Sorry for reviving such an old thread, I just really didn't want to make another considering it's a question related to the adventure started in this thread
I know the KOS has this stuff implemented already, I think it's in "kernel/arch/
dreamcast/include/dc/math.h"?
Either way, for learning purposes I've been trying to utilize the SH4's vector functionality.
I took a look a the KOS's
fmath.h header, which seems to inline assembly optimizations for shitton of stuff. For example, faster square roots, vector products, matrix multiplication etc.
Anywhoo, enough blabbering lol.
The KOS has this macro here:
Code: Select all
#define __fipr(x, y, z, w, a, b, c, d) ({ \
register float __x __asm__("fr0") = (x); \
register float __y __asm__("fr1") = (y); \
register float __z __asm__("fr2") = (z); \
register float __w __asm__("fr3") = (w); \
register float __a __asm__("fr4") = (a); \
register float __b __asm__("fr5") = (b); \
register float __c __asm__("fr6") = (c); \
register float __d __asm__("fr7") = (d); \
__asm__ __volatile__( \
"fipr fv4,fv0" \
: "+f" (__w) \
: "f" (__x), "f" (__y), "f" (__z), "f" (__w), \
"f" (__a), "f" (__b), "f" (__c), "f" (__d) \
); \
__w; })
Which clearly calculates the inner product of a 4D vector. But I'm very confused as to exactly how this works.
The constraints passed to GCC just say throw it into a floating point register, but they would need to be in a ordered fashion for the results to be accurate. Which leads me to believe that the values are being placed into the FR registers when they're declared like this:
Code: Select all
register float __x __asm__("fr0") = (x); \
But looking into the "register" keyword, GCC can
never promise they will actually be stored in a register.
Sorry if I've been too vague, I'm just quite confused. This is more of a question about GCC, but this seems quite different to any x86 inlined assembly I've seen.
EDIT: Just correct a few things...
Thanks!