DreamHAL - Dreamcast Hardware Abstraction Layer

If you have any questions on programming, this is the place to ask them, whether you're a newbie or an experienced programmer. Discussion on programming in general is also welcome. We will help you with programming homework, but we will not do your work for you! Any porting requests must be made in Developmental Ideas.
User avatar
Moopthehedgehog
DCEmu Freak
DCEmu Freak
Posts: 85
https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
Joined: Wed Jan 05, 2011 4:25 pm
Has thanked: 4 times
Been thanked: 39 times

DreamHAL - Dreamcast Hardware Abstraction Layer

Post by Moopthehedgehog »

DreamHAL - Dreamcast Hardware Abstraction Layer

If anyone has ever used an embedded device like an STM32, MSP430, Arduino, etc. you have probably come across something known as a hardware abstraction layer (or HAL, for short). Essentially what it is is a set of functions meant to easily initialize and configure various modules of a microcontroller in a completely on-demand way. This means things like timer units, DMA controllers, UART/serial ports, etc. can be added or removed from a project completely as-needed by including a header file, calling one initialization function with some initialization parameters, and then calling functions to make use of the desired peripheral. When designed and optimized appropriately, this can provide essentially no overhead and very high performance all in an easy-to-use way that does not depend on any external libraries.

This is in contrast to using standard libraries or building around kernels, as any unused functionality is simply not included in the project, so you only get the bare minimum you need to do what you want.

I've been thinking about how the Dreamcast is actually a pretty standard system as far as embedded hardware goes, and through conversing with user mrneo it appears evident that there may some desire for something like this--particularly a HAL that is highly optimized. Note that this is NOT the same as KOS, as it would only be a collection of .c/.h files pertaining to using the SH4. One way of thinking about it is turning the Renesas SH7750 hardware manual into usable code, ideally highly optimized. The SH4 is not a very complicated CPU, and it doesn't really have that many components to it (I think it's about 15?), so I don't think it would be too crazy a project.

Those who have looked at the source code for dcload-ip that I've been working on may have noticed that perctr.h/perfctr.c (attached), which are for controlling the supposedly "undocumented" performance counters, are in fact completely standalone, modular code. That's the kind of thing I have in mind, just do that for each subsystem of the SH4. Particularly with the bit of revival going on with SH4 patent expiry, this may also be useful outside of the Dreamcast universe.

I'd like to keep this something like MIT-licensed (although whatever modules I write I'd probably just go straight public domain. So many fewer headaches that way) to maximize adoption, although how licensing would be specifically handled is a discussion to be had only if I'm not going to be the only contributor ;). I can't guarantee when it might be done, only that it will at some point because I finish things I start (and I have one very big non-Dreamcast project in particular that's a bit higher-priority). If you want to help out, please feel free, but just keep that licensing restriction in mind. Getting involved is as simple as grabbing the "SH7750, SH7750S, SH7750R Group User's Manual: Hardware" manual (not the 7751, and not always the 7750R stuff in the 7750 manual; the Dreamcast's SH7091 is somewhere in between the SH7750 and SH7750R) from the Renesas website and making a module for a peripheral that hasn't been done yet. ...Which right now is most of them. I also want to make one single header file that has '#define's for every single memory-mapped address in the SH4's memory map, with each one having a comment stating exactly what it is.

A fair amount of this might already be out there, but it's not all in one place and not necessarily super-optimized. Additionally, it's probably not fully-featured; for example, the SCI and SCIF modules should probably have functions for polled, interrupt, and DMA modes of operation, etc.

Hopefully this name isn't already taken, as well. I wasn't able to find anything that had it already, so unless I missed something DreamHAL it is. :thumbup: (I really just wanted to make this post now in order to claim the name, get feedback on the idea, etc. I'm thinking of this as a more long-term project.)

admin edit: github repo was removed, for now can be downloaded at https://dreamcast.wiki/DreamHAL
Last edited by Moopthehedgehog on Tue Dec 17, 2019 1:13 am, edited 3 times in total.
These users thanked the author Moopthehedgehog for the post (total 3):
ProtofallfreakdaveIan Robinson
I'm sure Aleron Ives feels weird with his postcount back to <10668
:D
Ayla
DC Developer
DC Developer
Posts: 142
Joined: Thu Apr 03, 2008 7:01 am
Has thanked: 0
Been thanked: 4 times
Contact:

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by Ayla »

Public domain is not a license. Have a look at the CC0 license.
User avatar
Moopthehedgehog
DCEmu Freak
DCEmu Freak
Posts: 85
Joined: Wed Jan 05, 2011 4:25 pm
Has thanked: 4 times
Been thanked: 39 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by Moopthehedgehog »

Hmm, how about BSD 0-clause, then?
Apparently public domain isn’t the same thing worldwide like I thought it was...
BSD 0-Clause wrote: Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
I'm sure Aleron Ives feels weird with his postcount back to <10668
:D
User avatar
BB Hood
DC Developer
DC Developer
Posts: 189
Joined: Fri Mar 30, 2007 12:09 am
Has thanked: 41 times
Been thanked: 10 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by BB Hood »

Post a link to the source code repository
User avatar
Moopthehedgehog
DCEmu Freak
DCEmu Freak
Posts: 85
Joined: Wed Jan 05, 2011 4:25 pm
Has thanked: 4 times
Been thanked: 39 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by Moopthehedgehog »

I'm sure Aleron Ives feels weird with his postcount back to <10668
:D
User avatar
Moopthehedgehog
DCEmu Freak
DCEmu Freak
Posts: 85
Joined: Wed Jan 05, 2011 4:25 pm
Has thanked: 4 times
Been thanked: 39 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by Moopthehedgehog »

If anyone is interested in getting updates to that repository, I recommend hitting the "watch" button on the repo. Stars don't send you e-mails whenever I add to it. E.g. I just added a ton of new high-resolution video modes the Dreamcast can do.
I'm sure Aleron Ives feels weird with his postcount back to <10668
:D
User avatar
Moopthehedgehog
DCEmu Freak
DCEmu Freak
Posts: 85
Joined: Wed Jan 05, 2011 4:25 pm
Has thanked: 4 times
Been thanked: 39 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by Moopthehedgehog »

So I had a pretty dumb problem where division involving negative numbers was pretty broken using MATH_Fast_Divide() and MATH_Invert(). This has now been fixed in sh4_math.h version 1.1.1.

The now-fixed MATH_Invert() has been renamed to MATH_Fast_Invert() for consistency with naming convention (so if anyone was using MATH_Invert() before, just add the _Fast part because MATH_Invert() no longer exists). :P
I'm sure Aleron Ives feels weird with his postcount back to <10668
:D
hlabrand
DCEmu Newbie
DCEmu Newbie
Posts: 6
Joined: Sat Jul 18, 2020 5:38 pm
Has thanked: 0
Been thanked: 0

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by hlabrand »

Hi everyone,
Somebody mentioned this lib on Twitter and it is very cool! (I'm not sure if its author is still around? Also let me know if I should have created a new topic instead of replying here.)

If I'm not mistaken, the MATH_Fast_Sqrt function can be optimized further using a small trick, which saves one fsrra (bringing the cost down to 1 fsrra + 1 mul). See the commit here on my Github.

It's a pretty nice speed improvement, but as always with these things, you have to see if the precision loss / error in the approximation is good. My intuition is that it is more numerically stable, as fsrra introduces a small (2^{-21}) error; but I could be wrong. And it wouldn't be a good replacement if it gives inaccurate results (unless that kind of precision isn't that important).

Would anyone be able to benchmark this new function with regards to its speed, and maybe also see if the result looks accurate enough? (Unfortunately I can't do it myself: I'm barely a beginner on Dreamcast dev - but I have a background in math, so I figured I would take a look at the code :grin: )

Thanks in advance!
TapamN
DC Developer
DC Developer
Posts: 105
Joined: Sun Oct 04, 2009 11:13 am
Has thanked: 2 times
Been thanked: 89 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by TapamN »

That should have lower latency than a FSQRT instruction. FSQRT's latency is 12 cycles. FSRRA has a latency of 6 and FMUL has a latency of 3 cycles. So your approximate square root would be 3 cycles faster than FSQRT.
hlabrand
DCEmu Newbie
DCEmu Newbie
Posts: 6
Joined: Sat Jul 18, 2020 5:38 pm
Has thanked: 0
Been thanked: 0

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by hlabrand »

Thanks for confirming, I didn't know the latency of each operand! (Although it's odd, since Moop's source indicates that their code was already 3x faster in practice than fsqrt? What am I missing?)
TapamN
DC Developer
DC Developer
Posts: 105
Joined: Sun Oct 04, 2009 11:13 am
Has thanked: 2 times
Been thanked: 89 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by TapamN »

Not totally sure, but it seems like something is wrong with KOS's math headers which prevents GCC from generating the builtin instructions. It seems to always generate a function call.

GCC can emit the FSQRT, FSRRA, and FSCA instructions if you use the compiler options "-ffast-math -ffp-contract=fast -mfsrra -mfsca", but under KOS you have to manually use the GCC builtin function calls ("__builtin_sqrt()", "__builtin_sincos()"). At the moment, calling sqrt() or sincos() KOS seems to always call the slow, 100% IEEE compliant library functions, regardless of compile options. GCC can automatically generate FSRRA when it sees "1.0f / __builtin_sqrt(...)". This will also allow GCC to generate FMAC instructions when it's useful. I guess there's something wrong with KOS's math headers that don't allow GCC to handle this correctly on it's own, but I haven't looked into why yet. The speed difference measured is probably from GCC using a function call instead of just the FSQRT instruction.

Having GCC generate these instructions itself, instead of inline asm, results in better code since GCC knows whats going on. If you use inline assembly to generate these instructions, GCC does not look inside the asm block to see what timing or math is going on. It thinks all the results of the asm block are ready at the end of the block, and doesn't try to schedule instructions around it. (i.e. If GCC generates a FSQRT instruction, it doesn't have the code sit and wait until the result is ready, it tries to rearrange the code so that the CPU has something to do while it completes. If you use inline asm to use FSQRT, GCC doesn't know that it needs to find something else it do.)

Also, since GCC doesn't know what's going on in the asm block, it can't simplify equations. If you call sincosf with a compile-time constant value, and use the builtin functions, GCC can take advantage that it knows ahead of time what the result is to simplify how it's used. With an angle of 0, GCC knows the sin result is 0 and the cos is 1, and it can eliminate any pointless multiply by 1's, or it can skip calculating anything that ends up multiplied by 0. If you use inline asm, GCC doesn't know what's going on, and will always assume the worst.
hlabrand
DCEmu Newbie
DCEmu Newbie
Posts: 6
Joined: Sat Jul 18, 2020 5:38 pm
Has thanked: 0
Been thanked: 0

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by hlabrand »

Oh I see! Basically "don't try to be smarter than GCC"? :D

So does this mean that a "fast_sqrt" function taking 9 cycles should be written as "return x*1.0f/__builtin_sqrt(x)" (or in 2 different steps maybe) and let gcc take care of it? (And should it be written in KOS instead of in DreamHAL?)
User avatar
Ian Robinson
DC Developer
DC Developer
Posts: 114
Joined: Mon Mar 11, 2019 7:12 am
Has thanked: 209 times
Been thanked: 41 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by Ian Robinson »

hlabrand wrote: Fri Aug 06, 2021 9:15 pm Oh I see! Basically "don't try to be smarter than GCC"? :D

So does this mean that a "fast_sqrt" function taking 9 cycles should be written as "return x*1.0f/__builtin_sqrt(x)" (or in 2 different steps maybe) and let gcc take care of it? (And should it be written in KOS instead of in DreamHAL?)
GCC is not all that smart with sh4 target many cases it stuff things up using builtins. Use the sh4 compiler explorer and check the asm output dont just trust it..
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1874
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 80 times
Been thanked: 61 times
Contact:

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by GyroVorbis »

TapamN wrote:C can emit the FSQRT, FSRRA, and FSCA instructions if you use the compiler options "-ffast-math -ffp-contract=fast -mfsrra -mfsca", but under KOS you have to manually use the GCC builtin function calls ("__builtin_sqrt()", "__builtin_sincos()"). At the moment, calling sqrt() or sincos() KOS seems to always call the slow, 100% IEEE compliant library functions, regardless of compile options. GCC can automatically generate FSRRA when it sees "1.0f / __builtin_sqrt(...)". This will also allow GCC to generate FMAC instructions when it's useful. I guess there's something wrong with KOS's math headers that don't allow GCC to handle this correctly on it's own, but I haven't looked into why yet. The speed difference measured is probably from GCC using a function call instead of just the FSQRT instruction.
It's because we do not use -ffast-math or the other flags in standard KOS headers.

This is something we're aware of. I have brought it up to BlueCrab before, and even tried to make them automatic within the new CMake Toolchain file, but he decided that since they are not 100% accurate and IEEE conforming that doing such a thing would be a bad idea for a default configuration.

...that being said, maybe they should be slid in as a comment in the sample environ.sh just so that people know it's there and can enable them by uncommenting a single line...

I can also concur with TapamN's findings, though, that using the fast math and math instructions generates better assembly than Moop's inline assembly.
TapamN
DC Developer
DC Developer
Posts: 105
Joined: Sun Oct 04, 2009 11:13 am
Has thanked: 2 times
Been thanked: 89 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by TapamN »

GyroVorbis wrote: Fri May 26, 2023 8:36 pm]It's because we do not use -ffast-math or the other flags in standard KOS headers.
What I was talking about was that even when you used -ffast-math, GCC still wouldn't generate FSCA/FSRRA unless you manually specified the builtin version. I didn't know it at the time I wrote that post, but the cause of that problem is that environ_base.sh adds -fno-builtin to KOS_CFLAGS, which prevents a lot of clib optimizations, and forces real function calls. Beyond preventing ffast-math from generating FSCA/FSRRA, it also prevents optimizations like converting small memcpy/memsets to inline code, forcing a function call when you want to copy or clear an 8 byte struct.

Not defaulting to -ffast-math is the right choice, but I see no reason for -fno-builtin. Builtins remain fully C standard compliant (ignoring any compiler bugs, or opting out of standards with things like ffast-math). I've been using GCC 12.2 since its release, with the -fno-builtin removed from my environ_base.sh, and I haven't noticed any issues with it.
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1874
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 80 times
Been thanked: 61 times
Contact:

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by GyroVorbis »

TapamN wrote: Fri May 26, 2023 9:05 pm
GyroVorbis wrote: Fri May 26, 2023 8:36 pm]It's because we do not use -ffast-math or the other flags in standard KOS headers.
What I was talking about was that even when you used -ffast-math, GCC still wouldn't generate FSCA/FSRRA unless you manually specified the builtin version. I didn't know it at the time I wrote that post, but the cause of that problem is that environ_base.sh adds -fno-builtin to KOS_CFLAGS, which prevents a lot of clib optimizations, and forces real function calls. Beyond preventing ffast-math from generating FSCA/FSRRA, it also prevents optimizations like converting small memcpy/memsets to inline code, forcing a function call when you want to copy or clear an 8 byte struct.
JESUS. Okay, relaying this information... I did not know that we were screwing up enabling fast math by using -fno-builtin... that's what I get for looking at Compiler Explorer and not adding that flag there...
TapamN wrote:Not defaulting to -ffast-math is the right choice, but I see no reason for -fno-builtin. Builtins remain fully C standard compliant (ignoring any compiler bugs, or opting out of standards with things like ffast-math). I've been using GCC 12.2 since its release, with the -fno-builtin removed from my environ_base.sh, and I haven't noticed any issues with it.
Thank you, I will be relaying this information and making a PR to fix this if I get the green light.
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1874
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 80 times
Been thanked: 61 times
Contact:

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by GyroVorbis »

Okay, so I did some research on -fno-builtin, I built everything myself without it, I verified at least on GCC13.1.0, it does indeed work and at least the few things I've tested with are fine...

HOWEVER, I do see some new compiler warnings that kind of sketch me out... and I really don't feel comfortable just straight up changing the defaults with so little testing, and I'm betting BlueCrab won't go for it either...

So I have a proposal, here: https://github.com/KallistiOS/KallistiOS/pull/222 that I feel like gives everybody what they want (a choice in their build configuration) PLUS it lets us keep our defaults that we know have been rigorously tested... It does remove -fno-builtin from environ_base.sh; however, it still keeps it enabled by default in the environ.sh.sample file... It also straight up documents the fast math flags for users who had to go do side research for them previously... What do you guys think? Does it suit your needs?

Then we'll see what BlueCrab thinks tomorrow when he wakes up. lol.

EDIT: Huh. After actually digging into some of the new warnings (KOS built cleanly with -fno-builtin), we found out that it looks like they're actually legit... I'm guessing the compiler has more visibility into what's going on around the call-site now that it's using its own implementations of some of these functions? Definitely need to clean those up...

EDIT2: Well, well, well. Look what Quzar found: https://gcc.gnu.org/onlinedocs/gcc/C-Di ... tions.html. Specifically:
GCC C Dialect Options wrote: In addition, when a function is recognized as a built-in function, GCC may use information about that function to warn about problems with calls to that function [...]
So it does make sense that suddenly we're seeing warnings which actually appear to be legit potential issues within the code surrounding them... Very interesting.
These users thanked the author GyroVorbis for the post:
Ian Robinson
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1874
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 80 times
Been thanked: 61 times
Contact:

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by GyroVorbis »

BlueCrab was fine with it! The PR has been merged!

If you pull down the latest KOS, remember to update your environ.sh with the switches that have been added to environ.sh.sample!
TapamN
DC Developer
DC Developer
Posts: 105
Joined: Sun Oct 04, 2009 11:13 am
Has thanked: 2 times
Been thanked: 89 times

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by TapamN »

Neat.

I actually talked about the -fno-builtin switch a bit when the GCC 12.2 port was originally released.

https://dcemulation.org/phpBB/viewtopic ... 9#p1059779

https://dcemulation.org/phpBB/viewtopic ... 5#p1059785
These users thanked the author TapamN for the post (total 2):
GyroVorbisIan Robinson
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1874
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 80 times
Been thanked: 61 times
Contact:

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Post by GyroVorbis »

My bad, I got sidetracked with date/time and a few other PRs. Then to be honest, I was so new to KOS that I didn't even feel comfortable doing things like changing its flags. At the time I was working on the CMake toolchain file, but that was a separate thing from the main KOS flags. The GCC12 stuff was literally my first KOS PR that was ever merged...

Then after awhile, I totally forgot about it. Good thing it was brought up again, actually. :grin:

By the way, we know who you are. We read what you put out. We know you know what you're doing here on the Dreamcast. If there's anything else you think we need to be looking at, feel free to either let me know or raise a bug on GitHub. We've been really active lately, and we will make it a point to take a look at it.
These users thanked the author GyroVorbis for the post:
Ian Robinson
Post Reply