I hacked together some scripts to automate testing of 24 different compiler configurations.
Hardware used:
US Dreamcast model 1 with 32MB mod, DreamBoot BIOS, loading tests from embedded dcload-ip
Software
KallistiOS Git revision 98f913d
modified version of pvrmark example in kos/examples/dreamcast/pvr/pvrmark -- modification done to avoid inconsistencies and issues pvrmark has.
pvrmark goes through several phases. This modification waits until pvrmark reaches PHASE_FINAL, then waits until 5 more FPS/PPS debug messages (we'll call these "cycles") are printed out (approx every 5 seconds). The software then collects the next 25 polygons-per-second scores from the next 25 cycles and then averages them. Sometimes a cycle might spike up (and print >90fps, for example) or dip down (to 10-30fps), if this happens that cycle is dropped from being added to the results and the next cycle is counted instead.
All 24 tests were ran and data was collected, then it was done all over again: the system was rebooted and the test suite was ran a second time. And when that was finished, a third time. All in all, it took about 4.5 hours for the Dreamcast to run 3 passes of these tests.
Compiler setups tested:
KOS "testing" -- gcc-13.1.0 with Newlib 4.3.0 and binutils 2.40 (PR in github to be merged soon, since the final tarball hasn't been uploaded yet, we are using GCC 13.1.0 RC2)
KOS "testing" -- gcc-12.2.0 with Newlib 4.3.0 and binutils 2.40
KOS "stable" -- gcc-9.3.0 with Newlib 3.3.0 and binutils 2.34
KOS "legacy" -- gcc-4.7.4 with Newlib 2.0.0 and bintuils 2.34
Flags tested:
-O1
Code: Select all
-O1 -fomit-frame-pointer
Code: Select all
-O2 -fomit-frame-pointer
Code: Select all
-O3 -fomit-frame-pointer
Code: Select all
-Os -fomit-frame-pointer
Code: Select all
-Wall -g -ml -m4-single-only -Os -fomit-frame-pointer -ffast-math -fno-strict-aliasing -fwrapv
Code: Select all
O3 -fipa-pta -fomit-frame-pointer
Results
While the top 3 spots shuffled around in the different tests, a variant of GCC 13.1.0 was always on top. On average, GCC 13.1.0 with -O3-fipa-pta performed the best. Regardless, 4.7.4 had a very impressive showing at -O2.
Binary sizes: Strictly speaking of bin sizes, GCC 13.1.0 wins with -Os. Great, right? Comparing binary sizes and performance: Well, not so great for GCC 13.1.0 with -Os. Despite its excellent low bin size, performance is near the bottom.
That being said, comparing the top of the performance charts, GCC 13.1.0 at -O3 or -O3-fipa-pta not only edge out 4.7.4-O2 for the performance crown, but they also shave off quite a good chunk of space from the 4.7.4 top performers.
If you really need to shave off bin size, GCC 12.2.0 with -Os or iansflags might be the way to go, as you sacrifice a little bit of a performance for a pretty sizeable reduction in binary size. Beyond
- I have no idea if pvrmark is truly a good benchmark or if it's representative of the typical work done by homebrew software on a Dreamcast. I figured it was a decent starting point once I got it to (relatively) consistently produce results I could compare. My scripts are adaptable and it would be trivial to replace pvrmark with a different benchmark as long as the benchmark program prints out a score to the console and exits back to dcload.
- I could try more configurations. Does anyone have a set of CFLAGS they think can beat out these top configs? I'm more than willing to go a round two with this test to see how hard we can push this console with our available tools.