Dreamcast Lighting Engine With Bumpmapping
- LightDark
- DCEmu Fast Newbie
- Posts: 18
- https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
- Joined: Sat Mar 09, 2013 4:39 pm
- Has thanked: 0
- Been thanked: 1 time
Dreamcast Lighting Engine With Bumpmapping
So I've been working on this for quite some time now and it has only really took off in recent days with the acquisition of a BBA: https://github.com/Light-Dark/dclightengine.
This is a very very simple (only currently supports omni-directional diffuse lights) dynamic lighting engine for the Dreamcast with bumpmapping. The main lighting routine is written in sh4 assembly for speed while all calculations in regards to bumpmapping are done in C with the fmath macros and a fast approximation of atan2 that BlueCrab pointed me to:)! The reason for it being done in C is, it is easier to understand how the bumpmapping works for anyone who may be looking for an example on how to use the feature ( although I think my assembly code is decently documented and pretty straight forward). Please tell me if there are any glaring errors in my math, routines, etc. if that is cool.
This is a very very simple (only currently supports omni-directional diffuse lights) dynamic lighting engine for the Dreamcast with bumpmapping. The main lighting routine is written in sh4 assembly for speed while all calculations in regards to bumpmapping are done in C with the fmath macros and a fast approximation of atan2 that BlueCrab pointed me to:)! The reason for it being done in C is, it is easier to understand how the bumpmapping works for anyone who may be looking for an example on how to use the feature ( although I think my assembly code is decently documented and pretty straight forward). Please tell me if there are any glaring errors in my math, routines, etc. if that is cool.
Re: Dreamcast Lighting Engine With Bumpmapping
Is that atan really fast though? Looks very branchy. I'd try something like this
http://cellperformance.beyond3d.com/art ... n_spu.html
http://cellperformance.beyond3d.com/art ... n_spu.html
Wiki & tutorials: http://dcemulation.org/?title=Development
Wiki feedback: viewtopic.php?f=29&t=103940
My libgl playground (not for production): https://bitbucket.org/bogglez/libgl15
My lxdream fork (with small fixes): https://bitbucket.org/bogglez/lxdream
Wiki feedback: viewtopic.php?f=29&t=103940
My libgl playground (not for production): https://bitbucket.org/bogglez/libgl15
My lxdream fork (with small fixes): https://bitbucket.org/bogglez/lxdream
- LightDark
- DCEmu Fast Newbie
- Posts: 18
- Joined: Sat Mar 09, 2013 4:39 pm
- Has thanked: 0
- Been thanked: 1 time
Re: Dreamcast Lighting Engine With Bumpmapping
It is supposedly 3-5x faster than the standard atan2 and has good accuracy.bogglez wrote:Is that atan really fast though? Looks very branchy. I'd try something like this
http://cellperformance.beyond3d.com/art ... n_spu.html
reference:http://www.lists.apple.com/archives/Per ... 00051.html
- BlueCrab
- The Crabby Overlord
- Posts: 5652
- Joined: Mon May 27, 2002 11:31 am
- Location: Sailing the Skies of Arcadia
- Has thanked: 9 times
- Been thanked: 69 times
- Contact:
Re: Dreamcast Lighting Engine With Bumpmapping
A wise man once said that optimizing by hand before seeing if there's a performance problem in the first place is never a good idea.
That, and that SPE code is pretty ugly to start with. It may not have any obvious conditionals in it, but a lot of that is because it is using a bunch of special operations of the SPE through intrinsics. One would probably be hard-pressed to translate that into a nice routine to use on the SH4 without spending inordinate amounts of time on it.
The moral here is that writing code for the SPEs on the Cell Processor is a pain in the butt. I'd know, as I've done it before -- both for classes and for work. I actually got the research position I have based on taking a parallel programming course that focused on the Cell processor when I was an undergrad.
That, and that SPE code is pretty ugly to start with. It may not have any obvious conditionals in it, but a lot of that is because it is using a bunch of special operations of the SPE through intrinsics. One would probably be hard-pressed to translate that into a nice routine to use on the SH4 without spending inordinate amounts of time on it.
The moral here is that writing code for the SPEs on the Cell Processor is a pain in the butt. I'd know, as I've done it before -- both for classes and for work. I actually got the research position I have based on taking a parallel programming course that focused on the Cell processor when I was an undergrad.
- LightDark
- DCEmu Fast Newbie
- Posts: 18
- Joined: Sat Mar 09, 2013 4:39 pm
- Has thanked: 0
- Been thanked: 1 time
Re: Dreamcast Lighting Engine With Bumpmapping
Well hey my hand-optimized lighting routine runs at 60fps with 3 active lights! However once you go above 7 or 8 64x64 bumpmapped quad mesh's then the performance takes a nose dive for no apparent reason down to less than 30 fps.BlueCrab wrote:A wise man once said that optimizing by hand before seeing if there's a performance problem in the first place is never a good idea.
- BlueCrab
- The Crabby Overlord
- Posts: 5652
- Joined: Mon May 27, 2002 11:31 am
- Location: Sailing the Skies of Arcadia
- Has thanked: 9 times
- Been thanked: 69 times
- Contact:
Re: Dreamcast Lighting Engine With Bumpmapping
Well, I should have said over-optimizing. But, at least you know what your performance constraints are. Now, if you choose to do so, you can optimize further to them.LightDark wrote:Well hey my hand-optimized lighting routine runs at 60fps with 3 active lights! However once you go above 7 or 8 64x64 bumpmapped quad mesh's then the performance takes a nose dive for no apparent reason down to less than 30 fps.BlueCrab wrote:A wise man once said that optimizing by hand before seeing if there's a performance problem in the first place is never a good idea.
All I was saying is that doing a lot of pre-optimization before even looking at performance results often ends up making programmers overthink things. When we overthink things, we make mistakes or actually often make things slower. It's human nature.
- LightDark
- DCEmu Fast Newbie
- Posts: 18
- Joined: Sat Mar 09, 2013 4:39 pm
- Has thanked: 0
- Been thanked: 1 time
Re: Dreamcast Lighting Engine With Bumpmapping
I agree 100%. I've actually screwed myself over big time by pre-optimizing a tile-loading routine for a NES project I was working onBlueCrab wrote:Well, I should have said over-optimizing. But, at least you know what your performance constraints are. Now, if you choose to do so, you can optimize further to them.
All I was saying is that doing a lot of pre-optimization before even looking at performance results often ends up making programmers overthink things. When we overthink things, we make mistakes or actually often make things slower. It's human nature.
Re: Dreamcast Lighting Engine With Bumpmapping
I remember that you said something about working on the Cell on IRCBlueCrab wrote:A wise man once said that optimizing by hand before seeing if there's a performance problem in the first place is never a good idea.
That, and that SPE code is pretty ugly to start with. It may not have any obvious conditionals in it, but a lot of that is because it is using a bunch of special operations of the SPE through intrinsics. One would probably be hard-pressed to translate that into a nice routine to use on the SH4 without spending inordinate amounts of time on it.
The moral here is that writing code for the SPEs on the Cell Processor is a pain in the butt. I'd know, as I've done it before -- both for classes and for work. I actually got the research position I have based on taking a parallel programming course that focused on the Cell processor when I was an undergrad.
I just mentioned that because he specifically talked about the atan2, which is probably also very important in his lighting code.
The best performance is achieved by learning some linear algebra and avoiding atan2 altogether anyway
Wiki & tutorials: http://dcemulation.org/?title=Development
Wiki feedback: viewtopic.php?f=29&t=103940
My libgl playground (not for production): https://bitbucket.org/bogglez/libgl15
My lxdream fork (with small fixes): https://bitbucket.org/bogglez/lxdream
Wiki feedback: viewtopic.php?f=29&t=103940
My libgl playground (not for production): https://bitbucket.org/bogglez/libgl15
My lxdream fork (with small fixes): https://bitbucket.org/bogglez/lxdream
- PH3NOM
- DC Developer
- Posts: 576
- Joined: Fri Jun 18, 2010 9:29 pm
- Has thanked: 0
- Been thanked: 5 times
Re: Dreamcast Lighting Engine With Bumpmapping
Hey man cool stuff!LightDark wrote:Well hey my hand-optimized lighting routine runs at 60fps with 3 active lights! However once you go above 7 or 8 64x64 bumpmapped quad mesh's then the performance takes a nose dive for no apparent reason down to less than 30 fps.BlueCrab wrote:A wise man once said that optimizing by hand before seeing if there's a performance problem in the first place is never a good idea.
It is important to know where the time is spent.
Try to measure your CPU time and GPU time independently ( by isolating all pvr_* functions ).
If you can only light 7 or 8 quads, 8*4 = 32 vertices, with 3 light sources per frame on the CPU before breaking the 60fps budget, you are in trouble.
OpenGL can handle way more than that, even using a more complex lighting formula So that should really not be your problem here.
Possibly the PVR is choking down performing the blending / bump-mapping on the pixels.
It would be interesting to bench mark PVR GPU time with bump-maps, as I have not done much at all with bump-maps.
I know the PVR begins to struggle when sorting several overlapped transparent polygons, one idea is you can try to disable auto-sort on the PVR by setting the initialization flag ( thanks to BlueCrab for the recent update to KOS that added such support ).
Another thing, from a quick glance at your assembly code, I dont think you need to prefetch the way you are. Should save a cycle simply by not calling that operation.
- LightDark
- DCEmu Fast Newbie
- Posts: 18
- Joined: Sat Mar 09, 2013 4:39 pm
- Has thanked: 0
- Been thanked: 1 time
Re: Dreamcast Lighting Engine With Bumpmapping
Thank you for your support and critiques! I'll update the repo asap.PH3NOM wrote:Hey man cool stuff!LightDark wrote:Well hey my hand-optimized lighting routine runs at 60fps with 3 active lights! However once you go above 7 or 8 64x64 bumpmapped quad mesh's then the performance takes a nose dive for no apparent reason down to less than 30 fps.BlueCrab wrote:A wise man once said that optimizing by hand before seeing if there's a performance problem in the first place is never a good idea.
It is important to know where the time is spent.
Try to measure your CPU time and GPU time independently ( by isolating all pvr_* functions ).
If you can only light 7 or 8 quads, 8*4 = 32 vertices, with 3 light sources per frame on the CPU before breaking the 60fps budget, you are in trouble.
OpenGL can handle way more than that, even using a more complex lighting formula So that should really not be your problem here.
Possibly the PVR is choking down performing the blending / bump-mapping on the pixels.
It would be interesting to bench mark PVR GPU time with bump-maps, as I have not done much at all with bump-maps.
I know the PVR begins to struggle when sorting several overlapped transparent polygons, one idea is you can try to disable auto-sort on the PVR by setting the initialization flag ( thanks to BlueCrab for the recent update to KOS that added such support ).
Another thing, from a quick glance at your assembly code, I dont think you need to prefetch the way you are. Should save a cycle simply by not calling that operation.
I think it is the PVR struggling with the bump maps because when I disable bump mapping altogether its right back to a clean 60fps with a screen full of lit 64*64 quads. I've removed the math from the bumpmapping routine and it still takes a nose dive when just rendering the maps. I'll give disabling autosort the old college try.
- LightDark
- DCEmu Fast Newbie
- Posts: 18
- Joined: Sat Mar 09, 2013 4:39 pm
- Has thanked: 0
- Been thanked: 1 time
Re: Dreamcast Lighting Engine With Bumpmapping
Can anyone out there with the means and enough knowledge about the bumpmapping write up a quick test program to check the hypothesis that it is the PVR choking up on more than 7 or 8 bumpmaps being drawn in a scene so we can determine whether its hardware limitations or terrible code on my part ?
-
- Insane DCEmu
- Posts: 112
- Joined: Sat Sep 22, 2007 9:43 pm
- Location: Braga - Portugal
- Has thanked: 0
- Been thanked: 0
Re: Dreamcast Lighting Engine With Bumpmapping
Hi there. I'm trying to build it on KOS, but I get the following error :
k++ ?
Best Regards.
Code: Select all
jaerder@jaerder-G750JS ~/development/code_to_study/dclightengine/test $ source ../../../Tools/dreamcast/environ.sh
jaerder@jaerder-G750JS ~/development/code_to_study/dclightengine/test $ make
rm -f Game/1ST_READ.bin
rm -f Game/main.elf
rm -f romdisk.*
/home/jaerder/development/Tools/dreamcast/kallistios/utils/genromfs/genromfs -f romdisk.img -d romdisk -v
0 rom 550dc8bc [0xffffffff, 0xffffffff] 37777777777, sz 0, at 0x0
1 . [0x806 , 0x4c13b2 ] 0040755, sz 0, at 0x20
1 .. [0x806 , 0x4c13a0 ] 0040755, sz 0, at 0x40 [link to 0x20 ]
1 text.raw [0x806 , 0x4c13b4 ] 0100644, sz 264208, at 0x60
1 bumpmap.raw [0x806 , 0x4c13b3 ] 0100644, sz 264208, at 0x40890
/home/jaerder/development/Tools/dreamcast/kallistios/utils/bin2o/bin2o romdisk.img romdisk romdisk.o
/home/jaerder/development/Tools/dreamcast/sh-elf/bin/sh-elf-gcc -fomit-frame-pointer -ml -m4-single-only -ffunction-sections -fdata-sections -I/home/jaerder/development/Tools/dreamcast/kallistios/../kos-ports/include -I/home/jaerder/development/Tools/dreamcast/kallistios/include -I/home/jaerder/development/Tools/dreamcast/kallistios/kernel/arch/dreamcast/include -I/home/jaerder/development/Tools/dreamcast/kallistios/addons/include -D_arch_dreamcast -D_arch_sub_pristine -Wall -g -fno-builtin -fno-strict-aliasing -O3 -O2 -ml -m4-single-only -Wl,-Ttext=0x8c010000 -Wl,--gc-sections -T/home/jaerder/development/Tools/dreamcast/kallistios/utils/ldscripts/shlelf.xc -nodefaultlibs -L/home/jaerder/development/Tools/dreamcast/kallistios/lib/dreamcast -L/home/jaerder/development/Tools/dreamcast/kallistios/addons/lib/dreamcast -o Game/main.elf light_t.o main_t.o romdisk.o -lkosutils -loggvorbisplay -lpng -lz -lk++ -lstdc++ -lm -Wl,--start-group -lkallisti -lc -lgcc -Wl,--end-group
/home/jaerder/development/Tools/dreamcast/sh-elf/lib/gcc/sh-elf/4.7.3/../../../../sh-elf/bin/ld: cannot find -lk++
collect2: error: ld returned 1 exit status
make: *** [Game/main.elf] Error 1
jaerder@jaerder-G750JS ~/development/code_to_study/dclightengine/test $
Best Regards.
- BlueCrab
- The Crabby Overlord
- Posts: 5652
- Joined: Mon May 27, 2002 11:31 am
- Location: Sailing the Skies of Arcadia
- Has thanked: 9 times
- Been thanked: 69 times
- Contact:
Re: Dreamcast Lighting Engine With Bumpmapping
Nothing should be using libk++ anymore... It was removed from the KOS tree quite a while ago...Jae686 wrote:Hi there. I'm trying to build it on KOS, but I get the following error :
k++ ?Code: Select all
jaerder@jaerder-G750JS ~/development/code_to_study/dclightengine/test $ source ../../../Tools/dreamcast/environ.sh jaerder@jaerder-G750JS ~/development/code_to_study/dclightengine/test $ make rm -f Game/1ST_READ.bin rm -f Game/main.elf rm -f romdisk.* /home/jaerder/development/Tools/dreamcast/kallistios/utils/genromfs/genromfs -f romdisk.img -d romdisk -v 0 rom 550dc8bc [0xffffffff, 0xffffffff] 37777777777, sz 0, at 0x0 1 . [0x806 , 0x4c13b2 ] 0040755, sz 0, at 0x20 1 .. [0x806 , 0x4c13a0 ] 0040755, sz 0, at 0x40 [link to 0x20 ] 1 text.raw [0x806 , 0x4c13b4 ] 0100644, sz 264208, at 0x60 1 bumpmap.raw [0x806 , 0x4c13b3 ] 0100644, sz 264208, at 0x40890 /home/jaerder/development/Tools/dreamcast/kallistios/utils/bin2o/bin2o romdisk.img romdisk romdisk.o /home/jaerder/development/Tools/dreamcast/sh-elf/bin/sh-elf-gcc -fomit-frame-pointer -ml -m4-single-only -ffunction-sections -fdata-sections -I/home/jaerder/development/Tools/dreamcast/kallistios/../kos-ports/include -I/home/jaerder/development/Tools/dreamcast/kallistios/include -I/home/jaerder/development/Tools/dreamcast/kallistios/kernel/arch/dreamcast/include -I/home/jaerder/development/Tools/dreamcast/kallistios/addons/include -D_arch_dreamcast -D_arch_sub_pristine -Wall -g -fno-builtin -fno-strict-aliasing -O3 -O2 -ml -m4-single-only -Wl,-Ttext=0x8c010000 -Wl,--gc-sections -T/home/jaerder/development/Tools/dreamcast/kallistios/utils/ldscripts/shlelf.xc -nodefaultlibs -L/home/jaerder/development/Tools/dreamcast/kallistios/lib/dreamcast -L/home/jaerder/development/Tools/dreamcast/kallistios/addons/lib/dreamcast -o Game/main.elf light_t.o main_t.o romdisk.o -lkosutils -loggvorbisplay -lpng -lz -lk++ -lstdc++ -lm -Wl,--start-group -lkallisti -lc -lgcc -Wl,--end-group /home/jaerder/development/Tools/dreamcast/sh-elf/lib/gcc/sh-elf/4.7.3/../../../../sh-elf/bin/ld: cannot find -lk++ collect2: error: ld returned 1 exit status make: *** [Game/main.elf] Error 1 jaerder@jaerder-G750JS ~/development/code_to_study/dclightengine/test $
Best Regards.
If it's in the Makefile, just remove it from the Makefile. Otherwise, something might be wrong in your environ script or something...
- LightDark
- DCEmu Fast Newbie
- Posts: 18
- Joined: Sat Mar 09, 2013 4:39 pm
- Has thanked: 0
- Been thanked: 1 time
Re: Dreamcast Lighting Engine With Bumpmapping
Whoops my bad. I'll remove it from the Makefile ASAP.BlueCrab wrote:Nothing should be using libk++ anymore... It was removed from the KOS tree quite a while ago...
If it's in the Makefile, just remove it from the Makefile. Otherwise, something might be wrong in your environ script or something...