HDR rendering on Dreamcast

If you have any questions on programming, this is the place to ask them, whether you're a newbie or an experienced programmer. Discussion on programming in general is also welcome. We will help you with programming homework, but we will not do your work for you! Any porting requests must be made in Developmental Ideas.
Post Reply
Twada
DC Developer
DC Developer
Posts: 42
https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
Joined: Wed Jan 20, 2016 4:55 am
Has thanked: 18 times
Been thanked: 53 times

HDR rendering on Dreamcast

Post by Twada »

Hello, I'm Tashi. I can't change username...
This was what I had given up. Thanks to Ian Robinson and Protofall and TapamN and more!
bloom04.rar
(1.05 MiB) Downloaded 134 times
Based on:"Practical Implementation of SH Lighting and HDR Rendering on PlayStation 2" GDC 2005
(http://research.tri-ace.com/)

Feature
  • Everything is completed on PVR.
  • It operates at a maximum of 30 fps.
hdr01.png
Creates a 128 * 96 bloom texture in a single render using a 512 * 128 work buffer.

PVR can only draw 32 * 32. Arrange them in order to create a frame. The drawing order is determined at the time of initialization.
KOS draws from the left side to the right side of the frame. Therefore, you can use the left side, which has already been drawn, as the texture of the polygon on the right side.
hdr02.png
1. Render the frame into a 640 * 480 texture.

2. Extract the high-brightness part from the frame.
Reduce to 128 * 96, extract RGB (128-255) pixels by inversion / addition / inversion, then add the same texture to double the color. The RGB (0-255) pixels are complete.

3. Apply Gaussian blur.
Move to the right and use the previous texture. Apply a bilinear filter, shift it by 1 pixel, semi-transparently combine it, and apply a 3 * 3 Gaussian blur.
Reduce this and apply 3 * 3 blur again, and repeat to create three textures of different sizes.

4. Completion of bloom texture.
Make the three textures 128 * 96 size and add and synthesize. Use this for Bloom textures.

5. Tone mapping and final rendering
The color of the first 640 * 480 frame texture is doubled by additive synthesis, and RGB (0-127) is changed to RGB (0-255).
Bloom texture is added and synthesized from above, and it is completed.

Point
  • PVR is not good at translucency. Use fewer tiles!
  • Use user clips so as not to straddle tiles!
  • Since it is not a Twiddled texture, do not perform bilinear filtering as much as possible!

Please, find a better way!
Last edited by Twada on Tue Jan 19, 2021 8:12 am, edited 1 time in total.
These users thanked the author Twada for the post (total 4):
maslevinfreakdaveIan RobinsonGyroVorbis
User avatar
Ian Robinson
DC Developer
DC Developer
Posts: 114
Joined: Mon Mar 11, 2019 7:12 am
Has thanked: 206 times
Been thanked: 41 times

Re: HDR rendering on Dreamcast

Post by Ian Robinson »

Hi Tashi..
Great to see you post on here I'm ian micheal but my user name here is Ian robinson
Twada
DC Developer
DC Developer
Posts: 42
Joined: Wed Jan 20, 2016 4:55 am
Has thanked: 18 times
Been thanked: 53 times

Re: HDR rendering on Dreamcast

Post by Twada »

Sorry! Corrected the article.Thank you for watching this.

It's better than before. :)
These users thanked the author Twada for the post:
GyroVorbis
User avatar
Protofall
DCEmu Freak
DCEmu Freak
Posts: 78
Joined: Sun Jan 14, 2018 8:03 pm
Location: Emu land
Has thanked: 21 times
Been thanked: 18 times
Contact:

Re: HDR rendering on Dreamcast

Post by Protofall »

Nice stuff. Although I don't think I was involved in helping with this, maybe you mean someone else?
Moving Day: A clone of Dr Mario with 8-player support <https://dcemulation.org/phpBB/viewtopic ... 4&t=105389>
A recreation of Minesweeper for the Dreamcast <viewtopic.php?f=34&t=104820>

Twitter <https://twitter.com/ProfessorToffal>
YouTube (Not much there, but there are a few things) <https://www.youtube.com/user/TrueMenfa>
Twada
DC Developer
DC Developer
Posts: 42
Joined: Wed Jan 20, 2016 4:55 am
Has thanked: 18 times
Been thanked: 53 times

Re: HDR rendering on Dreamcast

Post by Twada »

I got a hint about alpha blending in the previous topic "How do I subtract with alpha blending?"
(https://dcemulation.org/phpBB/viewtopic ... 9&t=105628).

Thanks to that, all the processing is completed on PVR.
These users thanked the author Twada for the post:
GyroVorbis
User avatar
Protofall
DCEmu Freak
DCEmu Freak
Posts: 78
Joined: Sun Jan 14, 2018 8:03 pm
Location: Emu land
Has thanked: 21 times
Been thanked: 18 times
Contact:

Re: HDR rendering on Dreamcast

Post by Protofall »

Twada wrote: Wed Jan 20, 2021 1:28 am I got a hint about alpha blending in the previous topic "How do I subtract with alpha blending?"
(https://dcemulation.org/phpBB/viewtopic ... 9&t=105628).

Thanks to that, all the processing is completed on PVR.
Ah yes, that's right
Moving Day: A clone of Dr Mario with 8-player support <https://dcemulation.org/phpBB/viewtopic ... 4&t=105389>
A recreation of Minesweeper for the Dreamcast <viewtopic.php?f=34&t=104820>

Twitter <https://twitter.com/ProfessorToffal>
YouTube (Not much there, but there are a few things) <https://www.youtube.com/user/TrueMenfa>
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1873
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 79 times
Been thanked: 61 times
Contact:

Re: HDR rendering on Dreamcast

Post by GyroVorbis »

Hello, Twada! This is one of the most impressive things I've ever seen! I had always meant to get around to trying bloom on DC using the paper published in GPU gems taken from Tron for the Xbox, which uses a somewhat similar method... Didn't even think HDR was doable...

Anyway, I've downloaded the source and have it up and running on the latest version of KOS with the latest version of GCC13.2.1, in the source tree with just a teensy rendering issue I need to figure out (tile binning artifacts in only one spot)...

...and I was wondering if there was any chance in hell we could have your blessing to include this as a builtin example with KOS under kos/examples/dreamcast/pvr/hdr_bloom? We would be honored. I want this fine work not only preserved beyond these forums but also able to be seen and leveraged by the rest of the scene, so that they can see what is possible on the platform and an example of how to do it.
Twada
DC Developer
DC Developer
Posts: 42
Joined: Wed Jan 20, 2016 4:55 am
Has thanked: 18 times
Been thanked: 53 times

Re: HDR rendering on Dreamcast

Post by Twada »

I am glad you are interested in HDR!
The frame buffer on the Dreamcast is 16-bit, so reusing the frame buffer would result in a noticeable dither pattern.
So we use a roundabout way of preparing the reduced scene and the final scene separately. This is similar to GpuGems real-time glow. As a by-product, it saves fill rate.

If you could add it to the KOS sample, I would be honoured! However, there is a lot to tweak!
If we were to twiddle the reduced buffer, it would work much faster. In that case, the tile matrix would have to be uniquely aligned (e.g. horizontally) and the KOS PVR would have to be hijacked, so it might not be suitable as a sample.
If that is OK with you, please give me some time. Also, where do the artefacts appear?
GyroVorbis wrote: Fri Jan 12, 2024 4:41 pm Anyway, I've downloaded the source and have it up and running on the latest version of KOS with the latest version of GCC13.2.1, in the source tree with just a teensy rendering issue I need to figure out (tile binning artifacts in only one spot)...
When I pull KOS and try to rebuild, I get an error.

Code: Select all

/opt/toolchains/dc/kos/kernel/thread/thread.c:338: undefined reference to `___builtin_set_thread_pointer'
Rebuild KOS. Give me time...
These users thanked the author Twada for the post (total 2):
GyroVorbisIan Robinson
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1873
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 79 times
Been thanked: 61 times
Contact:

Re: HDR rendering on Dreamcast

Post by GyroVorbis »

Give me just a sec to clean up and upload the slight modifications I've made for the latest KOS. A few trivial things like the way we handle romdisks changed in the Makefiles.

In the meantime:
Twada wrote: Sun Jan 14, 2024 4:29 am When I pull KOS and try to rebuild, I get an error.

Code: Select all

/opt/toolchains/dc/kos/kernel/thread/thread.c:338: undefined reference to `___builtin_set_thread_pointer'
Rebuild KOS. Give me time...
Once we added support for C and C++ thread-local storage (TLS), we started requiring the toolchain to be rebuilt with TLS enabled... It's my fault for not giving a better error message to tell you to rebuild the toolchain. I've promised to do better in the future if the toolchain needs to be updated again.

But yeah, do this:

Code: Select all

cd /opt/toolchains/dc/kos/utils/dc-chain
cp config/config.mk.stable.sample config.mk
Now open up the config file that got copied to /opt/toolchains/dc/kos/utils/dc-chain/config.mk. The config file allows you to customize an extra settings you may want or optional features you want to enable on the toolchain. Make sure to set the "makejobs" variable based on your number of processor threads, at very least, so the build doesn't take a year. When you're done customizing, save. Finally run:

Code: Select all

make
And everything should get built for you with GCC 13.2.0.

EDIT: Oh yeah, if you're that out-of-date, you're going to want to replace your environ.sh file with the file located in kos/doc/environ.sh.sample. We added a bunch of new flags and ways to customize things.
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1873
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 79 times
Been thanked: 61 times
Contact:

Re: HDR rendering on Dreamcast

Post by GyroVorbis »

WHOOOO!!! SHE LIVES!!! Just got it 100% working on the latest everything.

https://cdn.discordapp.com/attachments/ ... f456c6a18&

OKAY, I did have to make a few changes, plus I cleaned some stuff up for you and got it in the KOS examples tree. This directory is in /kos/examples/dreamcast/pvr/bloom_hdr for me (if you don't like the name, we change it):
bloom_hdr.zip
(1.22 MiB) Downloaded 19 times
What changed:
1) Fixed all the warnings
2) Changed the Makefile to support the new romdisk mechanism plus to automatically clean/rebuild the static lib in the "poly" subdirectory
3) Not only did I have to change the PVR_BINSIZEs in pvr_init_params_t to be 32 in main.c, but I also had to allocate an overflow bin, which is functionality that was just recently added to KOS... Tbh, I'm not sure how this could've ever worked in old KOS?
4) Had to add two debug printf() + fflush() statements in tex_load() in game.c at ~line 129. Without these, the ELF just aborts on me without an error message... I have no idea how or why this fixes it, but it does. I'm very much not okay with this. About to hook it up to the debugger and do some sleuthing... Never seen anything like it with KOS, but I will get on looking into it, so we can get it resolved. Just wanted to make you aware of the issue.

I built with just -O3 WITHOUT LTO or fast math being enabled. I get the same abort with LTO enabled. This is something I'm going to have to help look into too, so we can leverage all the performance gains we can get here...

EDIT: By the way, I'm a graphics programmer by trade who has implemented HDR and bloom in my own engine... only it's on PC with modern OpenGL shaders and with the programmable pipeline.... Since joining team KOS, my work has mostly been low-level OS driver-y things, because I've been passionate about working in that area... However my goal was always to learn the PVR and get really good in that area too. Currently studying the code and the paper linked to so that I am not useless in this area and can help! :mrgreen:
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1873
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 79 times
Been thanked: 61 times
Contact:

Re: HDR rendering on Dreamcast

Post by GyroVorbis »

Twada wrote: Sun Jan 14, 2024 4:29 am If we were to twiddle the reduced buffer, it would work much faster. In that case, the tile matrix would have to be uniquely aligned (e.g. horizontally) and the KOS PVR would have to be hijacked, so it might not be suitable as a sample.
If that is OK with you, please give me some time.
If you have any ideas on how to make this thing faster and better, lets do it! This is one of the most impressive things I have EVER seen on DC, and it has always been a plan of mine to work on bloom for the platform... I think it's worth every bit of investment to show off what the DC is actually capable of and have as an example for others to learn and leverage as well.

As for highjacking the KOS PVR API... DO IT!!! If you have to go around KOS's PVR API to do this efficiently, then the API should be extended to allow for such behavior, in my humble opinion.
These users thanked the author GyroVorbis for the post (total 2):
Ian RobinsonTwada
TapamN
DC Developer
DC Developer
Posts: 104
Joined: Sun Oct 04, 2009 11:13 am
Has thanked: 2 times
Been thanked: 88 times

Re: HDR rendering on Dreamcast

Post by TapamN »

Twada wrote: Sun Jan 14, 2024 4:29 amIf we were to twiddle the reduced buffer, it would work much faster. In that case, the tile matrix would have to be uniquely aligned (e.g. horizontally) and the KOS PVR would have to be hijacked, so it might not be suitable as a sample.
Yes, I've done 60 FPS bloom on the DC, and twiddling the bloom texture was worth it. I was using a custom PVR driver that supported render-to-texture on any texture size, and wasn't limited to framebuffer size like KOS's driver, which saved a lot of fillrate.

Why do you need a horizontal tile matrix to twiddle the texture? The way I twiddled it was by taking advantage of the fact that ConvertToTwiddled(x) is equivalent to Untwiddle(Untwiddle(Untwiddle(x))). So I had the PVR take the untwiddled texture, have the PVR read it as a twiddled texture, then write it to memory, then repeat this two more times. Does a horizontal TM make a better way available?
These users thanked the author TapamN for the post:
GyroVorbis
Twada
DC Developer
DC Developer
Posts: 42
Joined: Wed Jan 20, 2016 4:55 am
Has thanked: 18 times
Been thanked: 53 times

Re: HDR rendering on Dreamcast

Post by Twada »

TapamN wrote: Sun Jan 28, 2024 5:22 pm Yes, I've done 60 FPS bloom on the DC, and twiddling the bloom texture was worth it. I was using a custom PVR driver that supported render-to-texture on any texture size, and wasn't limited to framebuffer size like KOS's driver, which saved a lot of fillrate.
Awesome! Your post process is incredible!
I'm very interested in how you implement it. In particular, how does DOF compare depth?
Screenshot 2024-01-29 12-02-21.png
It's still a work in progress, but my HDR sample renders a bloom texture like this.
TapamN wrote: Sun Jan 28, 2024 5:22 pm Why do you need a horizontal tile matrix to twiddle the texture? The way I twiddled it was by taking advantage of the fact that ConvertToTwiddled(x) is equivalent to Untwiddle(Untwiddle(Untwiddle(x))). So I had the PVR take the untwiddled texture, have the PVR read it as a twiddled texture, then write it to memory, then repeat this two more times. Does a horizontal TM make a better way available?
I think I'm also using the same method to Tweddle after seeing your post.
The texture is 512*256, and the tile matrix is set so that the top half can be rendered vertically and the bottom half horizontally.
The 128*128 texture in the upper right is Untwiddled three times in the lower half.
I think this is a good way to generate textures without delay in one rendering, but is there a better way?
These users thanked the author Twada for the post:
GyroVorbis
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1873
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 79 times
Been thanked: 61 times
Contact:

Re: HDR rendering on Dreamcast

Post by GyroVorbis »

TapamN wrote: Sun Jan 28, 2024 5:22 pm
Twada wrote: Sun Jan 14, 2024 4:29 amIf we were to twiddle the reduced buffer, it would work much faster. In that case, the tile matrix would have to be uniquely aligned (e.g. horizontally) and the KOS PVR would have to be hijacked, so it might not be suitable as a sample.
Yes, I've done 60 FPS bloom on the DC, and twiddling the bloom texture was worth it. I was using a custom PVR driver that supported render-to-texture on any texture size, and wasn't limited to framebuffer size like KOS's driver, which saved a lot of fillrate.
Hell yeah, man! I've never seen this work from you. It looks amazing! :o

By the way, have you ever made the modifications that you had to make for KOS's render-to-texture for arbitrary sizes available? That's something that has been on my todo list to add support for ever since I went through and brushed up on the PVR API. Seems like a pretty bad limitation, and we already have people on Discord saying it's an issue for them.

I was going to to go try to see if I could track down any homebrew code that successfully bypassed that, but I don't know any off the top of my head. If that approach failed, i was going to see if I could reverse engineer how it's supposed to work by looking at something like Flycast. :lol:
Twada
DC Developer
DC Developer
Posts: 42
Joined: Wed Jan 20, 2016 4:55 am
Has thanked: 18 times
Been thanked: 53 times

Re: HDR rendering on Dreamcast

Post by Twada »

GyroVorbis wrote: Sun Jan 14, 2024 3:45 pm
Twada wrote: Sun Jan 14, 2024 4:29 am If we were to twiddle the reduced buffer, it would work much faster. In that case, the tile matrix would have to be uniquely aligned (e.g. horizontally) and the KOS PVR would have to be hijacked, so it might not be suitable as a sample.
If that is OK with you, please give me some time.
If you have any ideas on how to make this thing faster and better, lets do it! This is one of the most impressive things I have EVER seen on DC, and it has always been a plan of mine to work on bloom for the platform... I think it's worth every bit of investment to show off what the DC is actually capable of and have as an example for others to learn and leverage as well.

As for highjacking the KOS PVR API... DO IT!!! If you have to go around KOS's PVR API to do this efficiently, then the API should be extended to allow for such behavior, in my humble opinion.
I am sorry to be late. The HDR sample is completed.
I intended to focus on ease of understanding, but it may have had the opposite effect. Please let me know if there are any changes.
bloom_hdr.rar
(143.21 KiB) Downloaded 13 times
I speed it up by twiddle with the textures. Now that I think about it, the previous sample might have been 15fps.
For speed reasons, I changed from Gaussian blur to box blur. You can check the texture generation by using the commented out lines 397-398.
This is multi-pass rendering.

vertex32.c sets the flag member of the structure pvr_vertex_t to w, assuming that it is used as an index. Please be careful if you reuse it.
matrix_identity2() sets the shifted identity matrix. By using the ftrv instruction, you can convert (x, y, z, w) to (w, x, y, z).
Since w is calculated first, find the reciprocal and make it (w, x/w, y/w, z/w). This is done by the vector4_transform2() function. I'm hoping that by leaving w at the beginning, clipping will be faster.
I think it's a good method because you can also use paired single-precision data transfer without rearranging xyz, but if you have any suggestions for improvement, please let me know.
Also, for some reason the fipr instruction didn't work properly, so I used addition and multiplication.
These users thanked the author Twada for the post (total 3):
|darc|Ian RobinsonGyroVorbis
User avatar
GyroVorbis
Elysian Shadows Developer
Elysian Shadows Developer
Posts: 1873
Joined: Mon Mar 22, 2004 4:55 pm
Location: #%^&*!!!11one Super Sonic
Has thanked: 79 times
Been thanked: 61 times
Contact:

Re: HDR rendering on Dreamcast

Post by GyroVorbis »

Holy fucking shit, this is absolutely INCREDIBLE, man!!! I can't believe how far you got that optimized! I'm pretty sure i just measured 47fps at WORST with the effect enabled, and up to 177fps with it off? What the hell!??! I'm studying the hell out of this code right now!

Such amazing, epic work, man! Also thanks so much TapamN for those tips and for getting in on this endeavor with us!

I've been in the process of making sure all of our ASM routines in KOS, KGL, and GLdc all work with -m4-single and not just -m4-single-only, because a lot of new homebrew stuff is breaking without true double-point precision, and I've found that if you aren't actually using them in a critical path, there's not really a performance hit... You get hit if you actually use doubles...

Anyway, apparently the register passing order is different between -m4-single and -m4-single-only. We now have this file thanks to zcrc to help fix that: https://github.com/KallistiOS/KallistiO ... /args.h#L3

This file provides macro wrapper utilities around your registers when writing inline ASM so that they resolve to the correct register index with both -m4-single and -m4-single-only, as seen here: https://github.com/KallistiOS/KallistiO ... ase.h#L137
These users thanked the author GyroVorbis for the post (total 2):
Ian RobinsonTwada
Post Reply