Search found 108 matches

by TapamN
Tue Nov 22, 2022 4:43 am
Forum: Programming Discussion
Topic: Using /pc/ filesystem with DreamSDK?
Replies: 6
Views: 459

Re: Using /pc/ filesystem with DreamSDK?

The -c option is just for chroot, to remap the root of /pc to something besides /. /pc should just work as long as you aren't passing dcload the -n option to disable it. I've used dcload-ip to access files on both Windows and Linux, without using the -c option. If you're having problems, you can use...
by TapamN
Sat Jul 16, 2022 6:09 am
Forum: Programming Discussion
Topic: Bump Mapping sample
Replies: 13
Views: 2985

Re: Bump Mapping sample

Maybe this is a stupid question but maybe using a empty or special normal map applied to this would you be able to force something similar to phong shading ? ( lighting calculated per pixel across the entire surface instead of gouraud? ) I've worked a bit on trying to get something like that workin...
by TapamN
Wed Jan 05, 2022 5:17 pm
Forum: Programming Discussion
Topic: Streaming Music Playback CPU Usage
Replies: 6
Views: 2292

Re: Streaming Music Playback CPU Usage

I get an internal server error when I try to embed the code in this post. I'll put it it on Pastebin . This is ripped out of my DMA testing code; I haven't tested the code to see if it works by itself, but I think everything important is in there to get DMA running. I'm can't really list everything ...
by TapamN
Tue Dec 14, 2021 9:27 pm
Forum: Programming Discussion
Topic: Streaming Music Playback CPU Usage
Replies: 6
Views: 2292

Streaming Music Playback CPU Usage

tl;dr Sound decompression in KOS is much more CPU intensive than it needs to be (uses up to 60% CPU usage). I think it can be lowered to less than 15%., When I was working on the sound for my version of Gens4All, I looked at the code for KOS's MP3 and Vorbis streamers to see how to push my own sampl...
by TapamN
Tue Dec 14, 2021 3:20 pm
Forum: Programming Discussion
Topic: Race condition in KOS timer_??_gettime*
Replies: 10
Views: 1863

Re: Race condition in KOS timer_??_gettime*

My one question about them is why do 'used=timer_count(TMU2)' in the overflow conditional branch? Was this just a redundancy or is it expected that checking the TCR for overflow would change the value? If it's a matter of timing, why not check overflow first then always set used directly after? I t...
by TapamN
Mon Sep 27, 2021 5:14 am
Forum: Programming Discussion
Topic: Race condition in KOS timer_??_gettime*
Replies: 10
Views: 1863

Race condition in KOS timer_??_gettime*

(I had a lot of trouble getting this post to work. I would get server errors when I tried to preview the post if I had too many/too large {code} sections. I ended up attaching the code.) I've been using timer_us_gettime64 for profiling for years, and I've noticed the occasional hiccup with results f...
by TapamN
Fri Aug 06, 2021 3:44 pm
Forum: Programming Discussion
Topic: DreamHAL - Dreamcast Hardware Abstraction Layer
Replies: 21
Views: 5409

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

Not totally sure, but it seems like something is wrong with KOS's math headers which prevents GCC from generating the builtin instructions. It seems to always generate a function call. GCC can emit the FSQRT, FSRRA, and FSCA instructions if you use the compiler options "-ffast-math -ffp-contrac...
by TapamN
Fri Aug 06, 2021 3:59 am
Forum: Programming Discussion
Topic: DreamHAL - Dreamcast Hardware Abstraction Layer
Replies: 21
Views: 5409

Re: DreamHAL - Dreamcast Hardware Abstraction Layer

That should have lower latency than a FSQRT instruction. FSQRT's latency is 12 cycles. FSRRA has a latency of 6 and FMUL has a latency of 3 cycles. So your approximate square root would be 3 cycles faster than FSQRT.
by TapamN
Fri Jul 30, 2021 2:07 am
Forum: Programming Discussion
Topic: SH4 Superscalar
Replies: 2
Views: 694

Re: SH4 Superscalar

If two instructions can't be executed in parallel, it runs the first instruction this cycle, then the next cycle it tries to run the second and third instructions in parallel.
by TapamN
Fri Jul 30, 2021 2:00 am
Forum: Programming Discussion
Topic: SH4 Contention between IF and MA
Replies: 5
Views: 926

Re: SH4 Contention between IF and MA

Those instruction/data access issues only happen on SuperH's with unified caches, like the SH-2 and SH-3. All SuperHs read instructions by fetching a long worth of instructions at a time (2 instructions). This fetch is always aligned to a long boundry. On the SuperH 1 though 3, what this works out t...
by TapamN
Thu Jun 17, 2021 11:04 am
Forum: Programming Discussion
Topic: High Resolution Dreamcast Video Modes!
Replies: 14
Views: 29529

Re: High Resolution Dreamcast Video Modes!

As far as I can tell, it's not possible to truly increase the output resolution of the Dreamcast without major compatibility problems. Trying to raise the true resolution results in the frame rate dropping. For video output, there's a pixel clock that controls when a new pixel is output to the displ...
by TapamN
Wed Dec 23, 2020 2:53 am
Forum: Programming Discussion
Topic: Unsure how some PVR features behave
Replies: 13
Views: 4557

Re: Unsure how some PVR features behave

In pvr_setup() you override some of the context struct values. gen.clip_mode is obvious, but I was wondering if there was any particular reason you modified gen.shading , blend.src and blend.dst ? Are they personal preference or is the user clipping affected by those parameters? Shading is set to f...
by TapamN
Tue Dec 15, 2020 11:16 pm
Forum: Programming Discussion
Topic: How do I subtract with alpha blending?
Replies: 5
Views: 1815

Re: How do I subtract with alpha blending?

I think the closest you can get to subtractive blending is to send the source blend func to zero and the dest blend func to 1-src. (So in OpenGL, this would be glBlendFunc(GL_ZERO, GL_ONE_MINUS_SRC_COLOR)) This gives you a blend equation of Source_color * 0.0 + Dest_color * (1 - Source_color), which...
by TapamN
Fri Oct 23, 2020 4:16 am
Forum: Programming Discussion
Topic: Need help to optimize my display routine
Replies: 10
Views: 2091

Re: Need help to optimize my display routine

Your pvr_wait_ready and timer_ms_gettime64 usage looks fine to me. Do you have any threads for stuff like sound that might be taking up CPU time? You don't have a lot of printf's going on during gameplay, do you? In the code example of how you added the timing measurements, you're printing every fra...
by TapamN
Sun Oct 18, 2020 9:09 pm
Forum: Programming Discussion
Topic: Need help to optimize my display routine
Replies: 10
Views: 2091

Re: Need help to optimize my display routine

The sprite count should not be a problem. Even a Saturn or PSX should be able to handle that. Timing information from emulators won't be accurate to real hardware. While emulators for older consoles can have accurate timing, everything starting with and after the Saturn and PSX typically treat the C...
by TapamN
Sun Sep 20, 2020 2:25 am
Forum: Programming Discussion
Topic: Objdump SH4 / can't disassemble for architecture UNKNOWN
Replies: 11
Views: 4634

Re: Objdump SH4 / can't disassemble for architecture UNKNOWN

If you specify the target object file as binary with the parameter "-b binary" you can disassemble raw binaries. You will also need to specify a machine type with "-m <MACHINE>". You can get a list of machine types with "[...]objdump -h". As an example, to disassemble a...
by TapamN
Wed Aug 19, 2020 7:07 pm
Forum: Programming Discussion
Topic: Bug in sem_init
Replies: 2
Views: 918

Bug in sem_init

This was frustrating to figure out. I'm working on a new PVR driver, and I ran into a bug in sem_init. I was initializing a semaphore as part of a malloc-ed struct, and the semaphore appeared to be getting corrupted or not initialized (When I went to wait on the semaphore, I sometimes found the coun...
by TapamN
Mon Jun 15, 2020 7:04 pm
Forum: Programming Discussion
Topic: KOS vs Ninja - simple test.
Replies: 25
Views: 7404

Re: KOS vs Ninja - simple test.

as of SH7750 and SH7091 differences: at hardware level main difference is - 7750 have GPIO PORT A lines shared with data bus, while 7091 have them shared with address bus. which is pretty wise move because DC uses quite few bits of address bits for local SDRAM access only, and access everything els...
by TapamN
Sun May 24, 2020 5:22 am
Forum: Programming Discussion
Topic: KOS vs Ninja - simple test.
Replies: 25
Views: 7404

Re: KOS vs Ninja - simple test.

Not OCINDEX, no. Basically the trick is doing a write-through write to memory, and then reading back that data in write-back memory, doing all the operations on the write-back memory (which is only in the cache), and then invalidating the cache without allowing the data to get written back to the s...
by TapamN
Wed May 13, 2020 12:34 pm
Forum: Programming Discussion
Topic: KOS vs Ninja - simple test.
Replies: 25
Views: 7404

Re: KOS vs Ninja - simple test.

Sorry, yeah, looks like I used poor word choice here. I had actually posted a way to make a kind of "fake" 2-way associativity by abusing 29-bit addressing that makes use of manual cache management in the Simulant Discord not too long ago. (Unrelated, but I gotta admit that it's nice havi...