DCEmulation

The Dreamcast Homebrew Community Online
Back to main site
It is currently Wed Apr 23, 2014 12:36 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 137 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7  Next
Author Message
PostPosted: Mon Jun 20, 2011 2:01 am 
Offline
DC Developer
DC Developer

Joined: Thu Aug 20, 2009 9:00 am
Posts: 316
I doubt the lack of a cache makes THAT much of a difference to the MIPS. A 68000 will give about 3 MIPS at that clock rate, and the ARM7 is MUCH more efficient about clock usage. In fact, the speed of nearly all instructions are limited by the how fast the bus is; if the bus allows load/store in a single cycle, most ARM instructions are only 2 or 3 cycles long. Given the instructions are 32 bits and the ram bus for the audio in the DC is 16, that should probably be doubled, but that's still MUCH faster than a 68000. The bus speed for the DC sound ram is supposed to be 66MHz, but the CPU will be sharing that with the sound channels, so if a lot of channels are playing, you will slow the CPU down. I doubt a few channels will slow the CPU at all. It sounds like it's close to what you need for SPEEX, which takes 5 to 10 MIPS to decode. It might be worth trying.

Do you have any more specific info on the CPU? Like what the FIQ hooks to (if anything)? I have the ARM7DI datasheet - at least that was easy to find.


Top
 Profile  
 
PostPosted: Mon Jun 20, 2011 6:33 am 
Offline
Insane DCEmu
Insane DCEmu

Joined: Thu Apr 03, 2008 5:01 am
Posts: 119
Have a look at that:
viewtopic.php?f=29&t=51558&p=545827&hilit=+aica#p545827

Heliophobe (RIP) theorized that the ARM would virtually run at ~2.8MHz.
What's funny here is that I benchmarked it, and measured ~2.7 MIPS.

The big problem is that even if the channels are disactivated the mixer still use as much memory baudwidth (see last line of the post).

You can generate a FIQ on the ARM with a end of DMA transfer, a timer event, or from the SH-4, and probably from the DSP.
Does that answer your question (sorry if I didn't get it correctly)?


Top
 Profile  
 
PostPosted: Mon Jun 20, 2011 3:29 pm 
Offline
DC Developer
DC Developer

Joined: Thu Aug 20, 2009 9:00 am
Posts: 316
That was all conjecture. I rather doubt the AICA keeps the registers in ram - that's just plain silly and was only even suggested because it was the only way the person could think of using ram cycles to slow the ARM. It's more likely like Quzar mentioned - that speed control for the ARM... perhaps it's not the master clock but the bus clock for the ARM. It seems there needs to be some more testing done.

The info on the FIQ is what I meant, but where are the specifics? I'm having trouble finding that in the KOS code.

By the way, the multiply on the ARM7DI takes a minimum of 2 clock cycles, and a max of 17 clock cycles... assuming no bus slow downs.


Top
 Profile  
 
PostPosted: Mon Jun 20, 2011 5:18 pm 
Offline
Insane DCEmu
Insane DCEmu

Joined: Thu Apr 03, 2008 5:01 am
Posts: 119
Yes it does, but as far as I remember the ARM7DI does not have a 32*32=>64 mul opcode, so a MP3 player which would be possible on a 25MHz ARM7TDMI becomes impossible here.

And as you said, Heliophobe conjectured it. However, as I said earlier, I did benchmark it, with a hand-made ASM program which counts the number of NOP instructions executed in one second. And the speed control register does not help at all (and I guess it overclocks correctly, it crashes the ARM with too high values) : the MIPS value I get is still the same, probably because while the clock is higher, the processor still spends its time waiting tiredlessly.

About the AICA, all the info you need is on one file I mirrored here: http://crapouillou.net/~paul/aica_v08.txt


Top
 Profile  
 
PostPosted: Mon Jun 20, 2011 5:53 pm 
Offline
DC Developer
DC Developer

Joined: Thu Aug 20, 2009 9:00 am
Posts: 316
Ayla wrote:
Yes it does, but as far as I remember the ARM7DI does not have a 32*32=>64 mul opcode, so a MP3 player which would be possible on a 25MHz ARM7TDMI becomes impossible here.


Yeah, it's 32*32->32. That can be used to make a 64 bit multiply exactly like the 16*16->32 on the 68000, but it's still slower doing so. You'd have to find some way around that... new assembly code that perhaps loses accuracy in exchange for speed. Still, it looks like it's probably not possible. SPEEX might be given its lower computational requirements.


Quote:
And as you said, Heliophobe conjectured it. However, as I said earlier, I did benchmark it, with a hand-made ASM program which counts the number of NOP instructions executed in one second. And the speed control register does not help at all (and I guess it overclocks correctly, it crashes the ARM with too high values) : the MIPS value I get is still the same, probably because while the clock is higher, the processor still spends its time waiting tiredlessly.


Hmm - how did you time it? It might be worth trying to do several million nops while timing it by hand to double check that it wasn't some timer issue.


Quote:
About the AICA, all the info you need is on one file I mirrored here: http://crapouillou.net/~paul/aica_v08.txt


Ah, I see. Thanks!

One thing I noticed in that note is that the DSP uses a ring buffer for the DSP program. Given there's a pointer and size you set, that would be in ram. So perhaps it's the DSP slowing the bus access by the ARM. That would make more sense than blaming it on the AICA channel registers. Perhaps a different DSP program would help. Of course there's even less on the DSP than the rest.


Top
 Profile  
 
PostPosted: Mon Jun 20, 2011 8:19 pm 
Offline
DC Developer
DC Developer
User avatar

Joined: Fri Jun 18, 2010 7:29 pm
Posts: 336
Chilly Willy wrote:
Ayla wrote:
Yes it does, but as far as I remember the ARM7DI does not have a 32*32=>64 mul opcode, so a MP3 player which would be possible on a 25MHz ARM7TDMI becomes impossible here.


Yeah, it's 32*32->32. That can be used to make a 64 bit multiply exactly like the 16*16->32 on the 68000, but it's still slower doing so. You'd have to find some way around that... new assembly code that perhaps loses accuracy in exchange for speed. Still, it looks like it's probably not possible. SPEEX might be given its lower computational requirements.


Yeah... thats what i believe stopped the helix mp3 decoder assembly code from compiling for the DC's ARM, and thats what stopped me from continuing with this idea.

It would be interesting to really know how many MIPS it can handle. 3MIPS seems quite low, but its probably right.


Top
 Profile  
 
PostPosted: Tue Jun 21, 2011 1:03 am 
Offline
Insane DCEmu
Insane DCEmu

Joined: Thu Apr 03, 2008 5:01 am
Posts: 119
Chilly Willy wrote:
Hmm - how did you time it? It might be worth trying to do several million nops while timing it by hand to double check that it wasn't some timer issue.


Well I did it twice, at first by starting the process with a signal from the SH-4, and stopping it by raising an interrupt on the ARM7 from the SH-4. Then I tried with one of the AICA's timers and got the same result.

Chilly Willy wrote:
One thing I noticed in that note is that the DSP uses a ring buffer for the DSP program. Given there's a pointer and size you set, that would be in ram. So perhaps it's the DSP slowing the bus access by the ARM. That would make more sense than blaming it on the AICA channel registers. Perhaps a different DSP program would help. Of course there's even less on the DSP than the rest.


Well I think that in that case I would get a "bus" FIQ which is AFAIK occuring when the DSP tries to read the shared memory.
However, that would be interesting to verify. I know it is possible to really disactivate some parts of the AICA like the hardware mixer or the DSP.


Top
 Profile  
 
PostPosted: Tue Jun 21, 2011 6:58 am 
Offline
Dream Coder
Dream Coder
User avatar

Joined: Tue Jul 30, 2002 10:14 pm
Posts: 7444
Location: Behind NeoDC
Chilly Willy wrote:
That was all conjecture. I rather doubt the AICA keeps the registers in ram - that's just plain silly and was only even suggested because it was the only way the person could think of using ram cycles to slow the ARM. It's more likely like Quzar mentioned - that speed control for the ARM... perhaps it's not the master clock but the bus clock for the ARM. It seems there needs to be some more testing done.


Only problem is that there is no evidence that the ARM actually does change clock speed. You'd think that tidbit would have been discovered by emulator authors no? My point was purely conjecture. I believe Ayla's testing, as I've followed what he's been doing. Though I still think it would be interesting to attempt a modification of the clock as mentioned in the docs and then run a benchmark. Not sure it's ever been done.

_________________
"When you post fewer lines of text than your signature, consider not posting at all." - A Wise Man


Top
 Profile  
 
PostPosted: Tue Jun 21, 2011 3:47 pm 
Offline
DC Developer
DC Developer

Joined: Thu Aug 20, 2009 9:00 am
Posts: 316
Hmm - we've pretty thoroughly hijacked this thread. Perhaps one of the mods could split this into its own thread?

It's just hard to believe that a 22.5 MHz ARM7DI would get less than 3 MIPS. Just look at the charts... even the early architectures got better MIPS at much lower clock rates.

http://en.wikipedia.org/wiki/List_of_AR ... ssor_cores

If an ARM2 with no cache can get 4 MIPS at 8 MHz, I find it very hard to believe a 22 MHz ARM7 gets 3 MIPS; it should be getting close to 10 MIPS.


Top
 Profile  
 
PostPosted: Tue Jun 21, 2011 5:15 pm 
Offline
Dream Coder
Dream Coder
User avatar

Joined: Tue Jul 30, 2002 10:14 pm
Posts: 7444
Location: Behind NeoDC
Chilly Willy wrote:
Hmm - we've pretty thoroughly hijacked this thread. Perhaps one of the mods could split this into its own thread?

It's just hard to believe that a 22.5 MHz ARM7DI would get less than 3 MIPS. Just look at the charts... even the early architectures got better MIPS at much lower clock rates.

http://en.wikipedia.org/wiki/List_of_AR ... ssor_cores

If an ARM2 with no cache can get 4 MIPS at 8 MHz, I find it very hard to believe a 22 MHz ARM7 gets 3 MIPS; it should be getting close to 10 MIPS.


The following links are the basic resource for this. Rand Linden's various hints on the forum about the DC's ARM/AICA. We've basically held Rand's word on DC hardware and capabilities as gospel since he went over the entire system with a fine toothed comb.

viewtopic.php?f=29&t=64862 "There is NO cache for AICA."
viewtopic.php?f=29&t=27769 "Instructions must run from memory directly..." (which directly follows if it's true it has no cache.
viewtopic.php?p=545709#p545709 "Incomprehensibly slow"

_________________
"When you post fewer lines of text than your signature, consider not posting at all." - A Wise Man


Top
 Profile  
 
PostPosted: Tue Jun 21, 2011 6:46 pm 
Offline
DC Developer
DC Developer
User avatar

Joined: Fri Jun 18, 2010 7:29 pm
Posts: 336
Quzar wrote:
Chilly Willy wrote:
Hmm - we've pretty thoroughly hijacked this thread. Perhaps one of the mods could split this into its own thread?

It's just hard to believe that a 22.5 MHz ARM7DI would get less than 3 MIPS. Just look at the charts... even the early architectures got better MIPS at much lower clock rates.

http://en.wikipedia.org/wiki/List_of_AR ... ssor_cores

If an ARM2 with no cache can get 4 MIPS at 8 MHz, I find it very hard to believe a 22 MHz ARM7 gets 3 MIPS; it should be getting close to 10 MIPS.


The following links are the basic resource for this. Rand Linden's various hints on the forum about the DC's ARM/AICA. We've basically held Rand's word on DC hardware and capabilities as gospel since he went over the entire system with a fine toothed comb.

viewtopic.php?f=29&t=64862 "There is NO cache for AICA."
viewtopic.php?f=29&t=27769 "Instructions must run from memory directly..." (which directly follows if it's true it has no cache.
viewtopic.php?p=545709#p545709 "Incomprehensibly slow"


@Chilly - No, I believe the topic of discussion fits this thread just fine. We'll let the moderators decide...

@Quazar - I had read one of those threads, and Rand's words were what stopped me from pursuing the idea further

Rand Linden wrote:
...a sound system is definitely feasible -- it all depends on how much time you want to spend on it... If you're looking for a research project that'll suck huge amount of time and have little progress to show for it, this is the one.

Rand.


But, what did you end up deciding about sending multi-channel streams to the SPU?
Would it be possible to send 6-ch wave to the SPU, and let the hardware do the mixing?
Or is data transfer too slow to send 6-ch wave to the SPU in real-time?


Top
 Profile  
 
PostPosted: Sat Jun 25, 2011 5:08 pm 
Offline
Dream Coder
Dream Coder
User avatar

Joined: Tue Jul 30, 2002 10:14 pm
Posts: 7444
Location: Behind NeoDC
PH3NOM wrote:
But, what did you end up deciding about sending multi-channel streams to the SPU?
Would it be possible to send 6-ch wave to the SPU, and let the hardware do the mixing?
Or is data transfer too slow to send 6-ch wave to the SPU in real-time?


Don't know. I have tested 44k/16bit stereo wav and it's faster to mix on the AICA. Never tried more channels. I believe that Scherzo had 5 channel audio in Nester DC and found it faster. It would likely be a worthwhile benchmark to test software vs hardware mixing across different numbers of channels and audio formats (bitrates, sampling rates, encodings).

I would expect that it would always be faster to mix to hardware until the point where you would simply be unable to transfer the audio fast enough to update (even with DMA there could be significant impacts against performance of main memory). It's possible though that something like mixing two channels down while transferring 4 others, then transferring the new mixed one might be faster than transferring 6. Really don't know.

_________________
"When you post fewer lines of text than your signature, consider not posting at all." - A Wise Man


Top
 Profile  
 
PostPosted: Sun Jun 26, 2011 8:50 am 
Offline
Moderator
Moderator
User avatar

Joined: Fri Oct 12, 2007 11:52 am
Posts: 614
Location: Munich, Germany
I just noticed this topic. "JUST FUCKING AWESOME" is all I can say. If you need any more beta testers, count me in.

_________________
..::SEGA-DC.DE - The home of the German Dreamcast scene::..
Image


Top
 Profile  
 
PostPosted: Tue Jul 19, 2011 2:59 pm 
Offline
DCEmu Cool Newbie
DCEmu Cool Newbie

Joined: Sun Apr 03, 2011 8:48 am
Posts: 16
No news ? Thanks.


Top
 Profile  
 
PostPosted: Mon Jul 25, 2011 7:12 pm 
Offline
DCEmu Newbie
DCEmu Newbie

Joined: Mon Jul 25, 2011 7:03 pm
Posts: 5
Give it some time mouZ, they'll have more updates when they can post them.


Top
 Profile  
 
PostPosted: Tue Jul 26, 2011 6:39 pm 
Offline
DC Developer
DC Developer
User avatar

Joined: Fri Jun 18, 2010 7:29 pm
Posts: 336
Thanks for the interest.

In the early stages, progress was very rapid on this project.
Reaching later stages, it seems more time is spent with less to show for it.

The photo, music, and application loaders are basically finished.
Funny, the thing I have been working on the longest, still proves to need the most work :evil:

Take a look at some of my other posts here recently to get an idea of what progress is being made on various aspects of this project.

Something I have not mentioned yet, I would like to add support for some emulators.
For example, SNES4ALL can be compiled to load a certain rom at load time.
So, I can implement rom browsing in DCMC, and when a rom is selected, load the emulator binary, referencing the selected rom.
I will also add a return function, so you can return back to DCMC when your done playing :lol:


Top
 Profile  
 
PostPosted: Wed Jul 27, 2011 1:31 pm 
Offline
DC Developer
DC Developer
User avatar

Joined: Sat Dec 01, 2007 7:51 am
Posts: 292
Great, but wouldn't it be better ( and obviously more diff and take more time ) to use the emulator core ( and not load the emu bin ) to load the rom directly from dcmc?

Two cons of this are the rom recognision, either using a different file extension for each rom type would work or maybe the header, but I get the feeling it would be a mess when loading bins.


Top
 Profile  
 
PostPosted: Wed Jul 27, 2011 8:02 pm 
Offline
Dream Coder
Dream Coder
User avatar

Joined: Tue Jul 30, 2002 10:14 pm
Posts: 7444
Location: Behind NeoDC
Neoblast wrote:
Great, but wouldn't it be better ( and obviously more diff and take more time ) to use the emulator core ( and not load the emu bin ) to load the rom directly from dcmc?

Two cons of this are the rom recognision, either using a different file extension for each rom type would work or maybe the header, but I get the feeling it would be a mess when loading bins.


The problem with doing something like that is the memory overhead of having the rest of the system still working. Remember, at a minimum the binary must be loaded into memory and an extra few MB isn't nothing. This might be a workable solution for one or two emulators, but beyond that and you may find things breaking (also video playback will suffer).

_________________
"When you post fewer lines of text than your signature, consider not posting at all." - A Wise Man


Top
 Profile  
 
PostPosted: Thu Jul 28, 2011 7:27 pm 
Offline
DC Developer
DC Developer
User avatar

Joined: Fri Jun 18, 2010 7:29 pm
Posts: 336
Neoblast wrote:
Great, but wouldn't it be better ( and obviously more diff and take more time ) to use the emulator core ( and not load the emu bin ) to load the rom directly from dcmc?

Two cons of this are the rom recognision, either using a different file extension for each rom type would work or maybe the header, but I get the feeling it would be a mess when loading bins.


As far as rom recognition, I don't see any problems.
True, the build of DCMC you tested determines how to process a file based on its extension.
It would be easy to change this to actually open the file first, read its header, then decide how to process it.
Simply, there has been no need to do so, until now...

Adding the EMU CORE into DCMC is a good idea.
But, as Quzar mentioned, this would inflate the binary size, that is something I dont want to do.
DCMC ( including romdisk ) is currently ~2.5Mb
If I had enough time, which I dont, I would implement a module system, similar to how Dreamshell functions.
Considering my time constraints, it is a realistic goal to simply go about this the way I originally mentioned.


Top
 Profile  
 
PostPosted: Sat Jul 30, 2011 3:09 pm 
Offline
DCEmu Newbie
DCEmu Newbie

Joined: Mon Jul 25, 2011 7:03 pm
Posts: 5
No rush here, real life takes precedence over development. It's just great to see people still willing to work on applications for the Dreamcast, I along with the community appreciate the efforts being made on this project.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 137 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group