dreamcast armc cpu uses?
- Neoblast
- DC Developer
- Posts: 315
- https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
- Joined: Sat Dec 01, 2007 8:51 am
- Has thanked: 5 times
- Been thanked: 1 time
Re: dreamcast armc cpu uses?
What I mean is, it "seems" to use the sound memory, and perhaps it should be worth looking it they use the arm on all these.
- PH3NOM
- DC Developer
- Posts: 576
- Joined: Fri Jun 18, 2010 9:29 pm
- Has thanked: 0
- Been thanked: 5 times
Re: dreamcast armc cpu uses?
Well, today I decided to try decoding ADX on the ARM processor.Ayla wrote:We have plenty of documentation. NullDC and MAME have a working AICA emulator, that's a source of documentation.
Since 2007 I'm slowly working on an OS for the ARM to ease development of the AICA. I've been reading a lot, and testing a lot (that's why I can affirm the running speed of the ARM and DSP as written previously). I'm more and more certain that we can't use both processors for anything useful, but I'm not loosing hope.
Surprisingly, I was able to get the decoder running much faster than real time using stereo 44.1kHz, but it is not yet outputting the audio.
The SH4 only has to read the ADX bit stream from the CD, then load the bit stream to the AICA memory
Code: Select all
if( ADX_REG_STAT_SH4 == ADX_NEED_BUF )
{
fread( buffer, 1, 2048, fd );
spu_memload( idx, buffer, 2048 );
ADX_REG_STAT_SH4 = ADX_HAVE_BUF;
printf("LibADX: DECODED %.0f%s\n", ((float)ftell(fd)/(ts-sizeof(ADX_INFO)))*100, "%" );
}
- RyoDC
- Mental DCEmu
- Posts: 366
- Joined: Wed Mar 30, 2011 12:13 pm
- Has thanked: 2 times
- Been thanked: 0
Re: dreamcast armc cpu uses?
The SH4 only has to read the ADX bit stream from the CD, then load the bit stream to the AICA memory
Phenom, it can be done by means of DMA?
Phenom, it can be done by means of DMA?
How do I try to build a Dreamcast toolchain:
-
- DC Developer
- Posts: 142
- Joined: Thu Apr 03, 2008 7:01 am
- Has thanked: 0
- Been thanked: 4 times
- Contact:
Re: dreamcast armc cpu uses?
Isn't ADX just standard ADPCM? If so the AICA decodes that in hardware.
If you want an easier way to program the ARM, have a look at my lib: https://github.com/pcercuei/AICAOS/
If you want an easier way to program the ARM, have a look at my lib: https://github.com/pcercuei/AICAOS/
- PH3NOM
- DC Developer
- Posts: 576
- Joined: Fri Jun 18, 2010 9:29 pm
- Has thanked: 0
- Been thanked: 5 times
Re: dreamcast armc cpu uses?
Actually, there was a bug in the memory allocation I was making ( with the ARM RAM ), that lead to the decoding "seeming" to finish extremely fast.
After fixing the bug, the real speed becomes apparent.
The ARM is able to decode MONO ADX @ 44.1kHz faster than real time, but stereo is actually slower than real-time.
I have made an important optimization to the decode routine that has increased the decoder speed in stereo, but there it is still not quite fast enough. I will make one more pass at optimizing the decode algorithm on the ARM before I make a final verdict.
But, there is probably not much difference, since we are simply sending compressed bit stream to the AICA, with a bandwidth that can handle un-compressed bit stream data.
http://en.wikipedia.org/wiki/ADX_%28file_format%29
It is similar, but its prediction algorithm is different, and it actually results in slightly larger sample size than ADPCM.
Your lib seems very well done.
But I dont know how to use it in this task?...
Right now, I use the ARM to decode the ADX samples sent from the SH4 into Sound RAM.
All I need is to figure out how to output the audio from the samples being generated, on the ARM.
Can your lib do this? Take a sample buffer ( in sound ram ) and output the audio the the speakers?
After fixing the bug, the real speed becomes apparent.
The ARM is able to decode MONO ADX @ 44.1kHz faster than real time, but stereo is actually slower than real-time.
I have made an important optimization to the decode routine that has increased the decoder speed in stereo, but there it is still not quite fast enough. I will make one more pass at optimizing the decode algorithm on the ARM before I make a final verdict.
Sure, the samples could be sent through the DMA channels, that may be something to consider if things work out.RyoDC wrote:The SH4 only has to read the ADX bit stream from the CD, then load the bit stream to the AICA memory
Phenom, it can be done by means of DMA?
But, there is probably not much difference, since we are simply sending compressed bit stream to the AICA, with a bandwidth that can handle un-compressed bit stream data.
No, ADX is not "just standard ADPCM"Ayla wrote:Isn't ADX just standard ADPCM? If so the AICA decodes that in hardware.
If you want an easier way to program the ARM, have a look at my lib: https://github.com/pcercuei/AICAOS/
http://en.wikipedia.org/wiki/ADX_%28file_format%29
It is similar, but its prediction algorithm is different, and it actually results in slightly larger sample size than ADPCM.
Your lib seems very well done.
But I dont know how to use it in this task?...
Right now, I use the ARM to decode the ADX samples sent from the SH4 into Sound RAM.
All I need is to figure out how to output the audio from the samples being generated, on the ARM.
Can your lib do this? Take a sample buffer ( in sound ram ) and output the audio the the speakers?
-
- DC Developer
- Posts: 142
- Joined: Thu Apr 03, 2008 7:01 am
- Has thanked: 0
- Been thanked: 4 times
- Contact:
Re: dreamcast armc cpu uses?
My lib does not do this, but I designed it to ease development on the ARM. For instance, an ARM program using my lib can call printf(), open and read files with fopen() / fclose() directly from the CD-rom, and launch pre-registered routines on the main processor.
It provides these features:
- an API that allows you to register a function, either from the ARM or the SH4, as callable from the other processor;
- it's integrated with newlib, so you get all the benefits of the libc like memory management routines, on the ARM. The I/O functions like open, read etc. on the ARM are defined as remotes functions, so each call newlib makes to read() on the ARM will call read() on the SH4, for instance. I let you imagine how useful it is when you need to stream data.
- it provides basic threads (called tasks).
It still has a couple of bugs I will fix when I get back on dreamcast development (I've been abroad for almost a year, without my console). For instance, calling remote functions of the ARM from the SH-4 sometimes locks, but the other way works fine.
Once I'm done with that library, my plan is to start a new one, to offer an API on the ARM side for the features of the AICA (playing samples, playing a stream, using the FBO, using the DSP etc), as well as a library to offer 3D sound (think OpenAL) and enhanced stereo.
It provides these features:
- an API that allows you to register a function, either from the ARM or the SH4, as callable from the other processor;
- it's integrated with newlib, so you get all the benefits of the libc like memory management routines, on the ARM. The I/O functions like open, read etc. on the ARM are defined as remotes functions, so each call newlib makes to read() on the ARM will call read() on the SH4, for instance. I let you imagine how useful it is when you need to stream data.
- it provides basic threads (called tasks).
It still has a couple of bugs I will fix when I get back on dreamcast development (I've been abroad for almost a year, without my console). For instance, calling remote functions of the ARM from the SH-4 sometimes locks, but the other way works fine.
Once I'm done with that library, my plan is to start a new one, to offer an API on the ARM side for the features of the AICA (playing samples, playing a stream, using the FBO, using the DSP etc), as well as a library to offer 3D sound (think OpenAL) and enhanced stereo.
- These users thanked the author Ayla for the post:
- Ian Robinson
- MisterDave
- DCEmu Freak
- Posts: 58
- Joined: Mon Apr 08, 2013 1:16 pm
- Has thanked: 0
- Been thanked: 0
Re: dreamcast armc cpu uses?
Thanks Ayla for the link to your ARM OS code. So far I have been trying to write my ARM code based on Dan Potter's s3mplay.c example which is as low level as you can get (no external libraries). Having a mechanism to call functions on the SH4 from the ARM is really nice, particularly printf for debugging as the only other way to see inside the ARM is the lxdream debugger.
- PH3NOM
- DC Developer
- Posts: 576
- Joined: Fri Jun 18, 2010 9:29 pm
- Has thanked: 0
- Been thanked: 5 times
Re: dreamcast armc cpu uses?
Hmm I think I understand now, your lib uses the ARM for processing general tasks, without using it for sound output?Ayla wrote:My lib does not do this, but I designed it to ease development on the ARM. For instance, an ARM program using my lib can call printf(), open and read files with fopen() / fclose() directly from the CD-rom, and launch pre-registered routines on the main processor.
It provides these features:
- an API that allows you to register a function, either from the ARM or the SH4, as callable from the other processor;
- it's integrated with newlib, so you get all the benefits of the libc like memory management routines, on the ARM. The I/O functions like open, read etc. on the ARM are defined as remotes functions, so each call newlib makes to read() on the ARM will call read() on the SH4, for instance. I let you imagine how useful it is when you need to stream data.
- it provides basic threads (called tasks).
It still has a couple of bugs I will fix when I get back on dreamcast development (I've been abroad for almost a year, without my console). For instance, calling remote functions of the ARM from the SH-4 sometimes locks, but the other way works fine.
Once I'm done with that library, my plan is to start a new one, to offer an API on the ARM side for the features of the AICA (playing samples, playing a stream, using the FBO, using the DSP etc), as well as a library to offer 3D sound (think OpenAL) and enhanced stereo.
Or you expect another program to be running on the arm, in another thread? to handle outputting sound through the aica?
I can see good reasons for this, making things simpler for the developer, but I can also imagine reasons why that approach may not be the best idea. I think that giving the ARM that much control reduces its potential efficiency.
For example, for my ADX decoder, now I am using a double-buffer approach where one buffer in sound ram is being filled with ADX samples by the SH4, while the other buffer in sound ram is being decoded by the ARM.
The ARM will decode the first buffer, then signal that buffer as needing more samples, then swap to the next buffer to decode.
While the ARM is decoding the 2nd buffer, the SH4 will check and see that the first buffer needs more data, and then will fread() and spu_memload() the ADX bit-stream into sound ram, and mark the first buffer as having more data.
Generally, the ARM should never have to wait for input bit-stream to be available, so the decoder can run at full tilt.
Instead, I imagine using your library to perform this task.
When the ADX decoder has finished decoding the current bit-stream packet, it will make a call to fread() to get another ADX bit-stream packet. Being a remote function, calling fread() the ARM will have to wait for the SH4 to finish performing its current execution, before recognizing the ARM made that request, and then finally execute that request to fread() and spu_memload() the data into sound ram. This means the ARM will be stalled for some time every packet.
To use a double-buffer would require a second thread running on the ARM that is calling fread() while the first thread is decoding the packets. Again, I dont think that is a very efficient way to do things.
Also, what are you doing on the poor ARM that needs a full libc? Ive only had to contrive 1 stdilb function to do what I have needed to on the ARM, malloc().
I think that your plan for a new lib sounds very useful and promising.
My question is still, what is the simplest way to get sound output, from the ARM to the AICA.
The decoder I am working on uses the ARM to decode ADX samples in sound ram.
What I have right now is a buffer of PCM samples in sound ram.
So far, my aica code is based off aica.c that Dan Potter included with the s3m player source, but no audio is heard, not even distortion or static.
This is the main function that is executed on the ARM:
Code: Select all
int arm_main()
{
ADX_REG_STAT = 0;
ADX_REG_SAMP = 0;
aica_init(); /* Initialize the AICA part of the SPU */
SAMPLE_BUFFER = malloc( 65535*4 );
prev[0].s1 = prev[0].s2 = prev[1].s1 = prev[1].s2 = 0;
ADX_ParseHeader( fd );
ADX_REG_STAT |= ADX_STAT_INITIALIZED;
ADX_REG_STAT |= ADX_STAT_NEED_BUF0;
ADX_REG_STAT |= ADX_STAT_NEED_BUF1;
while(!(ADX_REG_STAT & ADX_STAT_DONE))
{
switch(CUR_BUF)
{
case 0:
while(!(ADX_REG_STAT & ADX_STAT_HAVE_BUF0));
ADX_DecodeChunk( fd, 2048 );
ADX_REG_STAT ^= ADX_STAT_NEED_BUF0;
ADX_REG_STAT |= ADX_STAT_HAVE_BUF0;
CUR_BUF =! CUR_BUF;
break;
case 1:
while(!(ADX_REG_STAT & ADX_STAT_HAVE_BUF1));
ADX_DecodeChunk( fd, 2048 );
ADX_REG_STAT ^= ADX_STAT_NEED_BUF1;
ADX_REG_STAT |= ADX_STAT_HAVE_BUF1;
CUR_BUF =! CUR_BUF;
break;
}
if(SAMPLES==65536)
{
aica_play( ADX_Info->channels,
SAMPLE_BUFFER,
0,
0,
SAMPLES,
ADX_Info->rate,
0xFF,
0x80,
0 );
SAMPLES=0;
}
ADX_REG_SAMP = SAMPLES;
}
}
-
- DC Developer
- Posts: 142
- Joined: Thu Apr 03, 2008 7:01 am
- Has thanked: 0
- Been thanked: 4 times
- Contact:
Re: dreamcast armc cpu uses?
Well this is mostly true. But you seem to forget that the ARM and the SH4/DMA cannot access the sound RAM at the same time, so even with your design the execution is stalled during a transfer. Nonetheless, using a double-buffer solution with my lib wouldn't be as bad as you think. A request coming from the ARM will interrupt the SH4's execution and start transfering data without delay, and won't wait till the SH4 performs its current execution like you mention. Furthermore, my lib uses interrupts so the SH4 doesn't waste cycles polling for incoming events.PH3NOM wrote:Hmm I think I understand now, your lib uses the ARM for processing general tasks, without using it for sound output?
Or you expect another program to be running on the arm, in another thread? to handle outputting sound through the aica?
I can see good reasons for this, making things simpler for the developer, but I can also imagine reasons why that approach may not be the best idea. I think that giving the ARM that much control reduces its potential efficiency.
For example, for my ADX decoder, now I am using a double-buffer approach where one buffer in sound ram is being filled with ADX samples by the SH4, while the other buffer in sound ram is being decoded by the ARM.
The ARM will decode the first buffer, then signal that buffer as needing more samples, then swap to the next buffer to decode.
While the ARM is decoding the 2nd buffer, the SH4 will check and see that the first buffer needs more data, and then will fread() and spu_memload() the ADX bit-stream into sound ram, and mark the first buffer as having more data.
Generally, the ARM should never have to wait for input bit-stream to be available, so the decoder can run at full tilt.
Instead, I imagine using your library to perform this task.
When the ADX decoder has finished decoding the current bit-stream packet, it will make a call to fread() to get another ADX bit-stream packet. Being a remote function, calling fread() the ARM will have to wait for the SH4 to finish performing its current execution, before recognizing the ARM made that request, and then finally execute that request to fread() and spu_memload() the data into sound ram. This means the ARM will be stalled for some time every packet.
To use a double-buffer would require a second thread running on the ARM that is calling fread() while the first thread is decoding the packets. Again, I dont think that is a very efficient way to do things.
Finally, you could still do a design like yours using my lib, you're not obliged to fread() from the ARM. But having things like printf() out of the box is a big plus.
Only the functions you use are compiled in, so the full libc isn't integrated. So if you only use printf() it will include only printf() and the underlying functions like write().PH3NOM wrote:Also, what are you doing on the poor ARM that needs a full libc? Ive only had to contrive 1 stdilb function to do what I have needed to on the ARM, malloc().
I think that your plan for a new lib sounds very useful and promising.
- PH3NOM
- DC Developer
- Posts: 576
- Joined: Fri Jun 18, 2010 9:29 pm
- Has thanked: 0
- Been thanked: 5 times
Re: dreamcast armc cpu uses?
Aah I had considered that, but was not certain. Thanks for the clarification.Ayla wrote:Well this is mostly true. But you seem to forget that the ARM and the SH4/DMA cannot access the sound RAM at the same time, so even with your design the execution is stalled during a transfer.
Still, can someone help with how to actually output the sound from the samples being decoded on the ARM?
What I have is the aica_init / aica_play functions from Dan's S3M player, that does not seem to be working...
- Christuserloeser
- Moderator
- Posts: 5948
- Joined: Thu Aug 28, 2003 12:16 am
- Location: DCEvolution.net
- Has thanked: 10 times
- Been thanked: 0
- Contact:
Re: dreamcast armc cpu uses?
That is slow indeed. I wonder if it still could be used to run DrZ80 (from PicoDrive) to emulate the Z80 at a couple of cycles? (which should be sufficient for Mega Drive / Genesis emulation)Ayla wrote:The DC's arm7 runs at ~2.8 MHz, so the only use it could have is as a module player, or as a controller for the ~5MHz DSP.
Insane homebrew collector.
- Christuserloeser
- Moderator
- Posts: 5948
- Joined: Thu Aug 28, 2003 12:16 am
- Location: DCEvolution.net
- Has thanked: 10 times
- Been thanked: 0
- Contact:
Re: dreamcast armc cpu uses?
Ayla wrote:My lib does not do this, but I designed it to ease development on the ARM. For instance, an ARM program using my lib can call printf(), open and read files with fopen() / fclose() directly from the CD-rom, and launch pre-registered routines on the main processor.
It provides these features:
- an API that allows you to register a function, either from the ARM or the SH4, as callable from the other processor;
- it's integrated with newlib, so you get all the benefits of the libc like memory management routines, on the ARM. The I/O functions like open, read etc. on the ARM are defined as remotes functions, so each call newlib makes to read() on the ARM will call read() on the SH4, for instance. I let you imagine how useful it is when you need to stream data.
- it provides basic threads (called tasks).
It still has a couple of bugs I will fix when I get back on dreamcast development (I've been abroad for almost a year, without my console). For instance, calling remote functions of the ARM from the SH-4 sometimes locks, but the other way works fine.
Once I'm done with that library, my plan is to start a new one, to offer an API on the ARM side for the features of the AICA (playing samples, playing a stream, using the FBO, using the DSP etc), as well as a library to offer 3D sound (think OpenAL) and enhanced stereo.
That sounds pretty exciting btw. I noticed you uploaded an update three months ago. Really looking forward to more news on this.
Insane homebrew collector.
- MisterDave
- DCEmu Freak
- Posts: 58
- Joined: Mon Apr 08, 2013 1:16 pm
- Has thanked: 0
- Been thanked: 0
Re: dreamcast armc cpu uses?
Exciting stuff. I love the bare-bones approach to programming on the DC's ARM CPU, so it's nice to see examples of it being used going all the way back to Dan Potter's S3M player.