pl_mpegDC ported running but community help needed
- Ian Robinson
- DC Developer
- Posts: 126
- https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
- Joined: Mon Mar 11, 2019 7:12 am
- Has thanked: 224 times
- Been thanked: 45 times
pl_mpegDC ported running but community help needed
Name: PL_MPEGDC
Copyright: 7/31/20
Author: Ian micheal + Magnes(Bertholet)
Date: 31/07/23 09:03
Description: Dreamcast preliminary port KallistiOS video PVR without sound
All patents related to MPEG1 and MP2 have expired, so it's completely free now.
We need a proper mpeg api for indie games with a complete free license, so I started porting this. We have it running, but very slowly, of course.
If you're converting RGB on the CPU, I had the idea of using the PVR Yuv422 format and having it work, but of course color conversion is
A problem here is the github
https://github.com/ianmicheal/pl_mpegDC
With more info and the bleeding crazy versions are in that folder
https://github.com/ianmicheal/pl_mpegDC ... /%E2%80%9C
Working very slow https://streamable.com/vand9u
Working much faster 100ms a frame faster wrong colour https://streamable.com/mzxeyg
Any idea's and help to speed it up this full Single-file MIT licensed library for C/C++
All i can say is please help
Copyright: 7/31/20
Author: Ian micheal + Magnes(Bertholet)
Date: 31/07/23 09:03
Description: Dreamcast preliminary port KallistiOS video PVR without sound
All patents related to MPEG1 and MP2 have expired, so it's completely free now.
We need a proper mpeg api for indie games with a complete free license, so I started porting this. We have it running, but very slowly, of course.
If you're converting RGB on the CPU, I had the idea of using the PVR Yuv422 format and having it work, but of course color conversion is
A problem here is the github
https://github.com/ianmicheal/pl_mpegDC
With more info and the bleeding crazy versions are in that folder
https://github.com/ianmicheal/pl_mpegDC ... /%E2%80%9C
Working very slow https://streamable.com/vand9u
Working much faster 100ms a frame faster wrong colour https://streamable.com/mzxeyg
Any idea's and help to speed it up this full Single-file MIT licensed library for C/C++
All i can say is please help
-
- DC Developer
- Posts: 45
- Joined: Wed Jan 20, 2016 4:55 am
- Has thanked: 20 times
- Been thanked: 56 times
Re: pl_mpegDC ported running but community help needed
I am glad that you are working on this issue!
I was also working on speeding up pl_mpeg.
Here is my half-finished code.
It's video only, but it can play 320x240 videos at over 100% speed. (This sample is 368x208.) I worked as follows.
1. I am calling plm_decode_video() every frame without using plm_decode().
TODO: You should rework your high-level functions to have reasonable wait times.
2. Utilizes the Dreamcast YUV converter.
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c)
3. Replaced simple and beautiful code with conditionals.
(lines 2430-2936 of pl_mpeg.c) Significant speedup by referencing Berkeley's MPEG player.
I was unable to:
1. Proper speed control.
2. Asynchronous processing using threads.
3. Sound.
I needed help too. The Dreamcast community will be delighted when the MPEG player is complete!
I was also working on speeding up pl_mpeg.
Here is my half-finished code.
It's video only, but it can play 320x240 videos at over 100% speed. (This sample is 368x208.) I worked as follows.
1. I am calling plm_decode_video() every frame without using plm_decode().
TODO: You should rework your high-level functions to have reasonable wait times.
2. Utilizes the Dreamcast YUV converter.
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c)
3. Replaced simple and beautiful code with conditionals.
(lines 2430-2936 of pl_mpeg.c) Significant speedup by referencing Berkeley's MPEG player.
I was unable to:
1. Proper speed control.
2. Asynchronous processing using threads.
3. Sound.
I needed help too. The Dreamcast community will be delighted when the MPEG player is complete!
- These users thanked the author Twada for the post (total 2):
- Ian Robinson • |darc|
- Ian Robinson
- DC Developer
- Posts: 126
- Joined: Mon Mar 11, 2019 7:12 am
- Has thanked: 224 times
- Been thanked: 45 times
Re: pl_mpegDC ported running but community help needed
wonderful work i hope we just combine any together to get this done reading your changes very smart always your code is so nice and well done.. lot of work you have done thanks for sharing this runs great i just built it from src.. Major thing as well with the conversion RGB on the cpu is just super slow 170+ms a frame i wanted to do what your doing only got to green and pink colours on display.Twada wrote: ↑Fri Aug 11, 2023 3:30 am I am glad that you are working on this issue!
I was also working on speeding up pl_mpeg.
Here is my half-finished code.
It's video only, but it can play 320x240 videos at over 100% speed. (This sample is 368x208.)
pl_mpeg5.rar
I worked as follows.
1. I am calling plm_decode_video() every frame without using plm_decode().
TODO: You should rework your high-level functions to have reasonable wait times.
2. Utilizes the Dreamcast YUV converter.
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c)
3. Replaced simple and beautiful code with conditionals.
(lines 2430-2936 of pl_mpeg.c) Significant speedup by referencing Berkeley's MPEG player.
I was unable to:
1. Proper speed control.
2. Asynchronous processing using threads.
3. Sound.
I needed help too. The Dreamcast community will be delighted when the MPEG player is complete!
- lerabot
- Insane DCEmu
- Posts: 134
- Joined: Sun Nov 01, 2015 8:25 pm
- Has thanked: 2 times
- Been thanked: 19 times
Re: pl_mpegDC ported running but community help needed
Thank you so much to both of you!
This is great work.
I would be amazing if we can get closer to 640 X 480 but I don't know if this is possible.
This is great work.
I would be amazing if we can get closer to 640 X 480 but I don't know if this is possible.
- These users thanked the author lerabot for the post (total 2):
- Ian Robinson • Twada
- Ian Robinson
- DC Developer
- Posts: 126
- Joined: Mon Mar 11, 2019 7:12 am
- Has thanked: 224 times
- Been thanked: 45 times
Re: pl_mpegDC ported running but community help needed
If you could explain how convert multi-planar YUV 4:4:4 to packed YUV 4:2:2? I knew it was possible but just could not get it done and how this worksTwada wrote: ↑Fri Aug 11, 2023 3:30 am I am glad that you are working on this issue!
I was also working on speeding up pl_mpeg.
Here is my half-finished code.
It's video only, but it can play 320x240 videos at over 100% speed. (This sample is 368x208.)
pl_mpeg5.rar
I worked as follows.
1. I am calling plm_decode_video() every frame without using plm_decode().
TODO: You should rework your high-level functions to have reasonable wait times.
2. Utilizes the Dreamcast YUV converter.
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c)
3. Replaced simple and beautiful code with conditionals.
(lines 2430-2936 of pl_mpeg.c) Significant speedup by referencing Berkeley's MPEG player.
I was unable to:
1. Proper speed control.
2. Asynchronous processing using threads.
3. Sound.
I needed help too. The Dreamcast community will be delighted when the MPEG player is complete!
Code: Select all
void app_on_video(plm_t *mpeg, plm_frame_t *frame, void *user)
{
unsigned int *dest = (unsigned int *)disp_tex;
unsigned int *src = (unsigned int *)frame->display;
volatile unsigned int *d = (volatile unsigned int *)0xa05f8148;
volatile unsigned int *cfg = (volatile unsigned int *)0xa05f814c;
volatile unsigned int *stride_reg = (volatile unsigned int *)0xa05f80e4;
int stride_value;
int stride = 0;
int x, y, w, h, i;
if (!frame)
return;
/* set frame size. */
w = frame->width >> 4;
h = frame->height >> 4;
stride_value = (w >> 1); /* 16 pixel / 2 */
/* Set Stride value. */
*stride_reg &= 0xffffffe0;
*stride_reg |= stride_value & 0x01f;
/* Set SQ to YUV converter. */
*d = ((unsigned int)dest) & 0xffffff;
*cfg = 0x00000f1f;
x = *cfg; /* read on once */
// QACR0 = ((((unsigned int)0x10800000) >> 26) << 2) & 0x1c;
// QACR1 = ((((unsigned int)0x10800000) >> 26) << 2) & 0x1c;
for (y = 0; y < h; y++)
{
for (x = 0; x < w; x++, src += 96)
{
sq_cpy((void *)0x10800000, (void *)src, 384);
}
if (!stride)
{
/* Send dummy mb */
for (i = 0; i < 32 - w; i++)
{
sq_set((void *)0x10800000, 0, 384);
}
}
}
for (i = 0; i < 16 - h; i++)
{
if (!stride)
sq_set((void *)0x10800000, 0, 384 * 32);
else
sq_set((void *)0x10800000, 0, 384 * w);
}
}
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c) But i dont get how it works tbh.
-
- DC Developer
- Posts: 45
- Joined: Wed Jan 20, 2016 4:55 am
- Has thanked: 20 times
- Been thanked: 56 times
Re: pl_mpegDC ported running but community help needed
Ian Robinson wrote: ↑Fri Aug 11, 2023 12:01 pm ? struggling understanding how it's possible i know it was had done something like this before .. I see your saying Utilizes the Dreamcast YUV converter.
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c) But i dont get how it works tbh.
Ah, I didn't fix that I hardcoded. I am embarrassed.
The YUV converter has two registers.
Code: Select all
volatile unsigned int *d = (volatile unsigned int *)0xa05f8148;
volatile unsigned int *cfg = (volatile unsigned int *)0xa05f814c;
/* Set SQ to YUV converter. */
*d = ((unsigned int)dest) & 0xffffff;
*cfg = 0x00000f1f;
x = *cfg; /* read on once */
First, set the output destination VRAM address in PVR_YUV_ADDR.
Code: Select all
PVR_SET(PVR_YUV_ADDR, (((unsigned int)dest) & 0xffffff))
Code: Select all
PVR_SET(PVR_YUV_CFG_1, ((height << 8) | width))
You can choose from 32, 64, 128, 256, 512, 1024, divide by 16 and subtract 1.
The actual values to set are 1, 3, 7, 15, 31, 63.
The sample sets the height to 256 and the width to 512, so the value is 0x00000f1f.
Width can be specified in more detail when using stride textures.
For example, 320 is 320/16-1=19.
I am not using stride textures this time, so I am transferring dummy blocks.
Code: Select all
/* Send dummy mb */
for (i = 0; i < 32 - w; i++)
{
sq_set((void *)0x10800000, 0, 384);
}
The data is block by block. U data is 64 bytes, V data is 64 bytes, and Y data is 256 bytes, which is 384 bytes each. Do not change the order.
I'm modding pl_mpeg to create this 384 byte format at the decoding stage. Since the data is contiguous, it speeds up quite a bit.
However, I-frames and P-frames require full-screen data, as P-frames and B-frames must refer to the screen.
So only then are we creating data for reference. I want to make it smarter.
- These users thanked the author Twada for the post:
- Ian Robinson
- Ian Robinson
- DC Developer
- Posts: 126
- Joined: Mon Mar 11, 2019 7:12 am
- Has thanked: 224 times
- Been thanked: 45 times
Re: pl_mpegDC ported running but community help needed
NO need to be embarrassed it works and well thanks for the info very interesting solution to me.Twada wrote: ↑Fri Aug 11, 2023 4:50 pmIan Robinson wrote: ↑Fri Aug 11, 2023 12:01 pm ? struggling understanding how it's possible i know it was had done something like this before .. I see your saying Utilizes the Dreamcast YUV converter.
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c) But i dont get how it works tbh.
Ah, I didn't fix that I hardcoded. I am embarrassed.
The YUV converter has two registers.For KOS it is PVR_YUV_ADDR (0x0148) and PVR_YUV_CFG_1 (0x014c).Code: Select all
volatile unsigned int *d = (volatile unsigned int *)0xa05f8148; volatile unsigned int *cfg = (volatile unsigned int *)0xa05f814c; /* Set SQ to YUV converter. */ *d = ((unsigned int)dest) & 0xffffff; *cfg = 0x00000f1f; x = *cfg; /* read on once */
First, set the output destination VRAM address in PVR_YUV_ADDR.Next, set the size and format of the PVR_YUV_CFG_1 data.Code: Select all
PVR_SET(PVR_YUV_ADDR, (((unsigned int)dest) & 0xffffff))
height and width are the size of the output texture.Code: Select all
PVR_SET(PVR_YUV_CFG_1, ((height << 8) | width))
You can choose from 32, 64, 128, 256, 512, 1024, divide by 16 and subtract 1.
The actual values to set are 1, 3, 7, 15, 31, 63.
The sample sets the height to 256 and the width to 512, so the value is 0x00000f1f.
Width can be specified in more detail when using stride textures.
For example, 320 is 320/16-1=19.
I am not using stride textures this time, so I am transferring dummy blocks.
After setting the two registers, transfer the data to the YUV converter (0x10800000).Code: Select all
/* Send dummy mb */ for (i = 0; i < 32 - w; i++) { sq_set((void *)0x10800000, 0, 384); }
The data is block by block. U data is 64 bytes, V data is 64 bytes, and Y data is 256 bytes, which is 384 bytes each. Do not change the order.
I'm modding pl_mpeg to create this 384 byte format at the decoding stage. Since the data is contiguous, it speeds up quite a bit.
However, I-frames and P-frames require full-screen data, as P-frames and B-frames must refer to the screen.
So only then are we creating data for reference. I want to make it smarter.
-
- DCEmu Webmaster
- Posts: 16389
- Joined: Wed Mar 14, 2001 6:00 pm
- Location: New Orleans, LA
- Has thanked: 121 times
- Been thanked: 92 times
- Contact:
Re: pl_mpegDC ported running but community help needed
Moving to Programming Discussion as this has quickly shifted from an idea to reality!
Thanks everyone!
Thanks everyone!
- These users thanked the author |darc| for the post (total 2):
- Ian Robinson • Twada
It's thinking...
- Ian Robinson
- DC Developer
- Posts: 126
- Joined: Mon Mar 11, 2019 7:12 am
- Has thanked: 224 times
- Been thanked: 45 times
Re: pl_mpegDC ported running but community help needed
I have been also working on dreamroq up ported to kos2.0 fixed threading and other things https://github.com/ianmicheal/DREAMROQ-WORKING-SOUND- now works but sound lags a bit i do wonder if we could use your idea of the YUV converter on it as well..Twada wrote: ↑Fri Aug 11, 2023 4:50 pmIan Robinson wrote: ↑Fri Aug 11, 2023 12:01 pm ? struggling understanding how it's possible i know it was had done something like this before .. I see your saying Utilizes the Dreamcast YUV converter.
That's why we're decoding the video block by block. A reference frame is created only for I-frames and P-frames. (lines 2156-2241 of pl_mpeg.c) But i dont get how it works tbh.
Ah, I didn't fix that I hardcoded. I am embarrassed.
The YUV converter has two registers.For KOS it is PVR_YUV_ADDR (0x0148) and PVR_YUV_CFG_1 (0x014c).Code: Select all
volatile unsigned int *d = (volatile unsigned int *)0xa05f8148; volatile unsigned int *cfg = (volatile unsigned int *)0xa05f814c; /* Set SQ to YUV converter. */ *d = ((unsigned int)dest) & 0xffffff; *cfg = 0x00000f1f; x = *cfg; /* read on once */
First, set the output destination VRAM address in PVR_YUV_ADDR.Next, set the size and format of the PVR_YUV_CFG_1 data.Code: Select all
PVR_SET(PVR_YUV_ADDR, (((unsigned int)dest) & 0xffffff))
height and width are the size of the output texture.Code: Select all
PVR_SET(PVR_YUV_CFG_1, ((height << 8) | width))
You can choose from 32, 64, 128, 256, 512, 1024, divide by 16 and subtract 1.
The actual values to set are 1, 3, 7, 15, 31, 63.
The sample sets the height to 256 and the width to 512, so the value is 0x00000f1f.
Width can be specified in more detail when using stride textures.
For example, 320 is 320/16-1=19.
I am not using stride textures this time, so I am transferring dummy blocks.
After setting the two registers, transfer the data to the YUV converter (0x10800000).Code: Select all
/* Send dummy mb */ for (i = 0; i < 32 - w; i++) { sq_set((void *)0x10800000, 0, 384); }
The data is block by block. U data is 64 bytes, V data is 64 bytes, and Y data is 256 bytes, which is 384 bytes each. Do not change the order.
I'm modding pl_mpeg to create this 384 byte format at the decoding stage. Since the data is contiguous, it speeds up quite a bit.
However, I-frames and P-frames require full-screen data, as P-frames and B-frames must refer to the screen.
So only then are we creating data for reference. I want to make it smarter.
https://github.com/ianmicheal/DREAMROQ- ... lib.c#L103 let me know if you think that's possible might be able to use the sound part lib dcmc for this mpeg version..
-
- DC Developer
- Posts: 45
- Joined: Wed Jan 20, 2016 4:55 am
- Has thanked: 20 times
- Been thanked: 56 times
Re: pl_mpegDC ported running but community help needed
The ROQ format is also very interesting. Thank you for your work!Ian Robinson wrote: ↑Tue Aug 22, 2023 7:48 am I have been also working on dreamroq up ported to kos2.0 fixed threading and other things https://github.com/ianmicheal/DREAMROQ-WORKING-SOUND- now works but sound lags a bit i do wonder if we could use your idea of the YUV converter on it as well..
https://github.com/ianmicheal/DREAMROQ- ... lib.c#L103 let me know if you think that's possible might be able to use the sound part lib dcmc for this mpeg version..
I'm not sure about the ROQ format, is it similar to VQ textures?
If so, you might be able to use the YUV texture format as well.
I was able to get the sound working thanks to dreamcast.wiki!
However, I found plm_decode_audio() to be ridiculously slow. It's twice as slow as the accelerated plm_decode_video().
It looks like I'll have to work on speeding up the sound from now on.
I might try the dcmc library after that...
- These users thanked the author Twada for the post:
- Ian Robinson
- Ian Robinson
- DC Developer
- Posts: 126
- Joined: Mon Mar 11, 2019 7:12 am
- Has thanked: 224 times
- Been thanked: 45 times
Re: pl_mpegDC ported running but community help needed
Not only that the acia is very slow you can see this with TapamN's post benchmark https://dcemulation.org/phpBB/viewtopic ... 8#p1058848Twada wrote: ↑Fri Aug 25, 2023 5:29 pmThe ROQ format is also very interesting. Thank you for your work!Ian Robinson wrote: ↑Tue Aug 22, 2023 7:48 am I have been also working on dreamroq up ported to kos2.0 fixed threading and other things https://github.com/ianmicheal/DREAMROQ-WORKING-SOUND- now works but sound lags a bit i do wonder if we could use your idea of the YUV converter on it as well..
https://github.com/ianmicheal/DREAMROQ- ... lib.c#L103 let me know if you think that's possible might be able to use the sound part lib dcmc for this mpeg version..
I'm not sure about the ROQ format, is it similar to VQ textures?
If so, you might be able to use the YUV texture format as well.
I was able to get the sound working thanks to dreamcast.wiki!
However, I found plm_decode_audio() to be ridiculously slow. It's twice as slow as the accelerated plm_decode_video().
It looks like I'll have to work on speeding up the sound from now on.
I might try the dcmc library after that...
- BB Hood
- DCEmu Banned
- Posts: 189
- Joined: Fri Mar 30, 2007 12:09 am
- Has thanked: 42 times
- Been thanked: 10 times
Re: pl_mpegDC ported running but community help needed
First off, awesome stuff!! I gonna look into maybe creating an example that shows utilizing the TA to convert YUV420 => YUV422 once I fully get the grasp of it. After playing with your code a bit, I rewrote some video related functions in your main.c file. I cut some stuff that doesn't seem to make a difference, moved other things so they are only done once, replaced your hard coded stuff and magic numbers with PVR_*equivalents and #defines. There is one thing Im stomped on. What is the 96 coming from in for (x = 0; x < w; x++, src += 96). Do you use GitHub or GitLab at all? Would be a lot easier to share code back and forth. Oh and join the Discord :^p https://discord.gg/NjwBRKbk
Code: Select all
/* png example for KOS 1.1.x
* Jeffrey McBeth / Morphogenesis
* <mcbeth@morphogenesis.2y.net>
*
* Heavily borrowed from from 2-D example
* AndrewK / Napalm 2001
* <andrewk@napalm-x.com>
*/
#include <kos.h>
#include "perfctr.h"
// #define PL_MPEG_IMPLEMENTATION
#include "pl_mpeg.h"
#define min(a, b) (((a) < (b)) ? (a) : (b))
plm_t *plm;
/* textures */
pvr_ptr_t disp_tex;
snd_stream_hnd_t snd_hnd;
__attribute__((aligned(32))) unsigned char snd_buf[65536 + 16384];
// Output texture width and height initial values
// You can choose from 32, 64, 128, 256, 512, 1024
#define PVR_TEXTURE_WIDTH 512
#define PVR_TEXTURE_HEIGHT 256
pvr_poly_hdr_t hdr;
pvr_vertex_t vert[4];
void setup_graphics()
{
pvr_poly_cxt_t cxt;
pvr_poly_cxt_txr(&cxt, PVR_LIST_OP_POLY, PVR_TXRFMT_YUV422 | PVR_TXRFMT_NONTWIDDLED, PVR_TEXTURE_WIDTH, PVR_TEXTURE_HEIGHT, disp_tex, PVR_FILTER_BILINEAR);
pvr_poly_compile(&hdr, &cxt);
hdr.mode3 |= PVR_TXRFMT_STRIDE; // Was 0x02000000; which had one too many zeros. Should be 0x0200000
vert[0].z = vert[1].z = vert[2].z = vert[3].z = 1.0f;
vert[0].argb = vert[1].argb = vert[2].argb = vert[3].argb = PVR_PACK_COLOR(1.0f, 1.0f, 1.0f, 1.0f);
vert[0].oargb = vert[1].oargb = vert[2].oargb = vert[3].oargb = 0;
vert[0].flags = vert[1].flags = vert[2].flags = PVR_CMD_VERTEX;
vert[3].flags = PVR_CMD_VERTEX_EOL;
vert[0].x = 1;
vert[0].y = 1;
vert[0].u = 0;
vert[0].v = 0;
vert[1].x = 640;
vert[1].y = 1;
vert[1].u = 0.71875;
vert[1].v = 0.0;
vert[2].x = 1;
vert[2].y = 480;
vert[2].u = 0;
vert[2].v = 0.8125;
vert[3].x = 640;
vert[3].y = 480;
vert[3].u = 0.71875;
vert[3].v = 0.8125;
// Point to the dest texture in the PVR
unsigned int *dest = (unsigned int *)disp_tex;
/* Set SQ to YUV converter. */
PVR_SET(PVR_YUV_ADDR, (((unsigned int)dest) & 0xffffff));
// Divide texture width and texture height by 16 and subtract 1.
// The actual values to set are 1, 3, 7, 15, 31, 63.
PVR_SET(PVR_YUV_CFG_1, (((PVR_TEXTURE_HEIGHT / 16) - 1) << 8) | ((PVR_TEXTURE_WIDTH / 16) - 1));
PVR_GET(PVR_YUV_CFG_1);
}
void app_on_video(plm_t *mpeg, plm_frame_t *frame, void *user)
{
unsigned int *src = (unsigned int *)frame->display;
int x, y, w, h;
int stride = 0;
if (!frame)
return;
/* Set Stride value. */
// This can be moved outside this function too and only needs to be executed once.
//https://multimedia.cx/eggs/roq-on-dreamcast/#comment-167893
//PVR_SET(PVR_TEXTURE_MODULO, 640/32); // -1 not needed ???
//So if you want a 640*480 texture, you need to set the standard power-of-two-size to be a larger than the real size
//(so for a 640*480 real size, you set it to 1024*512) then set the stride bit (PVR_TXRFMT_STRIDE) on the header before you submit it.
if(stride)
PVR_SET(PVR_TEXTURE_MODULO, frame->width/32); // -1 needed ??? Not according to your *stride_reg |= stride_value & 0x01f;
/* set frame size. */
w = frame->width >> 4;
h = frame->height >> 4;
if (!stride) {
for (y = 0; y < h; y++) {
for (x = 0; x < w; x++, src += 96) { // += 384/4(size of int) = 96
sq_cpy((void *)0x10800000, (void *)src, 384);
}
// Send dummy mb
sq_set((void *)0x10800000, 0, 384 * (32 - w));
}
} else {
for (y = 0; y < h; y++) {
for (x = 0; x < w; x++, src += 96) { // += 384/4(size of int) = 96
sq_cpy((void *)0x10800000, (void *)src, 384);
}
}
}
}
void app_on_audio(plm_t *mpeg, plm_samples_t *samples, void *user)
{
int size = sizeof(float) * samples->count * 2;
// SDL_QueueAudio(self->audio_device, samples->interleaved, size);
// snd_sh4_to_aica_stop();
// snd_sh4_to_aica((void *)samples->interleaved, 200);
// snd_sh4_to_aica_start();
}
void *sound_callback(snd_stream_hnd_t hnd, int size, int *size_out)
{
plm_samples_t *sample = plm_decode_audio(plm);
// if (sample == NULL)
// {
// return NULL;
// }
// if(size > (PLM_AUDIO_SAMPLES_PER_FRAME * 2))
// {
// size = (PLM_AUDIO_SAMPLES_PER_FRAME * 2);
// }
*size_out = size;
//printf("%d::%d ", size, *size_out);
return (void *)sample->interleaved;
}
/* romdisk */
extern uint8 romdisk_boot[];
KOS_INIT_ROMDISK(romdisk_boot);
int main(void)
{
int done = 0;
double elapsed_time = 0.0;
double current_time = 0.0;
double last_time = 0.0;
PMCR_Init(1, PMCR_ELAPSED_TIME_MODE, 2);
/* init kos */
pvr_init_defaults();
disp_tex = pvr_mem_malloc(PVR_TEXTURE_WIDTH * PVR_TEXTURE_HEIGHT * 2);
setup_graphics();
plm = plm_create_with_filename("/rd/sample.mpg");
if (plm == 0)
return 0;
// plm_set_video_decode_callback(plm, app_on_video, 0);
// plm_set_audio_decode_callback(plm, app_on_audio, 0);
// plm_set_loop(plm, TRUE);
plm_set_audio_enabled(plm, 1);
last_time = (double)timer_ms_gettime64() / 1000.0;
// snd_stream_init();
// snd_hnd = snd_stream_alloc(sound_callback, PLM_AUDIO_SAMPLES_PER_FRAME << 3);
// snd_stream_reinit(snd_hnd, sound_callback);
// snd_stream_volume(snd_hnd, 0xff);
// snd_stream_queue_enable(snd_hnd);
// snd_stream_start(snd_hnd, 44100, 1);
// snd_stream_queue_go(snd_hnd);
/* keep drawing frames until start is pressed */
while (!done)
{
MAPLE_FOREACH_BEGIN(MAPLE_FUNC_CONTROLLER, cont_state_t, st)
if (st->buttons & CONT_START)
done = 1;
MAPLE_FOREACH_END()
pvr_wait_ready();
// plm_decode(plm, elapsed_time);
// if (plm_has_ended(plm))
// {
// plm_destroy(plm);
// break;
// }
// snd_sh4_to_aica_start();
plm_frame_t *frame = plm_decode_video(plm);
// plm_samples_t *sample = plm_decode_audio(plm);
if (!frame)
{
break;
}
app_on_video(plm, frame, 0);
// plm_decode_audio(plm);
// Decode
current_time = (double)timer_ms_gettime64() / 1000.0;
elapsed_time = min(current_time - last_time, 1.0 / 30.0);
last_time = current_time;
pvr_scene_begin();
pvr_list_begin(PVR_LIST_OP_POLY);
pvr_prim(&hdr, sizeof(hdr));
pvr_prim(&vert[0], sizeof(pvr_vertex_t));
pvr_prim(&vert[1], sizeof(pvr_vertex_t));
pvr_prim(&vert[2], sizeof(pvr_vertex_t));
pvr_prim(&vert[3], sizeof(pvr_vertex_t));
pvr_list_finish();
pvr_scene_finish();
}
pvr_mem_free(disp_tex);
// snd_mem_free();
// snd_mem_shutdown();
// snd_shutdown();
return 0;
}
Last edited by BB Hood on Sun Sep 03, 2023 12:23 am, edited 2 times in total.
- These users thanked the author BB Hood for the post:
- Ian Robinson
-
- DC Developer
- Posts: 45
- Joined: Wed Jan 20, 2016 4:55 am
- Has thanked: 20 times
- Been thanked: 56 times
Re: pl_mpegDC ported running but community help needed
Thank you for considering creating a sample!BB Hood wrote: ↑Sat Sep 02, 2023 10:46 pm First off, awesome stuff!! I gonna look into maybe creating an example that shows utilizing the TA to convert YUV420 => YUV422 once I fully get the grasp of it. After playing with your code a bit, I rewrote some video related functions in your main.c file. I cut some stuff that doesn't seem to make a difference, moved other things so they are only done once, replaced your hard coded stuff and magic numbers with PVR_*equivalents and #defines. There is one thing Im stomped on. What is the 96 coming from in for (x = 0; x < w; x++, src += 96). Do you use GitHub or GitLab at all? Would be a lot easier to share code back and forth. Oh and join the Discord :^p https://discord.gg/NjwBRKbk
YUV data must be in 16x16 blocks. The arrangement is horizontal.
In the case of YUV420, U data and V data are 64 bytes, Y data is 256 bytes, for a total of 384 bytes. Please do not change this order.
Since I am using a uint32 pointer, divide by 4 to get 96.
Sorry, I don't use Git or Discord at the moment. It seems convenient but seems difficult...
- BB Hood
- DCEmu Banned
- Posts: 189
- Joined: Fri Mar 30, 2007 12:09 am
- Has thanked: 42 times
- Been thanked: 10 times
Re: pl_mpegDC ported running but community help needed
Thanks!! I just found this post (https://dcemulation.org/phpBB/viewtopic ... 7#p1027297) by Phenom and it shows a way for DMA. Currently KOS is missing some of the functionality so I plan to add it soon.
- BB Hood
- DCEmu Banned
- Posts: 189
- Joined: Fri Mar 30, 2007 12:09 am
- Has thanked: 42 times
- Been thanked: 10 times
Re: pl_mpegDC ported running but community help needed
Code: Select all
static void convert() {
int i, j, index, x_blk, y_blk;
unsigned char u_block[64] __attribute__((aligned(32)));
unsigned char v_block[64] __attribute__((aligned(32)));
unsigned char y_block[256] __attribute__((aligned(32)));
for (y_blk = 0; y_blk < PVR_TEXTURE_HEIGHT; y_blk += 16) {
for (x_blk = 0; x_blk < PVR_TEXTURE_WIDTH; x_blk += 16) {
// Extract U
for (j = 0; j < 8; ++j) {
for (i = 0; i < 8; ++i) {
index = (y_blk / 2 + j) * (PVR_TEXTURE_WIDTH / 2) + (x_blk / 2 + i);
u_block[j * 8 + i] = u_plane[index];
}
}
// Extract V
for (j = 0; j < 8; ++j) {
for (i = 0; i < 8; ++i) {
index = (y_blk / 2 + j) * (PVR_TEXTURE_WIDTH / 2) + (x_blk / 2 + i);
v_block[j * 8 + i] = v_plane[index];
}
}
// Extract Y
for (j = 0; j < 16; ++j) {
for (i = 0; i < 16; ++i) {
index = (y_blk + j) * PVR_TEXTURE_WIDTH + (x_blk + i);
y_block[j * 16 + i] = y_plane[index];
}
}
sq_cpy((void *)PVR_TA_YUV_CONV, (void *)u_block, 64);
sq_cpy((void *)PVR_TA_YUV_CONV, (void *)v_block, 64);
sq_cpy((void *)PVR_TA_YUV_CONV, (void *)y_block, 256);
}
// Send dummy mb
//sq_set((void *)PVR_TA_YUV_CONV, 0, 384 * (32 - (PVR_TEXTURE_WIDTH >> 4)));
}
}
-
- DC Developer
- Posts: 45
- Joined: Wed Jan 20, 2016 4:55 am
- Has thanked: 20 times
- Been thanked: 56 times
Re: pl_mpegDC ported running but community help needed
ah. I didn't tell.
The Y data is 16x16, 256 bytes, and must follow the macroblock format.
In other words, it must be in the shape of four 8x8 blocks in a row. The orientation is horizontal.
I created a variable k and rewrote it as follows.
I'd like to devise a data arrangement by referring to libmp3, but I'm at my limit. (lines 3759-3781 of pl_mpeg.c)
A hard-coded version that just makes sounds. i need help…
The Y data is 16x16, 256 bytes, and must follow the macroblock format.
In other words, it must be in the shape of four 8x8 blocks in a row. The orientation is horizontal.
I created a variable k and rewrote it as follows.
Code: Select all
for (k = 0; k < 4; ++k)
{
for (j = 0; j < 8; ++j)
{
for (i = 0; i < 8; ++i)
{
index = (y_blk + j + (k / 2 * 8)) * PVR_TEXTURE_WIDTH + (x_blk + i) + (k % 2 * 8);
y_block[k * 64 + j * 8 + i] = y_plane[index];
}
}
}
And decoding mp2 audio is not faster at all. Decoding itself is slow.Ian Robinson wrote: ↑Sat Aug 26, 2023 5:28 am Not only that the acia is very slow you can see this with TapamN's post benchmark https://dcemulation.org/phpBB/viewtopic ... 8#p1058848
I'd like to devise a data arrangement by referring to libmp3, but I'm at my limit. (lines 3759-3781 of pl_mpeg.c)
A hard-coded version that just makes sounds. i need help…
- These users thanked the author Twada for the post (total 2):
- Ian Robinson • BB Hood
- BB Hood
- DCEmu Banned
- Posts: 189
- Joined: Fri Mar 30, 2007 12:09 am
- Has thanked: 42 times
- Been thanked: 10 times
Re: pl_mpegDC ported running but community help needed
<333333 Thank you! That fixed it alright. Im gonna take a look at your code and see what I can do.
- These users thanked the author BB Hood for the post:
- Ian Robinson
- BB Hood
- DCEmu Banned
- Posts: 189
- Joined: Fri Mar 30, 2007 12:09 am
- Has thanked: 42 times
- Been thanked: 10 times
Re: pl_mpegDC ported running but community help needed
It plays pretty well from running it on my Dreamcast. Its late here but I took at quick look at your sound code in main.c. I think I can help and will share code with you tomorrow. Unfortunately I have no input on the decoding of the sound data itself. Sorry.
- These users thanked the author BB Hood for the post (total 2):
- Twada • Ian Robinson
- BB Hood
- DCEmu Banned
- Posts: 189
- Joined: Fri Mar 30, 2007 12:09 am
- Has thanked: 42 times
- Been thanked: 10 times
Re: pl_mpegDC ported running but community help needed
Twada, sorry. Its taking longer than I thought. The code I did write removed stress from the video but the sound came out aweful. Ultimately what you want to do is run an audio thread that calls snd_stream_poll instead of calling it every frame:
and call your decode function every frame and store those results in a ring buffer. That in the you can just read from. You will need mutex to surround the ring buffer so the audio thread and the main thread wont touch it at the same time.
I did the above for your code but I guess my implementation was bad. I copied what I did in my dreamroq repo and its just not working out. I will need to dig deeper for better implementation. Again, my apologies for not finding a working solution for you.
Code: Select all
void* snd_thread() {
while(audio_status != AUDIO_STATUS_DONE) {
snd_stream_poll(snd_hnd);
thd_sleep(20);
}
return NULL;
}
Code: Select all
plm_samples_t *sample = plm_decode_audio(plm);
Code: Select all
sound_callback(snd_stream_hnd_t hnd, int size, int *size_out)
I did the above for your code but I guess my implementation was bad. I copied what I did in my dreamroq repo and its just not working out. I will need to dig deeper for better implementation. Again, my apologies for not finding a working solution for you.
- These users thanked the author BB Hood for the post:
- Ian Robinson
-
- DC Developer
- Posts: 45
- Joined: Wed Jan 20, 2016 4:55 am
- Has thanked: 20 times
- Been thanked: 56 times
Re: pl_mpegDC ported running but community help needed
Thank you for listening to my unreasonable request.
I'm not familiar with stream-related code, so this is a very valuable hint!
Again, we need to speed up audio decoding. I'll try again after I cool my head.
I'm not familiar with stream-related code, so this is a very valuable hint!
Again, we need to speed up audio decoding. I'll try again after I cool my head.
- These users thanked the author Twada for the post (total 2):
- Ian Robinson • BB Hood