pvr_txr_load question

If you have any questions on programming, this is the place to ask them, whether you're a newbie or an experienced programmer. Discussion on programming in general is also welcome. We will help you with programming homework, but we will not do your work for you! Any porting requests must be made in Developmental Ideas.
Post Reply
patbier
DC Developer
DC Developer
Posts: 152
https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
Joined: Fri Aug 29, 2003 1:25 am
Has thanked: 0
Been thanked: 0

pvr_txr_load question

Post by patbier »

Hello,

I have a question about pvr_txr_load or equivalent.

I prepare a texture in RAM. The buffer is 1024*512 but I only need it to display the first 640*480 pixels.

then, I send the texture to the PVR using pvr_txr_load.

Code :

Code: Select all

pvr_txr_load(buffer, texture, 1024*512*2);
It works very well, but the problem is that pvr_txr_load is a bit too slow.

This texture is only necessary in 640*480, so there's a blank part inside the 1024*512 buffer.

So, I try to copy the 1024*512 buffer into 2 textures : a 512*512, and a 128*512.
which should be quicker that a 1024*512 copy.

So I'll display the 512*512, and the 128*512 giving the whole screen : 640*512.

I search a way to copy the good part of the buffer on the 512*512 texture, and then
the good part of the buffer on the 128*512 texture

Do you have any ideas ?

PS : I can't create two buffers at the beginning, the 1024*512 buffer is the result of a video decoder.
ImageAlice Dreams Tournament Dreamcast fans : http://www.facebook.com/alicedreamst
In August 2015, we had to change "Dynamite Dreams" name to "Alice Dreams Tournament"
User avatar
BlueCrab
The Crabby Overlord
The Crabby Overlord
Posts: 5652
Joined: Mon May 27, 2002 11:31 am
Location: Sailing the Skies of Arcadia
Has thanked: 9 times
Been thanked: 69 times
Contact:

Re: pvr_txr_load question

Post by BlueCrab »

pvr_ptr_t is simply a void*. You can cast it around to another type and use arithmetic operations to move around inside the buffer if you need to. Keep in mind that you're still going to have to deal with the fact that the buffer is 1024 pixels wide, so its not going to be quite as easy as you might like.

One thing to note is that pvr_txr_load is basically a memcpy (with the requirement that the buffer copied is a multiple of 32-bytes long. The easiest thing you could do is just to copy 1024 * 480, but that won't help you much. You could also just make a loop that does 480 copies of size 512 * 2 bytes, and use pointer arithmetic to move through the two buffers. Its arguable whether that will help all that much.

The final thing you could look into is non-blocking texture DMA, which would allow you to be doing useful work while the texture copy is in progress.
OneThirty8
Damn Dirty Ape
Damn Dirty Ape
Posts: 5031
Joined: Thu Nov 07, 2002 11:11 pm
Location: Saugerties, NY
Has thanked: 0
Been thanked: 0

Re: pvr_txr_load question

Post by OneThirty8 »

This is essentially what I did in VC/DC. I made some changes so it would be obvious what everything is, so if it doesn't work it's probably because I made a typo when I posted it here. It's basically the sq_cpy function from KOS (found in kernel/arch/dreamcast/hardware/sq.c) except it will let you allocate a buffer of 640 x 480 pixels to create whatever image you want to copy, and then copy it to the upper-left corner of a 1024 x 512 buffer. So, it'll copy 640 pixels, and then start a new line, instead of needing to copy 1024 pixels, and you also get the benefit of allocating less of your main memory, since you only need 640 x 480 x 2 bytes.

Code: Select all

static void vo_txr_load(void *src, pvr_ptr_t vid_tex, int g_width, int g_height) {
int i, n;
uint32 count;
unsigned int *d, *s;

  d = (unsigned int *)(void *)
		(0xe0000000 | (((unsigned long)vid_tex) & 0x03ffffe0));
  s = src; //incoming_video;

  count = g_width * 2;
    
	/* Set store queue memory area as desired */
	QACR0 = ((((unsigned int)vid_tex)>>26)<<2)&0x1c;
	QACR1 = ((((unsigned int)vid_tex)>>26)<<2)&0x1c;
	
	if (count % 4)
		count = (count & 0xfffffffc) + 4;
	
	for(i=0;i<g_height;i++){
	  d = (unsigned int *)(void *)
		(0xe0000000 | (((unsigned long)vid_tex+(i*2048)) & 0x03ffffe0));
	  
	  n = count>>5;
	  
	  while(n--) {
		asm("pref @%0" : : "r" (s + 8)); /* prefetch 32 bytes for next loop */
		d[0] = *(s++);
		d[1] = *(s++);
		d[2] = *(s++);
		d[3] = *(s++);
		d[4] = *(s++);
		d[5] = *(s++);
		d[6] = *(s++);
		d[7] = *(s++);
		asm("pref @%0" : : "r" (d));
		d += 8;
	  }
	 
	}
	/* Wait for both store queues to complete */
	d = (unsigned int *)0xe0000000;
	d[0] = d[8] = 0;
    
}
User avatar
PH3NOM
DC Developer
DC Developer
Posts: 576
Joined: Fri Jun 18, 2010 9:29 pm
Has thanked: 0
Been thanked: 5 times

Re: pvr_txr_load question

Post by PH3NOM »

OneThirty8-
I thank you for your function vo_txr_load ( it has been put to use in my project, DCMC )

However, it is still limited to width being evenly divisible by 32.

I have used this function to allow arbitrary resolution texture loading.
It is a bit application-specific, however, you should get the idea:

Code: Select all

/* Copy a frame from RAM to VRAM, pixel by pixel. Mutex protected. */
int pvr_pixel_load( uint16 * src, struct pvr_frame *pvr ) {

	 int x, y;
	 uint16 *texture;
	 uint16 *image;
     
     /* Copy the image data from RAM to VRAM, byte by byte */
	 image = (uint16 *)src;
	 texture = (uint16 *)pvr->vram_tex;
	 PVR_lock_mutex();
     for (y=0; y<pvr->tex_height; y++)
	 {
	     for (x=0; x<pvr->tex_width; x++)
	       texture[x] = image[x];
	     texture += pvr->vram_width; image += pvr->tex_width;  
	 }
     PVR_unlock_mutex();
     
     /* free the resources */    
     sq_clr(image, sizeof(image));  
     sq_clr(texture, sizeof(texture));
	 
     return 1;
         
}
I do have a related question. Does the PVR Memory HAVE to be allocated by powers of 2?
Currently, I use this to allocate PVR Memory:

Code: Select all

/* Find and allocate the smallest possible PVR VRAM texture area */
int pvr_malloc( struct pvr_frame * pvr ) {
    
	 /* Check if Texture resolution exceeds PVR maximum */
	 if ( pvr->tex_width > PVR_TEX_WIDTH || pvr->tex_height > PVR_TEX_HEIGHT ) {
          printf("ERROR: IMAGE Exceeds PVR maximum resolution!\n");
          return 0;
     }
     
     /* Find the smallest PVR Polygon size to fit the Texture, from 8x8 to 1024x1024 */
	 pvr->vram_width = pvr->vram_height = 0x08;
	 while( pvr->vram_width<pvr->tex_width )                         
            pvr->vram_width<<=1;
     while( pvr->vram_height<pvr->tex_height )
            pvr->vram_height<<=1;
     
     /* Allocate PVR VRAM for the current Texture */
     if(pvr->vram_tex) pvr_mem_free(pvr->vram_tex);    
	 pvr->vram_tex = pvr_mem_malloc(pvr->vram_width*pvr->vram_height*2);

     return 1;

}
But, is that really necessary?
It seems like a waste of vram to allocate 1024x512 for a 640x480 image.
Is it ok to allocate PVR Memory by 640x480?
User avatar
BlueCrab
The Crabby Overlord
The Crabby Overlord
Posts: 5652
Joined: Mon May 27, 2002 11:31 am
Location: Sailing the Skies of Arcadia
Has thanked: 9 times
Been thanked: 69 times
Contact:

Re: pvr_txr_load question

Post by BlueCrab »

PH3NOM, while the way you're doing things does get around the 32-byte issue, it is quite likely a lot slower than using the store queues. Doing 16-bit writes across the bus to the PVR's memory is at least doing 2 times as many bus transactions, and probably more than that (since in the worst case, you're doing 32-bit writes to the PVR area when using the SQ). Not to mention the fact that you're tying up the CPU for longer periods of time with the memory access than is needed. That's why the store queues are generally used for PVR memory access. You can fill up one store queue, do the burst transfer, and fill up the other while the first one empties.

I doubt there are many uses of the video memory where having to do things on 32-byte boundaries is that much of a problem. Most sane resolutions (and pretty much everything the DC supports natively) are multiples of 32 pixels in the X direction, which should give you nice even 64-byte boundaries (assuming 16bpp).

Anyway, you do not have to allocate memory on the PVR in powers of two. The memory allocator for the PVR memory pool will allow you to allocate basically any size you want. However, if you want to use the memory as a texture (which is what you'd generally be doing with memory allocated with pvr_mem_malloc), you must use powers-of-two as your resolutions (or use a strided texture, which is a whole other can of worms). The PVR requires all textures that aren't strided textures to use powers of two, and there's no way around that requirement.
patbier
DC Developer
DC Developer
Posts: 152
Joined: Fri Aug 29, 2003 1:25 am
Has thanked: 0
Been thanked: 0

Re: pvr_txr_load question

Post by patbier »

Thanks for all your anwers !
I'll test this asap !
ImageAlice Dreams Tournament Dreamcast fans : http://www.facebook.com/alicedreamst
In August 2015, we had to change "Dynamite Dreams" name to "Alice Dreams Tournament"
User avatar
PH3NOM
DC Developer
DC Developer
Posts: 576
Joined: Fri Jun 18, 2010 9:29 pm
Has thanked: 0
Been thanked: 5 times

Re: pvr_txr_load question

Post by PH3NOM »

Sorry to bump an old thread, but recently I have done something along these lines.

Using DMA to transfer video frames from RAM->VRAM, things only work nice when textures are a nice even power of 2.

Well, I decided to make a DMA transfer function that does not have such strict restrictions.
Looking back at this post, I believe this is what BlueCrab had mentioned as a possibility...

Using DMA this function can copy a 640x480 image in RAM into a 1024*512 texture in VRAM

Code: Select all

void pvr_dma_load2( unsigned char *src, struct pvr_frame *pvr )
{
    int i          = pvr->tex_height;
        row_width  = pvr->tex_width*2;
        row_stride = pvr->vram_width*2;
    unsigned char * dst = (unsigned char *)pvr->vram_tex;

    dcache_flush_range(/*(unsigned int)*/src, 16384);                 
    pvr_wait_ready();                  
    while (!pvr_dma_ready());          
        
    while(i--)
    {
        pvr_dma_transfer( src, dst, row_width, PVR_DMA_VRAM64, 1, NULL, 0 );
        src += row_width;
        dst += row_stride;
    }
}
Post Reply