General Insanity (Genesis + PVR)

If you have any questions on programming, this is the place to ask them, whether you're a newbie or an experienced programmer. Discussion on programming in general is also welcome. We will help you with programming homework, but we will not do your work for you! Any porting requests must be made in Developmental Ideas.
Post Reply
BlackAura
DC Developer
DC Developer
Posts: 9951
Joined: Sun Dec 30, 2001 9:02 am

General Insanity (Genesis + PVR)

Post by BlackAura » Mon Jun 07, 2004 12:33 pm

I've been wondering how fiesable (sp?) using the PVR2DC to render the Genesis display is. My conclusion is that we'll never know until we try, so what the hell...

Obviously the first thing we need to do before we can do this is work out how to display the raw tile data on the screen using the PVR. There's little point in trying to display sprites and backgrounds when you can't get the image data over to the PVR to be drawn.

For the moment, we'll just consider the standard video modes, where tiles are 8x8 pixels, in 16 colours. Just to make it a little simpler. I don't know of many games that used interlace mode (Sonic 2 comes to mind), which used 8x16 tiles.

There is a striking similarity between the Genesis tile data format and the Dreamcast's 4-bit texture format. Basically, they use the exact same encoding, but the Dreamcast requires that the image data be twiddled.

The obvious solution would be to have one big texture conaining all the video data, and send that over. Probably a bit stupid, because you'd have to re-twiddle and re-send the entire texture every time VRAM changed.

A better solution might be to send the individual tiles as separate textures. We can keep a list of which tiles have been changed, and which tiles are in use. If a tile is both in use and has changed, we re-twiddle it and send it to the Dreamcast's VRAM.

One other interesting thing is that the tile data is 32 bytes long - the exact same size as the SH-4's cache line size, and the size of the store queues. Effectively, that means we could fetch the data for a tile using a prefetch instruction, twiddle it into one of the store queue buffers, and then send that along to VRAM. Both the prefetch instruction and the store queue write can run in paralell with the twiddling operation, so we might be able to make this little loop fully paralellized. If we did that, I think it'd be pretty much as efficient as it could possibly be.

One tricky bit would be working out which tiles need to be updated, and storing that information efficiently. It's no good having a really fast bit of code to transfer the data to VRAM if it takes us forever to find it. Unfortunately, I've not thought of an efficient way to do this. One complication is that tile data can be stored anywhere in the MD's VRAM, so we don't really have any way to distinguish tile data from background data.

Once the tile data is there, it should be fairly easy to draw. Basically, you can convert from the MegaDrive's palette format to one of the Dreamcast's colour format (I'd prefer ARGB1555, because we could use it for both punch-through and opaque modes, and it matches what the Genesis can do pretty well), and then map those palettes to the Dreamcast's. The PVR palette table can store 1024 entries, which gives us 64 16-colour palettes to play with, which should be enough to cope with games that change palettes in mid-frame.

One problem with drawing the tiles is that we'd need to send a separate polygon header for each tile we're going to draw. That means we're going to be pushing a hell of a lot of data out to VRAM, so the code to generate the display lists is going to have to be very, very fast. I think that implementing this using 8x8 textures (as above) would be easier than using a large texture containing the whole of VRAM, because we wouldn't have to make any calculations for the texture coordinates. We can just allocate a block of VRAM the same size as the MD's, and use the offset values directly from the MD's VDP.

I haven't even considered raster effects, mid-frame palette changes, different video modes, windowing, or a huge number of other features of the VDP. That can wait until we've worked out if any of this is even possible, and if it's worth the hassle.

Any ideas, comments, suggestions? Have I made any glaringly obvious ommissions or stupid mistakes? Will my head spontaneously turn into a ripe tomato? Tune in next week for the exciting continuation!

(Sorry - it's 3:30AM here, and this is basically a load of random musings and condensed versions of the three pages of notes I made when I tried to go to bed)
Alexvrb
DCEmu Ultra Poster
DCEmu Ultra Poster
Posts: 1754
Joined: Wed Jul 17, 2002 11:25 am

Post by Alexvrb » Mon Jun 07, 2004 2:12 pm

Are you planning on accelerating everything? Could you just accelerate, say, sprites? It might not break as many things. You could start with something like that, maybe?

Oh, and if you've read the thread over at dcemu.co.uk, Troy had some success compiling C68k with -O2 using GCC 3.4.0. Apparently 3.0.4 was breaking it with even -O1. C68k still has some bugs, but Stef is obviously still working on it.
If you have twenty monkeys,
banging randomly on typewriters,
they will in twenty minutes produce the complete source code to World of Warcraft.
doragasu
DCEmu Cool Poster
DCEmu Cool Poster
Posts: 1048
Joined: Thu May 16, 2002 5:01 pm
Location: Madrid, Spain

Post by doragasu » Mon Jun 07, 2004 3:26 pm

Looks terribly difficult to get that working without breaking compatibility, but that could speed things up terribly. Unfortunately I know near to nothing about PVR or Genesis VDP so I will not be able to help you with that... Maybe when I finish my exams I'll be able to study KOS documentation and examples (studying and working doesn't let me to code for DC as much as I'd like)...

The only that comes to my mind is that if the Genesis VRAM is not too big (I suppose it's small) you can have one image in DC's VRAM like it would be in the Genesis and other with all the data (no matter if its a tile or a tilemap) in twiddled format. Then, when you need to read tiles, you use the "twiddled" VRAM image, and when you need to read tilemaps you use the normal image. As I stated before, I know near to nothing about Genesis VDP, but I suppose it should be easy to know what is a tilemap and what is a tile while rendering the screen because genesis must have some registers, memory locations or whatever that points to tilemaps (and I suppose also it's easy to know the size of a tilemap). I don't know if this would be fast enough, as you would be copying all the data two times and you would be twiddling even the tilemaps that don't need to be twiddled, but considering most tiles and tilemaps don't change every frame, maybe it could work.

Man, if you get that working, the only ones that will be above you in my "Most talented DC homebrewn coders" list will be Neill Corlett and the creators of KOS :D.
Heliophobe
Smeg Creator
Smeg Creator
Posts: 246
Joined: Thu Mar 14, 2002 2:40 pm
Contact:

Post by Heliophobe » Mon Jun 07, 2004 4:05 pm

Why do you have to twiddle the textures? Is it just a requirement for paletted texture modes?
BlackAura
DC Developer
DC Developer
Posts: 9951
Joined: Sun Dec 30, 2001 9:02 am

Post by BlackAura » Mon Jun 07, 2004 11:44 pm

Heliophobe - Yeah, paletted textures have to be twiddled, because the bits in the texture format to enable twiddling are re-used for palette index selection when paletted textures are used. The alternative is converting all the tile data to 16-bit, but that'd involve far more work than just twiddling the things.

It'd be so much easier if we didn't have to twiddle things - you could just dump the appropriate parts of the Genesis' VRAM over to the DC's VRAM. Oh well...

doragasu - The Genesis' VRAM is 64KB, as far as I'm aware.

It wouldn't be a good idea to keep a raw copy of the Genesis' VRAM in the DC's VRAM, because access time is too slow. We do still need an unmodified version to work from, and Genesis Plus normally keeps that in system RAM, which is the best place if you want to be able to read the background data.

Alexvrb - Initially, I'm going to try to draw sprites using the PVR. I don't know if I can correctly composite hardware sprites with software-driven backgrounds though. I think I might be able to do it if I modified the software rendering code to output directly to VRAM, and to output to four different textures instead of just one. Then I can draw the backgrounds at the appropriate Z coordinates, and draw the sprites after that.

I think I do know how to determine what's a tile and what's a tilemap. Basically, you have to read the tilemap data (which is pointed to by the VDP registers), and you can determine which tiles are actually being used. The catch is that you'd have to read through all of the tilemap data to do it. You'd then have to work out which of the used tilemaps have actually been modified (there's no point in copying tiles that are the same as last time), and somehow do this very quickly. It's easy enough to keep a list of modified tile data (when the data's modified, add it to the list), but keeping track of what tiles are actually in use is going to be a pain.
doragasu
DCEmu Cool Poster
DCEmu Cool Poster
Posts: 1048
Joined: Thu May 16, 2002 5:01 pm
Location: Madrid, Spain

Post by doragasu » Tue Jun 08, 2004 4:22 am

BlackAura wrote:It wouldn't be a good idea to keep a raw copy of the Genesis' VRAM in the DC's VRAM, because access time is too slow. We do still need an unmodified version to work from, and Genesis Plus normally keeps that in system RAM, which is the best place if you want to be able to read the background data.
Yes, you're right, I thought about that and I was going to post it now. Then you should maybe have a copy of Genesis VRAM in system RAM and keep the "twiddled" copy in DC's VRAM (as you don't need to read it, you only need it to render the screen).
BlackAura wrote:I think I do know how to determine what's a tile and what's a tilemap. Basically, you have to read the tilemap data (which is pointed to by the VDP registers), and you can determine which tiles are actually being used.
That would work for tilemaps (what I don't know is if it would work with sprites as I don't know how does the genesis identify and manage sprites). Also, that can be dangerous, because VDP registers can point to invalid tilemaps (during initiallization or during tile and tilemap replacement in VRAM) so you would be twiddling invalid tiles.
BlackAura wrote:The catch is that you'd have to read through all of the tilemap data to do it. You'd then have to work out which of the used tilemaps have actually been modified (there's no point in copying tiles that are the same as last time), and somehow do this very quickly. It's easy enough to keep a list of modified tile data (when the data's modified, add it to the list), but keeping track of what tiles are actually in use is going to be a pain.
That's the reason that led me to suggest to have a unmodified copy of the Genesis VRAM in system memory and a twiddled version in DC's VRAM. Then you can forget about managing what tiles are used/modified or whatever. You use the copy in VRAM to render the image with PVR (as you have all the needed tiles twiddled) and the copy in system RAM for whatever else (every time that something needs to be read from VRAM and it's not for PVR rendering the screen). Of course, to get the tilemaps you must also use the copy in system RAM like you stated. If a Genesis program reads the VRAM, you read from the copy in system RAM. If a Genesis program modifies the Genesis VRAM, you get that modification in system RAM and then twiddle it (only the modified portion) and send it to the DC's VRAM. Maybe that would be faster that managing used/modified tiles and reading the entire tilemap data to start guessing what of the used tiles had been modified. Anyway, with your method you would be also having a copy in system RAM and a twiddled copy of the tiles in DC's VRAM. You would only avoid to twiddle and copy to DC's VRAM the tilemap data, but twiddling and copying it should be faster and easier than managing used/modified tiles as tilemaps tend to be really small compared to tiles (at least in GBA that is the only console I've coded for) and should not be modified every frame.

I hope I explained myself clear enough...
BlackAura
DC Developer
DC Developer
Posts: 9951
Joined: Sun Dec 30, 2001 9:02 am

Post by BlackAura » Tue Jun 08, 2004 4:40 am

doragasu wrote:I hope I explained myself clear enough...
Err... almost... You could do with some paragraphs in there. :wink:

I'm not totally sure if it'd be worth trying to upload only used tile data. Keeping track of modified data is pretty easy (and I've done it before in an SMS emulator, and Genesis Plus does it already). As you said, most games probably won't be writing to VRAM very often, and even if they do they'll either be changing everything, or a couple of tiles / a couple of bits of the tilemap.

Since the entire set of Genesis VRAM is only 64KB, and that won't be changed every frame, I think the (slightly) simpler solution is the best. Keep a copy of VRAM in system RAM, and keep a twiddled copy in the DC's VRAM for drawing. Treating the VRAM as if it contains only tiles is probably the easiest way to do it.

Even the Genesis VDP itself does this. Basically, you can put tile data anywhere that's not used for something else, even inside the tilemap areas. It'd massively simplify things if we mirror the design of the VDP a little more closely.

I think that'd probably be a good place to start at least. It's not obviously doing too much work, like re-transferring everything every time one byte is changed, and it's not doing any hideously complex calculations to avoid doing extra work.

I still think that twiddling individual tiles will be easier than trying to twiddle all the tiles as if they're one big texture. We won't be accessing the VRAM randomly, because each tile fits neatly within the DC's cache line size, and we can shove the data into a store queue to get it into VRAM. Basically, we'll end up with a copy of VRAM in the DC's VRAM, but each 32-byte block will have been twiddled. Otherwise, they're all in the same order, and haven't been modified.

It also means that finding tile data will be easy. The sprite tables and tile maps store an 11-bit number which is an index into VRAM. We just shift that left 5 bits (multiplying it by 32), and add it to the base address of our copy of VRAM. Poke that into the polygon header, and it'll display the correct tile.

Edit: Some info on the VDP and the Genesis hardware
doragasu
DCEmu Cool Poster
DCEmu Cool Poster
Posts: 1048
Joined: Thu May 16, 2002 5:01 pm
Location: Madrid, Spain

Post by doragasu » Tue Jun 08, 2004 1:21 pm

Thanks for the links, I'll check them when I get some time.
BlackAura wrote:I still think that twiddling individual tiles will be easier than trying to twiddle all the tiles as if they're one big texture.
Definitely, I haven't explained myself correctly because I was trying to explain how to twiddle individual tiles without having to worry about checking what tiles are used for backgrounds/sprites. Let's try with a little more detail.

Let's state we have 2 copies of the Genesis VRAM:

1. SysGenVram: A copy of the Genesis VRAM as it would be in the original console, allocated in the DC system RAM.

2. PvrTwGenVram: A twiddled copy of Genesis VRAM, allocated in the DC VRAM.

And let's have also a table of 64Kbyte/32byte = 2048 positions:
3. VramTileDirty[2048]: This reflects what tiles have been written. All positions in this table must be initialized to FALSE.

Now, each time Genesis VRAM needs to be written, you call this (note it's pseudo code):
GenVramWrite(Offset, Data)
__SysGenVram[Offset] <-- Data
__VramTileDirty[Offset/32] <-- TRUE
End GenVramWrite
And when frame calculations are complete and you are going to render the screen, you can do this:
GenesisFrameRender() begin
__...
__...
____for n = 0 to 2047 begin
____if VramTileDirty[n] = TRUE then
______TwiddleTileAndLoadToPvrTwGenVram(n)
______VramTileDirty[n] <-- FALSE
______n <-- n + 1
____end if
__end for
__...
__...
end GenesisFrameRender
With this process you'll be copying only the tiles modified during the frame, and you can still use a store queue to copy each tile in the DC VRAM (inside the func I called TwiddleTileAndLoadToPvrTwGenVram(n)).
BlackAura
DC Developer
DC Developer
Posts: 9951
Joined: Sun Dec 30, 2001 9:02 am

Post by BlackAura » Tue Jun 08, 2004 10:05 pm

That's pretty much the same way I ended up with.
User avatar
Stef.D
DCEmu Respected
DCEmu Respected
Posts: 114
Joined: Wed Oct 15, 2003 1:46 am
Contact:

Post by Stef.D » Wed Jun 09, 2004 2:38 am

I think using PVR acceleration for VDP rendering can help only for a simple tile based renderer... trying to keep rasters effects and line scrolls will probably kill all speed gain : instead of having 8x8 textures, we'll need 8x1 textures (one pattern line) !

Palette reprogramming can be done, as BlackAura said, by using extra textures, but unfortunatly, effect can only apply on next tile (can't get color changed in middle of the tile since a complete tile use only one palette).

I think the simplest way of doing it would be the best, otherwise, renderer will become too complex and probably too slow.
Having a twiddled copy of VRAM in video memory seems to be ok, as for the 2048 entries table which permit to update VRAM copy at each frame, will make stuff far easier, specially for DMA.

Nice points about PVR rendering :
- horizontal and vertical tile flip can be done by changing UV map.
- transparent color handled by transparent color in palette
- priority can be done with Z coordinate, with base tile Z coordinate :
8 = background
7 = scroll B
6 = scroll A & window
5 = sprite
then when tile's priority bit is set , we add -4 to Z coordinate :)

Maybe we should also do a table to detect tilemap changes and modify display list accordling.
Actually we can have a static display list which reprensent the tilemap plan (the display list is re-built only on tilemap size registers writes), the only thing we have to modify then is the texture, Z coordinate and UV coordinates.
Global scrolling can be done with tranformation matrix, tile based scrolling would require some extras modifications of the display list...

Well, beside all that, i think PVR rendering can be interesting to do a very fast but simple VDP render,.
For a 100% complete and accurate VDP render, we have to stand up with a 100% software render :-/
As i'm not familiar with PVR stuff, i think i'll put my effort on making a faster software VDP render... and as i don't know if it's possible to achieve 60 FPS with the software VDP, having a faster VDP core using PVR feature for games which doesn't use complex rendering would be really nice :)
Alexvrb
DCEmu Ultra Poster
DCEmu Ultra Poster
Posts: 1754
Joined: Wed Jul 17, 2002 11:25 am

Post by Alexvrb » Wed Jun 09, 2004 3:23 am

I still think you could use some combination of the two, use the PVR for the sprites at least, as I don't think they usually did anything crazy with the sprites?
If you have twenty monkeys,
banging randomly on typewriters,
they will in twenty minutes produce the complete source code to World of Warcraft.
BlackAura
DC Developer
DC Developer
Posts: 9951
Joined: Sun Dec 30, 2001 9:02 am

Post by BlackAura » Wed Jun 09, 2004 4:22 am

I think we might be able to emulate some of the things the VDP could do (like mid-frame palette changes) using modifier volumes, or some other insane thing. The PVR has a lot of really weird features that we could use.

Of course, I don't think a fully hardware-driven VDP emulator could possibly replace a software one. For example, I would seriously not want to run Road Rash on it.
Actually we can have a static display list which reprensent the tilemap plan (the display list is re-built only on tilemap size registers writes), the only thing we have to modify then is the texture, Z coordinate and UV coordinates.
Global scrolling can be done with tranformation matrix, tile based scrolling would require some extras modifications of the display list...
Might work. The problem is that we don't get transformations for free - we have to do them on the SH-4 side. I don't know how much data we'd be pushing, but most of the vertex data will probably be quite small. It might actually be faster to generate the display lists on the fly, rather than try to use prebuilt lists. We could transfer prebuilt lists very quickly using DMA, but actually modifying them each frame will probably kill any speed that we might have gained by doing that.
having a faster VDP core using PVR feature for games which doesn't use complex rendering would be really nice
Yep. Can't hurt to try.

One thing I'm not sure about: How could we handle sprite priorities correctly once we've got the background layers? It seems possible to have a low priority sprite (priority bit not set) overlap a high priority sprite, but then to have a background layer overlap the low priority sprite, without overlapping the high priority sprite. Painful...

It might be possible to do a little more than just a simple renderer using the PVR. Obviously, we want to do a nice and simple renderer first off, but if it's fast enough we could start adding bits which try to detect when the game is trying to do something weird, and mimic it. Probably starting off with effects which are common and probably not too hard to implement, until it starts getting in the way of emulating simple games quickly.
User avatar
Stef.D
DCEmu Respected
DCEmu Respected
Posts: 114
Joined: Wed Oct 15, 2003 1:46 am
Contact:

Post by Stef.D » Wed Jun 09, 2004 5:56 am

BlackAura wrote:I think we might be able to emulate some of the things the VDP could do (like mid-frame palette changes) using modifier volumes, or some other insane thing. The PVR has a lot of really weird features that we could use.
Modifier volume sound as a good idea for palette reprogramming, don't know how use them though.
Of course, I don't think a fully hardware-driven VDP emulator could possibly replace a software one. For example, I would seriously not want to run Road Rash on it.
Yeah, CastleVania Bloodline is a pain too ;) it uses many VDP tricks.
Might work. The problem is that we don't get transformations for free - we have to do them on the SH-4 side. I don't know how much data we'd be pushing, but most of the vertex data will probably be quite small. It might actually be faster to generate the display lists on the fly, rather than try to use prebuilt lists. We could transfer prebuilt lists very quickly using DMA, but actually modifying them each frame will probably kill any speed that we might have gained by doing that.
Ok, i though that vertex transformation was quicker than generate display list and send it :P
If we have static display list, we don't need to modify them much, pattern are rarely modified...
One thing I'm not sure about: How could we handle sprite priorities correctly once we've got the background layers? It seems possible to have a low priority sprite (priority bit not set) overlap a high priority sprite, but then to have a background layer overlap the low priority sprite, without overlapping the high priority sprite. Painful...
This is true, and yeah, that's really paintful, even with a software renderer :-/
Hopefully a very few game use that trick (as far i remember only Sonic 2 in title screen and Castlevania bloodline in some parts) ... so don't bother with that feature ;)
It might be possible to do a little more than just a simple renderer using the PVR. Obviously, we want to do a nice and simple renderer first off, but if it's fast enough we could start adding bits which try to detect when the game is trying to do something weird, and mimic it. Probably starting off with effects which are common and probably not too hard to implement, until it starts getting in the way of emulating simple games quickly.
Yeah, it's the idea :) if speed is good of course we can add more stuff, but really we should start with something quite simple.
I still think you could use some combination of the two, use the PVR for the sprites at least, as I don't think they usually did anything crazy with the sprites?
That's true, except sonic2 and castlevania bloodline (again) no game does rasters effects on sprites.
But since sprites are modified very often, all cache stuff which can do with PVR become pointless :-/
BlackAura
DC Developer
DC Developer
Posts: 9951
Joined: Sun Dec 30, 2001 9:02 am

Post by BlackAura » Wed Jun 09, 2004 9:35 am

Modifier volumes are basically regions defined as part of a display list. When you draw something outside a modifier volume, it has one set of attributes (like colour, texture, palette index), and when it's inside the volume it uses an alternate one. That'd work pretty well (and be simple to implement) for one palette change in a frame, but might be harder (or even impossible) with three or more. I don't know if any games (intentionally) do that. I think some of the Sonic games on water levels (like Chemical Plant or Aquatic Ruins in Sonic 2) start off with the underwater palette, switch to the normal palette, and then switch back to the underwater palette, but I think we could probably code around that.

I'm pretty sure that I can do that one at least.

The main thing with modifying a display list in memory and then sending that over is that you're having to read a hell of a lot more data from memory each frame. If you're generating it on the fly, you're re-doing a lot of work, but nothing that you wouldn't have to do anyway if you were reading a pre-built display list and modifying it.

If we had something like a GeForce which does hardware TnL, it'd be a different story. We'd still have to modify the textures for each tile if the name tables change, but that wouldn't be a big deal. Pixel and vertex shaders would probably be useful too, but we'll have to make do with what we've got.

I don't think we'll bother with weird sprite/background priorities for now then. If we render it in this order:
Low priority A
Low priority B
Low priority sprites
High priority A
High priority B
High priority sprites

we should probably be OK for most games then. That simplifies things a lot. I'm not even sure if it's possible to emulate that with a 3D accelerator.

Just checking - low priority backgrounds and sprites are rendered at half intensity, right? I think we could do that pretty easily by setting the vertex colours to gray instead of white. I still don't think we can do shadow/hilight mode without completely killing the speed though, because we'd have to use the translucent display list, where normal sprites and backgrounds would have to be in the opaque or punch-through display lists, and we'd also need to generate two sets of cached tile data. Not worth bothering with really.

I suppose we could simply not bother to upload any pattern data that's inside the name tables or the sprite table, but that'd cause problems if a game tried to use unused portions of those for patterns. Not really worth the effort then.

A BBA would be so useful for this kind of stuff. It's really hard to write fairly low-level code if you don't have a sufficiently fast way to test it. I can't really use emulators either, because this is far too low level, and the available emulators aren't accurate enough, or fast enough to give me an indication about the speed it'll run on a real Dreamcast.

Once I work out how the background scrolling and windowing stuff works, I'll give this a go. I might have to do a prototype in OpenGL (on a PC) first, mostly because I'm more comfortable with OpenGL than programming the DC's PVR directly.

Thinking about it, doing this for an SMS would be child's play compared to this. It doesn't have anything like the complexity of the Genesis VDP, only one known bug (which wasn't present in later versions, and only one game used it), and would be well suited to this. It's almost a shame that we already have full-speed SMS emulation. I wonder how many arcade games could have the same principles applied to them?
Alexvrb
DCEmu Ultra Poster
DCEmu Ultra Poster
Posts: 1754
Joined: Wed Jul 17, 2002 11:25 am

Post by Alexvrb » Wed Jun 09, 2004 2:18 pm

BlackAura wrote:I wonder how many arcade games could have the same principles applied to them?
Probably the older ones (system 16, etc?), if you had a more-or-less complete emulator + source to start with.

Stef: You can check out the source to a volume modifier program by Heinrich Tillack here.
doragasu
DCEmu Cool Poster
DCEmu Cool Poster
Posts: 1048
Joined: Thu May 16, 2002 5:01 pm
Location: Madrid, Spain

Post by doragasu » Wed Jun 09, 2004 2:36 pm

BlackAura wrote:I wonder how many arcade games could have the same principles applied to them?
I know nothing about SDL, but maybe it would be a lot easier to do a PVR renderer for SDL than making it for DC, and lots of homebrewn games and ports would get a speed boost.

Other system that maybe could be easier to get working with PVR is Capcom CPS-1. Most of it's games didn't use strange effects (The most complicated one I can remember now is the line scrolling of the ground in the Street Fighter II series.
Rand Linden
bleemcast! Creator
bleemcast! Creator
Posts: 882
Joined: Wed Oct 17, 2001 7:44 pm
Location: Los Angeles, CA
Contact:

Post by Rand Linden » Wed Jun 09, 2004 3:10 pm

<HINT>

Speed is all about knowing what you need to do -- and thus (more importantly), what you don't need to do.

The PVR2DC hardware is incredibly fast and powerful -- don't waste time with things that aren't necessary.

Rand.
Ian Micheal
Soul Sold for DCEmu
Soul Sold for DCEmu
Posts: 4853
Joined: Fri Jul 11, 2003 9:56 pm

Post by Ian Micheal » Wed Jun 09, 2004 6:11 pm

BlackAura wrote:Modifier volumes are basically regions defined as part of a display list. When you draw something outside a modifier volume, it has one set of attributes (like colour, texture, palette index), and when it's inside the volume it uses an alternate one. That'd work pretty well (and be simple to implement) for one palette change in a frame, but might be harder (or even impossible) with three or more. I don't know if any games (intentionally) do that. I think some of the Sonic games on water levels (like Chemical Plant or Aquatic Ruins in Sonic 2) start off with the underwater palette, switch to the normal palette, and then switch back to the underwater palette, but I think we could probably code around that.

I'm pretty sure that I can do that one at least.

The main thing with modifying a display list in memory and then sending that over is that you're having to read a hell of a lot more data from memory each frame. If you're generating it on the fly, you're re-doing a lot of work, but nothing that you wouldn't have to do anyway if you were reading a pre-built display list and modifying it.

If we had something like a GeForce which does hardware TnL, it'd be a different story. We'd still have to modify the textures for each tile if the name tables change, but that wouldn't be a big deal. Pixel and vertex shaders would probably be useful too, but we'll have to make do with what we've got.

I don't think we'll bother with weird sprite/background priorities for now then. If we render it in this order:
Low priority A
Low priority B
Low priority sprites
High priority A
High priority B
High priority sprites

we should probably be OK for most games then. That simplifies things a lot. I'm not even sure if it's possible to emulate that with a 3D accelerator.

Just checking - low priority backgrounds and sprites are rendered at half intensity, right? I think we could do that pretty easily by setting the vertex colours to gray instead of white. I still don't think we can do shadow/hilight mode without completely killing the speed though, because we'd have to use the translucent display list, where normal sprites and backgrounds would have to be in the opaque or punch-through display lists, and we'd also need to generate two sets of cached tile data. Not worth bothering with really.

I suppose we could simply not bother to upload any pattern data that's inside the name tables or the sprite table, but that'd cause problems if a game tried to use unused portions of those for patterns. Not really worth the effort then.

A BBA would be so useful for this kind of stuff. It's really hard to write fairly low-level code if you don't have a sufficiently fast way to test it. I can't really use emulators either, because this is far too low level, and the available emulators aren't accurate enough, or fast enough to give me an indication about the speed it'll run on a real Dreamcast.

Once I work out how the background scrolling and windowing stuff works, I'll give this a go. I might have to do a prototype in OpenGL (on a PC) first, mostly because I'm more comfortable with OpenGL than programming the DC's PVR directly.

Thinking about it, doing this for an SMS would be child's play compared to this. It doesn't have anything like the complexity of the Genesis VDP, only one known bug (which wasn't present in later versions, and only one game used it), and would be well suited to this. It's almost a shame that we already have full-speed SMS emulation. I wonder how many arcade games could have the same principles applied to them?
Sms emulation is not fullspeed yet. Frameskip2 to 1 as far as i can see in the source and most games are not 100% speed. not far from it thou.
Dreamcast forever!!!
Heliophobe
Smeg Creator
Smeg Creator
Posts: 246
Joined: Thu Mar 14, 2002 2:40 pm
Contact:

Post by Heliophobe » Wed Jun 09, 2004 7:02 pm

Ian Micheal wrote:
Sms emulation is not fullspeed yet. Frameskip2 to 1 as far as i can see in the source and most games are not 100% speed. not far from it thou.

Whatchoo talkin' 'bout Ian? SMSPlus (Kroustibat's port) and Smeg are both full speed without frameskipping.
Ian Micheal
Soul Sold for DCEmu
Soul Sold for DCEmu
Posts: 4853
Joined: Fri Jul 11, 2003 9:56 pm

Post by Ian Micheal » Wed Jun 09, 2004 10:10 pm

Sms plus is mostly fullspeed it's not 100% fullspeed it has no fm sound. Source i had it was using frameskip. Till it has fm sound with out slow down thats not 100% speed it's not fully emulating the machine. It's fully playable speed but i you can tell it's not 100% full on some scrollying games.


Smeg is fullspeed but lacks sms plus feature's.
Dreamcast forever!!!
Post Reply