Oh bum. That would have made things easier. I guess that's because the I/O ports are used for SMS compatability mode.Unfortunatly nothing is connected to the Z80 IO ports on genesis :-/
Z80 emulation
-
- DC Developer
- Posts: 9951
- https://www.artistsworkshop.eu/meble-kuchenne-na-wymiar-warszawa-gdzie-zamowic/
- Joined: Sun Dec 30, 2001 9:02 am
- Has thanked: 0
- Been thanked: 1 time
- az_bont
- Administrator
- Posts: 13567
- Joined: Sat Mar 09, 2002 8:35 am
- Location: Swansea, Wales
- Has thanked: 0
- Been thanked: 0
- Contact:
As far as I know, it has always been backwards compatible . Well, most models of it are - apparently the Genesis 3 and Nomad consoles lack a vital piece of hardware.
There were a couple of adaptors made, some still available at stores like Lik-Sang. There was also a limited edition cartridge of Phantasy Star I for the Mega Drive (Genesis) in Japan, which was basically the Master System game in a Mega Drive cartridge that took advantage of the compatibility mode.
It would have made a lot of sense to allow MS carts to be plugged in from the beginning, but then Sega were never really any good at making intelligent decisions...
There were a couple of adaptors made, some still available at stores like Lik-Sang. There was also a limited edition cartridge of Phantasy Star I for the Mega Drive (Genesis) in Japan, which was basically the Master System game in a Mega Drive cartridge that took advantage of the compatibility mode.
It would have made a lot of sense to allow MS carts to be plugged in from the beginning, but then Sega were never really any good at making intelligent decisions...
Sick of sub-par Dreamcast web browsers that fail to impress? Visit Psilocybin Dreams!
-
- DCEmu Uncool Newbie
- Posts: 1459
- Joined: Sat Dec 27, 2003 10:40 pm
- Has thanked: 0
- Been thanked: 0
- Contact:
http://cgi.ebay.co.uk/ws/eBayISAPI.dll? ... 94084&rd=1how come i've never seen a cartridge converter to play SMS games on the Genesis
-
- Smeg Creator
- Posts: 246
- Joined: Thu Mar 14, 2002 2:40 pm
- Has thanked: 0
- Been thanked: 0
- Contact:
An adaptor was sold as the 'Power Base Converter'q_006 wrote:since when did the Genesis have backware compatibility with the Master System? and how come i've never seen a cartridge converter to play SMS games on the Genesis (if what i'm hearing is true)?
(sorry to be offtopic)
I'm not sure exactly why it was supported out of the box, but I think they were trying to avoid the perception that the Genesis was just an upgraded Master System instead of a whole new system --- the Japanese Mark III and Master System were cartridge-compatible with the previous SG-1000 system, which Sega may have thought hurt their sales as it might have been perceived that the Master System was more of a "SG-1000+" then a major improvement over the SG-1000.
By offering the Power Base converter they had it covered both ways -- Master System fans could play their old carts on the new system, but the public wouldn't think of it as a "Master System +" --- the Power Base converter was large and looked like it would contain a lot of circuitry, perhaps a whoel embedded SMS, but really it was just a simple pin converter.
(sorry to continue to be off topic)
- Stef.D
- DCEmu Respected
- Posts: 114
- Joined: Wed Oct 15, 2003 1:46 am
- Has thanked: 0
- Been thanked: 0
- Contact:
Here's how i see that :Rev. Layle wrote:edit: just read your post again, i guess you would want to test for RAM and banked area first becuase they are going to be hit the most. and (againe dit) come to think of it: a few "if"s may do the trick better and faster than the stupid shifts, ands, and swtiches i used.
is there anything in the 0x50XX range of memory?
Code: Select all
if (adr < 0x4000) return ram[adr & 0x1FFF];
if (adr >= 0x8000) return read_byte_68000(adr + bank);
// IO PORT
...
As far i remember, there is nothing in 0x5xxx area...
-
- Insane DCEmu
- Posts: 190
- Joined: Sun Jun 27, 2004 8:35 pm
- Location: stillwater, ok
- Has thanked: 0
- Been thanked: 0
- Contact:
-
- Modder Of Rage
- Posts: 805
- Joined: Mon Mar 18, 2002 12:41 pm
- Location: Midwest
- Has thanked: 0
- Been thanked: 0
- Contact:
hey warmtoe- any breakthroughs or progress on the sound..
Check out the beats of rage community at http://borrevolution.vg-network.com/
-
- Modder Of Rage
- Posts: 805
- Joined: Mon Mar 18, 2002 12:41 pm
- Location: Midwest
- Has thanked: 0
- Been thanked: 0
- Contact:
well then i apologize- source releases and then compilations just flew by so frequently before and thats when everyone started to workout everything fast. It seems like this working together has narrowed unless all of them are still sending updated sources back and forth privately.
Check out the beats of rage community at http://borrevolution.vg-network.com/
-
- DCEmu Uncool Newbie
- Posts: 1459
- Joined: Sat Dec 27, 2003 10:40 pm
- Has thanked: 0
- Been thanked: 0
- Contact:
or maybe the parts they are working on requires time, dont you think? anyways, i can understand your haste, this project is very exciting, but to me it would be better to let them work on it calmly and dont ask for news every 2 days. but if blackaura/warmtoe/stef dont find it annoying they are free to correct me.
-
- DC Developer
- Posts: 9951
- Joined: Sun Dec 30, 2001 9:02 am
- Has thanked: 0
- Been thanked: 1 time
They certainly do. I don't really want to release what I have until it works much better than it does now.or maybe the parts they are working on requires time, dont you think?
Mostly, yes. Or just working on whatever-it-is independently. I did have another couple of source releases, but they weren't worth compiling and releasing.unless all of them are still sending updated sources back and forth privately.
-
- DC Developer
- Posts: 453
- Joined: Thu May 16, 2002 8:29 am
- Location: ice88's house
- Has thanked: 0
- Been thanked: 0
- Contact:
Stef.D wrote:As i done in C68K : in a standard CPU emulator, you store the current PC of emulated cpu in a variable (register is better) we can just call "PC"Warmtoe wrote:Stef,
I made an initial stab at doing z80 work with a jump table - it's working but I have only implemented a fraction of the opcodes at the moment .
How do you remove the call for each fetch though? I will carry on with my hack - I'm sure yours will be much better - but I don't see how you can eliminate the fetch. One thought I had is to use the much-maligned cache to speed things up - by pointing it at the location that represents the current PC for the z80 - will that help?
Anyway - any insight!
Then when you need to fetch the next opcode (and opcode parameter) you'll have to read data at this address.
Since we execute code from this area, we can see it as a large memory space : RAM / ROM , this is the "fetch area".
Imagine we define fetch area as follow (for a Z80 CPU) :
0x0000-0x7FFF = rom_data
0xE000-0xFFFF = ram_data
(code can't be executed from IO port, that doesn't make sense.)
Well, the trick is that PC is equal to (cpu PC + fetch base)
Imagine we the following instruction : bra 0x450
with a conventionnal CPU core we only need to do :
...
NewPC = FetchWord;
PC = NewPC;
...
but here we'll do :
...
NewPC = FetchWord;
PC = FetchBase[NewPC >> X] + NewPC;
...
where FetchBase is a (0x10000 >> X) sized table containing Fetch base area (X can be 4-12, depending what we need).
Then when you need to fetch data, you only have to do :
data = *(u8*)PC; for byte
data = *(u16*)PC; for word
....
That make stuff a bit more complexe, since PC doesn't contain the real PC value, but it's a lot faster for fetch then
My english is really limited, i hope you can understand my explainations.
Edit : That trick about PC can also be done on SP (stack pointer) since we *normally* use it only to store datas in memory area (no ports), but imo it doesn't worth the effort.
OK - I want to have a play with this - can you explain a little further? I'm not sure I understand it - but I want to
Read my blog: http://unrational.blogspot.com
- Quzar
- Dream Coder
- Posts: 7498
- Joined: Wed Jul 31, 2002 12:14 am
- Location: Miami, FL
- Has thanked: 4 times
- Been thanked: 10 times
- Contact:
The PC which is the current CPU seems to hold the fetch address. So every time you do a fetch it only has to add a small amount to the PC value to result in the data you are looking for. At least that is what i gathered from his explaination above and converting NeoCD from Musashi to C68k.
"When you post fewer lines of text than your signature, consider not posting at all." - A Wise Man
- Stef.D
- DCEmu Respected
- Posts: 114
- Joined: Wed Oct 15, 2003 1:46 am
- Has thanked: 0
- Been thanked: 0
- Contact:
Actually the trick is that normally in a CPU emulator, if the current PC = 0x200 then you PC variable will be 0x200.
Here, PC = (0x200 + memory address of fetch area.)
for instance from 0x0000 to 0x1FFF we have the rom (stored in rom_data) then PC = 0x200 + rom_data = &(rom_data[200])
then instead of doing ReadByte(PC) to fetch the next instruction, we can just do *PC which is a lot faster
Here, PC = (0x200 + memory address of fetch area.)
for instance from 0x0000 to 0x1FFF we have the rom (stored in rom_data) then PC = 0x200 + rom_data = &(rom_data[200])
then instead of doing ReadByte(PC) to fetch the next instruction, we can just do *PC which is a lot faster
- blargg
- DCEmu Newbie
- Posts: 2
- Joined: Thu Jul 15, 2004 6:02 am
- Has thanked: 0
- Been thanked: 0
- Contact:
CPU Emulator Optimization Techniques
I saw a post about optimizing a Z80 core. I'm working on one and if I did my timing correctly, it's fairly fast.
I have written an emulator for the GameBoy CPU (which is a subset of the Z80) as a part of a sound emulation library. I timed it on my 120 MHz PowerMac computer and it executes 3,472,608 instructions per second (assuming I didn't mis-time it). The core compiles to 3700 bytes of PowerPC code and 2200 bytes of data.
The most significant technique is aggressive sharing of common instruction behavior, which greatly reduces code size and thus cache impact. There are a couple of more optimizations I haven't applied yet; I detail a few of them on a page about a 6502 emulator I wrote: http://www.slack.net/~ant/nes-emu/6502.html
One optimization I haven't implemented yet is to defer status flag determination until the flag is actually needed (I did implement it in an 8085 emulator many years ago). For example in a full Z80 core I'd have a new 8-bit variable "parity" to which I'd assign the result of any instruction which modified the parity bit, and only do the actual parity determination if a branch on parity or push flags instruction were encountered. The half-carry is another example; in the 8085 emulator I had two variables holding the previous and new value of an instruction which set the half-carry, and calculated it only for PUSH FA, DAA, and the end of an emulation run.
Until I've released the GameBoy sound emulation library, here are the relevant bits which demonstrate some techniques:
I have written an emulator for the GameBoy CPU (which is a subset of the Z80) as a part of a sound emulation library. I timed it on my 120 MHz PowerMac computer and it executes 3,472,608 instructions per second (assuming I didn't mis-time it). The core compiles to 3700 bytes of PowerPC code and 2200 bytes of data.
The most significant technique is aggressive sharing of common instruction behavior, which greatly reduces code size and thus cache impact. There are a couple of more optimizations I haven't applied yet; I detail a few of them on a page about a 6502 emulator I wrote: http://www.slack.net/~ant/nes-emu/6502.html
One optimization I haven't implemented yet is to defer status flag determination until the flag is actually needed (I did implement it in an 8085 emulator many years ago). For example in a full Z80 core I'd have a new 8-bit variable "parity" to which I'd assign the result of any instruction which modified the parity bit, and only do the actual parity determination if a branch on parity or push flags instruction were encountered. The half-carry is another example; in the 8085 emulator I had two variables holding the previous and new value of an instruction which set the half-carry, and calculated it only for PUSH FA, DAA, and the end of an emulation run.
Until I've released the GameBoy sound emulation library, here are the relevant bits which demonstrate some techniques:
Code: Select all
// all memory accesses go through a function pointer table
// all instruction accesses use a mapping table; no function call
typedef unsigned (*reader_t)( unsigned addr );
typedef void (*writer_t)( unsigned addr, unsigned value );
reader_t data_reader [256];
writer_t data_writer [256];
uint8_t* code_map [256];
#define READ( addr ) (data_reader [(addr) >> 8]( addr ))
#define WRITE( addr, value ) (data_writer [(addr) >> 8]( addr, value ))
#define READ_PROG( addr ) (code_map [(addr) >> 8] [addr & 255])
void z80_stop() {
cycles_remain = 0;
}
void z80_emulate( registers_t& r )
{
unsigned pc = r.pc;
unsigned sp = r.sp;
unsigned flags = r.flags;
goto loop;
inc_pc_loop:
pc++;
loop:
int cyc = cycles_remain - cycles_per_instruction;
cycles_remain = cyc;
// in actual emulator these are efficiently read together as a word
unsigned op = READ_PROG( pc );
pc++;
unsigned data = READ_PROG( pc ); // pre-fetch data
if ( cyc <= 0 )
goto stop;
// 25% of the time is spent stalling in this switch dispatch
// since the desintation address isn't known in advance for prefetch.
switch ( op ) {
// ...
case 0x20: // JR NZ
if ( flags & z_flag )
goto inc_pc_loop;
// fall through
case 0x18: // JR
jr_taken:
pc += int8_t (data); // sign-extend
goto inc_pc_loop;
case 0x28: // JR Z
if ( flags & z_flag )
goto jr_taken;
goto inc_pc_loop;
case 0x30: // JR NC
if ( !(flags & c_flag) )
goto jr_taken;
goto inc_pc_loop;
case 0x38: // JR C
if ( flags & c_flag )
goto jr_taken;
goto inc_pc_loop;
case 0xE9: // JP_HL
pc = rp.hl;
goto loop;
// ...
case 0xBE: // CMP (HL)
data = rp.hl;
data = READ( data );
goto cmp_comm;
case 0xB8: // CMP B
case 0xB9: // CMP C
case 0xBA: // CMP D
case 0xBB: // CMP E
case 0xBC: // CMP H
case 0xBD: // CMP L
data = R8( op & 7 ); // indexes b, c, d, e, h, l, -, a
goto cmp_comm;
case 0xFE: // CMP IMM
pc++;
cmp_comm:
op = rg.a;
data = op - data;
sub_set_flags:
flags = ((op & 15) - (data & 15)) & h_flag;
flags |= (data >> 4) & c_flag;
flags |= n_flag;
if ( data & 0xff )
goto loop;
flags |= z_flag;
goto loop;
case 0x96: // SUB (HL)
data = rp.hl;
data = READ( data );
goto sub_comm;
case 0x90: // SUB B
case 0x91: // SUB C
case 0x92: // SUB D
case 0x93: // SUB E
case 0x94: // SUB H
case 0x95: // SUB L
case 0x97: // SUB A
data = R8( op & 7 );
goto sub_comm;
case 0xD6: // SUB IMM
pc++;
sub_comm:
op = rg.a;
data = op - data;
rg.a = data;
goto sub_set_flags; // share flag-setting code with CMP
// ...
}
stop:
r.pc = pc;
r.sp = sp;
r.flags = flags;
// ...
}
- Stef.D
- DCEmu Respected
- Posts: 114
- Joined: Wed Oct 15, 2003 1:46 am
- Has thanked: 0
- Been thanked: 0
- Contact:
What about trying to bench your Z80 core against current one used in Genesis Plus to see how it perform better ?
I was thinking about code size reduction for my new Z80 core, i don't know how "code stream break" (goto) affect execution speed on dreamcast though... at least on X86, it's often more efficient to have a huge code with limited intructions execution...
I was thinking about code size reduction for my new Z80 core, i don't know how "code stream break" (goto) affect execution speed on dreamcast though... at least on X86, it's often more efficient to have a huge code with limited intructions execution...
-
- Smeg Creator
- Posts: 246
- Joined: Thu Mar 14, 2002 2:40 pm
- Has thanked: 0
- Been thanked: 0
- Contact:
Sayten's core was written with the opposite in mind - code cache misses have a lot more impact on the dreamcast than they do on a typical desktop computer, so the theory was to cut down on cache hits by sharing more code, even at the expense of executing more instructions (of course, there is likely a break-even point). For x86 based cores, the opposite approach is usually taken because PC's generally have larger instruction caches, a large secondary cache, and probably faster ram relative to the CPU speed (on this point I am not certain).Stef.D wrote:What about trying to bench your Z80 core against current one used in Genesis Plus to see how it perform better ?
I was thinking about code size reduction for my new Z80 core, i don't know how "code stream break" (goto) affect execution speed on dreamcast though... at least on X86, it's often more efficient to have a huge code with limited intructions execution...
It is, however, difficult to say if it would matter, especially in the case of Genesis Plus where you're already emulating a 68000 and your code cache might be shot to hell anyway, especially if you are interleaving the emulation of the 68k, z80, and other components throughout the frame (scanline per scanline, perhaps) rather than running all the z80 for a single frame at once. You might be better off with a large z80 core that executes a smaller number of instructions if it looks like the cache hits are inevitable. It might even depend on the game --- some might use a more diverse set of instructions than others.
-
- DCEmu Ultra Poster
- Posts: 1754
- Joined: Wed Jul 17, 2002 11:25 am
- Has thanked: 0
- Been thanked: 0
This is OT, but: I think the DC's main memory is actually more balanced than many X86 machines. Not every P4 running at ~2-3Ghz has Dual DDR 400, and even if they did, is that better than 100Mhz SDRAM on a 200Mhz SH4, when it comes to integer work? They have a large cache though, so it doesn't matter that much - although that's not the only reason they have a large cache... pipeline is LONG.Heliophobe wrote:and probably faster ram relative to the CPU speed (on this point I am not certain).
- blargg
- DCEmu Newbie
- Posts: 2
- Joined: Thu Jul 15, 2004 6:02 am
- Has thanked: 0
- Been thanked: 0
- Contact:
GameBoy Z80 CPU core emulator
The Gb_Snd_Emu GameBoy Z80 sound emulator (with Z80-subset core) is available at, if anyone would like to examine its performance:
http://www.slack.net/~ant/nes-emu/
http://www.slack.net/~ant/nes-emu/