> I'd also suggest that given nobody's worked out the exact timings of the Genesis that a byuu Genesis emulator would be interesting
While there are some great Genesis games, the overall Genesis scene is very unappealing to me.
It's not a secret: I'm not spectacular at reverse engineering. I succeed more as a project motivator. Find people who also want something done, and ask them to help. And importantly, do as much work as I can as well.
But in the Genesis scene, 4:5 emulators are closed source. The authors are all great people (Steve and Aamir are really cool people), but I don't want to have to ask every single question. I got into the SNES scene in part because it was totally open at the time.
> we would need: 1. cothread to keep CPU sync'd as they need 2.
To be fair, you have that now, right? If you're trying not to use them in the SNES CPU, I can promise it will be painful. You can be in the middle of an opcode, and break out in the middle of a cycle or the middle of a DMA sequence triggering in the middle of that, and in the middle of a synchronization event inside the DMA of that.
>  Whatever that is - the Linux version is about 15-20 FPS on my 4.7 GHz i7. I assume it's not that bad on Windows.
It's not that bad on Linux, either. I get about 100fps with a 4.4GHz i7 (200 with scanline rendering, 300-400 with the hacked up cores.) The absolute most demanding special chips drop me down to 60fps.
I'm guessing your card is reporting that it has a 30-bit texture available and is doing software conversion back to 24-bit. Try changing away from OpenGL to X-video.
I need to get an Xlib function ready to detect actual display depth.
As far as overall performance, it could be a lot better. Probably 3x faster at this point in its most optimized form. But I'm going for source readability over speed.
For one instance, if a register is ten bits, I use a uint10 type. Behind the scenes, that's a unsigned int that gets masked to ten bits on assignment operations. Most of the time you don't need to mask, so it does add extra overhead.
I use threads everywhere for consistency, but they don't make sense when you have two chips that share a huge piece of RAM. Idle loop skipping on DSPs would vastly increase their speed, too.
I've always been willing to help write a more efficiency focused emulator, but I won't do it alone. I don't have enough time for that.