Poorly worded. I meant there are behaviors that are more important, as evidenced by the fact that a lot more software relies upon them.

Okay, I wrote this up to help you guys out:

And blargg's tests are here:

There's additional research from what was mentioned previously, but it should now comprise 100% of all possible edge cases. Eg writes during computation, reads of RDDIV inside multiplication (yes, it actually updates that too), and full cycle stepping validation.

You're definitely going to need your core broken into individual cycles (eg no word reads), but it doesn't need external synchronization between cycles. Just call the alu_edge() function between each cycle and you should be good.