Okay, I've written some tests to understand this HDMA-MDR behavior more: http://byuu.org/temp/test_mdrhdma2.zip

The good news is that we don't have a serious problem with bus timing nor HDMA execution order.

The bad news is that the behavior is flat out nonsense. In my first example, HDMA really does fetch before the S-CPU opcode cycle, and yet the subsequent open bus result is the HDMA fetch and not the S-CPU opcode fetch.

We can simulate it with some really fucked up code, but I would love to understand how the hell this is possible on a hardware level.

Also, an interesting side discovery came out of this: it wasn't storing the HDMA read value in MDR, it was storing the next HDMA counter value, even though a fetch wasn't at all needed (continuous HDMA mode was active with a high counter value.)

What this means is that for each active channel, it always fetches the next line value from the HDMA table. It will only actually use it and increment the HDMA address if the current line counter & 0x7f == 0, of course.

Makes sense from a hardware perspective, it's essentially saying there is no idle cycle, it's actually reading memory while checking the line counter at the same time.

It may explain the short-circuit behavior of HDMA indirect behaviors a bit better, but I haven't fully contrasted this finding with that one, yet.