I like the general idea, Marty. Here are a couple of suggestions.
*You should be able to define multiple systems. For instance, some games were released with no ROM changes whatsoever in Japan, the US, and Europe, so in those cases you could have something like this: system="famicom,nes.ntsc,nes.pal". Also, I would recommend adding "ffe" and "dr.pcjr" to the list of system types.
*Allow more than one board type to be specified. The emulator would simply use the first such valid type. For instance, Teenage Mutant Ninja Turtles (US, Ultra Games) has been found on all the following boards: Nintendo's NES-SLROM-04, NES-SLROM-05, and NES-SLROM-06, and Konami's 351908. All of these boards are functionally identical and the ROM is exactly the same for every one. But it would be nice to be able to accurately document the range of actual hardware that they were played on. (By the way, board information is from BootGod's excellent database site.) Listing board manufacturer would be good, too.
*In line with the above suggestion, allow two classes for board types: "confirmed" and "virtual". Confirmed means that this is the exact board name on the actual hardware PCB. Virtual means that it is a compatible board that was never used, but would work, and that emulators could more easily recognize. For instance, "4 Nin Uchi Mahjong" (Japan, Nintendo) is on a black blob board that is called HVC-FJ. That is the real board name. But, it is 100% compatiable with standard NES/HVC-NROM-128. There are tons of bootleg NROM games that use boards with weird or no names. By using an added "virtual" name in addition to the real one, we can ensure compatibility with emulators, while still preserving the actual board name.
For instance, "Ski or Die" (US, Konami) was published only on a 351908 Konami board. But this is 100% equivalent to SLROM. So, the board descriptions would be set up something like this:
<board class="confirmed" name="351908" manufacturer="Konami" />
<board class="virtual" name="NES-SLROM" />
*Rather than a mirroring tag, allow tags for the actual solder pads on the board. This better describes what is actually going on in the hardware, and allows for flexibility with stuff like the MMC5 ExROM boards that have more solder pads than just H and V. For instance, something like this (the below example is standard vertical mirroring):
<pad h="bridged" v="open" />
*I would recommend standardizing on MD5 as a hash algorithm. It's relatively simple to implement. I'm not worried about deliberate collisions with CRC32, but accidental ones; the latter cannot entirely be ruled out even if the probability is low. MD5, on the other hand, makes an accidental collision nearly impossible.
*How about appending the XML to a standard *.NES image? The prg and chr fields, instead of specifying file names, could give the size and the offset (from start of file). This would offer the best of both worlds: a cleanly documented and implemented format for new emulators, while maintaining backwards compatibility with old ones. You'd only have 16 bytes of overhead. (You would probably also want to add a special tag right before the XML, so the parser could easily find it and not confuse it for ROM data if the header was corrupt.)
If desired, I will try to write up a full draft spec and submit it to the board (and Marty) for approval.