Alright, finally got back to sdlmame. The feeling completely lost was a bit off-putting.

I installed the oprofile kernel module and userspace tools on my N900 and figured out all i have to do to generate the missing symbols is SYMBOLS=1...
I have recorded a few profiles and will paste them momentarily. Adding an extra ~80 MB of size to the executable noticeably slows things down on its own, but i guess it's a rough guide.
I was wondering if there's a simple way to disable certain drivers without breaking things. Nothing >1999 is going to run on the OMAP, anyway, and it would be nice to speed up building and reduce binary size.
Also, do you have any suggestions on which ROMs are best for testing?
Here's the breakdown of usage by the mame executable. Notably, the second highest usage was by pulseaudio. Setting no sound in the cfg must not be enough?
369990 68.1283 mame-symbols
GPTIMER_CYCLES:16|
samples| %|
------------------
331149 89.5021 mame-symbols
28745 7.7691 no-vmlinux
3605 0.9744 libc-2.5.so
1745 0.4716 libpulsecommon-0.9.15.so
1219 0.3295 libpulse.so.0.8.0
1159 0.3133 libpthread-2.5.so
1045 0.2824 libm-2.5.so
721 0.1949 libSDL-1.2.so.0.11.1
59717 10.9960 pulseaudio
GPTIMER_CYCLES:16|
samples| %|
------------------
19093 31.9725 no-vmlinux
16969 28.4157 module-nokia-voice.so
5997 10.0424 libpulsecommon-0.9.15.so
4573 7.6578 libpulsecore-0.9.15.so
And here's the breakdown by function. Two big winners here are gfx related.
Counted GPTIMER_CYCLES events (32KiHz timer clock cycles between interrupts) with a unit mask of 0x00 (No unit mask) count 16
samples % image name symbol name
108496 29.3240 mame-symbols drawsdl_rgb565_draw_quad_palette16_none
102284 27.6451 mame-symbols get_texel_palette16_nearest
30087 8.1318 mame-symbols ay8910_update
28745 7.7691 no-vmlinux /no-vmlinux
14177 3.8317 mame-symbols drawsdl_rgb565_draw_rect
4520 1.2217 mame-symbols generate_resampled_data
4345 1.1744 mame-symbols chan_calc
4170 1.1271 mame-symbols drawsdl_rgb565_draw_quad_argb32_alpha
3939 1.0646 mame-symbols cpu_execute_m6809
3605 0.9744 libc-2.5.so /lib/libc-2.5.so
3343 0.9035 mame-symbols cpu_execute_z80
3277 0.8857 mame-symbols read_byte_generic
2827 0.7641 mame-symbols advance_eg_channel
This is for the Ghosts 'n Goblins ROM. I tried a few other older and slightly newer ROMs and every one runs at about 11% according to the message in xterm; i guess that's because those two drawing routines are overwhelmingly the bottleneck. Running the stripped binary from my last build it runs at about 33%.
I'm building now with symbols and optimizations to see if it shows a different bottleneck.