Previous Thread
Next Thread
Print Thread
Page 6 of 9 1 2 3 4 5 6 7 8 9
Joined: Mar 2001
Posts: 17,259
Likes: 267
R
Very Senior Member
Very Senior Member
R Online: Content
Joined: Mar 2001
Posts: 17,259
Likes: 267
Nice. A9 sounds great on paper (shorter pipeline, out-of-order), can't wait to see real-world benchmarks.

Joined: Jul 2006
Posts: 87
L
Member
Member
L Offline
Joined: Jul 2006
Posts: 87
I don't have the right to quote numbers... The 25% speed increase in DMIPS over A8 quoted on ARM site isn't unusual.

It's a really nice chip, but I'm very biased smile

Joined: Dec 2009
Posts: 24
F
Member
Member
F Offline
Joined: Dec 2009
Posts: 24
Alright, finally got back to sdlmame. The feeling completely lost was a bit off-putting. laugh I installed the oprofile kernel module and userspace tools on my N900 and figured out all i have to do to generate the missing symbols is SYMBOLS=1...

I have recorded a few profiles and will paste them momentarily. Adding an extra ~80 MB of size to the executable noticeably slows things down on its own, but i guess it's a rough guide.

I was wondering if there's a simple way to disable certain drivers without breaking things. Nothing >1999 is going to run on the OMAP, anyway, and it would be nice to speed up building and reduce binary size.

Also, do you have any suggestions on which ROMs are best for testing?

Here's the breakdown of usage by the mame executable. Notably, the second highest usage was by pulseaudio. Setting no sound in the cfg must not be enough?
Code
369990 68.1283 mame-symbols
    GPTIMER_CYCLES:16|
      samples|      %|
    ------------------
       331149 89.5021 mame-symbols
        28745  7.7691 no-vmlinux
         3605  0.9744 libc-2.5.so
         1745  0.4716 libpulsecommon-0.9.15.so
         1219  0.3295 libpulse.so.0.8.0
         1159  0.3133 libpthread-2.5.so
         1045  0.2824 libm-2.5.so
          721  0.1949 libSDL-1.2.so.0.11.1
59717 10.9960 pulseaudio
    GPTIMER_CYCLES:16|
      samples|      %|
    ------------------
        19093 31.9725 no-vmlinux
        16969 28.4157 module-nokia-voice.so
         5997 10.0424 libpulsecommon-0.9.15.so
         4573  7.6578 libpulsecore-0.9.15.so


And here's the breakdown by function. Two big winners here are gfx related.
Code
Counted GPTIMER_CYCLES events (32KiHz timer clock cycles between interrupts) with a unit mask of 0x00 (No unit mask) count 16
samples  %        image name               symbol name
108496   29.3240  mame-symbols            drawsdl_rgb565_draw_quad_palette16_none
102284   27.6451  mame-symbols             get_texel_palette16_nearest
30087     8.1318  mame-symbols             ay8910_update
28745     7.7691  no-vmlinux               /no-vmlinux
14177     3.8317  mame-symbols             drawsdl_rgb565_draw_rect
4520      1.2217  mame-symbols             generate_resampled_data
4345      1.1744  mame-symbols             chan_calc
4170      1.1271  mame-symbols             drawsdl_rgb565_draw_quad_argb32_alpha
3939      1.0646  mame-symbols             cpu_execute_m6809
3605      0.9744  libc-2.5.so              /lib/libc-2.5.so
3343      0.9035  mame-symbols             cpu_execute_z80
3277      0.8857  mame-symbols             read_byte_generic
2827      0.7641  mame-symbols             advance_eg_channel

This is for the Ghosts 'n Goblins ROM. I tried a few other older and slightly newer ROMs and every one runs at about 11% according to the message in xterm; i guess that's because those two drawing routines are overwhelmingly the bottleneck. Running the stripped binary from my last build it runs at about 33%.

I'm building now with symbols and optimizations to see if it shows a different bottleneck.

Last edited by Flandry; 01/10/10 07:51 PM.
Joined: Mar 2001
Posts: 17,259
Likes: 267
R
Very Senior Member
Very Senior Member
R Online: Content
Joined: Mar 2001
Posts: 17,259
Likes: 267
That's not at all surprising - it indicates that the software renderer is eating your lunch. You need to get OpenGL running.

Joined: Dec 2009
Posts: 24
F
Member
Member
F Offline
Joined: Dec 2009
Posts: 24
Things look different with -o3:
Code
samples  %        image name               symbol name
65701    46.2634  mame-o3-symbols          drawsdl_rgb565_setup_and_draw_textured_quad
14508    10.2158  no-vmlinux               /no-vmlinux
6459      4.5481  mame-o3-symbols          ay8910_update
4897      3.4482  libc-2.5.so              /lib/libc-2.5.so
4539      3.1961  mame-o3-symbols          cpu_execute_z80
4522      3.1842  mame-o3-symbols          ym2203_update_one
3393      2.3892  mame-o3-symbols          drawsdl_rgb565_draw_rect

That was quite the optimization...

What exactly does that line mean:
Average speed: 34.91% (42 seconds)

Because it seemed to be running at full speed. Does that mean it was automatically dropping frames?

What bits of armel conversion are you working on?

Last edited by Flandry; 01/10/10 10:57 PM.
Joined: Mar 2001
Posts: 17,259
Likes: 267
R
Very Senior Member
Very Senior Member
R Online: Content
Joined: Mar 2001
Posts: 17,259
Likes: 267
Heh, yeah, trying to profile unoptimized MAME is probably not worthwhile. Press F11 during the emulation to see frameskips and actual speed - 34.91% means precisely that you were only getting 1/3rd speed.

Joined: Feb 2003
Posts: 168
Senior Member
Senior Member
Joined: Feb 2003
Posts: 168
Originally Posted by Flandry
Here's the breakdown of usage by the mame executable. Notably, the second highest usage was by pulseaudio. Setting no sound in the cfg must not be enough?

If I remember correctly, two or more years ago MAME changed the nosound flag so that it only prevents you from hearing it, but it is still rendered (so it still uses cpu cycles).

Joined: Mar 2001
Posts: 17,259
Likes: 267
R
Very Senior Member
Very Senior Member
R Online: Content
Joined: Mar 2001
Posts: 17,259
Likes: 267
Yeah, we had to do that because on a fair number of games the game won't run properly without the sound CPU (on Atari games it processes the coin inputs, for instance).

Joined: Jul 2006
Posts: 87
L
Member
Member
L Offline
Joined: Jul 2006
Posts: 87
You don't need to set SYMBOLS to 1 to get symbols, that's a MAME oddity smile

Just add -g (and perhaps -fno-omit-frame-pointer for oprofile backtrace; I'm not sure whether it's needed or not) to CCOMFLAGS in the appropriate place. And also remove the stripping of the executable.

As Arbee pointed out, you shouldn't waste profiling unoptimized code.

Oh and BTW SDL is sometimes extremely crappy. On an emulator I was working on it was doing a stupid conversion that was eating 2/3 of the total time. Replacing the offending routine with my version made the blitting almost negligible.

OTOH converting SDL OpenGL to OpenGL ES 2 probably is the way to go smile

EDIT: contrary to a popular belief adding -g to gcc doesn't change anything to the quality of generated code.

Last edited by ldesnogu; 01/11/10 07:21 PM.
Joined: Dec 2009
Posts: 24
F
Member
Member
F Offline
Joined: Dec 2009
Posts: 24
Hey exciting news with the closer tracking of upstream!

I found that using an yuv overlay gives pretty good performance but the setup and drawing of the quad is still the highest single user of processor. Have you made a start on optimizing that for arm or implementing GL ES support that i would be duplicating?

Most of the problems i have run into in the learning process here are due to weird/obsolete stuff in makefile, so hopefully the new merged project will clean that process up. If not i may submit some patches.

Ldesnogu: did you get .136 to build with the extra extern var declaration added? I couldn't get it to link.

Page 6 of 9 1 2 3 4 5 6 7 8 9

Moderated by  R. Belmont 

Link Copied to Clipboard
Who's Online Now
1 members (pmackinlay), 104 guests, and 0 robots.
Key: Admin, Global Mod, Mod
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Forum Statistics
Forums9
Topics9,358
Posts122,439
Members5,082
Most Online1,283
Dec 21st, 2022
Our Sponsor
These forums are sponsored by Superior Solitaire, an ad-free card game collection for macOS and iOS. Download it today!

Superior Solitaire
Powered by UBB.threads™ PHP Forum Software 8.0.0