Previous Thread
Next Thread
Print Thread
Page 6 of 7 1 2 3 4 5 6 7
Re: MAME on ODROID-N2 (Linux) [Re: Robert Hildinger] #116201 10/17/19 05:26 PM
Joined: Apr 2006
Posts: 711
Tafoid Offline
Senior Member
Offline
Senior Member
Joined: Apr 2006
Posts: 711
Originally Posted by Robert Hildinger
Originally Posted by Steve Bourg
That that stings, as I need a capability/fix as of 0201.. I'll definitely give it a try though.


As it turns out, the maciici emulation doesn't suffer much of a performance hit from the memory system rewrite in 0.200. The actual point where the hit occurs is between 0.203 and 0.204, but I don't yet know which commit caused it. You could try compiling 0.203 and see if that gives you a performance boost.


In some quick testing it appears refactoring/improving of m68kmmu on 11/2/18 and 11/3/18 (the machine broke down on initial commits and was fixed in 11/3/18) results in about ~10% drop in performance. Not sure if it can be undone or what the improvements are.

Re: MAME on ODROID-N2 (Linux) [Re: Steve Bourg] #116202 10/17/19 05:46 PM
Joined: Mar 2001
Posts: 16,371
R
R. Belmont Online Content
Very Senior Member
Online Content
Very Senior Member
R
Joined: Mar 2001
Posts: 16,371
Can't be undone, and the improvements are that it doesn't explode on contact with OSes that have more mature VM strategies than System 7.

Re: MAME on ODROID-N2 (Linux) [Re: Steve Bourg] #116203 10/17/19 06:26 PM
Joined: Mar 2004
Posts: 647
belegdol Online Content
Senior Member
Online Content
Senior Member
Joined: Mar 2004
Posts: 647
Originally Posted by Steve Bourg
I may have just encountered a problem with crashoverride's bgfx changes (whether it's something specific to crashoverride's code that enables the use of bgfx or it's some other problem with bgfx on odroid-n2, or just something else about that compiled binary, I don't know). The game I'm playing consistently locks up at a screen change that only happens with that bgfx binary. So I recommend handling that contribution separately.

I stopped using the patch you originally provided that completely disabled/stripped bgfx, and I have been using your second approach "BGFX to OpenGL ES if NO_X11". I believe it took both your "NO_OPENGL" and your "BGFX to OpenGL ES if no NO_X11" patches to get mame compiled on the odroid n2 without X. I would just mention that even with the second approach, running mame with "-video bgfx" did not work - I encounter the error:
Code
../../../../../3rdparty/bgfx/src/glcontext_egl.cpp (177): BGFX 0x00000002: Failed to create display 0x0
Aborted

Your second approach did enable me to get MAME compiled/linked, and MAME's performance with -video accel is just 2% less than -video bgfx from crashoverride's patch. This runs at ~ 90% emulation speed on the odroid n2. So your second approach is a big win even without a working -video bgfx option.

We have a bit of a problem. With the new bgfx simply switching to OpenGL ES no longer works, mame will fail to link:
Code
/usr/bin/ld: ../../../../linux_gcc/bin/x64/Release/libbgfx.a(glcontext_egl.o): in function `bgfx::gl::GlContext::createSwapChain(void*)':
glcontext_egl.cpp:(.text+0x82e): undefined reference to `eglGetCurrentSurface'
/usr/bin/ld: ../../../../linux_gcc/bin/x64/Release/libbgfx.a(glcontext_egl.o): in function `bgfx::gl::GlContext::destroySwapChain(bgfx::gl::SwapChainGL*)':
glcontext_egl.cpp:(.text+0x944): undefined reference to `eglGetCurrentSurface'
/usr/bin/ld: glcontext_egl.cpp:(.text+0x94c): undefined reference to `eglGetCurrentContext'
collect2: error: ld returned 1 exit status

Adding -lEGL to LDFLAGS takes care of that - mame links and runs:
Code
diff --git a/scripts/src/3rdparty.lua b/scripts/src/3rdparty.lua
index 970afc7d45..2f108c7cd7 100644
--- a/scripts/src/3rdparty.lua
+++ b/scripts/src/3rdparty.lua
@@ -1383,6 +1383,14 @@ end
                "__STDC_CONSTANT_MACROS",
                "BGFX_CONFIG_MAX_FRAME_BUFFERS=128",
        }
+
+       if _OPTIONS["NO_X11"]=="1" then
+               defines {
+               "BGFX_CONFIG_RENDERER_OPENGLES=1",
+               "BGFX_CONFIG_RENDERER_OPENGL=0",
+               }
+       end
+
        files {
                MAME_DIR .. "3rdparty/bgfx/src/bgfx.cpp",
                MAME_DIR .. "3rdparty/bgfx/src/debug_renderdoc.cpp",
diff --git a/scripts/src/osd/sdl.lua b/scripts/src/osd/sdl.lua
index ed660e65f9..9cd5415c88 100644
--- a/scripts/src/osd/sdl.lua
+++ b/scripts/src/osd/sdl.lua
@@ -29,6 +29,10 @@ function maintargetosdoptions(_target,_subtarget)
                        "X11",
                        "Xinerama",
                }
+       else
+               links {
+                       "EGL",
+               }
        end
 
        if _OPTIONS["NO_USE_XINPUT"]~="1" then

The issue is that putting forcing -lEGL whenever NO_X11 is specified is likely going to break windows and mac build at least. I feel like I am in over my head and that someone more familiar with interdependencies between X11, OpenGL and OpenGL ES should take care of incorporating the build switches.

Re: MAME on ODROID-N2 (Linux) [Re: Steve Bourg] #116204 10/17/19 06:33 PM
Joined: Mar 2001
Posts: 16,371
R
R. Belmont Online Content
Very Senior Member
Online Content
Very Senior Member
R
Joined: Mar 2001
Posts: 16,371
Put the NO_X11 EGL link inside a check for if _OPTIONS["targetos"]=="linux" or "netbsd" or "openbsd".

Re: MAME on ODROID-N2 (Linux) [Re: Steve Bourg] #116206 10/17/19 07:07 PM
Joined: Mar 2004
Posts: 647
belegdol Online Content
Senior Member
Online Content
Senior Member
Joined: Mar 2004
Posts: 647

Re: MAME on ODROID-N2 (Linux) [Re: belegdol] #116209 10/17/19 08:13 PM
Joined: Aug 2019
Posts: 33
S
Steve Bourg Offline OP
Member
OP Offline
Member
S
Joined: Aug 2019
Posts: 33
Originally Posted by belegdol

Yes, these are the droids I am looking for. You could try passing ARCHOPTS=-march=native to make (need to do REGENIE=1 for the change to take effect). Make sure the flag actually does anything by running
Code
$ gcc -march=native -E -v - </dev/null 2>&1 | grep cc1

Don't hold your breath, but if a couple percent speed bump is all you need this might just do the trick.


In the majority of my tests, mame binaries compiled on this odroid-n2 with ARCHOPTS=-march=native win by a sliver. It's such a tiny margin (typically 0.1% - .25%), and there are cases where the reverse is true. I couldn't fault someone for suspecting statistical noise. Same outcome whether compiled with gcc7 or gcc8. Worth a shot.

Code
# gcc -march=native -E -v - </dev/null 2>&1 | grep cc1
 /usr/lib/gcc/aarch64-linux-gnu/8/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -march=armv8-a+crypto+crc -fstack-protector-strong -Wformat -Wformat-security

# gcc -E -v - </dev/null 2>&1 | grep cc1
 /usr/lib/gcc/aarch64-linux-gnu/8/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -fstack-protector-strong -Wformat -Wformat-security


I'll also note here that I have best testing mame's maciici emulation performance, compiled with gcc7 vs. gcc8, and gcc7 is winning by a margin of nearly 2%. No net improvement for me, as I have been using gcc7 from the start.

Re: MAME on ODROID-N2 (Linux) [Re: Steve Bourg] #116216 10/18/19 08:24 PM
Joined: Apr 2006
Posts: 14
S
Steve Leung Offline
Member
Offline
Member
S
Joined: Apr 2006
Posts: 14
FWIW, if you need to try for another few percent, I've observed measurable improvements from profile-guided optimization.

You would need to compile and link with -fprofile-generate (I believe this needs to be added to both ARCHOPTS and LDOPTS) to produce an instrumented binary, then run that binary on a representative workload. That will produce a bunch of *.gcda files in the build directory.

Then you can blow away all of the object files (leaving the *.gcda files in place), then recompile and relink with -fprofile-use instead of -fprofile-generate. See how the new binary performs.

Some potential caveats:
  • I've only tried this on amd64 Linux
  • On the second build, I've gotten errors about corrupted profile counts, which are seemingly fixable by retrying the first build with -fprofile-update=atomic in addition to -fprofile-generate. cf. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58306 , but YMMV.
  • Second build probably requires NOWERROR=1 since you'll probably get some warnings about missing profile counts, if that code was not reached during your instrumentation run.
  • Whenever you're changing the makefile, you probably need REGENIE=1. I don't really know when you need to specify that. smile
  • You're already memory-constrained, and I have no idea what impact this would have on memory requirements during the build.


AutoFDO is a way to potentially cut out that extra instrumentation build, but I'm not sure whether the necessary infrastructure for that exists on ARM.

Re: MAME on ODROID-N2 (Linux) [Re: belegdol] #116217 10/18/19 10:54 PM
Joined: Aug 2019
Posts: 33
S
Steve Bourg Offline OP
Member
OP Offline
Member
S
Joined: Aug 2019
Posts: 33
Originally Posted by belegdol

I pulled the latest from master after this was merged. I can confirm a successful compile (of all systems) and a successful execution for maciici on the Odroid N2 (minimal/framebuffer). Many thanks to the MAME team. Odroid-N2 Minimal out-of-the-box!


Code
# make NO_X11=1 NOWERROR=1 NO_USE_XINPUT=1 NO_OPENGL=1 -j2

Code
# ./mame maciici -video accel -hard1 OS_608_500MB.chd -nbc enetnb

Re: MAME on ODROID-N2 (Linux) [Re: Steve Leung] #116226 10/19/19 11:26 PM
Joined: Aug 2019
Posts: 33
S
Steve Bourg Offline OP
Member
OP Offline
Member
S
Joined: Aug 2019
Posts: 33
Originally Posted by Steve Leung
FWIW, if you need to try for another few percent, I've observed measurable improvements from profile-guided optimization.

You would need to compile and link with -fprofile-generate (I believe this needs to be added to both ARCHOPTS and LDOPTS) to produce an instrumented binary, then run that binary on a representative workload. That will produce a bunch of *.gcda files in the build directory.

Then you can blow away all of the object files (leaving the *.gcda files in place), then recompile and relink with -fprofile-use instead of -fprofile-generate. See how the new binary performs.

...


Code

## maciici performance before profiling ##

## Launches into System 6.08 for 2 minutes ##
# ./mame maciici -video accel -hard1 ..OS_608_500MB.chd -nbc enetnb -str 120
Average speed: 91.30% (119 seconds)

## Briefly played NetTrek on maciici ##
# ./mame maciici -video accel -hard1 OS_608_500MB.chd -nbc enetnb 
Average speed: 89.12% (89 seconds)



## Build profiling version of mame, run with intended workload, build new version of mame using newly generated profiles ##

# gcc --version
gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0

# make NO_X11=1 NOWERROR=1 NO_USE_XINPUT=1 NO_OPENGL=1 SOURCES=src/mame/drivers/mac.cpp ARCHOPTS="-march=native -fprofile-generate" LDOPTS="-fprofile-generate" -j1

# make clean 

## Played NetTrek under maciici / System 6.08 ##
# ./mame maciici -video accel -hard1 OS_608_500MB.chd -nbc enetnb
Average speed: 34.61% (73 seconds)

# make NO_X11=1 NOWERROR=1 NO_USE_XINPUT=1 NO_OPENGL=1 SOURCES=src/mame/drivers/mac.cpp ARCHOPTS="-march=native -fprofile-use" LDOPTS="-fprofile-use" -j1



## maciici performance after profiling ##

## Launches into System 6.08 for 2 minutes ##
# ./mame maciici -video accel -hard1 OS_608_500MB.chd -nbc enetnb -str 120
Average speed: 99.60% (119 seconds)

## Briefly played NetTrek on maciici ##
# ./mame maciici -video accel -hard1 OS_608_500MB.chd -nbc enetnb
Average speed: 97.29% (138 seconds)


That is fantastic. Audio artifacts are virtually gone at this performance level.

Had to reduce compiling concurrency down to 1 job on my 2GB model, for the fprofile-generate build at least. My first attempt failed with even just two jobs because of memory exhaust.

Thank you sir!

Last edited by Steve Bourg; 10/19/19 11:30 PM.
Re: MAME on ODROID-N2 (Linux) [Re: Steve Bourg] #116227 10/19/19 11:36 PM
Joined: Aug 2019
Posts: 33
S
Steve Bourg Offline OP
Member
OP Offline
Member
S
Joined: Aug 2019
Posts: 33
I should qualify that all of my odroid-n2/mame benchmarking is for emulation of the maciici system. I'm sure that emulation performance can vary wildly from one emulated system to another, and that results from optimization techniques could vary from one emulated system to another.

Page 6 of 7 1 2 3 4 5 6 7

Moderated by  R. Belmont 

Who's Online Now
4 registered members (Fake Shemp, Carbon, R. Belmont, 1 invisible), 145 guests, and 3 spiders.
Key: Admin, Global Mod, Mod
ShoutChat Box
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Forum Statistics
Forums9
Topics8,710
Posts114,489
Members4,869
Most Online510
Aug 26th, 2019
Powered by UBB.threads™ PHP Forum Software 7.7.3