Previous Thread
Next Thread
Print Thread
Page 1 of 3 1 2 3
#106795 08/10/16 08:22 AM
Joined: Jun 2015
Posts: 55
N
NLS Offline OP
Member
OP Offline
Member
N
Joined: Jun 2015
Posts: 55
I am curious about this.
I haven't really verified (sorry - can't from where I am), but seems that MAME still is a single core application. I think there is a multithread switch somewhere in the ini but it used to be "not recommended". I might be totally wrong because I don't remember.

So could I ask, what is the current status and future (where "future" means next few months or short years) prospect of MAME doing things in parallel?
By that I don't mean only utilizing multiple cores (and threads), but also utilizing GPU cores and such new technologies where possible.

I would never say to sacrifice the core mission of MAME to emulate hardware in the lowest possible level and accuracy (I am not talking about taking "shortcuts"*), but real hardware that MAME tries to emulate DOES run things in parallel anyway, right? (I mean each component does its job, not one component after the other)

Please shed some light?

---

(* this just came to me: "shortcuts" maybe shouldn't be out of the question either... I mean, sometimes just to get things forward SIMULATING a part may be a stepping stone towards EMULATING it... maybe a component's status in the future could be "not working", "incomplete", "simulated", "working". This would kick forward many mechanical machines I believe...)

Thoughts?


Last edited by NLS; 08/10/16 08:25 AM.
Joined: Jun 2001
Posts: 503
Likes: 20
O
Senior Member
Online Content
Senior Member
O
Joined: Jun 2001
Posts: 503
Likes: 20
Current status is "not much, a little for 3d", future status is "maybe, someone has to do it though".

OG.

Joined: May 2009
Posts: 2,120
Likes: 152
J
Very Senior Member
Offline
Very Senior Member
J
Joined: May 2009
Posts: 2,120
Likes: 152
Those are some good questions. They've come up over the years before, but I haven't seen anyone ask them in a long while, so I'll take them one by one:

Originally Posted By NLS
I haven't really verified (sorry - can't from where I am), but seems that MAME still is a single core application. I think there is a multithread switch somewhere in the ini but it used to be "not recommended". I might be totally wrong because I don't remember.


MAME is multithreaded to the extent that it currently can be, within the two most important constraints of A) what the developers are interested in implementing, and B) what the developers have time to implement.

MAME will happily use whatever cores you throw at it, up to about 4 or 5 (there are diminishing returns here), in order to accelerate 3D drawing in drivers that support it. The actual rasterization of the 3D for the games on these drivers is done in software, rather than utilizing your GPU, for reasons I'll go over later. The drivers and games that benefit from this are largely those that use the poly.h system for threading off work units, which include Atari/Midway Seattle and Vegas games (CarnEvil, San Francisco Rush, NFL Blitz, Gauntlet Legends), the Nintendo 64 driver, Sega Model 2, Sega Model 3, the Gaelco 3D games (Radikal Bikers, Surf Planet, and Speed Up), and a number of others.

This can make a real difference in the performance of these games, although you wind up with diminishing returns: If you have two triangles that cover the same vertical areas of the screen, they need to be drawn in-order, and as such they can't be assigned to different CPU cores, as you don't know which one will be done first. The more triangles you have, the more you can bucket them off onto different CPU cores, but the more contention you have for these buckets.

Originally Posted By NLS
So could I ask, what is the current status and future (where "future" means next few months or short years) prospect of MAME doing things in parallel?
By that I don't mean only utilizing multiple cores (and threads), but also utilizing GPU cores and such new technologies where possible.


The only thing modern general-purpose GPU computing technology would be good for, in the context of MAME, is providing additional acceleration to the rendering on drivers that have 3D hardware.

Based on your questions, it appears that you think that it should be possible to run different devices on different host CPU cores, and that's just not something that's either worth doing or even doable at non-glacial speeds. You also appear to misunderstand how MAME works internally.

First, the different chips in an arcade machine - and indeed, any computer or console - are in fact very tightly-coupled. When a signal goes high or low, it happens within a single clock tick or less. This means that these chips can potentially exchange data, relying on each others' results, with remarkably tight synchronization. For an example of how costly that kind of synchronization is on a component level, just check out the driver for Pong - it doesn't even have a CPU, yet it's one of the slower drivers in MAME. By contrast, the different cores in your PC operate at their best when they have little to no communication between them.

By way of an analogy, the components on a PCB are more akin to a team interacting on a football pitch for a common goal, whereas the cores in your CPU are more like a synchronized swimming team, all doing the same thing but in unison.

Originally Posted By NLS
I would never say to sacrifice the core mission of MAME to emulate hardware in the lowest possible level and accuracy (I am not talking about taking "shortcuts"*), but real hardware that MAME tries to emulate DOES run things in parallel anyway, right? (I mean each component does its job, not one component after the other)

Please shed some light?

---

(* this just came to me: "shortcuts" maybe shouldn't be out of the question either... I mean, sometimes just to get things forward SIMULATING a part may be a stepping stone towards EMULATING it... maybe a component's status in the future could be "not working", "incomplete", "simulated", "working". This would kick forward many mechanical machines I believe...)

Thoughts?


And this is why I think you fundamentally misunderstand how MAME works internally. Compared to the lowest possible level of accuracy - emulating all of the ICs on a board individually and how they're interconnected - MAME is nowhere near on that level in most cases. The developers deviate from that particular level of accuracy all the time. The best examples I can think of are the implementations of the video hardware in various 80's (and later) arcade games: in most cases it's not handled by a single IC, it's either implemented with multiple ASICs or controllers, or with a bunch of TTL ICs. It's rare indeed for MAME to emulate the video hardware on that level.

I'm looking at making that sort of trade-off myself right now: I'm looking into emulating the Fairlight CMI series of synthesizers, and I have schematics for two of the types of boards that would be in the rack unit. I could, in theory, emulate the components on the schematic using MAME's netlist library. In practice, I'm very unlikely to do that, because at the end of the day the schematics already exist and I feel that MAME loses its documentary value for the games themselves if it needlessly sacrifices speed for accuracy when it can be both sufficiently speedy and sufficiently accurate, when evaluated in context.

Accuracy is not an all-or-nothing approach, and it ultimately comes down to what level of accuracy a given driver author is comfortable with, and a tacit agreement among the developers as to a minimum level of accuracy that we strive to maintain. MAME wouldn't be able to benefit from any additional level of parallelism other than accelerating 3D drivers, quite simply.

Joined: Aug 2015
Posts: 405
Senior Member
Offline
Senior Member
Joined: Aug 2015
Posts: 405
You can also say that MAME itself is an event based scheduler where events are either scheduled on timers or triggered by software that changes states on signals between emulated entities. The entities in turn are more or less generic and can be reused in different combinations for different PCB:s

So yes, a single host process for the emulation core but it implements semi-parallellism for the emulated parts, afaict, still learning this though.


Because I can
Joined: Jun 2015
Posts: 55
N
NLS Offline OP
Member
OP Offline
Member
N
Joined: Jun 2015
Posts: 55
Thanks all. I understand JD.

Joined: May 2007
Posts: 568
Likes: 3
M
Senior Member
Offline
Senior Member
M
Joined: May 2007
Posts: 568
Likes: 3
I wonder whether it will be possible some time to use threads for loosely coupled components (in particular implementing device_execute_interface), e.g. used in peripheral devices like floppy drives with built-in controller. The Hexbus floppy drive (for the 99/8) had a TMS9995 CPU inside.

Sure, I know that we only have virtual real-time (something that looks like real time but is only in sync at discrete points in time), so it is questionable whether thinking about any kind of coupling from tight to loose makes any sense.

Joined: May 2004
Posts: 979
Likes: 58
D
Senior Member
Offline
Senior Member
D
Joined: May 2004
Posts: 979
Likes: 58
We are at a point where we emulate discrete components that are connected over slow links (serial for example). In the future, I could see something like this:

- Thread 1: Computer
- Thread 2: Serial keyboard with CPU
- Thread 3: Serial floppy with CPU

We have specific sync points (bit change) there we can sync up the threads.

This makes most sense when emulating something at a low level (MFM hard drive?).

Joined: Mar 2001
Posts: 17,008
Likes: 94
R
Very Senior Member
Online Content
Very Senior Member
R
Joined: Mar 2001
Posts: 17,008
Likes: 94
Originally Posted By Just Desserts
If you have two triangles that cover the same vertical areas of the screen, they need to be drawn in-order, and as such they can't be assigned to different CPU cores, as you don't know which one will be done first. The more triangles you have, the more you can bucket them off onto different CPU cores, but the more contention you have for these buckets.


Which is why GPU rendering is attractive. I'd love to see a BGFX port of the proof-of-concept Voodoo D3D11 renderer; if nothing else it would solidify exactly how that needs to be done.

Joined: Mar 2001
Posts: 17,008
Likes: 94
R
Very Senior Member
Online Content
Very Senior Member
R
Joined: Mar 2001
Posts: 17,008
Likes: 94
Duke: the issue there is that in most cases you don't win much by threading off, say, a 4 MHz 8051 in a serial keyboard.

Joined: Aug 2015
Posts: 405
Senior Member
Offline
Senior Member
Joined: Aug 2015
Posts: 405
I think there is no gain from it though. The only reason to decouple things from the MAME would be if it touches a real world device, imho. This is already done.

Threading for the sake of it self is like working 10 hours per day Monday through Thursday and only get Friday afternoon off in return, so if your computer already is at max CPU utilization, you will only loose more cycles by dividing the task into sub tasks because they need to talk to each other. However if your code is polling for things to complete, threading can instead let you do other important things.

MAME is not waiting for anything, it is executing code from a timer queue "exactly" in the pace of the emulated system or as fast as it can on your computer.

Since C++ abstract devices in a nice way there is no need to abstract them using threads either, it will only complicate things.

Personally I would rather see a multi instance MAME where it would be *easy* to interconnect two or more MAME instances using emulated networks and even interconnect them with real hardware.


Because I can
Page 1 of 3 1 2 3

Link Copied to Clipboard
Who's Online Now
1 members (Dorando), 40 guests, and 0 robots.
Key: Admin, Global Mod, Mod
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Forum Statistics
Forums9
Topics9,189
Posts120,338
Members5,044
Most Online1,283
Dec 21st, 2022
Our Sponsor
These forums are sponsored by Superior Solitaire, an ad-free card game collection for macOS and iOS. Download it today!

Superior Solitaire
Forum hosted by www.retrogamesformac.com