Previous Thread
Next Thread
Print Thread
Page 1 of 3 1 2 3
#71061 - 06/22/11 06:39 PM The "Of Course HLSL Is Slow" Megathread!  
Joined: May 2009
Posts: 1,589
Just Desserts Offline
Very Senior Member
Just Desserts  Offline
Very Senior Member

Joined: May 2009
Posts: 1,589
I've heard a few complaints that the current HLSL functionality is slow - "slower than Crysis" slow.

My reasoned response: Of course HLSL is slow.

Here, I'll explain. I'm not trying to be a jerk, I'm just saying, there are a number of logical reasons why the HLSL functionality is slow.

First up is prescaling. In order for the Y axis to be "clean" and not blended, we need a decently high scaling value. The HLSL functionality currently attempts to allocate a renderable texture that is the next power-of-two higher on each axis than the screen's dimensions, similar to how MAME determines the game-sized render target size: If your computer's resolution is 640x480 it bumps up to 1024x512, if your computer's resolution is 800x600 it bumps up to 1024x1024, if your computer's resolution is 1024x768, it bumps it up to 2048x1024, and here's the important bit: If your're running at 1920x1080, you'll end up with a 2048x2048 render texture.

This in itself wouldn't be too slow - just a little slow, not slower than Crysis by any means - but the HLSL system, at a minimum, is slinging around three or four of them every frame in order to do its multi-pass rendering. When you turn on defocusing, that increases by yet another two passes.

Secondly, the defocus passes run at render-target resolution and need to take 8 samples for every outgoing pixel, and it does this twice for a smoother blur. This is much slower than usual box-blur or separated gaussian-blur shaders that are present in games, as they typically run at a lower resolution than the final target, not higher.

Lastly, bringing up the rear for just that little extra bit of anti-performance is YIQ functionality. Last I checked, it needs to sample something like 40 texels for every outgoing pixel. However, it gets a free pass, as it operates at the emulated machine's resolution and not the target resolution, so in light of the other slow functionality it's probably not even a blip on the radar.

So, what's to be done about the performance?

In short, for the best-looking settings, not much.

However, people with graphics cards that can't sling around large render targets with ease shouldn't feel left out in the cold - the next time I have a block of time free for MAME, I'm going to be adding functionality to disable or limit the HLSL pre-scaling. It might already be in the INI and I've forgotten.

Anyway, that's the long and short of it. HLSL is not dead, it just needs to rest for a few days because I burnt myself way out on it.

#71065 - 06/22/11 07:18 PM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Jun 2011
Posts: 2
Josef 1975 Offline
Member
Josef 1975  Offline
Member

Joined: Jun 2011
Posts: 2
VERONA, ITALY
well, on my core duo E8400 3.0ghz with ati radeon hd5850 1gb no problem with mame 142u4 and preset "arcade perfect" at fullhd resolution (1920x1280), but on 142u5 all new presets I tried are quite slow, I can't get more than 15-20 fps.
anyway.... thanks for your work!! I'll keep waiting for news smile

Last edited by Josef 1975; 06/22/11 07:18 PM.
#71082 - 06/23/11 01:14 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Apr 2011
Posts: 271
B2K24 Offline
Senior Member
B2K24  Offline
Senior Member

Joined: Apr 2011
Posts: 271
I'm wondering if Performance favors Nvidia cards over ATI, similar to folding@home GPU 2/3 clients.

It would be interesting for one to have the ability to execute say 1 million HLSL calculations and compare times between different cards.




#71083 - 06/23/11 01:35 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Sep 2007
Posts: 143
Augusto Offline
Village Idiot
Augusto  Offline
Village Idiot
Senior Member

Joined: Sep 2007
Posts: 143
perhaps have ATI radeon hd gpus with "more speed" in benchmark tests, but NVIDIA have better texture quality, solid color quality, real and true frame rate and not have terrible numbers of bugs in drivers.
ATI unhappily have bugs at lot and yet they drop support for gpus for less of 3 years.
I have one ATI radeon 3000 series gpu and amd never had added opencl support for my 3300-3450-3650 gpus.
NVIDIA have added CUDA for GF8000 series.GF8400 have CUDA support, but 8400 is one gpu from 2005 !
Radeon HD 3650 have more of 100 GFLOPS of performance and ATI NOT ADD OPENCL SUPPORT FOR ONE GPU FROM 2009 !
learned lesson.


title Village Idiot is bug of bad understand,but was forgotten to fix it.
#71084 - 06/23/11 02:02 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Jul 2001
Posts: 100
Reznor007 Online content
Senior Member
Reznor007  Online Content
Senior Member

Joined: Jul 2001
Posts: 100
Norman, OK, USA
"The 6XX series of cards do not have the required hardware to execute OpenCL kernels" directly from AMD.

The 6xx cards refers to the R600 series GPU's, which is the Radeon 3000 series.

For what it's worth, the Geforce 8000 series can only support CUDA1.1, but the current version is 4.0.

#71085 - 06/23/11 04:31 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Oct 2005
Posts: 351
ReadOnly Offline
Senior Member
ReadOnly  Offline
Senior Member

Joined: Oct 2005
Posts: 351
Correct me if I'm wrong but isn't it the 15th thread about HLSL?

#71086 - 06/23/11 04:50 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Sep 2007
Posts: 143
Augusto Offline
Village Idiot
Augusto  Offline
Village Idiot
Senior Member

Joined: Sep 2007
Posts: 143
@Reznor007

"The 6XX series of cards do not have the required hardware to execute OpenCL kernels" directly from AMD."
hmmm .... if true because was added Opencl to HD 4300 being that HD 4300 is the same HD3450 using other name ?
HD3300 is DX 10 and 3450 is 10.1.
some sources say that the 4300 is one 3450 modified to be IGP.
4300 is DX 10.1 and have OpenCL.
some developers say that is possible add opencl to 3000 series, but for ATI money is the main and the NVIDIA money and users are the main =)

"The 6xx cards refers to the R600 series GPU's, which is the Radeon 3000 series."
yes.

"For what it's worth, the Geforce 8000 series can only support CUDA1.1, but the current version is 4.0."
right, but was added.
perhaps ATI have one new more fastest gpu in the world, but the framerate is true ?
I had one GF6200 and about perfomance is more slow than ATI Radeon HD 3450, BUT the framerate is perfect and honest.
3450 not render totally all frames, but for DX and games the ati driver report that is being rendering all frames.
try using mame and choose vsync and triple buffer. you perhaps will say "strange.... vsync not work".
MAME using in one GF6200 have better video output quality than in one HD3000 series.colors and brightness are very good and one HD3600 not have better color output than one Geforce.
GF6200 is slow, but the framerate and colors are better.

for me that was one learned lesson

@MESSfan
Correct me if I'm wrong but isn't it the 15th thread about HLSL?
yes is one few out from main thread, but have one detail here that say about HLSL.
I yet not have one new mame version with HLSL support.
I wait one mame version update.

HLSL is great ?
what is the real good side about HLSL ?
perhaps fix the "ATI radeon color quality" using mame ?


title Village Idiot is bug of bad understand,but was forgotten to fix it.
#71087 - 06/23/11 05:33 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Jul 2001
Posts: 100
Reznor007 Online content
Senior Member
Reznor007  Online Content
Senior Member

Joined: Jul 2001
Posts: 100
Norman, OK, USA
VSync works fine here(Radeon HD6950), and the framerate is true. Perhaps you went into the catalyst control center and set vsync to disabled if you are having tearing issues.

There are no color quality issues either. If you are having color problems, check the color settings in the control center as well.

And yes, HLSL is a big improvement over the standard MAME output. With some tweaking you can get it to look very similar to an original arcade CRT monitor.

#71090 - 06/23/11 10:18 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Sep 2007
Posts: 143
Augusto Offline
Village Idiot
Augusto  Offline
Village Idiot
Senior Member

Joined: Sep 2007
Posts: 143
"VSync works fine here(Radeon HD6950)"
vsync not is good in one 3450 or 3650.

"There are no color quality issues either. If you are having color problems, check the color settings in the control center as well"
power on 2 pc one with radeon hd and one with geforce and see what pc have better color and bright quality.
geforce have better video output quality.
nothing related with catalyst control center.
geforce video cards have better output in monitors.
ATI radeon hd 3000 series color palette for me not have good brightness and color quality.
well... but all that i say here not is waiting support to fix and is just for posting here.
just post.


Last edited by Augusto; 06/23/11 10:20 AM.

title Village Idiot is bug of bad understand,but was forgotten to fix it.
#71091 - 06/23/11 10:25 AM Re: The "Of Course HLSL Is Slow" Megathread! [Re: Just Desserts]  
Joined: Feb 2008
Posts: 98
JonasP Offline
Member
JonasP  Offline
Member

Joined: Feb 2008
Posts: 98
FWIW VSync doesn't work here either (Nvidia Quadro FX 880M). It's turned on in the Nvidia control panel and mess.ini but I still experience tearing.

Page 1 of 3 1 2 3

Who's Online Now
5 registered members (AJR, phulshof, Tauwasser, R. Belmont, IgorRus), 21 guests, and 4 spiders.
Key: Admin, Global Mod, Mod
Shout Box
Forum Statistics
Forums9
Topics8,364
Posts108,114
Members4,748
Most Online225
May 26th, 2014
Powered by UBB.threads™ PHP Forum Software 7.6.0
Page Time: 0.029s Queries: 15 (0.008s) Memory: 5.0193 MB (Peak: 5.2682 MB) Zlib enabled. Server Time: 2017-06-28 12:22:31 UTC