Previous Thread
Next Thread
Print Thread
Page 1 of 3 1 2 3
#57252 12/22/09 12:36 AM
Joined: May 2009
Posts: 2,208
Likes: 354
J
Very Senior Member
OP Offline
Very Senior Member
J
Joined: May 2009
Posts: 2,208
Likes: 354
Make this faster:

Code
INLINE void COMBINER_EQUATION(UINT8 *out, UINT8 *A, UINT8 *B, UINT8 *C, UINT8 *D)
{
	INT32 color = (((*A-*B)* *C) + (*D << 8) + 0x80);
	color >>= 8;
	if (color > 255)
	{
		*out = 255;
	}
	else if (color < 0)
	{
		*out = 0;
	}
	else
	{
		*out = (UINT8)color;
	}
}

Joined: Mar 2006
Posts: 1,079
Likes: 6
L
Very Senior Member
Offline
Very Senior Member
L
Joined: Mar 2006
Posts: 1,079
Likes: 6
Code
INLINE void COMBINER_EQUATION(UINT8 *out, UINT8 *A, UINT8 *B, UINT8 *C, UINT8 *D)
{
	INT32 color = (((*A-*B)* *C) + 0x80);
	color >>= 8;
	color += *D;
	if (color > 255)
	{
		*out = 255;
	}
	else if (color < 0)
	{
		*out = 0;
	}
	else
	{
		*out = (UINT8)color;
	}
}
saves a shift, but MIGHT have issues because of this? not sure. do you have a testbench for this?


"When life gives you zombies... *CHA-CHIK!* ...you make zombie-ade!"
Joined: Mar 2006
Posts: 1,079
Likes: 6
L
Very Senior Member
Offline
Very Senior Member
L
Joined: Mar 2006
Posts: 1,079
Likes: 6
oh boy, i just thought of something truly gross which may be faster still:
Code
INLINE void COMBINER_EQUATION(UINT8 *out, UINT8 *A, UINT8 *B, UINT8 *C, UINT8 *D)
{
	INT32 color = (((*A-*B)* *C) + (*D << 8) + 0x80);
	// at this point color is located in the 0x0000nn00 bits
	if (color&0x7FFF0000) *out=(UINT8)(~color>>23);
	else
	{
		*out = (UINT8)(color>>8);
	}
}
The trick is if any bits 0x7FF00000 are set, the value must be below 0, and if any bits 0x000F0000 are set, the value must be above 256. Because of the way the numbers are multiplied together to produce the result, the bits at 0x7F80000 will always be zero (or 1 if the number is negative) due to insufficient range of the multipliers. Hence if you invert bits 0x7F80000 and put them in the result if any bits were set, you get instant clamping.
Now, can I make it faster...

Last edited by Lord Nightmare; 12/22/09 02:23 AM. Reason: add explanation pt 2.

"When life gives you zombies... *CHA-CHIK!* ...you make zombie-ade!"
Joined: Sep 2004
Posts: 392
Likes: 4
A
Senior Member
Offline
Senior Member
A
Joined: Sep 2004
Posts: 392
Likes: 4
Put the common case first and test for the outliers using a single unsigned compare. Also agree that you can move the addition of D after the shift.

Code
INLINE void COMBINER_EQUATION(UINT8 *out, UINT8 *A, UINT8 *B, UINT8 *C, UINT8 *D)
{
	INT32 color = (((*A-*B)* *C) + 0x80);
	color >>= 8;
	color += *D;
        if ((UINT32)color < 256)
	{
		*out = color;
	}
	else if (color < 0)
	{
		*out = 0;
	}
	else
	{
		*out = 255;
	}
}

A bigger win would be to identify a few common cases (like C == 0x100, B == 0, D == 0, etc) and check for those in the outer loops so that you can avoid a bunch of the math in the innermost loop, and might even be able to avoid the clamping (if B and D are 0, for example).

Joined: Mar 2006
Posts: 1,079
Likes: 6
L
Very Senior Member
Offline
Very Senior Member
L
Joined: Mar 2006
Posts: 1,079
Likes: 6
I think so, by combining both ways:
Code
INLINE void COMBINER_EQUATION(UINT8 *out, UINT8 *A, UINT8 *B, UINT8 *C, UINT8 *D)
{
	INT32 color = (((*A-*B)* *C) + 0x80);
	color >>= 8;
	color += *D;
	// at this point color is located in the 0x000000nn bits
	if (color&0x7FFFFF00) *out=(UINT8)(~color>>11);
	else
	{
		*out = (UINT8)(color);
	}
}


"When life gives you zombies... *CHA-CHIK!* ...you make zombie-ade!"
Joined: Sep 2004
Posts: 392
Likes: 4
A
Senior Member
Offline
Senior Member
A
Joined: Sep 2004
Posts: 392
Likes: 4
Cute, that should work even better. Still, put the common case first (i.e., swap the if/else).

Joined: Mar 2006
Posts: 1,079
Likes: 6
L
Very Senior Member
Offline
Very Senior Member
L
Joined: Mar 2006
Posts: 1,079
Likes: 6
Like this?
Code
INLINE void COMBINER_EQUATION(UINT8 *out, UINT8 *A, UINT8 *B, UINT8 *C, UINT8 *D)
{
	INT32 color = (((*A-*B)* *C) + 0x80);
	color >>= 8;
	color += *D;
	// at this point color is located in the 0x000000nn bits
	if ((color&0x7FFFFF00)==0) *out = (UINT8)color;
	else *out=(UINT8)(~color>>11);
}


"When life gives you zombies... *CHA-CHIK!* ...you make zombie-ade!"
Joined: May 2009
Posts: 2,208
Likes: 354
J
Very Senior Member
OP Offline
Very Senior Member
J
Joined: May 2009
Posts: 2,208
Likes: 354
I'm going to have to profile against this, because I'm pretty sure I win:

Code
In VIDEO_START(n64):
	for(i = 0; i < (1 << 24); i++)
	{
		UINT8 A = (i >> 16) & 0x000000ff;
		UINT8 B = (i >> 8) & 0x000000ff;
		UINT8 C = i & 0x000000ff;
		cc_lut1[i] = (INT16)((((((INT32)A - (INT32)B) * (INT32)C) + 0x80) >> 8) & 0x0000ffff);
	}

	for(i = 0; i < (1 << 16); i++)
	{
		for(j = 0; j < (1 << 8); j++)
		{
			INT32 temp = (INT32)((INT16)i) + j;
			if(temp > 255)
			{
				cc_lut2[(i << 8) | j] = 255;
			}
			else if(temp < 0)
			{
				cc_lut2[(i << 8) | j] = 0;
			}
			else
			{
				cc_lut2[(i << 8) | j] = (UINT8)temp;
			}
		}
	}

...

INLINE void COMBINER_EQUATION(UINT8 *out, UINT8 *A, UINT8 *B, UINT8 *C, UINT8 *D)
{
	/* The speedy, lookup table enabled version */
	*out = cc_lut2[(cc_lut1[(*A << 16) | (*B << 8) | *C] << 8) | *D];
}

Brings COMBINER_EQUATION from 13.98% of total execution time to 9.09%. smile

ETA: Also, I'm not trying to show you guys up or anything, it's just my 'net access went down shortly after I made this post, so I had to come up with it on my own in the meantime.

Joined: Mar 2006
Posts: 1,079
Likes: 6
L
Very Senior Member
Offline
Very Senior Member
L
Joined: Mar 2006
Posts: 1,079
Likes: 6
Yes that works, but you can optimize the table creation a little by using some stuff from the thread. Not that an 'only run once on startup' thing is a major speed loss.

LN


"When life gives you zombies... *CHA-CHIK!* ...you make zombie-ade!"
Joined: May 2009
Posts: 2,208
Likes: 354
J
Very Senior Member
OP Offline
Very Senior Member
J
Joined: May 2009
Posts: 2,208
Likes: 354
Actually, your final suggestion performed worse by nearly 2% total execution time versus the baseline case... frown

Page 1 of 3 1 2 3

Link Copied to Clipboard
Who's Online Now
0 members (), 59 guests, and 1 robot.
Key: Admin, Global Mod, Mod
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Forum Statistics
Forums9
Topics9,308
Posts121,693
Members5,070
Most Online1,283
Dec 21st, 2022
Our Sponsor
These forums are sponsored by Superior Solitaire, an ad-free card game collection for macOS and iOS. Download it today!

Superior Solitaire
Forum hosted by www.retrogamesformac.com