|
|
Joined: Nov 2006
Posts: 239
Senior Member
|
|
Senior Member
Joined: Nov 2006
Posts: 239 |
Hi All ..
Great work, I love the adaptation of the gl dispatcher etc.
I have fixed a few bugs ..
You can get it from my usual place:
http://www.jausoft.com/cgi-bin/gitweb.cgi?p=mame/src/osd/sdl;a=shortlog;h=sgothel_stable
( Well, the bilinear filter is not complete bug free, so .. if things are corrupted (e.g. in a small window) just force power of 2 textures till then .. )
2007-10-04 sgothel
+ - Fix: LUT bitmap size calculation
+ Remainder of sqrt was lost, visible in galaxian (shoot invisible)
+
+ - Fix: Biliniear GLSL fragment shader
+ - add 0.5 texel in coordinates
+ - ease the coordinate usage a bit
+
+2007-10-04 sgothel
- sync 0120u2
|
|
|
|
|
Joined: Mar 2001
Posts: 17,316 Likes: 280
Very Senior Member
|
|
Very Senior Member
Joined: Mar 2001
Posts: 17,316 Likes: 280 |
|
|
|
|
|
Joined: Mar 2007
Posts: 6
Member
|
|
Member
Joined: Mar 2007
Posts: 6 |
Damn, the pthread error still happens, though it seems to get further now than in 120u1, this way it at least gets to the warning screen for the game.
|
|
|
|
|
Joined: Mar 2001
Posts: 17,316 Likes: 280
Very Senior Member
|
|
Very Senior Member
Joined: Mar 2001
Posts: 17,316 Likes: 280 |
The pthread error is likely a glibc issue rather than a kernel one. The fact that Fedora 7 and Mac OS X run solidly with no errors tends to point in that direction (OS X doesn't even use glibc).
|
|
|
|
|
Joined: Nov 2006
Posts: 239
Senior Member
|
|
Senior Member
Joined: Nov 2006
Posts: 239 |
well, happens here as well :-|, where here is:
OpenSuSE 10.3, linux 2.6.22.6 cpu:Intel Dual Core (i686, no amd64 extension) GNU C Library stable release version 2.6.1 (20070803) gcc 4.2.1
ati fglrx driver 8.42.3
+++
But it works on my x86_64 machines, running in 64bit mode:
OpenSuSE 10.3, linux 2.6.22.6, or linux 2.6.23-24 cpu:AMD64 x2 x86_64 GNU C Library stable release version 2.6.1 (20070803) gcc 4.2.1
NVIDIA-Linux-x86_64-100.14.19
+++
Funny ..
So I must assume some arch depending code, either within glibc or sdlmame is responsible for this. Since a bunch of other software runs well on the 'dirty' machine, well ..
|
|
|
|
|
Joined: Mar 2001
Posts: 17,316 Likes: 280
Very Senior Member
|
|
Very Senior Member
Joined: Mar 2001
Posts: 17,316 Likes: 280 |
Hmm, it could be a 32-bit thing. That's interesting, I'll try a 32-bit build.
|
|
|
|
|
Joined: Mar 2001
Posts: 17,316 Likes: 280
Very Senior Member
|
|
Very Senior Member
Joined: Mar 2001
Posts: 17,316 Likes: 280 |
Yeah, it's definitely 32-bit only. No idea why though.
|
|
|
|
|
Joined: Feb 2004
Posts: 2,651 Likes: 371
Very Senior Member
|
|
Very Senior Member
Joined: Feb 2004
Posts: 2,651 Likes: 371 |
It can't be a problem with SDLMAME - we do the same thing with pthreads conditions on every platform and architecture.
|
|
|
|
|
Joined: Mar 2001
Posts: 17,316 Likes: 280
Very Senior Member
|
|
Very Senior Member
Joined: Mar 2001
Posts: 17,316 Likes: 280 |
I dunno, the behavior is very much like memory trashing taking place at some point and stomping either the mutex or cond.
|
|
|
|
|
Joined: Feb 2007
Posts: 507
Senior Member
|
|
Senior Member
Joined: Feb 2007
Posts: 507 |
The following patch is not strictly performance related so I post it here. It brings over some minor missing bits and pieces from winwork.c to sdlwork.c. In addition, it removes some obsolete includes and introduces inlines (might as well be implemented as defines) to get the source closer to sdlwork.c. More coherence could be achieved by renaming mame_thread_info to thread_info. I have not touched this one since it has been introduced by Vas and I fear it may break something on OSX. The definition/inlining of all interlocked_* functions in my opinion should be moved to osinline.h/eigc*.h and they should be renamed to osd_interlocked* or atomic_interlocked*. Since this will have an effect on winwork.c I have not made any changes.
diff -Nur /tmp/sdlmame0120u2/src/osd/sdl/sdlwork.c src/osd/sdl/sdlwork.c
--- /tmp/sdlmame0120u2/src/osd/sdl/sdlwork.c 2007-11-03 15:09:43.000000000 +0100
+++ src/osd/sdl/sdlwork.c 2007-11-06 23:31:02.000000000 +0100
@@ -16,13 +16,6 @@
#include "os2work.c"
#else
-// standard headers
-#include <time.h>
-#if defined(SDLMAME_UNIX) && !defined(SDLMAME_DARWIN)
-#include <sys/time.h>
-#endif
-#include <pthread.h>
-#include <unistd.h>
#include "osdcore.h"
#include "osinline.h"
@@ -48,7 +41,6 @@
//============================================================
#define INFINITE (osd_ticks_per_second()*10000)
-#define MAX_THREADS (16)
#define SPIN_LOOP_TIME (osd_ticks_per_second() / 1000)
@@ -58,7 +50,7 @@
//============================================================
#if KEEP_STATISTICS
-#define add_to_stat(v,x) do { atomic_add32((v), (x)); } while (0)
+#define add_to_stat(v,x) do { interlocked_add((v), (x)); } while (0)
#define begin_timing(v) do { (v) -= osd_profiling_ticks(); } while (0)
#define end_timing(v) do { (v) += osd_profiling_ticks(); } while (0)
#else
@@ -79,8 +71,8 @@
struct
{
volatile INT32 haslock; // do we have the lock?
- UINT8 filler[60]; // assumes a 64-byte cache line
- } slot[MAX_THREADS]; // one slot per thread
+ INT32 filler[64/4-1]; // assumes a 64-byte cache line
+ } slot[WORK_MAX_THREADS]; // one slot per thread
volatile INT32 nextindex; // index of next slot to use
};
@@ -139,7 +131,7 @@
volatile INT32 done; // is the item done?
};
-
+typedef void *PVOID;
//============================================================
// FUNCTION PROTOTYPES
@@ -153,19 +145,33 @@
// INLINE FUNCTIONS
//============================================================
-#ifndef osd_interlocked_increment
-INLINE INT32 osd_interlocked_increment(INT32 volatile *ptr)
+INLINE INT32 interlocked_exchange32(INT32 volatile *ptr, INT32 value)
{
- return atomic_add32(ptr, 1);
+ return atomic_exchange32(ptr, value);
}
-#endif
-#ifndef osd_interlocked_decrement
-INLINE INT32 osd_interlocked_decrement(INT32 volatile *ptr)
+INLINE INT32 interlocked_increment(INT32 volatile *ptr)
{
- return atomic_add32(ptr, -1);
+ return osd_interlocked_increment(ptr);
+}
+
+
+INLINE INT32 interlocked_decrement(INT32 volatile *ptr)
+{
+ return osd_interlocked_decrement(ptr);
+}
+
+
+INLINE INT32 interlocked_add(INT32 volatile *ptr, INT32 add)
+{
+ return atomic_add32(ptr, add);
}
-#endif
+
+
+
+//============================================================
+// Scalable Locks
+//============================================================
INLINE void scalable_lock_init(scalable_lock *lock)
{
@@ -176,7 +182,7 @@
INLINE INT32 scalable_lock_acquire(scalable_lock *lock)
{
- INT32 myslot = (osd_interlocked_increment(&lock->nextindex) - 1) & (MAX_THREADS - 1);
+ INT32 myslot = (osd_interlocked_increment(&lock->nextindex) - 1) & (WORK_MAX_THREADS - 1);
#if defined(__i386__) || defined(__x86_64__)
register INT32 tmp;
@@ -236,15 +242,15 @@
register INT32 tmp = TRUE;
__asm__ __volatile__ (
" xchg %[haslock], %[tmp] ;"
- : [haslock] "+m" (lock->slot[(myslot + 1) & (MAX_THREADS - 1)].haslock)
+ : [haslock] "+m" (lock->slot[(myslot + 1) & (WORK_MAX_THREADS - 1)].haslock)
, [tmp] "+r" (tmp)
:
);
#elif defined(__ppc__) || defined (__PPC__) || defined(__ppc64__) || defined(__PPC64__)
- lock->slot[(myslot + 1) & (MAX_THREADS - 1)].haslock = TRUE;
+ lock->slot[(myslot + 1) & (WORK_MAX_THREADS - 1)].haslock = TRUE;
__asm__ __volatile__ ( " eieio " : : );
#else
- osd_exchange32(&lock->slot[(myslot + 1) & (MAX_THREADS - 1)].haslock, TRUE);
+ osd_exchange32(&lock->slot[(myslot + 1) & (WORK_MAX_THREADS - 1)].haslock, TRUE);
#endif
}
@@ -287,7 +293,7 @@
queue->threads = (flags & WORK_QUEUE_FLAG_MULTI) ? (numprocs - 1) : 1;
// clamp to the maximum
- queue->threads = MIN(queue->threads, MAX_THREADS);
+ queue->threads = MIN(queue->threads, WORK_MAX_THREADS);
// allocate memory for thread array (+1 to count the calling thread)
queue->thread = malloc((queue->threads + 1) * sizeof(queue->thread[0]));
@@ -367,22 +373,27 @@
// process what we can as a worker thread
worker_thread_process(queue, thread);
- // spin until we're done
- begin_timing(thread->spintime);
- while (queue->items != 0 && osd_ticks() < stopspin)
- osd_yield_processor();
- end_timing(thread->spintime);
+ // if we're a high frequency queue, spin until done
+ if (queue->flags & WORK_QUEUE_FLAG_HIGH_FREQ)
+ {
+ // spin until we're done
+ begin_timing(thread->spintime);
+ while (queue->items != 0 && osd_ticks() < stopspin)
+ osd_yield_processor();
+ end_timing(thread->spintime);
+ begin_timing(thread->waittime);
+ return (queue->items == 0);
+ }
begin_timing(thread->waittime);
- return (queue->items == 0);
}
// reset our done event and double-check the items before waiting
osd_event_reset(queue->doneevent);
- atomic_exchange32(&queue->waiting, TRUE);
+ interlocked_exchange32(&queue->waiting, TRUE);
if (queue->items != 0)
osd_event_wait(queue->doneevent, timeout);
- atomic_exchange32(&queue->waiting, FALSE);
+ interlocked_exchange32(&queue->waiting, FALSE);
// return TRUE if we actually hit 0
return (queue->items == 0);
@@ -504,7 +515,7 @@
do
{
item = (osd_work_item *)queue->free;
- } while (item != NULL && compare_exchange_ptr((void * volatile *)&queue->free, item, item->next) != item);
+ } while (item != NULL && compare_exchange_ptr((PVOID volatile *)&queue->free, item, item->next) != item);
// if nothing, allocate something new
if (item == NULL)
@@ -537,7 +548,7 @@
scalable_lock_release(&queue->lock, lockslot);
// increment the number of items in the queue
- atomic_add32(&queue->items, numitems);
+ interlocked_add(&queue->items, numitems);
add_to_stat(&queue->itemsqueued, numitems);
// look for free threads to do the work
@@ -634,7 +645,7 @@
{
next = (osd_work_item *)item->queue->free;
item->next = next;
- } while (compare_exchange_ptr((void * volatile *)&item->queue->free, next, item) != next);
+ } while (compare_exchange_ptr((PVOID volatile *)&item->queue->free, next, item) != next);
}
@@ -681,8 +692,8 @@
break;
// indicate that we are live
- atomic_exchange32(&thread->active, TRUE);
- osd_interlocked_increment(&queue->livethreads);
+ interlocked_exchange32(&thread->active, TRUE);
+ interlocked_increment(&queue->livethreads);
// process work items
for ( ;; )
@@ -692,15 +703,17 @@
// process as much as we can
worker_thread_process(queue, thread);
- // spin for a while looking for more work
+ // if we're a high frequency queue, spin for a while before giving up
if (queue->flags & WORK_QUEUE_FLAG_HIGH_FREQ)
{
+ // spin for a while looking for more work
begin_timing(thread->spintime);
stopspin = osd_ticks() + SPIN_LOOP_TIME;
while (queue->list == NULL && osd_ticks() < stopspin)
osd_yield_processor();
end_timing(thread->spintime);
}
+
// if nothing more, release the processor
if (queue->list == NULL)
break;
@@ -708,8 +721,8 @@
}
// decrement the live thread count
- atomic_exchange32(&thread->active, FALSE);
- osd_interlocked_decrement(&queue->livethreads);
+ interlocked_exchange32(&thread->active, FALSE);
+ interlocked_decrement(&queue->livethreads);
}
return NULL;
}
@@ -754,9 +767,8 @@
end_timing(thread->actruntime);
// decrement the item count after we are done
- osd_interlocked_decrement(&queue->items);
- item->done = TRUE;
-
+ interlocked_decrement(&queue->items);
+ interlocked_exchange32(&item->done, TRUE);
add_to_stat(&thread->itemsdone, 1);
// if it's an auto-release item, release it
|
|
|
|
0 members (),
628
guests, and
6
robots. |
|
Key:
Admin,
Global Mod,
Mod
|
|
|
Forums9
Topics9,399
Posts122,883
Members5,092
| |
Most Online3,327 Nov 10th, 2025
|
|
These forums are sponsored by Superior Solitaire, an ad-free card game collection for macOS and iOS. Download it today!
|
|
|
|