Equalizer uses reference pointers in some places, and the reference count so far was protected by a pthread_mutex (or the equivalent on Windows). The simple reason is that when I implemented it, I didn’t want to spend the time on something better.
This week I’ve found the time to replace the lock-protected counter with an atomic variable, with surprisingly good results. The ‘frame throughput’ when just rendering a quad in eqPly increased from ~200 FPS to ~750 FPS on my MacBook Pro! As soon as one starts rendering something more complex (like the rockerArm), the speedup is much less noticeable. I haven’t done any tests yet on a multi-GPU system where lock contention should be more apparent.
So here is some non-news for all you parallel programming guys: Locks are bad for performance!
4. July 2008 at 14:55 |
ORLY??
Congrats on the speed-up, though.