Archive for July, 2008

Parallel Rendering Videos – Pixel

31. July 2008

Below is the next video – Pixel-based decomposition:

Advertisements

Parallel Rendering Videos – Stereo

30. July 2008

Below is the next video – EYE (stereo) decomposition:

DB (sort-last) load-balancing

29. July 2008

Continuing our set of movies, below is a video showing automatic database load-balancing.

Furthermore, this is a fairly new feature, just checked in yesterday. To my knowledge, Equalizer is the only toolkit on the market doing this (unless you count MPK 😉 ). The only thing required from the application is the capability to use a different database-range each frame for rendering, the rest is done by Equalizer behind the scenes.

All this stuff is available in the brand-new 0.5.4 developer release.

Parallel Rendering Videos – 2D load-balancing

28. July 2008

And another video, 2D load-balancing with eVolve:

Parallel Rendering Videos – DB

27. July 2008

Here is the next Equalizer video – Database decomposition with eVolve:

Parallel Rendering Videos – 2D

26. July 2008

I’ve started uploading some parallel rendering videos to youtube, showing Equalizer in action.

I’ll post them here as they become ready. Here is the first one, showing static 2D decomposition – enjoy!

Edit: replaced the video with in-movie annotations.

Equalizer in hardware?

21. July 2008

The web (well, a small part of it) has been abuzz with a new company claiming to provide ‘a near-linear to above-linear increase in performance’.

The solution consist of a chip in front of the graphics cards, which ‘decomposes a complex scene into well-balanced parallel tasks, and then recompose each task into the correct final image with no overhead’, of course transparently for the latest versions of OpenGL and DirectX.

Color me skeptical. First of all, decomposing an OpenGL stream correctly is not easy, and doing it in a way to provide a linear speedup is even harder. Keeping it compatible and up-to-date with the latest extensions will take quite some resources. Then there is the little problem that the single application thread has to be capable of feeding the GL pipeline fast enough, and the compositing has to take things like transparency and antialiasing into account.

I guess we’ll have to wait for the vapor to go away and for real hardware to arrive. I know I will keep an eye on it, this could be useful for Equalizer on visualization clusters.

New Equalizer Poll

15. July 2008

I’ve created a new poll to get a feeling which scenegraph integration is most wanted in Equalizer.

We’ve like to get a feeling where the biggest need is to port scene-graph based OpenGL applications to multi-GPU systems and visualization clusters.

Unfortunately I can’t embed the poll in this posting, so just weasel over to the Equalizer website and take the poll there!

If you have any input or don’t find your scene graph listed, just leave a comment below.

Two Methods for driving OpenGL Display Walls

7. July 2008

Recently the the VMML at the University of Zürich performed a benchmark comparing Chromium and Equalizer on a display wall. The result surprised me, as I would have expected less difference between the two solutions in this setup, since only static display lists are used. Unfortunately neither InfiniBand nor the broadcast SPU were available for this test, which should improve the Chromium performance.

The performance graph is on the left. You can download the White Paper from the Equalizer website.

pthread_mutex vs atomic operations

3. July 2008

Equalizer uses reference pointers in some places, and the reference count so far was protected by a pthread_mutex (or the equivalent on Windows). The simple reason is that when I implemented it, I didn’t want to spend the time on something better.

This week I’ve found the time to replace the lock-protected counter with an atomic variable, with surprisingly good results. The ‘frame throughput’ when just rendering a quad in eqPly increased from ~200 FPS to ~750 FPS on my MacBook Pro! As soon as one starts rendering something more complex (like the rockerArm), the speedup is much less noticeable. I haven’t done any tests yet on a multi-GPU system where lock contention should be more apparent.

So here is some non-news for all you parallel programming guys: Locks are bad for performance! 😉