You gotta love volume rendering. 🙂
I colleague just benchmarked eVolve, the new Equalizer volume rendering example on a 16-node graphics cluster. You can see the performance graphs below (click on it for a large version).
The first one tests a 512x512x512, 8-bit data set. As mentioned in earlier posts in this blog, it’s a bad idea to overcommit GPU memory. 😉
As soon as the database decomposition makes the slice rendered by each node small enough, we’ll see a nice jump of performance, to interactive levels. From there on it scales nicely to 16 nodes. The 2D case is close enough to the linear scalability, given that it still needs to hold the full texture on all nodes.
The second benchmark uses the same data set at a 256^3 resolution. Since this is 1/8 of the data, we don’t see any caching effects. Scalability is also nice until it hits the IO limits of the system, which is an acceptable 25 fps at 1280×1024. And this is not yet the limit, the test used IP-over-IB instead of the fast SDP-over-IB because the system was not set up for it.
The 25 fps are reached with 12 nodes, adding more nodes does not improve performance. For database decomposition, the performance limit is 12.5 fps, since twice as much data has to be handled by each node (see the direct-send paper for details). This limit is reached at 10 nodes already.