[Open-graphics] The future of GPUs
Lourens Veen
lourens at rainbowdesert.net
Sun May 25 18:51:33 EDT 2008
On Saturday 24 May 2008 12:12:27 Dieter wrote:
> It looks like ATI plans to take a similar path with GPUs
> as with CPUs. Rather than keep making GPUs bigger and bigger
> with resulting increases in cost, power consumption and heat,
> it looks like they plan to make smaller GPUs and use more
> than one together to build high end products. The little I've
> read indicates that they will be using multiple dies.
>
> This approach has some manufacturing advantages. If you can
> build a range of product with a single type of die, it would
> cost less to manufacture. Only one mask to have made. Larger
> quantities of a single chip. With smaller dies, a defect would
> spoil a smaller percentage of the wafer. Yield would increase.
>
> I know a bit about SMP, but close to nothing about the
> Crossfire/SLI style multiple GPU systems. How well does it
> scale?
Way back we discussed the needed precision in the multipliers and
reciprocals for the 3D engine, and it became quite clear that we'd have
to chop up the trapezoids that we're rendering into smaller bits.
Essentially, we'd tile the screen to keep the spans small and limit
accumulation of roundoff errors.
Once you have that, having a separate renderer for each tile seems easy
enough. Tiling like that has been done on video cards before (PowerVR
was the first consumer-level card that did it I think) and it's being
done on a larger scale with tiled display walls (one machine for every
couple of monitors).
Essentially, it's a blackboard style architecture, or even a tuple space
kind of thing if you just DMA the command sets to a separate piece of
memory and have the processors scoop them up and execute them whenever
they are available.
The main problem seems to me getting all these processors to access the
framebuffer at the same time. I suppose the memory would just have to
be fast enough to keep up with the renderer though, you'd have the same
problem with a single very fast GPU.
> How much extra work is it to create a multiple GPU
> system? Would it be feasible for OGP to go this route? If
> we can, this could allow us to be *far* more competitive
> while keeping chip fab costs down.
Right, I've been thinking about this in the context of a completely free
PC. Build a ground plate that supplies power and cooling, and then
stack a bunch of cubes containing a CPU (at say, 586 level of
performance), some memory, and some fast interconnect to all sides on
top of it. Some cubes would have external I/O on them. Need more
computing power, simply add more cubes, which would be cheap
individually because they would be made in large volumes.
The challenge would of course be the operating system, because you're
not going to hand-rewrite your software to run efficiently on your
particular topology of cubes, so it would have to be partitioned
dynamically. Essentially it'd be a microcluster, with all the
advantages and disadvantages that come with it. But that's something
for the other mailinglist I guess.
> It looks like ray tracing and radiosity are going to become
> more and more important. Does OGP need to do anything to
> be ready for this? (e.g. architecture to support it)
It's been a decade or so since I've dealt with those, but let's see what
I can remember.. Back then radiosity was cool because it was used in
Quake II as a (very slow) preprocessing step for calculating lightmaps.
IIRC, the main part of a radiosity calculation is calculating the
transfer function. Given two polygons, it tells you how much they "see"
of each other, and then uses that to figure out how much of the light
radiated by one ends up on the other polygon. It's linear algebra,
probably a bunch of dot and cross products. You do that for each pair
of polygons to calculate a transfer matrix, and then take the initial
luminosities of the polygons and multiply them by the matrix repeatedly
until you get to a steady state, or until you get to the shipping
deadline on your game. I'm not sure about the details, but it sounds
about right. So, maybe we should explore that DSP idea again.
If you want correct shadows, you also have to take any objects in
between the two faces into account for the transfer function, which is
where the ray tracing part comes from. I think most radiosity renderers
from that era would just shoot one or a few rays between the polygons
and multiply the result by the proportion that got through, Monte Carlo
style. Or you can forego the radiosity strategy completely and do
everything by ray tracing. I think they also use ray tracing for real
time 3D sound.
Anyway, it seems that it's all linear algebra, lots of adds, mults, and
mult-adds. And that this too could be parallelised, come to think of
it...
Cheers,
Lourens, who should really get started on his LinuxTag presentation
rather than writing long posts about parallel graphics hardware :-)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.duskglow.com/open-graphics/attachments/20080526/c80e91f8/attachment.bin
More information about the Open-graphics
mailing list