[Open-graphics] Shaders

James Richard Tyrer tyrerj at acm.org
Sun Apr 8 23:11:44 EDT 2007


Nicholas S-A wrote:
> I was thinking about how we will implement our OpenGL pipeline,
> mostly by looking at other implementations and the OpenGL spec.
> I don't think we were planning on having programmable shader on
> the OGA, but then I stumbled across this software, ShaderGen:
> http://developer.3dlabs.com/downloads/shadergen/index.htm
> Using something like that (that we port to or write for Linux, or even
> the embedded cpu), we could simply load a "standard OpenGL
> pipeline" from ROM, and if a programmed shader is requested then
> we can load it in place of the "fixed function" pipeline.
> 
> This could easily reduce the silicon area needed to implement the
> pipeline, since it combines many stages into a simple floating point
> processor. The problems that I see are:
> 1) Context Switches. How can we load a programmable shader
> without disrupting pixel output?

This is just a microcode so all you need to do is have the processor 
wait till the current pixel is complete to start using the new 
microcode.  They can be in different locations in writable microcode memory.

> 2) It could reduce pixel output, since pipelining doesn't really
> work. It could also increase output because we can have a bunch
> of parallel vertex/fragment shaders.
> 3) This is a very nonstandard way to implement the OpenGL pipeline.

Actually, I think that this is how their post-Oxygen video cards work, 
so it isn't really non-standard.

> We would need to design a (very) simple shader if this could happen.
> We might even be able to run it at a faster clock speed than a FF
> pipeline if we fiddle with it.
> 
> Is this a possibility? What blatant implementation problems am I
> missing? Feel free to ignore this. I just wanted to get it out before I
> forgot it - I imagine this will be more relevant much later on.

What you are missing is that it will take "X" FMACs (plus some less 
expensive instructions) to run a pixel through the shader.  So, if we 
had an array of ALUs that could do a FMAC (Floating point Multiply 
Accumulate) then this would be a good idea since the hardware could 
easily be changed to do different tasks.

But, if we have only one ALU then it is going to take a while no matter 
how we implement it.

-- 
JRT


More information about the Open-graphics mailing list