[Open-graphics] Shaders
James Richard Tyrer
tyrerj at acm.org
Sun Apr 8 23:11:44 EDT 2007
Nicholas S-A wrote:
> I was thinking about how we will implement our OpenGL pipeline,
> mostly by looking at other implementations and the OpenGL spec.
> I don't think we were planning on having programmable shader on
> the OGA, but then I stumbled across this software, ShaderGen:
> http://developer.3dlabs.com/downloads/shadergen/index.htm
> Using something like that (that we port to or write for Linux, or even
> the embedded cpu), we could simply load a "standard OpenGL
> pipeline" from ROM, and if a programmed shader is requested then
> we can load it in place of the "fixed function" pipeline.
>
> This could easily reduce the silicon area needed to implement the
> pipeline, since it combines many stages into a simple floating point
> processor. The problems that I see are:
> 1) Context Switches. How can we load a programmable shader
> without disrupting pixel output?
This is just a microcode so all you need to do is have the processor
wait till the current pixel is complete to start using the new
microcode. They can be in different locations in writable microcode memory.
> 2) It could reduce pixel output, since pipelining doesn't really
> work. It could also increase output because we can have a bunch
> of parallel vertex/fragment shaders.
> 3) This is a very nonstandard way to implement the OpenGL pipeline.
Actually, I think that this is how their post-Oxygen video cards work,
so it isn't really non-standard.
> We would need to design a (very) simple shader if this could happen.
> We might even be able to run it at a faster clock speed than a FF
> pipeline if we fiddle with it.
>
> Is this a possibility? What blatant implementation problems am I
> missing? Feel free to ignore this. I just wanted to get it out before I
> forgot it - I imagine this will be more relevant much later on.
What you are missing is that it will take "X" FMACs (plus some less
expensive instructions) to run a pixel through the shader. So, if we
had an array of ALUs that could do a FMAC (Floating point Multiply
Accumulate) then this would be a good idea since the hardware could
easily be changed to do different tasks.
But, if we have only one ALU then it is going to take a while no matter
how we implement it.
--
JRT
More information about the Open-graphics
mailing list