[Open-graphics] HELP: Weird discrepancy between fbZPutImage and memcpy

Zan Lynx zlynx at acm.org
Wed Jul 11 19:54:51 EDT 2007


Timothy Normand Miller wrote:
> In testing and experimenting, I noticed something strange, and I was
> wondering if anyone could help shed some light on this.
> 
> Keep in mind that OGD1 does not support DMA at this time, so
> everything we do is simple PIO access to the card.  If I use the test
> "x11perf -putimage500", I get a result of about 99/sec.  That
> translates to a bus throughput of about 94 megabytes/sec.
> 
> If, on the other hand, I use memcpy, I only get about 24 megs/sec.
> 
> I've looked at the source to x.org, and I just can't see them doing
> anything special.  I don't see any use of inline assembly or
> processor-specific instructions.  They use memcpy for aligned copies
> and they do something more complex for unaligned copies.
> 
> I tried doing just 32-bit word copies (to try to imitate what they're
> doing), and even that didn't get me any faster than about 24 or 25
> megs/sec.
> 
> What could possibly be making my code so much slower than theirs?

I bet that it is the MTRR setting.
Here is the MTRR from a system running an Nvidia card with the nv driver.
$ cat /proc/mtrr
reg00: base=0x00000000 (   0MB), size=1024MB: write-back, count=1
reg01: base=0xf0000000 (3840MB), size=  64MB: write-combining, count=1

This is the line from the X log where it sets it up:
(==) NV(0): Write-combining range (0xf0000000,0x4000000)


More information about the Open-graphics mailing list