Commit Graph

3815 Commits

Author SHA1 Message Date
Chris Wilson 92f4d978c8 sna: More micro-optimisation of messing around with clip regions
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 15:52:13 +01:00
Chris Wilson 57151f6547 sna: Micro-optimise checking for singular clip boxes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 15:23:17 +01:00
Chris Wilson 823a4272c5 sna/gen3: Avoid RENDER/BLT context switch for fill boxes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 13:51:41 +01:00
Chris Wilson 887361de17 sna: Enable single fill fast path for PolySegment
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 13:51:25 +01:00
Chris Wilson 721cf30e9e sna/accel: If the data is already on the GPU, use it for the source
Fixes regression from 1ec6a0e277 (sna: Move the source to the GPU
if it is reused).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 13:29:28 +01:00
Chris Wilson 15a4410cec sna: use correct insertion point for sorting partials
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 13:28:30 +01:00
Chris Wilson a9b53c4877 sna: Mark the spans render functions as fastcall
This reduces the amount of dancing required to call into the span
functions as we can pass the arguments in both the integer and floating
point registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 10:56:51 +01:00
Chris Wilson 208fa8e6b8 sna/trapezoid: Perform the NULL check for damage in the caller
Save the function call overhead in the common case.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 10:56:38 +01:00
Chris Wilson 5050fead0e sna/gen3: avoid applying zero offset to common spans
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 10:54:53 +01:00
Daniel Vetter d0184b5909 snb: implement PIPE_CONTROL workaround
Sandybdrige requires an elaborate dance to flush caches without
hanging the gpu. See public docs Vol2Part1 1.7.4.1 PIPE_CONTROL
or the corrensponding code in mesa/kernel.

This (together with the corresponding patch for the kernel) seems to
fix the hangs in cairo-perf-traces I'm seeing on my snb machine.

v2: Incorporate review from Chris Wilson. For paranoia keep all three
PIPE_CONTROL cmds in the same batchbuffer to avoid upsetting the gpu.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-10-11 09:54:17 +02:00
Chris Wilson 4a2e833ab1 sna/gen7: Add render support for fill one
To prevent the RENDER to BLT transition and potential stall.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 08:48:54 +01:00
Chris Wilson 41f525fab5 sna/gen6: Add render support for fill-one-box
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 08:48:54 +01:00
Chris Wilson 5b6575bdde sna: Support a fast composite hook for solitary boxes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-11 08:48:54 +01:00
Chris Wilson c5414ec992 sna: Use BLT operations to avoid fallbacks in core glyph rendering
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-04 19:30:35 +01:00
Chris Wilson 6b62b9d7c4 sna: Increase reserved space in batch to accommodate gen5 workaround
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-04 19:30:06 +01:00
Chris Wilson 6351d8eb82 sna/gen[23]: Fix compilation with debugging enabled
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-04 19:29:46 +01:00
Chris Wilson 1ec6a0e277 sna: Move the source to the GPU if it is reused
We attempt to skip upload a source pixmap to the GPU in the event it is
used only once (for example during image upload by firefox). However, if
we continue to use the CPU source pixmap then it obviously was worth
uploading to the GPU. So if we use the CPU pixmap a second time, do the
upload and then blit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-04 12:23:26 +01:00
Chris Wilson 413c9f7111 sna/blt: Suppress repeated SETUP_BLT
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-03 08:54:50 +01:00
Chris Wilson 1067335305 sna/blt: SETUP_BLT needs 9 dwords of batch space, not 3!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-03 00:09:00 +01:00
Chris Wilson d8c96a6a1d sna/blt: Use SETUP_MONO to reduce the number of fill relocations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-02 23:23:51 +01:00
Chris Wilson 4d227d43f0 sna/accel: Correct syntax for constifying BoxPtr
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-02 12:26:07 +01:00
Chris Wilson 3dd8052416 sna/accel: Only throttle after flushing
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-02 11:21:32 +01:00
Chris Wilson 04b8f0a5a1 sna/accel: Add a compile option to force flushing on every blockhandler
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-02 11:21:32 +01:00
Chris Wilson 32cef71efe sna/accel: Add an compile option to disable use of spans
Using spans has a tremendous effect (~100x) on x11perf, some good but
mostly bad. However, in reality operations are mixed and so preventing
migration on alternate opertaions is a win. In the x11perf slowdowns, it
appears to be CPU bound and so it seems like there should be plenty of
scope for recovering the lost performance.

However, for the time being, just go back to the old fallbacks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-02 11:21:32 +01:00
Chris Wilson dc1ec0dd1a sna/accel: Only disable the flush migitration if nothing changed
Previously we ignored updating the scanout in place, and so we were not
amoritizing the shadow cost of common core rendering operations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 20:39:09 +01:00
Chris Wilson c6acf13258 sna/accel: Micro-optimise sna_fill_spans_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 20:39:09 +01:00
Chris Wilson 8029765515 sna/accel: Don't attempt converting to spans if we will only fallback
As the span code does not yet handle plane masks or stippling, it is
disadvantageous to convert to spans only to fallback.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 20:39:09 +01:00
Chris Wilson cd11bd69f4 sna/accel: Use miPolyArc to convert arcs into spans for gpu bo
This is actually tricker than it looks since miPolyArc() sometimes uses
an intermediate bitmap which performs worse than the fbPolyArc() fallback.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 20:39:08 +01:00
Chris Wilson d07256cc33 sna/accel: Convert segments into spans similarly to PolyLine
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 20:39:08 +01:00
Chris Wilson d09a229e32 sna/accel: Use the mi*Line routines to convert the line into spans for gpu bo
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 20:39:08 +01:00
Chris Wilson e7a662b92e sna: Sort partials by remaining space
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 20:39:08 +01:00
Chris Wilson 13b9b5d8d6 sna/io: Only mark the buffer as LAST if we know we will flush the IO
Otherwise we can continue to batch up the data upload into larger
buffers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-01 09:23:41 +01:00
Chris Wilson 7ecc6993b8 sna/gen6: Fix offset of Scan-Line-Compare register
Reported-by: Frank Mariak <fmariak@macrosystem.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-30 16:11:05 +01:00
Chris Wilson d8fe941bc2 sna: Check for request retires after every batch
In the beginning, I did perform a retire after ever batch. Then I
decided that it was too much CPU overhead for too little gain. On
reflection, i.e. further benchmarking, we do see a performance
improvement for recycling active buffers faster.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-28 23:03:00 +01:00
Chris Wilson e74a39b454 sna/gen7: Confusion reigns as trying to fix errors found by an outdated checker
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-28 15:05:33 +01:00
Chris Wilson 6395894ada sna/gen7: Fix up a couple instances of my inability to count
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-27 23:26:33 +01:00
Chris Wilson a53538659d sna/accel: Fix s/x/y/ typo in computing relative drawing coordinates
Reported-by:Roman Jarosz <kedgedev@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41165
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-25 12:40:01 +01:00
Chris Wilson 960688d168 sna/accel: Debug option to force CPU fallbacks
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-25 12:23:10 +01:00
Chris Wilson 4fd46b8bb7 sna/glyphs: Add glyphs directly onto a client temporary buffer
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-25 10:02:02 +01:00
Chris Wilson 5154e281ed sna/glyph: Avoid useless attempt at GPU glyph rendering to a1 destinations
The actual bug is a little involved as we don't damage the temporary
glyph mask correctly presuming that we only hit GPU paths. However,
should we fail to prepare the composite operation that paints the mask
on to the destination, things fail horribly.

One particular example is that wine like to create its own temporary a1
buffer for the glyphs (which we render to via another temporary mask...)
which triggers the delayed fallback and then sw compositing with a random
buffer.

Reported-by: Roman Jarosz <kedgedev@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41165
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-25 09:56:16 +01:00
Chris Wilson 46fedf0cf1 sna/kgem: Check all operation bo in a single amalgamation
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-25 09:38:43 +01:00
Chris Wilson ccf6547a8f sna: Paranoid debug flush after every op (as well as before)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-24 23:16:57 +01:00
Chris Wilson 0233760034 sna/gen5: Debug option to disable state caching
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-24 23:16:43 +01:00
Chris Wilson af4d3853ae sna/glyphs: Convert all sub-8bpp masks to a8
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-24 19:47:41 +01:00
Chris Wilson c79e90da71 sna: Add a debug option to disable caching
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-24 19:47:41 +01:00
Kenneth Graunke 6bbb88af09 Fix incorrect maximum PS thread count on IvyBridge
I mistakenly set GEN7_PS_MAX_THREAD_SHIFT to 23; it's actually 24 on
Ivybridge.  Not only did this halve our thread count, it caused us to
write 1 into a bit 23, which is marked as MBZ (must be zero).
Furthermore, it made us write an even number into this field, which is
apparently not allowed.  Apparently we were just lucky it worked.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-24 09:28:45 +01:00
Chris Wilson 7f7f95abbf sna/accel: Use the PolyFillRect to handle tiled spans
Would be preferrable to duplicate the tiling logic. Leave the task of
reimplementing XAA to another day!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-23 11:49:43 +01:00
Chris Wilson 964c96b181 sna/accel: Always subtract the enlarged region from the outstanding GPU damage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-23 11:49:43 +01:00
Chris Wilson c68856f346 sna/accel: Only skip undamaging the GPU for reads
Introduced with ac1b83240e (sna/accel: Simplify single pixel read-back)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-23 11:49:43 +01:00
Chris Wilson 5913c90967 sna/accel: fix assert to include the offset of copy
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-22 17:13:15 +01:00