Commit Graph

4285 Commits

Author SHA1 Message Date
Chris Wilson 0cda7b4fa8 sna: Implement extended fallback handling for src == dst copies
Only marginally better than falling all the way back to using the CPU,
is to perform a double copy to workaround the overlapping copy.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-20 18:46:47 +00:00
Chris Wilson d257a96739 sna: Explicitly handle depth==1 scratch pixmaps
Short-cut the determination of whether it can be tiled and accelerated
-- we know it can't! This is mainly to cleanup the debug logs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-20 12:55:01 +00:00
Chris Wilson 0c12f7cb01 sna: Tidy up some recent valgrind complaints with reuse of scratch pixmaps
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-20 12:48:28 +00:00
Chris Wilson d3a4f5db14 sna: Fixup the refcnt to avoid an assert
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-20 12:17:27 +00:00
Chris Wilson ac52a1fcd1 sna: Don't immediately check for region intersection after subtract
In the READ==0 case we know that the region does not intersect damage
because we have just subtracted, and checking the intersection causes us
to immediately apply the subtraction operation defeating the
optimisation and forcing the expensive operation each time.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-20 08:44:14 +00:00
Chris Wilson 4071dca0ef sna: Don't mark mapping as synchronous by default
Only those that point into scratch memory need to synchronized before
control is handed back to the client.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-20 00:25:29 +00:00
Chris Wilson 4c2a97e9d2 sna: Always pass the damage to sna_drawable_use_gpu_bo()
As it now assumes that the damage is always writable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 23:10:45 +00:00
Chris Wilson 06a1792f91 sna: Avoid the GPU readback with READ==0 move_to_cpu
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 21:04:02 +00:00
Chris Wilson 0b5fec3f80 sna: Drop the is-mapped flag after operating via the GPU
Mark the end of a sequence of CPU operations and force the decision to
map again to be based on the current upload operation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 20:13:05 +00:00
Chris Wilson 351c8f1633 sna: Discard all damage when replacing pixmap contents
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 20:11:08 +00:00
Chris Wilson addf66dda7 sna: Tweak the rendering priorities
If the last operation was on the GPU, continue on the GPU if this
operation overlaps any GPU damage or does not overlap CPU damage.
Otherwise, if the last operation was on the CPU only switch to the GPU
if we do not overlap any CPU damage and overlap existing GPU damage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 20:11:08 +00:00
Chris Wilson 9b6ade1234 sna: Create a GPU bo for accelerated core drawing
As we now can accelerate most of the common core drawing operations, we
can create GPU bo for accelerated drawing on first use without undue
fear of readbacks. This benefits Qt especially which heavily uses core
the drawing operations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 20:11:08 +00:00
Chris Wilson eeb81dd6b4 sna: Remove the forced inplace upload
Make the decision per-operation, with a tendency to remain mapped.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 20:10:57 +00:00
Chris Wilson c3a8d77a2b sna: Tune the inplace cross-over point to be half-cache size
The theory being that we will also require cache space to copy from when
uploading into the shadow.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 20:10:47 +00:00
Chris Wilson d53b1b2895 configure: Bump the required pixman version
UXA now also uses pixman_triangle_t in order for its fallback, so we
need to bump the required pixman version for UXA as well as SNA.

Reported-by: Fabio Pedretti <fabio.ped@libero.it>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43946
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 15:39:06 +00:00
Chris Wilson 1fa5721f06 sna: Reset the GTT mapping flag when freeing the shadow pointers
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 00:37:43 +00:00
Chris Wilson 7326d30986 sna: Restore CPU shadow after a GTT memory
When mixing operations and switching between GTT and CPU mappings we
need to restore the original CPU shadow rather than accidentally
overwrite.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 00:37:43 +00:00
Chris Wilson ae32aaf4b2 sna/gen[23]: We need to check the batch before doing an inline flush
A missing check before emitting a dword into the batch opened up the
possibility of overflowing the batch and corrupting our state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-19 00:37:43 +00:00
Chris Wilson e32ad64676 sna: Continue searching the linear lists for CPU mappings
Prefer to reuse an available CPU mapping which are considered precious
and reaped if we keep too many unused entries availabled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 23:42:07 +00:00
Chris Wilson 15a769a66f sna: Distinguish between GTT and CPU maps when searching for VMA
Similarly try to avoid borrowing any vma when all we intend to do is
pwrite.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 23:00:46 +00:00
Chris Wilson d0ee695ef0 sna: the active cache is not marked as purgeable, so skip checking it
Otherwise we do a lot of list walking for no-ops.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 20:28:18 +00:00
Chris Wilson 8df9653135 sna: clear the request list when reusing a flushing bo
That the rq is NULL when on the flushing list is no longer true, but
now it points to the static request instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 20:08:19 +00:00
Chris Wilson b51e3de662 sna: When freeing vma, first see if the kernel evicted any
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 19:51:17 +00:00
Chris Wilson fed8d145c1 sna: Use a safe iterator whilst searching for inactive linear bo
As we may free a purged bo whilst iterating, we need to keep the next bo
as a local member.

Include the debugging that led to this find.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 19:46:08 +00:00
Chris Wilson 2a98dabcab sna: Purge all evicted bo
After we find a single bo that has been evicted from our cache, the
kernel is likely to have evicted many more so check our caches for any
more bo to reap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 19:23:10 +00:00
Chris Wilson 8ae105b2c7 sna: Only retire for the VMA search if there are cached VMA
If there are no VMA that might become inactive, there is no point
scanning the inactive lists if we are searching for VMA.

This prevents the regression in firefox-fishbowl whilst maintaining most
of the improvement with PutComposite.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 18:29:33 +00:00
Chris Wilson a0c0a3765c sna: Retire if the inactive vma list is empty
Try to recycle vma by first trying to populate the inactive list before
scanning for a vma bo to harvest.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 17:38:05 +00:00
Chris Wilson 34efb73146 sna: Hint likely usage of CPU bo
If we are going to transfer GPU damage back to the CPU bo, then we can
reuse an active buffer and so improve the recycling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 17:00:22 +00:00
Chris Wilson 3018967438 sna: Only upload to the source GPU bo if we need tiling to avoid TLB misses
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 16:25:49 +00:00
Chris Wilson b7f5d75aa5 Silence uxa-only compilation
Kill the stray warning for the undeclared extern used by the module
loader.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 16:25:28 +00:00
Chris Wilson a73cc4bf1e sna/gen5: Tidy checking against hardcoded maximum 3D size
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 14:55:23 +00:00
Chris Wilson b43548af39 sna: Explicitly handle errors from madv
In order to avoid conflating whether a bo was marked purgeable with its
retained state, we need to carefully handle the errors from madv.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 13:19:26 +00:00
Chris Wilson 954cf5129d sna/gen[67]: check for context switch after preparing source
If we used the BLT to prepare the source, see if we can continue the
operation on the BLT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 11:17:35 +00:00
Chris Wilson 90a432431c sna/gen[23]: Try BLT if the source/target do no fit in the 3D pipeline
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 10:25:40 +00:00
Chris Wilson eeb9741981 sna/gen3: Tidy checks against hardcoded maximum 3D pipeline size
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 10:04:08 +00:00
Chris Wilson dcfcf438a5 sna/gen2+: If we use the BLT to prepare the target, try using BLT for op
If we incurred a context switch to the BLT in order to prepare the
target (uploading damage for instance), we should recheck whether we can
continue the operation on the BLT rather than force a switch back to
RENDER.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 09:58:08 +00:00
Chris Wilson 507debe801 sna/gen5: If we need to flush the composite op, check to see if we can blit
If we need to halt the 3D engine in order to flush the pipeline for a
dirty source, we may as well re-evaluate whether we can use the BLT
instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 01:47:27 +00:00
Chris Wilson de530f89a3 sna/gen5+: First try a blt composite if the source/dest are too large
If we will need to extract either the source or the destination, we
should see if we can do the entire operation on the BLT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 01:47:27 +00:00
Chris Wilson 7b88f87945 sna: Upload images in place from CopyArea
As for PutImage, if the damage will be immediately flushed out to the
GPU bo, we may as well do the write directly to the GPU bo and not
staged via the shadow.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-18 00:48:33 +00:00
Chris Wilson 1418e4f315 sna: Tune the default pixmap upload paths
One issue with the heuristic is that it is based on total pixmap size
whereas the goal is to pick the placement for the next series of
operations. The next step in refinement is to combine an overall
placement to avoid frequent migrations along with a per-operation
override.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 23:42:06 +00:00
Chris Wilson 25c353503a sna: Simplify write domain tracking
Replace the growing bitfield with an enum marking where it was last
used.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:35 +00:00
Chris Wilson d20d167a75 sna: Upload to large pixmaps inplace
When the pixmap is large, larger than L2 cache size, we are unlikely to
benefit from first copying the data to a shadow buffer -- as that shadow
buffer itself will mostly reside in main memory. In such circumstances
we may as perform the write to the GTT mapping of the GPU bo. As such,
it is a fragile heuristic that may require further tuning.

Avoiding that extra copy gives a 30% boost to putimage500/shmput500 at
~10% cost to putimage10/shmput10 on Atom (945gm/PineView), without any
noticeable impact upon cairo.

Reported-by: Michael Larabel <Michael@phoronix.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:35 +00:00
Chris Wilson dd8fd6c906 sna: Search through the inactive VMA cache for potential upload bo
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:35 +00:00
Chris Wilson 8ef5d8c195 sna: Map the upload buffer using an LLC bo
In order to avoid having to perform a copy of the cacheable buffer into
GPU space, we can map a bo as cacheable and write directly to its
contents. This is only a win on systems that can avoid the clflush, and
also we have to go to greater measures to avoid unnecessary
serialisation upon that CPU bo. Sadly, we do not yet go to enough length
to avoid negatively impacting ShmPutImage, but that does not appear to
be a artefact of stalling upon a CPU buffer.

Note, LLC is a SandyBridge feature enabled by default in kernel 3.1 and
later. In time, we should be able to expose similar support for
snoopable buffers for other generations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:35 +00:00
Chris Wilson 6e47f28371 sna/gen3: Enforce a minimum width of 2 elements for the render target
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:34 +00:00
Chris Wilson 2ff0826f94 sna: Discard GPU damage first before choosing where to fill_boxes()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:34 +00:00
Chris Wilson 55520bab57 sna/gen3: Initialise missing value of need ca pass for fill_boxeS()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:34 +00:00
Chris Wilson e56d5081ea sna: Wrap I915_GEM_GET_PARAM with valgrind markup
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:34 +00:00
Chris Wilson e0399ec161 sna: Suppress an overwritten XY_SRC_COPY
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:34 +00:00
Chris Wilson 1684ed6a5e sna: Clean up compiler warnings for shadowed variables
No outright bug, just plenty of unwanted noise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-17 21:26:34 +00:00