If we are copying a region that does not fill its extents (i.e. is not
singular) then we must be care not to discard the CPU damage that is not
overwritten by the copy.
Fixes regression from 77ee922485
(sna: Use full usage flags for moving the dst pixmap for a copy).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal of the heuristic is to reduce readbacks and damage tracking on
active GPU bo whilst simultaneously offering the best performance for
small operations which would prefer to be performed on the shadow rather
than in place.
This restores ShmPutImage performance.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As proxy's are short-lived and are not used outside of the operation for
which they are created, dirtied or flushed, we can safely copy the dirty
status onto the proxy object itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Given the rarity of encountering a purged buffer versus the frequency of
scanning the list and the then likely result of allocation a new buffer,
simply abort the search on the first purged bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This way all paths can test to see if they might be able to reduce the
tiled fill or the opaque fill into a solid fill.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Having removed the double analysis for the fast paths, at least, the
span filling code on the GPU is now faster than doing the same
operations in cache memory for the majority of cases. So allow the
driver to prefer to use those functions when it has a GPU bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
miZeroPolyArc may use either FillSpans for PolyPoint to generate its
curves, so also provide custom point filling routines.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since we compute it for the pixmap migration, we may as well use to
perform the clipping within FillSpans as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal is to avoid the overhead of performing multiple region analysis
when calling sna_fill_spans by doing it once at the top level and then
choose the most appropriate drawing method.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal is to avoid the overhead of performing multiple region analysis
when calling sna_fill_spans by doing it once at the top level and then
choose the most appropriate drawing method.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The fast path to decide whether to use the GPU bo for the core drawaing
operations forgot to update the active status of the pixmap. This
included forgetting to clear the is-cleared flag.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order for us to produce stable downsampled images across multiple
frames, we need to sample the same pairs of pixels every time. This
requires us to align the origin of the sample region to an even pixel.
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=45086
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The idea behind that optimisation is for the inactive pixmap to be
refreshed and allowed to be transfered back to the GPU when it is
entirely redrawn. As such performing the subtraction when it does not
completely remove it only incurs additional overhead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Somewhere somewhen it appears that I am discarding the all-damaged flag
on the pointer. The only possibility I can see is for a no-op
subtraction, so put an assert there just in case the impossible is
happening.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This may not be true for external buffers that are put on the flushing
list because they have foreign requests pending.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Another restriction on the render pipeline, it turns out, is that before
the blend unit can read back the dst pixels in a subsequent primitive,
we must stall the pipeline for the completion of that earlier primitive.
This is demonstrated by cacomposite.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It is surprisingly common for a pixmap to be created, cleared and then
used as an upload target or, even worse, as a source for a ShmGetImage.
In order to prevent this folly, we can trivially track when we clear an
entire pixmap and its GPU bo and avoid the readback in such cases.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In the unlikely event that this makes a difference, provide the hint as
to when we do not read back from the destination and so a streaming copy
would be preferable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Now that the migration code can decide for itself when to not move
damage, we can pass the hints to the code rather than perform the
optimisation in sna_copy_boxes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we attempt to read from a drawable that is partially off its backing
pixmap (such as a partially offscreen composite window) we need to fixup
the read from the out-of-bounds regions to return clear. Since we don't,
the easier answer is to just to switch to the render pipeline for such
an operation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The gen3, among others, backend uses the unique id of a buffer to track
the currently attached buffer and uses 0 as the invalid value. Linear
buffers as created by kgem_create_buffer_2d were not being assigned a
unique id causing mayhem when they were then being passed to the
backends as render targets and sources. In particular, gen3 did not
notice the switch in render target and did not emit commands to change
the GPU target nor attach the buffer to the batch, causing the
sna_read_boxes to fail and for us to trigger an assertion for an
uncomsumed read buffer.
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42718
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The empty glyph still needs the correct advance, and copying it too late
left it as zero and so we were collapsing spaces in PolyText8 and
friends.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Move the workaround CS stall into the emit drawrect which is the only
non-pipelined op we emit. This removes the split between deciding
whether we will emit a drawrect and actual emission.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>