If the pixmap is mapped to the GPU bo, we should continue to use the
current mapping rather than revoke it. Otherwise if we write to the GPU
bo inplace, thereby discarding the CPU bo, we set the pointer we are
about to copy to, to NULL.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The mi routines do not ensure that their output is suitably constrained
to the clip extents, so we must run it through the clipper.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to avoid using the wrong function for a scratch GC created
during the course of a MI function whilst we have a specialised GC in
use, we need to avoid modifying the original function table.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the drawable_gc_flags() operate on lower level information than the
hint, it is able to spot more oportunities to reduce the READ flags and
so the assertion was overly optimistic.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we are copying a region that does not fill its extents (i.e. is not
singular) then we must be care not to discard the CPU damage that is not
overwritten by the copy.
Fixes regression from 77ee922485
(sna: Use full usage flags for moving the dst pixmap for a copy).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal of the heuristic is to reduce readbacks and damage tracking on
active GPU bo whilst simultaneously offering the best performance for
small operations which would prefer to be performed on the shadow rather
than in place.
This restores ShmPutImage performance.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As proxy's are short-lived and are not used outside of the operation for
which they are created, dirtied or flushed, we can safely copy the dirty
status onto the proxy object itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Given the rarity of encountering a purged buffer versus the frequency of
scanning the list and the then likely result of allocation a new buffer,
simply abort the search on the first purged bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This way all paths can test to see if they might be able to reduce the
tiled fill or the opaque fill into a solid fill.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Having removed the double analysis for the fast paths, at least, the
span filling code on the GPU is now faster than doing the same
operations in cache memory for the majority of cases. So allow the
driver to prefer to use those functions when it has a GPU bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
miZeroPolyArc may use either FillSpans for PolyPoint to generate its
curves, so also provide custom point filling routines.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since we compute it for the pixmap migration, we may as well use to
perform the clipping within FillSpans as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal is to avoid the overhead of performing multiple region analysis
when calling sna_fill_spans by doing it once at the top level and then
choose the most appropriate drawing method.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal is to avoid the overhead of performing multiple region analysis
when calling sna_fill_spans by doing it once at the top level and then
choose the most appropriate drawing method.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The fast path to decide whether to use the GPU bo for the core drawaing
operations forgot to update the active status of the pixmap. This
included forgetting to clear the is-cleared flag.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order for us to produce stable downsampled images across multiple
frames, we need to sample the same pairs of pixels every time. This
requires us to align the origin of the sample region to an even pixel.
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=45086
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The idea behind that optimisation is for the inactive pixmap to be
refreshed and allowed to be transfered back to the GPU when it is
entirely redrawn. As such performing the subtraction when it does not
completely remove it only incurs additional overhead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Somewhere somewhen it appears that I am discarding the all-damaged flag
on the pointer. The only possibility I can see is for a no-op
subtraction, so put an assert there just in case the impossible is
happening.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This may not be true for external buffers that are put on the flushing
list because they have foreign requests pending.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Another restriction on the render pipeline, it turns out, is that before
the blend unit can read back the dst pixels in a subsequent primitive,
we must stall the pipeline for the completion of that earlier primitive.
This is demonstrated by cacomposite.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>