The presumption that the pixmap is the scanout and so will always be
pinned is false if there is a shadow or under a compositor. In those
cases, the pixmap may be idle and so the GPU bo reaped. This was
compounded by that the video path did not mark the pixmap as busy. So
whilst watching a video under xfce4 with compositing enabled (has to be
a non-GL compositor) the video would suddenly stall.
Reported-by: Paul Neumann <paul104x@yahoo.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45279
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the render target is thin enough to fit within the 3D pipeline, but is
too tall, we can fudge the address of the origin and coordinates to fit
within the constaints of the pipeline.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the source is thin enough such that the pitch is within the sampler's
constraints and the sample size is small enough, just fudge the origin
of the bo such that it can be sampled.
This avoids having to create a temporary bo and use the BLT to extract
it and helps, for example, firefox-asteroids which uses an 64x11200
texture atlas.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Combine the two very similar routines that decided if we should render
into the GPU bo, CPU bo or shadow pixmap into a single function.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the hw is wedged, then the pixmap creation routines will return an
ordinary unattached pixmap. The code presumed that it would only return
a pixmap with an attached bo, and so would segfault as it chased the
invalid pointer after a GPU hang and the server was restarted.
Considering that we already checked that the GPU wasn't wedged before we
started, this is just mild paranoia, but on a run-once piece of code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The spec says that they must wholly contained with the valid BorderClip
for a Window or within the Pixmap or else a BadMatch is thrown. Rely on
this behaviour and not perform the clipping ourselves.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the source is not attached to a buffer (be it a GPU bo or a CPU bo),
a temporary upload buffer would be required and so it is not worth
forcing the target to the destination in that case (should the target
not be on the GPU already).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the blitter on gen4+ does not require fence registers, it is not
restricted to operating on large objects within the mappable aperture.
As we do not need to operate on such large GPU bo in place, we can relax
the restriction on the maximum bo size for gen4+ to allocate for use
with the GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the bo is larger than a quarter of the aperture, it is unlikely that
we will be able to evict enough contiguous space in the GATT to
accommodate that buffer. So don't attempt to map them and use the
indirect access instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It is preferrable to reuse a slightly larger bo, than it is to create a
fresh one and map it into the aperture. So search the bucket above us as
well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to handle rotations and fractional offsets produced by the act
of downsampling, we need to compute the full affine transformation and
apply it to the vertices rather than attempt to fudge it with an integer
offset.
References: https://bugs.freedesktop.org/show_bug.cgi?id=45086
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Even on non-LLC systems if we can prevent the migration of such
objects, we can still benefit immensely from being able to map them into
the GTT as required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Take advantage that we know we will have to clflush the unbound bo
before use by the GPU and populate it inplace.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We need to adjust the clip to include the border pixels when migrating
damage from the backing pixmap. This also requires relaxing the
constraint that a read must be within the drawable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The pathological case being nx1 or 1xm resulting in an illegal allocation
request of 0 bytes.
One such example is
wolframalpha.com: x = (200 + x) / 100
which generates an approximately 8500x1 image and so needs downscaling
to fit in the render pipeline on all but IvyBridge. Bring on Ivy!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
sna_accel.c: In function 'sna_copy_plane':
sna_accel.c:5022:21: warning: 'ret' may be used uninitialized in this
function [-Wuninitialized]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Prepare the source first as this has the dual benefit of letting us
decide how best to proceed with the op (on the CPU or GPU) and prevents
modification of the damage after we have choosen our preferred path.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The source window is (src->x, src->y)x(src->width, src->height) in
pixmap space. However, we then need to use this to clip against the
desination region, and so we need to translate from the source
coordinate to the destination coordinate.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This allows us to discard any busy GPU or CPU bo when we know we are
going to clear the shadow pixmap afterwards.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the pixmap is mapped to the GPU bo, we should continue to use the
current mapping rather than revoke it. Otherwise if we write to the GPU
bo inplace, thereby discarding the CPU bo, we set the pointer we are
about to copy to, to NULL.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The mi routines do not ensure that their output is suitably constrained
to the clip extents, so we must run it through the clipper.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to avoid using the wrong function for a scratch GC created
during the course of a MI function whilst we have a specialised GC in
use, we need to avoid modifying the original function table.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the drawable_gc_flags() operate on lower level information than the
hint, it is able to spot more oportunities to reduce the READ flags and
so the assertion was overly optimistic.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>