On the older generation, we have severe alignment penalties for fenced
regions which dramatically reduce the amount of space we can effectively
use in a batch. To accommodate this, reduce the tiling step size.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
With a large object, we try harder to operate inplace (to avoid creating
a second large CPU bo). This introduced an issue where we tried to read
from the GPU bo when there was already existing damage in the CPU -
triggering an assertion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We can optimistically only require that we waste the largest fence
region in a batch, as all other fences will then be naturally aligned as
well. So long as the kernel succeeds in defragmenting the aperture...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When promoting a partial move_area_to_gpu to a full move_to_gpu, we have
to disable certain optimisations that we try to use if MOVE_READ==0.
Reported-and-tested-by: Matti Hamalainen <ccr@tnsp.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71198
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On older architectures, large BO have to be untiled and so we can reuse
an existing CPU bo by adjusting its caching mode.
References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The older generations have stricter requirements for alignment of fenced
GPU surfaces, so accommodate this by reducing our estimate available
space for the temporary tile.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we install a BLT fill operation early in the drawing sequence (i.e.
before calling a mi routine), we may lose our state to delayed
initialisation of sources and so need to subsequently recheck.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
kgem.c: In function 'kgem_check_bo':
kgem.c:4768:7: warning: declaration of 'active' shadows a global declaration [-Wshadow]
kgem.c:692:21: warning: shadowed declaration is here [-Wshadow]
kgem.c: In function 'kgem_check_many_bo_fenced':
kgem.c:4907:7: warning: declaration of 'active' shadows a global declaration [-Wshadow]
kgem.c:692:21: warning: shadowed declaration is here [-Wshadow]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Following a complex path through multiple layers of indirections and
tiling fallbacks, resulted in hitting a path where the source offset was
subsequently ignored. This leads to the operation reading from invalid
memory (or hitting the assert warning about the same).
References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the stipple box is outside of the stipple pixmap, we need to
carefully upload the stipple using the modulus operation.
Buzilla: https://bugs.launchpad.net/bugs/1247785
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As write_boxes itself decides whether or not to stage the upload into
the destination bo, we can destroy the temporary allocation along the
write_boxes fallback path (i.e. after failing to map the destination
bo).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Along the sna_replace__xor path we destroyed the priv->gpu_bo twice upon
successfully replacing it.
References: https://bugs.freedesktop.org/show_bug.cgi?id=70527
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since we extend the write in the cache-aligned routine, it runs the risk
of reading from beyond the end of the allocation. As such, callers
should be carefully vetted to make sure that their allocations are
already cache-aligned (typically page-aligned). To make it obvious that
this complexity exists, rename the routine.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When we call kgem_bo_write() we have less control over the allocation of
the buffer, and do not ensure it meets the alignment criteria required
for the cacheline optimisation. So use the simple pwrite routine to
avoid reading beyond the end of the allocation.
Reported-and-tested-by: Mark Kettenis <mark.kettenis@xs4all.nl>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we are ignoring CPU damage, we also need only to check GPU damage
when considering placement of the target bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Another fix for
commit e3f15cbf39 [2.99.905]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Oct 22 15:19:15 2013 +0100
sna: Move gc back to GPU after failure to move it to CPU
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Old generations have very strict alignment requirements even for
unfenced tiled accesses which restricts the amount of aperture space
available for use, and in the process estimate for the effect of
framebuffer fragmentation on the mappable aperture.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Before attempting to map the destination for uploading into after a
failure to use the BLT, we need to recheck that it is indeed mappable.
References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The macro still had buried returns which were no longer valid after the
translation to handle clipping. They needed to be breaks from the inner
most loops to the outer clip box instead.
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70802
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
commit d580a30aaf
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Oct 31 15:58:47 2013 +0000
sna/gen7: Flush render cache when changing CC state
ultimately doesn't prevent the issue and in the process adversely
affects perforamnce.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
After converting aperture_mappable to count in pages, there were a few
residual users expecting a byte count.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71117
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to fix some font corruption, it appears that we need an extra
flush in the Sandybridge pipeline when we change the CC stage and the
render cache is dirty. We previously triggered a full pipeline stall
for this case.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Some of the mi routines recuse back into the generic accel routines and
so confuse our trapping of SIGBUS. Add extra assertions to pinpoint the
recursion and unwrap sufficiently to avoid that recursion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
For an extra layer of paranoia, catch any sigbus when trying to upload
a bitmap, and convert it to a no-op.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The BYT vsync is closer to the IVB vsync, and using gen6 is just
erroneous. Apparently. At least that is what is in bspec today.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Yikes, the new assertions found cases where we would retain an active
buffer even though it was still owned by the next batch. Fortunately,
this should only be a bookkeeping issue rather than lead to corruption.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we know that we had a request to retire, we know that we may make
progress retiring active buffers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>