The aim is to improve GPU concurrency by keeping it busy. The possible
complication is that we incur more overhead due to small batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
==29553== Invalid read of size 4
==29553== at 0x4980E1B: _list_del (intel_list.h:218)
==29553== by 0x4980EB3: list_del (intel_list.h:240)
==29553== by 0x4981F53: free_list (sna_damage.c:403)
==29553== by 0x4985139: __sna_damage_destroy (sna_damage.c:1467)
==29553== by 0x49A527E: sna_render_composite_redirect_done (sna_render.c:1921)
==29553== by 0x49C6904: gen2_render_composite_done (gen2_render.c:1136)
==29553== by 0x497F917: sna_composite (sna_composite.c:567)
==29553== by 0x8150C41: ??? (in /usr/bin/Xorg)
==29553== by 0x8142F13: CompositePicture (in /usr/bin/Xorg)
==29553== by 0x8145F58: ??? (in /usr/bin/Xorg)
==29553== by 0x81436F2: ??? (in /usr/bin/Xorg)
==29553== by 0x807965C: ??? (in /usr/bin/Xorg)
==29553== Address 0x9407e188 is not stack'd, malloc'd or (recently) free'd
Reported-by: bonbons67@internet.lu
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56785
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I thought these were completely specified via the LOAD_STATE_IMMEDIATE
commands we used whilst seting up the render pipeline. I was wrong.
Reported-by: Timo Kamph <timo@kamph.org>
References: https://bugs.freedesktop.org/show_bug.cgi?id=55455
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
After we have computed the source offset vector for the transformed
source bo, we need to use that with respect to the destination rectangle
to verify that the source sample is wholly within bounds.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This reverts commit 5a5212117e.
The clean up is in effect too early, as this is during preparation and
the actual work is already being correctly done at the end.
Submit early, submit often in order to keep the GPU busy. As always we
trade off CPU overhead versus concurrency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The only real user now has its own heuristics, so convert the remaining
users over to !is_gpu().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The fall-through in this instance is irrelevant, and the static
analysers complain for not commenting on the fall-through. Silence the
analyser by removing the fall-through.
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Due to the unresolved flushing bug it is no faster (so only enable when
we definitely can't do the operation inplace), however it does eliminate
a chunk of CPU overhead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It was only ever used in conjunction with HAS_DEBUG_FULL. For debug
purposes it is as easy to redefine DBG locally. By simplifying the DBG
macro we can create it consistently and so reduce the number of compiler
warnings.
Long term, this has to be dynamic. Sigh.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we expect to only emit this set of copy_boxes() and then submit the
batch, we would prefer to use the BLT for its lower latency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Semmingly only advisable when already committed to using the GPU. This
first pass is still a little naive as it makes no attempt to avoid empty
tiles, nor aims to be efficient.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we lack the ability to use a shader to compute the gradients
per-pixel, we need to use pixman to render a fallback texture. We can
reduce the size of this texture and upsample to reduce the cost with
hopefully imperceptible loss of quality.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Do not attempt to further reduce the operator locally in each backend as
the reduction is already performed in the upper layer.
References: https://bugs.freedesktop.org/show_bug.cgi?id=42606
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As if we try to perform the operation with outstanding operations on the
source pixmaps, we will stall waiting for them to complete.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If both the source and destination is on the CPU, then the thinking was
it would be quicker to operate on those on the CPU rather than copy both
to the GPU and then perform the operation. This turns out to be a false
assumption if transformation is involved -- something to be reconsidered
if pixman should ever be improved.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Rather than the specialised routines that assumed pDrawable was
non-NULL, which was no longer true after f30be6f743.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We treat any pixmap that is not attached to either a CPU or GPU bo as
requiring the pixel data to be uploaded to the GPU before we can
composite. Normally this is true, except for the solid cache.
References: https://bugs.freedesktop.org/show_bug.cgi?id=45672
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the pixmap is larger than the pipeline, but the operation extents fit
within the pipeline, we may be able to create a proxy target to
transform the operation into one that fits within the constraints of the
render pipeline.
This fixes the infinite recursion hit with partially displayed extremely
large images.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Having noticed that eog was failing to perform a 8k x 8k copy with
compiz running on a 965gm, it was time the checks for batch overflow
were implemented.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On gen4+ devices the maximum render pitch is much larger than is simply
required for the maximum coordinates. This makes it possible to use
proxy textures as a subimage into the oversized texture without having
to blit into a temporary copy for virtually every single bo we use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the source is not attached to a buffer (be it a GPU bo or a CPU bo),
a temporary upload buffer would be required and so it is not worth
forcing the target to the destination in that case (should the target
not be on the GPU already).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>