Having noticed that eog was failing to perform a 8k x 8k copy with
compiz running on a 965gm, it was time the checks for batch overflow
were implemented.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On gen4+ devices the maximum render pitch is much larger than is simply
required for the maximum coordinates. This makes it possible to use
proxy textures as a subimage into the oversized texture without having
to blit into a temporary copy for virtually every single bo we use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the source is not attached to a buffer (be it a GPU bo or a CPU bo),
a temporary upload buffer would be required and so it is not worth
forcing the target to the destination in that case (should the target
not be on the GPU already).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This allows us to implement backend specific workarounds and use the
more appropriate device specific flushing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
One of the side-effects of emitting the composite state is that it
tags the destination surface as dirty as a result of the *forthcoming*
operation. So emitting the flush after emitting the composite state
clears that tag, so we need to restore it for future coherency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Be sure the mask picture has a valid format even though it points to the
same pixels as the valid source. And also be wary if the source was
converted to a solid, but the mask is not.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Use a single idiom and reuse the check built into the state emission,
for both spans/boxes paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Initially, the batch->mode was only set upon an actual mode switch,
batch submission would not reset the mode. However, to facilitate fast
ring switching with semaphores, reseting the mode upon batch submission
is desired which means that if we submit the batch in the middle of an
operation we must redeclare its mode before continuing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Just in the unlikely event that we hit the delete-partial-upload path
which prefers destroying the last bo first.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we are forced to perform a render operation to a bo too large to fit
in the pipeline, copy to an intermediate and split the operation into
tiles rather than fallback.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we try to use the diffuse/specular and only resort to using a texture
operation for convenience in the rare case of a solid mask.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the new mode can be done either using a logic op or with the blend
unit, prefer the currently enabled unit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Convert the linear gradient to a texture ramp and compute the texture
coordinates in the standard manner.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
GTK+ has a clever trick for premultiplying its images by loading the
same pixel data into both the source and mask, and then performing the
composite. This causes us to upload the same pixel data twice!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Replace the source picture+alpha with a bo that contains the RGB
channels from source and A from the alpha map.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Only marginally better than falling all the way back to using the CPU,
is to perform a double copy to workaround the overlapping copy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A missing check before emitting a dword into the batch opened up the
possibility of overflowing the batch and corrupting our state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we incurred a context switch to the BLT in order to prepare the
target (uploading damage for instance), we should recheck whether we can
continue the operation on the BLT rather than force a switch back to
RENDER.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
gen2/3 have a restriction that the 3D pipeline cannot render to a pixmap
with a pitch less than 8/16 respectively. Rather than mandating all
pixmaps to be created with a stride greater than 16, fixup the bo for
the rare occasions when it is necessary.
Reported-by: Paul Neumann <paul104x@yahoo.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43688
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Make sure that the damage is always set, even if only to NULL, so that
we are safe if in future the operation state is not initially cleared.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
There is no point even attempting a BLT operation if we know that it is
an unusual render operation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This is set in configure and redefining it later inside the C files just
leads to trouble and broken compilation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This reverts commit 15266e1b95.
KDE relies upon the ability to render into a sampler and then render
upon itself. Not the first sign of madness...
Will have to find another way of winning back the compwinwin
performance.
As exemplified by KDE (using Kate) on gen3, it would attempt to render a
large set of boxes using OVER and a transparent colour. As gen3 copied
across some of the BLT assumptions, it was incorrectly reducing that to
a CLEAR and thus rendering incorrectly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It appears the hardware trashes the BLT registers after a 3D context
switch, so we need to reload.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This is slower than falling back to swrast for x11perf (up to 4x slower
on SNB), it is still faster than doing that rasterisation through a
WC-mapping and much faster in ordinary usage due to avoiding the
readback hit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
For many of the core drawing routines, passing a BoxRec for the fill is
more convenient since they already have one generated by the clip
intersection.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Lets only have one special gen2 value for the source channel pixel
colour and so remove the confusion and misrendering.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This reduces the amount of dancing required to call into the span
functions as we can pass the arguments in both the integer and floating
point registers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>