Even if it means incurring a context switch, the BLT unit is
significantly faster so long as we do enough fills. And there is the
catch ;-)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we are rendering to or from a ShmPixmap, we need to be sure that the
operation is complete prior to sending an XSync response to client in
order to preserve mixed rendering coherency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If an operation overflows from one batch into another, we submit the
complete batch and begin a new. That new batch will not be submitted
unless it is filled or on the next delayed flush update. This can cause
a flicker as a large operation is broken up, such as performing a
CopyArea through a Clipmask. So if we submit a full batch during a flush
interval, immediately flush any partial batch at the next blockhandler.
This stops rude Santa flashing Rudolf in xsnow!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Otherwise we may leave one behind...
A regression from the introduction of sna_poly_rectangles:
40af32a0e9 (sna: Execute blits directly
for PolyRectangle)
Reported-by: Matti Hamalainen <ccr@tnsp.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42568
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the pixmap were to be used multiple times within a batch with
mulitple formats, the cache would only return the initial location with
the incorrect format and so cause rendering glitches. For instance, GTK+
uses the same pixmap as an xrgb source and as an argb mask in order to
premultiply and composite in a single pass. Rather than introduce an
overly complication caching (handle, format) mechanism, kiss and remove
the invalid implementation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40926
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Applications may use the same pixmap with multiple formats within the
same operation. For instance, you can premultiply and composite a normal
pixmap in this manner. However, as we reused the sampler binding
locations of the source (without an alpha channel) for the mask, we
failed to read and multiply by the alpha channel causing it to remain
black instead of transparent.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40926
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Damage bypasses the Text interface, preventing the backend from hooking
into the font and storing private glyph representations, and calls
directly into the Glyph routines. So to prevent a segfault we have to
restore them.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We know that the length is nicely aligned and so can avoid a relatively
expensive call into memcpy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The use of a gpu-only scratch bo is uncommon with the core acceleration
routines, and we can eliminate the check for not incrementing the damage
by allocating a damage-all and using the common optimisation of
reduce_damage().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A secondary effect is that this prevents needless migration of the
tiling pixmap which we want to optimistically keep on the GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Now that we have the rudiments of accelerated deep-plane copies, we can
begin to benefit from using BO for the core dix/mi routines like
ShmPutImage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When comparing drawable clip extents against pixmap boundaries we need
to include the pixmap screen offset on a Composited desktop.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
These still get used (see Wine and Swing) by applications which like to
do "crisp" 1-bit rendering on the client side and then put onto the
scanout. So avoid the readbacks, and push them through the BLT instead. It
turns out to be faster than using fb too, bonus!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
SHADER_CONSTANT is expected here, the other IMMEDIATES however should
have already been handled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>