The complexity of the function has been moved to move-to-cpu so we can
take further advantage of the simplified logic in put_zpixmap to clean
up the code by removing an unwanted goto.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since b266ae6f6f protected the static allocations from being reaped in
the normal course of events, we need to penetrate those defenses in
order to finally free the SHM mappings.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we will not write back the GPU damage to the bo as we intend to
overwrite it for the next operation, we can forgo allocating the active
CPU bo and skip the synchronisation overhead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We can replace the custom heuristics for PutImage by applying them to
the common path, where hopefully they are equally valid.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
sna_accel.c: In function 'sna_put_image':
sna_accel.c:3730:18: warning: 'src_bo' may be used uninitialized in this
function [-Wmaybe-uninitialized]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As these do not flush the active state if we have read-read mappings, we
need to be careful with our asserts concerning the busy flag.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we may fail the size check with an empty batch and a pair of large
bo, we need to check before submitting that batch in order to not run
afoul of our internal sanity checks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It's far too slow due to the register starved instruction set producing
attrocious code and the extra overhead in the kernel for managing memory
mappings.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Filling the rings is a very unpleasant user experience, so cap the
number of batches we allow to be inflight at any one time.
Interestingly, as also found with SNA, throttling can improve
performance by reducing RSS. However, typically throughput is improved
(at the expense of latency) by oversubscribing work to the GPU and a
10-20% slowdown is commonplace for cairo-traces. Notably, x11perf is
less affected and in particular application level benchmarks show no
change.
Note that this exposes another bug in libdrm-intel 2.4.40 on gen2/3.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The normal source upload into GPU bo knows a few more tricks that we may
want to apply first before copying into the shadow of the GPU bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we have to fallback and the configuration is wonky, make sure that
all known outputs are disabled as we takeover the console.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
config->compat_output needs to be sanitized during device initialization
or we may dereference an invalid xf86OutputPtr.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Copied from commit c789d06cf8
Author: Dave Airlie <airlied@redhat.com>
Date: Mon Jan 7 13:57:21 2013 +1000
This fixes the damage posting to happen in the correct ordering,
not sure if this fixes anything, but it should make things more consistent.
This fixes the damage posting to happen in the correct ordering,
not sure if this fixes anything, but it should make things more consistent.
Signed-off-by: Dave Airlie <airlied@redhat.com>