Commit Graph

5822 Commits

Author SHA1 Message Date
Chris Wilson 441ef916ae intel: Throttle harder
Filling the rings is a very unpleasant user experience, so cap the
number of batches we allow to be inflight at any one time.

Interestingly, as also found with SNA, throttling can improve
performance by reducing RSS. However, typically throughput is improved
(at the expense of latency) by oversubscribing work to the GPU and a
10-20% slowdown is commonplace for cairo-traces. Notably, x11perf is
less affected and in particular application level benchmarks show no
change.

Note that this exposes another bug in libdrm-intel 2.4.40 on gen2/3.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-11 12:56:08 +00:00
Chris Wilson a37d56f338 sna: Use some surplus bits to back our temporary pixman_image_t
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-10 16:28:24 +00:00
Chris Wilson 09ea1f4402 sna: Prefer to use the GPU for copies from SHM onto tiled destinations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-10 16:26:24 +00:00
Chris Wilson c63147a3c3 sna: Allow CPU bo to copy to GPU bo if the device is idle.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-10 15:19:36 +00:00
Chris Wilson 2933e75958 sna: Ignore the last pixmap cpu setting if overwritting all damage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-10 15:19:36 +00:00
Chris Wilson 934ea64f7f sna: With a GPU bo and a shm source, do not fall all the way back
The normal source upload into GPU bo knows a few more tricks that we may
want to apply first before copying into the shadow of the GPU bo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-10 15:19:36 +00:00
Chris Wilson 8a8edfe407 sna: Make sure all outputs are disabled if no CompatOutput is defined
If we have to fallback and the configuration is wonky, make sure that
all known outputs are disabled as we takeover the console.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-10 03:34:04 +00:00
Chris Wilson 5449e16c0c sna: Open-code xf86CompatOutput() to avoid invalid pointers
config->compat_output needs to be sanitized during device initialization
or we may dereference an invalid xf86OutputPtr.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-10 02:52:57 +00:00
Mickaël THOMAS 8881a14200 Set initial value for backlight_active_level
If the "Backlight" option is set, backlight_active_level is not set which
results in a default value of 0, causing a black screen upon starting Xorg.
2013-01-07 20:26:03 +00:00
Chris Wilson b8c9598294 sna: fixup damage posting to be done correctly around slave pixmap
Copied from commit c789d06cf8
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Jan 7 13:57:21 2013 +1000

This fixes the damage posting to happen in the correct ordering,
not sure if this fixes anything, but it should make things more consistent.
2013-01-07 09:37:51 +00:00
Dave Airlie c789d06cf8 intel: fixup damage posting to be done correctly around slave pixmap
This fixes the damage posting to happen in the correct ordering,
not sure if this fixes anything, but it should make things more consistent.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-07 13:57:21 +10:00
Dave Airlie 5891c89ff2 intel: drop pointless error printf in the slave pixmap sync code.
This is left over and spams logs, get rid.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-07 13:54:47 +10:00
Chris Wilson 27550e8148 sna/dri: Transfer the DRI2 reference to the new TearFree pixmap
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58814
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-06 17:30:34 +00:00
Chris Wilson 1a5e4fb725 sna: Only disable upon a failed pageflip after at least one pipe flips
If we have yet to update a pipe for a pageflip, then the state remains
consistent and we can fallback to a blit without disabling any pipes. If
we fail after flipping a pipe, then unless we disable an output the
state becomes inconsistent (the pipes disagree on what the attached fb
is).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-06 17:08:56 +00:00
Chris Wilson dd66ba8e56 sna: Try to create userptr with the unsync'ed flag set first
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-06 16:58:34 +00:00
Chris Wilson 9051f43fa3 sna/gen4+: Handle solids passed to the general texcoord emitter
The general texcoord emitter does handle solids (for the case of a
transformed mask) and so we need to be careful to setup the
VERTEX_ELEMENTS accordingly.

Fixes regression from
commit 2559cfcc4c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jan 2 10:22:14 2013 +0000

    sna/gen4+: Specialise linear vertex emissio

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-06 15:20:14 +00:00
Chris Wilson 4af910e8be sna/gen4+: Trim the redundant float from the fill vertices
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-06 13:47:12 +00:00
Chris Wilson 3244e4b233 Revert "sna/gen4+: Backport tight vertex packing for simple renderblits"
This reverts commit 8ff76fad1f and
commit 48e4dc4bd4.

I forgot gen4 and gen5 do not have the 'non-normalized' bit in their
sampler states.
2013-01-06 13:30:37 +00:00
Chris Wilson d3be77f879 sna/trapezoids: filter out cancelling edges upon insertion to edge-list
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-05 18:08:40 +00:00
Chris Wilson 2b4a2f52c4 sna/trapezoids: filter out zero-length runs
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-05 17:21:34 +00:00
Chris Wilson 59a7b8b32c sna: Clear up the caches after handling a request allocation failure
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-04 18:23:36 +00:00
Chris Wilson 3c31a9fc21 sna: Embed the pre-allocation of the static request into the device
So that in the cache where we are driving multiple independent screens
each having their own device, we do not share the global reserved
request in the event of an allocation failure.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-04 18:11:12 +00:00
Chris Wilson b5b3cfb0ad sna: Flush the batch prior to referencing work from another ring
In the case where the kernel is inserting semaphores to serialise work
between rings, we want to only delay the surface that is coming from the
other ring and not interfere with work already queued.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-04 00:02:52 +00:00
Chris Wilson ea2da97773 sna: Convert allocation request from bytes to num_pages when shrinking
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-03 18:04:55 +00:00
Chris Wilson 2bd6e4dcd4 sna: Add a pair of asserts to validate fls()/cache_bucket()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-03 18:04:55 +00:00
Chris Wilson f9d2730974 sna: Also recognise __i386__ for fls asm
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-03 18:04:52 +00:00
Chris Wilson 69dde74a00 sna: Fix off-by-one in C version of fls
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-03 16:36:03 +00:00
Matt Turner fc702cdf53 sna: Rewrite __fls without dependence upon x86 assembly
The asm() prevents SNA from compiling on ia64.

Fixes https://bugs.gentoo.org/show_bug.cgi?id=448570
2013-01-02 16:23:13 +00:00
Chris Wilson bc67bdcec8 sna/gen6+: Fine tune placement of DRI copies
Avoid offsetting the overhead of the render copy only to be penalised by
the overhead of the semaphore. So compromise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02 16:16:10 +00:00
Chris Wilson 2559cfcc4c sna/gen4+: Specialise linear vertex emission
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02 11:52:58 +00:00
Chris Wilson 0996ed85fd sna/gen2+: Precompute the affine transformation scale factors
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02 11:52:58 +00:00
Chris Wilson d36cae801f sna/gen4+: Tidy special handling of 2s2s vertex elements
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02 11:52:58 +00:00
Chris Wilson 8582c6f0bb sna/gen6+: Remove vestigial CC viewport state
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02 11:52:58 +00:00
Chris Wilson 24264af291 sna: Fast path inplace addition of solid trapezoids
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02 11:52:58 +00:00
Chris Wilson e9a9f9b029 sna: Micro-optimise glyph_valid()
Note that this requires fixing up the glyph->info if the xserver didn't
create a GlyphPicture.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02 11:52:31 +00:00
Chris Wilson 372c14aae8 sna: Remove some obsolete Options
Throttling and delayed-flush are now redundant.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-01 20:42:41 +00:00
Chris Wilson 65924da91d sna: Tidy compat interfaces
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-01 11:40:15 +00:00
Chris Wilson 0a35d92873 sna/gen2: Always try to use the BLT pipeline first
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-01 11:15:07 +00:00
Chris Wilson c1457fbd8a sna/gen2: Tidy a pair of vertex emitters
Switch to the new inline scaled transforms.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-01 10:49:27 +00:00
Chris Wilson 48a5797c0f sna/gen4: Tweak single-thread SF w/a for solids
Allow multiple threads for the rare case of compositing with a solid
color.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-31 17:51:00 +00:00
Chris Wilson e4f6ba6b47 sna/gen6+: Hint that we prefer to use the BLT with uncached scanouts
Once again balancing the trade-off of faster smaller copies with the BLT
versus the faster larger copies the RENDER ring.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-31 15:55:28 +00:00
Chris Wilson 6e87e7ddfe sna/dri: Use the default choice of backend for copying the region
Notably, if everything is idle, using the BLT is a win as we can emit
them so much faster than a rendercopy, and as the target is uncached we
do not benefit as much from the rendercache.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-31 15:55:28 +00:00
Chris Wilson a7988bf77f sna/dri: Fix triple buffering to not penalise missed frames
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-31 15:55:28 +00:00
Chris Wilson 736b89504a uxa: Align surface allocations to even tile rows
Align surface sizes to an even number of tile rows to cater for sampler
prefetch. If we read beyond the last page we may catch the PTE in a
state of flux and trigger a GPU hang. Also detected by enabling invalid
PTE access checking.

References: https://bugs.freedesktop.org/show_bug.cgi?id=56916
References: https://bugs.freedesktop.org/show_bug.cgi?id=55984
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-30 10:36:05 +00:00
Chris Wilson 43336c632b sna: Seed the solid color cache with an invalid value to prevent false hits
After flushing, we *do* need to make sure we cannot hit a false lookup
via the last cache.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-29 16:59:00 +00:00
Chris Wilson f605038209 sna/dri: Gracefully handle failures from pageflip
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-29 16:41:03 +00:00
Chris Wilson 1c2ece3691 sna/gen4+: Try using the BLT before doing a tiled copy
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-29 16:06:19 +00:00
Chris Wilson 09ca8feb34 sna: Move the primary color cache into the alpha cache
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-29 16:06:19 +00:00
Chris Wilson 8c56c9b1da sna: Allow a flush to occur before batching a flush-bo
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-29 15:46:00 +00:00
Chris Wilson 2f53fb389c sna: DBG compile fixes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-28 22:58:02 +00:00