Commit Graph

9314 Commits

Author SHA1 Message Date
Chris Wilson bca4e0e35e sna: Limit generic convolution to smallish kernels
Since the naive implementation uses an 8bit temporary, we can only
support so many passes before the quantization artefacts become
apparent. We have to be extra conservation in order to support
multi-pass convolution algorithms (notable 2-pass separable Gaussian
kernels).

References: https://bugs.freedesktop.org/show_bug.cgi?id=95091
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-23 22:56:08 +01:00
Chris Wilson cac8e1ee74 uxa: Enable Y-tiling BLT support
Mesa wants to pass Y-tiled framebuffers onto scanout. Admittedly, this
isn't quite that but it does prevent them being jumbled up.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-22 22:48:54 +01:00
Chris Wilson 46caee86db sna: Fix reporting of errno after setcrtc failure
As we now do more syscalls after the setcrtc, we cannot rely on errno
storing the pertinent error code. Instead we have to save it immediately
after the drmIoctl() and propagate that back.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-20 21:41:11 +01:00
Chris Wilson c62177ec32 sna: Force the shadow buffer even after we fail to set the crtc for TearFree
As the first choice of orientation and tiling may be invalid, e.g.
left/right rotation on Skylake, we need to force the second pass here to
try and an alternate non-native rotation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-20 21:31:46 +01:00
Chris Wilson 562ae1f29f sna/present: Postpone recursed vblank during TearFree by 1ms
Avoid postponing until the next vblank to avoid continually recursing
every TearFree update, and to minimise the presentation delay.

References: https://bugs.freedesktop.org/show_bug.cgi?id=94982
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-18 20:56:05 +01:00
Chris Wilson 680ae24ea9 sna: Block SIGIO when we are trying to flip
Temporarily stopping the pointer whilst we try to queue the flip should
help keep the output latency down.

Reported-by: Rafael Ristovski <rafael.ristovski@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94980
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-17 18:58:09 +01:00
Chris Wilson 81029be073 sna/gen8+: Flush pipecontrols when forcing a pipeline stall
In order to actually stall the pipeline completely and to wait for
earlier flushes to complete, we have to set a flag in the pipecontrol.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-17 16:02:30 +01:00
Chris Wilson f7f5ef714f sna: Tweak flushing before adding a new bo into a batch
Try to reduce the frequency we flush between operations, to only
consider known-idle bo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-17 15:06:57 +01:00
Chris Wilson 29b70ccdf6 sna/present: Fix requeuing after interrupting TearFree
Increment the target_msc by one, not the last known msc!

Reported-by: Rafael Ristovski <rafael.ristovski@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-17 15:06:57 +01:00
Chris Wilson f2a46458a2 sna: Fix alignment vs length check when adjusting dst pointer
When doing the misaligned copy from the start of the dst pointer, the
important check is whether there is enough bytes remaining to the next
alignment position, next from the last.

References: https://bugs.freedesktop.org/show_bug.cgi?id=94928
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-16 18:17:31 +01:00
Chris Wilson c3dc831057 sna: Mark the transformed cursor image as dirty
So that when we size from transformed cursors to non-transformed, we
remember to clear the entire area.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-15 19:35:00 +01:00
Chris Wilson d30d276aee sna/blt: Reuse computed partial tile offset in copy_from_tiled
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-15 19:29:47 +01:00
Chris Wilson bb69256b52 sna/gen6: Encourage migration of small BLT operations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-15 15:16:00 +01:00
Chris Wilson 0d38419cbe sna/gen9: Update mocs selection
Since the choice is now boolean (use PTE or use WB), remove the third
uncached condition.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-15 14:58:57 +01:00
Chris Wilson a7526ea2e0 sna/present: Prevent recursion when handling TearFree waits
When draining the flipqueue for TearFree, we may recurse from the vblank
handlers. Avoid this by delaying the Present vblank until next frame.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-10 16:46:46 +01:00
Chris Wilson b6917eced7 sna: Restict reduction of ADD white when we have compatibile formats
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-09 20:45:29 +01:00
Chris Wilson 6c07f467a3 sna: Apply the precomputed BLT colors for SRC reductions
Eek, after computing what the resultant color should be, we should
endeavour to use it!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-09 20:15:41 +01:00
Chris Wilson d08221edab sna: Replace lost offset when copying from tiled memory
Fixes typo from 28e3bdd758

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-09 19:12:47 +01:00
Chris Wilson de44aaa2dd sna/present: Refuse to queue a vblank on a disabled CRTC
Kick the error back to the upper layer for it to sort out.

Reported-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 21:08:08 +01:00
Chris Wilson edcfb3efb8 sna/present: Handle 64bit wraparound in msc comparisons
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 20:58:02 +01:00
Chris Wilson e5bbf519bd sna/present: Fixup msc when reporting a fake vblank with 0 delay
If we have to fake a vblank because the CRTC is off and we compute the
delay as being 0, then we would report the event immediately - but with
the earlier msc. Instead, we want to report the completion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 20:49:13 +01:00
Chris Wilson e783ffa497 sna/gen9: Bias GT for pipeline selection
Each GT on Skylake is bigger than previous generations. For reusing the
placement logic, we then want to pretend that Skylake has a higher
GT-equivalence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 15:28:34 +01:00
Chris Wilson 15903e7c68 sna: Avoid rep mov (builtin memcpy) for WC writes
Lesson learnt, rep mov is terrible when applied to WC.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 11:00:22 +01:00
Chris Wilson ab041b9b91 sna: Specialise alignment paths for storing
Switch between aligned/unaligned stores for the bulk copy inner loops.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 10:01:29 +01:00
Chris Wilson e62010374c sna: Unroll the innermost SSE2 loop one more time
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 08:24:44 +01:00
Chris Wilson 27ec7e49da sna: Force inlinement of SSE2 builtins
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 08:03:11 +01:00
Chris Wilson 65c72d9871 sna: Invert the function wrapping for sse64xN/sse64
We should be consistent in making the code simpler for the compiler and
so not rely on it eliminating the dead-code for a single loop of
sse64xN!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-08 07:56:07 +01:00
Chris Wilson 59d371a9b2 sna: Don't limit CRTC id
Don't bake in the assumption that the CRTCs will always be allocated in
the low byte of the identifiers range. It is only used in a pair of
other functions (Xv plane updates), so not a big deal.

Reported-by: Mark Kettenis <kettenis@openbsd.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-07 21:57:00 +01:00
Chris Wilson 28e3bdd758 sna: Fixup SSE2 alignment instructions for copying to client memory
Only the tiled surface alignment is known, the client we must assume is
unaligned.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-07 18:41:58 +01:00
Chris Wilson d4818c74b1 sna/present: Clamp to maximum timer delay
Timers can only be set for a maximum of int32_t milliseconds into the
future. Respect that - if we need more, we'll just requeue!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-07 14:17:01 +01:00
Chris Wilson 74b755fe0a sna/present: Clear flags on the vblank event's CRTC early
We store a flag on the vblank's CRTC to indicate whether we have marked
the target CRTC as having an immediately pending vblank. We should clear
this set of flags early so that we don't have to worry about the flag
whilst processing the vblank, and so that we don't get confused if we
have to requeue the vblank.

Reported-by: Christoph Haag <haagch@frickel.club>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94829#c32
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 22:55:39 +01:00
Chris Wilson 2077272b12 sna/blt: Don't skip the final src/dst_stride adjustment
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 20:48:43 +01:00
Chris Wilson 8d3c8a6c0d sna: Restrict sse2 routines to __x86_64__
After fixing the 32bit build (sigh), testing out the manual unwinding of
the sse2 memcpy doesn't look worthwhile (at least on pnv). So leave it
off for 32bit builds.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 19:43:41 +01:00
Chris Wilson 4d220adcad sna: Mark sse2 routines as "fast"
Trying to unify all the target attributes to chase down:

blt.c: In function ‘memcpy_from_tiled_x__swizzle_0__sse2’:
blt.c:345:1: error: inlining failed in call to always_inline
‘memcpy_sse64xN’: target specific option mismatch
 memcpy_sse64xN(uint8_t *dst, const uint8_t *src, int bytes)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 18:44:07 +01:00
Chris Wilson b3a2d6c84e sna: Manually expand sse2 memcpy to compensate for a bad compiler
Eek, this doubles the memcpy performance on skl with gcc-4.8. Still, not
ideal.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 18:13:06 +01:00
Chris Wilson 4e172a38e1 sna/gen9: Quick and dirty implementation
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 17:24:01 +01:00
Chris Wilson ff0ab2c2ea sna/present: Only compensate the timer delay on the final frame
For delays over a frame, we aim to fire a frame early and so
compensating again for less than a whole frame is irrelevant.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 10:03:36 +01:00
Chris Wilson a76560107f sna/present: Only use the HW vblank for the last frame
Rather than queuing vblanks for future events up to 4000 years and
setting the hardware to stay awake for all that period (reporting a tick
every refresh), use the timer for the first part. (The timer should
allow a precise-(ish) single wakeup of the hardware.) We set the timer a
frame ahead, so that we can then queue an actual hw event for the final
vblank for precision.

References: https://bugs.freedesktop.org/show_bug.cgi?id=94829
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 09:37:35 +01:00
Chris Wilson 9ce7d47a86 sna/present: Skip the unflip if a no-op
If the screen is already scanning out from the desired framebuffer
(because we failed when flipping and already restored the mode), skip
the unflip.

References: https://bugs.freedesktop.org/show_bug.cgi?id=94829
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06 08:55:14 +01:00
Chris Wilson afddc9fe7f sna/present: Arm the fake vblank timer to wake up one frame early
If we wake up early from the timer, we can use the hw to give us a more
precise wakeup for the vblank (rather than estimating to the approximate
millisecond).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05 21:35:37 +01:00
Chris Wilson e091ace4d8 sna/present: Remove stale assert that fake vblanks only have one event
Since commit 02f535e8f3
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Mar 24 09:39:06 2016 +0000

    sna/present: Requeue early vblank completions

we may hit the fake vblank timer path with an old hw struct that may have
multiple associated events. The assert that we only called the fake
vblank from sna_present_queue_vblank is no longer correct.

Reported-by: Christoph Haag <haagch@frickel.club>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94829#c15
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05 20:04:25 +01:00
Chris Wilson 4b4324cd05 sna/present: Update the vblank timestamp after a blocking wait
After doing a blocking wait (rather than event driven) for the imminent
arrival of the next frame (when it is expected less than 1ms in the
future), update the timestamp and frame counter that we then report back
to present.

Reported-by: Christoph Haag <haagch@frickel.club>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94829#c15
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05 20:01:07 +01:00
Chris Wilson f4ce7fee6c sna/present: And drop bogus queued assertion
Don't forget that along the vblank path, we may use a timer instead of
the hw event in which case, we don't actually count the info as queued
(to hardware)

Reported-by: Christoph Haag <haagch@frickel.club>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94829#c13
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05 19:12:29 +01:00
Chris Wilson 84c545b1b7 sna/present: Markup hw vbanks queued after a fake vblank
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05 18:27:51 +01:00
Chris Wilson 65dc4176d8 sna/present: Prevent reporting an incomplete event
If we cancel a flip, we may try to restore the current mode and this may
flush the partial flip (in a multi-monitor setup). We report the completed
event back to present and free the event info. Then we report the error
back to present, and free the event info a second time. Chaos and
corruption ensues.

Reported-and-tested-by: Christoph Haag <haagch@frickel.club>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94829
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05 18:09:21 +01:00
Chris Wilson bb5194eebd sna: Add alignment hints to tiled memcpy
Telling the compiler the known alignment should improve the memcpy
operation, but only has a small impact today (a few bytes/instructions
per function).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05 07:02:54 +01:00
Chris Wilson 90792c933d sna: Only print "Failed to prepare CRTC ... disabling TearFree" once
The actual disablement doesn't take place until the next modeset, and
until then we are likely to keep spamming the error message.

References: https://bugs.freedesktop.org/show_bug.cgi?id=94806
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-03 20:18:03 +01:00
Chris Wilson 2c4890001d sna/video: Use the GPU to prescale overlay sprites
Since Haswell, we lost the ability to use hardware scalers on the
overlay planes. Allow Xv clients to pass in unscaled data and use the 3D
pipe to prescale the images before display.

(I doubt I have the rotations corrected!...)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-31 20:41:26 +01:00
Chris Wilson 3fafabe562 sna/dri3: Ensure foriegn bo are marked as unclean on creation
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-31 12:31:41 +01:00
Chris Wilson ebe86fdaa9 test/dri2-race: Don't leak the Display after detecting the race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-30 14:44:55 +01:00