Forcing the batch emission on virtually every glyph eats a lot of CPU
time sending very short commands to the GPU, and is totally unnecessary.
Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In commit dcf9b5ae18
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Sep 17 22:27:45 2013 +0100
intel: Compile fixes for base install of SLED11.sp3
the includes were juggled around to avoid pulling in xorg-server.h
outside of the driver. However, missing xorg-server.h leads to subtle
bugs in the layout of structures, in this case breaking xf86Options.
Reported-by: FBrown <francisbrwn9@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69555
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Move the wrapping out of the main code body and hide it with the others
in our compatability header.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
SLED11 also requires us to poke around in the privates as it does not
provide the more recent privates API.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Too often our implementation of vsync or pageflip is buggy, or for some
other reason it is desired by the user to disable those code paths. Make
it possible once again by restoring the Options for the user to control.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We always are going to write to the image, despite the flag set in
commit fa961ec99a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sun Jul 21 18:00:22 2013 +0100
sna: Allow linear inplace uploads along the tiled X PutImage blt paths
which was accidentally conditional on the image not being too large.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We need to be careful not to copy too much data during the vertex flush
or else that becomes the rate-limiting step. The goal here is to do the
early flush to warm up the GPU, then transition to larger batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Given how fragile the render operations are, taking the hit from
transitioning from the slow render operations to the comparatively fast
BLT (when possible) is always worth it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This problematic GPU still seems to like to fallover when faced with
Y-tiling. It was reserved only for use with glyphs, but even that
occasionally runs into trouble, so disable all selection of Y-tiling for
our own use.
Bugzilla: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1222203
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Oh my, bspec is missing a few details on how to perform a scanline wait
on Haswell. But by using the extra steps required for Ivybridge, we can
successfully send events from the scanout to the BCS ring. Sadly this
again means that to use vsync on Haswell requires preventing the GPU
from sleeping whilst it waits for the scanout to advance.
Reported-by: Dan Doel <dan.doel@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69119
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Hmm, this should have meant that we never actually waited for a
scan-line on pipe A. I wonder if it even works...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Sigh. I added the new functions for the asserts, updated the parameters,
but forgot to change the actual function themselves.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When using sna_copy_boxes__inplace(), we need to remember that the
region is in destination space, so we need to offset the boxes when
comparing against the source. The assertion forgot to do so, and so
failed as soon as it met a little complexity.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
For SNB, the different between RCS and BCS is more marginal but it is
slightly in favour of using rendercopy on GT2.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On gt1, the BCS is faster than the RCS for all equivalent operations,
unlike gt2+ where the RCS is faster (but at greater power draw).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The new error message was added in
commit ea30967245 [2.99.902]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Sep 6 22:54:48 2013 +0000
configure: Disable UXA build without DRI2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It appears possible to race the framebuffer resize with a VT switch and
so end up attempting to update the CRTCs whilst master. The code
complains, but in reality we can just ignore the requested change until
we VT switch back and then apply the updates upon restoration of master.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
DRI2 is interwoven into the UXA structs, so simply disable building UXA
if DRI2 is not available.
Reported-by: Ross Burton <ross@burtonini.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69056
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The adapter names are not uniform, so we need to scan the directory and
find the entry that corresponds to the Mains power supply. However, the
acpid does continue to report generic ac_adapter events.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the BackLeft is the only that couled be flipped, it is the only one
that may end up as the scanout and so is the only one that should be
allocated from the scanout cache.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The BLT is more power-efficient for the operations it can handle, so use
it when possible (following the usual caveats) if we know we only have
battery power.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When on-battery, we would prefer to use more power efficient operations.
For example, the BCS is far more economical to more data around with, but
it doesn't have quite the same throughput as the hungry RCS. (Not that
there is any reason why, the BCS is supposed to run at full memory
speed, unfortunately that is main memory speed and not the caches...)
Note: that X already listens to acpid for video switch notifications, it
would be useful if we could extend that interface to emit power
notifications as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Rather than always switching over to using the GPU bo and immediately
discarding the CPU bo, keep it around as we may want to reuse the cached
data.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This applies the copy-from-tiled-X GetImage optimistion to the
ShmGetImage paths - when we don't have userptr available.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes regression from
commit d2f19d5a1f [2.99.901]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Sep 3 19:05:41 2013 +0100
sna: Tidy walking the window tree for updating our pixmaps
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we ourselves only track the BBox of damage on the virtual outputs, we
can ask X to amalgamate the damage events as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we only use these buffers once, we should not benefit from requesting
them to be moved into L3/LLC cache - over and above the default
recommendations we make when creating the buffer. Indeed, this may even
lead to artefacts if we fail to invalidate those other caches when
reusing the buffers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Some versions of the Xserver lose Damage tracking across the modeset,
causing a loss of damage notifications and repainting to cease on the
virtual outputs. We can workaround this by reattaching the damage every
time we receive notification that the local Screen configuration
changes.
Reported-and-tested-by: Severin Strobl <fd@severin-strobl.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68987
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The timer will be enabled if a reconfiguration actually takes place and
we mark the damaged region to be redrawn.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>