We always are going to write to the image, despite the flag set in
commit fa961ec99a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sun Jul 21 18:00:22 2013 +0100
sna: Allow linear inplace uploads along the tiled X PutImage blt paths
which was accidentally conditional on the image not being too large.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We need to be careful not to copy too much data during the vertex flush
or else that becomes the rate-limiting step. The goal here is to do the
early flush to warm up the GPU, then transition to larger batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Given how fragile the render operations are, taking the hit from
transitioning from the slow render operations to the comparatively fast
BLT (when possible) is always worth it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This problematic GPU still seems to like to fallover when faced with
Y-tiling. It was reserved only for use with glyphs, but even that
occasionally runs into trouble, so disable all selection of Y-tiling for
our own use.
Bugzilla: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1222203
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Oh my, bspec is missing a few details on how to perform a scanline wait
on Haswell. But by using the extra steps required for Ivybridge, we can
successfully send events from the scanout to the BCS ring. Sadly this
again means that to use vsync on Haswell requires preventing the GPU
from sleeping whilst it waits for the scanout to advance.
Reported-by: Dan Doel <dan.doel@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69119
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Hmm, this should have meant that we never actually waited for a
scan-line on pipe A. I wonder if it even works...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Sigh. I added the new functions for the asserts, updated the parameters,
but forgot to change the actual function themselves.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When using sna_copy_boxes__inplace(), we need to remember that the
region is in destination space, so we need to offset the boxes when
comparing against the source. The assertion forgot to do so, and so
failed as soon as it met a little complexity.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
For SNB, the different between RCS and BCS is more marginal but it is
slightly in favour of using rendercopy on GT2.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On gt1, the BCS is faster than the RCS for all equivalent operations,
unlike gt2+ where the RCS is faster (but at greater power draw).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It appears possible to race the framebuffer resize with a VT switch and
so end up attempting to update the CRTCs whilst master. The code
complains, but in reality we can just ignore the requested change until
we VT switch back and then apply the updates upon restoration of master.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The adapter names are not uniform, so we need to scan the directory and
find the entry that corresponds to the Mains power supply. However, the
acpid does continue to report generic ac_adapter events.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the BackLeft is the only that couled be flipped, it is the only one
that may end up as the scanout and so is the only one that should be
allocated from the scanout cache.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The BLT is more power-efficient for the operations it can handle, so use
it when possible (following the usual caveats) if we know we only have
battery power.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When on-battery, we would prefer to use more power efficient operations.
For example, the BCS is far more economical to more data around with, but
it doesn't have quite the same throughput as the hungry RCS. (Not that
there is any reason why, the BCS is supposed to run at full memory
speed, unfortunately that is main memory speed and not the caches...)
Note: that X already listens to acpid for video switch notifications, it
would be useful if we could extend that interface to emit power
notifications as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Rather than always switching over to using the GPU bo and immediately
discarding the CPU bo, keep it around as we may want to reuse the cached
data.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This applies the copy-from-tiled-X GetImage optimistion to the
ShmGetImage paths - when we don't have userptr available.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes regression from
commit d2f19d5a1f [2.99.901]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Sep 3 19:05:41 2013 +0100
sna: Tidy walking the window tree for updating our pixmaps
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we only use these buffers once, we should not benefit from requesting
them to be moved into L3/LLC cache - over and above the default
recommendations we make when creating the buffer. Indeed, this may even
lead to artefacts if we fail to invalidate those other caches when
reusing the buffers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
With lots of updates by Christopher James Halse Rogers as he updated the
XMir API - but now supposedly frozen!
"<RAOF> ickle: I think the xmir api should be pretty much stable now,
barring people coming up with more awesome ways of doing things."
Signed-off-by: Christopher James Halse Rogers <raof@ubuntu.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Now that the WriteThrough ABI is upstream, we can rely on runtime
detection of the current interface.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Like it's sibling sna_pixmap_move_to_gpu(), it helps to know the private
sna_pixmap after the operation rather than just a boolean success/fail
result, and make it more robust in the process.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The latest proposal for passing swap_interval==0 is through the normal
ScheduleSwap() call, so we can remove the specialised function.
Link: http://lists.x.org/archives/xorg-devel/2013-September/037661.html
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we may call the ->detect() routines during the fallback initial
probing, we need to handle the case where the output callbacks are
called before RandR is setup.
Regresion from
commit 8ecb758697
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sat Aug 31 19:44:50 2013 +0100
sna: Expand the array of fake outputs if the last is used
Reported-by: Andreas Reis <andreas.reis@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68843
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Sometimes the window size is not a simple box, but a full region. In
which case we do need to process it and not just assert that it is a
box!
Reported-by: Jiri Slaby <jirislaby@gmai.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=47597
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Always maintain one spare so that we can reconfigure for any number of
desired outputs on the fly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Prevent a NULL dereference for the small system pixmaps. Introduced with
commit f22d7f68b8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Aug 28 14:24:33 2013 +0100
sna/gen6+: Improve ring stickyness for BLT composite ops
Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68728
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We also need to correctly offset the current_msc for the normal
pageflip, so rearrange the code flow so that we only do the calculation
of target_msc once.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Previously, we instantiated a fake output in case we had a machine with
no output. (For certain server-class products.) The Bumblee project were
also doing something very similar in order to fake an extended desktop
on the Intel igfx and copy portions onto a discrete GPU. (The preferred
method for doing this upstream is through the use of PRIME). As the code
is very similar, we can support both use-cases simultaneously.
This adds the option:
Section "Device"
Driver "intel"
Option "VirtualHeads" "<count>"
EndSection
to allow the user to specify an additional set of fake outputs, which
can then be controlled using xrandr.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A cut'n'paste error dropped the clip region copy, resulting in the
port not being set on the window instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The final version that was upstreamed differed from the original version
we implemented. The final version allows for both destination/source
colorkeying, but left the ddx out of date.
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We can arrive there with a COW and wanting to a CPU mapping, which is
unfortunate and requires the indirect path instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Being able to read back the fbcon handle as a non-Master process is an
information leak that will be fixed. We should already be Master by
this point by virtue of the sequence in which we obtain the device fd.
However, to be pedagogically correct, call drmSetMaster() before the
fbcon copy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>