When you don't have many cycles to play with, every one counts.
Here we make sure we cache negative lookups for large glyphs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Obviously we can only replace the bo if it is not pinned and so just
incur a stall when we could have instead rerouted the rendering through
its CPU bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The inplace write distinction is not important on LLC, so pick any
buffer that is on the GPU and available for reuse.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
There is yet another race in drm initialisation where X is starting long
before the drm device is completely ready, and is being told that the
output has a valid mode, but with bogus settings. Ignore it, and hope it
comes to its senses later on.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Start adding the infrastructure to disable direct hardware access if X
is being run under a system compositor (aka "hosted").
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Repairs the build for f16 which has an older version of xorg-macros.
Fortunately, as it doesn't define XORG_TESTSET_CFLAG it also doesn't use
it generate noisy output.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Framebuffers created from stolen bo were not being released and so the
kernel would keep the fb and bo alive, causing the memory to be
remain unreusable whilst X lived and us to leak all available stolen
memory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We can shave a few instructions off the routine by incrementally
performing the "is-empty" check as soon as we compute the intersection
in each dimension.
Otherwise we'd fail saying DRI1 wasn't possible, when that
is exactly what we asked for.
[ickle: The breakage was introduced with
commit bd6ffd1ad2 [2.21.14]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sat Jul 27 15:33:19 2013 +0100
configure: Print a summary of compilation options
which modified the search to only take place if UMS was enabled, but
missed mollifying the resulting error.]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Using __packed__ as shorthand for ___attribute__(__packed__) confuses
clang as. (I guess to it expands (__packed__) which gcc skips.) As
clang also uses packed in its builtins, we have to find a compromise,
and so tightly_packed wins for being a more verbose description without
the dangerous leading underscores.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Whilst we reserved exec entry slots for the deferred VBO, there were no
relocation spaces reserved. So if we submitted a render command followed
by a multitude of BLT copies, we could then overrun the relocation array
when adding the deferred vbo to the batch.
Reported-by: Danny <moondrake@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67504
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Ideally, the method of swapping is something that the applications have
control over, along with how to synchronise to the vertical refresh.
Whilst triple buffering is good to reduce jitter for games (at the cost of
an extra frame of latency, usually considered a good tradeoff), it
prevents the applications from accurately controlling the presentation
of animations. One vocal critique is Owen Taylor, who demands accurate
swap control for smooth animations in gnome-shell. For example,
http://blog.fishsoup.net/2012/11/28/avoiding-jitter-in-composited-frame-display/
In lieu of application control, just apply a quirk for the compositor.
Everyone else will just have to wait for DRI3.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we need optimal copy code for the general case, where unlike
swizzling the run lengths are not known before hand, we need to call the
arch specific routines from glibc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
xf86InterpretEDID() doesn't copy the EDID raw data in xf86MonPtr but
just stores the given pointer. The DDX driver needs to make sure that
data stays valid.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
If we allocate the scanout from stolen, we cannot then access it via the
CPU - so prevent the mapping in those cases.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On Iris, we may store the framebuffer in the eLLC/LLC and mark it as
being Write-Through cached. This means that we can treat it as being
cached for read accesses (either by the GPU or CPU), but must be careful
to still not write directly to the scanout with the CPU (only the GPU
writes are cached and coherent with the display).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
GT3 has twice the number of cores and URB as GT2, and so we can use
more threads and URB entries.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Whilst we are force flushing vertexes we are not using the threaded
emitter, so simply hide it from the compiler to prevent it warning about
the unused function.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
All the curent reports by gcc are false positives. Instead we have better
static analysis tools at our disposal and valgrind.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Warning about redundant declarations within the xorg headers hides
genuine warnings in our own code - disable them until the headers are
cleaned up.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the DRI buffer is stale, the drawable may have been recreated and no
longer be associated with DRI. In this case, the pixmap may not be on
the GPU, so just subsitute the client's old bo and hope the it catches
up and does a GetBuffers in the near future.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
So do not offset it again when processing the fallback composite
operation.
Regression from commit 6921abd810
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Jul 18 16:21:27 2013 +0100
sna: Add a fast path for the most common fallback for CPU-CPU blits
References: https://bugs.freedesktop.org/show_bug.cgi?id=66990
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As a first step towards working out what to do with the remaining
used-once PCI IDs, delete the used-never ones.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Typically, PutImage is not a performance critical path since ShmPutImage
uses CopyArea and so PutImage is relegated to small one off transfers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>