RegionNotEmpty() is only valid if we only use the Region API, and as we
mix direct operations on the region extents, we need to also do our own
final check.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes regression in
commit 6921abd810
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Jul 18 16:21:27 2013 +0100
sna: Add a fast path for the most common fallback for CPU-CPU blits
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This path will mostly be upload for individual glyph uploads, for which
the malloc overhead is significant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
There are times, such as rendering into the scanout, where continuing to
use the GTT is preferrable even when wedged.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Some operations we do not know the true extents and so check the whole
drawable when considering placement. In this case, the drawing may only
partially cover the drawable and so we can not simply ignore existing CPU
damage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Hopefully a final regression from:
commit 07926bfe50
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Jul 11 15:28:55 2013 +0100
sna: Remove the temporary region allocation from sna_do_copy
References: https://bugs.freedesktop.org/show_bug.cgi?id=67055
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This can happen if the DRI client passes in a stale DRI2Drawable - that
is the Drawable now references a new Pixmap which the client has not run
DRI2GetBuffers against.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Simply reject any attempts to copy using stale references (i.e. the
DRI2Drawable has changed structure but the client hasn't yet noticed).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
More our ifdef out of line from the main code into a header file, where
we can also apply a little bit of synatic sugar.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since __cpuid_count() was only introduced into gcc-4.4, we obviously
cannot use it with earlier versions or with compilers that do not
provide compatible interfaces.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
ALIGN() expects it alignment argument to be a power of two. As buffer
size depended upon cache_size, which is not always a power of two,
issues could arise with unexpected buffer sizes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comparing y2 against y1 for the intersection was a silly typo,
especially as the routine for computing the intersection already
existed.
Fixes regression in commit 34c9b759fb
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Jul 16 19:39:37 2013 +0100
sna: Note that borderClip region may be more than a singular box
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66991
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Older hardware does not support cache size probing via cpuid4, so we
need to implement the older algorithm which requires a table based
lookup. (And in hindsight, why I thought cache probing via cpuid to be
quite hairy.) For the moment, just use the value found in /proc/cpuinfo.
Reported-by: Oscar Dario Trujillo Tejada <oscardt19@gmail.com>
Reported-by: Ferry Toth <ftoth@telfort.nl>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the linear bo is still in the CPU domain, we can map it through the
CPU with no penalty, so treat it as mappable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If the child is obscured, then borderClip will contain a list of valid
boxes rather a singular extents. I thought this was covered by the
clipList, but I was wrong.
Reported-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66970
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since we will not sample depth=1 pixmaps from the GPU, we may as well
directly allocate these in system memory and avoid tickling the upload
cache. This then avoids an issue within the size calculation code which
makes the assumption that bpp>=8.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since we are retreiving the hw values rather than choosing a default for
ourselves, it is more consistent to use PROBED rather than INFO for our
message.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We no longer seem to be hitting the same random hangs, so presumably
another w/a is taking effect.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Occasionally when forced to use an intermediate destination surface, we
know that we will completely overwrite the contents of the surface and
so we can forgo the initial copy from the target.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Courtesy of a patch from Chad Versace via Ben Widawsky, actually digging
through CPUID for the cache info looks quite easy in comparison to the
fragile approach of parsing a linux specific file that may or may not be
available.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
These assertions were checking that the previous state prior to
performing the new mapping was consistent. Given that the checks were
occurring after the update in mapping, the asserts were bogus.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Thomas Jones reported that the build was failing with gcc-4.5 due to the
memcpy routines requesting an unsupported optimisation mode (-Ofast) and
supplied this patch to only enable Ofast for gcc-4.6+
Reported-by: Thomas Jones <thomas.jones@utoronto.ca>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to preserve the optimisation of discarding incomplete batches,
we don't always want to immediately submit the batch after inserting the
first command. As we currently only cancel a batch if it only touches
the bo being discarded, we can skip the immediate flush if it only
accesses one bo and maybe be able to use the undo optimisation later.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes regression from
commit 8751c0f5ad
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Jul 5 17:55:10 2013 +0100
sna: Flush blt copies if no operations pending
Reported-by: Andreas Reis <andreas.reis@gmail.com>
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66742
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If there is stolen memory reserved by the BIOS, we want to utilize it in
preference to regular system memory. However, given the caveat that it
is not suitable for CPU access, rules out most use cases - but it is a
good match for framebuffers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Remove the hack from the glyph path to force the use of an auxiliary
channel, and reduce the maximum amount of inflight vertices until we can
then render glyphs with no corruption (at least in my test case).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Inspect whether this rectangle will be added to the previous primitive
and so charge it against the current number of inflight rectangles.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This is an abhorrent workaround for some internal GPU brokenness. A
slight refinement since earlier times is the recognition that 16 is a
magic number limiting the maximum number of inflight rectangles through
the GPU.
References: https://bugs.freedesktop.org/show_bug.cgi?id=55500
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I was being overzealous at the time of making the COW and trying to be
sure that we would never write through a mapping. Then I started to
allow clones to be mapped (for reads) and missed relaxing this assertion.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
No actual initial configration magic is required, all we need to do is
set the initial framebuffer size with no connected outputs and leave it
to the core to select CompatOutput() the like.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>