Using spans has a tremendous effect (~100x) on x11perf, some good but
mostly bad. However, in reality operations are mixed and so preventing
migration on alternate opertaions is a win. In the x11perf slowdowns, it
appears to be CPU bound and so it seems like there should be plenty of
scope for recovering the lost performance.
However, for the time being, just go back to the old fallbacks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Previously we ignored updating the scanout in place, and so we were not
amoritizing the shadow cost of common core rendering operations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the span code does not yet handle plane masks or stippling, it is
disadvantageous to convert to spans only to fallback.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This is actually tricker than it looks since miPolyArc() sometimes uses
an intermediate bitmap which performs worse than the fbPolyArc() fallback.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In the beginning, I did perform a retire after ever batch. Then I
decided that it was too much CPU overhead for too little gain. On
reflection, i.e. further benchmarking, we do see a performance
improvement for recycling active buffers faster.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The actual bug is a little involved as we don't damage the temporary
glyph mask correctly presuming that we only hit GPU paths. However,
should we fail to prepare the composite operation that paints the mask
on to the destination, things fail horribly.
One particular example is that wine like to create its own temporary a1
buffer for the glyphs (which we render to via another temporary mask...)
which triggers the delayed fallback and then sw compositing with a random
buffer.
Reported-by: Roman Jarosz <kedgedev@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41165
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I mistakenly set GEN7_PS_MAX_THREAD_SHIFT to 23; it's actually 24 on
Ivybridge. Not only did this halve our thread count, it caused us to
write 1 into a bit 23, which is marked as MBZ (must be zero).
Furthermore, it made us write an even number into this field, which is
apparently not allowed. Apparently we were just lucky it worked.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Would be preferrable to duplicate the tiling logic. Leave the task of
reimplementing XAA to another day!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since we have no recycling of wc pages in the kernel, we try hard to
recycle buffers in userspace to avoid GTT thrashing. This requires
co-operation between DRI clients and X, which is sadly lacking and so we
need to discard any buffer given out to a client after it is finished.
We cheat slightly for page-flips and access to the scanout.
A further compromise.
References: https://bugs.freedesktop.org/show_bug.cgi?id=38732
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On x11perf one of the major hotspots is the search through the active
list for an object large enough to reuse as the target surface. We can
eliminate that overhead by keeping those active objects in pre-sorted
lists by size.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
All the guesswork is so that when we require an inactive bo, we do
actually get a buffer that is not currently on a GPU active list. For
some unresolved reason, this assertion was firing when putting the
buffer onto the inactive list - so just workaround the worrisome issue
by delaying the check until use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We were tracking the 32bit value of the prev_x using only a 16bit
variable, and so failing to sort the edges after advancing to the next
scanline.
Fixes cairo a1-clip-fill-rule.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Removes 17 instances of:
warning: comparison of unsigned expression >= 0 is always true
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Enums are unsigned by default in gcc and we can't rely on any specific
signedess for the other compilers.
i965_render.c: In function ‘i965_prepare_composite’:
i965_render.c:2018:2: warning: comparison of unsigned expression < 0 is always false
i965_render.c:2025:2: warning: comparison of unsigned expression < 0 is always false
i965_render.c:2050:3: warning: comparison of unsigned expression < 0 is always false
i965_render.c:2057:3: warning: comparison of unsigned expression < 0 is always false
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[ickle: take advantage and rename the enum values]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Didn't spot anything that might have led to a genuine bug, but this
should help improve the signal-to-noise ratio of warnings in the future.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Under a compositing manager where we have fun values for both
drawable->x/y and pixmap->screen.x/y, we were not drawing either the
glyphs into the mask correctly and then failed to composite the mask in
the right position on top of the pixmap.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the active bo is still referenced in the request list, we can not
simply free it but need to wait for it to be purged on expiration.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Julien Cristau disliked my broadcasting of the git tree used to build
his distribution package as it bore little relevance to his users. As it
is only useful for people installing their own drivers (as a means of
sanity checking that they are running the right driver), we introduce
the --with-builderstring idiom borrowed from the xserver. This allows
the builder to override the use of `git describe` and either leave it
blank or to fill it with something useful for their own purposes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>