Commit Graph

3273 Commits

Author SHA1 Message Date
Kenneth Graunke 07cc488bcf render: New Ivybridge assembly programs for render acceleration.
These are exactly the same as the ones for Sandybridge, but with message
registers translated (hopefully) in the same way as Haihao's new
programs (m1 == g65).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-28 15:00:17 -07:00
Chris Wilson 1b9e82b4b5 sna: Revert enabling scan-line wait on SNB
Hanging the machine does indeed prevent video tearing. Just not quite
what the user expected...

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39497
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-26 08:29:36 +01:00
Chris Wilson 6dbbb74bde sna: Enable gen6 scan-line waiting
The code was ready and waiting, just forgot to turn it on.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-20 22:00:53 +01:00
Chad Versace 3e55f3e88b dri: Do not tile stencil buffer
Until now, the stencil buffer was allocated as a Y tiled buffer, because
in several locations the PRM states that it is. However, it is actually
W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
4.5.2.1 W-Major Format:
    W-Major Tile Format is used for separate stencil.

The GTT is incapable of W fencing, so we allocate the stencil buffer with
I915_TILING_NONE and decode the tile's layout in software.

This commit mutually depends on the mesa commit:
    intel: Fix stencil buffer to be W tiled
    Author: Chad Versace <chad@chad-versace.us>
    Date:   Mon Jul 18 00:37:45 2011 -0700

Signed-off-by: Chad Versace <chad@chad-versace.us>
Reviewed-by: Ian Romanick <ian.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-19 13:08:18 -07:00
Chris Wilson 212fa98687 Disable adding normal RTF modes for an eDP
This is causing a hard hang with 2.6.39+, we don't know why so play safe
and disable for the time being.

References: https://bugs.freedesktop.org/show_bug.cgi?id=38012
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 21:11:14 +01:00
Chris Wilson 7a695c9f6b sna: Fast-path single span boxes
These are very common when compositing unclipped trapezoids, and the
majority of the overhead is in handling the arbitrary number of boxes
and misses out on the constant folding the compiler can do if it is
known we have just one box.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 17:43:13 +01:00
Chris Wilson 0190964906 sna/damage: Avoid testing against a completey damaged region
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 17:43:13 +01:00
Chris Wilson b929717c89 sna/gen3: Tune emit_spans_primitive_constant
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 17:43:13 +01:00
Chris Wilson fbdbfaf38d sna/glyphs: Discard GLYPH_PICTURE hint if the glyph doesn't fit into the cache
If the glyph is too big to fit into the cache, than ideally we do want
to keep an associated GPU bo around for future use. As it is too large
to fit into the cache, it of reasonable size and there is little wastage
in allocating indiviual GPU bo for each oversized glyph.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 17:41:02 +01:00
Chris Wilson 12f52530db sna: Add some extra debugging to the texture upload fallback paths
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 17:41:02 +01:00
Chris Wilson a861094c23 sna/dri: Fix a couple of typos
Somehow these were lost in the rebasing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 17:41:02 +01:00
Chris Wilson c221d0356d sna/dri: Remove the unused id/type members for Resource tracking
...and reduce it to a simple list.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-13 17:41:02 +01:00
Chris Wilson 644b1a9033 dri: Always initialise resource members of DRI2FrameEvent
As we now attempt to always decouple the lists upon freeing the frame
event, we need to initialise them along all code paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-12 11:51:58 +01:00
Chris Wilson 32f4235814 sna/dri: Add some simple debugging
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-11 22:14:15 +01:00
Chris Wilson a46598220e sna/dri: Refactor common code for assigning a pixmap to the DRI2 buffer
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-11 22:14:15 +01:00
Chris Wilson 7538be3315 dri: Enable triple-bufferred pageflips
By popular demand.

Triple-buffering trade-offs output latency versus jitter. By having a
pre-rendered frame ready to swap in following a pageflip, we avoid the
scenario where the latency between receiving the flip complete signal
from the kernel, waking up the vsynced application, it render the new
frame and then for the server to process the swap request is greater
than the frame interval, causing us to miss the vblank. The result is
that application can become frame-locked to 30fps. Instead, we report to
the application that the first frame swap is immediately completed,
supply a new back buffer (or else the rendering would be blocked on
waiting for the front-buffer to be swapped away from the scanout) and
let them proceed to render the second frame. The second frame is added
to the swap queue, and the client throttled to vrefresh. (If the client
missed the vblank, the swap queue is empty and the client is immediately
woken again, whilst the pageflip is pending.)

Note, for practical reasons this only applies to page-flipping, for
example, calls to glXSwapBuffer() on fullscreen applications.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-11 22:13:01 +01:00
Chris Wilson 2608a367ac dri: Prevent abuse of the Resource database
The Resource database is only designed to store a single value for a
particular type associated with an XID. Due to the asynchronous nature
of the vblank/flip requests, we would often associate multiple frame
events with a particular drawable/client. Upon freeing the resource, we
would not necessarily decouple the right value, leaving a stale pointer
behind. Later when the client disappeared, we would write through that
stale pointer upsetting valgrind and causing memory corruption. MDK.

Instead, we need to implement an extra layer for tracking multiple
frames within a single Resource.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37700
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-11 21:46:36 +01:00
Chris Wilson ab1000821a dri: Remove the shadow copy of attachment
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-09 19:47:00 +01:00
Chris Wilson 9f22ea7ca4 sna: Clamp results for computing BoxRec coords from xRectangle
As the width/height in the rectangle is specified as uint16_t, the
result may be larger than is storagable in the int16_t of the box. Of
course it would take a really inane client to do attempt to draw
something much larger than the largest possible surface... Is it strange
that first example I've found to do so is a Java application?

Reported-by: Nicolas Kalkhof <nkalkhof@web.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-09 14:58:35 +01:00
Chris Wilson f4c5dc8817 sna/accel: Fix fallback for depth=1 copy
A little carelessness with passing down the offsets caused us to
incorrectly copy depth=1 bitmaps, as exemplified by gkrellm.

Reported-by: Nicolas Kalkhof <nkalkhof@web.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-09 14:19:04 +01:00
Chris Wilson 649ebcef09 sna: A buffer only needs a flush if it remains dirty at the end of the batch
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-08 18:59:05 +01:00
Chris Wilson 625e37b317 sna/gen3: So we also need to ensure stippling is cleared...
My theory that we used nothing that invoked polygon stippling proved
baseless.

Fixes regression from 3b5971bd23

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-08 18:58:40 +01:00
Chris Wilson 1e2cae0ab3 sna/gen3: Restore disabling the use of stencil/fog in the invariant
One cleanup too far causing spurious results after rebooting. We also
need to ensure that the writemask is fully enabled (ie not disabled)
as well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-08 10:00:41 +01:00
Chris Wilson ec3dd64e73 sna/dri: Enable chaining of page-flips
Trade off extra frames of latency for extra frames of anti-jitter
buffering and loss of completion information; compiz users rejoice.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-08 10:00:41 +01:00
Chris Wilson a32694b0f0 sna/dri: Remove redundant NULL check in reference
The buffer has already been dereferenced by this point...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-07 23:11:39 +01:00
Chris Wilson d180c5f5f7 sna: Take advantage of the needs_flush tracking on the front buffer
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-07 11:50:12 +01:00
Chris Wilson 0be47f459b sna: Check against an execbuffer reference before discarding partials
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-06 13:06:17 +01:00
Chris Wilson f6573fe757 sna: Compute aligned tiled heights for gen2 correctly
We were underestimating the height of X-tiled surfaces (and less
harmfully overestimating the height of Y-tiled surfaces.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-06 13:06:14 +01:00
Chris Wilson d6afd66461 sna: Reset unused partial buffers
Whilst searching for available space on the active partial buffer list,
if we discover an unreferenced one, reset its used counter to zero.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-05 23:36:09 +01:00
Chris Wilson 6e7a0c8641 sna: Discard unused partial buffers
If we allocate a partial buffer and then fallback for the operation, the
buffer would remain on the partial list waiting for another user.
Discard any unused partials at the next batch submission or expiration
point.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-05 23:36:09 +01:00
Chris Wilson 3b5971bd23 sna/gen2: Restore invariant ENABLES
One deletion too many, unnoticed until the next reboot. Besides the
failure to disable logic op and enable colour buffer blending which
causes a hang if you subsequently try to enable both, you also need
to request texture caching...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-05 22:22:41 +01:00
Chris Wilson 5fa3e73f2c sna/gen[23]: Do as the comments suggest and prefer the BLT
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-05 21:07:10 +01:00
Chris Wilson f749ed618e sna: Reduce tiling if pitch is less than a tile_width/height only on pre-G33
(Note this only applies to 2D pixmaps.)

The rationale, borne out by experimentation with cairo-perf-trace, is
that on the pre-G33 devices we always need a fence region region
for tiled surfaces, i.e. at least .5/1MiB in size, and that combined
with the smaller GTT on those devices, we loose the benefit of tiling to
the excessive GTT thrashing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-05 21:05:08 +01:00
Chris Wilson b9de6a98d3 sna: Remove unused aperture_size member
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-05 17:48:11 +01:00
Chris Wilson fd3bc2af47 sna: Clamp object size to the min of 1/4 of the whole GTT or 1/2 the mappable
... for those pesky early devices whose GTT was no larger than the AGP
aperture.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 17:13:57 +01:00
Chris Wilson d294e41a6a sna: Update flush/retirement lists after a implicit flush for mmap
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 17:13:57 +01:00
Chris Wilson 3e53b0f3a3 sna: Enable relaxed-fencing for gen2 devices
(Just as dependent upon non-buggy kernels as gen3...)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 17:13:57 +01:00
Chris Wilson 33ddaf5429 sna: Fix gen2 tiled surface sizes
Actually use the gen2 path for gen2 devices!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 17:13:57 +01:00
Chris Wilson 9eceddf69f sna/gen2: fix batch buffer acounting
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 17:13:57 +01:00
Chris Wilson 3f80f7edb8 sna: Manually set to the GTT domain for mmap
...since the kernel no longer does strict coherency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 17:13:47 +01:00
Chris Wilson f91ee24b2d sna: Trim number of downsample passes
If we can fit the entire width or the entire height into the pipeline
when downsampling, do so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 15:27:54 +01:00
Chris Wilson 6db93720a7 sna: Don't change tiling modes on replace
This was trying to workaround a kernel bug, and instead causes a
performance cliff for textures that *need* to be tiled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 15:27:54 +01:00
Chris Wilson 430c905ef3 sna: Force tiled modes for large pitches
If the surface is so big that the 2x2 texel sampling will cause a TLB
miss everytime, i.e. the row pitch exceeeds 4096, then we need to
encourage tiling to prevent attrocious performance.

For example, try downscaling a 2560x1600 background image on a gen3
device using I915_TILING_NONE...

Using slideshow-demo /usr/share/backgrounds/cosmos/whirlpool.jpg, on a
PineView netbook, fps goes from under 4 to over 40.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 15:27:40 +01:00
Chris Wilson ae567b783e sna: Finer-grained debugging for trapezoids
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-04 14:45:00 +01:00
Chris Wilson 98f2e3855d sna/video: Downgrade severity of "overlay not found" message
We don't need to warn the user that their hardware does not support the
video overlay plane (but Jesse is working on it!), but merely inform
them that its presence is lacking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-02 09:53:11 +01:00
Chris Wilson 01c258718e sna/gen2: Add missing stub debug files
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-02 07:51:15 +01:00
Chris Wilson 5c8a108d2c sna/gen2: Recompute blend pipeline for component-alpha pass
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-01 21:41:23 +01:00
Chris Wilson 121511d3bd sna/gen2: Pack solid sources into the default diffuse component
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-01 21:41:23 +01:00
Chris Wilson a303f85c16 sna/gen2: Remove unused state from invariant setup
... and also some state that gets clobbered when we install the
composite pipelines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-01 21:41:23 +01:00
Chris Wilson 120c98ac10 sna: Downsample sources 2x too large to fit in the 3D pipeline
This is quite trivial to hit given the 2k limits on gen2/gen3. We
compromise on image quality by pre-downscaling the source by a fixed
factor to make it fit into the pipeline in preference to performing the
entire operation on the CPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-01 21:41:23 +01:00