Commit Graph

5577 Commits

Author SHA1 Message Date
Chris Wilson 286b0e1a48 sna: Refresh experimental userptr vmap support
Bring the code uptodate with both kernel interface changes and internal
adjustments following the creation of CPU buffers with set-cacheing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-21 11:45:37 +01:00
Chris Wilson 93c794eb3f sna: Micro-optimise copying boxes with the blitter
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-21 00:01:59 +01:00
Chris Wilson a0d95a9c2d sna: Only update a buffer when it becomes dirty
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-21 00:01:59 +01:00
Chris Wilson c52d265b83 sna: Tweak CPU bo promotion rules for CopyArea
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 22:00:54 +01:00
Chris Wilson f92a64dd91 sna: Only set the vmap flag after we make the bo snoopable
Otherwise if we fail then we incorrectly add the handle to the vmap
cache.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 15:54:35 +01:00
Chris Wilson 8b4cf24f14 sna: Also check whether the first upload box can use the BLT
No point checking boxes 1..n if box 0 is the troublemaker!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 14:46:32 +01:00
Chris Wilson df14b285be sna/gen6: Prefer the more flexible render ring for large surfaces
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 14:35:28 +01:00
Chris Wilson 578ff11c37 sna: Just use composite.box() when we only have one box
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 14:24:06 +01:00
Chris Wilson fb7987fc0b sna/dri: Cleanup ring selection for SNB+ CopyRegion
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 13:12:27 +01:00
Chris Wilson 3b56588fba sna: Update WIP userptr example usage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 10:19:25 +01:00
Chris Wilson 473a1dfb68 sna: Rename kgem_partial_bo to kgem_buffer
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 10:09:14 +01:00
Chris Wilson 8e6e8a2fa8 sna: Allow the snoopable upload buffer to take pages from the CPU vma cache
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 09:51:46 +01:00
Chris Wilson 979035bb9c sna: Remove topmost unused 'flush' attribute
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 07:28:54 +01:00
Chris Wilson b83011909a sna: Replace 'sync' flag with equivalent 'flush'
The only difference is in semantics. Currently 'sync' was only used on
CPU buffers for shared memory segments with 2D clients, and 'flush' on GPU
buffers shared with DRI clients.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 07:28:43 +01:00
Chris Wilson 88bee3caea sna: Remove unused scanout-is-dirty? flag
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-20 07:28:42 +01:00
Chris Wilson 6f60f89588 sna/gen6: Bump the WM thread count to 80
Note that we should only do this when "WiZ Hashing" is disabled. So we
should be checking the GT_MODE register (bring on i915_read!) to be sure
that is safe to do so. However, it gives a big boost to performance of
render copies...  It also causes perf benchmarks to hit thermal limits
much quicker.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 17:55:00 +01:00
Chris Wilson fc39d4b5cb sna/gen6: Add a simple DBG option to limit usage of either BLT/RENDER
We can force the code to either select only BLT or RENDER operations -
for those that we have a choice for at least!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 16:51:54 +01:00
Chris Wilson 15d3eea700 sna: Handle mixed bo/buffers in assertions
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 16:51:50 +01:00
Chris Wilson e4fce3b780 sna/gen4: Hookup composite spans
Due to the unresolved flushing bug it is no faster (so only enable when
we definitely can't do the operation inplace), however it does eliminate
a chunk of CPU overhead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 11:01:52 +01:00
Chris Wilson 5f138176bf sna: Tweak order of screen re-initialisation
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 10:32:33 +01:00
Chris Wilson 9bd0f8f3e7 i810: Correct the double negative and enable XAA when available
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 09:40:07 +01:00
Chris Wilson d145d0e145 i810: Handle initialisation without the XAA module present at runtime
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 09:25:47 +01:00
Chris Wilson 7a3b98e05b sna: Re-register the SHM funcs every server generation
As the SHM layer hooks into the CloseScreen chain to free its privates,
we then need to call the registration function again on the next
generation to ensure that the private is reallocated before use.

Reported-by: Pawel Sikora <pluto@agmk.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52255
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-19 08:57:02 +01:00
Chris Wilson 4bcab83bbd i810: DRI is not dependent upon XAA
The blit routines is uses are independent of the XAA driver interface
and can be used separately.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 22:21:22 +01:00
Chris Wilson 558c825129 sna/gen4+: Drop unsupported source formats
Once again I've confused existence of the enum with the ability of the
sampler to read that format.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 22:00:30 +01:00
Chris Wilson 9f3b3098c9 sna/dri: Allow DRI2 to be loaded even if we are wedged
Just because the GPU is spitting EIO at us does not necessarily imply
that a DRI client will also suffer. Spit out a warning for later bug
reporting and let them find out for themselves!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 21:42:51 +01:00
Chris Wilson 15b7191fd3 sna/gen6: Micro-optimise render copy emission
Backport of the changes made for IVB.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 21:36:26 +01:00
Chris Wilson 4eea9ac003 sna/gen7: Micro-optimise render copy emission
The goal is bring the overhead down to that of using the blitter. Tricky
given the number of steps to using the 3D pipeline compared to the
BLT...

A stretch goal would be to make IVB GPU bound for -copywinpix10!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 21:36:26 +01:00
Chris Wilson 267429bbb1 sna: Enable runtime detection of set-cacheing ioctl
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 21:36:26 +01:00
Chris Wilson c0b3674d04 sna/trapezoids: Only reduce bounded operators to a single pass
Only for a few operators can we replace the opacity mask by
premultiplying into the source.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 21:36:26 +01:00
Chris Wilson bb0303677c sna/trapezoids: Use pixman from within the spans to reduce two-pass operations
Reduce the two pass CompositeTrapezoids if we can perform the operation
inplace by calling pixman_image_composite from the span. This step
enables this for xrgb32.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 10:44:35 +01:00
Chris Wilson bee1a14618 sna: Fix processing of the last fallback box
The evil typo caused us to misalign the clip boxes and run over a
garbage array on 64-bit builds.

Reported-by: Edward Sheldrake <ejsheldrake@gmail.com>
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52163
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 09:40:09 +01:00
Chris Wilson 88cb1968b6 sna: Add more DBG for fallback processing
Hunting the lost box...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 09:41:24 +01:00
Chris Wilson 36f2e46619 sna: Reuse the snoopable cache more frequently for upload buffers
Now that we are keeping a small cache of snoopable buffers, experiment
with using them for uploads more frequently.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 09:41:24 +01:00
Chris Wilson 73f07abbd2 sna: Maintain a short-lived cache of snoopable CPU bo for older gen
Once again, we find that frequent buffer creation and manipulation of the
GTT is a painful experience leading to noticeable and frequent application
stalls. So mitigate the need for fresh pages by keeping a small stash of
recently freed and inactive bo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-18 09:41:24 +01:00
Chris Wilson 77520641a3 i810: Replace XAAGet.*ROP() with local tables
The XAAGetPatternROP() and XAAGetCopyROP() functions were removed along
with the rest of XAA so we need to implement those tables locally.

Reported-by: Knut Petersen <Knut_Petersen@t-online.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 22:12:49 +01:00
Chris Wilson caef63e026 i810: Split xaa routines from common acceleration methods
Some of the routines in i810_accel.c are specific to XAA whilst others
are used elsewhere, for example in i810_dri.c. Therefore we have to be
selective over which ones we compile out without xaa.

Reported-by: Knut Petersen <Knut_Petersen@t-online.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 21:37:28 +01:00
Chris Wilson 53ff19f45a sna: Allow wedged CopyPlane to operate inplace on the destination
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 19:49:43 +01:00
Chris Wilson d4fa4d5494 sna: Allow inplace copies for wedged CopyArea
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 19:49:37 +01:00
Chris Wilson 217eeadf81 sna: Allow operation inplace to scanout whilst wedged
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 19:40:51 +01:00
Chris Wilson 40ff29480a sna: Tweak fast blt path
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 18:45:28 +01:00
Chris Wilson fce69c79c4 sna: prefer fbBlt over pixman_blt
It is currently much better optimised through memcpy.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 18:43:31 +01:00
Chris Wilson c29f96d508 sna/gen7: Bump the number of pixel shader threads for IVB GT2
Spotted-by: Kilarski, Bernard R" <bernard.r.kilarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 17:29:21 +01:00
Chris Wilson 799bae9e8f sna/dri: Do not allow an exchange to take place on invalid buffers
If the SwapBuffers is called after we have resized a Window but before
the client has processed the Invalidate notification, then the
SwapBuffers will be referring to a pair of stale buffers. As the buffers
are no longer attached to the Pixmap, we can not simply exchange them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 17:29:21 +01:00
Chris Wilson 067aeaddb8 sna: Rebalance choice of GPU vs CPU bo
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 17:29:21 +01:00
Chris Wilson 7ebeea3f5c sna: Avoid the CPU bo readback for render paths
As we exclude using the CPU bo if there is overlapping GPU damage, we
can forgo the call to keep the transfer the damage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 17:29:11 +01:00
Chris Wilson ed8c729ed0 sna: Catch the short-circuit path for clearing clear on move-to-gpu as well
I thought the short-circuit path was only taken when already clear, I
was wrong.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 11:21:59 +01:00
Chris Wilson 359b9cc82d sna: Limit the use of snoopable buffers to read/write uploads
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 10:26:27 +01:00
Chris Wilson 4f21dba6ee sna: Only drop the clear flag when writing to the GPU pixmap
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 09:26:46 +01:00
Chris Wilson fbfbbee828 sna: Fix glyph DBG to include clip extents and actual glyph origin
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-17 09:20:21 +01:00