Commit Graph

4697 Commits

Author SHA1 Message Date
Chris Wilson df148c9621 sna: Limit the tile size for uploading into large pixmaps
As we may have a constrained aperture, we need to be careful not to
exceed our resources limits when uploading the pixel data. (For example,
fitting two of the maximum bo into a single batch may fail due to
fragmentation of the GATT.) So be cautious and use more tiles to reduce
the size of each individual batch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-29 20:12:39 +00:00
Chris Wilson e1e67e8f39 sna: Fix the "trivial" fix to improve error handling
The logic was just backwards and we tried to upload a shadowless GPU
pixmap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-29 15:43:42 +00:00
Chris Wilson d3fb1e1e89 sna: Handle GPU creation failure when uploading subtexture
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-29 14:47:12 +00:00
Chris Wilson 518a99ea34 sna: Always create a GPU bo for copying from an existent source GPU bo
Make sure we prevent the readback of an active source GPU bo by always
prefering to do the copy on the GPU if the data is already resisent.
This fixes the second regression from e583af9cc, (sna: Experiment with
creating large objects as CPU bo).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-29 14:47:12 +00:00
Chris Wilson 624d9843ab sna: Ignore map status and pick the first inactive bo for reuse
This fixes the performance regression introduced with e583af9cca,
(sna: Experiment with creating large objects as CPU bo), as we ended up
creating fresh bo and incurring setup and thrashing overhead, when we
already had plenty cached.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-29 14:47:12 +00:00
Chris Wilson 5c6255ba2f sna: Determine whether to use a partial proxy based on the pitch
On gen4+ devices the maximum render pitch is much larger than is simply
required for the maximum coordinates. This makes it possible to use
proxy textures as a subimage into the oversized texture without having
to blit into a temporary copy for virtually every single bo we use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-29 14:47:12 +00:00
Chris Wilson 65466f8626 sna: Allow ridiculously large bo, up to half the total GATT
Such large bo place extreme stress on the system, for example trying to
mmap a 1GiB into the CPU domain currently fails due to a kernel bug. :(
So if you can avoid the swap thrashing during the upload, the ddx can now
handle 16k x 16k images on gen4+ on the GPU. That is fine until you want
two such images...

The real complication comes in uploading (and downloading) from such
large textures as they are too large for a single operation with
automatic detiling via either the BLT or the RENDER ring. We could do
manual tiling/switching or, as this patch does, tile the transfer in
chunks small enough to fit into either pipeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-29 14:47:12 +00:00
Chris Wilson 03211f4b0b sna: Guard against the upload buffer growing past the maximum bo size
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 23:20:36 +00:00
Chris Wilson 2afd49a284 sna: Limit inplace upload buffers to maximum mappable size
References: https://bugs.freedesktop.org/show_bug.cgi?id=45323
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 23:11:07 +00:00
Chris Wilson 8f4bae01e3 sna/video: Ensure the video pixmap is on the GPU
The presumption that the pixmap is the scanout and so will always be
pinned is false if there is a shadow or under a compositor. In those
cases, the pixmap may be idle and so the GPU bo reaped. This was
compounded by that the video path did not mark the pixmap as busy. So
whilst watching a video under xfce4 with compositing enabled (has to be
a non-GL compositor) the video would suddenly stall.

Reported-by: Paul Neumann <paul104x@yahoo.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45279
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 20:34:30 +00:00
Chris Wilson d02bd80b2f sna: Use a proxy rather than a temporary bo for too-tall but thin targets
If the render target is thin enough to fit within the 3D pipeline, but is
too tall, we can fudge the address of the origin and coordinates to fit
within the constaints of the pipeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 20:34:30 +00:00
Chris Wilson ea433995a3 sna: Experiment with a partial source
If the source is thin enough such that the pitch is within the sampler's
constraints and the sample size is small enough, just fudge the origin
of the bo such that it can be sampled.

This avoids having to create a temporary bo and use the BLT to extract
it and helps, for example, firefox-asteroids which uses an 64x11200
texture atlas.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 19:34:39 +00:00
Chris Wilson ad910949be sna: Mark diagonal lines as partial write
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 18:37:39 +00:00
Chris Wilson b9c83e0b2c sna/video: Add some DBG messages to track the error paths
References: https://bugs.freedesktop.org/show_bug.cgi?id=45279
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 18:25:01 +00:00
Chris Wilson 45d831c8b1 sna: Consolidate routines to choice destination bo
Combine the two very similar routines that decided if we should render
into the GPU bo, CPU bo or shadow pixmap into a single function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 18:25:01 +00:00
Chris Wilson 6402e7f119 sna: Ensure that we have a source bo for tiled fills
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 18:25:00 +00:00
Chris Wilson 6c5fb84f4d sna/glyphs: Check that we attached to the cache pixmaps upon creation
If the hw is wedged, then the pixmap creation routines will return an
ordinary unattached pixmap. The code presumed that it would only return
a pixmap with an attached bo, and so would segfault as it chased the
invalid pointer after a GPU hang and the server was restarted.
Considering that we already checked that the GPU wasn't wedged before we
started, this is just mild paranoia, but on a run-once piece of code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 14:11:50 +00:00
Chris Wilson 86f1ae9164 sna/video: Add some more DBG breadcrumbs to the textured PutImage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 10:26:11 +00:00
Chris Wilson ce1cae7f47 sna/video: Simplify the gen2/915gm check
And make the later check in put image match.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-27 10:12:33 +00:00
Chris Wilson 541908524f sna: Remove extraneous clipping from GetImage
The spec says that they must wholly contained with the valid BorderClip
for a Window or within the Pixmap or else a BadMatch is thrown. Rely on
this behaviour and not perform the clipping ourselves.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 23:25:46 +00:00
Chris Wilson 7ff40b572e sna: Avoid fbBlt for the easy GetImage cases
From (i5-2520m):
  60000 trep @   0.6145 msec (  1630.0/sec): GetImage 500x500 square
To:
  60000 trep @   0.4949 msec (  2020.0/sec): GetImage 500x500 square

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 23:24:57 +00:00
Chris Wilson adb1320bba sna/gen2+: Include being unattached in the list of source fallbacks
If the source is not attached to a buffer (be it a GPU bo or a CPU bo),
a temporary upload buffer would be required and so it is not worth
forcing the target to the destination in that case (should the target
not be on the GPU already).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 16:05:48 +00:00
Chris Wilson b1f9415bf3 sna: Allow gen4+ to use larger GPU bo
As the blitter on gen4+ does not require fence registers, it is not
restricted to operating on large objects within the mappable aperture.
As we do not need to operate on such large GPU bo in place, we can relax
the restriction on the maximum bo size for gen4+ to allocate for use
with the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 14:41:56 +00:00
Chris Wilson d35b6955db sna: Prevent mapping through the GTT for large bo
If the bo is larger than a quarter of the aperture, it is unlikely that
we will be able to evict enough contiguous space in the GATT to
accommodate that buffer. So don't attempt to map them and use the
indirect access instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 14:36:19 +00:00
Chris Wilson 7c81bcd0c4 sna: Add FORCE_FALLBACK debugging hook for PutImage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 13:35:45 +00:00
Chris Wilson 35c0ef586b sna/gen3: Use cpu bo if already in use
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 13:01:24 +00:00
Chris Wilson b76a6da3fa sna: Search the buckets above the desired size in the bo cache
It is preferrable to reuse a slightly larger bo, than it is to create a
fresh one and map it into the aperture. So search the bucket above us as
well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 13:01:24 +00:00
Chris Wilson e2b8b1c145 sna: Apply any previous transformation when downsampling
In order to handle rotations and fractional offsets produced by the act
of downsampling, we need to compute the full affine transformation and
apply it to the vertices rather than attempt to fudge it with an integer
offset.

References: https://bugs.freedesktop.org/show_bug.cgi?id=45086
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 13:01:23 +00:00
Chris Wilson 352828ee59 sna: Tweak aperture thresholds for batch flushing
In order to more easily accommodate operations on large source CPU bo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 12:55:27 +00:00
Chris Wilson cff6a1a2e4 sna: Use the cpu bo where possible as the source for texture extraction
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 12:55:22 +00:00
Chris Wilson e583af9cca sna: Experiment with creating large objects as CPU bo
Even on non-LLC systems if we can prevent the migration of such
objects, we can still benefit immensely from being able to map them into
the GTT as required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 12:48:08 +00:00
Chris Wilson 55569272f7 sna: Apply the same migration flags for the dst alphamap as for the dst pixmap
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 12:48:07 +00:00
Chris Wilson 4a132ddbf0 sna: Correct offset for moving drawable regions to the CPU
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 12:48:07 +00:00
Chris Wilson 65164d90b7 sna/gen2+: Do not force use of GPU if the target is simply cleared
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 12:48:02 +00:00
Chris Wilson 307f493d76 sna: Map freshly created, unbound bo through the CPU
Take advantage that we know we will have to clflush the unbound bo
before use by the GPU and populate it inplace.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-26 12:37:07 +00:00
Chris Wilson d785bb7df0 sna: GetImage is allowed to read a window's border
We need to adjust the clip to include the border pixels when migrating
damage from the backing pixmap. This also requires relaxing the
constraint that a read must be within the drawable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 16:46:58 +00:00
Chris Wilson 36425ba49e sna: Round up buffer allocations when downsampling
The pathological case being nx1 or 1xm resulting in an illegal allocation
request of 0 bytes.

One such example is
  wolframalpha.com: x = (200 + x) / 100
which generates an approximately 8500x1 image and so needs downscaling
to fit in the render pipeline on all but IvyBridge. Bring on Ivy!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 13:52:27 +00:00
Chris Wilson a2e83c6dcb sna: Silence compiler warning for a potential uninitialised return on error
sna_accel.c: In function 'sna_copy_plane':
sna_accel.c:5022:21: warning: 'ret' may be used uninitialized in this
function [-Wuninitialized]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 11:23:22 +00:00
Chris Wilson 8d22a76506 sna: Run the miHandleExposures for no-op CopyPlane
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 09:36:20 +00:00
Chris Wilson 338941eda3 sna: Handle self-copies for CopyPlane
Prepare the source first as this has the dual benefit of letting us
decide how best to proceed with the op (on the CPU or GPU) and prevents
modification of the damage after we have choosen our preferred path.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 02:42:25 +00:00
Chris Wilson 2e8b398ca3 sna: Only shrink partial buffers that are being written to
Ignore inactive and mmapped buffers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 01:47:06 +00:00
Chris Wilson b79252efaa sna: Apply source clipping to sna_copy_plane()
Ensure that the migration region is within bounds for both the source
and destination pixmaps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 01:36:27 +00:00
Chris Wilson 46252bc7bc sna: Set the source clip for CopyArea fallback correctly
The source window is (src->x, src->y)x(src->width, src->height) in
pixmap space. However, we then need to use this to clip against the
desination region, and so we need to translate from the source
coordinate to the destination coordinate.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 01:31:34 +00:00
Chris Wilson ae6d3a3117 sna: Print source and destination regions for CopyArea fallback for DBG
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 01:18:11 +00:00
Chris Wilson dd5e90adfc sna: Clip GetImage to drawable so that damage migration is within bounds
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 01:17:49 +00:00
Chris Wilson b1fba5e853 sna: Clear GPU damage first when moving a clear pixmap to the CPU
This allows us to discard any busy GPU or CPU bo when we know we are
going to clear the shadow pixmap afterwards.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-25 01:17:34 +00:00
Chris Wilson 5ad95d6666 sna: Reduce number of reads required to inspect timers
By using the information provided by select at wakeup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-24 22:11:20 +00:00
Chris Wilson aae19cbc5d sna: Only reset devPrivate.ptr if owned by the CPU bo when freeing
If the pixmap is mapped to the GPU bo, we should continue to use the
current mapping rather than revoke it. Otherwise if we write to the GPU
bo inplace, thereby discarding the CPU bo, we set the pointer we are
about to copy to, to NULL.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-24 20:01:27 +00:00
Chris Wilson 5312ee90ad sna: mark the pixmap as no longer clear after rendering video
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-24 19:02:56 +00:00
Chris Wilson 69d3fc91f4 sna: Set up GC for general FillArc to FillSpans callback
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-24 18:54:08 +00:00