Commit Graph

4484 Commits

Author SHA1 Message Date
Chris Wilson 87e6dcb3b0 sna: Don't call RegionIntersect for the trivial PutImage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-12 02:21:29 +00:00
Chris Wilson 1bd6665093 sna: Disable the min alignment workaround
Allow all generations to use the minimum alignment of 4 bytes again as
it appears to be working for me... Or at least what remains broken seems
to be irrespective of this alignment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-12 02:21:29 +00:00
Chris Wilson 112b895926 sna: Prevent shrinking a partial buffer stolen for a read
If we reuse a partial buffer for a read, we cannot shrink it during
upload to the device as we do not track how many bytes we actually need
for the read operation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-12 02:21:26 +00:00
Chris Wilson b09ae4c203 sna: Don't drop expired partial bo immediately, wait until dispatch
As the partial bo may be coupled into the execlist, we may as well hang
onto the memory to service the next partial buffer request until it
expires in the next dispatch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-12 02:16:49 +00:00
Chris Wilson a3c42565a8 sna: Store damage-all in the low bit of the damage pointer
Avoid the function call overhead by inspecting the low bit to see if it
is all-damaged already.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-12 02:16:49 +00:00
Chris Wilson c64a9d0683 sna: Choose a stride for the indirect replacement
Don't blithely assume that the incoming bytes are appropriately aligned
for the destination buffer. Indeed we may be replacing the destination
bo with the shadow bytes out of another,larger, pixmap, in which case we
do need to create a stride that is appropriate for the upload an
perform the 2D copy.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 19:54:12 +00:00
Chris Wilson b82851e74d sna: Mark upload pixmaps as being wholly GPU damaged
So that subsequent code resists performing CPU operations with them
(after they have been populated.)

Marking both sides as wholly damaged breaks the rules, but should work
out so long as we check whether we can perform the operation within the
target damage first.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 15:54:16 +00:00
Chris Wilson 2a5ab05f16 sna: Use a minimum alignment of 64
We should be able to reduce this by disabling dual-stream mode of the
GPU (which we want to achieve any way for 2D performance). Artefacts
in small uploads demonstrate that we fail to do.

References: https://bugs.freedesktop.org/show_bug.cgi?id=44150
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 15:29:08 +00:00
Chris Wilson e94807759e sna/gen6: Special case spans with no transform
As the no transform is a special case of affine, we were attempting to
deference the NULL transform in order to determine if it was a simple
no-rotation matrix. As the operation is extremely simple, add a special
case vertex program to speed it up.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 12:13:18 +00:00
Chris Wilson 0a5313900e sna: Explicitly retire the bo following a serialisation point
This is to keep the sanity checks in order, but conceptually should be
useful as well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 12:10:18 +00:00
Chris Wilson 2add5991a7 sna: Mark the bo as no longer in the GPU domain after clearing needs_flush
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 11:35:42 +00:00
Chris Wilson fec7098571 sna: Add assertions to track requests
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 11:33:14 +00:00
Chris Wilson a93c93be76 sna/gen6: Add a vertex program for a simple (affine, no rotation) spans
I long for the day when this code is obsolete... Until then, this gives
a nice boost in the fishtank.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-11 00:44:27 +00:00
Chris Wilson 3cf5da1090 sna: Amalgamate small replacements into upload buffers
Similar for the standard io paths, try to reuse an upload buffer for a
small replacement pixmap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 23:39:33 +00:00
Chris Wilson f0e3f6b5be sna: Check needs-flush status immediately upon destroy
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 22:12:41 +00:00
Chris Wilson b4ae6dbaed sna: Align the small upload buffers to 2 texels, and the pitch to dwords
References: https://bugs.freedesktop.org/show_bug.cgi?id=44150
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 21:25:28 +00:00
Chris Wilson 46f6c6917e sna: A partial read buffer is allowed to be in the GPU domain
As we can create the read buffer from an active cached bo, it may
already be in the GPU domain by the time we first finish it, so fix the
broken assertion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 20:34:09 +00:00
Chris Wilson 3c26055639 sna: Shrink the partial upload buffers before compacting the batch
So that the relocation entries point into the contiguous surface/batch
and can be trivially fixed up.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 20:06:51 +00:00
Chris Wilson 7b077a4d3d sna: Make the check for a single fill-rectangle clearer before modifying damage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 18:40:53 +00:00
Chris Wilson ca2a07adc4 sna: Release the stale GTT mapping after recreating the bo with new tiling
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 17:13:38 +00:00
Chris Wilson 8dd913fd3a sna: Add reminder about possible future tiling optimisations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 17:08:16 +00:00
Chris Wilson 21948578d0 sna: Disable the inline xRGB to ARGB upload conversion
As we have to upload the dirty data anyway, setting the
alpha-channel to 0xff should be free. Not so for firefox-asteroids on
Atom at least.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 17:08:16 +00:00
Chris Wilson 87f73b0434 sna/gen[23]: Tile render fill to oversized bo
If we are forced to perform a render operation to a bo too large to fit
in the pipeline, copy to an intermediate and split the operation into
tiles rather than fallback.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 17:08:15 +00:00
Chris Wilson 2ccb31c5a4 sna: Shrink upoads buffers
If we do not fill the whole upload buffer, we may be able to reuse a
smaller buffer that is currently bound in the GTT. Ideally, this will
keep our RSS trim.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 17:08:15 +00:00
Chris Wilson 572cc76be5 sna: Destroy the counter damage after a reduction-to-all
If, for instance, we reduce the GPU damage to all we know that there can
be no CPU damage even though it may still have a region with a list of
subtractions. Take advantage of this knowledge and cheaply discard that
damage without having to evaluate it.

This should prevent a paranoid assertion that there is no cpu damage
when discarding the CPU bo for an active pixmap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 17:08:15 +00:00
Chris Wilson 4a255e1316 sna: Replace the free-inactive-gpu-bo with the generic code
The function was semantically equivalent to moving the pixmap to the CPU
for writing, so replace it with a call to the generic function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 17:08:10 +00:00
Chris Wilson c1d403266a sna: Allow for xRGB uploads to replace their target bo
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 00:41:05 +00:00
Chris Wilson 406776cd95 sna: Rearrange buffer allocation to prefer llc bo over vmaps
If we can create snoopable bo, we prefer to use those as creating a vmap
forces a new bo creation increasing GTT pressure.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-10 00:25:14 +00:00
Chris Wilson b76865fa3d sna/gen2: Try to avoid creating a bo for solid colours
As we try to use the diffuse/specular and only resort to using a texture
operation for convenience in the rare case of a solid mask.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-09 23:24:18 +00:00
Chris Wilson 981aae104a sna/gen2: Eliminate some switching between logic op and blend
If the new mode can be done either using a logic op or with the blend
unit, prefer the currently enabled unit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-09 23:24:18 +00:00
Chris Wilson d65b7f9cf4 sna/blt: Rearrange to reduce a out-of-bounds copy to a clear
If we asked to use the BLT, try to avoid trigging a context switch for
a trivial case where we sample outside of a NONE source and so can
reduce the operation to a clear.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-09 23:24:18 +00:00
Chris Wilson 09e54c5536 sna/gen2: Add poor-man's linear gradient support
Convert the linear gradient to a texture ramp and compute the texture
coordinates in the standard manner.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-09 23:24:18 +00:00
Chris Wilson 6c70558ae7 sna: mark the cpu bo used for the upload buffer as in CPU domain
For correctness we need to inform GEM of the change of domain for the
buffer so that it knows to invalidate any caches when it is next used by
the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 22:48:11 +00:00
Chris Wilson 9ec31af029 sna/io: Combine small uploads into single writes
For a small update, try and amalgamate the upload buffer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 17:34:48 +00:00
Chris Wilson 4db1bb3fd8 Removed deprecated xf86PciInfo.h includes
The driver should and does provide its own PCI-IDs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 17:34:48 +00:00
Chris Wilson 54232d1a5d sna: Add ricer stripes to memcpy_xor
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 17:34:48 +00:00
Chris Wilson c037b4f542 sna: Tune cache size for cpu bo cache
This helps SNB on cairo-traces that utilize lots of temporary uploads
(rasterised sources and masks for instance), but comes at a cost of
regressing others...

In order to counter the regression from increasing the GTT cache size,
the CPU/GTT vma cache are split and accounted separately.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 17:34:48 +00:00
Chris Wilson 26042b2660 sna: Bubble sort the partial buffer list back into order after trimming padding
After reducing the used size in the partial buffer, we need to resort
the list to maintain the list in decreasing amount of available space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 14:34:01 +00:00
Chris Wilson 3f7ea44bf1 sna/gen[67]: Hook into the clear operation for glyph masks
Allow SandyBridge to specialise its clear routine to reduce the number
of ring switches. It may be interesting to specialise the clear routines
even further and use the special render clear commands...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 14:34:01 +00:00
Chris Wilson 803ac5c6b9 sna/trapezoids: Don't risk stalling for inplace SRC trapezoids
Optimistically we would replace the GPU damage with the new set of
trapezoids. However, if any partial damage remains then the next
operation which is often to composite another layer of trapezoids (for
complex clipmasks) using IN will then stall.

This fixes a regression in firefox-fishbowl (and lesser regressions
throughout the cairo-traces).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 01:48:40 +00:00
Chris Wilson 0229841c0d sna: Do not upload an untiled GPU bo if we already have a CPU bo
Continuing the tuning for sna_copy_boxes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 01:48:20 +00:00
Chris Wilson 42eb9b7c4b sna: Trim usage of vmapping
The first, and likely only, goal is to support SHMPixmap efficiently
(and without compromising SHMImage!) which we want to preserve as vmaps
and never create a GPU bo. For all other use cases, we will want to
create snoopable CPU bo ala the LLC buffers on SandyBridge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 01:08:48 +00:00
Chris Wilson e52f020493 sna: Do not move-to-gpu for sna_copy_boxes if we have a cpu bo
We trade-off the extra copy in the hope that as we haven't used the GPU
bo before then, we won't need it again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-08 01:08:47 +00:00
Chris Wilson c05e90aa99 sna: Missing chunks from last commit
And update the check for reusing the blit!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-07 18:23:43 +00:00
Chris Wilson 292aebfcdc sna: Prevent reuse of scanlines after the buffer is destroyed
Once the buffer is destroyed, it may be reallocated with a new pitch. We
could track handle and pitch, but it is easier to simply restart the
blit after the buffer is freed.

References: https://bugs.freedesktop.org/show_bug.cgi?id=44277
References: https://bugs.freedesktop.org/show_bug.cgi?id=44555
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-07 18:13:56 +00:00
Chris Wilson d7d07d1df3 sna: Pad upload buffers to ensure there are an even number of rows
One restriction common to all generations is that samplers access pairs
of rows and so we need to pad the buffer to accommodate access to that
second row. Do so unconditionally along paths that may be used by the
render pipeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-07 18:11:35 +00:00
Chris Wilson e2ad0f6272 sna/blt: Amalgamate many PolyFillRect of single boxes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-06 18:08:35 +00:00
Chris Wilson c085de905c sna: Also mark a bo created by force-to-gpu as being all-damaged
Similar to the action taken into move-to-gpu so that we forgo the
overhead of damage tracking when the initial act of creation is on the
render paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-06 17:50:01 +00:00
Chris Wilson 9f1935bb4e sna: Support performing alpha-fixup on the source
By inlining the swizzling of the alpha-channel we can support BLT copies
from an alpha-less pixmap to an alpha-destination.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-06 17:50:01 +00:00
Chris Wilson 141001df6c sna: always skip active search when requested for find an inactive bo
References: https://bugs.freedesktop.org/show_bug.cgi?id=44504
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-06 13:48:14 +00:00