Commit Graph

4574 Commits

Author SHA1 Message Date
Chris Wilson 05f9764a88 sna/damage: Fast path singular regions
Mainly for consistency, so that we treat it like the other damage
addition functions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-19 00:45:08 +00:00
Chris Wilson 96529e345d sna: Make sure we create a mappable GPU bo when streaming writes
If we decide to do the CPU fallback inplace on the GPU bo through a WC
mapping (because it is a large write-only operation), make sure that
the new GPU bo we create is not active and so will not^W^W is less likely
to cause a stall when mapped.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-19 00:45:08 +00:00
Chris Wilson efce896e1d sna: Check number of boxes to migrate during move-to-cpu
When reducing the damage we may find that it is actually empty and so
sna_damage_get_boxes() returns 0, be prepared.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 20:53:55 +00:00
Chris Wilson 334f3f70a8 sna/gen3: Set the batch mode for emitting video state
The lack of kgem_set_mode() here is causing some recently added
assertions to fail.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 20:09:26 +00:00
Chris Wilson 76203b7070 sna: Almagamate writes based on the total number of bytes written
Cachelines will only be dirtied for the bytes accessed so a better
metric would based on the total number of pages brought into the TLB
and the total number of cachelines used. Base the decision on whether
to try and amalgamate the upload with others on the number of bytes
copied rather than the overall extents.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 18:49:42 +00:00
Chris Wilson 470741e84c sna: Debug uploads
All of the asserts and debug options that lead me to believe that the
tiling was completely screwy for some writes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 18:49:42 +00:00
Chris Wilson ab387a89cf sna: Update bo->tiling during search_linear_cache
search_linear_cache() was updated to track the first good match whilst it
continued to search for a better match. This resulted in the first good
bo being modified and a record of those modifications lost, in
particular the change in tiling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 18:49:42 +00:00
Chris Wilson 4b893ab081 sna: Remove defunct debugging option
FORCE_GPU_ONLY now has no effect except for marking the initial pixmap
as all-damaged on the GPU, and so not testing the paths for which it was
originally introduction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 18:49:42 +00:00
Chris Wilson 965586544a sna/gen6: Don't assume that a batch mode implies a non-empty batch
Just in case we set a mode then fail to emit any dwords. Sounds
inefficient and woe betide the culprit when I find it...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 18:49:28 +00:00
Chris Wilson d2e0575036 sna: Fix some tracking of to-be-flushed dri pixmaps
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 18:39:29 +00:00
Chris Wilson 1ad5320fd4 sna: Add valgrind markup for tracking CPU mmaps
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 15:39:35 +00:00
Chris Wilson f3da610ead sna: Prevent switching rings with render disabled
We fudge forced used of the BLT ring unless we install a render backend
and so we must also prevent the ring from being reset when the GPU is
idle. Therefore we make handing the ring status a backend function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 15:27:40 +00:00
Chris Wilson 6d31cb2d94 sna: Restore use of shadow pixmaps by default without RENDER support
If we do not have access to an accelerated render backend, only create
GPU buffers for the scanout and use an accelerated blitter for
upload/download and operating inplace on the scanout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 13:43:20 +00:00
Chris Wilson 15a150579c intel: Trivially remove a piece of XAA dependency for shadow
The wolves are gathering at the door baying for the removal of XAA from
Xorg.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-18 10:27:17 +00:00
Chris Wilson 850495f956 sna: Fix increment of damage boxes after updating for rectangles
Found by valgrind:
==13639== Conditional jump or move depends on uninitialised value(s)
==13639==    at 0x5520B1E: pixman_region_init_rects (in
/usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.0)
==13639==    by 0x89E6ED7: __sna_damage_reduce (sna_damage.c:489)
==13639==    by 0x89E7FEC: _sna_damage_contains_box (sna_damage.c:1161)
==13639==    by 0x89CFCD9: sna_drawable_use_gpu_bo (sna_damage.h:175)
==13639==    by 0x89D52DA: sna_poly_segment (sna_accel.c:6130)
==13639==    by 0x21F87E: damagePolySegment (damage.c:1096)
==13639==    by 0x1565A2: ProcPolySegment (dispatch.c:1771)
==13639==    by 0x159FB0: Dispatch (dispatch.c:437)
==13639==    by 0x1491D9: main (main.c:287)
==13639==  Uninitialised value was created by a heap allocation
==13639==    at 0x4028693: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13639==    by 0x89E6BFB: _sna_damage_create_boxes (sna_damage.c:205)
==13639==    by 0x89E78F0: _sna_damage_add_rectangles (sna_damage.c:327)
==13639==    by 0x89CD32D: sna_poly_fill_rect_blt.isra.65
(sna_damage.h:68)
==13639==    by 0x89DE23F: sna_poly_fill_rect (sna_accel.c:8366)
==13639==    by 0x21E9C8: damagePolyFillRect (damage.c:1309)
==13639==    by 0x26DD3F: miPaintWindow (miexpose.c:674)
==13639==    by 0x18370A: ChangeWindowAttributes (window.c:1553)
==13639==    by 0x154500: ProcChangeWindowAttributes (dispatch.c:696)
==13639==    by 0x159FB0: Dispatch (dispatch.c:437)
==13639==    by 0x1491D9: main (main.c:287)
==13639==

Use 'count' everywhere for consistency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 23:12:49 +00:00
Chris Wilson 4b5c9affd4 sna: Restore orginal shadow pointer before uploading CPU damage
Detected by valgrind:
==22012== Source and destination overlap in memcpy(0xd101000, 0xd101000,
783360)
==22012==    at 0x402A180: memcpy (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22012==    by 0x89BD4ED: memcpy_blt (blt.c:209)
==22012==    by 0x89F2921: sna_write_boxes (sna_io.c:364)
==22012==    by 0x89CFABF: sna_pixmap_move_to_gpu (sna_accel.c:1900)
==22012==    by 0x89F49B0: sna_render_pixmap_bo (sna_render.c:571)
==22012==    by 0x8A268CE: gen5_composite_picture (gen5_render.c:1908)
==22012==    by 0x8A29B8A: gen5_render_composite (gen5_render.c:2252)
==22012==    by 0x89E6762: sna_composite (sna_composite.c:485)
==22012==    by 0x21D3C3: damageComposite (damage.c:569)
==22012==    by 0x215963: ProcRenderComposite (render.c:728)
==22012==    by 0x159FB0: Dispatch (dispatch.c:437)
==22012==    by 0x1491D9: main (main.c:287)
==22012==

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 23:12:49 +00:00
Eugeni Dodonov bbd6c81236 sna: check for LLC support
Instead of checking for CPU generation, use the libdrm-provided
I915_PARAM_HAS_LLC instead.

v2: use a define check to verify if we have I915_PARAM_HAS_LLC.

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 18:40:04 +00:00
Chris Wilson e4efde920b sna: Track whether damage is a complete representation of the dirt
The previous commit undoes a premature optimisation that assumed that
the current damage captured all pixels written. However, it happens to
be a useful optimisation along that path (tracking upload of partial
images), so add the necessary booking that watches for when the union
of cpu and gpu damage is no longer the complete set of all pixels
written, that is if we either migrate from one pixmap to the other, the
undamaged region goes untracked. We also take advantage of whenever we
damage the whole pixel to restore knowledge that our tracking of all
pixels written is complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 18:23:43 +00:00
Chris Wilson a9b705f9a7 sna: Mark GPU as all-damaged discarding the CPU bo to prevent stalls
If we discard the CPU bo, we lose knowledge of whatever regions had been
initialised but no longer dirty on the GPU, but instead must assume that
the entirety of the GPU bo is dirty.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 12:58:43 +00:00
Chris Wilson 9d631e26d7 sna: Mark the freshly allocated CPU bo as in the CPU domain
As we immediately use it after creation, we need to inform GEM of the
domain transfer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 12:58:03 +00:00
Chris Wilson dfbf02b877 sna: Add some DBG breadcrumbs to put_image upload paths
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 12:57:55 +00:00
Chris Wilson da90afc32f sna: Add DBG breadcrumbs to gradient initialisation
Put some markers into the debug log as those functions create many
proxies causing a lot of debug noise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 11:51:25 +00:00
Chris Wilson d14341cb22 sna: Add a render ring detiling read path
For SNB, in case you really, really want to use GPU detiling and not
incur the ring switch. Tweaking when to just mmap the target seems to
gain most anyway...

The ulterior motive is that this provides fallback paths for avoiding
the use of TILING_Y with GTT mmaps which is broken on 855gm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 08:22:22 +00:00
Chris Wilson 3620f9ca45 sna: Cap pwrite buffer alignment to 64k
We only want to create huge pwrite buffers when populating the inactive
cache for mmapped uploads. In the absence of using mmap for upload, be
more conservative with the alignment value so as not to simply waste
valuable aperture and memory.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 00:24:16 +00:00
Chris Wilson b9f59b1099 sna: correct adjust of a stolen 2d read buffer
If we steal a write buffer for creating a pixmap for read back, then we
need to be careful as we will have set the used amount to 0 and then try
to incorrectly decrease by the last row. Fortunately, we do not yet have
any code that attempts to create a 2d buffer for reading.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-17 00:22:25 +00:00
Chris Wilson 6fc4cdafeb sna: Correct assertion for a partial read buffer
The batch may legitimately be submitted prior to the attachment of the
read buffer, if, for example, we need to switch rings. Therefore update
the assertion to only check that the bo remains in existence via either
a reference from the exec or from the user

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 21:36:04 +00:00
Chris Wilson 377f5e16cd sna/gen[45]: clear the state tracker before setting the formats
When backporting the patches from gen6, I didn't notice the memset that
came later, and this wasn't along the paths checked by rendercheck.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 16:09:57 +00:00
Chris Wilson 6387f2fb8a sna/gen[4567]: x1r5g5b5 is only a render target, not sampler
Whilst we can render to and blend with an depth 15 target, we cannot use
it as a texture with the sampling engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 15:39:42 +00:00
Chris Wilson 8b2bb66666 sna/gen6: Restore the non-pipelined op after every WM binding table update
The hw wants it as demonstrated by the '>' in KDE's menus. Why is it
always KDE that demonstrates coherency problems...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 13:37:45 +00:00
Chris Wilson a11b22d172 sna/gen[23]: Remark the destination bo as dirty after flushing
One of the side-effects of emitting the composite state is that it
tags the destination surface as dirty as a result of the *forthcoming*
operation. So emitting the flush after emitting the composite state
clears that tag, so we need to restore it for future coherency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 13:37:45 +00:00
Zhigang Gong 2f09363a6e uxa/glamor: Create glamor pixmap by default.
When creating native glamor pixmaps we will get much better performance
than using the textured-drm pixmap, this commit is to make that the
default behaviour when configured to use glamor. Another advantage
of this commit is that  we reduce the risk of encountering the
"incompatible region exists for this name" and the associated
render corruption. And since we now never intentionally allocate
a reusable pixmap we could just make all (intel_glamor) allocations
non-reusable without incurring too great an overhead.

A side effect is that those glamor pixmaps do not have a
valid BO attached to them and thus it fails to get a DRI drawable. This
commit also fixes that problem by adjusting the fixup_shadow mechanism
to recreate a textured-drm pixmap from the native glamor pixmap. I tested
this with mutter, and it works fine.

The performance gain to apply this patch is about 10% to 20% with
different workload.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 10:49:21 +00:00
Chris Wilson fd4c139a39 sna: On LLC systems quietly replace all linear mmappings using the CPU
If the GPU and CPU caches are shared and coherent, we can use a cached
mapping for linear bo in the CPU domain with no penalty and so avoid the
penalty of using WC/UC mappings through the GTT (and any aperture
pressure). We presume that the bo for such mappings are indeed LLC
cached...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 01:30:13 +00:00
Chris Wilson c20a729d0a sna/gen6: Force a batch submission after allocation failure during composite
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 01:30:13 +00:00
Chris Wilson 380a2fca3c sna: Optimise call to composite with single box
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 01:30:13 +00:00
Chris Wilson 9f89250de1 sna: Use the prefer-GPU hint for forcing allocation for core drawing
Similar to the render paths and simpler than the current look up tiling
method.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-16 01:30:13 +00:00
Chris Wilson 8652bf7a19 sna: Don't track an unmatching tiled bo when searching the linear cache
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 19:56:35 +00:00
Chris Wilson cc4b616990 sna/video: Increase the level of paranoia
In how many different ways can we check that the scanout is allocated
before we start decoding video?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 19:55:50 +00:00
Chris Wilson 7f480ba02c sna: Tidy search through active bo cache
Perform the assertions upon cache consistency upfront, and tidy the
indentation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 19:53:39 +00:00
Chris Wilson 6f7bc35d7f sna: Use indirect uploads rather than teardown existing CPU maps
Allow the snoopable CPU mapping to be used inplace of the GTT map for
untiled bo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 18:14:24 +00:00
Chris Wilson 475fa67ed3 sna: Fast path move-area-to-cpu when the pixmap is already on the cpu
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 17:30:00 +00:00
Chris Wilson 37ced44a53 sna: Be a little more lenient wrt damage migration if we have CPU bo
The idea being that they facilitate copying to and from the CPU, but
also we want to avoid stalling on any pixels help by the CPU bo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 15:35:57 +00:00
Chris Wilson e3732a6f7f sna: Defer ring switching until after a period of idleness
Similar to the desire to flush the next batch after an overflow, we do
not want to incur any lag in the midst of drawing, even if that lag is
mitigated by GPU semaphores.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 11:06:59 +00:00
Chris Wilson 5df7147b09 sna: Restore the kgem_create_map() symbol
As the stub is exported to the driver even in the absence of vmapping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 10:28:00 +00:00
Chris Wilson be53740c6f sna: Various DBG typos
Fix some mispellings inside the DBG messages

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 10:16:13 +00:00
Chris Wilson 349e9a7b94 sna: Prefer read-boxes inplace again
Using the gpu to do the detiling just incurs extra latency and an extra
copy, so go back to using a fence and GTT mapping for the common path.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 10:06:01 +00:00
Chris Wilson 09dc8b1b35 sna/gen7: Check reused source for validity
Be sure the mask picture has a valid format even though it points to the
same pixels as the valid source. And also be wary if the source was
converted to a solid, but the mask is not.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 09:48:53 +00:00
Chris Wilson d9871f01d8 sna/gen6: Check reused source for validity
Be sure the mask picture has a valid format even though it points to the
same pixels as the valid source. And also be wary if the source was
converted to a solid, but the mask is not.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 09:48:52 +00:00
Chris Wilson 1d6030322e sna/gen5: Check reused source for validity
Be sure the mask picture has a valid format even though it points to the
same pixels as the valid source. And also be wary if the source was
converted to a solid, but the mask is not.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 09:48:52 +00:00
Chris Wilson 0e4a24ef6c sna/gen4: Check reused source for validity
Be sure the mask picture has a valid format even though it points to the
same pixels as the valid source. And also be wary if the source was
converted to a solid, but the mask is not.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 09:48:52 +00:00
Chris Wilson ea299f2523 sna/gen3: Check reused source for validity
Be sure the mask picture has a valid format even though it points to the
same pixels as the valid source. And also be wary if the source was
converted to a solid, but the mask is not.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-15 09:48:52 +00:00