xf86-video-intel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	fc9531fc2d	sna: Move the flush to the backends This allows us to implement backend specific workarounds and use the more appropriate device specific flushing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-20 00:02:05 +00:00
Chris Wilson	2e0a534a88	sna/gen7: Forward port recent changes from gen6 Fixes for resubmitting batches after running out of space for vertex buffers and also a couple of trivial spans functions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 19:08:36 +00:00
Chris Wilson	5caf806d42	sna: BLT use dword pitch only for tiled surfaces The gen4+ spec is a little misleading as states that all BLT pitches for the XY commands are in dwords. Apparently not, as the upload/download functions were already demonstrating. This only became apparent when accelerating core text routines to offscreen pixmaps, such as composited windows. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 17:35:09 +00:00
Chris Wilson	dbc75532d5	sna: Tweak move-to-cpu to ignore inplace hint if its already on the CPU If we test the area to be drawn against the existing CPU damage and find it is already on the CPU, we may as well continue to utilize that damaged region. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 14:38:57 +00:00
Chris Wilson	7ad4a0c942	sna: Only use the blitter to emit wide spans if we cannot stream the updates If either the region is busy on the gpu or if we need to read the destination then we would incur penalties for trying to perform the operation through the GTT. However, if we are simply streaming pixels to an unbusy bo then we can do so inplace faster than computing the corresponding GPU commands and uploading them. Note: currently it is universally slower to use the GPU here (the computation of the spans is too slow). However that is only according to micro-benchmarks, avoiding the readback is likely to be more efficient in practice. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 12:32:59 +00:00
Chris Wilson	9db6b9fad8	sna: Also check for the inplace hint when migrating the whole pixmap Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 12:00:13 +00:00
Chris Wilson	d3f7d5d614	sna: Only use the blitter to emit spans if we cannot stream the updates If either the region is busy on the gpu or if we need to read the destination then we would incur penalties for trying to perform the operation through the GTT. However, if we are simply streaming pixels to an unbusy bo then we can do so inplace faster than computing the corresponding GPU commands and uploading them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 11:59:53 +00:00
Chris Wilson	ff2eb116ef	sna: Micro-optimise line extents for zero line width Handling zero line widths is the common case, so avoid the extra work. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 09:54:44 +00:00
Chris Wilson	3c01074507	sna: filter out degenerate segments whilst drawing unclipped PolySegment The damage layer was detecting that we were asking it to accumulate a degenerate box emanating from PolySegment, as the unclipped paths made the fatal assumption that it would not need to filter out degenerate boxes. However, a degenerate line becomes a point, does the same apply to a degenerate segment? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 09:54:06 +00:00
Chris Wilson	35f81005f9	sna/damage: Always mark the damage as dirty when recording new boxes A few of the create_elts() routines missed marking the damage as dirty so that if only part of the emebbed box was used (i.e. the damage contained less than 8 rectangles that needed to included in the damage region) then those were being ignored during migration and testing. Reported-by: Clemens Eisserer <linuxhippy@gmail.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=44682 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 00:45:09 +00:00
Chris Wilson	36e691ea90	sna: Demote MOVE_READ if the GC operation covers the clip If the write operation fills the entire clip, then we can demote and possible avoid having to read back the clip from the GPU provided that we do not need the destination data due to arithmetic operation or mask. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 00:45:09 +00:00
Chris Wilson	17efdbc48c	sna: Clip damage area with source extents for fallback The damage tracking code asserts that it only handles clip regions. However, sna_copy_area() was failing to ensure that its damage region was being clipped by the source drawable, leading to out of bounds reads during forced fallback. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 00:45:08 +00:00
Chris Wilson	fb07243c9a	sna: Fine grained fallback debugging for core drawing routines Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 00:45:08 +00:00
Chris Wilson	05f9764a88	sna/damage: Fast path singular regions Mainly for consistency, so that we treat it like the other damage addition functions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 00:45:08 +00:00
Chris Wilson	96529e345d	sna: Make sure we create a mappable GPU bo when streaming writes If we decide to do the CPU fallback inplace on the GPU bo through a WC mapping (because it is a large write-only operation), make sure that the new GPU bo we create is not active and so will not^W^W is less likely to cause a stall when mapped. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-19 00:45:08 +00:00
Chris Wilson	efce896e1d	sna: Check number of boxes to migrate during move-to-cpu When reducing the damage we may find that it is actually empty and so sna_damage_get_boxes() returns 0, be prepared. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 20:53:55 +00:00
Chris Wilson	334f3f70a8	sna/gen3: Set the batch mode for emitting video state The lack of kgem_set_mode() here is causing some recently added assertions to fail. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 20:09:26 +00:00
Chris Wilson	76203b7070	sna: Almagamate writes based on the total number of bytes written Cachelines will only be dirtied for the bytes accessed so a better metric would based on the total number of pages brought into the TLB and the total number of cachelines used. Base the decision on whether to try and amalgamate the upload with others on the number of bytes copied rather than the overall extents. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 18:49:42 +00:00
Chris Wilson	470741e84c	sna: Debug uploads All of the asserts and debug options that lead me to believe that the tiling was completely screwy for some writes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 18:49:42 +00:00
Chris Wilson	ab387a89cf	sna: Update bo->tiling during search_linear_cache search_linear_cache() was updated to track the first good match whilst it continued to search for a better match. This resulted in the first good bo being modified and a record of those modifications lost, in particular the change in tiling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 18:49:42 +00:00
Chris Wilson	4b893ab081	sna: Remove defunct debugging option FORCE_GPU_ONLY now has no effect except for marking the initial pixmap as all-damaged on the GPU, and so not testing the paths for which it was originally introduction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 18:49:42 +00:00
Chris Wilson	965586544a	sna/gen6: Don't assume that a batch mode implies a non-empty batch Just in case we set a mode then fail to emit any dwords. Sounds inefficient and woe betide the culprit when I find it... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 18:49:28 +00:00
Chris Wilson	d2e0575036	sna: Fix some tracking of to-be-flushed dri pixmaps Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 18:39:29 +00:00
Chris Wilson	1ad5320fd4	sna: Add valgrind markup for tracking CPU mmaps Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 15:39:35 +00:00
Chris Wilson	f3da610ead	sna: Prevent switching rings with render disabled We fudge forced used of the BLT ring unless we install a render backend and so we must also prevent the ring from being reset when the GPU is idle. Therefore we make handing the ring status a backend function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 15:27:40 +00:00
Chris Wilson	6d31cb2d94	sna: Restore use of shadow pixmaps by default without RENDER support If we do not have access to an accelerated render backend, only create GPU buffers for the scanout and use an accelerated blitter for upload/download and operating inplace on the scanout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 13:43:20 +00:00
Chris Wilson	15a150579c	intel: Trivially remove a piece of XAA dependency for shadow The wolves are gathering at the door baying for the removal of XAA from Xorg. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-18 10:27:17 +00:00
Chris Wilson	850495f956	sna: Fix increment of damage boxes after updating for rectangles Found by valgrind: ==13639== Conditional jump or move depends on uninitialised value(s) ==13639== at 0x5520B1E: pixman_region_init_rects (in /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.0) ==13639== by 0x89E6ED7: __sna_damage_reduce (sna_damage.c:489) ==13639== by 0x89E7FEC: _sna_damage_contains_box (sna_damage.c:1161) ==13639== by 0x89CFCD9: sna_drawable_use_gpu_bo (sna_damage.h:175) ==13639== by 0x89D52DA: sna_poly_segment (sna_accel.c:6130) ==13639== by 0x21F87E: damagePolySegment (damage.c:1096) ==13639== by 0x1565A2: ProcPolySegment (dispatch.c:1771) ==13639== by 0x159FB0: Dispatch (dispatch.c:437) ==13639== by 0x1491D9: main (main.c:287) ==13639== Uninitialised value was created by a heap allocation ==13639== at 0x4028693: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==13639== by 0x89E6BFB: _sna_damage_create_boxes (sna_damage.c:205) ==13639== by 0x89E78F0: _sna_damage_add_rectangles (sna_damage.c:327) ==13639== by 0x89CD32D: sna_poly_fill_rect_blt.isra.65 (sna_damage.h:68) ==13639== by 0x89DE23F: sna_poly_fill_rect (sna_accel.c:8366) ==13639== by 0x21E9C8: damagePolyFillRect (damage.c:1309) ==13639== by 0x26DD3F: miPaintWindow (miexpose.c:674) ==13639== by 0x18370A: ChangeWindowAttributes (window.c:1553) ==13639== by 0x154500: ProcChangeWindowAttributes (dispatch.c:696) ==13639== by 0x159FB0: Dispatch (dispatch.c:437) ==13639== by 0x1491D9: main (main.c:287) ==13639== Use 'count' everywhere for consistency. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 23:12:49 +00:00
Chris Wilson	4b5c9affd4	sna: Restore orginal shadow pointer before uploading CPU damage Detected by valgrind: ==22012== Source and destination overlap in memcpy(0xd101000, 0xd101000, 783360) ==22012== at 0x402A180: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==22012== by 0x89BD4ED: memcpy_blt (blt.c:209) ==22012== by 0x89F2921: sna_write_boxes (sna_io.c:364) ==22012== by 0x89CFABF: sna_pixmap_move_to_gpu (sna_accel.c:1900) ==22012== by 0x89F49B0: sna_render_pixmap_bo (sna_render.c:571) ==22012== by 0x8A268CE: gen5_composite_picture (gen5_render.c:1908) ==22012== by 0x8A29B8A: gen5_render_composite (gen5_render.c:2252) ==22012== by 0x89E6762: sna_composite (sna_composite.c:485) ==22012== by 0x21D3C3: damageComposite (damage.c:569) ==22012== by 0x215963: ProcRenderComposite (render.c:728) ==22012== by 0x159FB0: Dispatch (dispatch.c:437) ==22012== by 0x1491D9: main (main.c:287) ==22012== Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 23:12:49 +00:00
Eugeni Dodonov	bbd6c81236	sna: check for LLC support Instead of checking for CPU generation, use the libdrm-provided I915_PARAM_HAS_LLC instead. v2: use a define check to verify if we have I915_PARAM_HAS_LLC. Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 18:40:04 +00:00
Chris Wilson	e4efde920b	sna: Track whether damage is a complete representation of the dirt The previous commit undoes a premature optimisation that assumed that the current damage captured all pixels written. However, it happens to be a useful optimisation along that path (tracking upload of partial images), so add the necessary booking that watches for when the union of cpu and gpu damage is no longer the complete set of all pixels written, that is if we either migrate from one pixmap to the other, the undamaged region goes untracked. We also take advantage of whenever we damage the whole pixel to restore knowledge that our tracking of all pixels written is complete. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 18:23:43 +00:00
Chris Wilson	a9b705f9a7	sna: Mark GPU as all-damaged discarding the CPU bo to prevent stalls If we discard the CPU bo, we lose knowledge of whatever regions had been initialised but no longer dirty on the GPU, but instead must assume that the entirety of the GPU bo is dirty. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 12:58:43 +00:00
Chris Wilson	9d631e26d7	sna: Mark the freshly allocated CPU bo as in the CPU domain As we immediately use it after creation, we need to inform GEM of the domain transfer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 12:58:03 +00:00
Chris Wilson	dfbf02b877	sna: Add some DBG breadcrumbs to put_image upload paths Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 12:57:55 +00:00
Chris Wilson	da90afc32f	sna: Add DBG breadcrumbs to gradient initialisation Put some markers into the debug log as those functions create many proxies causing a lot of debug noise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 11:51:25 +00:00
Chris Wilson	d14341cb22	sna: Add a render ring detiling read path For SNB, in case you really, really want to use GPU detiling and not incur the ring switch. Tweaking when to just mmap the target seems to gain most anyway... The ulterior motive is that this provides fallback paths for avoiding the use of TILING_Y with GTT mmaps which is broken on 855gm. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 08:22:22 +00:00
Chris Wilson	3620f9ca45	sna: Cap pwrite buffer alignment to 64k We only want to create huge pwrite buffers when populating the inactive cache for mmapped uploads. In the absence of using mmap for upload, be more conservative with the alignment value so as not to simply waste valuable aperture and memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 00:24:16 +00:00
Chris Wilson	b9f59b1099	sna: correct adjust of a stolen 2d read buffer If we steal a write buffer for creating a pixmap for read back, then we need to be careful as we will have set the used amount to 0 and then try to incorrectly decrease by the last row. Fortunately, we do not yet have any code that attempts to create a 2d buffer for reading. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-17 00:22:25 +00:00
Chris Wilson	6fc4cdafeb	sna: Correct assertion for a partial read buffer The batch may legitimately be submitted prior to the attachment of the read buffer, if, for example, we need to switch rings. Therefore update the assertion to only check that the bo remains in existence via either a reference from the exec or from the user Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 21:36:04 +00:00
Chris Wilson	377f5e16cd	sna/gen[45]: clear the state tracker before setting the formats When backporting the patches from gen6, I didn't notice the memset that came later, and this wasn't along the paths checked by rendercheck. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 16:09:57 +00:00
Chris Wilson	6387f2fb8a	sna/gen[4567]: x1r5g5b5 is only a render target, not sampler Whilst we can render to and blend with an depth 15 target, we cannot use it as a texture with the sampling engine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 15:39:42 +00:00
Chris Wilson	8b2bb66666	sna/gen6: Restore the non-pipelined op after every WM binding table update The hw wants it as demonstrated by the '>' in KDE's menus. Why is it always KDE that demonstrates coherency problems... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 13:37:45 +00:00
Chris Wilson	a11b22d172	sna/gen[23]: Remark the destination bo as dirty after flushing One of the side-effects of emitting the composite state is that it tags the destination surface as dirty as a result of the forthcoming operation. So emitting the flush after emitting the composite state clears that tag, so we need to restore it for future coherency. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 13:37:45 +00:00
Zhigang Gong	2f09363a6e	uxa/glamor: Create glamor pixmap by default. When creating native glamor pixmaps we will get much better performance than using the textured-drm pixmap, this commit is to make that the default behaviour when configured to use glamor. Another advantage of this commit is that we reduce the risk of encountering the "incompatible region exists for this name" and the associated render corruption. And since we now never intentionally allocate a reusable pixmap we could just make all (intel_glamor) allocations non-reusable without incurring too great an overhead. A side effect is that those glamor pixmaps do not have a valid BO attached to them and thus it fails to get a DRI drawable. This commit also fixes that problem by adjusting the fixup_shadow mechanism to recreate a textured-drm pixmap from the native glamor pixmap. I tested this with mutter, and it works fine. The performance gain to apply this patch is about 10% to 20% with different workload. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 10:49:21 +00:00
Chris Wilson	fd4c139a39	sna: On LLC systems quietly replace all linear mmappings using the CPU If the GPU and CPU caches are shared and coherent, we can use a cached mapping for linear bo in the CPU domain with no penalty and so avoid the penalty of using WC/UC mappings through the GTT (and any aperture pressure). We presume that the bo for such mappings are indeed LLC cached... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 01:30:13 +00:00
Chris Wilson	c20a729d0a	sna/gen6: Force a batch submission after allocation failure during composite Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 01:30:13 +00:00
Chris Wilson	380a2fca3c	sna: Optimise call to composite with single box Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 01:30:13 +00:00
Chris Wilson	9f89250de1	sna: Use the prefer-GPU hint for forcing allocation for core drawing Similar to the render paths and simpler than the current look up tiling method. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-16 01:30:13 +00:00
Chris Wilson	8652bf7a19	sna: Don't track an unmatching tiled bo when searching the linear cache Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-15 19:56:35 +00:00
Chris Wilson	cc4b616990	sna/video: Increase the level of paranoia In how many different ways can we check that the scanout is allocated before we start decoding video? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-15 19:55:50 +00:00

1 2 3 4 5 ...

4587 Commits All Branches Search

4587 Commits

All Branches