xf86-video-intel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	c64ebee5fd	sna/gen6: Prefer the render ring for copies Slower for fills, but on the current stack faster for copies, both large and small. Hopefully, when we write some good shaders for SNB, we will not only improve performance for copies but also make fills faster on the render ring than the blt? As the BLT copy routine is GPU bound for copywinpix10, and the RENDER copy routine is CPU bound and faster, I believe that we have reached the potential of the BLT ring and not yet saturated the GPU using the render copy. Note that we still do not casually switch rings, so the actual routine chosen will still be selected by the preceeding operations, so is unlikely to have any effect in practice during, for example, cairo-traces. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-11 13:34:44 +00:00
Chris Wilson	6a9b501774	sna/gen6: Suppress the CS stall for the first command in the batch The batch emission serves as a full stall, so we do not need to incur a second before our first rendering. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-11 11:02:53 +00:00
Chris Wilson	cbe8bed83f	sna/gen7: Mention the depth-stall required before changing VS state Because one day we may actually start using VS! Copied from the addition of the w/a to Mesa by Kenneth Graunke. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-11 10:58:05 +00:00
Chris Wilson	6193f2f00f	sna: Fix retire after readback Upon reading, we encounter a serialisation point and so can retire all requests. However, kgem_bo_retire() wasn't correctly detecting that barrier and so we continued to using GPU detiling thinking the target was still busy. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-09 14:16:17 +00:00
Chris Wilson	4d8369f8e6	sna/gen2+: Force upload rather than perform source transformations on the CPU If both the source and destination is on the CPU, then the thinking was it would be quicker to operate on those on the CPU rather than copy both to the GPU and then perform the operation. This turns out to be a false assumption if transformation is involved -- something to be reconsidered if pixman should ever be improved. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-08 13:15:46 +00:00
Chris Wilson	8634d461bd	sna: Limit max CPU bo size to prevent aperture thrashing on upload Copying between two objects that consume more than the available GATT space is a painful experience due to the forced use of an intermediatory and eviction on every batch. The tiled upload paths are in comparison remarkably efficient, so favour their use when handling extremely large buffers. This reverses the previous idea in that we now prefer large GPU bo rather than large CPU bo, as the render pipeline is far more flexible for handling those than the blitter is for handling the CPU bo (at least for gen4+). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-08 09:30:12 +00:00
Chris Wilson	5b16972d78	sna: Check that we successfully retired an active linear buffer If we go to the trouble of running retire before searching, we may as well check that we retired something before proceeding to check all the inactive lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-08 09:16:47 +00:00
Chris Wilson	207b4d4482	sna: Relax must-be-blittable rules for gen4+ The render pipeline is actually more flexible than the blitter for dealing with large surfaces and so the BLT is no longer the limiting factor on gen4+. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-08 09:16:47 +00:00
Zhigang Gong	13c960db9e	uxa/glamor: Use a macro to specify module name. This depends upon glamor commit b5f8d, just after the 0.3.0 tag. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-08 09:07:42 +00:00
Zhigang Gong	70092bfbc5	uxa/glamor: Refine CloseScreen and InitScreen process. The previous version calls glamor_egl_close_screen and glamor_egl_free_screen manually which is not align with standard process. Now glamor change the way to follow standard method: glamor layer and glamor egl layer both have their internal CloseScreens. The correct sequence is after the I830CloseScreen is registered, then register glamor_egl_close_screen and the last one is glamor_close_screen. So we move out the intel_glamor_init from the intel_uxa_init to I830ScreenInit and just after the registration of I830CloseScreen. As the glamor interfaces changed, we need to check the glamor version when load the glamor egl module to make sure we are loading the right glamor module. If failed, it will switch back to UXA path. This depends upon glamor commit 1bc8bf tagged with version 0.3.0. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-08 09:07:42 +00:00
Chris Wilson	798aad6c95	sna/gen[4-7]: Fix erroneous scale factor for partial large bo render copies Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-07 20:16:48 +00:00
Chris Wilson	ea65887261	sna: Apply offsets correctly for partial src/dst in large copy boxes Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-07 15:32:31 +00:00
Chris Wilson	14c91e1084	sna/tiling: Request Y-tiles if we know we cannot BLT to either the src or dst Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-07 15:32:31 +00:00
Chris Wilson	3131217e3e	sna: Mark up the temporary allocations Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-07 14:36:21 +00:00
Chris Wilson	ec1ccb6bf6	sna: Set the damage for render->copy_boxes to NULL before use Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-07 13:37:52 +00:00
Chris Wilson	58f634b792	sna: Handle tile alignment for untiled large bo more carefully We ended up trying to align the upper bound to zero as the integer divsion of the tile width by pixel was zero. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-07 13:32:20 +00:00
Zhigang Gong	bf3518ea91	uxa/glamor/dri: Fix a typo bug when fixup glamor pixmap. Should modify the old pixmap's header not the new one which was already destroyed. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-07 08:43:08 +00:00
Chris Wilson	1467a4ba1a	sna: Use the proper sna_picture_is_solid() test Rather than the specialised routines that assumed pDrawable was non-NULL, which was no longer true after `f30be6f743`. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-06 21:10:35 +00:00
Chris Wilson	ef335a65a9	sna: Search all active buckets for a temporary allocation Reduce the need for creating a new object if we only need the allocation for a single operation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-06 21:10:35 +00:00
Chris Wilson	b7e3aaf773	sna: Use the clipped end-point for recomputing segment length after clipping References: https://bugs.freedesktop.org/show_bug.cgi?id=45673 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-06 18:08:19 +00:00
Chris Wilson	f30be6f743	sna/gen2+: Exclude solids from being classed as requiring an upload We treat any pixmap that is not attached to either a CPU or GPU bo as requiring the pixel data to be uploaded to the GPU before we can composite. Normally this is true, except for the solid cache. References: https://bugs.freedesktop.org/show_bug.cgi?id=45672 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-06 15:59:21 +00:00
Chris Wilson	f009386de8	sna: If we have a CPU bo, do not assert we have shadow pixels When transferring damage to the GPU, on SNB it is not necessarily true that we have a shadow pixmap, we may instead have drawn onto an unmapped CPU bo and now simply need to copy from that bo onto the GPU. Move the assertion onto the path where it truly matters. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45672 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-06 09:50:03 +00:00
Chris Wilson	22e452ebe0	sna: Disable use of xvmc for SNB+ Not yet implemented, so don't bother setting it to fail. References: https://bugs.freedesktop.org/show_bug.cgi?id=44874 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-06 09:19:56 +00:00
Chris Wilson	a8ed1a02ad	sna: Discard the redundant clear of the unbounded area if already clear Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-04 20:13:07 +00:00
Chris Wilson	b899a4b696	sna: Always pass the clear colour for PictOpClear Having made that optimisation for Composite, and then made the assumption that it is always true in the backends, we failed to clear the unbounded area outside of a trapezoid since we passed in the original colour and the operation was optimised as a continuation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-04 20:07:49 +00:00
Chris Wilson	c107b90a44	sna/gen6: Reduce PictOpClear to PictOpSrc (with blending disabled) The advantage of PictOpSrc is that it writes its results directly to memory bypassing the blend unit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-04 20:07:45 +00:00
Chris Wilson	4baa2806bc	sna: Check if the damage reduces to all before performing the migration An assert exposed a situation where we had accumulated an unreduced damage-all and so we were taking the slow path only to discover later that it was a damage-all and that we had performed needless checks. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-04 15:19:37 +00:00
Chris Wilson	2653524dff	sna: Reduce the downsample tile size to accommodate alignment If we need to enlarge the sampled tile due to tiling alignments, the resulting sample can become larger than we can accommodate through the 3D pipeline, resulting in FAIL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-04 15:19:37 +00:00
Chris Wilson	93a0b10f16	sna: Apply redirection for the render copy into large pixmaps If the pixmap is larger than the pipeline, but the operation extents fit within the pipeline, we may be able to create a proxy target to transform the operation into one that fits within the constraints of the render pipeline. This fixes the infinite recursion hit with partially displayed extremely large images. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-04 15:19:05 +00:00
Chris Wilson	4774c6b833	sna: Add a couple of sanity checks that the CPU drawable is on the CPU Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-03 09:54:25 +00:00
Chris Wilson	418cd98db7	sna/gen6: Ring switching outweighs the benefits for cairo-traces At the moment, the jury is still out on whether freely switching rings for fills is a Good Idea. So make it easier to turn it on and off for testing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-03 09:53:29 +00:00
Chris Wilson	2d0e7c7ecd	sna: Search again for a just-large-enough mapping for inplace uploads Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-01 14:52:56 +00:00
Chris Wilson	55c7088f54	sna: Add debugging code to verify damage extents of fallback paths After using the CPU, upload the damage and read back the pixels from the GPU bo and verify that the two are equivalent. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-01 09:19:03 +00:00
Chris Wilson	c8fc2cde53	sna: Fill extents for ImageGlyphs The spec says to fill the characters boxes, which is what the hardware does. The implementation fills the extents instead. rxvt expects the former, emacs the latter. Overdraw is a nuisance, but less than leaving glyphs behind... Reported-by: walch.martin@web.de Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45438 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-01 09:19:03 +00:00
Chris Wilson	13508ab5ea	sna: PolyGlyph supports all of fill/tile/stipple rules The hw routines only directly supports solid fill so fallback for the interesting cases. An alternative would be to investigate using the miPolyGlyph routine to convert the weird fills into spans in order to fallback. Sounds cheaper to fallback, so wait for an actual use case. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-01 09:19:03 +00:00
Chris Wilson	df4e1059a4	sna/gen6: Prefer to do fills using the BLT Using the BLT is substantially faster than the current shaders for solid fill. The downside is that it invokes more ring switching. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-02-01 09:07:13 +00:00
Chris Wilson	8b012de0a1	sna/gen5: Always prefer to emit solid fills using the BLT As the BLT is far, far faster than using a shader. Improves cairo-demos/chart from 6 to 13 fps. Reported-by: Michael Larabel <Michael@phoronix.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-31 20:30:40 +00:00
Chris Wilson	0a748fc49d	sna: Split the tiling limits between upload and copying The kernel has a bug that prevents pwriting buffers large than the aperture. Whilst waiting for the fix, limit the upload where possible to fit within that constraint. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-31 11:21:42 +00:00
Chris Wilson	9c1f8a768c	sna: Avoid converting requested Y to X tiling for large pitches on gen4+ The only strong requirement is that to utilize large pitches, the object must be tiled. Having it as X tiling is a pure convenience to facilitate use of the blitter. A DRI client may want to keep using Y tiling instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-31 10:29:02 +00:00
Chris Wilson	e872c1011f	sna/dri: We need to reduce tiling on older gen if we cannot fence Only apply the architectural limits to enable bo creation for DRI buffers. Reported-by: Alban Browaeys <prahal@yahoo.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45414 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-31 10:26:42 +00:00
Chris Wilson	a4caf67d8d	sna: Trim tile sizes to fit into bo cache Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-31 00:55:12 +00:00
Chris Wilson	3f7c1646c7	sna: Check that the intermediate IO buffer can also be used for blitting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-31 00:31:21 +00:00
Chris Wilson	e504fab6c5	sna: Discard the cleared GPU buffer upon PutImage to the CPU buffer Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-30 23:49:18 +00:00
Chris Wilson	ed1c1a7468	sna: Track large objects and limit prefer-gpu hint to small objects As the GATT is irrespective of actual RAM size, we need to be careful not to be too generous when allocating GPU bo and their shadows. So first of all we limit default render targets to those small enough to fit comfortably in RAM alongside others, and secondly we try to only keep a single copy of large objects in memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-30 15:44:44 +00:00
Chris Wilson	d53d93ffa6	sna: Update the partial buffer allocation size when reusing an old mapping Whilst the old mapping is guaranteed to be larger than the requested allocation size, keep track of the actual size allows for better packing of future buffers. And the code also performs a sanity check that the buffer is the size we claim it to be... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-30 15:39:30 +00:00
Chris Wilson	6f99555b6b	sna: Allow the creation of render targets larger than the maximum bo cache Given that we now handle uploads to and from bo that are larger than the aperture and that usage of such large bo is rare and so unlikely to benefit from caching, allow them to be created as render targets and destroy as soon as they become inactive. In principle, this finally enables GPU acceleration of ocitysmap on gen4+, but due to the large cost of creating and destroying large bo it is disabled on systems that require clflushing. It is, however, a pre-requisite for exploiting the enhanced capabilities of IvyBridge. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-30 12:05:51 +00:00
Chris Wilson	c65ec096e7	sna: Decrease tiling step size in case we need to enlarge the box later We can juggle rendering into large bo on gen4 by redirecting the rendering through a proxy that is tile aligned, and so the render target may be slightly larger than the tiling step size. As that is then larger than the maximum 3D pipeline, the trick fails and we need to resort to a temporary render target with copies in and out. In this case, check that the tile is aligned to the most pessimistic tiling width and reduce the step size to accomodate the enlargement. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-30 12:05:51 +00:00
Chris Wilson	95f3734dd6	sna: Allow creation of proxies to proxies Just update the offset of the new bo by the offset of the existing proxy. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-30 12:05:51 +00:00
Chris Wilson	488937edb6	sna: Base prefer-gpu hint on default tiling choice As on gen4+, tiling increases the maximum usable pitch we can accommodate wider pixmaps on the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-30 12:05:48 +00:00
Chris Wilson	ca252e5b51	sna: Detect batch overflow and fallback rather an risk an ENOSPC Having noticed that eog was failing to perform a 8k x 8k copy with compiz running on a 965gm, it was time the checks for batch overflow were implemented. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2012-01-29 20:12:39 +00:00

1 2 3 4 5 ...

4748 Commits All Branches Search

4748 Commits

All Branches