xf86-video-intel

History

Chris Wilson 354dc3c65b sna: Avoid fallbacks for convolutions by rendering the convolved texture If we have no shader support for generic convolutions, we currently create the convolved texture using pixman. A multipass accumulation algorithm can be implemented on top of CompositePicture, so try it! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>		2011-12-22 11:51:58 +00:00
..
Makefile.am	sna: Enable hooking up of valgrind during debugging	2011-12-11 16:23:13 +00:00
README	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
blt.c	sna: Optimise single pixel transfers	2011-11-14 19:49:29 +00:00
compiler.h	sna: Protect against deferred malloc failures for pixel data	2011-12-14 10:35:04 +00:00
gen2_render.c	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
gen2_render.h	sna/gen2: Use specular component for solid spans	2011-07-01 21:41:23 +01:00
gen3_render.c	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
gen3_render.h	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
gen4_render.c	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
gen4_render.h	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
gen5_render.c	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
gen5_render.h	sna/gen5: Avoid bitfields for simple assignments	2011-09-12 19:25:08 +01:00
gen6_render.c	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
gen6_render.h	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
gen7_render.c	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
gen7_render.h	sna/gen7: Correct shifts for surface state	2011-11-19 08:34:59 +00:00
kgem.c	sna: Treat all exported bo as potentially active	2011-12-20 22:40:25 +00:00
kgem.h	sna: Tune the inplace cross-over point to be half-cache size	2011-12-19 20:10:47 +00:00
kgem_debug.c	sna: Begin debugging gen7	2011-11-11 00:15:44 +00:00
kgem_debug.h	sna: Begin debugging gen7	2011-11-11 00:15:44 +00:00
kgem_debug_gen2.c	sna/gen2: Improve batch decoder.	2011-09-04 12:46:32 +01:00
kgem_debug_gen3.c	sna: Implement a VMA cache	2011-12-11 00:52:54 +00:00
kgem_debug_gen4.c	sna: Implement a VMA cache	2011-12-11 00:52:54 +00:00
kgem_debug_gen5.c	sna: Implement a VMA cache	2011-12-11 00:52:54 +00:00
kgem_debug_gen6.c	sna: Implement a VMA cache	2011-12-11 00:52:54 +00:00
kgem_debug_gen7.c	sna: Implement a VMA cache	2011-12-11 00:52:54 +00:00
rop.h	sna: Reduce and clarify dependencies	2011-11-16 22:15:39 +00:00
sna.h	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
sna_accel.c	sna: Avoid fallbacks for convolutions by rendering the convolved texture	2011-12-22 11:51:58 +00:00
sna_blt.c	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
sna_composite.c	sna: Map the upload buffer using an LLC bo	2011-12-17 21:26:35 +00:00
sna_damage.c	sna/damage: Guard against malloc failures	2011-12-14 12:23:04 +00:00
sna_damage.h	sna: Defer allocation of memory for larger pixmap until first use	2011-12-12 16:03:11 +00:00
sna_display.c	sna: Simplify write domain tracking	2011-12-17 21:26:35 +00:00
sna_dri.c	sna: Simplify write domain tracking	2011-12-17 21:26:35 +00:00
sna_driver.c	sna: The block handler is passed an indirect pointer to the timeval	2011-11-16 22:15:39 +00:00
sna_glyphs.c	sna: Map the upload buffer using an LLC bo	2011-12-17 21:26:35 +00:00
sna_gradient.c	sna: Simplify write domain tracking	2011-12-17 21:26:35 +00:00
sna_io.c	sna: Always readback untiled bo in place	2011-12-20 22:40:25 +00:00
sna_module.h	sna: Add zaphod support	2011-06-07 16:54:57 +01:00
sna_reg.h	sna: Accelerate XYPixmap upload when using GXcopy	2011-11-01 21:12:02 +00:00
sna_render.c	sna: Avoid fallbacks for convolutions by rendering the convolved texture	2011-12-22 11:51:58 +00:00
sna_render.h	sna: Implement extended fallback handling for src == dst copies	2011-12-20 18:46:47 +00:00
sna_render_inline.h	sna/gen[23]: We need to check the batch before doing an inline flush	2011-12-19 00:37:43 +00:00
sna_stream.c	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
sna_tiling.c	sna: Map the upload buffer using an LLC bo	2011-12-17 21:26:35 +00:00
sna_transform.c	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
sna_trapezoids.c	sna: Map the upload buffer using an LLC bo	2011-12-17 21:26:35 +00:00
sna_video.c	sna: Implement a VMA cache	2011-12-11 00:52:54 +00:00
sna_video.h	sna/video: Use the normal bo cache for texture video streams	2011-11-09 14:00:16 +00:00
sna_video_hwmc.c	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
sna_video_hwmc.h	sna: Introduce a new acceleration model.	2011-06-04 09:19:46 +01:00
sna_video_overlay.c	sna/video: Constify a couple of attribute arrays	2011-11-13 13:13:03 +00:00
sna_video_textured.c	sna/gen3: Check for upload failure of video bo	2011-12-15 18:19:18 +00:00

README

SandyBridge's New Acceleration
------------------------------

The guiding principle behind the design is to avoid GPU context switches.
On SandyBridge (and beyond), these are especially pernicious because the
RENDER and BLT engine are now on different rings and require
synchronisation of the various execution units when switching contexts.
They were not cheap on early generation, but with the increasing
complexity of the GPU, avoiding such serialisations is important.

Furthermore, we try very hard to avoid migrating between the CPU and GPU.
Every pixmap (apart from temporary "scratch" surfaces which we intend to
use on the GPU) is created in system memory. All operations are then done
upon this shadow copy until we are forced to move it onto the GPU. Such
migration can only be first triggered by: setting the pixmap as the
scanout (we obviously need a GPU buffer here), using the pixmap as a DRI
buffer (the client expects to perform hardware acceleration and we do not
want to disappoint) and lastly using the pixmap as a RENDER target. This
last is chosen because when we know we are going to perform hardware
acceleration and will continue to do so without fallbacks, using the GPU
is much, much faster than the CPU. The heuristic I chose therefore was
that if the application uses RENDER, i.e. cairo, then it will only be
using those paths and not intermixing core drawing operations and so
unlikely to trigger a fallback.

The complicating case is front-buffer rendering. So in order to accommodate
using RENDER on an application whilst running xterm without a composite
manager redirecting all the pixmaps to backing surfaces, we have to
perform damage tracking to avoid excess migration of portions of the
buffer.