xf86-video-intel/src/sna
Chris Wilson ccf0fdd56d sna: Only flush after the BLT operation if we have more than 2 distinct bo
In order to preserve the optimisation of discarding incomplete batches,
we don't always want to immediately submit the batch after inserting the
first command. As we currently only cancel a batch if it only touches
the bo being discarded, we can skip the immediate flush if it only
accesses one bo and maybe be able to use the undo optimisation later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-07-09 20:41:51 +01:00
..
brw sna: Remove unused brw_eu_debug.c 2013-06-24 11:50:25 +01:00
fb
Makefile.am configure: test for librt (clock_gettime) 2013-06-19 11:30:24 +01:00
README
atomic.h
blt.c sna: Implement memcpy_from_tiled functions (for X-tiling only atm) 2013-06-27 15:26:41 +01:00
compiler.h sna: Add the Ofast option to the critical memcpy routines 2013-06-29 21:56:13 +01:00
gen2_render.c sna/gen2: Fix alpha replication in the copy pipeline 2013-07-02 23:17:32 +01:00
gen2_render.h
gen3_render.c sna/gen2+: Consider precision in render operation placement 2013-06-28 11:25:47 +01:00
gen3_render.h
gen4_render.c sna/gen4: Remove custom max flush vertices w/a from video path 2013-07-07 09:17:58 +01:00
gen4_render.h
gen4_source.c
gen4_source.h
gen4_vertex.c sna: Add DBG statements for choice of spans vertex emitter 2013-05-09 13:49:56 +01:00
gen4_vertex.h
gen5_render.c sna/gen2+: Consider precision in render operation placement 2013-06-28 11:25:47 +01:00
gen5_render.h
gen6_render.c sna/gen2+: Consider precision in render operation placement 2013-06-28 11:25:47 +01:00
gen6_render.h
gen7_render.c sna/gen2+: Consider precision in render operation placement 2013-06-28 11:25:47 +01:00
gen7_render.h sna/gen7: Fix MOCS for Haswell 2013-03-27 16:58:41 +00:00
kgem.c sna: Experiment with a new ioctl to create buffers from stolen memory 2013-07-07 09:31:27 +01:00
kgem.h sna: Experiment with a new ioctl to create buffers from stolen memory 2013-07-07 09:31:27 +01:00
kgem_debug.c
kgem_debug.h
kgem_debug_gen2.c
kgem_debug_gen3.c
kgem_debug_gen4.c
kgem_debug_gen5.c
kgem_debug_gen6.c
kgem_debug_gen7.c
rop.h
sna.h sna: Simplify validation of active CRTCs 2013-07-02 15:44:10 +01:00
sna_accel.c sna: Tune inplace hints for CPU operations with GPU targets 2013-07-06 22:54:02 +01:00
sna_blt.c sna: Only flush after the BLT operation if we have more than 2 distinct bo 2013-07-09 20:41:51 +01:00
sna_composite.c sna: Always create the clear Picture 2013-07-01 18:10:07 +01:00
sna_cpu.c sna: Always populate the CPU features string 2013-06-03 15:35:43 +01:00
sna_damage.c sna: Minor tweaks to make DBG compile again 2013-07-01 22:51:22 +01:00
sna_damage.h
sna_display.c sna: Use a stack allocated PixmapRec for the fbcon copy 2013-07-04 21:39:33 +01:00
sna_display_fake.c sna: Cleanup up error reporting after failure to init KMS interface 2013-05-30 13:08:10 +01:00
sna_dri.c sna: Allow scanouts to be untiled if need be 2013-07-01 18:11:03 +01:00
sna_driver.c intel: Retire Option "RelaxedFencing" 2013-07-03 12:45:10 +01:00
sna_glyphs.c sna/gen4: Remove the glyph mask hack and tune the flush w/a 2013-07-07 09:13:49 +01:00
sna_gradient.c sna: Markup when a gradient is opaque 2013-06-28 10:14:16 +01:00
sna_io.c sna: Minor tweaks to make DBG compile again 2013-07-01 22:51:22 +01:00
sna_module.h
sna_reg.h
sna_render.c sna: Markup when a gradient is opaque 2013-06-28 10:14:16 +01:00
sna_render.h sna: Markup when a gradient is opaque 2013-06-28 10:14:16 +01:00
sna_render_inline.h
sna_stream.c
sna_threads.c
sna_tiling.c sna: Avoid allocating a temporary if using rendercpy tiles 2013-06-26 11:27:25 +01:00
sna_transform.c
sna_trapezoids.c sna: Fix format specifier for mismatching int/long in DBG 2013-06-06 16:43:24 +01:00
sna_vertex.c
sna_video.c sna/video: Free the private video (adaptor/port) arrays upon CloseScreen 2013-06-12 14:34:05 +01:00
sna_video.h sna/video: Fixup formats to select visuals 2013-06-06 21:40:59 +01:00
sna_video_hwmc.c sna/video: Convert to a pure Xv backend 2013-05-21 11:14:52 +01:00
sna_video_hwmc.h sna/video: Convert to a pure Xv backend 2013-05-21 11:14:52 +01:00
sna_video_overlay.c sna/video: Catch allocation failure whilst setting up the TexturedAdaptor 2013-06-12 14:20:10 +01:00
sna_video_sprite.c sna/video: Fixup formats to select visuals 2013-06-06 21:40:59 +01:00
sna_video_textured.c sna/video: Catch allocation failure whilst setting up the TexturedAdaptor 2013-06-12 14:20:10 +01:00

README

SandyBridge's New Acceleration
------------------------------

The guiding principle behind the design is to avoid GPU context switches.
On SandyBridge (and beyond), these are especially pernicious because the
RENDER and BLT engine are now on different rings and require
synchronisation of the various execution units when switching contexts.
They were not cheap on early generation, but with the increasing
complexity of the GPU, avoiding such serialisations is important.

Furthermore, we try very hard to avoid migrating between the CPU and GPU.
Every pixmap (apart from temporary "scratch" surfaces which we intend to
use on the GPU) is created in system memory. All operations are then done
upon this shadow copy until we are forced to move it onto the GPU. Such
migration can only be first triggered by: setting the pixmap as the
scanout (we obviously need a GPU buffer here), using the pixmap as a DRI
buffer (the client expects to perform hardware acceleration and we do not
want to disappoint) and lastly using the pixmap as a RENDER target. This
last is chosen because when we know we are going to perform hardware
acceleration and will continue to do so without fallbacks, using the GPU
is much, much faster than the CPU. The heuristic I chose therefore was
that if the application uses RENDER, i.e. cairo, then it will only be
using those paths and not intermixing core drawing operations and so
unlikely to trigger a fallback.

The complicating case is front-buffer rendering. So in order to accommodate
using RENDER on an application whilst running xterm without a composite
manager redirecting all the pixmaps to backing surfaces, we have to
perform damage tracking to avoid excess migration of portions of the
buffer.