Currently, we require that a batch containing a dirty bo be submitted
before we mark the device as requiring a flush. So if we never submit a
batch between block handlers, we can end up sleeping without ever
flushing either the partial batch or the rendering to the scanout.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36776
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Rather than just creating and submitting a batch that simply contains a
flush in order to periodically ensure that rendering reaches the
scanout, we can simply ask the kernel whether the scanout is busy. The
kernel will then submit a flush on our behalf if it is dirty, which
takes advantage of the kernel's dirty state tracking.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A modest boost to throughput and reduction in CPU overhead from
not flushing the batch on every transition from BLT to RENDER.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
gen6+ platform has a BLT engine with seperate
command streamer to support BLT commands.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
[ickle: merge trivial conflict]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
An attempt to workaround the incoherency in gen2 chipsets, we avoid
using dynamic reallocation as much as possible.
The first step is to disable allocation of pixmaps using GEM and simply
create them in system memory without a backing buffer object. This
forces all rendering to use S/W fallbacks.
The second step is to allocate a shadow front buffer and assign that to
the Screen pixmap. This ensure that the front buffer remains in the GTT
and pinned for scanout. The shadow buffer will be rendered to in the
normal fashion via the Screen pixmap, and be marked dirty. In the block
handler, the dirty shadow buffer is then blitted (using the GPU) over
the front buffer. This should completely avoid having to move pages
around in the GTT and avoid incurring the wrath of those early chipsets.
Secondly, performance should be reasonable as we avoid the ping-pong
caused by the small aperture and weak GPU forcing software fallbacks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
After splitting out the i810 driver into its own legacy directory, we
can identify the common routines not as i830 but as intel. This
clarifies the code which *is* i830 specific.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>