For the long interval events (such as expiring the caches), we do not
need precise timing and so can use a coarse timer to allow the system
to coalesce and reduce wakeup events.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If a pixmap lies around for a couple of minutes not being used, it is
unlikely to be used again in the near future. Reap the GPU buffers of
any of those idle pixmaps (copying to a more compact buffer in system
memory) in order to free up resources for use elsewhere. Any object
that is exposed via DRI is obviously exempt from this reaping.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to avoid inconsistent usage of coherency domains and to avoid
completely unnecessary clflushing during video playback, use the same
buffer allocation and upload functions as the rest of the driver.
Reported-by: Christophe Roland <roll68@gmail.com>
Bugzilla: http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=651316
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A typo confused left and right, rejecting true vertical edges, and worse
might have incurred false positives.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The operations when setting dpms on should be in the order opposite
of what's done when setting dpms off.
This is because of potentially conflicting effects:
~ drmModeConnectoSetProperty() enables/disables the backlight driver.
Some backlight drivers such as intel_backlight set the backlight to 0
when disabled and to max when enabled.
~ intel_output_dpms_backlight() saves the backlight value when turning
DPMS off and restores it when turning DPMS on.
Here's the current order of operations:
xset dpms force off (backlight is nonzero)
drmModeConnectoSetProperty(DPMSModeOff)
kernel: disable backlight, backlight=0
intel_output_dpms_backlight(DPMSModeOff)
save backlight value (0) <-- it has been set to 0 by kernel
set backlight to 0
xset dpms force on
drmModeConnectoSetProperty(DPMSModeOn)
kernel: enable backlight, backlight=max
intel_output_dpms_backlight(DPMSModeOn)
set backlight to saved value (0)
The correct way to do this would be to reverse the operations during
xset dpms force off:
intel_output_dpms_backlight(DPMSModeOff)
save backlight value (nonzero)
set backlight to 0
drmModeConnectoSetProperty(DPMSModeOff)
kernel: enable backlight, backlight=0
This restores the saved nonzero backlight value during the force on.
Signed-off-by: Simon Que <sque@chromium.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A partial buffer is considered finished upon the next batch submission,
so one needs to be careful that it is completely written to before such
an event is triggered. move-to-cpu is such a trigger as demonstrated by
the picture fixup routine for handling convolution filters.
Reported-by: Victor Machado <machado.prx@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43607
Signed-off-by: Chris Wilson <ickle@crestline.(none)>
Otherwise we may exhaust the per-process vma limit and cause
applications to stop rendering and eventually crash the X server.
Will only work in conjunction with a new libdrm (2.4.28) and commit
c549a77c (intel: Unmap buffers during drm_intel_gem_bo_unmap)
in particular.
Reported-by: nobled@dreamwidth.org
References: https://bugs.freedesktop.org/show_bug.cgi?id=43075
References: https://bugs.freedesktop.org/show_bug.cgi?id=40066
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we never use these with a depth nor attach them to scanout, we can
safely relax the multiple-of-64 byte pitch restriction. In the unlikely
event that we do need A8 surfaces with depthbuffers, this is broken...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The basis for the constraints are what we can map into the aperture for
direct writing with the CPU, so use the size of the mappable region as
opposed to the size of the total GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Benchmarking on the current code base, says this is now a win. A
reversal of older benchmarks, so expect further tuning.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal of the optimisation is to discard the GPU bo early, so we
can skip the extra damage reduction if there is no gpu bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We want to avoid the condition of reducing the tiling mode (when reusing
an active untiled buffer in preference to creating a new) for a wide buffer
when doing will force a TLB miss on each sample.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we handle tiled spans indirectly, we need to avoid applying the
drawable offsets twice (once in the mi layer generating the spans, and
then once more in the tiled rect renderer).
Reported-by: Ulrich Müller <ulm@gentoo.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43245
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The cost of the TLB miss on every sample far outweighs the impact of the
context (and ring) switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
There are many operations, usually the core drawing acceleration, where
the BLT is much more preferable than using the CPU. However, the BLT is
limited to only using X-tiling, so if we encounter a Y-tiled pixmap
target we need to recreate it as X-tiling before proceeding. Hopefully,
the pixmap is then kept around and rendered multiple times to amoritize
the cost of the copy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Y-tiling is slightly faster with RENDER operations, so attempt to
allocate source-only pixmaps using this tiling mode. Actually using
Y-tiling is a delicate balance because it then prevents the use of the
BLT. For instance, enabling Y-tiling by default gives a 30% performance
improvement on the fish-demo (compositing benchmark) at 2560x1440 on
Ironlake but regresses tiger-demo by 2x (spans benchmark).
So experiment with this compromise and allow for changing the default
tiling.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>