Commit Graph

4187 Commits

Author SHA1 Message Date
Chris Wilson c83fd4e24d sna: Add some more debug messages for VMA caching
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 16:14:38 +00:00
Chris Wilson 3ae7fb918a sna: Restrict pitch alignment on 945gm to 64 bytes
In theory we should be able to disable dual-stream mode and so be
subject to much looser restrictions (such as the pitch need only be
dword aligned). However, achieving single-stream mode seems quite
difficult!

Reported-by: Paul Neumann <paul104x@yahoo.de>
References: https://bugs.freedesktop.org/show_bug.cgi?id=43706
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 13:52:42 +00:00
Chris Wilson 2f35d77cd0 sna: Update computation of untiled pitch to cater for CREATE_SCANOUT
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 13:37:18 +00:00
Chris Wilson 5a0139487f sna/gen3: Ensure that depth read/writes are disabled before first use
Our goal is to achieve "single-stream" rendering where the entire
RenderCache is allocated to the colour buffer (rather than split between
colour and depth). In theory all that is required is for the pipeline
not to reference the depth buffer at all, however it is not made clear
when that evaluation is made.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 12:37:22 +00:00
Chris Wilson a02bbd8700 sna: Only transfer the bo if the src/dst are of matching size
If the src replaces the dst, it could just be a much larger pixmap!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 10:34:37 +00:00
Chris Wilson 43a9964863 sna: Only transfer unpinned buffers
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 10:30:48 +00:00
Chris Wilson eb859f6446 uxa/video: Correct the offset of the binding table in the surface buffer
The binding table is intended to be after all the surface descriptions,
so make sure we write it with the appropriate offset into the buffer.

Fixes regression from 699888a64 (uxa/video: Use the common bo
allocations and upload)

Reported-by: Cyril Brulebois <kibi@debian.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43704
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 01:38:51 +00:00
Chris Wilson 051a18063d sna: Implement a VMA cache
A VMA cache appears unavoidable thanks to compiz and an excrutiatingly
slow GTT pagefault, though it does look like it will be ineffectual
during everyday usage. Compiz (and presumably other compositing
managers) appears to be undoing all the pagefault minimisation as
demonstrated on gen5 with large XPutImage. It also appears the CPU to
memory bandwidth ratio plays a crucial role in determining whether
going straight to GTT or through the CPU cache is a win - so no trivial
heuristic.

x11perf -putimage10 -putimage500 on i5-2467m:
Before:
  bare:   1150,000   2,410
  compiz:  438,000   2,670
After:
  bare:   1190,000   2,730
  compiz:  437,000   2,690
UXA:
  bare:    658,000   2,670
  compiz:  389,000   2,520

On i3-330m
Before:
  bare:    537,000   1,080
  compiz:  263,000     398
After:
  bare:    606,000   1,360
  compiz:  203,000     985
UXA:
  bare:    294,000   1,070
  compiz:  197,000     821

On pnv:
Before:
  bare:    179,000   213
  compiz:  106,000   123
After:
  bare:    181,000   246
  compiz:  103,000   197
UXA:
  bare:    114,000   312
  compiz:   75,700   191

Reported-by: Michael Larabel <Michael@phoronix.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11 00:52:54 +00:00
Chris Wilson 735a15208d sna/gen5: Remove a redundant format check
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 23:52:25 +00:00
Chris Wilson c5584252c3 sna: Remember to assign a new unique id for the replaced bo
Missed from the previous patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 23:34:51 +00:00
Chris Wilson 9c764dc13b sna: Be more pessimistic with CPU sources
Try to avoid a few more unnecessary context switches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 23:34:51 +00:00
Chris Wilson 358aaef6db sna/dri: Prefer using the BLT for DRICopyRegion on pre-SNB
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 23:34:51 +00:00
Chris Wilson c295ad8da9 sna: Transfer the whole bo for a replacement XCopyArea
If we are copying over the entire source onto the destination,just copy
across the GPU bo. This is often used for caching images as pixmaps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 23:34:51 +00:00
Chris Wilson ece7fc8afe sna: Only use the 64-byte pitch alignment for scanout
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 23:34:51 +00:00
Chris Wilson b3816cf3a9 sna: Remove assertions that external bo are not busy
We have to be careful to assume bo via exposed are under our full
control, in particular not to assert their state. :(

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 23:34:51 +00:00
Chris Wilson b5a6bc9e33 sna/gen[23]: Fixup render targets with pitches below hw minimum
gen2/3 have a restriction that the 3D pipeline cannot render to a pixmap
with a pitch less than 8/16 respectively. Rather than mandating all
pixmaps to be created with a stride greater than 16, fixup the bo for
the rare occasions when it is necessary.

Reported-by: Paul Neumann <paul104x@yahoo.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43688
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 13:18:44 +00:00
Chris Wilson c0dab7b1cf sna/trapezoids: Try to render traps onto a8 destinations in place
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 12:46:46 +00:00
Chris Wilson c73b14cabb sna/trapezoids: First try the scan converter for fallbacks
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-10 11:41:18 +00:00
Chris Wilson 22d9bc0bc1 sna: Use a single definition for the inactive cache timeout
And share it between the timer and the expiration function, just to
simplify the code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-09 23:51:02 +00:00
Chris Wilson eb3e04d960 sna: Fallback to ordinary monotonic clock if coarse is not supported
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-09 23:51:02 +00:00
Chris Wilson 1c202cc074 sna: s/MONOTONICE/MONOTONIC/
A late addition to be flexible for compiling on different systems
heralded its doom.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-09 17:25:19 +00:00
Chris Wilson c51626ccb6 sna: Use the coarse monotonic clock to coalesce wakeup events
For the long interval events (such as expiring the caches), we do not
need precise timing and so can use a coarse timer to allow the system
to coalesce and reduce wakeup events.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-09 17:14:38 +00:00
Chris Wilson c22197f25b sna: Discard bo for idle private pixmaps
If a pixmap lies around for a couple of minutes not being used, it is
unlikely to be used again in the near future. Reap the GPU buffers of
any of those idle pixmaps (copying to a more compact buffer in system
memory) in order to free up resources for use elsewhere. Any object
that is exposed via DRI is obviously exempt from this reaping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-09 17:14:38 +00:00
Chris Wilson 429a36f748 uxa: Fix clip processing for uxa_fill_spans()
Fixes regression from e0066e77e0
(uxa: Simplify Composite solid acceleration for spans by only clipping
once) [2.15.901]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43649
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-09 09:54:22 +00:00
Chris Wilson 699888a641 uxa/video: Use the common bo allocations and upload
In order to avoid inconsistent usage of coherency domains and to avoid
completely unnecessary clflushing during video playback, use the same
buffer allocation and upload functions as the rest of the driver.

Reported-by: Christophe Roland <roll68@gmail.com>
Bugzilla: http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=651316
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-09 00:33:12 +00:00
Chris Wilson 706d3a97bd sna/trapezoids: Fix detection of rectilinearity after projection
A typo confused left and right, rejecting true vertical edges, and worse
might have incurred false positives.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 18:53:59 +00:00
Simon Que bc081420a5 xf86-video-intel: change order of DPMS operations
The operations when setting dpms on should be in the order opposite
of what's done when setting dpms off.

This is because of potentially conflicting effects:
~ drmModeConnectoSetProperty() enables/disables the backlight driver.
Some backlight drivers such as intel_backlight set the backlight to 0
when disabled and to max when enabled.
~ intel_output_dpms_backlight() saves the backlight value when turning
DPMS off and restores it when turning DPMS on.

Here's the current order of operations:

xset dpms force off (backlight is nonzero)
   drmModeConnectoSetProperty(DPMSModeOff)
      kernel: disable backlight, backlight=0
   intel_output_dpms_backlight(DPMSModeOff)
      save backlight value (0) <-- it has been set to 0 by kernel
      set backlight to 0

xset dpms force on
   drmModeConnectoSetProperty(DPMSModeOn)
      kernel: enable backlight, backlight=max
   intel_output_dpms_backlight(DPMSModeOn)
      set backlight to saved value (0)

The correct way to do this would be to reverse the operations during
xset dpms force off:
   intel_output_dpms_backlight(DPMSModeOff)
      save backlight value (nonzero)
      set backlight to 0
   drmModeConnectoSetProperty(DPMSModeOff)
      kernel: enable backlight, backlight=0

This restores the saved nonzero backlight value during the force on.

Signed-off-by: Simon Que <sque@chromium.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 14:06:16 +00:00
Chris Wilson 84aaf1537c sna/gen7: Reduce dst readbacks for unsupported sources
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 12:36:08 +00:00
Chris Wilson 440ac68ec0 sna/gen6: Reduce dst readbacks for unsupported sources
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 12:35:54 +00:00
Chris Wilson bc68211d18 sna/gen5: Reduce dst readbacks for unsupported sources
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 12:35:39 +00:00
Chris Wilson a5df7c28e4 sna/gen4: Reduce dst readbacks for unsupported sources
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 12:35:22 +00:00
Chris Wilson cc8cab649c sna/gen3: Reduce readbacks on dst for unsupported sources
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 12:34:58 +00:00
Chris Wilson e5bc0c823b sna/gen2: Avoid readbacks for unsupported sources
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-08 12:34:52 +00:00
Chris Wilson 874c722c86 sna: Beware flushing partial buffers before they are written
A partial buffer is considered finished upon the next batch submission,
so one needs to be careful that it is completely written to before such
an event is triggered. move-to-cpu is such a trigger as demonstrated by
the picture fixup routine for handling convolution filters.

Reported-by: Victor Machado <machado.prx@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43607
Signed-off-by: Chris Wilson <ickle@crestline.(none)>
2011-12-08 12:10:12 +00:00
Chris Wilson 6ccb114a7e sna: Prefer to use our pixmap upload paths
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-07 21:08:49 +00:00
Chris Wilson 101942d41d uxa: Unmap the buffer after swrast
Otherwise we may exhaust the per-process vma limit and cause
applications to stop rendering and eventually crash the X server.

Will only work in conjunction with a new libdrm (2.4.28) and commit
  c549a77c (intel: Unmap buffers during drm_intel_gem_bo_unmap)
in particular.

Reported-by: nobled@dreamwidth.org
References: https://bugs.freedesktop.org/show_bug.cgi?id=43075
References: https://bugs.freedesktop.org/show_bug.cgi?id=40066
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-05 10:30:06 +00:00
Chris Wilson b424b10e77 sna: use tight pitches for a8
As we never use these with a depth nor attach them to scanout, we can
safely relax the multiple-of-64 byte pitch restriction. In the unlikely
event that we do need A8 surfaces with depthbuffers, this is broken...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-04 13:55:26 +00:00
Chris Wilson 46c7df8038 sna: Remove one redundant retire
There is no need to retire immediately after a batch and no indication
that it will be useful.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-03 22:59:28 +00:00
Chris Wilson b99c6b13eb sna: Pass the current value of the batch offset to the kernel relocator
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-03 22:59:28 +00:00
Chris Wilson 735219cd59 uxa: Ensure that we can fallback with all of (src, mask, dst) as GTT mappings
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-02 10:42:00 +00:00
Chris Wilson f6c82c73b6 uxa: Fix runtime linking of previous commit
So much for relying on compiler warnings.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-02 10:34:10 +00:00
Chris Wilson 85d3dc5910 uxa: Reset size limits based on AGP size
The basis for the constraints are what we can map into the aperture for
direct writing with the CPU, so use the size of the mappable region as
opposed to the size of the total GTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-02 10:22:51 +00:00
Chris Wilson e551987461 sna: Reuse the full size of an old handle for io
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-01 13:49:03 +00:00
Chris Wilson c5632369cb sna: Move the preservation of the io handle into the common destroy path
In order to capture and reuse all io buffers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-01 13:23:56 +00:00
Chris Wilson 95f4da647a sna: Align pwrite to transfer whole cachelines
Daniel claims that this is will be faster, or will be once he has
completed rewriting pwrite!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-30 12:08:39 +00:00
Chris Wilson ecd6cca617 sna/gen5: Handle cpu-bo for render targets
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-29 19:27:46 +00:00
Chris Wilson d8f2e87473 sna/render: Fix check for "migrate whole pixmap"
The whole pixmap means the sample covers the full width and height, not
just either!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-29 14:14:23 +00:00
Chris Wilson 20e5791408 sna: Fix assertion around flushing of mmap(PROT_READ)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-29 10:59:20 +00:00
Chris Wilson 56155c91af sna/gen6: Set the batch mode prior to checking limits and flushing
If we change contexts, then we will submit the batch obsoleting the
earlier resource checks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-29 10:27:18 +00:00
Chris Wilson 5b1e9e1573 sna: Always reduce tiling for thin pixmaps
Benchmarking on the current code base, says this is now a win. A
reversal of older benchmarks, so expect further tuning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-28 22:01:00 +00:00