Commit Graph

91 Commits

Author SHA1 Message Date
Chris Wilson 1b1016624a uxa/i915: Remove broken CA pass, fallback to magic 2-pass composite helper
The backend failed to handle all the corner cases, so remove the
complication.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-03-15 12:43:12 +00:00
Chris Wilson 9c6f79440e uxa: Remove unused tracking of the current render target
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-03-15 12:43:12 +00:00
Chris Wilson 219467ac8b uxa: Simplify flush tracking
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-03-15 12:43:12 +00:00
Chris Wilson 3c4f29820b uxa/gen3: Remove special casing of solid pictures
Fixes use of alpha-groups and opacity masks in cairo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-02-27 16:28:41 +00:00
Chris Wilson ad22003033 i965: Avoid transform overheads for vertex emit where possible
Minor improvement as the bottlenecks lie elsewhere. But it was annoying me.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-07 15:09:21 +01:00
Chris Wilson a5a1ab7bbc i915: Remove unused 'w' and 'h'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-28 17:20:06 +00:00
Chris Wilson 2c9b3225d8 i915: Remove unused 'num_floats' variable
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-28 17:19:39 +00:00
Eric Anholt 5a22bc999d Quiet compiler warning about is_affine_src same way we do is_affine_mask. 2011-01-17 11:32:37 -08:00
Chris Wilson 3cc74044ce i965: Amalgamate surface binding tables
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-03 14:05:30 +00:00
Matthias Hopf b84925b9c0 Make driver compile for 1.6 Xserver series again.
Signed-off-by: Matthias Hopf <mhopf@suse.de>
2010-09-22 17:45:06 +02:00
Chris Wilson 5c663ce844 Rename common infrastructure to the intel namespace.
After splitting out the i810 driver into its own legacy directory, we
can identify the common routines not as i830 but as intel. This
clarifies the code which *is* i830 specific.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-25 13:18:01 +01:00
Chris Wilson 2ff7a2fc9d i915: Force the emission of BUF_INFO on every composite_setup
We should be able to eliminate these as the drawable remains unchanged.
However, the implicit flush of BUF_INFO fixes the rendering in KDE.
Alternatively, we need an MI_FLUSH | INHIBIT_RENDER_CACHE_FLUSH between
composites. (Note that it is not stale cache data causing the rendering
corruption and that a pipelined flush is not sufficient either.) Also,
having tried varies points at which to flush, the only place where the
flush is effective seems to be between composite operations - that is a
flush after 2D is not sufficient.

Reported-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-17 14:08:07 +01:00
Chris Wilson 8c1a8d2297 Revert "xp:trapezoids"
This reverts commit f429fb9d87.

An experimental patch I forgot was on my main branch as I was bugfixing.
ARGH!
2010-06-09 10:03:29 +01:00
Chris Wilson f429fb9d87 xp:trapezoids 2010-06-08 19:52:46 +01:00
Chris Wilson 0776a42b70 implicit-flush 2010-06-08 15:00:16 +01:00
Chris Wilson dc402334f4 i915: Centre sampling.
Use centre sampling of textures to match pixman, and remove numerous
off-by-one and visual artefacts when rendering. The classic example for
this is cairo/text/xcomposite-projection where the edge of the rotated
rectangle is jaggy due to the incorrect sample position.

Fixes:

  Bug 16917  - [i915] Blur on y-axis also when only x-axis is scaled
               billiear
  https://bugs.freedesktop.org/show_bug.cgi?id=16917

And about 15 tests from the Cairo test suite.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-01 23:15:02 +01:00
Chris Wilson f74b3f82ba i915; Avoid the implicit flush on changing BUF_INFO
3DSTATE_BUF_INFO is an implicit flush of the piepline, so avoid emitting
that and associated state unless the destination pixmap has actually
changed. This is a win of around 3-5% for cairo-perf-trace, notably for
firefox.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-01 23:15:02 +01:00
Chris Wilson 90c74a4314 i915: Don't re-emit vertex size unless it has changed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-28 21:50:04 +01:00
Chris Wilson ea07535240 i915: Emit CA over using OutReverse + Add passes
On PineView:
  578/621 -> 610/617 kglyphs/sec [rgb/aa]
2010-05-24 18:31:16 +01:00
Chris Wilson 80a9e64f50 uxa: Use temporary dest when target is too large for compositor
If the destination cannot fit into the 3D pipeline when we need to
composite, we fallback to doing the operation on the CPU. This is very
slow, and quite easy to trigger on i915 by plugging in an external
display.

An alternative is to extract the extents of the operation from the
destination using the blitter which can usually handle much larger
operations. This gives us a temporary target that can fit into the 3D
pipeline and thus be accelerated, before copying back into the larger
real destination.

For x11perf this boosts glyph rendering on PineView, from 38kglyphs/s to
480kglyphs/s. Just a little shy of the native performance of 601kglyphs/s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 18:31:16 +01:00
Chris Wilson e3ece83f57 i915: compute normalized texcoords using a scale factor.
500 -> 580kglyphs/s on i945.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 09:42:18 +01:00
Chris Wilson 2adf823b80 i915: Add special case primitive emitters for glyphs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 09:40:26 +01:00
Chris Wilson f64ab9e0d9 i915: Move vertices into a vertex buffer object.
In theory this should allow us to pack far more operations into a single
batch buffer, and reduce our overheads.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 09:36:23 +01:00
Chris Wilson 2b050f330f Use pwrite to upload the batch buffer
By using pwrite() instead of dri_bo_map() we can write to the batch buffer
through the GTT and not be forced to map it back into the CPU domain and
out again, eliminating a double clflush.

Measing x11perf text performance on PineView:

Before:
16000000 trep @   0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10)
16000000 trep @   0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10)
After:
16000000 trep @   0.0019 msec (532000.0/sec): Char in 80-char aa line (Charter 10)
16000000 trep @   0.0020 msec (496000.0/sec): Char in 80-char rgb line (Charter 10)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 09:33:37 +01:00
Chris Wilson dcef703a7c Kill paranoid assertions on every write into the batchbuffer.
On my PineView box these represent ~5% overhead on x11perf text:

Before:
16000000 trep @   0.0020 msec (495000.0/sec): Char in 80-char aa line (Charter 10)
12000000 trep @   0.0022 msec (461000.0/sec): Char in 80-char rgb line (Charter 10)

After:
16000000 trep @   0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10)
16000000 trep @   0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 09:33:35 +01:00
Chris Wilson bc41f84e01 i915: Emit composite primitive with specialised functions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 09:32:30 +01:00
Chris Wilson 4a3476ea09 i915: amalgamate composite into a single primitive list
Combine all the calls to composite between prepare_composite and
done_composite into a single primitive list, rather than a primitive
call per composite().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-23 18:52:15 +01:00
Chris Wilson 213816c30b i915: Load texture into directly into OC when possible.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15 00:48:19 +01:00
Chris Wilson 271240fd47 i915: Remove a couple of unsupported 16bpp no-alpha tex formats
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-14 23:56:05 +01:00
Chris Wilson 4be8d7eb89 i915: Don't force alpha=1 for RGB drawables in the shader.
I was blindly fixing rendercheck without thinking. We need to force the
alpha value to be in the blend unit and not before -- otherwise we
generate the incorrect result whilst blending. D'oh.
2010-05-14 21:16:51 +01:00
Chris Wilson 25811dc7b7 i915: Force output alpha to 1. if dst has no alpha channel.
Ensure that garbage is not stored in the unused alpha channel so that
we can rely on it being currently initialiased when used as a source or
returning via GetImage.

Partial fix for rendercheck -t blend
2010-05-13 17:17:10 +01:00
Chris Wilson 0e726b85ca i915: Add a2r10g10b10 format and friends
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-13 09:40:27 +01:00
Chris Wilson 3eded4202e i915: Fix pixmap based masks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10 23:38:17 +01:00
Chris Wilson 0d4dd00aea uxa,i915: Handle SourcePict through uxa_composite()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10 12:29:26 +01:00
Chris Wilson 21c1c3c7f6 i915: Use 1x1R pixmap for solid drawables
x11perf has a regression
  https://bugs.freedesktop.org/show_bug.cgi?id=25068

caused by

  commit e581ceb738
  i915: Use the color channels to pass along solid sources and masks.

Do not convert 1x1R pixmaps into a solid color as the readback from the
bo negates all the performances advantages of using a smaller vertex
buffer and fewer samplers.

Before (PineView):
  aa=66800 glyph/s, rgb=28800 glyphs/s

Now:
  aa=96800 glyphs/s, rgb=48500 glyphs/s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10 10:36:15 +01:00
Chris Wilson f52b6e8322 uxa: Rearrange checking and preparing of composite textures.
x11perf regression caused by 2D driver
  https://bugs.freedesktop.org/show_bug.cgi?id=28047

caused by

  commit a7b800513f
  uxa: Extract sub-region from in-memory buffers.

The issue is that as we extract the region prior to checking whether the
composite can in fact be accelerated, we perform expensive surplus
operations. This is particularly noticeable for ComponentAlpha text,
such as rgb10text. The solution here is to rearrange the
check_composite() prior to acquiring the sources, and only extracting
the subregion if the render path can not actually handle the texture.

Performance (on PineView):
a7b800513^: aa=68600 glyphs/s, rgb=29900 glyphs/s
a7b800513: aa=65700 glyphs/s, rgb=13200 glyphs/s
now: aa=66800 glyph/s, rgb=28800 glyphs/s

The residual lossage seems to be from the extra function call and
dixPrivate lookups. Hmm. More warning is the extremely low performance,
however the results are consistent so the improvement looks real...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10 10:36:14 +01:00
Daniel Vetter a619a78312 i915 render: use tiling bits where possible
This is in preparation to explicit fence allocation with execbuf2.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-04-13 08:34:20 +02:00
Chris Wilson 31d5f84bb4 i915: Correct preamble for emit_composite
Fixes:
http://bugs.freedesktop.org/show_bug.cgi?id=27123

Fatal server error:
i915_emit_composite_setup: ADVANCE_BATCH: under-used allocation 100/104

Introduced with commit d6b7f96fde.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-17 09:20:41 +00:00
Chris Wilson d6b7f96fde Fill alpha on xrgb images.
Do not try to fixup the alpha in the ff/shaders as this has the
side-effect of overriding the alpha value of the border color, causing
images to be padded with black rather than transparent. This can
generate large and obnoxious visual artefacts.

Fixes:

  Bug 17933 - x8r8g8b8 doesn't sample alpha=0 outside surface bounds
  http://bugs.freedesktop.org/show_bug.cgi?id=17933

and many related cairo test suite failures.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-16 10:53:29 +00:00
Chris Wilson cd475bad23 batch: Ensure we send a MI_FLUSH in the block handler for TFP
This should restore the previous level of synchronisation between
textures and pixmaps, but *does not* guarantee that a texture will be
flushed before use. tfp should be fixed so that the ddx can submit the
batch if required to flush the pixmap.

A side-effect of this patch is to rename intel_batch_flush() to
intel_batch_submit() to reduce the confusion of executing a batch buffer
with that of emitting a MI_FLUSH.

Should fix the remaining rendering corruption involving tfp [inc compiz]:

  Bug 25431 [i915 bisected] piglit/texturing_tfp regressed
  http://bugs.freedesktop.org/show_bug.cgi?id=25431

  Bug 25481 Wrong cursor format and cursor blink rate with compiz enabled
  http://bugs.freedesktop.org/show_bug.cgi?id=25481

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-07 11:15:42 +00:00
Chris Wilson cfcabc4514 i915: Disable centre-point sampling.
I still have no idea how this is triggering failures, but it is. So
revert until the problem is solved.

Should fix once again:

  Bug 23803 [bisected i915] gnome characters disappear
  http://bugs.freedesktop.org/show_bug.cgi?id=23803

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30 14:04:25 +00:00
Chris Wilson 8f8b6bd03d i915: Whitespace
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30 14:03:40 +00:00
Chris Wilson b118a52cd1 i915: Remove routing of alpha channel to green.
This modification is redundant since the routing is done in the blend
unit anyway.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30 11:14:26 +00:00
Chris Wilson 5e04ded2bc i915: Fix missing texture offset for mask.
In commit e581ceb, I modified the shader generation to accommodate mixed
textures and solids but missed applying the new computed sampler for the
mask.

References:

  Bug 23803 [bisected i915] gnome characters disappear
  http://bugs.freedesktop.org/show_bug.cgi?id=23803

  Bug 25031 rendering and color corruption since 14109abf
  http://bugs.freedesktop.org/show_bug.cgi?id=25031

  Bug 25047 [945GM bisected] rendercheck/repeat/triangles regressed
  http://bugs.freedesktop.org/show_bug.cgi?id=25047

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30 11:12:03 +00:00
Chris Wilson 9a2c18fb92 batch: Emit a 'pipelined' flush when using a dirty source.
Ensure that the render caches and texture caches are appropriately
flushed when switching a pixmap from a target to a source.

This should fix bug 24315,
  [855GM] Rendering corruption in text (usually)
  https://bugs.freedesktop.org/show_bug.cgi?id=24315

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-29 22:42:07 +00:00
Chris Wilson c180baf43b i915: Derive the correct target color from the pixmap by checking its format
Particularly noting to route alpha to the green channel when blending
with a8 destinations.

Fixes:

  rendercheck/repeat/triangles regressed
  http://bugs.freedesktop.org/show_bug.cgi?id=25047

introduced with commit 14109a.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-13 20:20:52 +00:00
Chris Wilson 14109abf28 i915: Fix texture sampling coordinates.
RENDER specifies that texels should sampled from the pixel centre. This
corrects a number of failures in the cairo test suite and a few
off-by-one bug reports.

  Grey border around images
  https://bugs.freedesktop.org/show_bug.cgi?id=21523

Note that the earlier attempt to fix this was subverted by the buggy use
of 1x1R textures for solid sources -- which caused the majority of text
to disappear.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10 15:30:49 +00:00
Chris Wilson e581ceb738 i915: Use the color channels to pass along solid sources and masks.
Instead of allocating and utilising the texture samplers for 1x1R
solid sources and masks we can simply use the default diffuse and
specular colour channels and adjust the fragment shader appropriately.
The big advantage is the reduction in size of batches which should give
a good boost to glyph performance, irrespective of the additional boost
from using simpler shaders.

However, the motivating factor behind the switch is that our use of 1x1
textures turns out to be buggy...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10 15:19:23 +00:00
Chris Wilson 67af5a9925 Check that batch buffers are atomic.
Since batch buffers are rarely emitted by themselves but as part of a
sequence of state and vertices, the whole sequence is emitted atomically.

Here we just enforce that batches are marked as being part of an atomic
sequence as appropriate.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10 15:19:08 +00:00
Eric Anholt 8ff2a64964 Remove flow-control macros for fallbacks in the 2D driver.
It's poor style, and has confused new developers.
2009-11-05 14:22:56 -08:00