Commit Graph

72 Commits

Author SHA1 Message Date
Chris Wilson 5c663ce844 Rename common infrastructure to the intel namespace.
After splitting out the i810 driver into its own legacy directory, we
can identify the common routines not as i830 but as intel. This
clarifies the code which *is* i830 specific.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-25 13:18:01 +01:00
Chris Wilson 995a4b2b1d i965: Sanity check ComponentAlpha status in prepare_composite
Fixes:

  Bug 28446 - Garbled Font with Mathematica 7
  https://bugs.freedesktop.org/show_bug.cgi?id=28446

Rewriting the glyphs to render to the destination directly and removing
the more expensive multiple invocations of CompositePicture per picture
was a great performance boost -- except that it needs special handling
in the backend in order to not fallback. Having done so for i915, I
neglected to ensure the sanity checking in i965_prepare_composite() was
sufficient. As it turns out, it was not and so we misrendered CA-glyphs
when rendering directly to the destination. This causes us to fallback
properly, but is a performance regression as we no longer try the 2-pass
magic helper before resorting to s/w. At the moment, I'd rather live
with the temporary regression and fix i965 to do the same magic as i915,
as it critical to fixing the severe performance issues currently
crippling i965, as I believe that this regression only affects the
minority of applications (incorrect, as it turns out, as the glyphs are
overlapping) rendering directly to the destination.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-14 12:14:30 +01:00
Chris Wilson 8c1a8d2297 Revert "xp:trapezoids"
This reverts commit f429fb9d87.

An experimental patch I forgot was on my main branch as I was bugfixing.
ARGH!
2010-06-09 10:03:29 +01:00
Chris Wilson f429fb9d87 xp:trapezoids 2010-06-08 19:52:46 +01:00
Chris Wilson 80a9e64f50 uxa: Use temporary dest when target is too large for compositor
If the destination cannot fit into the 3D pipeline when we need to
composite, we fallback to doing the operation on the CPU. This is very
slow, and quite easy to trigger on i915 by plugging in an external
display.

An alternative is to extract the extents of the operation from the
destination using the blitter which can usually handle much larger
operations. This gives us a temporary target that can fit into the 3D
pipeline and thus be accelerated, before copying back into the larger
real destination.

For x11perf this boosts glyph rendering on PineView, from 38kglyphs/s to
480kglyphs/s. Just a little shy of the native performance of 601kglyphs/s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 18:31:16 +01:00
Chris Wilson dcef703a7c Kill paranoid assertions on every write into the batchbuffer.
On my PineView box these represent ~5% overhead on x11perf text:

Before:
16000000 trep @   0.0020 msec (495000.0/sec): Char in 80-char aa line (Charter 10)
12000000 trep @   0.0022 msec (461000.0/sec): Char in 80-char rgb line (Charter 10)

After:
16000000 trep @   0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10)
16000000 trep @   0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24 09:33:35 +01:00
Chris Wilson 2c69709d8a i830: Encode surface bpp into format
References:

  Bug 28135 - [855GM] Slowdown/High CPU-Usage after Git-Commit
              926fbc7d90
  https://bugs.freedesktop.org/show_bug.cgi?id=28135

The simple answer is that I had assumed that 0 was a reserved value.
However, without the bbp encoded into the format 0 was used for a8r8g8b8
and r5g6b5, which are very common formats!

The other possibility for the slowdown is that gtkperf is using of the
now verboten xrgb formats -- but would in fact be valid if the source
covers the clip and we could fixup the alpha value in the fixed function
combine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-16 18:41:52 +01:00
Chris Wilson 926fbc7d90 i830: Remove incorrectly mapped tex formats.
We no longer workaround the lack of alpha expansion for xrgb textures as
this interferes with EXTEND_NONE, though we could if we know the source
covers the clip...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15 01:09:13 +01:00
Chris Wilson f52b6e8322 uxa: Rearrange checking and preparing of composite textures.
x11perf regression caused by 2D driver
  https://bugs.freedesktop.org/show_bug.cgi?id=28047

caused by

  commit a7b800513f
  uxa: Extract sub-region from in-memory buffers.

The issue is that as we extract the region prior to checking whether the
composite can in fact be accelerated, we perform expensive surplus
operations. This is particularly noticeable for ComponentAlpha text,
such as rgb10text. The solution here is to rearrange the
check_composite() prior to acquiring the sources, and only extracting
the subregion if the render path can not actually handle the texture.

Performance (on PineView):
a7b800513^: aa=68600 glyphs/s, rgb=29900 glyphs/s
a7b800513: aa=65700 glyphs/s, rgb=13200 glyphs/s
now: aa=66800 glyph/s, rgb=28800 glyphs/s

The residual lossage seems to be from the extra function call and
dixPrivate lookups. Hmm. More warning is the extremely low performance,
however the results are consistent so the improvement looks real...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10 10:36:14 +01:00
Daniel Vetter 324a2810da i830 render: check aperture space requirements
No point not doing this.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-04-13 08:39:43 +02:00
Daniel Vetter 55cd36046e i830 render: use tiling bits where possible
This is in preparation to explicit fence allocation with execbuf2.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-04-13 08:34:20 +02:00
Chris Wilson d6b7f96fde Fill alpha on xrgb images.
Do not try to fixup the alpha in the ff/shaders as this has the
side-effect of overriding the alpha value of the border color, causing
images to be padded with black rather than transparent. This can
generate large and obnoxious visual artefacts.

Fixes:

  Bug 17933 - x8r8g8b8 doesn't sample alpha=0 outside surface bounds
  http://bugs.freedesktop.org/show_bug.cgi?id=17933

and many related cairo test suite failures.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-16 10:53:29 +00:00
Chris Wilson 910fd171a0 i830: Remove coord-adjust for nearest centre-sampling.
Fixes a number of cairo test suite failures.

Also affects:
  Bug 16917 - Blur on y-axis also when only x-axis is scaled bilinear
  http://bugs.freedesktop.org/show_bug.cgi?id=16917

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-16 10:53:29 +00:00
Eric Anholt ec5deb2bcb Remove dead assignments noticed by clang. 2010-02-20 12:55:13 -05:00
Chris Wilson cd475bad23 batch: Ensure we send a MI_FLUSH in the block handler for TFP
This should restore the previous level of synchronisation between
textures and pixmaps, but *does not* guarantee that a texture will be
flushed before use. tfp should be fixed so that the ddx can submit the
batch if required to flush the pixmap.

A side-effect of this patch is to rename intel_batch_flush() to
intel_batch_submit() to reduce the confusion of executing a batch buffer
with that of emitting a MI_FLUSH.

Should fix the remaining rendering corruption involving tfp [inc compiz]:

  Bug 25431 [i915 bisected] piglit/texturing_tfp regressed
  http://bugs.freedesktop.org/show_bug.cgi?id=25431

  Bug 25481 Wrong cursor format and cursor blink rate with compiz enabled
  http://bugs.freedesktop.org/show_bug.cgi?id=25481

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-07 11:15:42 +00:00
Chris Wilson 9a2c18fb92 batch: Emit a 'pipelined' flush when using a dirty source.
Ensure that the render caches and texture caches are appropriately
flushed when switching a pixmap from a target to a source.

This should fix bug 24315,
  [855GM] Rendering corruption in text (usually)
  https://bugs.freedesktop.org/show_bug.cgi?id=24315

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-29 22:42:07 +00:00
Chris Wilson 67af5a9925 Check that batch buffers are atomic.
Since batch buffers are rarely emitted by themselves but as part of a
sequence of state and vertices, the whole sequence is emitted atomically.

Here we just enforce that batches are marked as being part of an atomic
sequence as appropriate.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10 15:19:08 +00:00
Eric Anholt 4c8e783d84 Fix "Remove flow-control macros for fallbacks in the 2D driver."
I guess this is the sort of failure due to rebase-happiness that makes
Linus yell at us for rebasing.
2009-11-05 16:01:32 -08:00
Eric Anholt 8ff2a64964 Remove flow-control macros for fallbacks in the 2D driver.
It's poor style, and has confused new developers.
2009-11-05 14:22:56 -08:00
Chris Wilson 3c0815abf2 conf: Add debugging flush options
Make the following options available via xorg.conf:
  Section "Driver"
    Option "DebugFlushBatches" "1" # Flush the batch buffer after every
                                   # single operation;

    Option "DebugFlushCaches" "1" # Include a MI_FLUSH at the end of every
                                  # batch buffer to force data to be
                                  # flushed out of cache and into memory
                                  # before the completion of the batch.

    Option "DebugWait" "1" # Wait for the completion of every batch buffer
                           # before continuing, i.e. perform synchronous
                           # rendering.
  EndSection

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-10-14 11:53:20 +01:00
Eric Anholt f309d47524 Call pPixmaps plain old pixmaps. 2009-10-08 15:34:09 -07:00
Eric Anholt da0f6616ad de-pCamelHungarian the Render pictures and pixmaps. 2009-10-08 15:34:09 -07:00
Eric Anholt 050a141b7b Share several render fields between render implementations.
Also, start settling on the cairo naming for things: source, mask, and dest.
2009-10-08 15:34:09 -07:00
Eric Anholt af27a3a0a5 Rename the xf86 screen private from pScrn to scrn. 2009-10-08 15:34:09 -07:00
Eric Anholt cc5d3ba3c3 Rename the screen private from I830Ptr pI830 to intel_screen_private *intel.
This is the beginning of the campaign to remove some of the absurd use of
Hungarian in the driver.  Not that I don't like Hungarian, but I don't need
to know that pI830 is a pPointer.
2009-10-08 15:34:09 -07:00
Eric Anholt 8ae0e44e42 Move to kernel coding style.
We've talked about doing this since the start of the project, putting it off
until "some convenient time".  Just after removing a third of the driver seems
like a convenient time, when backporting's probably not happening much anyway.
2009-10-06 17:10:31 -07:00
Chris Wilson 762e406d13 Revert "8xx: Fallback for any non-affine transformation."
This reverts commit 505025053d.

In theory, the non-affine paths work -- at least for the stated test case,
so re-enable them and avoid the slow work-around.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-22 01:36:09 +01:00
Keith Packard 2cc1f3cb60 i8xx: Format projective texture coordinates correctly.
Projective texture coordinates must be delivered as TEXCOORDFMT_3D
using TEXCOORDTYPE_HOMOGENOUS. This meant selecting the correct type
in i830_texture_setup, the correct format in i830_emit_composite_state
and sending only 3 coordinates in i830_emit_composite_primitive.

Signed-off-by: Keith Packard <keithp@keithp.com>
[ickle: tweaked to fix up a couple of use-before-initialised]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-22 01:30:59 +01:00
Keith Packard bd817e2d73 Split i915/i830 composite_emit_primitive into two functions.
The i915 and i830 take similar but different data when emitting the
primitives, instead of trying to share code here, just split this
apart and avoid potentially breaking things later on.

Signed-off-by: Keith Packard <keithp@keithp.com>
2009-09-21 17:24:11 -07:00
Carl Worth 505025053d 8xx: Fallback for any non-affine transformation.
There are definitely bugs in the 8xx code dealing with non-affine
transformations. Disable that code for now to get things working.

Fixes bug #22947 ([855GM, xf86-video-intel-2.8.0] "Freeze" when RENDER extension is being used)
2009-09-21 15:46:51 -07:00
Chris Wilson c2abfa8e54 Avoid fallbacks for compositing gradient patterns
Currently when asked to composite using a gradient source or mask, we
fallback to using fbComposite().  This has the side-effect of causing a
readback on the destination surface, stalling the GPU pipeline.  Instead,
like uxa_trapezoids(), we can use pixman to fill a scratch pixmap and then
copy that to an offscreen pixmap for use with uxa_composite().

Speedups on i915:
firefox-talos-svg:  710378.14 -> 549262.96:  1.29x speedup

No slowdowns.

Thanks to Søeren Sandmann Pedersen for spotting the missing
ValidatePicture().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-14 16:26:57 +01:00
Chris Wilson 1a77ca74bc i915: Restore nearest sampling
My recent commit [94fc93] to use the pixel centre for sampling with the i830
broke the i915. This restores the previous sampling coordinates for the
i915 whilst preserving the correct coordinates for i830.

Fixes: gnome characters disappear
       http://bugs.freedesktop.org/show_bug.cgi?id=23803

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-09 12:40:15 +01:00
Chris Wilson 94fc93d4e2 i830/i915: Set the sample position to the pixel center.
And in particular we apply the nearest sample bias separately for
src/mask.

Fixes cairo/test:
	device-offset-scale
	finer-grained-fallbacks
	mask-transformed-{similar,image}
	meta-surface-pattern
	pixman-rotate
	surface-pattern-big-scale-down
	text-transform

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-05 11:06:44 +01:00
Chris Wilson ced0cc8bb2 i830: Update comments
i830_composite() is no longer shared with i915 but
i830_emit_composite_primitive() is.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-05 11:06:44 +01:00
Chris Wilson 8863706e25 i830: Trim composite setup
Remove a couple of redundant NOOPs from the setup and correct the required
space checking for atomic batch operation.
2009-09-05 11:06:11 +01:00
Chris Wilson a9b12111f9 i830: remove padding NOOPs from composite
Bumps aa10text up from 249k to 260k!

These NOOPs have existed uncommented since
04d1584737.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-05 01:58:03 +01:00
Chris Wilson 9c1bf6d01c i830: do not use stale mask transform
Not only were incorrectly falling back if we had non-affine
transformations, but we made the decision based on a stale transformation
matrix.

Related bug 22877:
   batch_start_atomic horribly breaks performance after a while
   https://bugs.freedesktop.org/show_bug.cgi?id=22877

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Maximilian Grothusmann <maxi@own-hero.net>
2009-09-05 01:34:59 +01:00
Eric Anholt 12c5aeca7a 8xx render: Add limited support for a8 dests.
This improves aa10text performance from 74k to 569k on my 855 laptop.
This also causes my 865 to hang on aa10text like it does on rgb10text,
thanks to actually hitting render accel.
2009-07-22 09:58:17 -07:00
Eric Anholt 8dd7ccf37e Fix 915-class Render after the 8xx-class Render fix.
The two shared i830_composite.c, so giving i830 atomic batch support
triggered anger about starting i830's atomic area while in i915's atomic
area.  Instead, split the emit-a-primitive stuff from the state emission.
2009-07-16 11:48:33 -07:00
Eric Anholt a1e6abb5ca Use batch_start_atomic to fix batchbuffer wrapping problems with 8xx render.
Bug #22483.
2009-07-15 15:11:21 -07:00
Alan Coopersmith 6d025e679a Harden i830 render in case check_composite didn't throw out bad formats.
Fixes a warning in a static analysis program, and the code's a little
clearer.

Bug #21667
2009-06-23 15:35:41 -07:00
Alan Coopersmith f16ee21884 Fix "Unkown" typo in two FatalError messages
Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
2009-05-10 16:25:24 -07:00
Eric Anholt 928a37041d Replace a bunch of #ifdef debug flushing/syncing with a single function.
This removes it from a callsite where it would have just resulted in a
fatalerror.
2009-04-21 15:28:55 -07:00
Eric Anholt 73b7190421 intel: Nuke shared-entity support (zaphod mode).
It's been broken for years now, and KMS offers a much better chance of getting
this working sensibly without making a mess of the 2D driver.
2009-03-06 13:26:10 -08:00
Eric Anholt 801f0eac4f Make I830FALLBACK debugging a runtime instead of compile-time option. 2008-11-05 17:22:00 -08:00
Eric Anholt 080d36648f Add support for RepeatPad and RepeatReflect to 915 and 830-class Render accel. 2008-10-06 17:00:28 -07:00
Keith Packard b2d058d80c Rename uxa using _ instead of caps 2008-08-05 15:41:52 -07:00
Keith Packard 12df8f40d2 Use dri_bo for all object allocations, including pixmaps under uxa 2008-08-05 15:40:14 -07:00
Eric Anholt ecf19e1cda Change most usage of pixmap offsets to using a reloc macro.
This is based on airlied's RING->BATCH commit.  The 965 code still needs to
be fixed up for relocations.
2008-06-10 11:37:03 -07:00
Zhenyu Wang 79fde3ad7a Check pitch for EXA operation
2D pitch limit applys to all chips. Pre-965 chip has
8KB pitch limit for 3D. 965 supports max pitch by current
exa (128KB).
(cherry picked from commit 8187a5a16f8bd8f0ba5e7f5357f355928b3b8f07)
2008-05-07 13:42:38 +08:00