Commit Graph

3496 Commits

Author SHA1 Message Date
Chris Wilson c8a2fa3a2e sna/gen2: Correct command length for CA LOAD_IMMEDIATE_STATE_1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-06 08:43:34 +01:00
Chris Wilson a89fc7181b sna/gen2: Only emit the mask texcoord if there is a mask
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-06 08:39:17 +01:00
Chris Wilson 4fb7784e1e sna/gen3: Non-affine texcoords require space for 4 floats not 3.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-06 08:27:17 +01:00
Chris Wilson d9344ab8d0 sna/gen2: Set op->floats_per_vertex
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-06 08:25:53 +01:00
Chris Wilson 741c1101f1 sna/gen2: Enable selection of gen2 only
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-06 08:22:45 +01:00
Chris Wilson c76ec69660 sna/gen2: The inline primitive takes a length, not a vertex count
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-06 08:12:33 +01:00
Eric Anholt 91424d4937 uxa: Simplify uxa_poly_fill_rect by only clipping once.
Reviewed-by: Keith Packard <keithp@keithp.com>
2011-06-05 21:13:49 -07:00
Eric Anholt e0066e77e0 uxa: Simplify Composite solid acceleration for spans by only clipping once.
Unlike the previous commit removing this style of code, the code in
this one was originally wrong, and would fail to clip in the second
pass of clipping when y was > pbox->y2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37233
Reviewed-by: Keith Packard <keithp@keithp.com>
2011-06-05 21:13:36 -07:00
Eric Anholt ace324e4aa uxa: Simplify BLT solid acceleration for spans filling by only clipping once.
We were clipping each span against the bounds of the clip, throwing
out the span early if it was all clipped, and then walked the clip box
clipping against each of the cliprects.  We would expect spans to
typically be clipped against one box, and not thrown out, so we were
not saving any work there.  For multiple cliprects, we were adding
work.  Only for many spans clipped entirely out of a complicated clip
region would it have saved work, and it clearly didn't save bugs as
evidenced by the many fix attempts here.

Reviewed-by: Keith Packard <keithp@keithp.com>
2011-06-05 21:13:32 -07:00
Chris Wilson bdb396a44b sna: PutImage: Copy straight to GTT if the bo is idle
This saves a copy in the typical PutImage to frontbuffer favoured by
flash. And we also happen to fix a bug if we should be requested to
PutImage outside of the clip region...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-05 20:06:44 +01:00
Chris Wilson 407257570f sna/gen6: Flush the pipeline before effecting a change of blend modes
... also make sure that we flush if we change the blend mode for the CA pass.

Reported-by: Ivan Bulatovic <combuster@archlinux.us>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37946
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-05 15:40:00 +01:00
Chris Wilson 7316771122 sna: 915gm does not have 128-byte Y-tiling
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-05 14:56:32 +01:00
Chris Wilson 0260c4ce32 sna: Fallback if presented with mask under NO_COMPOSITE
Just making sure that the debug paths actually work...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-05 14:39:34 +01:00
Chris Wilson fcbe2d9ee7 sna/gen4: Flush every vertex for the magic CA pass
gen4 dies hard if it has two rectangles in the pipeline, and despite the
stringent and crippling efforts to prevent us from efficiently using the
GPU, I missed a flush before submitting the CA rectangle.

Reported-and-tested-by: Fryderyk Dziarmagowski <fdziarmagowski@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=28768
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-04 19:36:09 +01:00
Chris Wilson bcef98af56 sna: Introduce a new acceleration model.
The premise is that switching between rings (i.e. the BLT and
RENDER rings) on SandyBridge imposes a large latency overhead whilst
rendering. The cause is that in order to switch rings, we need to split
the batch earlier than is desired and to add serialisation between the
rings. Both of which incur large overhead.

By switching to using a pure 3D blit engine (ok, not so pure as the BLT
engine still has uses for the core drawing model which can not be easily
represented without a combinatorial explosion of shaders) we can take
advantage of additional efficiencies, such as relative relocations, that
have been incorporated into recent hardware advances. However, even
older hardware performs better from avoiding the implicit context
switches and from the batching efficiency of the 3D pipeline...

But this is X, and PolyGlyphBlt still exists and remains in use. So for
the operations that are not worth accelerating in hardware, we introduce a
shadow buffer mechanism through out and reintroduce pixmap migration.
Doing this efficiently is the cornerstone of ensuring that we do exploit
the increased potential of recent hardware for running old applications and
environments (i.e. so that the latest and greatest chip is actually faster
than gen2!)

For the curious, sna is SandyBridge's New Acceleration. If you are
running older chipsets and welcome the performance increase offered by
this patch, then you may choose to call it Snazzy instead.

Speedups
========
 gen3           firefox-fishtank  1203584.56 (1203842.75 0.01%) -> 85561.71 (125146.44 14.87%): 14.07x speedup
 gen5             grads-heat-map  3385.42 (3489.73 1.44%) -> 350.29 (350.75 0.18%):  9.66x speedup
 gen3          xfce4-terminal-a1  4179.02 (4180.09 0.06%) -> 503.90 (531.88 4.48%):  8.29x speedup
 gen4             grads-heat-map  2458.66 (2826.34 4.64%) -> 348.82 (349.20 0.29%):  7.05x speedup
 gen3             grads-heat-map  1443.33 (1445.32 0.09%) -> 298.55 (298.76 0.05%):  4.83x speedup
 gen3             swfdec-youtube  3836.14 (3894.14 0.95%) -> 889.84 (979.56 5.99%):  4.31x speedup
 gen6             grads-heat-map  742.11 (744.44 0.15%) -> 172.51 (172.93 0.20%):  4.30x speedup
 gen3          firefox-talos-svg  71740.44 (72370.13 0.59%) -> 21959.29 (21995.09 0.68%):  3.27x speedup
 gen5                       gvim  8045.51 (8071.47 0.17%) -> 2589.38 (3246.78 10.74%):  3.11x speedup
 gen6                    poppler  3800.78 (3817.92 0.24%) -> 1227.36 (1230.12 0.30%):  3.10x speedup
 gen6         gnome-terminal-vim  9106.84 (9111.56 0.03%) -> 3459.49 (3478.52 0.25%):  2.63x speedup
 gen5              midori-zoomed  9564.53 (9586.58 0.17%) -> 3677.73 (3837.02 2.02%):  2.60x speedup
 gen5         gnome-terminal-vim  38167.25 (38215.82 0.08%) -> 14901.09 (14902.28 0.01%):  2.56x speedup
 gen5                    poppler  13575.66 (13605.04 0.16%) -> 5554.27 (5555.84 0.01%):  2.44x speedup
 gen5         swfdec-giant-steps  8941.61 (8988.72 0.52%) -> 3851.98 (3871.01 0.93%):  2.32x speedup
 gen5          xfce4-terminal-a1  18956.60 (18986.90 0.07%) -> 8362.75 (8365.70 0.01%):  2.27x speedup
 gen5           firefox-fishtank  88750.31 (88858.23 0.14%) -> 39164.57 (39835.54 0.80%):  2.27x speedup
 gen3              midori-zoomed  2392.13 (2397.82 0.14%) -> 1109.96 (1303.10 30.35%):  2.16x speedup
 gen6                       gvim  2510.34 (2513.34 0.20%) -> 1200.76 (1204.30 0.22%):  2.09x speedup
 gen5       firefox-planet-gnome  40478.16 (40565.68 0.09%) -> 19606.22 (19648.79 0.16%):  2.06x speedup
 gen5       gnome-system-monitor  10344.47 (10385.62 0.29%) -> 5136.69 (5256.85 1.15%):  2.01x speedup
 gen3                    poppler  2595.23 (2603.10 0.17%) -> 1297.56 (1302.42 0.61%):  2.00x speedup
 gen6          firefox-talos-gfx  7184.03 (7194.97 0.13%) -> 3806.31 (3811.66 0.06%):  1.89x speedup
 gen5                  evolution  8739.25 (8766.12 0.27%) -> 4817.54 (5050.96 1.54%):  1.81x speedup
 gen3                  evolution  1684.06 (1696.88 0.35%) -> 1004.99 (1008.55 0.85%):  1.68x speedup
 gen3         gnome-terminal-vim  4285.13 (4287.68 0.04%) -> 2715.97 (3202.17 13.52%):  1.58x speedup
 gen5             swfdec-youtube  5843.94 (5951.07 0.91%) -> 3810.86 (3826.04 1.32%):  1.53x speedup
 gen4                    poppler  7496.72 (7558.83 0.58%) -> 5125.08 (5247.65 1.44%):  1.46x speedup
 gen4         gnome-terminal-vim  21126.24 (21292.08 0.85%) -> 14590.25 (15066.33 1.80%):  1.45x speedup
 gen5          firefox-talos-svg  99873.69 (100300.95 0.37%) -> 70745.66 (70818.86 0.05%):  1.41x speedup
 gen4       firefox-planet-gnome  28205.10 (28304.45 0.27%) -> 19996.11 (20081.44 0.56%):  1.41x speedup
 gen5          firefox-talos-gfx  93070.85 (93194.72 0.10%) -> 67687.93 (70374.37 1.30%):  1.37x speedup
 gen4                  evolution  6696.25 (6854.14 0.85%) -> 4958.62 (5027.73 0.85%):  1.35x speedup
 gen3         swfdec-giant-steps  2538.03 (2539.30 0.04%) -> 1895.71 (2050.62 62.43%):  1.34x speedup
 gen4                       gvim  4356.18 (4422.78 0.70%) -> 3276.31 (3281.69 0.13%):  1.33x speedup
 gen6                  evolution  1242.13 (1245.44 0.72%) -> 953.76 (954.54 0.07%):  1.30x speedup
 gen6       firefox-planet-gnome  4554.23 (4560.69 0.08%) -> 3758.76 (3768.97 0.28%):  1.21x speedup
 gen3          firefox-talos-gfx  6264.13 (6284.65 0.30%) -> 5261.56 (5370.87 1.28%):  1.19x speedup
 gen4              midori-zoomed  4771.13 (4809.90 0.73%) -> 4037.03 (4118.93 0.85%):  1.18x speedup
 gen6         swfdec-giant-steps  1557.06 (1560.13 0.12%) -> 1336.34 (1341.29 0.32%):  1.17x speedup
 gen4          firefox-talos-gfx  80767.28 (80986.31 0.17%) -> 69629.08 (69721.71 0.06%):  1.16x speedup
 gen6              midori-zoomed  1463.70 (1463.76 0.08%) -> 1331.45 (1336.56 0.22%):  1.10x speedup
Slowdowns
=========
 gen6          xfce4-terminal-a1  2030.25 (2036.23 0.25%) -> 2144.60 (2240.31 4.29%):  1.06x slowdown
 gen4             swfdec-youtube  3580.00 (3597.23 3.92%) -> 3826.90 (3862.24 0.91%):  1.07x slowdown
 gen4          firefox-talos-svg  66112.25 (66256.51 0.11%) -> 71433.40 (71584.31 0.14%):  1.08x slowdown
 gen4       gnome-system-monitor  5691.60 (5724.03 0.56%) -> 6707.56 (6747.83 0.33%):  1.18x slowdown
 gen3                  ocitysmap  3494.05 (3502.44 0.20%) -> 4321.99 (4524.42 2.78%):  1.24x slowdown
 gen4                  ocitysmap  3628.42 (3641.66 9.37%) -> 5177.16 (5828.74 8.38%):  1.43x slowdown
 gen5                  ocitysmap  4027.77 (4068.11 0.80%) -> 5748.26 (6282.25 7.38%):  1.43x slowdown
 gen6                  ocitysmap  1401.61 (1402.24 0.40%) -> 2365.74 (2379.14 4.12%):  1.69x slowdown

[Note the performance regression for ocitysmap comes from that we now
attempt to support rendering to and (more importantly) from large
surfaces. By enabling such operations is the only way to one day be
faster than purely using the CPU, in the meantime we suffer regression
due to the increased migration and aperture thrashing. The other couple
of regressions will be eliminated with improved span and shader support,
now that the framework for such is in place.]

The performance increase for Cairo completely overlooks the other
critical aspects of the architecture:

World of Padman:
gen3 (800x600):   57.5 ->  96.2
gen4 (800x600):   47.8 ->  74.6
gen6 (1366x768): 100.4 -> 140.3 [F15]
                 144.3 -> 146.4 [drm-intel-next]

x11perf (gen6);
aa10text:     3.47 -> 14.3 Mglyphs/s [unthrottled!]
copywinwin10: 1.66 -> 1.99 Mops/s
copywinpix10: 2.28 -> 2.98 Mops/s

And we do not have a good measure for how much improvement the reworking
of the fallback paths give, except that xterm is now over 4x faster...

PS: This depends upon the Xorg patchset "Remove the cacheing of the last
scratch PixmapRec" for correct invalidations of scratch Pixmaps (used by
the dix to implement SHM operations, used by chromium and gtk+ pixbufs.

PPS: ./configure --enable-sna

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-04 09:19:46 +01:00
Diego Elio Pettenò 340cfb7f52 build: do not use AC_CHECK_FILE to find the header files.
Using AC_CHECK_FILE will cause cross-builds to fail picking the right file;
instead use compile/preprocessor checks properly, and check for
xf86driproto earlier.

Reviewed-by: Rémi Cardona <remi@gentoo.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-30 09:21:47 +01:00
Adam Jackson 9d6e02a135 Remove the memory of Option "AccelMethod"
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-11 09:20:09 +01:00
Chris Wilson 895a46e8ff dri: Flush the batch after a DRI swap/copy event
To minimise lag in those every so critical games, we want to ensure that
the copy happens as soon as it is received, so we need to flush the
batch after processing a swap event and before we go to sleep.

References: https://bugs.freedesktop.org/show_bug.cgi?id=37068
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-10 20:38:25 +01:00
Chris Wilson 0b4ca9313c video: Flush the batch on the next blockhandler after queuing
In order to avoid video lag and jerky playback we need to ensure that
any queued video is flushed before we go to sleep.

Fixes regression from 6f104189bb.

Reported-and-tested-by: Edward Sheldrake <ejsheldrake@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37068
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-10 20:36:18 +01:00
Chris Wilson bb8bf2a28b Correct chipset detection for Q33, Q35, B43_G1
Everytime we update these tables we trip over this bit of marketing
genius.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-10 14:35:02 +01:00
Chris Wilson fd1ebd44fb module: Adopt IVB's more detailed naming convention for SNB
This should fix the seven-fold repetition of "SandyBridge" in the list
of supported chipsets during start-up... And be more useful in bug
reports!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-10 07:30:58 +01:00
Chris Wilson e9811bb777 Whitespacing cleanup for intel_module.c
Bring intel_module.c into line with the kernel whitespacing rules abided
by everywhere else in the tree.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-10 07:28:23 +01:00
Eric Anholt 79e59fb2a0 Add support for Ivybridge chipset.
This gets display and 2D blit acceleration up and running.  No Render
acceleration is provided yet.
2011-05-09 22:56:42 -07:00
Eric Anholt 792738adfc Remove the static list of PciChipset and construct it from SymTabRec instead.
This is one less place the new hardware enabler has to spam the
chipset in.  The PciChipset is just a match structure from PciId to
the SymTabRec entry token, and our SymTabRec entry tokens are just the
PciId, so it's trivial to construct.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-09 22:56:42 -07:00
Eric Anholt 583e80dfa1 Use the existing deviceID -> name mapping in SymTabRec instead of duping it.
We need to have this array anyway for the xf86 interfaces, apparently,
so just store the name in one location.  This drops the i852/i855
subdevice distinction in the name printed, but I haven't seen us ever
care about that.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-09 22:56:42 -07:00
Eric Anholt adf7bbd3a8 Store the chipset info struct in the PCI match struct, instead of a switch().
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-09 22:56:42 -07:00
Chris Wilson 3145530fee Ensure that the partial batch is flushed upon the blockhandler
Currently, we require that a batch containing a dirty bo be submitted
before we mark the device as requiring a flush. So if we never submit a
batch between block handlers, we can end up sleeping without ever
flushing either the partial batch or the rendering to the scanout.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36776
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-07 20:04:18 +01:00
Paul Menzel 67e5a74e99 NEWS: fix typo (s/2.14/2.15/) to match corresponding release
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-22 14:18:17 +01:00
Jesse Barnes 0944e2d574 Add basic 30 bit depth support
Still need to handle video and gamma correction, but this gets the
display up and running at 30 bit depth if the kernel and display support
it.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-04-20 11:39:55 -07:00
Chris Wilson 1d102cc6ed Use SwapbuffersWait config option to control waiting on fullscreen swaps
As fullscreen swaps were going via a different path to the swapping of
ordinary windows, we were no longer honouring the xorg.conf option to
disable swapbuffer waiting.

This changes the code to only use pageflipping if the Option
"SwapbuffersWait" is set to "TRUE" (default).

Jesse's comment was that this should be superseded by actually
supporting asynchronous page flips. As we are missing kernel and dix level
support for that, in the meantime honour the config option.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Kristian Høgsberg <krh@bitplanet.net>
2011-04-20 08:51:50 +01:00
Chris Wilson c9fb69cb25 i965/video: We need 150 dwords of space for video state emission
(Actually around 131, with additional 10% just for safety.)

Reported-by: Modestas Vainius <geromanas@mailas.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36319
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-17 10:43:42 +01:00
Chris Wilson a51cd83d25 intel: Beware the unsigned promotion when checking for batch overflows
Reported-by: Modestas Vainius <geromanas@mailas.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=36319
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-17 10:43:35 +01:00
Chris Wilson 030aa3d136 NEWS: typo.
Spotted too late...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-14 10:25:21 +01:00
Chris Wilson 0e425b30e1 configure,NEWS: 2.15.0 release
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-14 10:16:40 +01:00
Chris Wilson 686018f283 Turn relaxed-fencing off by default for older (pre-G33) chipsets
There are still too many unresolved bugs, typically GPU hangs, that are
related to using relaxed fencing (i.e. only allocating the minimal
amount of memory required for a buffer) on older hardware, so turn off
the feature by default for the release.

Reported-and-tested-by: Knut Petersen <Knut_Petersen@t-online.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36147
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-04-12 09:03:01 +01:00
Chris Wilson 3d2b79098c dri: Rearrange code to compile against xorg-server-1.7
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-11 15:23:56 +01:00
Chris Wilson 4fa35dd5e1 NEWS: version bump for 2.14.903 snapshot
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-11 10:44:55 +01:00
Chris Wilson 97e9557619 intel: Restore manual flush for old kernels
Daniel Vetter pointed out that the automagic flush by the kernel for the
busy-ioctl was only introduced upstream in 2.6.37. So we still need to
manually emit a flush on old kernels.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-08 13:38:48 +01:00
Daniel Vetter fb40bf2b33 Tell users to grab i915_error_state on gpu hangs
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-04-08 12:37:35 +02:00
Chris Wilson 59ed6b05db Revert "i965: Convert to relative relocations for state"
This reverts commit d2106384be.

Breaks compiz (but not mutter/gnome-shell) on gen6. Not sure if this is
not seem deep interaction issue with multiple clients sharing the GPU or
just with compiz, but for now we have to revert and suffer the inane
performance hit. It looks suspiciously like another deferred damage
issue...

Bugzilla: 51a27e88b073cff229fff4362cb6ac22835c4044
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-07 16:16:58 +01:00
Chris Wilson 25521900df gen6: Invalidate texture cache
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-07 15:09:30 +01:00
Chris Wilson ad22003033 i965: Avoid transform overheads for vertex emit where possible
Minor improvement as the bottlenecks lie elsewhere. But it was annoying me.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-07 15:09:21 +01:00
Chris Wilson 007c2f86cb i965: Refactor to use constant sampler_state offsets
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-07 10:53:14 +01:00
Chris Wilson 8dc99b305a i965: Reset vertex_id after every batch
So that we always remember to re-emit the initial vertex elements state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-04 22:30:29 +01:00
Chris Wilson 5982ed4da1 i965: Always update last_floats_per_vertex
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-04 19:32:43 +01:00
Chris Wilson 6f104189bb Take advantage of the kernel flush for dirty bo in the busy ioctl
Rather than just creating and submitting a batch that simply contains a
flush in order to periodically ensure that rendering reaches the
scanout, we can simply ask the kernel whether the scanout is busy. The
kernel will then submit a flush on our behalf if it is dirty, which
takes advantage of the kernel's dirty state tracking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-04 19:24:30 +01:00
Chris Wilson 314439860e Remove unused function 'intel_bo_alloc_for_data'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-04 17:20:04 +01:00
Chris Wilson ced747cefb Remove the unnecessary MI_FLUSH from the flush handler
The kernel will emit any required flushes between the dri client and the
ddx, and we do not rely on the MI_FLUSH here for scanout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-04 17:14:38 +01:00
Chris Wilson 79444291a3 i965: segregate each vertex element into its own buffer
Reduce the number of relocations emitted by only emitting one relocation
per vertex element per vertex buffer.

References: https://bugs.freedesktop.org/show_bug.cgi?id=35733
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-04 16:42:57 +01:00
Chris Wilson d2106384be i965: Convert to relative relocations for state
References: https://bugs.freedesktop.org/show_bug.cgi?id=35733
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-04-04 15:57:28 +01:00