xf86-video-intel

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	030d56279b	drm: don't overwrite the old intel->front_buffer It's now handled in the common ExchangeBuffers() path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-14 17:30:23 +01:00
Chris Wilson	5bd0227395	i830: Teardown batch entries on reset. By not cleaning up the batch entries when resetting the X server, we left the pointers in an inconsistent state and caused X to crash.	2010-05-14 15:50:05 +01:00
Chris Wilson	0d2392d44a	dri: Hold reference to buffers across swap As we schedule swaps for some time in the future and may process a detachment prior to receiving the vblank notification from the kernel, we need to hold a reference to the buffers for our swap event handler. Fixes: Bug 28080 - "glresize" causes X server segfault with indirect rendering. https://bugs.freedesktop.org/show_bug.cgi?id=28080 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-14 10:32:12 +01:00
Chris Wilson	8de09a0707	uxa: Convert 1x1R back to solid_fill In the change to prevent blitting between incompatible sources, we also prevented 1x1R pixmaps from being used for solid fills. Reorder the sequence of conditions to enable this fast path again.	2010-05-13 17:17:54 +01:00
Chris Wilson	92e9cf8af7	uxa: Only use solid_fill for SRC.	2010-05-13 17:17:54 +01:00
Chris Wilson	d1bd14e8b6	uxa: Replace source for CLEAR with a transparent solid This means that we will hit the faster try_solid_fill path instead.	2010-05-13 17:17:54 +01:00
Chris Wilson	cdab72c405	uxa: Fallback early if compositing with alphaMaps	2010-05-13 17:17:54 +01:00
Chris Wilson	25811dc7b7	i915: Force output alpha to 1. if dst has no alpha channel. Ensure that garbage is not stored in the unused alpha channel so that we can rely on it being currently initialiased when used as a source or returning via GetImage. Partial fix for rendercheck -t blend	2010-05-13 17:17:10 +01:00
Chris Wilson	0e726b85ca	i915: Add a2r10g10b10 format and friends Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-13 09:40:27 +01:00
Chris Wilson	9f54107f86	dri2: Handle reference counting across page flipping 1. Instead of swapping bos, swap the entire private structure. 2. If we update the pixmap bo for the Screen, make sure we update the reference inside intel->front_buffer so that xrandr still functions. Fixes: Bug 27922 - i965: Rapidly resizing OpenGL window causes GPU to hang. https://bugs.freedesktop.org/show_bug.cgi?id=27922 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-12 21:37:49 +01:00
Chris Wilson	6c27f6e4f7	uxa: Avoid glyph ping-pong with !offscreen destination Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-12 12:50:31 +01:00
Chris Wilson	d5383c2073	uxa: Avoid ping-pong with !offscreen destination and traps If we are destined to target an !offscreen drawable, then uploading the trapezoid mask to a bo is the last thing we actually want to do... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-12 12:50:31 +01:00
Chris Wilson	00664b8f9d	uxa: Fallback when compositing to a !offscreen destination Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-12 12:50:31 +01:00
Chris Wilson	0c6372a77f	i830: Prevent allocation of bo larger than half the aperture We need to prevent overcommitting the aperture, and in particular if we allocate a buffer larger than available space we will fail to mmap it in and rendering will fail. Trying to allocate multiple large buffers in the aperture, often the case when falling back, causes thrashes and eviction of useful buffers. So from the outset simply do not allocate a bo if the the required size is more than half the available aperture space. Fixes allocation failure in ocitymap.trace for instance. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-12 12:50:31 +01:00
Chris Wilson	244b7cbfff	uxa: Use accelerated PutImage for uploading pixman images. Short-circuits the current use of PutImage from CopyArea, bypassing all the temporary allocations.	2010-05-12 12:50:31 +01:00
Chris Wilson	cb887cfc67	uxa: solid rects The cost of performing relocations outweigh the advantages of using the blitter for solids with lots of rectangles. References: Bug 22127 - [UXA] 50% performance regression for XRenderFillRectangles https://bugs.freedesktop.org/show_bug.cgi?id=22127 By using the 3D pipeline we improve our performance by around 4x on i945, measured by the jxbench microbenchmark, and a factor of 10x by short-cutting to the 3D pipeline for blended rectangles. Before, on a i945GME: 19982.412060 Ops/s; rects (!); 15x15 9599.131693 Ops/s; rects (!); 75x75 3803.654743 Ops/s; rects (!); 250x250 6836.743772 Ops/s; rects blended; 15x15 1443.750000 Ops/s; rects blended; 75x75 495.335821 Ops/s; rects blended; 250x250 23247.933884 Ops/s; rects composition (!); 15x15 10993.073048 Ops/s; rects composition (!); 75x75 3595.905172 Ops/s; rects composition (!); 250x250 After: 87271.145975 Ops/s; rects (!); 15x15 32347.744361 Ops/s; rects (!); 75x75 5884.177215 Ops/s; rects (!); 250x250 73500.000000 Ops/s; rects blended; 15x15 33580.882353 Ops/s; rects blended; 75x75 5858.811749 Ops/s; rects blended; 250x250 25582.317073 Ops/s; rects composition (!); 15x15 6664.728682 Ops/s; rects composition (!); 75x75 14965.909091 Ops/s; rects composition (!); 250x250 [suspicious] This has no impact on Cairo, but I have a suspicion from watching xtrace that Qt likes to blit thousands of 1x1 rectangles with the same colour. However, we are still around 2-3x slower than the reported figures for EXA! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-12 12:50:31 +01:00
Chris Wilson	c8e10f7791	debug: Add names for operators Most useful for confirming my worst fears: unwarranted use of OutReverse + Add. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-12 12:48:21 +01:00
Chris Wilson	6ea8ce640f	xvmc: Build fix with -pedantic Fixes: Bug 27352 - RPMLINT error causes build breakage https://bugs.freedesktop.org/show_bug.cgi?id=27352 Reported-by: Johannes Obermayr <johannesobermayr@gmx.de> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-11 19:39:01 +01:00
Chris Wilson	e1b7e8bf1d	drmmode: Reorder i830_set_pixmap_bo() so that the correct stride is used. The pitch needs to be set on the pixmap prior to the private intel_pixmap structure being created so that it can record the correct value from the pixmap. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-11 15:54:18 +01:00
Chris Wilson	dfbaf9aab8	i830: Never create a bo for depth=1 pixmaps. As we can not accelerate these either as a destination or a source, don't bother allocating a buffer object for them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-11 15:01:00 +01:00
Chris Wilson	5b7efe375a	i830: Use set_pixmap_bo() instead of open-coding. The advantage is that this enables in-flight reuse of the old pixmap if possible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-11 15:00:59 +01:00
Chris Wilson	ad8af95dd3	i830: Do not cache in-flight non-reusable buffers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-11 15:00:59 +01:00
Chris Wilson	f1048e14d5	i965: Add texformats mapping for additional pixman formats Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-11 13:07:19 +01:00
Chris Wilson	a35afd4a2d	uxa: Recheck texture after acquiring pattern. As the first step to handling unsupported texture formats, double check that the converted pattern can be used as a texture by the card. Fixes: rendercheck -t repeat Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-11 13:07:03 +01:00
Keith Packard	d745cab6c4	Must call ValidateGC in i830_uxa_put_image for scratch GC Always need to call ValidateGC or the scratch GC will not get the right composite clip. Signed-off-by: Keith Packard <keithp@keithp.com>	2010-05-10 22:59:52 -07:00
Chris Wilson	3eded4202e	i915: Fix pixmap based masks. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-10 23:38:17 +01:00
Chris Wilson	1ecd89be03	uxa: Protect against valid SourcePict in uxa_acquire_mask() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-10 23:33:52 +01:00
Chris Wilson	a8761585ef	i830: Minor cleanup Remove some extraneous prototypes and unused variables. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-10 19:38:24 +01:00
Chris Wilson	9e9b0d85da	i830: Update stride when swapping bo for PutImage Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-10 18:37:26 +01:00
Chris Wilson	0d4dd00aea	uxa,i915: Handle SourcePict through uxa_composite() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-10 12:29:26 +01:00
Chris Wilson	21c1c3c7f6	i915: Use 1x1R pixmap for solid drawables x11perf has a regression https://bugs.freedesktop.org/show_bug.cgi?id=25068 caused by commit `e581ceb738` i915: Use the color channels to pass along solid sources and masks. Do not convert 1x1R pixmaps into a solid color as the readback from the bo negates all the performances advantages of using a smaller vertex buffer and fewer samplers. Before (PineView): aa=66800 glyph/s, rgb=28800 glyphs/s Now: aa=96800 glyphs/s, rgb=48500 glyphs/s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-10 10:36:15 +01:00
Chris Wilson	f52b6e8322	uxa: Rearrange checking and preparing of composite textures. x11perf regression caused by 2D driver https://bugs.freedesktop.org/show_bug.cgi?id=28047 caused by commit `a7b800513f` uxa: Extract sub-region from in-memory buffers. The issue is that as we extract the region prior to checking whether the composite can in fact be accelerated, we perform expensive surplus operations. This is particularly noticeable for ComponentAlpha text, such as rgb10text. The solution here is to rearrange the check_composite() prior to acquiring the sources, and only extracting the subregion if the render path can not actually handle the texture. Performance (on PineView): a7b800513^: aa=68600 glyphs/s, rgb=29900 glyphs/s a7b800513: aa=65700 glyphs/s, rgb=13200 glyphs/s now: aa=66800 glyph/s, rgb=28800 glyphs/s The residual lossage seems to be from the extra function call and dixPrivate lookups. Hmm. More warning is the extremely low performance, however the results are consistent so the improvement looks real... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-10 10:36:14 +01:00
Chris Wilson	848ab66384	uxa: Transform composites with a simple translation into a blit We can also convert a composite with an integer translation into a blit, so long as the sample extents remains within the source. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-08 19:35:28 +01:00
Chris Wilson	a7b800513f	uxa: Extract sub-region from in-memory buffers. If the buffer is too large or not suitable for a GPU operation, we currently fallback and perform the composite on the CPU. An alternative is too extract the small region out of the source (as usually the sample extents are much smaller than the actual surface size) and try the composite with the new surface. The effect is particularly noticeable on pathological websites that use very large background images. For example, http://www.woodtv.com/ uses a 1299x15000 pattern that is obscured by another opaque pattern. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-05-08 19:35:07 +01:00
Chris Wilson	8562b7bc67	i830: prepare the uxa pixmap for fbCopyArea. Complete the prepare access for the PutImage fallback via fbCopyArea(), by remembering to set the private pointer to the GTT mapping. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-04-27 10:29:16 +01:00
Chris Wilson	9a5cd65b59	i830: if pixman_blt() fails fallback to fbCopyArea() On older versions of pixman, pixman_blt() can return false if the images are <= 8bpp. If we are being called from CopyArea, then we cannot return FALSE here as that will trigger an infinite recursion. Instead we must manually perform the fallback using fbCopyArea(). Reported-by: Peter Clifton <pcjc2@cam.ac.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-04-26 09:14:17 +01:00
Chris Wilson	86d349aa7b	i830: tidy in flight bo reuse. A left-over cleanup patch for `c374c94`. sigh Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-04-26 09:13:54 +01:00
Daniel Vetter	72fd7d191c	Fix "make dist" This is some fallout from my xvmc cleanup. Original-Patch-by: Rico Tzschichholz <ricotz@t-online.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-04-19 21:56:57 +02:00
Daniel Vetter	9494f4e91f	i810: adjust the pitch for DRI rendering Current code forgot to adjust the pitch of the frontbuffer. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=16729	2010-04-16 22:24:01 +02:00
Chris Wilson	c374c94e41	uxa: Reuse in-flight bo When we need to allocate a new bo for use as a gpu target, first check if we can reuse a pixmap that has already been relocated into the aperture as a temporary target, for instance a glyph mask or a clip mask. Before: backend test min(s) median(s) stddev. xlib firefox-planet-gnome 50.568 50.873 0.30% xcb firefox-planet-gnome 49.686 53.003 3.92% xlib evolution 40.115 40.131 0.86% xcb evolution 28.241 28.285 0.18% After: backend test min(s) median(s) stddev. xlib firefox-planet-gnome 47.759 48.233 0.80% xcb firefox-planet-gnome 48.611 48.657 0.87% xlib evolution 38.954 38.991 0.05% xcb evolution 26.561 26.654 0.19% And even more dramatic improvements when using a font size larger than the maximum size of the glyph cache: xcb firefox-36-20090611: 1.79x speedup xlib firefox-36-20090611: 1.74x speedup xcb firefox-36-20090609: 1.62x speedup xlib firefox-36-20090609: 1.59x speedup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-04-15 21:37:32 +01:00
Chris Wilson	96aa7a236a	i830: Allocate bo's for glyphs larger than 32x32. As we only use the glyph cache for small glyphs, those large than 32x32 will first be copied to a bo and used as a mask in a composite operation. We can avoid the allocation and upload per use by allocating a bo for the over-sized glyph from the start. As the glyph is large anyway, the excess memory allocation is less significant. Using normal font sizes, firefox shows no change - as expected. However, using the 36 font size traces, we see around a 10% improvement on g45. Before: xcb firefox-36-20090609 127.333 127.897 0.22% xcb firefox-36-20090611 87.456 88.624 0.66% xcb firefox-20090601 19.522 20.194 1.69% xlib firefox-36-20090609 201.054 201.780 0.18% xlib firefox-36-20090611 133.468 133.717 0.09% xlib firefox-20090601 23.740 23.975 0.49% With large glyphs in bo: xcb firefox-36-20090609 117.256 118.254 0.42% xcb firefox-36-20090611 79.462 79.962 0.31% xcb firefox-20090601 19.658 20.024 0.92% xlib firefox-36-20090609 185.645 188.202 0.68% xlib firefox-36-20090611 123.592 124.940 0.54% xlib firefox-20090601 23.917 24.098 0.38% Thanks to Owain G. Ainsworth for the suggestion! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-04-14 17:10:09 +01:00
Chris Wilson	2d17bd50af	Revert "Revert "uxa: Try using put_image when copying from a memory buffer."" This reverts commit `6d50553e8f`. Now we have taught the fallback path not to infinitely recurse, re-enable the accelerated path for ShmPutImage and friends.	2010-04-14 17:10:09 +01:00
Chris Wilson	1cc2c2c44a	i830: Use pixman_blt directly for performing the in-memory copy In order to avoid an infinite recursion after enabling CopyArea to use the put_image acceleration to either stream a blit or to copy in-place, we cannot call CopyArea from put_image for the fallback path. Instead, we can simply call pixman_blt directly, which coincidentally is a tiny bit faster. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-04-14 17:10:05 +01:00
Daniel Vetter	324a2810da	i830 render: check aperture space requirements No point not doing this. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-04-13 08:39:43 +02:00
Daniel Vetter	804263c10d	render: tell the kernel explicitly when fences are needed This slighlty improves xrender performance on fence reg starved i8xx hw. I've also changed a few function calls to the new names from the compat ones while looking at the code. The i915 textured video path is not converted because atm the xv code does not use tiled surfaces. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-04-13 08:34:20 +02:00
Daniel Vetter	a619a78312	i915 render: use tiling bits where possible This is in preparation to explicit fence allocation with execbuf2. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-04-13 08:34:20 +02:00
Daniel Vetter	55cd36046e	i830 render: use tiling bits where possible This is in preparation to explicit fence allocation with execbuf2. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-04-13 08:34:20 +02:00
Eric Anholt	6d50553e8f	Revert "uxa: Try using put_image when copying from a memory buffer." This reverts commit `27195d7dba`. put_image often calls copy_area. Which calls put_image. Exhausting of the stack follows.	2010-04-12 13:46:24 -07:00
Chris Wilson	28024f6c5f	Revert "uxa: Add fallback warnings for PutImage." This reverts commit `299b0338d0`. A debugging patch, it was never intended to go into master	2010-04-12 13:44:01 +01:00
Chris Wilson	27195d7dba	uxa: Try using put_image when copying from a memory buffer. Often, for example in the fallback for ShmPutImage, we will attempt to use uxa_copy_area() copying to a normal pixmap from a memory buffer. This triggers a fallback, and maps the destination pixmap back into the GTT. The accelerated put_image path will attempt to stream a blit to the destination pixmap if it is currently active, avoiding the stall.	2010-04-10 18:50:26 +01:00

1 2 3 4 5 ...

3080 Commits All Branches Search

3080 Commits

All Branches