To prepare for Xv on Sandybridge. It is easy to fill the binding
table without relocation and make sure that the pointer to binding
table only uses bits[15:0].
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
After splitting out the i810 driver into its own legacy directory, we
can identify the common routines not as i830 but as intel. This
clarifies the code which *is* i830 specific.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The driver is still built but is no longer under active development so
move it and supporting files to a new directory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the batch submit may not trigger further drawing through flushing the
vertices, pass the requirement to emit the flush down to the submission
routine so that the flush can be appended after the final commands.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On my PineView box these represent ~5% overhead on x11perf text:
Before:
16000000 trep @ 0.0020 msec (495000.0/sec): Char in 80-char aa line (Charter 10)
12000000 trep @ 0.0022 msec (461000.0/sec): Char in 80-char rgb line (Charter 10)
After:
16000000 trep @ 0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10)
16000000 trep @ 0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In my recent fix for the chroma pitch for i915 xvmc I've forgotten about
i965 class hw. For videos with a non-even sized stride (measured in dwords)
the chroma pitch was internally incosistent and one dword off.
Fix this by using pitch2 for the chroma pitch in i965 textured video like
everywhere else.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27417
Tested-by: Nick Bowler <nbowler@draconx.ca>
Tested-by: Sven Arvidsson <sa@whiz.se>
The previous code made no sense, (multiplying an offset by 4 is
meaningless). It could have onlt worked with the offset being
fortuitously 0.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Carl Worth <cworth@cworth.org>
This is the first part of my small crusade to rip out x1, x2, y1, y2
from I830PutImage*. These variables have strange semantics (they
change from simple integers to fixed-point values somewhere in
the middle) and don't really seem to be what we actually need.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
This should restore the previous level of synchronisation between
textures and pixmaps, but *does not* guarantee that a texture will be
flushed before use. tfp should be fixed so that the ddx can submit the
batch if required to flush the pixmap.
A side-effect of this patch is to rename intel_batch_flush() to
intel_batch_submit() to reduce the confusion of executing a batch buffer
with that of emitting a MI_FLUSH.
Should fix the remaining rendering corruption involving tfp [inc compiz]:
Bug 25431 [i915 bisected] piglit/texturing_tfp regressed
http://bugs.freedesktop.org/show_bug.cgi?id=25431
Bug 25481 Wrong cursor format and cursor blink rate with compiz enabled
http://bugs.freedesktop.org/show_bug.cgi?id=25481
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
There is only a single caller that wishes to forcibly append a flush
into the batch: intel_sync(). So move the logic there.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since batch buffers are rarely emitted by themselves but as part of a
sequence of state and vertices, the whole sequence is emitted atomically.
Here we just enforce that batches are marked as being part of an atomic
sequence as appropriate.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Make the following options available via xorg.conf:
Section "Driver"
Option "DebugFlushBatches" "1" # Flush the batch buffer after every
# single operation;
Option "DebugFlushCaches" "1" # Include a MI_FLUSH at the end of every
# batch buffer to force data to be
# flushed out of cache and into memory
# before the completion of the batch.
Option "DebugWait" "1" # Wait for the completion of every batch buffer
# before continuing, i.e. perform synchronous
# rendering.
EndSection
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This is the beginning of the campaign to remove some of the absurd use of
Hungarian in the driver. Not that I don't like Hungarian, but I don't need
to know that pI830 is a pPointer.
We've talked about doing this since the start of the project, putting it off
until "some convenient time". Just after removing a third of the driver seems
like a convenient time, when backporting's probably not happening much anyway.
The idea for the hw double buffering support is to program two fixed
buffers and then only switch buffers in the OCMD register. But the driver
as-is always programs the new buffer address (in both register sets
when double buffered). Therefore we gain nothing by using this hw
capability. Scrap the software support for it.
When double buffered, we now allocate just a buffer of size 2*size and
switch between the two parts purely in software.
To make reviewing this easier, I'll shortly explain the differences of how
double-buffering (i.e. tear-free video) is achieved before and after this
change:
- When double buffer, allocate a buffer twice the size (unchanged).
- Depending upon the currently shown buffer-half, copy the new frame into
the other buffer-half. In the old code this is done by using the right
set of buffer offsets, either *Buf0Offset or *Buf1Offset. The new code
simply programs the offset for the right buffer-half into the single set
of offsets. The end-result is unchanged.
Now the big difference in hw-programming:
Old: Programm new buffer offset into both sets of _hw_ buffer offset
registers. Depending upon the current _sw_ buffer, select the _hw_ buffer
and program this into the OCMD register. This just complicates matters
unnecessarly.
New: Just always use the hw buffer 0.
And then it's again the same story in both old and new code:
- Execute an overlay flip (MI_OVERLAY_FLIP) to read in the contents of the
hw registers into the shadow hw registers (which are actually being used
by the overlay, not the ones we write stuff into). This is synchronized
with the respective crtc vblank by the hw.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
The optimization of unreferencing the binding table when the relocation is
posted causes the object to be dereferenced for each box in the clip list,
causing general chaos in the buffer manager. It's easier to just hold a
reference to the object until all of the boxes are painted and then drop it.
Signed-off-by: Keith Packard <keithp@keithp.com>
This increases the overhead for video in the presence of cliprects, but we
were already doing nasty things in that case and don't seem to care. This
could fix potential bad rendering or hangs with video, particularly with
DRI2.