We need one for each possible combination of src and dst
blend_factors. Again, as with recent changes, this eliminates
state updates from prepare_composite and allows that function
to instead simply reference an existing object initialized
within gen4_state_init.
Thanks to Dave Airlie (and git-bisect) for pointing out that with
gnome-terminal all text was appearing as solid black with an early
version of this commit. As expected the bug was an alignment issue.
(cherry picked from 0c0ab52c2d100c47f38c7ef826ef585c8b9815e9 commit)
Performance is approximately equivalent on text tests, but may be
around +2%.
The YUV->RGB code was written to write directly to the dataport registers,
but that didn't work for the compositing functions (cause still unknown).
This change makes that code write RGB values to the src_sample registers as
with the other sample computation fragments.
This reverts commit 346cf57deabb4c336612df4c13650a87b5ef6775.
Mixing randr transforms and video caused screen corruption for Render
operations. No, I don't understand why.
The hardware has been marked as needing a sync, so the next video put will
block waiting for the previous one to complete. Adding a sync here just
stalls the video playback for no good reason.
Instead of leaving pixel values in src_sample registers, compute the pixel
values directl to the data port to save 8 moves. This cannot work when no
computation is done as there is both no way to wait for the sampler to
finish and because the sampler returns data in a different order from that
required by the data port (sigh).
This reduces the CPU overhead of memcpying them in every time, for a speedup
in aa24text of around 30%. This is based on work by Carl Worth which is
in the intel-batchbuffer branch.
The Intel driver appears to be coded to only work with displays
expecting 18 bit pixels. However I have an application using a LCD
display that expects pixel data in 24 bit format. The difference is
only 2 bits in a single GPU register. This patch implements that
change, controlled by a new driver option, "LVDS24Bit". The default
value is false, which is the previous behavior. When set to true,
then 24 bit panels should work (at least the one I'm testing here
does).
Fd.o bug #15201
Signed-off-by: Mike Isely <isely@pobox.com>
Using the updated factors even when BT709 conversion isn't available
(non-965) should still give us better color reproduction. Tested on a
945GM, examining the +/-5% of black bars of videotestsrc.
The 2-bit input_mask was actually an input count -- in0 is always there, and
in1 is optional.
The output flags weren't being reported in the log, so I mistakenly took
controlled_output == RGB0 to mean that the device only reported an RGB0,
while it actually reported RGB0|SVID0|YPRPB0|misc|other. Move SVID0 up
in priority and remove the RGB-is-it-really-TV hack I had just come up with.
Finally, set the input/output mapping at mode set time. We're always
supposed to do this, but haven't had to so far as we've never handled
devices with more than one output.
sf_mask is the same as sf except that it must compute both src and mask uvw
cooefficients, which are conveniently adjacent in the same registers, and so
need only an extended execution width
This involves correctly computing u/v locations based on x/y vectors and
line constants computed in new sf program.
Also, use fewer instructions to make this go a bit faster (2X for 500x500
composite).