A magic number required for so many functions of the GPU. In this
particular case it is likely to be that the offset of a texture in the
GTT has to have a minimum alignment of 64 bytes.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46415
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I skipped a GCC warning about the implicit function declaration, which
of course results in a runtime silent death. Oops.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We used to allow the backing pixmap to be created later in order to
accommodate ShmPixmaps and ShmPutImage. However, they are now correctly
handled upfront if we choose to accelerate those paths, and so all
choice over whether to attach to a pixmap are made during creation and
are invariant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The sampler just dies if it encounters a snoopable page, for no apparent
reason. Whilst I encountered the bug on Crestline, disable it for the
rest of gen4 just to be safe.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we wish to immediate map the vertices buffers, it is beneficial to
search the linear cache for an existing mapping to reuse first.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
KGEM_BUFFER_WRITE_INPLACE is WRITE | INPLACE and so the typo prevented
uploading of partial data through the pwrite paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When moving only a region to the CPU and we detect a pending clear, we
transform the operation into a move whole pixmap. In such situations, we
only have a partial damage area and so need to or in MOVE_READ to
prevent the pending clear of the whole pixmap from being discarded.
References: https://bugs.freedesktop.org/show_bug.cgi?id=46792
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Debug builds are excruitatingly slow as the compiler doesn't store the
temporary in a register but uses an uncached readback instead. Maybe
this will help...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we now attempt to keep retain partial buffers after execution, we can
end up will lots of inactive buffers sitting on the partial buffer list.
In any one batch, we wish to minimise the number of buffers used, so
keep all the inactive buffers on a seperate list and only pull from them
as required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Dust off the kernel patches and update to reflect the changes made to
support LLC CPU bo, in particular to support the unsynchronized shadow
buffers.
However, due to the forced synchronisation required for strict client
coherency we prefer not to use the vmap for shared pixmaps unless we are
already busy (i.e. sync afterwards rather than before in the hope that
we can squash a few operations into one). Being able to block the reply
to the client until the request is actually complete and so avoid the
sync remains a dream.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the buffer is cache-coherent, we can read as well as write to any
partial buffer so the distinction is irrelevant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This reverts commit 4adb6967a8.
Oops, this debugging commit was not intended to be pushed along with the
bugfix. :(
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
An artefact of retaining the mmapped partial buffers is that it
magnified the effect of stealing those for readback, causing extra
writes on non-llc platforms.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Bo used for batch buffers are handled differently and not tracked
through the active cache, so we failed to notice when we might be able
to run retire and recover a suitable buffer for reuse. So simply always
run retire when we might need to create a new linear buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we change tiling on a bo, we are effectively discarding the cached
mmap so it is preferable to look for another.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We use the XF86DRI as a user configurable option to control whether to
build DRI support for i810, but it is also used internally within xorg
and there exists a public define in xorg-server.h which overrides our
configure option. So rename our define to HAVE_DRI1 to avoid the
conflict.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46590
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If you are suffering from regular X crashes and rendering corruption
with a flood of ENOSPC or even EFILE reported in the Xorg.log, try
adding this snippet to your xorg.conf:
Section "Driver"
Option "BufferCache" "False"
EndSection
References: https://bugs.freedesktop.org/show_bug.cgi?id=39552
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As gen3 only uses the single state emission block, and uniformly calls
get_rectangles(), we can move that caller protocol into the callee.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we prematurely end the batch if we bail on extending the vbo for CA
glyphs, we need to force the flush.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The heuristic of using the mapping only before the first use in an
execbuffer was suboptimal and broken by the change in bo initialisation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Due to the w/a for its buggy shaders, gen4 is significantly different
that backporting the simple patch from gen5 was prone to failure. We
need to check that the vertices have not already been flushed prior to
flushing again.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Or upon actually closing the vertex buffer.
However, the underlying issue remains. That is we are failing to re-emit
the first-pass for CA text after flushing the vertex buffer (and so
emitting the second-pass for the flushed vertices).
Reported-by: lemens Eisserer <linuxhippy@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=42891
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we may wait upon the bo having finished rendering before we can
execute the flip, flushing the render cache as early as possible is
beneficial
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Before blocking and waiting for further input, we need to make sure that
we have not developed too large a queue of outstanding rendering. As we
rendering to the front-buffer with no natural throttling and allow X
clients to render as fast as they wish, it is entirely possible for a
large queue of outstanding rendering to develop. For such an example,
watch firefox rendering the fishietank demo and notice the delay that
can build up before the tooltips appear.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we change the Screen pixmap due to a change of mode, we lose the
flag that we've attached a DRI2 buffer to it. So the next time we try to
copy from/to it, reassert its DRI2 status.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>