History log of /linux-master/drivers/gpu/drm/i915/i915_sw_fence.c
Revision Date Author Comments
# 292a089d 20-Dec-2022 Steven Rostedt (Google) <rostedt@goodmis.org>

treewide: Convert del_timer*() to timer_shutdown*()

Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.

The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.

This was created by using a coccinelle script and the following
commands:

$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)

$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch

Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8146d588 22-Sep-2022 Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

drm/i915: Remove unused function parameter

The function parameter 'exclude' in funciton
i915_sw_fence_await_reservation() is not used.
Remove it.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922213916.12112-1-niranjana.vishwanathapura@intel.com


# 03e067bc 29-Aug-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915/fence: replace BUG_ON() with BUILD_BUG_ON()

Avoid BUG_ON(). Since __i915_sw_fence_init() is always called via a
wrapper macro, we can replace it with a compile time BUILD_BUG_ON().

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220830093411.1511040-5-jani.nikula@intel.com


# 7bc80a54 09-Nov-2021 Christian König <christian.koenig@amd.com>

dma-buf: add enum dma_resv_usage v4

This change adds the dma_resv_usage enum and allows us to specify why a
dma_resv object is queried for its containing fences.

Additional to that a dma_resv_usage_rw() helper function is added to aid
retrieving the fences for a read or write userspace submission.

This is then deployed to the different query functions of the dma_resv
object and all of their users. When the write paratermer was previously
true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise.

v2: add KERNEL/OTHER in separate patch
v3: some kerneldoc suggestions by Daniel
v4: some more kerneldoc suggestions by Daniel, fix missing cases lost in
the rebase pointed out by Bas.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-2-christian.koenig@amd.com


# 44505168 16-Nov-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Drop stealing of bits from i915_sw_fence function pointer

Rather than stealing bits from i915_sw_fence function pointer use
separate fields for function pointer and flags. If using two different
fields, the 4 byte alignment for the i915_sw_fence function pointer can
also be dropped.

v2:
(CI)
- Set new function field rather than flags in __i915_sw_fence_init
v3:
(Tvrtko)
- Remove BUG_ON(!fence->flags) in reinit as that will now blow up
- Only define fence->flags if CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is
defined
v4:
- Rebase, resend for CI

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116194929.10211-1-matthew.brost@intel.com


# 1b5bdf07 13-Sep-2021 Christian König <christian.koenig@amd.com>

drm/i915: use the new iterator in i915_sw_fence_await_reservation v3

Simplifying the code a bit.

v2: use dma_resv_for_each_fence instead, according to Tvrtko the lock is
held here anyway.
v3: back to using dma_resv_for_each_fence_unlocked.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116102431.198905-4-christian.koenig@amd.com


# d3fae3b3 02-Jun-2021 Christian König <christian.koenig@amd.com>

dma-buf: drop the _rcu postfix on function names v3

The functions can be called both in _rcu context as well
as while holding the lock.

v2: add some kerneldoc as suggested by Daniel
v3: fix indentation

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com


# 6b41323a 01-Jun-2021 Christian König <christian.koenig@amd.com>

dma-buf: rename dma_resv_get_excl_rcu to _unlocked

That describes much better what the function is doing here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-6-christian.koenig@amd.com


# 460d02ba 16-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Encode fence specific waitqueue behaviour into the wait.flags

Use the wait_queue_entry.flags to denote the special fence behaviour
(flattening continuations along fence chains, and for propagating
errors) rather than trying to detect ordinary waiters by their
functions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216165850.25030-1-chris@chris-wilson.co.uk


# f9e62f31 14-Aug-2020 Stephen Boyd <swboyd@chromium.org>

treewide: Make all debug_obj_descriptors const

This should make it harder for the kernel to corrupt the debug object
descriptor, used to call functions to fixup state and track debug objects,
by moving the structure to read-only memory.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200815004027.2046113-3-swboyd@chromium.org


# 20612303 28-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Filter wake_flags passed to default_wake_function

(NOTE: This is the minimal backportable fix, a full fix is being
developed at https://patchwork.freedesktop.org/patch/388048/)

The flags passed to the wait_entry.func are passed onwards to
try_to_wake_up(), which has a very particular interpretation for its
wake_flags. In particular, beyond the published WF_SYNC, it has a few
internal flags as well. Since we passed the fence->error down the chain
via the flags argument, these ended up in the default_wake_function
confusing the kernel/sched.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110
Fixes: ef4688497512 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152144.1100-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added a note and link about more complete fix]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit f4b3c395540aa3d4f5a6275c5bdd83ab89034806)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# f4b3c395 28-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Filter wake_flags passed to default_wake_function

(NOTE: This is the minimal backportable fix, a full fix is being
developed at https://patchwork.freedesktop.org/patch/388048/)

The flags passed to the wait_entry.func are passed onwards to
try_to_wake_up(), which has a very particular interpretation for its
wake_flags. In particular, beyond the published WF_SYNC, it has a few
internal flags as well. Since we passed the fence->error down the chain
via the flags argument, these ended up in the default_wake_function
confusing the kernel/sched.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110
Fixes: ef4688497512 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152144.1100-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added a note and link about more complete fix]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# a80d7367 11-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Tidy awaiting on dma-fences

Just tidy up the return handling for completed dma-fences. While it may
return errors for invalid fence, we already know that we have a good
fence and the only error will be an already signaled fence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200511075722.13483-5-chris@chris-wilson.co.uk


# 2386b492 19-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prefer '%ps' for printing function symbol names

%pS includes the offset, which is useful for return addresses but noise
when we are pretty printing a known (and expected) function entry point.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319091943.7815-1-chris@chris-wilson.co.uk


# 42fb60de 11-Feb-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Don't leak non-persistent requests on changing engines

If we have a set of active engines marked as being non-persistent, we
lose track of those if the user replaces those engines with
I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that
non-persistent requests are terminated if they are no longer being
tracked by the user's context (in order to prevent a lost request
causing an untracked and so unstoppable GPU hang), we need to apply the
same context cancellation upon changing engines.

v2: Track stale engines[] so we only reap at context closure.
v3: Tvrtko spotted races with closing contexts and set-engines, so add a
veneer of kill-everything paranoia to clean up after losing a race.

Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional")
Testcase: igt/gem_ctx_peristence/replace
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200211144831.1011498-1-chris@chris-wilson.co.uk


# cbab8d87 06-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Propagate errors on awaiting already signaled dma-fences

If we see an already signaled dma-fence that we want to await on, we skip
adding to the i915_sw_fence. However, we should pay attention to whether
there was an error on that fence and if so propagate it for our future
request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191206160428.1503343-3-chris@chris-wilson.co.uk


# 67a3acaa 22-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use a ctor for TYPESAFE_BY_RCU i915_request

As we start peeking into requests for longer and longer, e.g.
incorporating use of spinlocks when only protected by an
rcu_read_lock(), we need to be careful in how we reset the request when
recycling and need to preserve any barriers that may still be in use as
the request is reset for reuse.

Quoting Linus Torvalds:

> If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU?

.. because the object can be accessed (by RCU) after the refcount has
gone down to zero, and the thing has been released.

That's the whole and only point of SLAB_TYPESAFE_BY_RCU.

That flag basically says:

"I may end up accessing this object *after* it has been free'd,
because there may be RCU lookups in flight"

This has nothing to do with constructors. It's ok if the object gets
reused as an object of the same type and does *not* get
re-initialized, because we're perfectly fine seeing old stale data.

What it guarantees is that the slab isn't shared with any other kind
of object, _and_ that the underlying pages are free'd after an RCU
quiescent period (so the pages aren't shared with another kind of
object either during an RCU walk).

And it doesn't necessarily have to have a constructor, because the
thing that a RCU walk will care about is

(a) guaranteed to be an object that *has* been on some RCU list (so
it's not a "new" object)

(b) the RCU walk needs to have logic to verify that it's still the
*same* object and hasn't been re-used as something else.

In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used
immediately, but because it gets reused as the same kind of object,
the RCU walker can "know" what parts have meaning for re-use, in a way
it couidn't if the re-use was random.

That said, it *is* subtle, and people should be careful.

> So the re-use might initialize the fields lazily, not necessarily using a ctor.

If you have a well-defined refcount, and use "atomic_inc_not_zero()"
to guard the speculative RCU access section, and use
"atomic_dec_and_test()" in the freeing section, then you should be
safe wrt new allocations.

If you have a completely new allocation that has "random stale
content", you know that it cannot be on the RCU list, so there is no
speculative access that can ever see that random content.

So the only case you need to worry about is a re-use allocation, and
you know that the refcount will start out as zero even if you don't
have a constructor.

So you can think of the refcount itself as always having a zero
constructor, *BUT* you need to be careful with ordering.

In particular, whoever does the allocation needs to then set the
refcount to a non-zero value *after* it has initialized all the other
fields. And in particular, it needs to make sure that it uses the
proper memory ordering to do so.

NOTE! One thing to be very worried about is that re-initializing
whatever RCU lists means that now the RCU walker may be walking on the
wrong list so the walker may do the right thing for this particular
entry, but it may miss walking *other* entries. So then you can get
spurious lookup failures, because the RCU walker never walked all the
way to the end of the right list. That ends up being a much more
subtle bug.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122094924.629690-1-chris@chris-wilson.co.uk


# ef468849 17-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Propagate fence errors

Errors spread like wildfire, and must eventually be returned to the
user. They need to be captured and passed along the flow of fences,
infecting each in turn with the existing error, until finally they fall
out of a user visible result.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190817232511.11391-1-chris@chris-wilson.co.uk


# 52791eee 11-Aug-2019 Christian König <christian.koenig@amd.com>

dma-buf: rename reservation_object to dma_resv

Be more consistent with the naming of the other DMA-buf objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/


# ea593dbb 22-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Allow contexts to share a single timeline across all engines

Previously, our view has been always to run the engines independently
within a context. (Multiple engines happened before we had contexts and
timelines, so they always operated independently and that behaviour
persisted into contexts.) However, at the user level the context often
represents a single timeline (e.g. GL contexts) and userspace must
ensure that the individual engines are serialised to present that
ordering to the client (or forgot about this detail entirely and hope no
one notices - a fair ploy if the client can only directly control one
engine themselves ;)

In the next patch, we will want to construct a set of engines that
operate as one, that have a single timeline interwoven between them, to
present a single virtual engine to the user. (They submit to the virtual
engine, then we decide which engine to execute on based.)

To that end, we want to be able to create contexts which have a single
timeline (fence context) shared between all engines, rather than multiple
timelines.

v2: Move the specialised timeline ordering to its own function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-4-chris@chris-wilson.co.uk


# e8861964 01-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+

Having introduced per-context seqno, we now have a means to identity
progress across the system without feel of rollback as befell the
global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
advance of submission safe in the knowledge that our target seqno and
address is stable.

However, since we are telling the GPU to busy-spin on the target address
until it matches the signaling seqno, we only want to do so when we are
sure that busy-spin will be completed quickly. To achieve this we only
submit the request to HW once the signaler is itself executing (modulo
preemption causing us to wait longer), and we only do so for default and
above priority requests (so that idle priority tasks never themselves
hog the GPU waiting for others).

As might be reasonably expected, HW semaphores excel in inter-engine
synchronisation microbenchmarks (where the 3x reduced latency / increased
throughput more than offset the power cost of spinning on a second ring)
and have significant improvement (can be up to ~10%, most see no change)
for single clients that utilize multiple engines (typically media players
and transcoders), without regressing multiple clients that can saturate
the system or changing the power envelope dramatically.

v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
v4: Tell the world and include it as part of scheduler caps.

Testcase: igt/gem_exec_whisper
Testcase: igt/benchmarks/gem_wsim
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk


# b312d8ca 14-Nov-2018 Christian König <christian.koenig@amd.com>

dma-buf: make fence sequence numbers 64 bit v2

For a lot of use cases we need 64bit sequence numbers. Currently drivers
overload the dma_fence structure to store the additional bits.

Stop doing that and make the sequence number in the dma_fence always
64bit.

For compatibility with hardware which can do only 32bit sequences the
comparisons in __dma_fence_is_later only takes the lower 32bits as significant
when the upper 32bits are all zero.

v2: change the logic in __dma_fence_is_later

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Link: https://patchwork.freedesktop.org/patch/266927/


# 635b3bc6 28-Nov-2018 Jonathan Gray <jsg@jsg.id.au>

drm/i915: change i915_sw_fence license to MIT

Change the license of the i915_sw_fence files to MIT matching
most of the other i915 files. This makes it possible to use them
in a new port of i915 to OpenBSD.

Besides some mechanical tree wide changes Chris Wilson is the sole
author of these files with Intel holding the copyright.

Intel's legal team have given permission to change the license according
to Joonas Lahtinen.

v2: expand commit message and note permission from Intel legal

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181129013051.17525-1-jsg@jsg.id.au


# 5791bad4 14-Sep-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Include fence-hint for timeout warning

If an asynchronous wait on a foriegn fence, we print a warning
indicating which fence was not signaled. As i915_sw_fences become more
common, include the debug hint (the symbol-name of the target) to help
identify the waiter. E.g.

[ 31.968144] Asynchronous wait on fence sw_sync:gem_eio:1 timed out (hint:submit_notify [i915])

We also want to downgrade from a warning to a notice (normal but
significant condition) as the timeout is imposed and controlled by the
caller (i.e. it is deliberate) and can be provoked by userspace.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180914124007.18790-1-chris@chris-wilson.co.uk


# f255c1e9 15-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/fence: Separate timeout mechanism for awaiting on dma-fences

As the timeout mechanism has grown more and more complicated, using
multiple deferred tasks and more than doubling the size of our struct,
split the two implementations to streamline the simpler no-timeout
callback variant.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115090643.26696-2-chris@chris-wilson.co.uk


# c32164b1 15-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only defer freeing of fence callback when also using the timer

Without an accompanying timer (for internal fences), we can free the
fence callback immediately as we do not need to employ the RCU barrier
to serialise with the timer. By avoiding the RCU delay, we can avoid the
extra mempressure under heavy inter-engine request utilisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115090643.26696-1-chris@chris-wilson.co.uk


# 2cf654db 13-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/fence: Use rcu to defer freeing of irq_work

It is illegal to perform an immediate free of the struct irq_work from
inside the irq_work callback (as irq_work_run_list modifies work->flags
after execution of the work->func()). As we use the irq_work to
coordinate the freeing of the callback from two different softirq paths,
we need to defer the kfree from inside our irq_work callback, for which
we can use kfree_rcu.

Fixes: 81c0ed21aa91 ("drm/i915/fence: Avoid del_timer_sync() from inside a timer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171213094802.28243-1-chris@chris-wilson.co.uk
(cherry picked from commit 7d622351c94172a42bfe9b13bdb0fdc2be90ed3b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# e30a7581 12-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up potential allocation paths within i915_sw_fence as might_sleep

As kmalloc is allowed to block (if given the right flags), mark up the
two i915_sw_fence routines that may call kmalloc as potential sleeping
routines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-1-chris@chris-wilson.co.uk


# 7d622351 13-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/fence: Use rcu to defer freeing of irq_work

It is illegal to perform an immediate free of the struct irq_work from
inside the irq_work callback (as irq_work_run_list modifies work->flags
after execution of the work->func()). As we use the irq_work to
coordinate the freeing of the callback from two different softirq paths,
we need to defer the kfree from inside our irq_work callback, for which
we can use kfree_rcu.

Fixes: 81c0ed21aa91 ("drm/i915/fence: Avoid del_timer_sync() from inside a timer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171213094802.28243-1-chris@chris-wilson.co.uk


# 39cbf2aa 17-Oct-2017 Kees Cook <keescook@chromium.org>

drm/i915: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171017065304.3358-1-joonas.lahtinen@linux.intel.com


# 214707fc 12-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/selftests: Wrap a timer into a i915_sw_fence

For some selftests, we want to issue requests but delay them going to
hardware. Furthermore, we don't want those requests to block
indefinitely (or else we may hang the driver and block testing) so we
want to employ a timeout. So naturally we want a fence that is
automatically signaled by a timer.

v2: Add kselftests.
v3: Limit the API available to selftests; there isn't an overwhelming
reason to export it universally.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-2-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 81c0ed21 11-Sep-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/fence: Avoid del_timer_sync() from inside a timer

A fence may be signaled from any context, including from inside a timer.
One example is timer_i915_sw_fence_wake() which is used to provide a
safety-net when waiting on an external fence. If the external fence is
not signaled within a timely fashion, we signal our fence on its behalf,
and so we then may process subsequent fences in the chain from within
that timer context.

Given that dma_i915_sw_fence_wake() may be from inside a timer, we cannot
then use del_timer_sync() as that requires the timer lock for itself. To
circumvent this, while trying to keep the signal propagation as low
latency as possible, move the completion into a worker and use a bit of
atomic switheroo to serialise the timer-callback and the dma-callback.

Testcase: igt/gem_eio/in-flight-external
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170911084135.22903-3-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 2055da97 19-Jun-2017 Ingo Molnar <mingo@kernel.org>

sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming

So I've noticed a number of instances where it was not obvious from the
code whether ->task_list was for a wait-queue head or a wait-queue entry.

Furthermore, there's a number of wait-queue users where the lists are
not for 'tasks' but other entities (poll tables, etc.), in which case
the 'task_list' name is actively confusing.

To clear this all up, name the wait-queue head and entry list structure
fields unambiguously:

struct wait_queue_head::task_list => ::head
struct wait_queue_entry::task_list => ::entry

For example, this code:

rqw->wait.task_list.next != &wait->task_list

... is was pretty unclear (to me) what it's doing, while now it's written this way:

rqw->wait.head.next != &wait->entry

... which makes it pretty clear that we are iterating a list until we see the head.

Other examples are:

list_for_each_entry_safe(pos, next, &x->task_list, task_list) {
list_for_each_entry(wq, &fence->wait.task_list, task_list) {

... where it's unclear (to me) what we are iterating, and during review it's
hard to tell whether it's trying to walk a wait-queue entry (which would be
a bug), while now it's written as:

list_for_each_entry_safe(pos, next, &x->head, entry) {
list_for_each_entry(wq, &fence->wait.head, entry) {

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# ac6424b9 19-Jun-2017 Ingo Molnar <mingo@kernel.org>

sched/wait: Rename wait_queue_t => wait_queue_entry_t

Rename:

wait_queue_t => wait_queue_entry_t

'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
which had to carry the name.

Start sorting this out by renaming it to 'wait_queue_entry_t'.

This also allows the real structure name 'struct __wait_queue' to
lose its double underscore and become 'struct wait_queue_entry',
which is the more canonical nomenclature for such data types.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 47624cc3 17-May-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Import the kfence selftests for i915_sw_fence

A long time ago, I wrote some selftests for the struct kfence idea. Now
that we have infrastructure in i915/igt for running kselftests, include
some for i915_sw_fence.

v2: INIT_WORK_ONSTACK/destroy_work_on_stack (Mika)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-2-chris@chris-wilson.co.uk


# 9310cb7f 17-May-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove kref from i915_sw_fence

My original intention was for i915_sw_fence to be the base class and
provide the reference count for the container. This was from starting
with a design to handle async_work. In practice, for i915 we embed
fences into structs which have their own independent reference counting,
making the i915_sw_fence.kref duplicitous. If we remove the kref, we
remove the i915_sw_fence's ability to free itself and its independence,
it can only exist within a container and must be supplied with a
callback to handle its release.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-1-chris@chris-wilson.co.uk


# 8dfe162a 28-Feb-2017 Joe Perches <joe@perches.com>

gpu: drm: drivers: Convert printk(KERN_<LEVEL> to pr_<level>

Use a more common logging style.

Miscellanea:

o Coalesce formats and realign arguments
o Neaten a few macros now using pr_<level>

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Sinclair Yeh <syeh@vmware.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/76355db47b31668bb64d996865ceee53bd66b11f.1488285953.git.joe@perches.com


# 6f13f29f 13-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Flush the change in debugobject before reallocation

When marking the debugobject as freed, be sure that write is flushed
before another CPU may see it on a reallocation path.

Only seen once in CI:

[ 159.240873] WARNING: CPU: 3 PID: 6735 at lib/debugobjects.c:263 debug_print_object+0x87/0xb0
[ 159.240897] ODEBUG: init destroyed (active state 0) object type: i915_sw_fence hint: submit_notify+0x0/0x4c [i915]
[ 159.240902] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_codec_realtek crc32_pclmul snd_hda_codec_generic snd_hda_codec_hdmi ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915]
[ 159.240913] CPU: 3 PID: 6735 Comm: gem_exec_nop Tainted: G U 4.10.0-rc3-CI-Trybot_479+ #1
[ 159.240913] Hardware name: LENOVO 10AGS00601/SHARKBAY, BIOS FBKT34AUS 04/24/2013
[ 159.240914] Call Trace:
[ 159.240916] dump_stack+0x67/0x92
[ 159.240919] __warn+0xc6/0xe0
[ 159.240920] warn_slowpath_fmt+0x4a/0x50
[ 159.240921] debug_print_object+0x87/0xb0
[ 159.240935] ? __i915_request_wait_for_execute+0x1d0/0x1d0 [i915]
[ 159.240936] __debug_object_init+0xb2/0x410
[ 159.240950] ? __i915_request_wait_for_execute+0x1d0/0x1d0 [i915]
[ 159.240951] debug_object_init+0x16/0x20
[ 159.240962] __i915_sw_fence_init+0x29/0x60 [i915]
[ 159.240975] i915_gem_request_alloc+0x1fb/0x450 [i915]
[ 159.240987] i915_gem_do_execbuffer.isra.15+0x798/0x1b20 [i915]
[ 159.241000] i915_gem_execbuffer2+0xc0/0x250 [i915]
[ 159.241003] drm_ioctl+0x200/0x450
[ 159.241016] ? i915_gem_execbuffer+0x330/0x330 [i915]
[ 159.241018] do_vfs_ioctl+0x90/0x6e0
[ 159.241020] ? trace_hardirqs_on_caller+0x122/0x1b0
[ 159.241021] SyS_ioctl+0x3c/0x70
[ 159.241023] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 159.241024] RIP: 0033:0x7f9bc4f41357
[ 159.241025] RSP: 002b:00007ffc6cd5c568 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 159.241026] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bc4f41357
[ 159.241026] RDX: 00007ffc6cd5c640 RSI: 0000000040406469 RDI: 0000000000000003
[ 159.241027] RBP: 00007ffc6cd5c640 R08: 0000000000047508 R09: 0000000000000001
[ 159.241027] R10: 000b58552d323c3d R11: 0000000000000246 R12: 0000000040406469
[ 159.241028] R13: 0000000000000003 R14: 0000000000000004 R15: 0000000000000001

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170113214335.5829-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# fc158405 25-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Integrate i915_sw_fence with debugobjects

Add the tracking required to enable debugobjects for fences to improve
error detection in BAT. The debugobject interface lets us track the
lifetime and phases of the fences even while being embedded into larger
structs, i.e. to check they are not used after they have been released.

v2: Don't populate the stubs, debugobjects checks for a NULL pointer and
treats it equivalently.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161125131718.20978-4-chris@chris-wilson.co.uk


# 556b7487 14-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Give each sw_fence its own lockclass

Localise the static struct lock_class_key to the caller of
i915_sw_fence_init() so that we create a lock_class instance for each
unique sw_fence rather than all sw_fences sharing the same
lock_class. This eliminate some lockdep false positive when using fences
from within fence callbacks.

For the relatively small number of fences currently in use [2], this adds
160 bytes of unused text/code when lockdep is disabled. This seems
quite high, but fully reducing it via ifdeffery is also quite ugly.
Removing the #fence strings saves 72 bytes with just a single #ifdef.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-1-chris@chris-wilson.co.uk


# 7e941861 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate

In forthcoming patches, we want to be able to dynamically allocate the
wait_queue_t used whilst awaiting. This is more convenient if we extend
the i915_sw_fence_await_sw_fence() to perform the allocation for us if
we pass in a gfp mask as an alternative than a preallocated struct.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-2-chris@chris-wilson.co.uk


# f54d1867 25-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

dma-buf: Rename struct fence to dma_fence

I plan to usurp the short name of struct fence for a core kernel struct,
and so I need to rename the specialised fence/timeline for DMA
operations to make room.

A consensus was reached in
https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html
that making clear this fence applies to DMA operations was a good thing.
Since then the patch has grown a bit as usage increases, so hopefully it
remains a good thing!

(v2...: rebase, rerun spatch)
v3: Compile on msm, spotted a manual fixup that I broke.
v4: Try again for msm, sorry Daniel

coccinelle script:
@@

@@
- struct fence
+ struct dma_fence
@@

@@
- struct fence_ops
+ struct dma_fence_ops
@@

@@
- struct fence_cb
+ struct dma_fence_cb
@@

@@
- struct fence_array
+ struct dma_fence_array
@@

@@
- enum fence_flag_bits
+ enum dma_fence_flag_bits
@@

@@
(
- fence_init
+ dma_fence_init
|
- fence_release
+ dma_fence_release
|
- fence_free
+ dma_fence_free
|
- fence_get
+ dma_fence_get
|
- fence_get_rcu
+ dma_fence_get_rcu
|
- fence_put
+ dma_fence_put
|
- fence_signal
+ dma_fence_signal
|
- fence_signal_locked
+ dma_fence_signal_locked
|
- fence_default_wait
+ dma_fence_default_wait
|
- fence_add_callback
+ dma_fence_add_callback
|
- fence_remove_callback
+ dma_fence_remove_callback
|
- fence_enable_sw_signaling
+ dma_fence_enable_sw_signaling
|
- fence_is_signaled_locked
+ dma_fence_is_signaled_locked
|
- fence_is_signaled
+ dma_fence_is_signaled
|
- fence_is_later
+ dma_fence_is_later
|
- fence_later
+ dma_fence_later
|
- fence_wait_timeout
+ dma_fence_wait_timeout
|
- fence_wait_any_timeout
+ dma_fence_wait_any_timeout
|
- fence_wait
+ dma_fence_wait
|
- fence_context_alloc
+ dma_fence_context_alloc
|
- fence_array_create
+ dma_fence_array_create
|
- to_fence_array
+ to_dma_fence_array
|
- fence_is_array
+ dma_fence_is_array
|
- trace_fence_emit
+ trace_dma_fence_emit
|
- FENCE_TRACE
+ DMA_FENCE_TRACE
|
- FENCE_WARN
+ DMA_FENCE_WARN
|
- FENCE_ERR
+ DMA_FENCE_ERR
)
(
...
)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk


# e68a139f 09-Sep-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add a sw fence for collecting up dma fences

This is really a core kernel struct in disguise until we can finally
place it in kernel/. There is an immediate need for a fence collection
mechanism that is more flexible than fence-array, in particular being
able to easily drive request submission via events (and not just
interrupt driven). The same mechanism would be useful for handling
nonblocking and asynchronous atomic modesets, parallel execution and
more, but for the time being just create a local sw fence for execbuf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-1-chris@chris-wilson.co.uk