#
8146d588 |
|
22-Sep-2022 |
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> |
drm/i915: Remove unused function parameter The function parameter 'exclude' in funciton i915_sw_fence_await_reservation() is not used. Remove it. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220922213916.12112-1-niranjana.vishwanathapura@intel.com
|
#
03e067bc |
|
29-Aug-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/fence: replace BUG_ON() with BUILD_BUG_ON() Avoid BUG_ON(). Since __i915_sw_fence_init() is always called via a wrapper macro, we can replace it with a compile time BUILD_BUG_ON(). Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220830093411.1511040-5-jani.nikula@intel.com
|
#
44505168 |
|
16-Nov-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Drop stealing of bits from i915_sw_fence function pointer Rather than stealing bits from i915_sw_fence function pointer use separate fields for function pointer and flags. If using two different fields, the 4 byte alignment for the i915_sw_fence function pointer can also be dropped. v2: (CI) - Set new function field rather than flags in __i915_sw_fence_init v3: (Tvrtko) - Remove BUG_ON(!fence->flags) in reinit as that will now blow up - Only define fence->flags if CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is defined v4: - Rebase, resend for CI Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211116194929.10211-1-matthew.brost@intel.com
|
#
42fb60de |
|
11-Feb-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Don't leak non-persistent requests on changing engines If we have a set of active engines marked as being non-persistent, we lose track of those if the user replaces those engines with I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that non-persistent requests are terminated if they are no longer being tracked by the user's context (in order to prevent a lost request causing an untracked and so unstoppable GPU hang), we need to apply the same context cancellation upon changing engines. v2: Track stale engines[] so we only reap at context closure. v3: Tvrtko spotted races with closing contexts and set-engines, so add a veneer of kill-everything paranoia to clean up after losing a race. Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") Testcase: igt/gem_ctx_peristence/replace Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200211144831.1011498-1-chris@chris-wilson.co.uk
|
#
5e6a9471 |
|
06-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Check for error before calling cmpxchg() Only do the locked compare of the existing fence->error if we actually need to set an error. As we tend to call i915_sw_fence_set_error_once() unconditionally, it saves on typing to put the common has-error check into the inline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206160428.1503343-1-chris@chris-wilson.co.uk
|
#
67a3acaa |
|
22-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Use a ctor for TYPESAFE_BY_RCU i915_request As we start peeking into requests for longer and longer, e.g. incorporating use of spinlocks when only protected by an rcu_read_lock(), we need to be careful in how we reset the request when recycling and need to preserve any barriers that may still be in use as the request is reset for reuse. Quoting Linus Torvalds: > If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU? .. because the object can be accessed (by RCU) after the refcount has gone down to zero, and the thing has been released. That's the whole and only point of SLAB_TYPESAFE_BY_RCU. That flag basically says: "I may end up accessing this object *after* it has been free'd, because there may be RCU lookups in flight" This has nothing to do with constructors. It's ok if the object gets reused as an object of the same type and does *not* get re-initialized, because we're perfectly fine seeing old stale data. What it guarantees is that the slab isn't shared with any other kind of object, _and_ that the underlying pages are free'd after an RCU quiescent period (so the pages aren't shared with another kind of object either during an RCU walk). And it doesn't necessarily have to have a constructor, because the thing that a RCU walk will care about is (a) guaranteed to be an object that *has* been on some RCU list (so it's not a "new" object) (b) the RCU walk needs to have logic to verify that it's still the *same* object and hasn't been re-used as something else. In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used immediately, but because it gets reused as the same kind of object, the RCU walker can "know" what parts have meaning for re-use, in a way it couidn't if the re-use was random. That said, it *is* subtle, and people should be careful. > So the re-use might initialize the fields lazily, not necessarily using a ctor. If you have a well-defined refcount, and use "atomic_inc_not_zero()" to guard the speculative RCU access section, and use "atomic_dec_and_test()" in the freeing section, then you should be safe wrt new allocations. If you have a completely new allocation that has "random stale content", you know that it cannot be on the RCU list, so there is no speculative access that can ever see that random content. So the only case you need to worry about is a re-use allocation, and you know that the refcount will start out as zero even if you don't have a constructor. So you can think of the refcount itself as always having a zero constructor, *BUT* you need to be careful with ordering. In particular, whoever does the allocation needs to then set the refcount to a non-zero value *after* it has initialized all the other fields. And in particular, it needs to make sure that it uses the proper memory ordering to do so. NOTE! One thing to be very worried about is that re-initializing whatever RCU lists means that now the RCU walker may be walking on the wrong list so the walker may do the right thing for this particular entry, but it may miss walking *other* entries. So then you can get spurious lookup failures, because the RCU walker never walked all the way to the end of the right list. That ends up being a much more subtle bug. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191122094924.629690-1-chris@chris-wilson.co.uk
|
#
ef468849 |
|
17-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Propagate fence errors Errors spread like wildfire, and must eventually be returned to the user. They need to be captured and passed along the flow of fences, infecting each in turn with the existing error, until finally they fall out of a user visible result. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817232511.11391-1-chris@chris-wilson.co.uk
|
#
52791eee |
|
11-Aug-2019 |
Christian König <christian.koenig@amd.com> |
dma-buf: rename reservation_object to dma_resv Be more consistent with the naming of the other DMA-buf objects. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/323401/
|
#
ea593dbb |
|
22-Mar-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Allow contexts to share a single timeline across all engines Previously, our view has been always to run the engines independently within a context. (Multiple engines happened before we had contexts and timelines, so they always operated independently and that behaviour persisted into contexts.) However, at the user level the context often represents a single timeline (e.g. GL contexts) and userspace must ensure that the individual engines are serialised to present that ordering to the client (or forgot about this detail entirely and hope no one notices - a fair ploy if the client can only directly control one engine themselves ;) In the next patch, we will want to construct a set of engines that operate as one, that have a single timeline interwoven between them, to present a single virtual engine to the user. (They submit to the virtual engine, then we decide which engine to execute on based.) To that end, we want to be able to create contexts which have a single timeline (fence context) shared between all engines, rather than multiple timelines. v2: Move the specialised timeline ordering to its own function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-4-chris@chris-wilson.co.uk
|
#
e8861964 |
|
01-Mar-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ Having introduced per-context seqno, we now have a means to identity progress across the system without feel of rollback as befell the global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in advance of submission safe in the knowledge that our target seqno and address is stable. However, since we are telling the GPU to busy-spin on the target address until it matches the signaling seqno, we only want to do so when we are sure that busy-spin will be completed quickly. To achieve this we only submit the request to HW once the signaler is itself executing (modulo preemption causing us to wait longer), and we only do so for default and above priority requests (so that idle priority tasks never themselves hog the GPU waiting for others). As might be reasonably expected, HW semaphores excel in inter-engine synchronisation microbenchmarks (where the 3x reduced latency / increased throughput more than offset the power cost of spinning on a second ring) and have significant improvement (can be up to ~10%, most see no change) for single clients that utilize multiple engines (typically media players and transcoders), without regressing multiple clients that can saturate the system or changing the power envelope dramatically. v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway. v4: Tell the world and include it as part of scheduler caps. Testcase: igt/gem_exec_whisper Testcase: igt/benchmarks/gem_wsim Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk
|
#
635b3bc6 |
|
28-Nov-2018 |
Jonathan Gray <jsg@jsg.id.au> |
drm/i915: change i915_sw_fence license to MIT Change the license of the i915_sw_fence files to MIT matching most of the other i915 files. This makes it possible to use them in a new port of i915 to OpenBSD. Besides some mechanical tree wide changes Chris Wilson is the sole author of these files with Intel holding the copyright. Intel's legal team have given permission to change the license according to Joonas Lahtinen. v2: expand commit message and note permission from Intel legal Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181129013051.17525-1-jsg@jsg.id.au
|
#
ac6424b9 |
|
19-Jun-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/wait: Rename wait_queue_t => wait_queue_entry_t Rename: wait_queue_t => wait_queue_entry_t 'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue", but in reality it's a queue *entry*. The 'real' queue is the wait queue head, which had to carry the name. Start sorting this out by renaming it to 'wait_queue_entry_t'. This also allows the real structure name 'struct __wait_queue' to lose its double underscore and become 'struct wait_queue_entry', which is the more canonical nomenclature for such data types. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
9310cb7f |
|
17-May-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove kref from i915_sw_fence My original intention was for i915_sw_fence to be the base class and provide the reference count for the container. This was from starting with a design to handle async_work. In practice, for i915 we embed fences into structs which have their own independent reference counting, making the i915_sw_fence.kref duplicitous. If we remove the kref, we remove the i915_sw_fence's ability to free itself and its independence, it can only exist within a container and must be supplied with a callback to handle its release. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-1-chris@chris-wilson.co.uk
|
#
0e932c08 |
|
25-Nov-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Hold a reference on the request for its fence chain Currently, we have an active reference for the request until it is retired. Though it cannot be retired before it has been executed by hardware, the request may be completed before we have finished processing the execute fence, i.e. we may continue to process that fence as we free the request. Fixes: 5590af3e115a ("drm/i915: Drive request submission through fence callbacks") Fixes: 23902e49c999 ("drm/i915: Split request submit/execute phase into two") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161125131718.20978-3-chris@chris-wilson.co.uk (cherry picked from commit 48bc2a4a427ad81578f887d71d45794619a77211) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
fc158405 |
|
25-Nov-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Integrate i915_sw_fence with debugobjects Add the tracking required to enable debugobjects for fences to improve error detection in BAT. The debugobject interface lets us track the lifetime and phases of the fences even while being embedded into larger structs, i.e. to check they are not used after they have been released. v2: Don't populate the stubs, debugobjects checks for a NULL pointer and treats it equivalently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161125131718.20978-4-chris@chris-wilson.co.uk
|
#
48bc2a4a |
|
25-Nov-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Hold a reference on the request for its fence chain Currently, we have an active reference for the request until it is retired. Though it cannot be retired before it has been executed by hardware, the request may be completed before we have finished processing the execute fence, i.e. we may continue to process that fence as we free the request. Fixes: 5590af3e115a ("drm/i915: Drive request submission through fence callbacks") Fixes: 23902e49c999 ("drm/i915: Split request submit/execute phase into two") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161125131718.20978-3-chris@chris-wilson.co.uk
|
#
556b7487 |
|
14-Nov-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Give each sw_fence its own lockclass Localise the static struct lock_class_key to the caller of i915_sw_fence_init() so that we create a lock_class instance for each unique sw_fence rather than all sw_fences sharing the same lock_class. This eliminate some lockdep false positive when using fences from within fence callbacks. For the relatively small number of fences currently in use [2], this adds 160 bytes of unused text/code when lockdep is disabled. This seems quite high, but fully reducing it via ifdeffery is also quite ugly. Removing the #fence strings saves 72 bytes with just a single #ifdef. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-1-chris@chris-wilson.co.uk
|
#
7e941861 |
|
28-Oct-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate In forthcoming patches, we want to be able to dynamically allocate the wait_queue_t used whilst awaiting. This is more convenient if we extend the i915_sw_fence_await_sw_fence() to perform the allocation for us if we pass in a gfp mask as an alternative than a preallocated struct. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-2-chris@chris-wilson.co.uk
|
#
f54d1867 |
|
25-Oct-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
dma-buf: Rename struct fence to dma_fence I plan to usurp the short name of struct fence for a core kernel struct, and so I need to rename the specialised fence/timeline for DMA operations to make room. A consensus was reached in https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html that making clear this fence applies to DMA operations was a good thing. Since then the patch has grown a bit as usage increases, so hopefully it remains a good thing! (v2...: rebase, rerun spatch) v3: Compile on msm, spotted a manual fixup that I broke. v4: Try again for msm, sorry Daniel coccinelle script: @@ @@ - struct fence + struct dma_fence @@ @@ - struct fence_ops + struct dma_fence_ops @@ @@ - struct fence_cb + struct dma_fence_cb @@ @@ - struct fence_array + struct dma_fence_array @@ @@ - enum fence_flag_bits + enum dma_fence_flag_bits @@ @@ ( - fence_init + dma_fence_init | - fence_release + dma_fence_release | - fence_free + dma_fence_free | - fence_get + dma_fence_get | - fence_get_rcu + dma_fence_get_rcu | - fence_put + dma_fence_put | - fence_signal + dma_fence_signal | - fence_signal_locked + dma_fence_signal_locked | - fence_default_wait + dma_fence_default_wait | - fence_add_callback + dma_fence_add_callback | - fence_remove_callback + dma_fence_remove_callback | - fence_enable_sw_signaling + dma_fence_enable_sw_signaling | - fence_is_signaled_locked + dma_fence_is_signaled_locked | - fence_is_signaled + dma_fence_is_signaled | - fence_is_later + dma_fence_is_later | - fence_later + dma_fence_later | - fence_wait_timeout + dma_fence_wait_timeout | - fence_wait_any_timeout + dma_fence_wait_any_timeout | - fence_wait + dma_fence_wait | - fence_context_alloc + dma_fence_context_alloc | - fence_array_create + dma_fence_array_create | - to_fence_array + to_dma_fence_array | - fence_is_array + dma_fence_is_array | - trace_fence_emit + trace_dma_fence_emit | - FENCE_TRACE + DMA_FENCE_TRACE | - FENCE_WARN + DMA_FENCE_WARN | - FENCE_ERR + DMA_FENCE_ERR ) ( ... ) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
|
#
e68a139f |
|
09-Sep-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Add a sw fence for collecting up dma fences This is really a core kernel struct in disguise until we can finally place it in kernel/. There is an immediate need for a fence collection mechanism that is more flexible than fence-array, in particular being able to easily drive request submission via events (and not just interrupt driven). The same mechanism would be useful for handling nonblocking and asynchronous atomic modesets, parallel execution and more, but for the time being just create a local sw fence for execbuf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-1-chris@chris-wilson.co.uk
|