#
8a922cf1 |
|
22-Sep-2023 |
Kees Cook <keescook@chromium.org> |
drm/i915/selftests: Annotate struct perf_series with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct perf_series. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <john.c.harrison@Intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-3-keescook@chromium.org
|
#
6e921328 |
|
07-Mar-2023 |
Tejas Upadhyay <tejas.upadhyay@intel.com> |
drm/i915/selftest: Remove avoidable init assignment We can skip the assignment and i915 variable altogether and use refernce directly. Also used at single place only. Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230307094643.532271-1-tejas.upadhyay@intel.com
|
#
4ce0c8e7 |
|
27-Feb-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915/selftests: Fix live_requests for all engines After the abandonment of i915->kernel_context and since we have started to create per-gt engine->kernel_context, these tests need to be updated to instantiate the batch buffer VMA in the correct PPGTT for the context used to execute each spinner. v2(Tejas): - Clean commit message - Matt - Add BUG_ON to match vm v3(Tejas): - Fix dim checkpatch warnings Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230228044307.191639-1-tejas.upadhyay@intel.com
|
#
8e4ee5e8 |
|
30-Nov-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Wrap all access to i915_vma.node.start|size We already wrap i915_vma.node.start for use with the GGTT, as there we can perform additional sanity checks that the node belongs to the GGTT and fits within the 32b registers. In the next couple of patches, we will introduce guard pages around the objects _inside_ the drm_mm_node allocation. That is we will offset the vma->pages so that the first page is at drm_mm_node.start + vma->guard (not 0 as is currently the case). All users must then not use i915_vma.node.start directly, but compute the guard offset, thus all users are converted to use a i915_vma_offset() wrapper. The notable exceptions are the selftests that are testing exact behaviour of i915_vma_pin/i915_vma_insert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-3-andi.shyti@linux.intel.com
|
#
2a76fc89 |
|
19-Oct-2022 |
Andrzej Hajda <andrzej.hajda@intel.com> |
drm/i915: call i915_request_await_object from _i915_vma_move_to_active Since almost all calls to i915_vma_move_to_active are prepended with i915_request_await_object, let's call the latter from _i915_vma_move_to_active by default and add flag allowing bypassing it. Adjust all callers accordingly. The patch should not introduce functional changes. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-2-andrzej.hajda@intel.com
|
#
6f85403e |
|
02-Nov-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Reduce oversaturation of request smoketesting The goal in launching the request smoketest is to have sufficient tasks running across the system such that we are likely to detect concurrency issues. We aim to have 2 tasks using the same engine, gt, device (each level of locking around submission and signaling) running at the same time. While tasks may not be running all the time as they synchronise with the gpu, they will be running most of the time, in which case having many more tasks than cores available is wasteful (and dramatically increases the workload causing excess runtime). Aim to limit the number of tasks such that there is at least 2 running per engine, spreading surplus cores around the engines (rather than running a task per core per engine.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102155709.31717-1-nirmoy.das@intel.com
|
#
6407cf53 |
|
20-Oct-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915/selftests: Stop using kthread_stop() Since a7c01fa93aeb ("signal: break out of wait loops on kthread_stop()") kthread_stop() started asserting a pending signal which wreaks havoc with a few of our selftests. Mainly because they are not fully expecting to handle signals, but also cutting the intended test runtimes short due signal_pending() now returning true (via __igt_timeout), which therefore breaks both the patterns of: kthread_run() ..sleep for igt_timeout_ms to allow test to exercise stuff.. kthread_stop() And check for errors recorded in the thread. And also: Main thread | Test thread ---------------+------------------------------ kthread_run() | kthread_stop() | do stuff until __igt_timeout | -- exits early due signal -- Where this kthread_stop() was assume would have a "join" semantics, which it would have had if not the new signal assertion issue. To recap, threads are now likely to catch a previously impossible ERESTARTSYS or EINTR, marking the test as failed, or have a pointlessly short run time. To work around this start using kthread_work(er) API which provides an explicit way of waiting for threads to exit. And for cases where parent controls the test duration we add explicit signaling which threads will now use instead of relying on kthread_should_stop(). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221020130841.3845791-1-tvrtko.ursulin@linux.intel.com
|
#
61faec5f |
|
16-Aug-2022 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/selftests: Use correct selfest calls for live tests This will help in an upcoming patch where the live selftest wrappers are extended to do more. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <john.c.harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-2-alan.previn.teres.alexis@intel.com
|
#
b25c377a |
|
15-Jul-2022 |
Jason Wang <wangborong@cdjrlc.com> |
drm/i915/selftests: Fix comment typo Fix the double `wait' typo in comment. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220716040520.31676-1-wangborong@cdjrlc.com
|
#
b508d01f |
|
10-Feb-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: split out i915_gem_internal.h from i915_drv.h We already have the i915_gem_internal.c file. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6715d1f3232c445990630bb3aac00f279f516fee.1644507885.git.jani.nikula@intel.com
|
#
4e683546 |
|
13-Jan-2022 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/selftests: Add a cancel request selftest that triggers a reset Add a cancel request selftest that results in an engine reset to cancel the request as it is non-preemptable. Also insert a NOP request after the cancelled request and confirm that it completes successfully. v2: (Tvrtko) - Skip test if preemption timeout compiled out - Skip test if engine reset isn't supported - Update debug prints to be more descriptive v3: - Add comment explaining test v4: (John Harrison) - Fix typos in comment explaining test - goto out_rq is NOP creation fails Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220113181351.21296-2-matthew.brost@intel.com
|
#
17190a34 |
|
19-Dec-2021 |
Michał Winiarski <michal.winiarski@intel.com> |
drm/i915/selftests: Use to_gt() helper for GGTT accesses GGTT is currently available both through i915->ggtt and gt->ggtt, and we eventually want to get rid of the i915->ggtt one. Use to_gt() for all i915->ggtt accesses to help with the future refactoring. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211219212500.61432-6-andi.shyti@linux.intel.com
|
#
8c2699fa |
|
14-Dec-2021 |
Andi Shyti <andi.shyti@linux.intel.com> |
drm/i915/selftests: Use to_gt() helper Use to_gt() helper consistently throughout the codebase. Pure mechanical s/i915->gt/to_gt(i915). No functional changes. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-8-andi.shyti@linux.intel.com
|
#
49a8bf50 |
|
30-Nov-2021 |
Zhou Qingyang <zhou1615@umn.edu> |
drm/i915/gem: Fix a NULL pointer dereference in igt_request_rewind() In igt_request_rewind(), mock_context(i915, "A") is assigned to ctx[0] and used in i915_gem_context_get_engine(). There is a dereference of ctx[0] in i915_gem_context_get_engine(), which could lead to a NULL pointer dereference on failure of mock_context(i915, "A") . So as mock_context(i915, "B"). Although this bug is not serious for it belongs to testing code, it is better to be fixed to avoid unexpected failure in testing. Fix this bugs by adding checks about ctx[0] and ctx[1]. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_DRM_I915_SELFTEST=y show no new warnings, and our static analyzer no longer warns about this code. References: 591c0fb85d1c ("drm/i915: Exercise request cancellation using a mock selftest") [tursulin: Replaced fixes with references to avoid.] Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211130141545.153899-1-zhou1615@umn.edu
|
#
7c287113 |
|
11-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/selftests: Increase timeout in requests perf selftest perf_parallel_engines is micro benchmark to test i915 request scheduling. The test creates a thread per physical engine and submits NOP requests and waits the requests to complete in a loop. In execlists mode this works perfectly fine as powerful CPU has enough cores to feed each engine and process the CSBs. With GuC submission the uC gets overwhelmed as all threads feed into a single CTB channel and the GuC gets bombarded with CSBs as contexts are immediately switched in and out on the engines due to the zero runtime of the requests. When the GuC is overwhelmed scheduling of contexts is unfair due to the nature of the GuC scheduling algorithm. This behavior is understood and deemed acceptable as this micro benchmark isn't close to real world use case. Increasing the timeout of wait period for requests to complete. This makes the test understand that is ok for contexts to get starved in this scenario. A future patch / cleanup may just delete these micro benchmark tests as they basically mean nothing. We care about real workloads not made up ones. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211011175704.28509-1-matthew.brost@intel.com
|
#
716c61c8 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/selftest: Increase some timeouts in live_requests Requests may take slightly longer with GuC submission, let's increase the timeouts in live_requests. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-30-matthew.brost@intel.com
|
#
651e7d48 |
|
05-Jun-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: replace IS_GEN and friends with GRAPHICS_VER This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210606045050.103862-2-lucas.demarchi@intel.com
|
#
8f4caef8 |
|
01-Jun-2021 |
Zhihao Cheng <chengzhihao1@huawei.com> |
drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() In case of error, the function live_context() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/33c46ef24cd547d0ad21dc106441491a@intel.com [tursulin: Wrap commit text, fix Fixes: tag.] Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
#
10c1f0cb |
|
01-Jun-2021 |
Zhihao Cheng <chengzhihao1@huawei.com> |
drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() In case of error, the function live_context() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/33c46ef24cd547d0ad21dc106441491a@intel.com [tursulin: Wrap commit text, fix Fixes: tag.] Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (cherry picked from commit 8f4caef8d5401b42c6367d46c23da5e0e8111516) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
38b237ea |
|
23-Mar-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Individual request cancellation Currently, we cancel outstanding requests within a context when the context is closed. We may also want to cancel individual requests using the same graceful preemption mechanism. v2 (Tvrtko): * Cancel waiters carefully considering no timeline lock and RCU. * Fixed selftests. v3 (Tvrtko): * Remove error propagation to waiters for now. v4 (Tvrtko): * Rebase for extracted i915_request_active_engine. (Matt) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> [danvet: Resolve conflict because intel_engine_flush_scheduler is still called intel_engine_flush_submission] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-3-tvrtko.ursulin@linux.intel.com
|
#
aa8b70be |
|
23-Mar-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915/selftests: Prepare i915_request tests for obj->mm.lock removal Straightforward conversion by using unlocked versions. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-58-maarten.lankhorst@linux.intel.com
|
#
16f2941a |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Replace direct submit with direct call to tasklet Rather than having special case code for opportunistically calling process_csb() and performing a direct submit while holding the engine spinlock for submitting the request, simply call the tasklet directly. This allows us to retain the direct submission path, including the CS draining to allow fast/immediate submissions, without requiring any duplicated code paths, and most importantly greatly simplifying the control flow by removing reentrancy. This will enable us to close a few races in the virtual engines in the next few patches. The trickiest part here is to ensure that paired operations (such as schedule_in/schedule_out) remain under consistent locking domains, e.g. when pulled outside of the engine->active.lock v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous"). v3: Update engine-reset to be tasklet aware Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
|
#
f170523a |
|
22-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Consolidate the CS timestamp clocks Pull the GT clock information [used to derive CS timestamps and PM interval] under the GT so that is it local to the users. In doing so, we consolidate the two references for the same information, of which the runtime-info took note of a potential clock source override and scaling factors. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201223122359.22562-2-chris@chris-wilson.co.uk
|
#
19384452 |
|
16-Nov-2020 |
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> |
drm/i915/selftests: Fix wrong return value of perf_request_latency() If intel context create failed, the perf_request_latency() will return 0 rather than error, because we doesn't initialize the return value. Fixes: 25c26f18ea79 ("drm/i915/selftests: Measure dispatch latency") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201116143540.3648870-1-zhangxiaoxu5@huawei.com
|
#
01d70884 |
|
16-Nov-2020 |
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> |
drm/i915/selftests: Fix wrong return value of perf_series_engines() If intel context create failed, the perf_series_engines() will return 0 rather than error, because we doesn't initialize the return value. Fixes: cbfd3a0c5a55 ("drm/i915/selftests: Add request throughput measurement to perf") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201116144112.3673011-1-zhangxiaoxu5@huawei.com
|
#
b5462cc3 |
|
16-Nov-2020 |
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> |
drm/i915/selftests: Fix wrong return value of perf_request_latency() If intel context create failed, the perf_request_latency() will return 0 rather than error, because we doesn't initialize the return value. Fixes: 25c26f18ea79 ("drm/i915/selftests: Measure dispatch latency") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201116143540.3648870-1-zhangxiaoxu5@huawei.com (cherry picked from commit 19384452052a1e0525e663bfbdd62ac1399bb647) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
2106edbd |
|
16-Nov-2020 |
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> |
drm/i915/selftests: Fix wrong return value of perf_series_engines() If intel context create failed, the perf_series_engines() will return 0 rather than error, because we doesn't initialize the return value. Fixes: cbfd3a0c5a55 ("drm/i915/selftests: Add request throughput measurement to perf") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201116144112.3673011-1-zhangxiaoxu5@huawei.com (cherry picked from commit 01d708840c26c9532579677eaca942363a009fd5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
82be0d75 |
|
18-Sep-2020 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
drm/i915/selftest: Create mock_destroy_device Just some prep work before we rework the lifetime handling, which requires replacing all the drm_dev_put in selftests by something else. v2: Don't go with a static inline, upsets the header tests and separation. Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200918132505.2316382-2-daniel.vetter@ffwll.ch
|
#
15b6c924 |
|
19-Aug-2020 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion, v3. Make sure vma_lock is not used as inner lock when kernel context is used, and add ww handling where appropriate. Ensure that execbuf selftests keep passing by using ww handling. Changes since v2: - Fix i915_gem_context finally. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-22-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
e714977e |
|
01-Aug-2020 |
Tianjia Zhang <tianjia.zhang@linux.alibaba.com> |
drm/i915: Fix wrong return value In function i915_active_acquire_preallocate_barrier(), not all paths have the return value set correctly, and in case of memory allocation failure, a negative error code should be returned. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200802115655.25568-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
792592e7 |
|
07-Jul-2020 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: Move the engine mask to intel_gt_info Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device. In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info. v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
|
#
810b7ee3 |
|
17-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Always report the sample time for busy-stats Return the monotonic timestamp (ktime_get()) at the time of sampling the busy-time. This is used in preference to taking ktime_get() separately before or after the read seqlock as there can be some large variance in reported timestamps. For selftests trying to ascertain that we are reporting accurate to within a few microseconds, even a small delay leads to the test failing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk
|
#
1b90e4a4 |
|
17-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Enable selftesting of busy-stats A couple of very simple tests to ensure that the basic properties of per-engine busyness accounting [0% and 100% busy] are faithful. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-1-chris@chris-wilson.co.uk
|
#
25c26f18 |
|
19-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Measure dispatch latency A useful metric of the system's health is how fast we can tell the GPU to do various actions, so measure our latency. v2: Refactor all the instruction building into emitters. v3: Mark the error handling if not perfect, at least consistent. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200519130802.4067-1-chris@chris-wilson.co.uk
|
#
9bad40a2 |
|
11-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Always flush before unpining after writing Be consistent, and even when we know we had used a WC, flush the mapped object after writing into it. The flush understands the mapping type and will only clflush if !I915_MAP_WC, but will always insert a wmb [sfence] so that we can be sure that all writes are visible. v2: Add the unconditional wmb so we are know that we always flush the writes to memory/HW at that point. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200511141304.599-1-chris@chris-wilson.co.uk
|
#
b0a997ae |
|
10-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Emit await(batch) before MI_BB_START Be consistent and ensure that we always emit the asynchronous waits prior to issuing instructions that use the address. This ensures that if we do emit GPU commands to do the await, they are before our use! Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200510102431.21959-1-chris@chris-wilson.co.uk
|
#
426d0073 |
|
29-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Always enable busy-stats for execlists In the near future, we will utilize the busy-stats on each engine to approximate the C0 cycles of each, and use that as an input to a manual RPS mechanism. That entails having busy-stats always enabled and so we can remove the enable/disable routines and simplify the pmu setup. As a consequence of always having the stats enabled, we can also show the current active time via sysfs/engine/xcs/active_time_ns. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
|
#
cbfd3a0c |
|
22-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Add request throughput measurement to perf Under ideal circumstances, the driver should be able to keep the GPU fully saturated with work. Measure how close to ideal we get under the harshest of conditions with no user payload. v2: Also measure throughput using only one thread. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422074203.9799-1-chris@chris-wilson.co.uk
|
#
73c8bfb7 |
|
25-Mar-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Drop final few uses of drm_i915_private.engine We've migrated all the heavy users over to the intel_gt, and can finally drop the last few users and with that the mirror in dev_priv->engine[]. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200325234803.6175-1-chris@chris-wilson.co.uk
|
#
2386b492 |
|
19-Mar-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Prefer '%ps' for printing function symbol names %pS includes the offset, which is useful for return addresses but noise when we are pretty printing a known (and expected) function entry point. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200319091943.7815-1-chris@chris-wilson.co.uk
|
#
e6ba7648 |
|
21-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove i915->kernel_context Allocate only an internal intel_context for the kernel_context, forgoing a global GEM context for internal use as we only require a separate address space (for our own protection). Now having weaned GT from requiring ce->gem_context, we can stop referencing it entirely. This also means we no longer have to create random and unnecessary GEM contexts for internal use. GEM contexts are now entirely for tracking GEM clients, and intel_context the execution environment on the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
|
#
86ca2bf2 |
|
13-Dec-2019 |
Dan Carpenter <dan.carpenter@oracle.com> |
drm/i915/selftests: remove a condition We know that "err" is non-zero so there is no need to check. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213105050.y2v5nylsuxvc44jj@kili.mountain
|
#
de5825be |
|
25-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Serialise with engine-pm around requests on the kernel_context As the engine->kernel_context is used within the engine-pm barrier, we have to be careful when emitting requests outside of the barrier, as the strict timeline locking rules do not apply. Instead, we must ensure the engine_park() cannot be entered as we build the request, which is simplest by taking an explicit engine-pm wakeref around the request construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk
|
#
3b054a1c |
|
23-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Include the subsubtest name for live_parallel_engines Include the name of the failing subsubtest, should it fails. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191123191547.925360-1-chris@chris-wilson.co.uk
|
#
a8c9a7f5 |
|
07-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Complete transition to a real struct file mock Since drm provided us with a real struct file we can use for our anonymous internal clients (mock_file), complete our transition to using that as the primary interface (and not the mocked up struct drm_file we previous were using). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191107213929.23286-1-chris@chris-wilson.co.uk
|
#
85ca528e |
|
07-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Replace mock_file hackery with drm's true fake As drm now exports a method to create an anonymous struct file around a drm_device for internal use, make use of it to avoid our horrible hacks. Danial suggested that the mock_file_put() wrapper was suitable for drm-core, along with the mock_drm_getfile() [and that the vestigal mock_drm_file() in this patch should perhaps be the drm interface itself]. However, the eventual goal is to remove the mock_drm_file() and use the struct file and fput() directly, in this patch we take a simple transition in that direction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-3-chris@chris-wilson.co.uk
|
#
f05816cb |
|
01-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Spin on all engines simultaneously Vanshidhar Konda asked for the simplest test "to verify that the kernel can submit and hardware can execute batch buffers on all the command streamers in parallel." We have a number of tests in userspace that submit load to each engine and verify that it is present, but strictly we have no selftest to prove that the kernel can _simultaneously_ execute on all known engines. (We have tests to demonstrate that we can submit to HW in parallel, but we don't insist that they execute in parallel.) v2: Improve the igt_spinner support for older gen. Suggested-by: Vanshidhar Konda <vanshidhar.r.konda@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Vanshidhar Konda <vanshidhar.r.konda@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Vanshidhar Konda <vanshidhar.r.konda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191101101528.10553-1-chris@chris-wilson.co.uk
|
#
e5661c6a |
|
01-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Start kthreads before stopping An interesting observation made with our parallel selftests was that on our small/single cpu systems we would call kthread_stop() before the kthreads were spawned. If this happens, the kthread is never run at all; completely bypassing the test. A simple yield() from the parent will ensure that all children have the opportunity to start before we reap them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191101084940.31838-1-chris@chris-wilson.co.uk
|
#
e9768bfe |
|
16-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Teach requests to use all available engines The request selftests straddle the boundary between checking the driver and the hardware. They are subject to the quirks of the underlying HW, but operate on top of the backend abstractions. The tests focus on the scheduler elements and so should check for interactions of the scheduler across all exposed engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191016125236.17960-1-chris@chris-wilson.co.uk
|
#
a4e7ccda |
|
04-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move context management under GEM Keep track of the GEM contexts underneath i915->gem.contexts and assign them their own lock for the purposes of list management. v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex v3: Correct split with removal of logical HW ID Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk
|
#
7e805762 |
|
04-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Drop struct_mutex from around i915_retire_requests() We don't need to hold struct_mutex now for retiring requests, so drop it from i915_retire_requests() and i915_gem_wait_for_idle(), finally removing I915_WAIT_LOCKED for good. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-8-chris@chris-wilson.co.uk
|
#
2850748e |
|
04-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Pull i915_vma_pin under the vm->mutex Replace the struct_mutex requirement for pinning the i915_vma with the local vm->mutex instead. Note that the vm->mutex is tainted by the shrinker (we require unbinding from inside fs-reclaim) and so we cannot allocate while holding that mutex. Instead we have to preallocate workers to do allocate and apply the PTE updates after we have we reserved their slot in the drm_mm (using fences to order the PTE writes with the GPU work and with later unbind). In adding the asynchronous vma binding, one subtle requirement is to avoid coupling the binding fence into the backing object->resv. That is the asynchronous binding only applies to the vma timeline itself and not to the pages as that is a more global timeline (the binding of one vma does not need to be ordered with another vma, nor does the implicit GEM fencing depend on a vma, only on writes to the backing store). Keeping the vma binding distinct from the backing store timelines is verified by a number of async gem_exec_fence and gem_exec_schedule tests. The way we do this is quite simple, we keep the fence for the vma binding separate and only wait on it as required, and never add it to the obj->resv itself. Another consequence in reducing the locking around the vma is the destruction of the vma is no longer globally serialised by struct_mutex. A natural solution would be to add a kref to i915_vma, but that requires decoupling the reference cycles, possibly by introducing a new i915_mm_pages object that is own by both obj->mm and vma->pages. However, we have not taken that route due to the overshadowing lmem/ttm discussions, and instead play a series of complicated games with trylocks to (hopefully) ensure that only one destruction path is called! v2: Add some commentary, and some helpers to reduce patch churn. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
|
#
a3f56e7d |
|
25-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Exercise concurrent submission to all engines The simplest and most maximal submission we can do, a thread to submit requests unto each engine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190925193446.26007-1-chris@chris-wilson.co.uk
|
#
8e40983d |
|
21-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Fixup a couple of missing serialisation with vma In commit 70d6894d1456 ("drm/i915: Serialize against vma moves") I managed to miss a couple of i915_vma_move_to_active() that had not serialised against an async vma pinning. Add the missing i915_request_await. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190821193851.18232-1-chris@chris-wilson.co.uk
|
#
70d6894d |
|
18-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Serialize against vma moves Make sure that when submitting requests, we always serialize against potential vma moves and clflushes. Time for a i915_request_await_vma() interface! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819112033.30638-1-chris@chris-wilson.co.uk
|
#
cbb153c5 |
|
08-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Fixup a missing legacy_idx Grr, missed one*. For using the legacy engine map, we should use engine->legacy_idx. Ideally, we should know the intel_context in the selftest and avoid all the fiddling around with unwanted GEM contexts. * In my defence, the conflict was added in another patch after it was tested by CI. v2: mock engines needs legacy love as well Fixes: f1c4d157ab9b ("drm/i915: Fix up the inverse mapping for default ctx->engines[]") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808194525.9410-2-chris@chris-wilson.co.uk
|
#
ca883c30 |
|
07-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Pass intel_context to mock_request Modernise the mock_request factory to take intel_context not a (GEM context, intel_engine_cs) tuple. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808115640.20552-1-chris@chris-wilson.co.uk
|
#
cb823ed9 |
|
12-Jul-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Use intel_gt as the primary object for handling resets Having taken the first step in encapsulating the functionality by moving the related files under gt/, the next step is to start encapsulating by passing around the relevant structs rather than the global drm_i915_private. In this step, we pass intel_gt to intel_reset.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
|
#
c8d84778 |
|
25-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Hold ref on request across waits As we wait upon the request, we should be sure to hold our own reference for our checks. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-14-chris@chris-wilson.co.uk
|
#
baea429d |
|
21-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Move i915_gem_chipset_flush to intel_gt This aligns better with the rest of restructuring. v2: * Move call out of line. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-24-tvrtko.ursulin@linux.intel.com
|
#
22b7a426 |
|
20-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Preempt-to-busy When using a global seqno, we required a precise stop-the-workd event to handle preemption and unwind the global seqno counter. To accomplish this, we would preempt to a special out-of-band context and wait for the machine to report that it was idle. Given an idle machine, we could very precisely see which requests had completed and which we needed to feed back into the run queue. However, now that we have scrapped the global seqno, we no longer need to precisely unwind the global counter and only track requests by their per-context seqno. This allows us to loosely unwind inflight requests while scheduling a preemption, with the enormous caveat that the requests we put back on the run queue are still _inflight_ (until the preemption request is complete). This makes request tracking much more messy, as at any point then we can see a completed request that we believe is not currently scheduled for execution. We also have to be careful not to rewind RING_TAIL past RING_HEAD on preempting to the running context, and for this we use a semaphore to prevent completion of the request before continuing. To accomplish this feat, we change how we track requests scheduled to the HW. Instead of appending our requests onto a single list as we submit, we track each submission to ELSP as its own block. Then upon receiving the CS preemption event, we promote the pending block to the inflight block (discarding what was previously being tracked). As normal CS completion events arrive, we then remove stale entries from the inflight tracker. v2: Be a tinge paranoid and ensure we flush the write into the HWS page for the GPU semaphore to pick in a timely fashion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
|
#
2f530945 |
|
18-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Stop passing I915_WAIT_LOCKED to i915_request_wait() Since commit eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex"), the I915_WAIT_LOCKED flags passed to i915_request_wait() has been defunct. Now go ahead and remove it from all callers. References: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-3-chris@chris-wilson.co.uk
|
#
c447ff7d |
|
13-Jun-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: update with_intel_runtime_pm to use the rpm structure Matching the underlying get/put functions. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
|
#
d858d569 |
|
13-Jun-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: update rpm_get/put to use the rpm structure The functions where internally already only using the structure, so we need to just flip the interface. v2: rebase Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
|
#
e568ac38 |
|
11-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Pull kref into i915_address_space Make the kref common to both derived structs (i915_ggtt and i915_ppgtt) so that we can safely reference count an abstract ctx->vm address space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk
|
#
c017cf6b |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Drop the deferred active reference An old optimisation to reduce the number of atomics per batch sadly relies on struct_mutex for coordination. In order to remove struct_mutex from serialising object/context closing, always taking and releasing an active reference on first use / last use greatly simplifies the locking. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk
|
#
6951e589 |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move GEM object domain management from struct_mutex to local Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
|
#
10be98a7 |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move more GEM objects under gem/ Continuing the theme of separating out the GEM clutter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
|
#
46472b3e |
|
26-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move i915_request_alloc into selftests/ Having transitioned GEM over to using intel_context as its primary means of tracking the GEM context and engine combined and using i915_request_create(), we can move the older i915_request_alloc() helper function into selftests/ where the remaining users are confined. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-9-chris@chris-wilson.co.uk
|
#
2ccdf6a1 |
|
24-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Pass intel_context to i915_request_create() Start acquiring the logical intel_context and using that as our primary means for request allocation. This is the initial step to allow us to avoid requiring struct_mutex for request allocation along the perma-pinned kernel context, but it also provides a foundation for breaking up the complex request allocation to handle different scenarios inside execbuf. For the purpose of emitting a request from inside retirement (see the next patch for engine power management), we also need to lift control over the timeline mutex to the caller. v2: Note that the request carries the active reference upon construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-4-chris@chris-wilson.co.uk
|
#
a679f58d |
|
21-Mar-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Flush pages on acquisition When we return pages to the system, we ensure that they are marked as being in the CPU domain since any external access is uncontrolled and we must assume the worst. This means that we need to always flush the pages on acquisition if we need to use them on the GPU, and from the beginning have used set-domain. Set-domain is overkill for the purpose as it is a general synchronisation barrier, but our intent is to only flush the pages being swapped in. If we move that flush into the pages acquisition phase, we know then that when we have obj->mm.pages, they are coherent with the GPU and need only maintain that status without resorting to heavy handed use of set-domain. The principle knock-on effect for userspace is through mmap-gtt pagefaulting. Our uAPI has always implied that the GTT mmap was async (especially as when any pagefault occurs is unpredicatable to userspace) and so userspace had to apply explicit domain control itself (set-domain). However, swapping is transparent to the kernel, and so on first fault we need to acquire the pages and make them coherent for access through the GTT. Our use of set-domain here leaks into the uABI that the first pagefault was synchronous. This is unintentional and baring a few igt should be unoticed, nevertheless we bump the uABI version for mmap-gtt to reflect the change in behaviour. Another implication of the change is that gem_create() is presumed to create an object that is coherent with the CPU and is in the CPU write domain, so a set-domain(CPU) following a gem_create() would be a minor operation that merely checked whether we could allocate all pages for the object. On applying this change, a set-domain(CPU) causes a clflush as we acquire the pages. This will have a small impact on mesa as we move the clflush here on !llc from execbuf time to create, but that should have minimal performance impact as the same clflush exists but is now done early and because of the clflush issue, userspace recycles bo and so should resist allocating fresh objects. Internally, the presumption that objects are created in the CPU write-domain and remain so through writes to obj->mm.mapping is more prevalent than I expected; but easy enough to catch and apply a manual flush. For the future, we should push the page flush from the central set_pages() into the callers so that we can more finely control when it is applied, but for now doing it one location is easier to validate, at the cost of sometimes flushing when there is no need. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Antonio Argenziano <antonio.argenziano@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk
|
#
8a68d464 |
|
05-Mar-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Store the BIT(engine->id) as the engine's mask In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
|
#
3ef71149 |
|
01-Mar-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Introduce i915_timeline.mutex A simple mutex used for guarding the flow of requests in and out of the timeline. In the short-term, it will be used only to guard the addition of requests into the timeline, taken on alloc and released on commit so that only one caller can construct a request into the timeline (important as the seqno and ring pointers must be serialised). This will be used by observers to ensure that the seqno/hwsp is stable. Later, when we have reduced retiring to only operate on a single timeline at a time, we can then use the mutex as the sole guard required for retiring. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190301110547.14758-2-chris@chris-wilson.co.uk
|
#
8892f477 |
|
26-Feb-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove access to global seqno in the HWSP Stop accessing the HWSP to read the global seqno, and stop tracking the mirror in the engine's execution timeline -- it is unused. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-2-chris@chris-wilson.co.uk
|
#
c41166f9 |
|
20-Feb-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Beware temporary wedging when determining -EIO At a few points in our uABI, we check to see if the driver is wedged and report -EIO back to the user in that case. However, as we perform the check and reset asynchronously (where once before they were both serialised by the struct_mutex), we may instead see the temporary wedging used to cancel inflight rendering to avoid a deadlock during reset (caused by either us timing out in our reset handler, i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare for a stuck modeset). If we suspect this is the case, that is we see a wedged driver *and* reset in progress, then wait until the reset is resolved before reporting upon the wedged status. v2: might_sleep() (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk
|
#
52c0fdb2 |
|
29-Jan-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Replace global breadcrumbs with per-context interrupt tracking A few years ago, see commit 688e6c725816 ("drm/i915: Slaughter the thundering i915_wait_request herd"), the issue of handling multiple clients waiting in parallel was brought to our attention. The requirement was that every client should be woken immediately upon its request being signaled, without incurring any cpu overhead. To handle certain fragility of our hw meant that we could not do a simple check inside the irq handler (some generations required almost unbounded delays before we could be sure of seqno coherency) and so request completion checking required delegation. Before commit 688e6c725816, the solution was simple. Every client waiting on a request would be woken on every interrupt and each would do a heavyweight check to see if their request was complete. Commit 688e6c725816 introduced an rbtree so that only the earliest waiter on the global timeline would woken, and would wake the next and so on. (Along with various complications to handle requests being reordered along the global timeline, and also a requirement for kthread to provide a delegate for fence signaling that had no process context.) The global rbtree depends on knowing the execution timeline (and global seqno). Without knowing that order, we must instead check all contexts queued to the HW to see which may have advanced. We trim that list by only checking queued contexts that are being waited on, but still we keep a list of all active contexts and their active signalers that we inspect from inside the irq handler. By moving the waiters onto the fence signal list, we can combine the client wakeup with the dma_fence signaling (a dramatic reduction in complexity, but does require the HW being coherent, the seqno must be visible from the cpu before the interrupt is raised - we keep a timer backup just in case). Having previously fixed all the issues with irq-seqno serialisation (by inserting delays onto the GPU after each request instead of random delays on the CPU after each interrupt), we can rely on the seqno state to perfom direct wakeups from the interrupt handler. This allows us to preserve our single context switch behaviour of the current routine, with the only downside that we lose the RT priority sorting of wakeups. In general, direct wakeup latency of multiple clients is about the same (about 10% better in most cases) with a reduction in total CPU time spent in the waiter (about 20-50% depending on gen). Average herd behaviour is improved, but at the cost of not delegating wakeups on task_prio. v2: Capture fence signaling state for error state and add comments to warm even the most cold of hearts. v3: Check if the request is still active before busywaiting v4: Reduce the amount of pointer misdirection with list_for_each_safe and using a local i915_request variable inside the loops v5: Add a missing pluralisation to a purely informative selftest message. References: 688e6c725816 ("drm/i915: Slaughter the thundering i915_wait_request herd") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
|
#
e4a8c813 |
|
21-Jan-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Refactor common live_test framework Before adding yet another copy of struct live_test and its handler, refactor the existing code into a common framework for live selftests. For many live selftests, we want to know if the GPU hung or otherwise misbehaved during the execution of the test (beyond any infraction in the behaviour under test), live_test provides this by comparing the GPU state before and after, alerting if it unexpectedly changed (e.g. the reset counter changed). It also ensures that the GPU is idle before and after the test, so that residual code running on the GPU is flushed before testing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-5-chris@chris-wilson.co.uk
|
#
d4225a53 |
|
14-Jan-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Syntatic sugar for using intel_runtime_pm Frequently, we use intel_runtime_pm_get/_put around a small block. Formalise that usage by providing a macro to define such a block with an automatic closure to scope the intel_runtime_pm wakeref to that block, i.e. macro abuse smelling of python. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk
|
#
c9d08cc3 |
|
14-Jan-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Mark up rpm wakerefs Track the temporary wakerefs used within the selftests so that leaks are clear. v2: Add a couple of coarse annotations for mock selftests as we now loudly warn about the errors. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-14-chris@chris-wilson.co.uk
|
#
16e4dd03 |
|
14-Jan-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Markup paired operations on wakerefs The majority of runtime-pm operations are bounded and scoped within a function; these are easy to verify that the wakeref are handled correctly. We can employ the compiler to help us, and reduce the number of wakerefs tracked when debugging, by passing around cookies provided by the various rpm_get functions to their rpm_put counterpart. This makes the pairing explicit, and given the required wakeref cookie the compiler can verify that we pass an initialised value to the rpm_put (quite handy for double checking error paths). For regular builds, the compiler should be able to eliminate the unused local variables and the program growth should be minimal. Fwiw, it came out as a net improvement as gcc was able to refactor rpm_get and rpm_get_if_in_use together, v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual mark up for smaller more targeted patches. v3: Mention the cookie in Returns Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
|
#
b8bdd9cc |
|
20-Sep-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Live tests emit requests and so require rpm As we emit requests or touch HW directly for some of the live tests, the requirement is that we hold the rpm wakeref before doing so. We want a mix of granularity since we will want to test runtime suspend, so try to mark up only the critical sections where we need rpm for the live test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108002 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180920144934.16611-1-chris@chris-wilson.co.uk
|
#
ec625fb9 |
|
09-Jul-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Provide a timeout to i915_gem_wait_for_idle() Usually we have no idea about the upper bound we need to wait to catch up with userspace when idling the device, but in a few situations we know the system was idle beforehand and can provide a short timeout in order to very quickly catch a failure, long before hangcheck kicks in. In the following patches, we will use the timeout to curtain two overly long waits, where we know we can expect the GPU to complete within a reasonable time or declare it broken. In particular, with a broken GPU we expect it to fail during the initial GPU setup where do a couple of context switches to record the defaults. This is a task that takes a few milliseconds even on the slowest of devices, but we may have to wait 60s for hangcheck to give in and declare the machine inoperable. In this a case where any gpu hang is unacceptable, both from a timeliness and practical standpoint. The other improvement is that in selftests, we do not need to arm an independent timer to inject a wedge, as we can just limit the timeout on the wait directly. v2: Include the timeout parameter in the trace. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-1-chris@chris-wilson.co.uk
|
#
a5236978 |
|
06-Jul-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Start returning an error from i915_vma_move_to_active() Handling such a late error in request construction is tricky, but to accommodate future patches which may allocate here, we potentially could err. To handle the error after already adjusting global state to track the new request, we must finish and submit the request. But we don't want to use the request as not everything is being tracked by it, so we opt to cancel the commands inside the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-3-chris@chris-wilson.co.uk
|
#
a9450e15 |
|
06-Jul-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Skip all request selftests when wedged If the GPU is irrecoverably wedge, we cannot submit any request and so all of the request selftests will expectedly fail. Skip over them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-5-chris@chris-wilson.co.uk
|
#
bb9e8755 |
|
05-Jul-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Fixup recursive MI_BB_START for gen3 There's no magic bit0 in MI_BB_START for gen3, it's the same dword length parameter as elsewhere and needs to be zero. v2: Same bug in both live_requests and live_hanghcheck. References: https://bugs.freedesktop.org/show_bug.cgi?id=107132 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180705154756.5533-1-chris@chris-wilson.co.uk Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
|
#
a24362ea |
|
18-Jun-2018 |
Thomas Zimmermann <tdz@users.sourceforge.net> |
drm/i915: Replace drm_dev_unref with drm_dev_put This patch unifies the naming of DRM functions for reference counting of struct drm_device. The resulting code is more aligned with the rest of the Linux kernel interfaces. Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180618110154.30462-6-tdz@users.sourceforge.net
|
#
920d3fb1 |
|
14-Jun-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Initialise request to silence a compiler With an old (4.7.3 on 32bit) gcc, it emits a warning for In file included from drivers/gpu/drm/i915/i915_request.c:1425:0: drivers/gpu/drm/i915/selftests/i915_request.c: In function ‘live_nop_request’: drivers/gpu/drm/i915/selftests/i915_request.c:380:21: error: ‘request’ may be used uninitialized in this function [-Werror=maybe-uninitialized] Silence it by just setting it to NULL on initialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180614124923.18071-1-chris@chris-wilson.co.uk
|
#
697b9a87 |
|
12-Jun-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Make closing request flush mandatory For symmetry, simplicity and ensuring the request is always truly idle upon its completion, always emit the closing flush prior to emitting the request breadcrumb. Previously, we would only emit the flush if we had started a user batch, but this just leaves all the other paths open to speculation (do they affect the GPU caches or not?) With mm switching, a key requirement is that the GPU is flushed and invalidated before hand, so for absolute safety, we want that closing flush be mandatory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180612105135.4459-1-chris@chris-wilson.co.uk
|
#
82ad6443 |
|
05-Jun-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gtt: Rename i915_hw_ppgtt base member In the near future, I want to subclass gen6_hw_ppgtt as it contains a few specialised members and I wish to add more. To avoid the ugliness of using ppgtt->base.base, rename the i915_hw_ppgtt base member (i915_address_space) as vm, which is our common shorthand for an i915_address_space local. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk
|
#
e61e0f51 |
|
21-Feb-2018 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Rename drm_i915_gem_request to i915_request We want to de-emphasize the link between the request (dependency, execution and fence tracking) from GEM and so rename the struct from drm_i915_gem_request to i915_request. That is we may implement the GEM user interface on top of requests, but they are an abstraction for tracking execution rather than an implementation detail of GEM. (Since they are not tied to HW, we keep the i915 prefix as opposed to intel.) In short, the spatch: @@ @@ - struct drm_i915_gem_request + struct i915_request A corollary to contracting the type name, we also harmonise on using 'rq' shorthand for local variables where space if of the essence and repetition makes 'request' unwieldy. For globals and struct members, 'request' is still much preferred for its clarity. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|