History log of /linux-master/drivers/gpu/drm/i915/i915_vma.c
Revision Date Author Comments
# 0e45882c 05-Mar-2024 Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

drm/i915/vma: Fix UAF on destroy against retire race

Object debugging tools were sporadically reporting illegal attempts to
free a still active i915 VMA object when parking a GT believed to be idle.

[161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915]
[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0
...
[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1
[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915]
[161.360592] RIP: 0010:debug_print_object+0x80/0xb0
...
[161.361347] debug_object_free+0xeb/0x110
[161.361362] i915_active_fini+0x14/0x130 [i915]
[161.361866] release_references+0xfe/0x1f0 [i915]
[161.362543] i915_vma_parked+0x1db/0x380 [i915]
[161.363129] __gt_park+0x121/0x230 [i915]
[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915]

That has been tracked down to be happening when another thread is
deactivating the VMA inside __active_retire() helper, after the VMA's
active counter has been already decremented to 0, but before deactivation
of the VMA's object is reported to the object debugging tool.

We could prevent from that race by serializing i915_active_fini() with
__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from
being used, e.g. from __i915_vma_retire() called at the end of
__active_retire(), after that VMA has been already freed by a concurrent
i915_vma_destroy() on return from the i915_active_fini(). Then, we should
rather fix the issue at the VMA level, not in i915_active.

Since __i915_vma_parked() is called from __gt_park() on last put of the
GT's wakeref, the issue could be addressed by holding the GT wakeref long
enough for __active_retire() to complete before that wakeref is released
and the GT parked.

I believe the issue was introduced by commit d93939730347 ("drm/i915:
Remove the vma refcount") which moved a call to i915_active_fini() from
a dropped i915_vma_release(), called on last put of the removed VMA kref,
to i915_vma_parked() processing path called on last put of a GT wakeref.
However, its visibility to the object debugging tool was suppressed by a
bug in i915_active that was fixed two weeks later with commit e92eb246feb9
("drm/i915/active: Fix missing debug object activation").

A VMA associated with a request doesn't acquire a GT wakeref by itself.
Instead, it depends on a wakeref held directly by the request's active
intel_context for a GT associated with its VM, and indirectly on that
intel_context's engine wakeref if the engine belongs to the same GT as the
VMA's VM. Those wakerefs are released asynchronously to VMA deactivation.

Fix the issue by getting a wakeref for the VMA's GT when activating it,
and putting that wakeref only after the VMA is deactivated. However,
exclude global GTT from that processing path, otherwise the GPU never goes
idle. Since __i915_vma_retire() may be called from atomic contexts, use
async variant of wakeref put. Also, to avoid circular locking dependency,
take care of acquiring the wakeref before VM mutex when both are needed.

v7: Add inline comments with justifications for:
- using untracked variants of intel_gt_pm_get/put() (Nirmoy),
- using async variant of _put(),
- not getting the wakeref in case of a global GTT,
- always getting the first wakeref outside vm->mutex.
v6: Since __i915_vma_active/retire() callbacks are not serialized, storing
a wakeref tracking handle inside struct i915_vma is not safe, and
there is no other good place for that. Use untracked variants of
intel_gt_pm_get/put_async().
v5: Replace "tile" with "GT" across commit description (Rodrigo),
- avoid mentioning multi-GT case in commit description (Rodrigo),
- explain why we need to take a temporary wakeref unconditionally inside
i915_vma_pin_ww() (Rodrigo).
v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm
wakerefs") (Andi),
- for more easy backporting, split out removal of former insufficient
workarounds and move them to separate patches (Nirmoy).
- clean up commit message and description a bit.
v3: Identify root cause more precisely, and a commit to blame,
- identify and drop former workarounds,
- update commit message and description.
v2: Get the wakeref before VM mutex to avoid circular locking dependency,
- drop questionable Fixes: tag.

Fixes: d93939730347 ("drm/i915: Remove the vma refcount")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org # v5.19+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-6-janusz.krzysztofik@linux.intel.com
(cherry picked from commit f3c71b2ded5c4367144a810ef25f998fd1d6c381)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# c1464a89 30-Aug-2023 Jani Nikula <jani.nikula@intel.com>

drm/i915: add minimal i915_gem_object_frontbuffer.h

Split out frontbuffer related declarations and static inlines from
gem/i915_gem_object.h into new gem/i915_gem_object_frontbuffer.h.

The main goal is to reduce header interdependencies. With
gem/i915_gem_object.h including display/intel_frontbuffer.h,
modification of the latter causes a whopping 300+ objects to be rebuilt,
while many of the source files actually needing it aren't explicitly
including it at all.

After the change, only 21 objects depend on display/intel_frontbuffer.h,
directly or indirectly.

Cc: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230830085127.2416842-1-jani.nikula@intel.com


# 90b8ad13 14-Aug-2023 Alan Previn <alan.previn.teres.alexis@intel.com>

drm/i915: Fix TLB-Invalidation seqno store

When getting the next gt's seqno to be stored into an
objects mm.tlb[gt_id] array, fix the retrieval code
to get it from the correct gt instead of the same one.

Fixes: d6c531ab4820 ("drm/i915: Invalidate the TLBs on each GT")
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230814182449.1060747-1-alan.previn.teres.alexis@intel.com


# f2ac6402 14-Aug-2023 Alan Previn <alan.previn.teres.alexis@intel.com>

drm/i915: Fix TLB-Invalidation seqno store

When getting the next gt's seqno to be stored into an
objects mm.tlb[gt_id] array, fix the retrieval code
to get it from the correct gt instead of the same one.

Fixes: d6c531ab4820 ("drm/i915: Invalidate the TLBs on each GT")
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230814182449.1060747-1-alan.previn.teres.alexis@intel.com
(cherry picked from commit 90b8ad13536e80b1b4d9ed1c9d527e64ee757c26)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# d6c531ab 01-Aug-2023 Chris Wilson <chris.p.wilson@linux.intel.com>

drm/i915: Invalidate the TLBs on each GT

With multi-GT devices, the object may have been bound on each GT.
Invalidate the TLBs across all GT before releasing the pages
back to the system.

Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230801141955.383305-4-andi.shyti@linux.intel.com


# 568a2e6f 01-Aug-2023 Chris Wilson <chris.p.wilson@linux.intel.com>

drm/i915/gt: Move TLB invalidation to its own file

Prepare for supporting more TLB invalidation scenarios by moving
the current MMIO invalidation to its own file.

Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230801141955.383305-2-andi.shyti@linux.intel.com


# ddd33ff1 27-Jul-2023 Jouni Högander <jouni.hogander@intel.com>

drm/i915: Add function to clear scanout flag for vmas

Currently frontbuffer tracking code is directly iterating over object vmas
and clearing scanout flags for them. Add function to clear scanout flag for
vmas and use it from frontbuffer tracking code.

v2: describe function parameter.

Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230727064142.751976-5-jouni.hogander@intel.com


# 7b574550 27-Jul-2023 Jouni Högander <jouni.hogander@intel.com>

drm/i915: Add getter/setter for i915_gem_object->frontbuffer

Add getter/setter for i915_gem_object->frontbuffer and use it instead of
directly touching i915_gem_object->frontbuffer frontbuffer pointer.

v3:
- Fix intel_frontbuffer_get return value
- s/front_ret/cur/
v2: Move getter/setter into i915_gem_object.h

Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230727064142.751976-3-jouni.hogander@intel.com


# b364f3cd 21-Jul-2023 Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

drm/i915: Simplify expression &to_i915(dev)->drm

to_i915 is defined as

container_of(dev, struct drm_i915_private, drm);

So for a struct drm_device *dev, to_i915(dev)->drm is just dev. Simplify
accordingly.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230721212133.271118-1-u.kleine-koenig@pengutronix.de


# 7a2280e8 22-May-2023 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Wait for active retire before i915_active_fini()

i915_active_fini() finalizes the debug object, which can occur before
the active retires and deactivates the debug object. Wait for one
final time before calling i915_active_fini();

Closes:: https://gitlab.freedesktop.org/drm/intel/-/issues/8311
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230522124205.368-1-nirmoy.das@intel.com


# 9275277d 09-May-2023 Fei Yang <fei.yang@intel.com>

drm/i915: use pat_index instead of cache_level

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity. PAT is expected to work much like MOCS already works today,
and by design userspace is expected to select the index that exactly
matches the desired behavior described in the hardware specification.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For kernel objects, cache_level is used
for simplicity and backward compatibility. For Pre-gen12 platforms PAT can
have 1:1 mapping to i915_cache_level, so these two are interchangeable. see
the use of LEGACY_CACHELEVEL.

One consequence of this change is that gen8_pte_encode is no longer working
for gen12 platforms due to the fact that gen12 platforms has different PAT
definitions. In the meantime the mtl_pte_encode introduced specfically for
MTL becomes generic for all gen12 platforms. This patch renames the MTL
PTE encode function into gen12_pte_encode and apply it to all gen12. Even
though this change looks unrelated, but separating them would temporarily
break gen12 PTE encoding, thus squash them in one patch.

Special note: this patch changes the way caching behavior is controlled in
the sense that some objects are left to be managed by userspace. For such
objects we need to be careful not to change the userspace settings.There
are kerneldoc and comments added around obj->cache_coherent, cache_dirty,
and how to bypass the checkings by i915_gem_object_has_cache_level. For
full understanding, these changes need to be looked at together with the
two follow-up patches, one disables the {set|get}_caching ioctl's and the
other adds set_pat extension to the GEM_CREATE uAPI.

Bspec: 63019

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com


# a915450e 31-Mar-2023 Lee Jones <lee@kernel.org>

drm/i915/i915_vma: Provide one missing param and demote another non-kerneldoc header

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/i915/i915_vma.c:756: warning: Function parameter or member 'ww' not described in 'i915_vma_insert'
drivers/gpu/drm/i915/i915_vma.c:1744: warning: Function parameter or member 'vma' not described in 'i915_vma_destroy_locked'

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230331092607.700644-17-lee@kernel.org


# acc855d3 16-Jan-2023 Jani Nikula <jani.nikula@intel.com>

drm/i915/display: add intel_display_limits.h for key enums

Move a handful of key enums to a new file intel_display_limits.h. These
are the enum types, and the MAX/NUM enumerations within them, that are
used in other headers. Otherwise, there's no common theme between them.

Replace intel_display.h include with intel_display_limit.h where
relevant, and add the intel_display.h include directly in the .c files
where needed.

Since intel_display.h is used almost everywhere in display/, include it
from intel_display_types.h to avoid massive changes across the
board. There are very few files that would need intel_display_types.h
but not intel_display.h so this is neglible, and further cleanup between
these headers can be left for the future.

Overall this change drops the direct and indirect dependencies on
intel_display.h from about 300 to about 100 compilation units, because
we can drop the include from i915_drv.h.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230116164644.1752009-1-jani.nikula@intel.com


# 4f0755c2 23-Dec-2022 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Reserve enough fence slot for i915_vma_unbind_async

A nested dma_resv_reserve_fences(1) will not reserve slot from the
2nd call onwards and folowing dma_resv_add_fence() might hit the
"BUG_ON(fobj->num_fences >= fobj->max_fences)" check.

I915 hit above nested dma_resv case in ttm_bo_handle_move_mem() with
async unbind:

dma_resv_reserve_fences() from --> ttm_bo_handle_move_mem()
dma_resv_reserve_fences() from --> i915_vma_unbind_async()
dma_resv_add_fence() from --> i915_vma_unbind_async()
dma_resv_add_fence() from -->ttm_bo_move_accel_cleanup()

Resolve this by adding an extra fence in i915_vma_unbind_async().

Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 2f6b90da9192 ("drm/i915: Use vma resources for async unbinding")
Cc: <stable@vger.kernel.org> # v5.18+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221223092011.11657-1-nirmoy.das@intel.com


# f47e6306 28-Dec-2022 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Typecheck page lookups

We need to check that we avoid integer overflows when looking up a page,
and so fix all the instances where we have mistakenly used a plain
integer instead of a more suitable long. Be pedantic and add integer
typechecking to the lookup so that we can be sure that we are safe.
And it also uses pgoff_t as our page lookups must remain compatible with
the page cache, pgoff_t is currently exactly unsigned long.

v2: Move added i915_utils's macro into drm_util header (Jani N)
v3: Make not use the same macro name on a function. (Mauro)
For kernel-doc, macros and functions are handled in the same namespace,
the same macro name on a function prevents ever adding documentation
for it.
v4: Add kernel-doc markups to the kAPI functions and macros (Mauoro)
v5: Fix an alignment to match open parenthesis
v6: Rebase
v10: Use assert_typable instead of exactly_pgoff_t() macro. (Kees)
v11: Change the use of assert_typable to assert_same_typable (G.G)
v12: Change to use static_assert(__castable_to_type(n ,T)) style since
the assert_same_typable() macro has been dropped. (G.G)
v13: Change the use of __castable_to_type() to castable_to_type()
Remove an unnecessary header include line. (G.G)
v16: Fix "ERROR:SPACING" Checkpatch report (G.G)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> (v2)
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org> (v3)
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> (v5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221228192252.917299-2-gwan-gyeong.mun@intel.com


# 801fa7a8 16-Dec-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: improve the catch-all evict to handle lock contention

The catch-all evict can fail due to object lock contention, since it
only goes as far as trylocking the object, due to us already holding the
vm->mutex. Doing a full object lock here can deadlock, since the
vm->mutex is always our inner lock. Add another execbuf pass which drops
the vm->mutex and then tries to grab the object will the full lock,
before then retrying the eviction. This should be good enough for now to
fix the immediate regression with userspace seeing -ENOSPC from execbuf
due to contended object locks during GTT eviction.

v2 (Mani)
- Also revamp the docs for the different passes.

Testcase: igt@gem_ppgtt@shrink-vs-evict-*
Fixes: 7e00897be8bf ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/7627
References: https://gitlab.freedesktop.org/drm/intel/-/issues/7570
References: https://bugzilla.mozilla.org/show_bug.cgi?id=1779558
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Mani Milani <mani@chromium.org>
Cc: <stable@vger.kernel.org> # v5.18+
Reviewed-by: Mani Milani <mani@chromium.org>
Tested-by: Mani Milani <mani@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20221216113456.414183-1-matthew.auld@intel.com


# 61102251 01-Dec-2022 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Introduce guard pages to i915_vma

Introduce the concept of padding the i915_vma with guard pages before
and after. The major consequence is that all ordinary uses of i915_vma
must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size
directly, as the drm_mm_node will include the guard pages that surround
our object.

The biggest connundrum is how exactly to mix requesting a fixed address
with guard pages, particularly through the existing uABI. The user does
not know about guard pages, so such must be transparent to the user, and
so the execobj.offset must be that of the object itself excluding the
guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages.
The caveat is that some placements will be impossible with guard pages,
as wrap arounds need to be avoided, and the vma itself will require a
larger node. We must not report EINVAL but ENOSPC as these are unavailable
locations within the GTT rather than conflicting user requirements.

In the next patch, we start using guard pages for scanout objects. While
these are limited to GGTT vma, on a few platforms these vma (or at least
an alias of the vma) is shared with userspace, so we may leak the
existence of such guards if we are not careful to ensure that the
execobj.offset is transparent and excludes the guards. (On such platforms
like ivb, without full-ppgtt, userspace has to use relocations so the
presence of more untouchable regions within its GTT such be of no further
issue.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221201203912.346110-1-andi.shyti@linux.intel.com


# 8e4ee5e8 30-Nov-2022 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Wrap all access to i915_vma.node.start|size

We already wrap i915_vma.node.start for use with the GGTT, as there we
can perform additional sanity checks that the node belongs to the GGTT
and fits within the 32b registers. In the next couple of patches, we
will introduce guard pages around the objects _inside_ the drm_mm_node
allocation. That is we will offset the vma->pages so that the first page
is at drm_mm_node.start + vma->guard (not 0 as is currently the case).
All users must then not use i915_vma.node.start directly, but compute
the guard offset, thus all users are converted to use a
i915_vma_offset() wrapper.

The notable exceptions are the selftests that are testing exact
behaviour of i915_vma_pin/i915_vma_insert.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-3-andi.shyti@linux.intel.com


# 0f857158 21-Nov-2022 Aravind Iddamsetty <aravind.iddamsetty@intel.com>

drm/i915/mtl: Media GT and Render GT share common GGTT

On XE_LPM+ platforms the media engines are carved out into a separate
GT but have a common GGTMMADR address range which essentially makes
the GGTT address space to be shared between media and render GT. As a
result any updates in GGTT shall invalidate TLB of GTs sharing it and
similarly any operation on GGTT requiring an action on a GT will have to
involve all GTs sharing it. setup_private_pat was being done on a per
GGTT based as that doesn't touch any GGTT structures moved it to per GT
based.

BSPEC: 63834

v2:
1. Add details to commit msg
2. includes fix for failure to add item to ggtt->gt_list, as suggested
by Lucas
3. as ggtt_flush() is used only for ggtt drop i915_is_ggtt check within
it.
4. setup_private_pat moved out of intel_gt_tiles_init

v3:
1. Move out for_each_gt from i915_driver.c (Jani Nikula)

v4: drop using RCU primitives on ggtt->gt_list as it is not an RCU list
(Matt Roper)

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221122070126.4813-1-aravind.iddamsetty@intel.com


# 476fdcda 23-Dec-2022 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Reserve enough fence slot for i915_vma_unbind_async

A nested dma_resv_reserve_fences(1) will not reserve slot from the
2nd call onwards and folowing dma_resv_add_fence() might hit the
"BUG_ON(fobj->num_fences >= fobj->max_fences)" check.

I915 hit above nested dma_resv case in ttm_bo_handle_move_mem() with
async unbind:

dma_resv_reserve_fences() from --> ttm_bo_handle_move_mem()
dma_resv_reserve_fences() from --> i915_vma_unbind_async()
dma_resv_add_fence() from --> i915_vma_unbind_async()
dma_resv_add_fence() from -->ttm_bo_move_accel_cleanup()

Resolve this by adding an extra fence in i915_vma_unbind_async().

Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 2f6b90da9192 ("drm/i915: Use vma resources for async unbinding")
Cc: <stable@vger.kernel.org> # v5.18+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221223092011.11657-1-nirmoy.das@intel.com
(cherry picked from commit 4f0755c2faf7388616109717facc5bbde6850e60)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 3f882f2d 16-Dec-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: improve the catch-all evict to handle lock contention

The catch-all evict can fail due to object lock contention, since it
only goes as far as trylocking the object, due to us already holding the
vm->mutex. Doing a full object lock here can deadlock, since the
vm->mutex is always our inner lock. Add another execbuf pass which drops
the vm->mutex and then tries to grab the object will the full lock,
before then retrying the eviction. This should be good enough for now to
fix the immediate regression with userspace seeing -ENOSPC from execbuf
due to contended object locks during GTT eviction.

v2 (Mani)
- Also revamp the docs for the different passes.

Testcase: igt@gem_ppgtt@shrink-vs-evict-*
Fixes: 7e00897be8bf ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/7627
References: https://gitlab.freedesktop.org/drm/intel/-/issues/7570
References: https://bugzilla.mozilla.org/show_bug.cgi?id=1779558
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Mani Milani <mani@chromium.org>
Cc: <stable@vger.kernel.org> # v5.18+
Reviewed-by: Mani Milani <mani@chromium.org>
Tested-by: Mani Milani <mani@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20221216113456.414183-1-matthew.auld@intel.com
(cherry picked from commit 801fa7a81f6da533cc5442fc40e32c72b76cd42a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 2a76fc89 19-Oct-2022 Andrzej Hajda <andrzej.hajda@intel.com>

drm/i915: call i915_request_await_object from _i915_vma_move_to_active

Since almost all calls to i915_vma_move_to_active are prepended with
i915_request_await_object, let's call the latter from
_i915_vma_move_to_active by default and add flag allowing bypassing it.
Adjust all callers accordingly.
The patch should not introduce functional changes.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-2-andrzej.hajda@intel.com


# 443a8fbc 15-Nov-2022 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Fix vma allocator debug

Add a missing colon which I accidentally removed in the recent logging
changes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: a10234fda466 ("drm/i915: Partial abandonment of legacy DRM logging macros")
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115101730.394880-1-tvrtko.ursulin@linux.intel.com


# a10234fd 09-Nov-2022 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Partial abandonment of legacy DRM logging macros

Convert some usages of legacy DRM logging macros into versions which tell
us on which device have the events occurred.

v2:
* Don't have struct drm_device as local. (Jani, Ville)

v3:
* Store gt, not i915, in workaround list. (John)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com


# 8133a6da 03-Oct-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: enable PS64 support for DG2

It turns out that on production DG2/ATS HW we should have support for
PS64. This feature allows to provide a 64K TLB hint at the PTE level,
which is a lot more flexible than the current method of enabling 64K GTT
pages for the entire page-table, since that leads to all kinds of
annoying restrictions, as documented in:

commit caa574ffc4aaf4f29b890223878c63e2e7772f62
Author: Matthew Auld <matthew.auld@intel.com>
Date: Sat Feb 19 00:17:49 2022 +0530

drm/i915/uapi: document behaviour for DG2 64K support

On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.

With PS64, we can now drop the 2M GTT alignment restriction, and instead
only require 64K or larger when dealing with lmem. We still use the
compact-pt layout when possible, but only when we are certain that this
doesn't interfere with userspace.

Note that this is a change in uAPI behaviour, but hopefully shouldn't be
a concern (IGT is at least able to autodetect the alignment), since we
are only making the GTT alignment constraint less restrictive.

Based on a patch from CQ Tang.

v2: update the comment wrt scratch page
v3: (Nirmoy)
- Fix the selftest to actually use the random size, plus some comment
improvements, also drop the rem stuff.

Reported-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Yang A Shi <yang.a.shi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221004114915.221708-1-matthew.auld@intel.com


# 04f7eb3d 07-Sep-2022 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Set correct domains values at _i915_vma_move_to_active

Fix regression introduced by commit:
"drm/i915: Individualize fences before adding to dma_resv obj"
which sets obj->read_domains to 0 for both read and write paths.
Also set obj->write_domain to 0 on read path which was removed by
the commit.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6639
Fixes: 420a07b841d0 ("drm/i915: Individualize fences before adding to dma_resv obj")
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.16+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907172641.12555-1-nirmoy.das@intel.com


# 3bb6a442 01-Sep-2022 Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

drm/i915: Rename ggtt_view as gtt_view

So far, different views (normal, partial, rotated and remapped)
into the same object are only supported for GGTT mappings.
But with the upcoming VM_BIND feature, PPGTT will also use the
partial view mapping. Hence rename ggtt_view to more generic
gtt_view.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220901183854.3446-1-niranjana.vishwanathapura@intel.com


# 3d037d99 04-Aug-2022 Mauro Carvalho Chehab <mchehab@kernel.org>

drm/i915: pass a pointer for tlb seqno at vma_invalidate_tlb()

WRITE_ONCE() should happen at the original var, not on a local
copy of it.

Cc: stable@vger.kernel.org
Fixes: 5d36acb7198b ("drm/i915/gt: Batch TLB invalidations")
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[added cc-stable while merging it]
Link: https://patchwork.freedesktop.org/patch/msgid/f9550e6bacea10131ff40dd8981b69eb9251cdcd.1659598090.git.mchehab@kernel.org


# 5d36acb7 27-Jul-2022 Chris Wilson <chris.p.wilson@intel.com>

drm/i915/gt: Batch TLB invalidations

Invalidate TLB in batches, in order to reduce performance regressions.

Currently, every caller performs a full barrier around a TLB
invalidation, ignoring all other invalidations that may have already
removed their PTEs from the cache. As this is a synchronous operation
and can be quite slow, we cause multiple threads to contend on the TLB
invalidate mutex blocking userspace.

We only need to invalidate the TLB once after replacing our PTE to
ensure that there is no possible continued access to the physical
address before releasing our pages. By tracking a seqno for each full
TLB invalidate we can quickly determine if one has been performed since
rewriting the PTE, and only if necessary trigger one for ourselves.

That helps to reduce the performance regression introduced by TLB
invalidate logic.

[mchehab: rebased to not require moving the code to a separate file]

Cc: stable@vger.kernel.org
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4e97ef5deb6739cadaaf40aa45620547e9c4ec06.1658924372.git.mchehab@kernel.org


# 08b81298 07-Sep-2022 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Set correct domains values at _i915_vma_move_to_active

Fix regression introduced by commit:
"drm/i915: Individualize fences before adding to dma_resv obj"
which sets obj->read_domains to 0 for both read and write paths.
Also set obj->write_domain to 0 on read path which was removed by
the commit.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6639
Fixes: 420a07b841d0 ("drm/i915: Individualize fences before adding to dma_resv obj")
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.16+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907172641.12555-1-nirmoy.das@intel.com
(cherry picked from commit 04f7eb3d4582a0a4da67c86e55fda7de2df86d91)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 9d50bff4 04-Aug-2022 Mauro Carvalho Chehab <mchehab@kernel.org>

drm/i915: pass a pointer for tlb seqno at vma_invalidate_tlb()

WRITE_ONCE() should happen at the original var, not on a local
copy of it.

Cc: stable@vger.kernel.org
Fixes: 59eda6ce824e ("drm/i915/gt: Batch TLB invalidations")
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[added cc-stable while merging it]
Link: https://patchwork.freedesktop.org/patch/msgid/f9550e6bacea10131ff40dd8981b69eb9251cdcd.1659598090.git.mchehab@kernel.org
(cherry picked from commit 3d037d99e61a1e7a3ae3d214146d88db349dd19f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 59eda6ce 27-Jul-2022 Chris Wilson <chris.p.wilson@intel.com>

drm/i915/gt: Batch TLB invalidations

Invalidate TLB in batches, in order to reduce performance regressions.

Currently, every caller performs a full barrier around a TLB
invalidation, ignoring all other invalidations that may have already
removed their PTEs from the cache. As this is a synchronous operation
and can be quite slow, we cause multiple threads to contend on the TLB
invalidate mutex blocking userspace.

We only need to invalidate the TLB once after replacing our PTE to
ensure that there is no possible continued access to the physical
address before releasing our pages. By tracking a seqno for each full
TLB invalidate we can quickly determine if one has been performed since
rewriting the PTE, and only if necessary trigger one for ourselves.

That helps to reduce the performance regression introduced by TLB
invalidate logic.

[mchehab: rebased to not require moving the code to a separate file]

Cc: stable@vger.kernel.org
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4e97ef5deb6739cadaaf40aa45620547e9c4ec06.1658924372.git.mchehab@kernel.org
(cherry picked from commit 5d36acb7198b0e5eb88e6b701f9ad7b9448f8df9)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 1926a6b7 20-Jun-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Fix vm use-after-free in vma destruction

In vma destruction, the following race may occur:

Thread 1: Thread 2:
i915_vma_destroy();

...
list_del_init(vma->vm_link);
...
mutex_unlock(vma->vm->mutex);
__i915_vm_release();
release_references();

And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.

However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().

v2: Fix a typo in the commit message (Andi Shyti)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944
Fixes: e1a7ab4fca0c ("drm/i915: Remove the vm open count")

Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.con>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com


# bfe53be2 29-Jun-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915/ttm: handle blitter failure on DG2

If the move or clear operation somehow fails, and the memory underneath
is not cleared, like when moving to lmem, then we currently fallback to
memcpy or memset. However with small-BAR systems this fallback might no
longer be possible. For now we use the set_wedged sledgehammer if we
ever encounter such a scenario, and mark the object as borked to plug
any holes where access to the memory underneath can happen. Add some
basic selftests to exercise this.

v2:
- In the selftests make sure we grab the runtime pm around the reset.
Also make sure we grab the reset lock before checking if the device
is wedged, since the wedge might still be in-progress and hence the
bit might not be set yet.
- Don't wedge or put the object into an unknown state, if the request
construction fails (or similar). Just returning an error and
skipping the fallback should be safe here.
- Make sure we wedge each gt. (Thomas)
- Peek at the unknown_state in io_reserve, that way we don't have to
export or hand roll the fault_wait_for_idle. (Thomas)
- Add the missing read-side barriers for the unknown_state. (Thomas)
- Some kernel-doc fixes. (Thomas)
v3:
- Tweak the ordering of the set_wedged, also add FIXME.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com


# d976521a 10-Jun-2022 CQ Tang <cq.tang@intel.com>

drm/i915: extend i915_vma_pin_iomap()

In the future display might try call this with a normal smem object,
which doesn't require PIN_MAPPABLE underneath in order to CPU map the
pages (unlike stolen). Extend i915_vma_pin_iomap() to directly use
i915_gem_object_pin_map() for such cases.

This change was suggested by Chris P Wilson, that we pin
the smem with i915_gem_object_pin_map_unlocked().

v2 (jheikkil): Change i915_gem_object_pin_map_unlocked to
i915_gem_object_pin_map

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Jari Tahvanainen <jari.tahvanainen@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
[mauld: tweak commit message, plus minor checkpatch fix]
Link: https://patchwork.freedesktop.org/patch/msgid/20220610121205.29645-2-juhapekka.heikkila@gmail.com


# afd5cb39 10-Jun-2022 Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>

drm/i915: don't leak lmem mapping in vma_evict

Don't leak lmem mapping in vma_evict, move __i915_vma_iounmap outside
i915_vma_is_map_and_fenceable.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220610121205.29645-3-juhapekka.heikkila@gmail.com


# 420a07b8 25-May-2022 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Individualize fences before adding to dma_resv obj

_i915_vma_move_to_active() can receive > 1 fences for
multiple batch buffers submission. Because dma_resv_add_fence()
can only accept one fence at a time, change _i915_vma_move_to_active()
to be aware of multiple fences so that it can add individual
fences to the dma resv object.

v6: fix multi-line comment.
v5: remove double fence reservation for batch VMAs.
v4: Reserve fences for composite_fence on multi-batch contexts and
also reserve fence slots to composite_fence for each VMAs.
v3: dma_resv_reserve_fences is not cumulative so pass num_fences.
v2: make sure to reserve enough fence slots before adding.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5614
Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf")
Cc: <stable@vger.kernel.org> # v5.16+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220525095955.15371-1-nirmoy.das@intel.com


# 48da0f67 20-Jun-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Fix vm use-after-free in vma destruction

In vma destruction, the following race may occur:

Thread 1: Thread 2:
i915_vma_destroy();

...
list_del_init(vma->vm_link);
...
mutex_unlock(vma->vm->mutex);
__i915_vm_release();
release_references();

And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.

However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().

v2: Fix a typo in the commit message (Andi Shyti)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944
Fixes: e1a7ab4fca0c ("drm/i915: Remove the vm open count")

Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.con>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 1926a6b75954fc1a8b44d10bd0c67db957b78cf7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 12058077 20-Jun-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Fix vm use-after-free in vma destruction

In vma destruction, the following race may occur:

Thread 1: Thread 2:
i915_vma_destroy();

...
list_del_init(vma->vm_link);
...
mutex_unlock(vma->vm->mutex);
__i915_vm_release();
release_references();

And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.

However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().

v2: Fix a typo in the commit message (Andi Shyti)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944
Fixes: e1a7ab4fca0c ("drm/i915: Remove the vm open count")

Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.con>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 1926a6b75954fc1a8b44d10bd0c67db957b78cf7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 842d9346 25-May-2022 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Individualize fences before adding to dma_resv obj

_i915_vma_move_to_active() can receive > 1 fences for
multiple batch buffers submission. Because dma_resv_add_fence()
can only accept one fence at a time, change _i915_vma_move_to_active()
to be aware of multiple fences so that it can add individual
fences to the dma resv object.

v6: fix multi-line comment.
v5: remove double fence reservation for batch VMAs.
v4: Reserve fences for composite_fence on multi-batch contexts and
also reserve fence slots to composite_fence for each VMAs.
v3: dma_resv_reserve_fences is not cumulative so pass num_fences.
v2: make sure to reserve enough fence slots before adding.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5614
Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf")
Cc: <stable@vger.kernel.org> # v5.16+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220525095955.15371-1-nirmoy.das@intel.com
(cherry picked from commit 420a07b841d03f6a436d8c06571c69aa5c783897)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 1df1c79c 20-Apr-2022 Karol Herbst <kherbst@redhat.com>

drm/i915: Fix race in __i915_vma_remove_closed

i915_vma_reopen checked if the vma is closed before without taking the
lock. So multiple threads could attempt removing the vma.

Instead the lock needs to be taken before actually checking.

v2: move struct declaration

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.3+
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5732
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Fixes: 155ab8836caa ("drm/i915: Move object close under its own lock")
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220420095720.3331609-1-kherbst@redhat.com


# ea3ce08c 03-May-2022 Kefeng Wang <wangkefeng.wang@huawei.com>

drm/i915: use IOMEM_ERR_PTR() directly

Use IOMEM_ERR_PTR() instead of self defined IO_ERR_PTR().

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220503144937.679424-1-tvrtko.ursulin@linux.intel.com


# 0de2cc0e 29-Apr-2022 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Fix assert in i915_ggtt_pin

Use lockdep_assert_not_held to simplify and correct the code. Otherwise
false positive are hit if lock state is uknown like after a previous
taint.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220429140757.651406-1-tvrtko.ursulin@linux.intel.com


# 1d7f5e6c 22-Dec-2021 Christian König <christian.koenig@amd.com>

drm/i915: drop bo->moving dependency

That should now be handled by the common dma_resv framework.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-13-christian.koenig@amd.com


# 73511edf 09-Nov-2021 Christian König <christian.koenig@amd.com>

dma-buf: specify usage while adding fences to dma_resv obj v7

Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.

Rework all drivers to use this interface instead and deprecate the old one.

v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com


# c8d4c18b 16-Nov-2021 Christian König <christian.koenig@amd.com>

dma-buf/drivers: make reserving a shared slot mandatory v4

Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.

This is the next step towards handling the exclusive fence like a
shared one.

v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com


# e4b3ee71 04-Mar-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: stop checking for NULL vma->obj

This is no longer possible since e6e1a304d759 ("drm/i915: vma is always
backed by an object.").

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304174252.1000238-1-matthew.auld@intel.com


# 833124a0 04-Mar-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: limit the async bind to bind_async_flags

If the vm doesn't request async binding, like for example with the dpt,
then we should be able to skip the async path and avoid calling
i915_vm_lock_objects() altogether. Currently if we have a moving fence
set for the BO(even though it might have signalled), we still take the
async patch regardless of the bind_async setting, and then later still
end up just doing i915_gem_object_wait_moving_fence() anyway.

Alternatively we would need to add dummy scratch object which can be
locked, just for the dpt.

Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304095934.925036-2-matthew.auld@intel.com


# d9393973 04-Mar-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Remove the vma refcount

Now that i915_vma_parked() is taking the object lock on vma destruction,
and the only user of the vma refcount, i915_gem_object_unbind()
also takes the object lock, remove the vma refcount.

v3: Documentation update.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-3-thomas.hellstrom@linux.intel.com


# e1a7ab4f 04-Mar-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Remove the vm open count

vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.

The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.

Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.

v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack
and should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.

v3:
- Documentation update
- Commit message formatting

Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com


# 3220c3b2 20-Apr-2022 Karol Herbst <kherbst@redhat.com>

drm/i915: Fix race in __i915_vma_remove_closed

i915_vma_reopen checked if the vma is closed before without taking the
lock. So multiple threads could attempt removing the vma.

Instead the lock needs to be taken before actually checking.

v2: move struct declaration

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.3+
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5732
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Fixes: 155ab8836caa ("drm/i915: Move object close under its own lock")
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220420095720.3331609-1-kherbst@redhat.com
(cherry picked from commit 1df1c79cbb7ac9bf148930be3418973c76ba8dde)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# c03d9826 22-Feb-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Clarify vma lifetime

It's unclear what reference the initial vma kref reference refers to.
A vma can have multiple weak references, the object vma list,
the vm's bound list and the GT's closed_list, and the initial vma
reference can be put from lookups of all these lists.

With the current implementation this means
that any holder of yet another vma refcount (currently only
i915_gem_object_unbind()) needs to be holding two of either
*) An object refcount,
*) A vm open count
*) A vma open count

in order for us to not risk leaking a reference by having the
initial vma reference being put twice.

Address this by re-introducing i915_vma_destroy() which removes all
weak references of the vma and *then* puts the initial vma refcount.
This makes a strong vma reference hold on to the vma unconditionally.

Perhaps a better name would be i915_vma_revoke() or i915_vma_zombify(),
since other callers may still hold a refcount, but with the prospect of
being able to replace the vma refcount with the object lock in the near
future, let's stick with i915_vma_destroy().

Finally this commit fixes a race in that previously i915_vma_release() and
now i915_vma_destroy() could destroy a vma without taking the vm->mutex
after an advisory check that the vma mm_node was not allocated.
This would race with the ungrab_vma() function creating a trace similar
to the below one. This was fixed in one of the __i915_vma_put() callsites
in
commit bc1922e5d349 ("drm/i915: Fix a race between vma / object destruction and unbinding")
but although not seemingly triggered by CI, that
is not sufficient. This patch is needed to fix that properly.

[823.012188] Console: switching to colour dummy device 80x25
[823.012422] [IGT] gem_ppgtt: executing
[823.016667] [IGT] gem_ppgtt: starting subtest blt-vs-render-ctx0
[852.436465] stack segment: 0000 [#1] PREEMPT SMP NOPTI
[852.436480] CPU: 0 PID: 3200 Comm: gem_ppgtt Not tainted 5.16.0-CI-CI_DRM_11115+ #1
[852.436489] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.2422.A00.2110131104 10/13/2021
[852.436499] RIP: 0010:ungrab_vma+0x9/0x80 [i915]
[852.436711] Code: ef e8 4b 85 cf e0 e8 36 a3 d6 e0 8b 83 f8 9c 00 00 85 c0 75 e1 5b 5d 41 5c 41 5d c3 e9 d6 fd 14 00 55 53 48 8b af c0 00 00 00 <8b> 45 00 85 c0 75 03 5b 5d c3 48 8b 85 a0 02 00 00 48 89 fb 48 8b
[852.436727] RSP: 0018:ffffc90006db7880 EFLAGS: 00010246
[852.436734] RAX: 0000000000000000 RBX: ffffc90006db7598 RCX: 0000000000000000
[852.436742] RDX: ffff88815349e898 RSI: ffff88815349e858 RDI: ffff88810a284140
[852.436748] RBP: 6b6b6b6b6b6b6b6b R08: ffff88815349e898 R09: ffff88815349e8e8
[852.436754] R10: 0000000000000001 R11: 0000000051ef1141 R12: ffff88810a284140
[852.436762] R13: 0000000000000000 R14: ffff88815349e868 R15: ffff88810a284458
[852.436770] FS: 00007f5c04b04e40(0000) GS:ffff88849f000000(0000) knlGS:0000000000000000
[852.436781] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[852.436788] CR2: 00007f5c04b38fe0 CR3: 000000010a6e8001 CR4: 0000000000770ef0
[852.436797] PKRU: 55555554
[852.436801] Call Trace:
[852.436806] <TASK>
[852.436811] i915_gem_evict_for_node+0x33c/0x3c0 [i915]
[852.437014] i915_gem_gtt_reserve+0x106/0x130 [i915]
[852.437211] i915_vma_pin_ww+0x8f4/0xb60 [i915]
[852.437412] eb_validate_vmas+0x688/0x860 [i915]
[852.437596] i915_gem_do_execbuffer+0xc0e/0x25b0 [i915]
[852.437770] ? deactivate_slab+0x5f2/0x7d0
[852.437778] ? _raw_spin_unlock_irqrestore+0x50/0x60
[852.437789] ? i915_gem_execbuffer2_ioctl+0xc6/0x2c0 [i915]
[852.437944] ? init_object+0x49/0x80
[852.437950] ? __lock_acquire+0x5e6/0x2580
[852.437963] i915_gem_execbuffer2_ioctl+0x116/0x2c0 [i915]
[852.438129] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915]
[852.438300] drm_ioctl_kernel+0xac/0x140
[852.438310] drm_ioctl+0x201/0x3d0
[852.438316] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915]
[852.438490] __x64_sys_ioctl+0x6a/0xa0
[852.438498] do_syscall_64+0x37/0xb0
[852.438507] entry_SYSCALL_64_after_hwframe+0x44/0xae
[852.438515] RIP: 0033:0x7f5c0415b317
[852.438523] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[852.438542] RSP: 002b:00007ffd765039a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[852.438553] RAX: ffffffffffffffda RBX: 000055e4d7829dd0 RCX: 00007f5c0415b317
[852.438562] RDX: 00007ffd76503a00 RSI: 00000000c0406469 RDI: 0000000000000017
[852.438571] RBP: 00007ffd76503a00 R08: 0000000000000000 R09: 0000000000000081
[852.438579] R10: 00000000ffffff7f R11: 0000000000000246 R12: 00000000c0406469
[852.438587] R13: 0000000000000017 R14: 00007ffd76503a00 R15: 0000000000000000
[852.438598] </TASK>
[852.438602] Modules linked in: snd_hda_codec_hdmi i915 mei_hdcp x86_pkg_temp_thermal snd_hda_intel snd_intel_dspcfg drm_buddy coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec ttm ghash_clmulni_intel snd_hwdep snd_hda_core e1000e drm_dp_helper ptp snd_pcm mei_me drm_kms_helper pps_core mei syscopyarea sysfillrect sysimgblt fb_sys_fops prime_numbers intel_lpss_pci smsc75xx usbnet mii
[852.440310] ---[ end trace e52cdd2fe4fd911c ]---

v2: Fix typos in the commit message.

Fixes: 7e00897be8bf ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.")
Fixes: bc1922e5d349 ("drm/i915: Fix a race between vma / object destruction and unbinding")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220222133209.587978-1-thomas.hellstrom@linux.intel.com


# 30b9d1b3 25-Feb-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: add I915_BO_ALLOC_GPU_ONLY

If the user doesn't require CPU access for the buffer, then
ALLOC_GPU_ONLY should be used, in order to prioritise allocating in the
non-mappable portion of LMEM, on devices with small BAR.

v2(Thomas):
- The BO_ALLOC_TOPDOWN naming here is poor, since this is pure lies on
systems that don't even have small BAR. A better name is GPU_ONLY,
which is accurate regardless of the configuration.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-3-matthew.auld@intel.com


# 87bd701e 18-Feb-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: enforce min GTT alignment for discrete cards

For local-memory objects we need to align the GTT addresses
to 64K, both for the ppgtt and ggtt.

We need to support vm->min_alignment > 4K, depending
on the vm itself and the type of object we are inserting.
With this in mind update the GTT selftests to take this
into account.

For compact-pt we further align and pad lmem object GTT addresses
to 2MB to ensure PDEs contain consistent page sizes as
required by the HW.

v3:
* use needs_compact_pt flag to discriminate between
64K and 64K with compact-pt
* add i915_vm_obj_min_alignment
* use i915_vm_obj_min_alignment to round up vma reservation
if compact-pt instead of hard coding
v5:
* fix i915_vm_obj_min_alignment for internal objects which
have no memory region
v6:
* tiled_blits_create correctly pick largest required alignment
v8:
* i915_vm_min_alignment protect against array overflow for mock region

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-7-ramalingam.c@intel.com


# a594525c 28-Jan-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Allow dead vm to unbind vma's without lock.

i915_gem_vm_close may take the lock, and we currently have no better way
of handling this. At least for now, allow a path in which holding vm->mutex
is sufficient. This is the case, because the object destroy path will
forcefully take vm->mutex now.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220128085739.1464568-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>


# c36846f3 27-Jan-2022 Dan Carpenter <dan.carpenter@oracle.com>

drm/i915: delete shadow "ret" variable

This "ret" declaration shadows an existing "ret" variable at the top of
the function. Delete it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127085115.GD25644@kili
Fixes: f6c466b84cfa ("drm/i915: Add support for moving fence waiting")


# 8f4f9a3b 19-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Fix vma resource freeing

In some cases we use leftover kfree() instead of i915_vma_resource_free().
Fix this.

Fixes: 2f6b90da9192 ("drm/i915: Use vma resources for async unbinding")
Reported-by: Robert Beckett <bob.beckett@collabora.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220119174734.213552-1-thomas.hellstrom@linux.intel.com


# b5cfe6f7 14-Jan-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Remove short-term pins from execbuf, v6.

Add a flag PIN_VALIDATE, to indicate we don't need to pin and only
protected by the object lock.

This removes the need to unpin, which is done by just releasing the
lock.

eb_reserve is slightly reworked for readability, but the same steps
are still done:
- First pass pins with NONBLOCK.
- Second pass unbinds all objects first, then pins.
- Third pass is only called when not all objects are softpinned, and
unbinds all objects, then calls i915_gem_evict_vm(), then pins.

Changes since v1:
- Split out eb_reserve() into separate functions for readability.
Changes since v2:
- Make batch buffer mappable on platforms where only GGTT is available,
to prevent moving the batch buffer during relocations.
Changes since v3:
- Preserve current behavior for batch buffer, instead be cautious when
calling i915_gem_object_ggtt_pin_ww, and re-use the current batch vma
if it's inside ggtt and map-and-fenceable.
- Remove impossible condition check from eb_reserve. (Matt)
Changes since v5:
- Do not even temporarily pin, just call i915_gem_evict_vm() and mark
all vma's as unpinned.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-7-maarten.lankhorst@linux.intel.com


# 294996a9 14-Jan-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Remove support for unlocked i915_vma unbind

Now that we require the object lock for all ops, some code handling
race conditions can be removed.

This is required to not take short-term pins inside execbuf.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-6-maarten.lankhorst@linux.intel.com


# 0f341974 14-Jan-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for i915_vma_unbind, v2.

We want to remove more members of i915_vma, which requires the locking to
be held more often.

Start requiring gem object lock for i915_vma_unbind, as it's one of the
callers that may unpin pages.

Some special care is needed when evicting, because the last reference to
the object may be held by the VMA, so after __i915_vma_unbind, vma may be
garbage, and we need to cache vma->obj before unlocking.

Changes since v1:
- Make trylock failing a WARN. (Matt)
- Remove double i915_vma_wait_for_bind() (Matt)
- Move atomic_set to right before mutex_unlock(), to make it more clear
they belong together. (Matt)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-5-maarten.lankhorst@linux.intel.com


# 7e00897b 14-Jan-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.

Because we will start to require the obj->resv lock for unbinding,
ensure these vma eviction utility functions also take the lock.

This requires some function signature changes, to ensure that the
ww context is passed around, but is mostly straightforward.

Previously this was split up into several patches, but reworking
should allow for easier bisection.

Changes since v1:
- Handle evicting dead objects better.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-4-maarten.lankhorst@linux.intel.com


# 6945c53b 17-Jan-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Add locking to i915_gem_evict_vm(), v3.

i915_gem_evict_vm will need to be able to evict objects that are
locked by the current ctx. By testing if the current context already
locked the object, we can do this correctly. This allows us to
evict the entire vm even if we already hold some objects' locks.

Previously, this was spread over several commits, but it makes
more sense to commit the changes to i915_gem_evict_vm separately
from the changes to i915_gem_evict_something() and
i915_gem_evict_for_node().

Changes since v1:
- Handle evicting dead objects better.
Changes since v2:
- Use for_i915_gem_ww in igt_evict_vm. (Thomas)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
[mlankhorst: Fix up doc warning.]
Link: https://patchwork.freedesktop.org/patch/msgid/20220117075604.131477-1-maarten.lankhorst@linux.intel.com


# 60dc43d1 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Use struct vma_resource instead of struct vma_snapshot

There is always a struct vma_resource guaranteed to be alive when we
access a corresponding struct vma_snapshot.

So ditch the latter and instead of allocating vma_snapshots, reference
the already existning vma_resource.

This requires a couple of extra members in struct vma_resource but that's
a small price to pay for the simplification.

v2:
- Fix a missing include and declaration (kernel test robot <lkp@intel.com>)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-7-thomas.hellstrom@linux.intel.com


# 2f6b90da 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Use vma resources for async unbinding

Implement async (non-blocking) unbinding by not syncing the vma before
calling unbind on the vma_resource.
Add the resulting unbind fence to the object's dma_resv from where it is
picked up by the ttm migration code.
Ideally these unbind fences should be coalesced with the migration blit
fence to avoid stalling the migration blit waiting for unbind, as they
can certainly go on in parallel, but since we don't yet have a
reasonable data structure to use to coalesce fences and attach the
resulting fence to a timeline, we defer that for now.

Note that with async unbinding, even while the unbind waits for the
preceding bind to complete before unbinding, the vma itself might have been
destroyed in the process, clearing the vma pages. Therefore we can
only allow async unbinding if we have a refcounted sg-list and keep a
refcount on that for the vma resource pages to stay intact until
binding occurs. If this condition is not met, a request for an async
unbind is diverted to a sync unbind.

v2:
- Use a separate kmem_cache for vma resources for now to isolate their
memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread. Regardless
of whether the vm is closed, we need the unbind fence to properly wait
for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
pending unbind. (Matthew Auld)
v5:
- Fix timeout and error check in i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
async unbind capable.
- Fix braces around a single-line if statement.
v6:
- Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>)
- Don't allow async unbinding if the vma_res pages are not the same as
the object pages. (Matthew Auld)
v7:
- s/unsigned long/u64/ in a number of places (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com


# ebf3c361 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Don't pin the object pages during pending vma binds

A pin-count is already held by vma->pages so taking an additional pin
during async binds is not necessary.

When we introduce async unbinding we have other means of keeping the
object pages alive.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-4-thomas.hellstrom@linux.intel.com


# 39a2bd34 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Use the vma resource as argument for gtt binding / unbinding

When introducing asynchronous unbinding, the vma itself may no longer
be alive when the actual binding or unbinding takes place.

Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
Similarly change the insert_entries() op for struct i915_address_space.

Replace a couple of i915_vma_snapshot members with their newly introduced
i915_vma_resource counterparts, since they have the same lifetime.

Also make sure to avoid changing the struct i915_vma_flags (in particular
the bind flags) async. That should now only be done sync under the
vm mutex.

v2:
- Update the vma_res::bound_flags when binding to the aliased ggtt
v6:
- Remove I915_VMA_ALLOC_BIT (Matthew Auld)
- Change some members of struct i915_vma_resource from unsigned long to u64
(Matthew Auld)
v7:
- Fix vma resource size parameters to be u64 rather than unsigned long
(Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com


# e1a4bbb6 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Initial introduction of vma resources

Introduce vma resources, sort of similar to TTM resources, needed for
asynchronous bind management. Initially we will use them to hold
completion of unbinding when we capture data from a vma, but they will
be used extensively in upcoming patches for asynchronous vma unbinding.

v6:
- Some documentation updates

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-2-thomas.hellstrom@linux.intel.com


# 386e75a4 07-Jan-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: split out gem/i915_gem_tiling.h from i915_drv.h

We already have the gem/i915_gem_tiling.c file.

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8073a429ed1f8ade9c0cc8a6ed1a0f82183100c5.1641561552.git.jani.nikula@intel.com


# 2ef97818 07-Jan-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: split out i915_gem_evict.h from i915_drv.h

We already have the i915_gem_evict.c file.

v2: Fixed commit message (Tvrtko)

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ec666853171d04daeb21a93083940df36907c343.1641561552.git.jani.nikula@intel.com


# 7938d615 19-Oct-2021 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Flush TLBs before releasing backing store

We need to flush TLBs before releasing backing store otherwise userspace
is able to encounter stale entries if a) it is not declaring access to
certain buffers and b) it races with the backing store release from a
such undeclared execution already executing on the GPU in parallel.

The approach taken is to mark any buffer objects which were ever bound
to the GPU and to trigger a serialized TLB flush when their backing
store is released.

Alternatively the flushing could be done on VMA unbind, at which point
we would be able to ascertain whether there is potential a parallel GPU
execution (which could race), but essentially it boils down to paying
the cost of TLB flushes potentially needlessly at VMA unbind time (when
the backing store is not known to be going away so not needed for
safety), versus potentially needlessly at backing store relase time
(since we at that point cannot tell whether there is anything executing
on the GPU which uses that object).

Thereforce simplicity of implementation has been chosen for now with
scope to benchmark and refine later as required.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c2ea703d 21-Dec-2021 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Require the vm mutex for i915_vma_bind()

Protect updates of struct i915_vma flags and async binding / unbinding
with the vm::mutex. This means that i915_vma_bind() needs to assert
vm::mutex held. In order to make that possible drop the caching of
kmap_atomic() maps around i915_vma_bind().

An alternative would be to use kmap_local() but since we block cpu
unplugging during sleeps inside kmap_local() sections this may have
unwanted side-effects. Particularly since we might wait for gpu while
holding the vm mutex.

This change may theoretically increase execbuf cpu-usage on snb, but
at least on non-highmem systems that increase should be very small.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211221200050.436316-5-thomas.hellstrom@linux.intel.com


# 2abb6195 16-Dec-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Take object lock in i915_ggtt_pin if ww is not set

i915_vma_wait_for_bind needs the vma lock held, fix the caller.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-5-maarten.lankhorst@linux.intel.com


# 0b4d1f0e 16-Dec-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Remove pages_mutex and intel_gtt->vma_ops.set/clear_pages members, v3.

Big delta, but boils down to moving set_pages to i915_vma.c, and removing
the special handling, all callers use the defaults anyway. We only remap
in ggtt, so default case will fall through.

Because we still don't require locking in i915_vma_unpin(), handle this by
using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in
unpin, which only fails if we race a against a new pin.

Changes since v1:
- aliasing gtt sets ZERO_SIZE_PTR, not -ENODEV, remove special case
from __i915_vma_get_pages(). (Matt)
Changes since v2:
- Free correct old pages in __i915_vma_get_pages(). (Matt)
Remove race of clearing vma->pages accidentally from put,
free it but leave it set, as only get has the lock.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-4-maarten.lankhorst@linux.intel.com
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# ad5c99e0 16-Dec-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Remove unused bits of i915_vma/active api

When reworking the code to move the eviction fence to the object,
the best code is removed code.

Remove some functions that are unused, and change the function definition
if it's only used in 1 place.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
[mlankhorst: Remove new use of i915_active_has_exclusive]
Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-2-maarten.lankhorst@linux.intel.com


# f6c466b8 22-Nov-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Add support for moving fence waiting

For now, we will only allow async migration when TTM is used,
so the paths we care about are related to TTM.

The mmap path is handled by having the fence in ttm_bo->moving,
when pinning, the binding only becomes available after the moving
fence is signaled, and pinning a cpu map will only work after
the moving fence signals.

This should close all holes where userspace can read a buffer
before it's fully migrated.

v2:
- Fix a couple of SPARSE warnings
v3:
- Fix a NULL pointer dereference
v4:
- Ditch the moving fence waiting for i915_vma_pin_iomap() and
replace with a verification that the vma is already bound.
(Matthew Auld)
- Squash with a previous patch introducing moving fence waiting and
accessing interfaces (Matthew Auld)
- Rename to indicated that we also add support for sync waiting.
v5:
- Fix check for NULL and unreferencing i915_vma_verify_bind_complete()
(Matthew Auld)
- Fix compilation failure if !CONFIG_DRM_I915_DEBUG_GEM
- Fix include ordering. (Matthew Auld)
v7:
- Fix yet another compilation failure with clang if
!CONFIG_DRM_I915_DEBUG_GEM

Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-2-thomas.hellstrom@linux.intel.com


# 95c3d275 17-Nov-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Remove resv from i915_vma

It's just an alias to vma->obj->base.resv, no need to duplicate it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211117142024.1043017-5-matthew.auld@intel.com


# e6e1a304 17-Nov-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: vma is always backed by an object.

vma->obj and vma->resv are now never NULL, and some checks can be removed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211117142024.1043017-4-matthew.auld@intel.com


# fbd4cf3b 02-Nov-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915: fixup dma_fence_wait usage

dma_fence_wait expects a boolean for whether it should be interruptible,
not a timeout value.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211102155055.100138-1-matthew.auld@intel.com


# 0f68d45e 08-Nov-2021 Imran Khan <imran.f.khan@oracle.com>

lib, stackdepot: add helper to print stack entries into buffer

To print stack entries into a buffer, users of stackdepot, first get a
list of stack entries using stack_depot_fetch and then print this list
into a buffer using stack_trace_snprint. Provide a helper in stackdepot
for this purpose. Also change above mentioned users to use this helper.

[imran.f.khan@oracle.com: fix build error]
Link: https://lkml.kernel.org/r/20210915175321.3472770-4-imran.f.khan@oracle.com
[imran.f.khan@oracle.com: export stack_depot_snprint() to modules]
Link: https://lkml.kernel.org/r/20210916133535.3592491-4-imran.f.khan@oracle.com

Link: https://lkml.kernel.org/r/20210915014806.3206938-4-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Jani Nikula <jani.nikula@intel.com> [i915]
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 544460c3 14-Oct-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Multi-BB execbuf

Allow multiple batch buffers to be submitted in a single execbuf IOCTL
after a context has been configured with the 'set_parallel' extension.
The number batches is implicit based on the contexts configuration.

This is implemented with a series of loops. First a loop is used to find
all the batches, a loop to pin all the HW contexts, a loop to create all
the requests, a loop to submit (emit BB start, etc...) all the requests,
a loop to tie the requests to the VMAs they touch, and finally a loop to
commit the requests to the backend.

A composite fence is also created for the generated requests to return
to the user and to stick in dma resv slots.

No behavior from the existing IOCTL should be changed aside from when
throttling because the ring for a context is full. In this situation,
i915 will now wait while holding the object locks. This change was done
because the code is much simpler to wait while holding the locks and we
believe there isn't a huge benefit of dropping these locks. If this
proves false we can restructure the code to drop the locks during the
wait.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
(Matthew Brost)
- Return proper error value if i915_request_create fails
v3:
(John Harrison)
- Add comment explaining create / add order loops + locking
- Update commit message explaining different in IOCTL behavior
- Line wrap some comments
- eb_add_request returns void
- Return -EINVAL rather triggering BUG_ON if cmd parser used
(Checkpatch)
- Check eb->batch_len[*current_batch]
v4:
(CI)
- Set batch len if passed if via execbuf args
- Call __i915_request_skip after __i915_request_commit
(Kernel test robot)
- Initialize rq to NULL in eb_pin_timeline
v5:
(John Harrison)
- Fix typo in comments near bb order loops

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-21-matthew.brost@intel.com


# 64fc7cc7 27-Jul-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: move vma slab to direct module init/exit

With the global kmem_cache shrink infrastructure gone there's nothing
special and we can convert them over.

I'm doing this split up into each patch because there's quite a bit of
noise with removing the static global.slab_vmas to just a
slab_vmas.

We have to keep i915_drv.h include in i915_globals otherwise there's
nothing anymore that pulls in GEM_BUG_ON.

v2: Make slab static (Jason, 0day)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-9-daniel.vetter@ffwll.ch


# 4f62a7e0 21-Jul-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Ditch i915 globals shrink infrastructure

This essentially reverts

commit 84a1074920523430f9dc30ff907f4801b4820072
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Jan 24 11:36:08 2018 +0000

drm/i915: Shrink the GEM kmem_caches upon idling

mm/vmscan.c:do_shrink_slab() is a thing, if there's an issue with it
then we need to fix that there, not hand-roll our own slab shrinking
code in i915.

Also when this was added there was only one other caller of
kmem_cache_shrink (added 2005 to the acpi code). Now there's a 2nd one
outside of i915 code in a kunit test, which seems legit since that
wants to very carefully control what's in the kmem_cache. This out of
a total of over 500 calls to kmem_cache_create. This alone should have
been warning sign enough that we're doing something silly.

Noticed while reviewing a patch set from Jason to fix up some issues
in our i915_init() and i915_exit() module load/cleanup code. Now that
i915_globals.c isn't any different than normal init/exit functions, we
should convert them over to one unified table and remove
i915_globals.[hc] entirely.

v2: Improve commit message (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721183229.4136488-1-daniel.vetter@ffwll.ch


# dc194184 14-Jul-2021 Jason Ekstrand <jason@jlekstrand.net>

drm/i915: Drop error handling from dma_fence_work

Asynchronous command parsing was the only thing which ever returned a
non-zero error. With that gone, we can drop the error handling from
dma_fence_work.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-5-jason@jlekstrand.net


# 0f4308d5 01-Jun-2021 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Untangle the vma pages_mutex

Any sleeping dma_resv lock taken while the vma pages_mutex is held
will cause a lockdep splat.
Move the i915_gem_object_pin_pages() call out of the pages_mutex
critical section.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210601074654.3103-2-thomas.hellstrom@linux.intel.com


# 74862d4c 24-May-2021 Imre Deak <imre.deak@intel.com>

drm/i915/adlp: Fix GEM VM asserts for DPT VMs

An object mapped via DPT can have remapped and rotated VMA instances
besides the normal VMA instance, similarly to GGTT VMA instances.
Adjust the corresponding VMA lookup asserts.

While at it also check if a DPT VM is passed incorrectly to
i915_vm_to_ppgtt().

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210524172703.2113058-2-imre.deak@intel.com


# c3b14760 04-May-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915: drop the __i915_active_call pointer packing

We use some of the lower bits of the retire function pointer for
potential flags, which is quite thorny, since the caller needs to
remember to give the function the correct alignment with
__i915_active_call, otherwise we might incorrectly unpack the pointer
and jump to some garbage address later. Instead of all this let's just
pass the flags along as a separate parameter.

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
References: ca419f407b43 ("drm/i915: Fix crash in auto_retire")
References: d8e44e4dd221 ("drm/i915/overlay: Fix active retire callback alignment")
References: fd5f262db118 ("drm/i915/selftests: Fix active retire callback alignment")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210504164136.96456-1-matthew.auld@intel.com


# 4bc91dbd 27-Apr-2021 Anusha Srivatsa <anusha.srivatsa@intel.com>

drm/i915/lmem: Bypass aperture when lmem is available

In the scenario where local memory is available, we have
rely on CPU access via lmem directly instead of aperture.

v2:
gmch is only relevant for much older hw, therefore we can drop the
has_aperture check since it should always be present on such platforms.
(Chris)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-6-matthew.auld@intel.com


# 529b9ec8 27-Apr-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: map the PD up front

We need to generalise our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
mapping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Note that keeping the mapping around is a potential concern here, since
while the vma is pinned the mapping remains there for the PDs
underneath, or at least until the used_count reaches zero, at which
point we can safely destroy the mapping. For 32b this will be even worse
since the address space is more limited, but since this change mostly
impacts full ppGTT platforms, the justification is that for modern
platforms we shouldn't care too much about 32b.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com


# 26ad4f8b 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Use a single page table lock for each gtt.

We may create page table objects on the fly, but we may need to
wait with the ww lock held. Instead of waiting on a freed obj
lock, ensure we have the same lock for each object to keep
-EDEADLK working. This ensures that i915_vma_pin_ww can lock
the page tables when required.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-41-maarten.lankhorst@linux.intel.com


# 7d1c2618 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Take reservation lock around i915_vma_pin.

We previously complained when ww == NULL.

This function is now only used in selftests to pin an object,
and ww locking is now fixed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
[danvet: Resolve conflict because we don't have a set-domain refactor,
see
https://lore.kernel.org/intel-gfx/20210203090205.25818-8-chris@chris-wilson.co.uk/

The really worrying thing here is that the above patch had a change in
arguments for i915_gem_object_set_to_gtt_domain(), without any
explanation. I decided to just faithfully apply Maarten's change but
not the argument change which was in Maarten's context diff.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-26-maarten.lankhorst@linux.intel.com


# bfaae47d 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: make lockdep slightly happier about execbuf.

As soon as we install fences, we should stop allocating memory
in order to prevent any potential deadlocks.

This is required later on, when we start adding support for
dma-fence annotations.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-11-maarten.lankhorst@linux.intel.com


# 1eef0de1 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Ensure we hold the object mutex in pin correctly.

Currently we have a lot of places where we hold the gem object lock,
but haven't yet been converted to the ww dance. Complain loudly about
those places.

i915_vma_pin shouldn't have the obj lock held, so we can do a ww dance,
while i915_vma_pin_ww should.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #irc
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-6-maarten.lankhorst@linux.intel.com


# 547be6a4 23-Mar-2021 Liam Howlett <liam.howlett@oracle.com>

i915_vma: Rename vma_lookup to i915_vma_lookup

Use i915 prefix to avoid name collision with future vma_lookup() in mm.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323134208.3077275-1-Liam.Howlett@Oracle.com


# 537457a9 02-Nov-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Hold onto an explicit ref to i915_vma_work.pinned

Since __vma_release is run by a kworker after the fence has been
signaled, it is no longer protected by the active reference on the vma,
and so the alias of vw->pinned to vma->obj is also not protected by a
reference on the object. Add an explicit reference for vw->pinned so it
will always be safe.

Found by inspection.

Fixes: 54d7195f8c64 ("drm/i915: Unpin vma->obj on early error")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102161931.30031-1-chris@chris-wilson.co.uk
(cherry picked from commit bc73e5d33048b7ab5f12b11b5d923700467a8e1d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# bc73e5d3 02-Nov-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Hold onto an explicit ref to i915_vma_work.pinned

Since __vma_release is run by a kworker after the fence has been
signaled, it is no longer protected by the active reference on the vma,
and so the alias of vw->pinned to vma->obj is also not protected by a
reference on the object. Add an explicit reference for vw->pinned so it
will always be safe.

Found by inspection.

Fixes: 54d7195f8c64 ("drm/i915: Unpin vma->obj on early error")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201102161931.30031-1-chris@chris-wilson.co.uk


# cef8ce55 21-Sep-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915: check i915_vm_alloc_pt_stash for errors

If we are really unlucky and encounter an error during
i915_vm_alloc_pt_stash, we end up passing an empty pt/pd stash all the
way down into the low-level ppgtt alloc code, leading to explosions,
since it expects at least the required number of pt/pd for the va range.

[ 211.981418] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 211.981421] #PF: supervisor read access in kernel mode
[ 211.981422] #PF: error_code(0x0000) - not-present page
[ 211.981424] PGD 80000008439cb067 P4D 80000008439cb067 PUD 84a37f067 PMD 0
[ 211.981427] Oops: 0000 [#1] SMP PTI
[ 211.981428] CPU: 1 PID: 1301 Comm: i915_selftest Tainted: G U I 5.9.0-rc5+ #3
[ 211.981430] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017
[ 211.981521] RIP: 0010:__gen8_ppgtt_alloc+0x1ed/0x3c0 [i915]
[ 211.981523] Code: c1 48 c7 c7 5d 5d fe c0 65 ff 0d ee 1d 03 3f e8 d9 91 1f e2 8b 55 c4 31 c0 48 8b 75 b8 85 d2 0f 95 c0 48 8b 1c c6 48 89 45 98 <48> 8b 03 48 8b 90 58 02 00 00 48 85 d2 0f 84 07 ea 15 00 48 81 fa
[ 211.981526] RSP: 0018:ffffba2cc0eb3970 EFLAGS: 00010202
[ 211.981527] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000004
[ 211.981529] RDX: 0000000000000002 RSI: ffff9be998bdb8c0 RDI: ffff9be99c844300
[ 211.981530] RBP: ffffba2cc0eb39d8 R08: 0000000000000640 R09: ffff9be97cdfd000
[ 211.981531] R10: ffff9be97cdfd614 R11: 0000000000000000 R12: 0000000000000000
[ 211.981532] R13: ffff9be98607ba20 R14: ffff9be995a0b400 R15: ffffba2cc0eb39e8
[ 211.981534] FS: 00007f0f10b31000(0000) GS:ffff9be99fc40000(0000) knlGS:0000000000000000
[ 211.981536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 211.981538] CR2: 0000000000000000 CR3: 000000084d74e006 CR4: 00000000003706e0
[ 211.981539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 211.981541] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 211.981542] Call Trace:
[ 211.981609] gen8_ppgtt_alloc+0x79/0x90 [i915]
[ 211.981678] ppgtt_bind_vma+0x36/0x80 [i915]
[ 211.981756] __vma_bind+0x39/0x40 [i915]
[ 211.981818] fence_work+0x21/0x98 [i915]
[ 211.981879] fence_notify+0x8d/0x128 [i915]
[ 211.981939] __i915_sw_fence_complete+0x62/0x240 [i915]
[ 211.982018] i915_vma_pin_ww+0x1ee/0x9c0 [i915]

Fixes: cd0452aa2a0d ("drm/i915: Preallocate stashes for vma page-directories")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921160844.73186-1-matthew.auld@intel.com
(cherry picked from commit 1604cb2aa7fafd83e11f9257f765a5f5dd7c19d3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 1604cb2a 21-Sep-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915: check i915_vm_alloc_pt_stash for errors

If we are really unlucky and encounter an error during
i915_vm_alloc_pt_stash, we end up passing an empty pt/pd stash all the
way down into the low-level ppgtt alloc code, leading to explosions,
since it expects at least the required number of pt/pd for the va range.

[ 211.981418] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 211.981421] #PF: supervisor read access in kernel mode
[ 211.981422] #PF: error_code(0x0000) - not-present page
[ 211.981424] PGD 80000008439cb067 P4D 80000008439cb067 PUD 84a37f067 PMD 0
[ 211.981427] Oops: 0000 [#1] SMP PTI
[ 211.981428] CPU: 1 PID: 1301 Comm: i915_selftest Tainted: G U I 5.9.0-rc5+ #3
[ 211.981430] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017
[ 211.981521] RIP: 0010:__gen8_ppgtt_alloc+0x1ed/0x3c0 [i915]
[ 211.981523] Code: c1 48 c7 c7 5d 5d fe c0 65 ff 0d ee 1d 03 3f e8 d9 91 1f e2 8b 55 c4 31 c0 48 8b 75 b8 85 d2 0f 95 c0 48 8b 1c c6 48 89 45 98 <48> 8b 03 48 8b 90 58 02 00 00 48 85 d2 0f 84 07 ea 15 00 48 81 fa
[ 211.981526] RSP: 0018:ffffba2cc0eb3970 EFLAGS: 00010202
[ 211.981527] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000004
[ 211.981529] RDX: 0000000000000002 RSI: ffff9be998bdb8c0 RDI: ffff9be99c844300
[ 211.981530] RBP: ffffba2cc0eb39d8 R08: 0000000000000640 R09: ffff9be97cdfd000
[ 211.981531] R10: ffff9be97cdfd614 R11: 0000000000000000 R12: 0000000000000000
[ 211.981532] R13: ffff9be98607ba20 R14: ffff9be995a0b400 R15: ffffba2cc0eb39e8
[ 211.981534] FS: 00007f0f10b31000(0000) GS:ffff9be99fc40000(0000) knlGS:0000000000000000
[ 211.981536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 211.981538] CR2: 0000000000000000 CR3: 000000084d74e006 CR4: 00000000003706e0
[ 211.981539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 211.981541] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 211.981542] Call Trace:
[ 211.981609] gen8_ppgtt_alloc+0x79/0x90 [i915]
[ 211.981678] ppgtt_bind_vma+0x36/0x80 [i915]
[ 211.981756] __vma_bind+0x39/0x40 [i915]
[ 211.981818] fence_work+0x21/0x98 [i915]
[ 211.981879] fence_notify+0x8d/0x128 [i915]
[ 211.981939] __i915_sw_fence_complete+0x62/0x240 [i915]
[ 211.982018] i915_vma_pin_ww+0x1ee/0x9c0 [i915]

Fixes: cd0452aa2a0d ("drm/i915: Preallocate stashes for vma page-directories")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921160844.73186-1-matthew.auld@intel.com


# 47b08693 19-Aug-2020 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.

As a preparation step for full object locking and wait/wound handling
during pin and object mapping, ensure that we always pass the ww context
in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this
happens.

This also requires changing the order of eb_parse slightly, to ensure
we pass ww at a point where we could still handle -EDEADLK safely.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 89351925 29-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Switch to object allocations for page directories

The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.

In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.

v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# cd0452aa 29-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Preallocate stashes for vma page-directories

We need to make the DMA allocations used for page directories to be
performed up front so that we can include those allocations in our
memory reservation pass. The downside is that we have to assume the
worst case, even before we know the final layout, and always allocate
enough page directories for this object, even when there will be overlap.
This unfortunately can be quite expensive, especially as we have to
clear/reset the page directories and DMA pages, but it should only be
required during early phases of a workload when new objects are being
discovered, or after memory/eviction pressure when we need to rebind.
Once we reach steady state, the objects should not be moved and we no
longer need to preallocating the pages tables.

It should be noted that the lifetime for the page directories DMA is
more or less decoupled from individual fences as they will be shared
across objects across timelines.

v2: Only allocate enough PD space for the PTE we may use, we do not need
to allocate PD that will be left as scratch.
v3: Store the shift unto the first PD level to encapsulate the different
PTE counts for gen6/gen8.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# af5c6fcf 31-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Provide a fastpath for waiting on vma bindings

Before we can execute a request, we must wait for all of its vma to be
bound. This is a frequent operation for which we can optimise away a
few atomic operations (notably a cmpxchg) in lieu of the RCU protection.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-7-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# cf1976b1 02-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Also drop vm.ref along error paths for vma construction

Not only do we need to release the vm.ref we acquired for the vma on the
duplicate insert branch, but also for the normal error paths, so roll
them all into one.

Reported-by: Andi Shyti <andi.shyti@intel.com>
Suggested-by: Andi Shyti <andi.shyti@intel.com>
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702211015.29604-1-chris@chris-wilson.co.uk
(cherry picked from commit 03fca66b7a36b52da8915341eee388267f6d5b73)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 42723673 02-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop vm.ref for duplicate vma on construction

As we allow for parallel threads to create the same vma instance
concurrently, and we only filter out the duplicates upon reacquiring the
spinlock for the rbtree, we have to free the loser of the constructors'
race. When freeing, we should also drop any resource references acquired
for the redundant vma.

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702083225.20044-1-chris@chris-wilson.co.uk
(cherry picked from commit 2377427cdd2b7514eb4c40241cf5c4dec63c1bec)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 12b07256 03-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Export ppgtt_bind_vma

Reuse the ppgtt_bind_vma() for aliasing_ppgtt_bind_vma() so we can
reduce some code near-duplication. The catch is that we need to then
pass along the i915_address_space and not rely on vma->vm, as they
differ with the aliasing-ppgtt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703102519.26539-1-chris@chris-wilson.co.uk


# 03fca66b 02-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Also drop vm.ref along error paths for vma construction

Not only do we need to release the vm.ref we acquired for the vma on the
duplicate insert branch, but also for the normal error paths, so roll
them all into one.

Reported-by: Andi Shyti <andi.shyti@intel.com>
Suggested-by: Andi Shyti <andi.shyti@intel.com>
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702211015.29604-1-chris@chris-wilson.co.uk


# 2377427c 02-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop vm.ref for duplicate vma on construction

As we allow for parallel threads to create the same vma instance
concurrently, and we only filter out the duplicates upon reacquiring the
spinlock for the rbtree, we have to free the loser of the constructors'
race. When freeing, we should also drop any resource references acquired
for the redundant vma.

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702083225.20044-1-chris@chris-wilson.co.uk


# 51dc276d 11-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Leave vma intact as they are discarded

If we find ourselves trying to reuse a misplaced but active vma, we
currently try to discard it to avoid having to wait to unbind it
(upsetting the current user fo the vma). An alternative to marking it as
a dicarded vma and keeping it in both the obj->vma.list and
obj->vma.tree, is to simply remove it from the lookup rbtree.

While it remains in the list of vma, it will be unbound under eviction
pressure and freed along with the object. We will never reuse it again
for new instances. As before, with no pruning, the list may continually
grow, but eventually we will have the most constrained version of the
ggtt view that meets all requirements -- so the list of vma should not
grow without bound.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2012
Fixes: 9bdcaa5e3a2f ("drm/i915: Discard a misplaced GGTT vma")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200611180421.23262-1-chris@chris-wilson.co.uk


# bffa18dd 28-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Remove local entries from GGTT on suspend

Across suspend/resume, we clear the entire GGTT and rebuild from
scratch. In particular, we want to only preserve the global entries for
use by the HW, and delay reinstating the local binds until required by
the user. This means that we can evict any local binds in the global GTT,
saving any time in preserving their state, as they will be rebound on
demand.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1947
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528082427.21402-2-chris@chris-wilson.co.uk


# aedbe0a1 21-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove PIN_UPDATE for i915_vma_pin

As we no longer use PIN_UPDATE (since commit 7d0aa0db4375 ("drm/i915/gem:
Unbind all current vma on changing cache-level")) we can remove
PIN_UPDATE itself. The benefit is just in simplifing the vma bind.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521144949.25357-1-chris@chris-wilson.co.uk


# af23facc 03-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Check current i915_vma.pin_count status first on unbind

Do an early rejection of a i915_vma_unbind() attempt if the i915_vma is
currently pinned, without waiting to see if the inflight operations may
unpin it. We see this problem with the shrinker trying to unbind the
active vma from inside its bind worker:

<6> [472.618968] Workqueue: events_unbound fence_work [i915]
<4> [472.618970] Call Trace:
<4> [472.618974] ? __schedule+0x2e5/0x810
<4> [472.618978] schedule+0x37/0xe0
<4> [472.618982] schedule_preempt_disabled+0xf/0x20
<4> [472.618984] __mutex_lock+0x281/0x9c0
<4> [472.618987] ? mark_held_locks+0x49/0x70
<4> [472.618989] ? _raw_spin_unlock_irqrestore+0x47/0x60
<4> [472.619038] ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619084] ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619122] i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619165] i915_gem_object_unbind+0x1dc/0x400 [i915]
<4> [472.619208] i915_gem_shrink+0x328/0x660 [i915]
<4> [472.619250] ? i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619282] i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619325] vm_alloc_page.constprop.25+0x1aa/0x240 [i915]
<4> [472.619330] ? rcu_read_lock_sched_held+0x4d/0x80
<4> [472.619363] ? __alloc_pd+0xb/0x30 [i915]
<4> [472.619366] ? module_assert_mutex_or_preempt+0xf/0x30
<4> [472.619368] ? __module_address+0x23/0xe0
<4> [472.619371] ? is_module_address+0x26/0x40
<4> [472.619374] ? static_obj+0x34/0x50
<4> [472.619376] ? lockdep_init_map+0x4d/0x1e0
<4> [472.619407] setup_page_dma+0xd/0x90 [i915]
<4> [472.619437] alloc_pd+0x29/0x50 [i915]
<4> [472.619470] __gen8_ppgtt_alloc+0x443/0x6b0 [i915]
<4> [472.619503] gen8_ppgtt_alloc+0xd7/0x300 [i915]
<4> [472.619535] ppgtt_bind_vma+0x2a/0xe0 [i915]
<4> [472.619577] __vma_bind+0x26/0x40 [i915]
<4> [472.619611] fence_work+0x1c/0x90 [i915]
<4> [472.619617] process_one_work+0x26a/0x620

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403120150.17091-1-chris@chris-wilson.co.uk
(cherry picked from commit 614654abe847a42fc75d7eb5096e46f796a438b6)
(cherry picked from commit dd086cf516d9bea3878abb267f62ccc53acd764b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# f524a774 22-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()

While the ggtt vma are protected by their object lifetime, the list
continues until it hits a non-ggtt vma, and that vma is not protected
and may be freed as we inspect it. Hence, we require the obj->vma.lock
to protect the list as we iterate.

An example of forgetting to hold the obj->vma.lock is

[1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
[1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1
[1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017
[1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915]
[1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10
[1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282
[1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000
[1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578
[1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000
[1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8
[1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00
[1642834.465034] FS: 00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000
[1642834.465036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0
[1642834.465038] Call Trace:
[1642834.465083] i915_gem_set_tiling_ioctl+0x122/0x230 [i915]
[1642834.465121] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465151] drm_ioctl_kernel+0x86/0xd0 [drm]
[1642834.465156] ? avc_has_perm+0x3b/0x160
[1642834.465178] drm_ioctl+0x206/0x390 [drm]
[1642834.465216] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465221] ? selinux_file_ioctl+0x122/0x1c0
[1642834.465226] ? __do_munmap+0x24b/0x4d0
[1642834.465231] ksys_ioctl+0x82/0xc0
[1642834.465235] __x64_sys_ioctl+0x16/0x20
[1642834.465238] do_syscall_64+0x5b/0xf0
[1642834.465243] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1642834.465245] RIP: 0033:0x7f6a3d7b047b
[1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
[1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b
[1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e
[1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002
[1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080
[1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30

Now to take the spinlock during the list iteration, we need to break it
down into two phases. In the first phase under the lock, we cannot sleep
and so must defer the actual work to a second list, protected by the
ggtt->mutex.

We also need to hold the spinlock during creation of a new vma to
serialise with updates of the tiling on the object.

Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk
(cherry picked from commit cb593e5d2b6d3ad489669914d9fd1c64c7a4a6af)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 50689771 22-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only close vma we open

The history of i915_vma_close() is confusing, as is its use. As the
lifetime of the i915_vma is currently bounded by the object it is
attached to, we needed a means of identify when a vma was no longer in
use by userspace (via the user's fd). This is further complicated by
that only ppgtt vma should be closed at the user's behest, as the ggtt
were always shared.

Now that we attach the vma to a lut on the user's context, the open
count does indicate how many unique and open context/vm are referencing
this vma from the user. As such, we can and should just use the
open_count to track when the vma is still in use by userspace.

It's a poor man's replacement for reference counting.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk


# cb593e5d 22-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()

While the ggtt vma are protected by their object lifetime, the list
continues until it hits a non-ggtt vma, and that vma is not protected
and may be freed as we inspect it. Hence, we require the obj->vma.lock
to protect the list as we iterate.

An example of forgetting to hold the obj->vma.lock is

[1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
[1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1
[1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017
[1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915]
[1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10
[1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282
[1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000
[1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578
[1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000
[1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8
[1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00
[1642834.465034] FS: 00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000
[1642834.465036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0
[1642834.465038] Call Trace:
[1642834.465083] i915_gem_set_tiling_ioctl+0x122/0x230 [i915]
[1642834.465121] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465151] drm_ioctl_kernel+0x86/0xd0 [drm]
[1642834.465156] ? avc_has_perm+0x3b/0x160
[1642834.465178] drm_ioctl+0x206/0x390 [drm]
[1642834.465216] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465221] ? selinux_file_ioctl+0x122/0x1c0
[1642834.465226] ? __do_munmap+0x24b/0x4d0
[1642834.465231] ksys_ioctl+0x82/0xc0
[1642834.465235] __x64_sys_ioctl+0x16/0x20
[1642834.465238] do_syscall_64+0x5b/0xf0
[1642834.465243] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1642834.465245] RIP: 0033:0x7f6a3d7b047b
[1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
[1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b
[1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e
[1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002
[1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080
[1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30

Now to take the spinlock during the list iteration, we need to break it
down into two phases. In the first phase under the lock, we cannot sleep
and so must defer the actual work to a second list, protected by the
ggtt->mutex.

We also need to hold the spinlock during creation of a new vma to
serialise with updates of the tiling on the object.

Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk


# 442dbc5c 06-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Make exclusive awaits on i915_active optional

Later use will require asynchronous waits on the active timelines, but
will not utilize an async wait on the exclusive channel. Make the await
on the exclusive fence explicit in the selection flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200406155840.1728-1-chris@chris-wilson.co.uk


# 614654ab 03-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Check current i915_vma.pin_count status first on unbind

Do an early rejection of a i915_vma_unbind() attempt if the i915_vma is
currently pinned, without waiting to see if the inflight operations may
unpin it. We see this problem with the shrinker trying to unbind the
active vma from inside its bind worker:

<6> [472.618968] Workqueue: events_unbound fence_work [i915]
<4> [472.618970] Call Trace:
<4> [472.618974] ? __schedule+0x2e5/0x810
<4> [472.618978] schedule+0x37/0xe0
<4> [472.618982] schedule_preempt_disabled+0xf/0x20
<4> [472.618984] __mutex_lock+0x281/0x9c0
<4> [472.618987] ? mark_held_locks+0x49/0x70
<4> [472.618989] ? _raw_spin_unlock_irqrestore+0x47/0x60
<4> [472.619038] ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619084] ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619122] i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619165] i915_gem_object_unbind+0x1dc/0x400 [i915]
<4> [472.619208] i915_gem_shrink+0x328/0x660 [i915]
<4> [472.619250] ? i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619282] i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619325] vm_alloc_page.constprop.25+0x1aa/0x240 [i915]
<4> [472.619330] ? rcu_read_lock_sched_held+0x4d/0x80
<4> [472.619363] ? __alloc_pd+0xb/0x30 [i915]
<4> [472.619366] ? module_assert_mutex_or_preempt+0xf/0x30
<4> [472.619368] ? __module_address+0x23/0xe0
<4> [472.619371] ? is_module_address+0x26/0x40
<4> [472.619374] ? static_obj+0x34/0x50
<4> [472.619376] ? lockdep_init_map+0x4d/0x1e0
<4> [472.619407] setup_page_dma+0xd/0x90 [i915]
<4> [472.619437] alloc_pd+0x29/0x50 [i915]
<4> [472.619470] __gen8_ppgtt_alloc+0x443/0x6b0 [i915]
<4> [472.619503] gen8_ppgtt_alloc+0xd7/0x300 [i915]
<4> [472.619535] ppgtt_bind_vma+0x2a/0xe0 [i915]
<4> [472.619577] __vma_bind+0x26/0x40 [i915]
<4> [472.619611] fence_work+0x1c/0x90 [i915]
<4> [472.619617] process_one_work+0x26a/0x620

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403120150.17091-1-chris@chris-wilson.co.uk


# 9657aaa2 03-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Revoke mmap before fence

Make sure we revoke the user's mmaps of this vma to force them to take a
pagefault *before* we remove the associated aperture detiling register.

Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403160951.8271-1-chris@chris-wilson.co.uk


# 9da0ea09 01-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Drop cached obj->bind_count

We cached the number of vma bound to the object in order to speed up
shrinker decisions. This has been superseded by being more proactive in
removing objects we cannot shrink from the shrinker lists, and so we can
drop the clumsy attempt at atomically counting the bind count and
comparing it to the number of pinned mappings of the object. This will
only get more clumsier with asynchronous binding and unbinding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401223924.16667-1-chris@chris-wilson.co.uk


# 0d86ee35 01-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Make fence revocation unequivocal

If we must revoke the fence because the VMA is no longer present, or
because the fence no longer applies, ensure that we do and convert it
into an error if we try but cannot.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401210104.15907-3-chris@chris-wilson.co.uk


# 63baf4f3 01-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence

Only GPU activity via the GGTT fence is asynchronous, we know that we
control the CPU access directly, so we only need to wait for the GPU to
stop using the fence before we relinquish it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401210104.15907-1-chris@chris-wilson.co.uk


# d0024911 26-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Differentiate between aliasing-ppgtt and ggtt pinning

Userptr causes lockdep to complain when we are using the aliasing-ppgtt
(and ggtt, but for that it is rightfully so to complain about) in that
when we revoke the userptr we take a mutex which we also use to revoke
the mmaps. However, we only revoke mmaps for GGTT bindings and we never
allow userptr to create a GGTT binding so the warning should be false
and is simply caused by our conflation of the aliasing-ppgtt with the
ggtt. So lets try treating the binding into the aliasing-ppgtt as a
separate lockclass from the ggtt. The downside is that we are
deliberately suppressing lockdep;s ability to warn us of cycles.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/478
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326142727.31962-1-chris@chris-wilson.co.uk


# b0647a5e 23-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid live-lock with i915_vma_parked()

Abuse^W Take advantage that we know we are inside the GT wakeref and
that prevents any client execbuf from reopening the i915_vma in order to
claim all the vma to close without having to drop the spinlock to free
each one individually. By keeping the spinlock, we do not have to
restart if we run concurrently with i915_gem_free_objects -- which
causes them both to restart continually and make very very slow
progress.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1361
Fixes: 77853186e547 ("drm/i915: Claim vma while under closed_lock in i915_vma_parked()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-2-chris@chris-wilson.co.uk
(cherry picked from commit 3447c4c55d0edc95742fdcd91c3efb050546b907)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 92581f9f 24-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Immediately execute the fenced work

If the caller allows and we do not have to wait for any signals,
immediately execute the work within the caller's process. By doing so we
avoid the overhead of scheduling a new task, and the latency in
executing it, at the cost of pulling that work back into the immediate
context. (Sometimes we still prefer to offload the task to another cpu,
especially if we plan on executing many such tasks in parallel for this
client.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325120227.8044-2-chris@chris-wilson.co.uk


# 3447c4c5 23-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid live-lock with i915_vma_parked()

Abuse^W Take advantage that we know we are inside the GT wakeref and
that prevents any client execbuf from reopening the i915_vma in order to
claim all the vma to close without having to drop the spinlock to free
each one individually. By keeping the spinlock, we do not have to
restart if we run concurrently with i915_gem_free_objects -- which
causes them both to restart continually and make very very slow
progress.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1361
Fixes: 77853186e547 ("drm/i915: Claim vma while under closed_lock in i915_vma_parked()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-2-chris@chris-wilson.co.uk


# 29e6ecf3 11-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Extend i915_request_await_active to use all timelines

Extend i915_request_await_active() to be able to asynchronously wait on
all the tracked timelines simultaneously.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311092044.16353-1-chris@chris-wilson.co.uk


# 2f000308 03-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop vma is-closed assertion on insert

The is-closed flag may be added after we have acquired the vma under the
ctx->mutex, but will not take effect until after we release the
vm->mutex. i.e. the flag may be set on the vma as attempt to bind it and
that will cause the vma to be unbound later after we unpin it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200303093157.1153887-1-chris@chris-wilson.co.uk


# 00de702c 20-Feb-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Check that the vma hasn't been closed before we insert it

As there is a delay before we pin a vma, there is an opportunity for
another thread to have closed the vm and its vma (including us).
Check as soon as we acquire the vm->mutex and know the vm/vma is stable.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1291
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200221121940.2741563-1-chris@chris-wilson.co.uk


# e4edd4fc 23-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Check activity on i915_vma after confirming pin_count==0

Only assert that the i915_vma is now idle if and only if no other pins
are present. If another user has the i915_vma pinned, they may submit
more work to the i915_vma skipping the vm->mutex used to serialise the
unbind. We need to wait again, if we want to continue and unbind this
vma.

However, if we own the i915_vma (we hold the vm->mutex for the unbind
and the pin_count is 0), we can assert that the vma remains idle as we
unbind.

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/530
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123224459.38128-1-chris@chris-wilson.co.uk
(cherry picked from commit 60e94557fff1f5514c7fc4da7ddc2c7a13ffff26)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 30ca04e1 03-Feb-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Hold reference to previous active fence as we queue

Take a reference to the previous exclusive fence on the i915_active, as
we wish to add an await to it in the caller (and so must prevent it from
being freed until we have completed that task).

Fixes: e3793468b466 ("drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200203094152.4150550-1-chris@chris-wilson.co.uk


# e3793468 30-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutex

On Braswell and Broxton (also known as Valleyview and Apollolake), we
need to serialise updates of the GGTT using the big stop_machine()
hammer. This has the side effect of appearing to lockdep as a possible
reclaim (since it uses the cpuhp mutex and that is tainted by per-cpu
allocations). However, we want to use vm->mutex (including ggtt->mutex)
from within the shrinker and so must avoid such possible taints. For this
purpose, we introduced the asynchronous vma binding and we can apply it
to the PIN_GLOBAL so long as take care to add the necessary waits for
the worker afterwards.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/211
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-3-chris@chris-wilson.co.uk


# d62f416f 23-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Wait on vma activity before taking the mutex

Optimistically wait for the prior vma activity before taking the mutex
to minimise the mutex hold time while unbinding. We will then verify the
vma is idle with a second wait under the mutex to ensure it is safe to
unbind.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123224459.38128-2-chris@chris-wilson.co.uk


# 60e94557 23-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Check activity on i915_vma after confirming pin_count==0

Only assert that the i915_vma is now idle if and only if no other pins
are present. If another user has the i915_vma pinned, they may submit
more work to the i915_vma skipping the vm->mutex used to serialise the
unbind. We need to wait again, if we want to continue and unbind this
vma.

However, if we own the i915_vma (we hold the vm->mutex for the unbind
and the pin_count is 0), we can assert that the vma remains idle as we
unbind.

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/530
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123224459.38128-1-chris@chris-wilson.co.uk


# 5424f5d7 21-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Clear the GGTT_WRITE bit on unbinding the vma

While we do flush writes to the vma before unbinding (to make sure they
go through the right detiling register), we may also be concurrently
poking at the GGTT_WRITE bit from set-domain, as we mark all GGTT vma
associated with an object. We know this is for another vma, as we
are currently unbinding this one -- so if this vma will be reused, it
will be refaulted and have its dirty bit set before the next write.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/999
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200121222447.419489-1-chris@chris-wilson.co.uk


# c0e60347 10-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Hold rpm wakeref before taking ggtt->vm.mutex

We need to hold the runtime-pm wakeref to update the global PTEs (as
they exist behind a PCI BAR). However, some systems invoke ACPI during
runtime resume and so require allocations, which is verboten inside the
vm->mutex. Ergo, we must not use intel_runtime_pm_get() inside the
mutex, but lift the call outside.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/958
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110144418.1415639-1-chris@chris-wilson.co.uk


# a5972e93 08-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Reduce warning for i915_vma_pin_iomap() without runtime-pm

Access through the GGTT (iomap) into the vma does require the device to
be awake. However, we often take the i915_vma_pin_iomap() as an early
preparatory step that is long before we use the iomap. Asserting that
the device is awake at pin time does not protect us, and is merely a
nuisance.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200108153550.3803446-2-chris@chris-wilson.co.uk


# 76f9764c 22-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Introduce a vma.kref

Start introducing a kref on i915_vma in order to protect the vma unbind
(i915_gem_object_unbind) from a parallel destruction (i915_vma_parked).
Later, we will use the refcount to manage all access and turn i915_vma
into a first class container.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222210256.2066451-2-chris@chris-wilson.co.uk


# e85ade1f 18-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Hold reference to intel_frontbuffer as we track activity

Since obj->frontbuffer is no longer protected by the struct_mutex, as we
are processing the execbuf, it may be removed. Mark the
intel_frontbuffer as rcu protected, and so acquire a reference to
the struct as we track activity upon it.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/827
Fixes: 8e7cb1799b4f ("drm/i915: Extract intel_frontbuffer active tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191218104043.3539458-1-chris@chris-wilson.co.uk
(cherry picked from commit da42104f589d979bbe402703fd836cec60befae1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# e6ba7648 21-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915->kernel_context

Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk


# da42104f 18-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Hold reference to intel_frontbuffer as we track activity

Since obj->frontbuffer is no longer protected by the struct_mutex, as we
are processing the execbuf, it may be removed. Mark the
intel_frontbuffer as rcu protected, and so acquire a reference to
the struct as we track activity upon it.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/827
Fixes: 8e7cb1799b4f ("drm/i915: Extract intel_frontbuffer active tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191218104043.3539458-1-chris@chris-wilson.co.uk


# 54d7195f 16-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Unpin vma->obj on early error

If we inherit an error along the fence chain, we skip the main work
callback and go straight to the error. In the case of the vma bind
worker, we only dropped the pinned pages from the worker.

In the process, make sure we call the release earlier rather than wait
until the final reference to the fence is dropped (as a reference is
kept while being listened upon).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191216161717.2688274-1-chris@chris-wilson.co.uk


# d3e48352 08-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Change i915_vma_unbind() to report -EAGAIN on activity

If someone else acquires the i915_vma before we complete our wait and
unbind it, we currently error out with -EBUSY. Use -EAGAIN instead so
that if necessary the caller is prepared to try again.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/683
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191208161252.3015727-2-chris@chris-wilson.co.uk


# 77853186 05-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Claim vma while under closed_lock in i915_vma_parked()

Remove the vma we wish to destroy from the gt->closed_list to avoid
having two i915_vma_parked() try and free it.

Fixes: aa5e4453dc05 ("drm/i915/gem: Try to flush pending unbind events")
References: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205214159.829727-1-chris@chris-wilson.co.uk


# ccd20945 05-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Try hard to bind the context

It is not acceptable for context pinning to fail with -ENOSPC as we
should always be able to make space in the GGTT. The only reason we may
fail is that other "temporary" context pins are reserving their space
and we need to wait for an available slot.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/676
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205113726.413351-2-chris@chris-wilson.co.uk


# cc662126 03-Dec-2019 Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET

This is really just an alias of mmap_gtt. The 'mmap offset' nomenclature
comes from the value returned by this ioctl which is the offset into the
device fd which userpace uses with mmap(2).

mmap_gtt was our initial mmap_offset implementation, this extends
our CPU mmap support to allow additional fault handlers that depends on
the object's backing pages.

Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
and use the zero extending behaviour of drm to differentiate between
them, when we inspect the flags.

To support multiple mmap types on an object we need to support multiple
mmap_offsets for an object (each offset in the global device address
space corresponding to a unique instance of the object for a file + mmap
type). As we drop the simplified drm core idea of a single mmap_offset,
we need to provide replacement hooks for the dumb mmap interface as
well.

Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1675
Testcase: igt/gem_mmap_offset
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191204120032.3682839-1-chris@chris-wilson.co.uk


# dde01d94 30-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Split detaching and removing the vma

In order to keep the assert_bind_count() valid, we need to hold the vma
page reference until after we drop the bind count. However, we must also
keep the drm_mm_remove_node() as the last action of i915_vma_unbind() so
that it serialises with the unlocked check inside i915_vma_destroy(). So
we need to split up i915_vma_remove() so that we order the detach, drop
pages and remove as required during unbind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112067
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030192159.18404-1-chris@chris-wilson.co.uk


# 71e51ca8 21-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lift i915_vma_parked() onto the gt

Currently even though i915_vma_parked() operates on a per-gt struct, it
is called from a global pm notify. This oddity was only because the long
term plan is to decouple the vma cache from the pm notification, but
right now the oddity stands out like a sore thumb!

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021183236.21790-1-chris@chris-wilson.co.uk


# 454a325a 15-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove leftover vma->obj->pages_pin_count on insert/remove

We now do the page pin count upfront in vma_get_pages/vma_put_pages, so
that we do the allocations before we enter the vm->mutex. Our vma
page references we are tracked in vma->pages_count and the extra
obj->pages_pin_count being performed later in i915_vma_insert and
i915_vma_remove is redundant, and worse throws off the shrinker's logic
on when it can free an object by unbinding it.

Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reported-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015100155.10376-1-chris@chris-wilson.co.uk


# 56184a20 15-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop obj.page_pin_count after a failed vma->set_pages()

Before we attempt to set_pages on the vma, we claim a
obj.pages_pin_count for it. If we subsequently fail to set the pages on
the vma, we need to drop our pinning before returning the error.

Reported-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015093915.3995-1-chris@chris-wilson.co.uk


# b1e3177b 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Coordinate i915_active with its own mutex

Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.

This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).

The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.

v2: Add some comments that barely elucidate anything :(

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk


# 274cbf20 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Push the i915_active.retire into a worker

As we need to use a mutex to serialise i915_active activation
(because we want to allow the callback to sleep), we need to push the
i915_active.retire into a worker callback in case we get need to retire
from an atomic context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-5-chris@chris-wilson.co.uk


# 2850748e 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull i915_vma_pin under the vm->mutex

Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).

In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.

Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!

v2: Add some commentary, and some helpers to reduce patch churn.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk


# 5e053450 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only track bound elements of the GTT

The premise here is to simply avoiding having to acquire the vm->mutex
inside vma create/destroy to update the vm->unbound_lists, to avoid some
nasty lock recursions later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-2-chris@chris-wilson.co.uk


# b290a78b 03-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use helpers for drm_mm_node booleans

A subset of 71724f708997 ("drm/mm: Use helpers for drm_mm_node booleans")
in order to prepare drm-intel-next-queued for subsequent patches before
we can backmerge 71724f708997 itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004142226.13711-1-chris@chris-wilson.co.uk


# 71724f70 03-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/mm: Use helpers for drm_mm_node booleans

In preparation for rearranging the booleans into a flags field, ensure
all the current users are using the inline helpers and not directly
accessing the members.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191003210100.22250-3-chris@chris-wilson.co.uk


# d19d71fc 18-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark i915_request.timeline as a volatile, rcu pointer

The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.

One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.

v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk


# 4dd2fbbf 11-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Make i915_vma.flags atomic_t for mutex reduction

In preparation for reducing struct_mutex stranglehold around the vm,
make the vma.flags atomic so that we can acquire a pin on the vma
atomically before deciding if we need to take the mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk


# 33dd8899 09-Sep-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: cleanup cache-coloring

Try to tidy up the cache-coloring such that we rid the code of any
mm.color_adjust assumptions, this should hopefully make it more obvious
in the code when we need to actually use the cache-level as the color,
and as a bonus should make adding a different color-scheme simpler.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com


# 1e0a96e5 09-Sep-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: export color_differs

Export color_differs so that we can use it elsewhere.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-1-matthew.auld@intel.com


# 1f7fd484 22-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace i915_vma_put_fence()

Avoid calling i915_vma_put_fence() by using our alternate paths that
bind a secondary vma avoiding the original fenced vma. For the few
instances where we need to release the fence (i.e. on binding when the
GGTT range becomes invalid), replace the put_fence with a revoke_fence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822061557.18402-1-chris@chris-wilson.co.uk


# b7d151ba 22-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull obj->userfault tracking under the ggtt->mutex

Since we want to revoke the ggtt vma from only under the ggtt->mutex, we
need to move protection of the userfault tracking from the struct_mutex
to the ggtt->mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822060914.2671-2-chris@chris-wilson.co.uk


# 2833ddcc 20-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Be defensive when starting vma activity

Before we acquire the vma for GPU activity, ensure that the underlying
object is not already in the process of being freed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820100531.8430-1-chris@chris-wilson.co.uk


# 25ffd4b1 16-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Markup expected timeline locks for i915_active

As every i915_active_request should be serialised by a dedicated lock,
i915_active consists of a tree of locks; one for each node. Markup up
the i915_active_request with what lock is supposed to be guarding it so
that we can verify that the serialised updated are indeed serialised.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-2-chris@chris-wilson.co.uk


# 8e7cb179 16-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Extract intel_frontbuffer active tracking

Move the active tracking for the frontbuffer operations out of the
i915_gem_object and into its own first class (refcounted) object. In the
process of detangling, we switch from low level request tracking to the
easier i915_active -- with the plan that this avoids any potential
atomic callbacks as the frontbuffer tracking wishes to sleep as it
flushes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816074635.26062-1-chris@chris-wilson.co.uk


# 52791eee 11-Aug-2019 Christian König <christian.koenig@amd.com>

dma-buf: rename reservation_object to dma_resv

Be more consistent with the naming of the other DMA-buf objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/


# 3d6792cf 12-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Forgo last_fence active request tracking

We were using the last_fence to track the last request that used this
vma that might be interpreted by a fence register and forced ourselves
to wait for this request before modifying any fence register that
overlapped our vma. Due to requirement that we need to track any XY_BLT
command, linear or tiled, this in effect meant that we have to track the
vma for its active lifespan anyway, so we can forgo the explicit
last_fence tracking and just use the whole vma->active.

Another solution would be to pipeline the register updates, and would
help resolve some long running stalls for gen3 (but only gen 2 and 3!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190812174804.26180-1-chris@chris-wilson.co.uk


# a09d9a80 06-Aug-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: avoid including intel_drv.h via i915_drv.h->i915_trace.h

Disentangle i915_drv.h from intel_drv.h, which gets included via
i915_trace.h. This necessitates including i915_trace.h wherever it's
needed.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ed82bf259d3b725a1a1a3c3e9d6fb5c08bc4d489.1565085691.git.jani.nikula@intel.com


# 1aff1903 02-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Hide unshrinkable context objects from the shrinker

The shrinker cannot touch objects used by the contexts (logical state
and ring). Currently we mark those as "pin_global" to let the shrinker
skip over them, however, if we remove them from the shrinker lists
entirely, we don't event have to include them in our shrink accounting.

By keeping the unshrinkable objects in our shrinker tracking, we report
a large number of objects available to be shrunk, and leave the shrinker
deeply unsatisfied when we fail to reclaim those. The shrinker will
persist in trying to reclaim the unavailable objects, forcing the system
into a livelock (not even hitting the dread oomkiller).

v2: Extend unshrinkable protection for perma-pinned scratch and guc
allocations (Tvrtko)
v3: Notice that we should be pinned when marking unshrinkable and so the
link cannot be empty; merge duplicate paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk


# cd2a4eaf 30-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Report resv_obj allocation failure

Since commit 64d6c500a384 ("drm/i915: Generalise GPU activity
tracking"), we have been prepared for i915_vma_move_to_active() to fail.
We can take advantage of this to report the failure for allocating the
shared-fence slot in the reservation_object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730205805.3733-1-chris@chris-wilson.co.uk


# c082afac 30-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move aliasing_ppgtt underneath its i915_ggtt

The aliasing_ppgtt provides a PIN_USER alias for the global gtt, so move
it under the i915_ggtt to simplify later transformations to enable
intel_context.vm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-1-chris@chris-wilson.co.uk


# 09480072 03-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up vma->active as safe for use inside shrinkers

Since a shrinker may be forced to wait on GPU activity,
i915_active_wait(&vma->active) must be safe for use inside a shrinker,
and so let's mark up the lock as being acquired by the shrinker to avoid
any nasty surprises creeping in.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703091726.11690-8-chris@chris-wilson.co.uk


# 12c255b5 21-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Provide an i915_active.acquire callback

If we introduce a callback for i915_active that is only called the first
time we use the i915_active and is symmetrically paired with the
i915_active.retire callback, we can replace the open-coded and
non-atomic implementations -- which will be very fragile (i.e. broken)
upon removing the struct_mutex serialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-4-chris@chris-wilson.co.uk


# a93615f9 21-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Throw away the active object retirement complexity

Remove the accumulated optimisations that we have for i915_vma_retire
and reduce it to the bare essential of tracking the active object
reference. This allows us to only use atomic operations, and so will be
able to avoid the struct_mutex requirement.

The principal loss here is the shrinker MRU bumping, so now if we have
to shrink, we will do so in much more random order and more likely to
try and shrink recently used objects. That is a nuisance, but shrinking
active objects is a second step we try to avoid and will always be a
system-wide performance issue.

The other loss is here is in the automatic pruning of the
reservation_object when idling. This is not as large an issue as upon
reservation_object introduction as now adding new fences into the object
replaces already signaled fences, keeping the array compact. But we do
lose the auto-expiration of stale fences and unused arrays. That may be
a noticeable problem for which we need to re-implement autopruning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-3-chris@chris-wilson.co.uk


# a1c8a09e 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Convert i915_gem_flush_ggtt_writes to intel_gt

Having introduced struct intel_gt (named the anonymous structure in i915)
we can start using it to compartmentalize our code better. It makes more
sense logically to have the code internally like this and it will also
help with future split between gt and display in i915.

v2:
* Keep ggtt flush before fb obj flush. (Chris)

v3:
* Fix refactoring fail.
* Always flush ggtt writes. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-23-tvrtko.ursulin@linux.intel.com


# ef78f7b1 18-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use drm_gem_object.resv

Since commit 1ba627148ef5 ("drm: Add reservation_object to
drm_gem_object"), struct drm_gem_object grew its own builtin
reservation_object rendering our own private one bloat. Remove our
redundant reservation_object and point into obj->base.resv instead.

References: 1ba627148ef5 ("drm: Add reservation_object to drm_gem_object")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618125858.7295-1-chris@chris-wilson.co.uk


# df0566a6 13-Jun-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: move modesetting core code under display/

Now that we have a new subdirectory for display code, continue by moving
modesetting core code.

display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this
is, again, a surprisingly clean operation.

v2:
- don't move intel_sideband.[ch] (Ville)
- use tabs for Makefile file lists and sort them

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com


# 87b391b9 13-Jun-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: Remove rpm asserts that use i915

Quite a few of the call points have already switched to the version
working directly on the runtime_pm structure, so let's switch over the
rest and kill the i915-based asserts.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-3-daniele.ceraolospurio@intel.com


# ecab9be1 12-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Combine unbound/bound list tracking for objects

With async binding, we don't want to manage a bound/unbound list as we
may end up running before we even acquire the pages. All that is
required is keeping track of shrinkable objects, so reduce it to the
minimum list.

Fixes: 6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190612105720.30310-1-chris@chris-wilson.co.uk


# a8cff4c8 10-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Promote i915->mm.obj_lock to be irqsafe

The intent is to be able to update the mm.lists from inside an irqsoff
section (e.g. from a softirq rcu workqueue), ergo we need to make the
i915->mm.obj_lock irqsafe.

v2: can_discard_pages() ensures we are shrinkable
v3: Beware shadowing of 'flags'

Fixes: 3b4fa9640ccd ("drm/i915: Track the purgeable objects on a separate eviction list")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110869
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190610145430.17717-1-chris@chris-wilson.co.uk


# 155ab883 05-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move object close under its own lock

Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk


# d82b4b26 30-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Report all objects with allocated pages to the shrinker

Currently, we try to report to the shrinker the precise number of
objects (pages) that are available to be reaped at this moment. This
requires searching all objects with allocated pages to see if they
fulfill the search criteria, and this count is performed quite
frequently. (The shrinker tries to free ~128 pages on each invocation,
before which we count all the objects; counting takes longer than
unbinding the objects!) If we take the pragmatic view that with
sufficient desire, all objects are eventually reapable (they become
inactive, or no longer used as framebuffer etc), we can simply return
the count of pinned pages maintained during get_pages/put_pages rather
than walk the lists every time.

The downside is that we may (slightly) over-report the number of
objects/pages we could shrink and so penalize ourselves by shrinking
more than required. This is mitigated by keeping the order in which we
shrink objects such that we avoid penalizing active and frequently used
objects, and if memory is so tight that we need to free them we would
need to anyway.

v2: Only expose shrinkable objects to the shrinker; a small reduction in
not considering stolen and foreign objects.
v3: Restore the tracking from a "backup" copy from before the gem/ split

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-2-chris@chris-wilson.co.uk


# 3b4fa964 30-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track the purgeable objects on a separate eviction list

Currently the purgeable objects, I915_MADV_DONTNEED, are mixed in the
normal bound/unbound lists. Every shrinker pass starts with an attempt
to purge from this set of unneeded objects, which entails us doing a
walk over both lists looking for any candidates. If there are none, and
since we are shrinking we can reasonably assume that the lists are
full!, this becomes a very slow futile walk.

If we separate out the purgeable objects into own list, this search then
becomes its own phase that is preferentially handled during shrinking.
Instead the cost becomes that we then need to filter the purgeable list
if we want to distinguish between bound and unbound objects.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-1-chris@chris-wilson.co.uk


# c017cf6b 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop the deferred active reference

An old optimisation to reduce the number of atomics per batch sadly
relies on struct_mutex for coordination. In order to remove struct_mutex
from serialising object/context closing, always taking and releasing an
active reference on first use / last use greatly simplifies the locking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk


# 6951e589 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GEM object domain management from struct_mutex to local

Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk


# bb211c3d 09-May-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915/selftests: Add live vma selftest

Add a live selftest to excercise rotated/remapped vmas. We simply
write through the rotated/remapped vma, and confirm that the data
appears in the right page when read through the normal vma.

Not sure what the fallout of making all rotated/remapped vmas
mappable/fenceable would be, hence I just hacked it in the test.

v2: Grab rpm reference (Chris)
GEM_BUG_ON(view.type not as expected) (Chris)
Allow CAN_FENCE for rotated/remapped vmas (Chris)
Update intel_plane_uses_fence() to ask for a fence
only for normal vmas on gen4+
v3: Deal with intel_wakeref_t
v4: Rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-4-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# 1a74fc0b 09-May-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add a new "remapped" gtt_view

To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.

v2: Use intel_remapped_plane_info base type
s/unused/unused_mbz/ (Chris)
Separate BUILD_BUG_ON()s (Chris)
Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
to one row of pages to keep things simple

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# 487f3c7f 25-Apr-2019 Thomas Gleixner <tglx@linutronix.de>

drm: Simplify stacktrace handling

Replace the indirection through struct stack_trace by using the storage
array based interfaces.

The original code in all printing functions is really wrong. It allocates a
storage array on stack which is unused because depot_fetch_stack() does not
store anything in it. It overwrites the entries pointer in the stack_trace
struct so it points to the depot storage.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: linux-mm@kvack.org
Cc: David Rientjes <rientjes@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: iommu@lists.linux-foundation.org
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: linux-btrfs@vger.kernel.org
Cc: dm-devel@redhat.com
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-arch@vger.kernel.org
Link: https://lkml.kernel.org/r/20190425094802.622094226@linutronix.de


# 112ed2d3 24-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GraphicsTechnology files under gt/

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk


# 103b76ee 05-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use i915_global_register()

Rather than manually add every new global into each hook, use
i915_global_register() function and keep a list of registered globals to
invoke instead.

However, I haven't found a way for random drivers to add an .init table
to avoid having to manually add ourselves to i915_globals_init() each
time.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305213830.18094-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 13f1bfd3 28-Feb-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Make object/vma allocation caches global

As our allocations are not device specific, we can move our slab caches
to a global scope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk


# 21950ee7 05-Feb-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull i915_gem_active into the i915_active family

Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk


# 64d6c500 05-Feb-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Generalise GPU activity tracking

We currently track GPU memory usage inside VMA, such that we never
release memory used by the GPU until after it has finished accessing it.
However, we may want to track other resources aside from VMA, or we may
want to split a VMA into multiple independent regions and track each
separately. For this purpose, generalise our request tracking (akin to
struct reservation_object) so that we can embed it into other objects.

v2: Tweak error handling during selftest setup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk


# 528cbd17 28-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move vma lookup to its own lock

Remove the struct_mutex requirement for looking up the vma for an
object.

v2: Highlight how the race for duplicate vma creation is resolved on
reacquiring the lock with a short comment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk


# 09d7e46b 28-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull VM lists under the VM mutex.

A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk


# 499197dc 28-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Stop tracking MRU activity on VMA

Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk


# ca05359f 19-Sep-2018 Christian König <christian.koenig@amd.com>

dma-buf: allow reserving more than one shared fence slot

Let's support simultaneous submissions to multiple engines.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.kernel.org/patch/10626149/


# bbb8a9d7 12-Oct-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: GEM_WARN_ON considered harmful

GEM_WARN_ON currently has dangerous semantics where it is completely
compiled out on !GEM_DEBUG builds. This can leave users who expect it to
be more like a WARN_ON, just without a warning in non-debug builds, in
complete ignorance.

Another gotcha with it is that it cannot be used as a statement. Which is
again different from a standard kernel WARN_ON.

This patch fixes both problems by making it behave as one would expect.

It can now be used both as an expression and as statement, and also the
condition evaluates properly in all builds - code under the conditional
will therefore not unexpectedly disappear.

To satisfy call sites which really want the code under the conditional to
completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the
callers to it. This one can also be used as both expression and statement.

>From the above it follows GEM_DEBUG_WARN_ON should be used in situations
where we are certain the condition will be hit during development, but at
a place in code where error can be handled to the benefit of not crashing
the machine.

GEM_WARN_ON on the other hand should be used where condition may happen in
production and we just want to distinguish the level of debugging output
emitted between the production and debug build.

v2:
* Dropped BUG_ON hunk.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com


# f013027e 16-Aug-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Stop holding a ref to the ppgtt from each vma

The context owns both the ppgtt and the vma within it, and our activity
tracking on the context ensures that we do not release active ppgtt. As
the context fulfils our obligations for active memory tracking, we can
relinquish the reference from the vma.

This fixes a silly transient refleak from closed vma being kept alive
until the entire system was idle, keeping all vm alive as well.

Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Testcase: igt/gem_ctx_create/files
Fixes: 3365e2268b6b ("drm/i915: Lazily unbind vma on close")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180816073448.19396-1-chris@chris-wilson.co.uk
(cherry picked from commit a4417b7b419a68540ad7945ac4efbb39d19afa63)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# a4417b7b 16-Aug-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Stop holding a ref to the ppgtt from each vma

The context owns both the ppgtt and the vma within it, and our activity
tracking on the context ensures that we do not release active ppgtt. As
the context fulfils our obligations for active memory tracking, we can
relinquish the reference from the vma.

This fixes a silly transient refleak from closed vma being kept alive
until the entire system was idle, keeping all vm alive as well.

Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Testcase: igt/gem_ctx_create/files
Fixes: 3365e2268b6b ("drm/i915: Lazily unbind vma on close")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180816073448.19396-1-chris@chris-wilson.co.uk


# 6a2f59e4 21-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull unpin map into vma release

A reasonably common operation is to pin the map of the vma alongside the
vma itself for the lifetime of the vma, and so release both pins at the
same time as destroying the vma. It is common enough to pull into the
release function, making that central function more attractive to a
couple of other callsites.

The continual ulterior motive is to sweep over errors on module load
aborting...

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180721125037.20127-1-chris@chris-wilson.co.uk


# 46b1063f 19-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Handle recursive shrinker for vma->last_active allocation

If we call into the shrinker for direct relcaim inside kmalloc, it will
retire the requests. If we retire the vma->last_active while processing a
new i915_vma_move_to_active() we can upset the delicate bookkeeping
required for the cache. After the possible invocation of the shrinker, we
need to double check the vma->last_active is still valid.

Fixes: 8b293eb53a7d ("drm/i915: Track the last-active inside the i915_vma")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105600#c39
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719072206.16015-1-chris@chris-wilson.co.uk


# 8b293eb5 06-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track the last-active inside the i915_vma

Using a VMA on more than one timeline concurrently is the exception
rather than the rule (using it concurrently on multiple engines). As we
expect to only use one active tracker, store the most recently used
tracker inside the i915_vma itself and only fallback to the rbtree if
we need a second or more concurrent active trackers.

v2: Comments on how we overwrite any existing last_active cache.
v3: __list_del_entry() before list_replace_init() is confusing and, much
more important, entirely redundant.
v4: Note that both last_active and the rbtree may be simultaneously
tracking this timeline, albeit with different requests, and so the vma
may be retired twice for the same timeline.
v5: No, that list_del is required!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706123157.9645-1-chris@chris-wilson.co.uk


# 5c3f8c22 06-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track vma activity per fence.context, not per engine

In the next patch, we will want to be able to use more flexible request
timelines that can hop between engines. From the vma pov, we can then
not rely on the binding of this request to an engine and so can not
ensure that different requests are ordered through a per-engine
timeline, and so we must track activity of all timelines. (We track
activity on the vma itself to prevent unbinding from HW before the HW
has finished accessing it.)

v2: Switch to a rbtree for 32b safety (since using u64 as a radixtree
index is fraught with aliasing of unsigned longs).
v3: s/lookup_active/active_instance/ because we can never agree on names

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-5-chris@chris-wilson.co.uk


# e6bb1d7f 06-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move i915_vma_move_to_active() to i915_vma.c

i915_vma_move_to_active() has grown beyond its execbuf origins, and
should take its rightful place in i915_vma.c as a method for i915_vma!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-4-chris@chris-wilson.co.uk


# 1eca65d9 06-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Squelch very verbose error logging

Having found the error causing the IGT test to fail, downgrade the
verbose logging so that we stop flooding the syslogs as we deliberately
provoke it many thousands of time during selftests.

References: 10195b1e4411 ("drm/i915: Show vma allocator stack when in doubt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-1-chris@chris-wilson.co.uk


# d403397c 30-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Try GGTT mmapping whole object as partial

If the whole object is already pinned by HW for use as scanout, we will
fail to move it to the mappable region and so must resort to using a
partial VMA covering the whole object.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104513
Fixes: aa136d9d72c2 ("drm/i915: Convert partial ggtt vma to full ggtt if it spans the entire object")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180630090509.469-1-chris@chris-wilson.co.uk
(cherry picked from commit 7e7367d3bc6cf27dd7e007e7897fcebfeff1ee8b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 7e7367d3 30-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Try GGTT mmapping whole object as partial

If the whole object is already pinned by HW for use as scanout, we will
fail to move it to the mappable region and so must resort to using a
partial VMA covering the whole object.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104513
Fixes: aa136d9d72c2 ("drm/i915: Convert partial ggtt vma to full ggtt if it spans the entire object")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180630090509.469-1-chris@chris-wilson.co.uk


# 10195b1e 28-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Show vma allocator stack when in doubt

At the moment, gem_exec_gttfill fails with a sporadic EBUSY due to us
wanting to unbind a pinned batch. Let's dump who first bound that vma to
see if that helps us identify who still unexpectedly has it pinned.

v2: We cannot allocate inside the printer (as it may be on an fs-reclaim
path), so hope for the best and build the string on the stack
v3: stack depth of 16 routinely overflows a 512 character string, limit
it to 12 to avoid unsightly truncation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628132206.8329-1-chris@chris-wilson.co.uk


# 93f2cde2 07-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Decouple vma vfuncs from vm

To allow for future non-object backed vma, we need to be able to
specialise the callbacks for binding, et al, the vma. For example,
instead of calling vma->vm->bind_vma(), we now call
vma->ops->bind_vma(). This gives us the opportunity to later override the
operation for a custom vma.

v2: flip order of unbind/bind

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-2-chris@chris-wilson.co.uk


# 520ea7c5 07-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prepare for non-object vma

In order to allow ourselves to use VMA to wrap other entities other than
GEM objects, we need to allow for the vma->obj backpointer to be NULL.
In most cases, we know we are operating on a GEM object and its vma, but
we need the core code (such as i915_vma_pin/insert/bind/unbind) to work
regardless of the innards.

The remaining eyesore here is vma->obj->cache_level and related (but
less of an issue) vma->obj->gt_ro. With a bit of care we should mirror
those on the vma itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-1-chris@chris-wilson.co.uk


# 82ad6443 05-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Rename i915_hw_ppgtt base member

In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk


# 83d317ad 05-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/vma: Move the bind_count vs pin_count assertion to a helper

To spare ourselves a long line later, refactor the repeated check of
bind_count vs pin_count to a helper.

v2: Fix up the commentary!

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605094107.31367-1-chris@chris-wilson.co.uk


# 3365e226 03-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lazily unbind vma on close

When userspace is passing around swapbuffers using DRI, we frequently
have to open and close the same object in the foreign address space.
This shows itself as the same object being rebound at roughly 30fps
(with a second object also being rebound at 30fps), which involves us
having to rewrite the page tables and maintain the drm_mm range manager
every time.

However, since the object still exists and it is only the local handle
that disappears, if we are lazy and do not unbind the VMA immediately
when the local user closes the object but defer it until the GPU is
idle, then we can reuse the same VMA binding. We still have to be
careful to mark the handle and lookup tables as closed to maintain the
uABI, just allowing the underlying VMA to be resurrected if the user is
able to access the same object from the same context again.

If the object itself is destroyed (neither userspace keeping a handle to
it), the VMA will be reaped immediately as usual.

In the future, this will be even more useful as instantiating a new VMA
for use on the GPU will become heavier. A nuisance indeed, so nip it in
the bud.

v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
v3: Leave a hint as to why we deferred the unbind on close.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk


# e61e0f51 21-Feb-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename drm_i915_gem_request to i915_request

We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)

In short, the spatch:
@@

@@
- struct drm_i915_gem_request
+ struct i915_request

A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 73ebd503 11-Dec-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: make mappable struct resource centric

Now that we are using struct resource to track the stolen region, it is
more convenient if we track the mappable region in a resource as well.

v2: prefer iomap and gmadr naming scheme
prefer DEFINE_RES_MEM

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-8-matthew.auld@intel.com


# e2189dd0 07-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Refactor common list iteration over GGTT vma

In quite a few places, we have a list iteration over the vma on an
object that only want to inspect GGTT vma. By construction, these are
placed at the start of the list, so we have copied that knowledge into
many callsites. Pull that knowledge back to i915_vma.h and provide a
for_each_ggtt_vma() to tidy up the code.

v2: Add a backreference from vma_create() to remind ourselves why we put
ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail).
v3: Fixup s/vma/V/

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk


# 7125397b 05-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track GGTT writes on the vma

As writes through the GTT and GGTT PTE updates do not share the same
path, they are not strictly ordered and so we must explicitly flush the
indirect writes prior to modifying the PTE. We do track outstanding GGTT
writes on the object itself, but since the object may have multiple GGTT
vma, that is overly coarse as we can track and flush individual vma as
required.

Whilst here, update the GGTT flushing behaviour for Cannonlake.

v2: Hard-code ring offset to allow use during unload (after RCS may have
been freed, or never existed!)

References: https://bugs.freedesktop.org/show_bug.cgi?id=104002
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-2-chris@chris-wilson.co.uk


# 010e3e68 05-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove vma from object on destroy, not close

Originally we translated from the object to the vma by walking
obj->vma_list to find the matching vm (for user lookups). Now we process
user lookups using the rbtree, and we only use obj->vma_list itself for
maintaining state (e.g. ensuring that all vma are flushed or rebound).
As such maintenance needs to go on beyond the user's awareness of the
vma, defer removal of the vma from the obj->vma_list from i915_vma_close()
to i915_vma_destroy()

Fixes: 5888fc9eac3c ("drm/i915: Flush pending GTT writes before unbinding")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104155
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 7f017b19 09-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up i915_vma_unbind() as a potential sleeper

Whenever we want to unbind a vma, we must wait on all GPU activity to
complete first. (This is what gives us the ability to do fine grained
eviction and purging by only having to wait on the VMA that we need to
unbind to proceed; though if pushed we can make it a rule that we are
only allowed to unbind already idle VMA and move the burden of the work
and organising the sleep onto the caller.) Currently, we might only
sleep if the vma is still active on the GPU, but in principle
i915_vma_unbind() always implies a sleep, so mark it up with a
might_sleep().

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=103638
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171109213450.13875-2-chris@chris-wilson.co.uk


# 398c13b9 07-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prune the reservation shared fence array

The shared fence array is not autopruning and may continue to grow as an
object is shared between new timelines. Take the opportunity when we
think the object is idle (we have to confirm that any external fence is
also signaled) to decouple all the fences.

We apply a similar trick after waiting on an object, see commit
e54ca9774777 ("drm/i915: Remove completed fences after a wait")

v2: No longer need to handle the batch pool as a special case.
v3: Need to trylock from within i915_vma_retire as this may be called
form the shrinker - and we may later try to allocate underneath the
reservation lock, so a deadlock is possible.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102936
Fixes: d07f0e59b2c7 ("drm/i915: Move GEM activity tracking into a common struct reservation_object")
Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107220656.5020-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 1ab22356b37ab08a391d6f007fda4c822bef9fb5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 1ab22356 07-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prune the reservation shared fence array

The shared fence array is not autopruning and may continue to grow as an
object is shared between new timelines. Take the opportunity when we
think the object is idle (we have to confirm that any external fence is
also signaled) to decouple all the fences.

We apply a similar trick after waiting on an object, see commit
e54ca9774777 ("drm/i915: Remove completed fences after a wait")

v2: No longer need to handle the batch pool as a special case.
v3: Need to trylock from within i915_vma_retire as this may be called
form the shrinker - and we may later try to allocate underneath the
reservation lock, so a deadlock is possible.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102936
Fixes: d07f0e59b2c7 ("drm/i915: Move GEM activity tracking into a common struct reservation_object")
Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107220656.5020-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# d36caeea 04-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert vma->flags are updated correctly during binding

As we bind, and unbind on error, we want to be sure that the vma->flags
are updated to reflect the binding state so that on the next invocation
all is well.

v2: Take two.
v3: Take three; vma-misplaced is checking map-and-fenceable so keep it
last!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171105124550.32715-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>


# f2123818 15-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move dev_priv->mm.[un]bound_list to its own lock

Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61df ("drm/i915/shrinker: Flush active on objects before
counting")).

v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk


# a65adaf8 09-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track user GTT faulting per-vma

We don't wish to refault the entire object (other vma) when unbinding
one partial vma. To do this track which vma have been faulted into the
user's address space.

v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 3bd40735 09-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Consolidate get_fence with pin_fence

Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# b4563f59 09-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pin fence for iomap

Acquire the fence register for the iomap in i915_vma_pin_iomap() on
behalf of the caller.

We probably want for the caller to specify whether the fence should be
pinned for their usage, but at the moment all callers do want the
associated fence, or none, so take it on their behalf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-1-chris@chris-wilson.co.uk


# bef27bdb 09-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT

We only apply the hugepage PD redirection inside the ppGTT, so during
i915_vma_insert() we want to exclude the GGTT from the additional
alignment constraints (thereby avoiding the extra GTT pressure from
fragmentation). Add an assert to document that intention alongside the
comment.

v2: After discussion with Matthew, make it a blanket GGTT ban
(previously we allowed the expansion for appgtt, and so indirectly
ggtt). There are issues we need to fix before allowing the current
appgtt to be used with hugepages, and if we do, we probably want more
care over when to expand/align, as the mappable aperture inside the ggtt
is precious.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009092019.20747-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 855822be 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: align 64K objects to 2M

We can't mix 64K and 4K pte's in the same page-table, so for now we
align 64K objects to 2M to avoid any potential mixing. This is
potentially wasteful but in reality shouldn't be too bad since this only
applies to the virtual address space of a 48b PPGTT.

v2: don't separate logically connected ops

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-10-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-9-chris@chris-wilson.co.uk


# 7464284b 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: align the vma start to the largest gtt page size

For the 48b PPGTT try to align the vma start address to the required
page size boundary to guarantee we use said page size in the gtt. If we
are dealing with multiple page sizes, we can't guarantee anything and
just align to the largest. For soft pinning and objects which need to be
tightly packed into the lower 32bits we don't force any alignment.

v2: various improvements suggested by Chris

v3: use set_pages and better placement of page_sizes

v4: prefer upper_32_bits()

v5: assign vma->page_sizes = vma->obj->page_sizes directly
prefer sizeof(vma->page_sizes)

v6: fixup checking of end to exclude GGTT (which are assumed to be
limited to 4G).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-9-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-8-chris@chris-wilson.co.uk


# fa3f46af 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: introduce vm set_pages/clear_pages

Move the setting/clearing of the vma->pages to a vm operation. Doing so
neatens things up a little, but more importantly gives us a sane place
to also set/clear the vma->pages_sizes, which we introduce later in
preparation for supporting huge-pages.

v2: remove redundant vma->pages check

v3: GEM_BUG_ON(vma->pages) following i915_vma_remove

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-7-chris@chris-wilson.co.uk


# d1b48c1e 16-Aug-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace execbuf vma ht with an idr

This was the competing idea long ago, but it was only with the rewrite
of the idr as an radixtree and using the radixtree directly ourselves,
along with the realisation that we can store the vma directly in the
radixtree and only need a list for the reverse mapping, that made the
patch performant enough to displace using a hashtable. Though the vma ht
is fast and doesn't require any extra allocation (as we can embed the node
inside the vma), it does require a thread for resizing and serialization
and will have the occasional slow lookup. That is hairy enough to
investigate alternatives and favour them if equivalent in peak performance.
One advantage of allocating an indirection entry is that we can support a
single shared bo between many clients, something that was done on a
first-come first-serve basis for shared GGTT vma previously. To offset
the extra allocations, we create yet another kmem_cache for them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk


# d8462d0a 20-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Retire the VMA's fence tracker before unbinding

Since we may track unfenced access (GPU access to the vma that
explicitly requires no fence), vma->last_fence may be set without any
attached fence (vma->fence) and so will not be flushed when we call
i915_vma_put_fence(). Since we stopped doing a full retire of the
activity trackers for unbind, we need to explicitly retire each tracker.

Fixes: b0decaf75bd9 ("drm/i915: Track active vma requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620124321.1108-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
(cherry picked from commit 760a898d8069111704e1bd43f00ebf369ae46e57)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7a3bc034 20-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert the vma's active tracking is clear before free

In looking at a use-after-free on Baytrail, it looks like the VMA's
activity tracking is suspect. Add some asserts to catch freeing the VMA
before we have decoupled all of its i915_gem_active trackers.

References: https://bugs.freedesktop.org/show_bug.cgi?id=101511
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620124321.1108-3-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>


# 760a898d 20-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Retire the VMA's fence tracker before unbinding

Since we may track unfenced access (GPU access to the vma that
explicitly requires no fence), vma->last_fence may be set without any
attached fence (vma->fence) and so will not be flushed when we call
i915_vma_put_fence(). Since we stopped doing a full retire of the
activity trackers for unbind, we need to explicitly retire each tracker.

Fixes: b0decaf75bd9 ("drm/i915: Track active vma requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620124321.1108-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>


# 95ff7c7d 16-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Stash a pointer to the obj's resv in the vma

During execbuf, a mandatory step is that we add this request (this
fence) to each object's reservation_object. Inside execbuf, we track the
vma, and to add the fence to the reservation_object then means having to
first chase the obj, incurring another cache miss. We can reduce the
number of cache misses by stashing a pointer to the reservation_object
in the vma itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170616140525.6394-1-chris@chris-wilson.co.uk


# dade2a61 16-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store a persistent reference for an object in the execbuffer cache

If we take a reference to the object/vma when it is first used in an
execbuf, we can keep that reference until the object's file-local handle
is closed. Thereby saving a frequent ref/unref pair.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 2889caa9 16-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Eliminate lots of iterations over the execobjects array

The major scaling bottleneck in execbuffer is the processing of the
execobjects. Creating an auxiliary list is inefficient when compared to
using the execobject array we already have allocated.

Reservation is then split into phases. As we lookup up the VMA, we
try and bind it back into active location. Only if that fails, do we add
it to the unbound list for phase 2. In phase 2, we try and add all those
objects that could not fit into their previous location, with fallback
to retrying all objects and evicting the VM in case of severe
fragmentation. (This is the same as before, except that phase 1 is now
done inline with looking up the VMA to avoid an iteration over the
execobject array. In the ideal case, we eliminate the separate reservation
phase). During the reservation phase, we only evict from the VM between
passes (rather than currently as we try to fit every new VMA). In
testing with Unreal Engine's Atlantis demo which stresses the eviction
logic on gen7 class hardware, this speed up the framerate by a factor of
2.

The second loop amalgamation is between move_to_gpu and move_to_active.
As we always submit the request, even if incomplete, we can use the
current request to track active VMA as we perform the flushes and
synchronisation required.

The next big advancement is to avoid copying back to the user any
execobjects and relocations that are not changed.

v2: Add a Theory of Operation spiel.
v3: Fall back to slow relocations in preparation for flushing userptrs.
v4: Document struct members, factor out eb_validate_vma(), add a few
more comments to explain some magic and hide other magic behind macros.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 4ff4b44c 16-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store a direct lookup from object handle to vma

The advent of full-ppgtt lead to an extra indirection between the object
and its binding. That extra indirection has a noticeable impact on how
fast we can convert from the user handles to our internal vma for
execbuffer. In order to bypass the extra indirection, we use a
resizable hashtable to jump from the object to the per-ctx vma.
rhashtable was considered but we don't need the online resizing feature
and the extra complexity proved to undermine its usefulness. Instead, we
simply reallocate the hastable on demand in a background task and
serialize it before iterating.

In non-full-ppgtt modes, multiple files and multiple contexts can share
the same vma. This leads to having multiple possible handle->vma links,
so we only use the first to establish the fast path. The majority of
buffers are not shared and so we should still be able to realise
speedups with multiple clients.

v2: Prettier names, more magic.
v3: Many style tweaks, most notably hiding the misuse of execobj[].rsvd2

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# b8e5d2ef 16-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Make i915_vma_destroy() static

i915_vma_destroy() is now not used outside of i915_vma.c so we can
remove the export and make the function static.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170616123508.12673-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>


# d55495b4 15-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use vma->exec_entry as our double-entry placeholder

This has the benefit of not requiring us to manipulate the
vma->exec_link list when tearing down the execbuffer, and is a
marginally cheaper test to detect the user error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-2-chris@chris-wilson.co.uk


# 8c992370 26-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove the vma from the drm_mm if binding fails

As we track whether a vma has been inserted into the drm_mm using the
vma->flags, if we fail to bind the vma into the GTT we do not update
those bits and will attempt to reinsert the vma into the drm_mm on
future passes. To prevent that, we want to unwind i915_vma_insert() if
we fail in our attempt to bind.

Fixes: 59bfa1248e22 ("drm/i915: Start passing around i915_vma from execbuffer")
Testcase: igt/drv_selftest/live_gtt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227122654.27651-3-chris@chris-wilson.co.uk
(cherry picked from commit 31c7effa39f21f0fea1b3250ae9ff32b9c7e1ae5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 31c7effa 26-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove the vma from the drm_mm if binding fails

As we track whether a vma has been inserted into the drm_mm using the
vma->flags, if we fail to bind the vma into the GTT we do not update
those bits and will attempt to reinsert the vma into the drm_mm on
future passes. To prevent that, we want to unwind i915_vma_insert() if
we fail in our attempt to bind.

Fixes: 59bfa1248e22 ("drm/i915: Start passing around i915_vma from execbuffer")
Testcase: igt/drv_selftest/live_gtt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227122654.27651-3-chris@chris-wilson.co.uk


# aa149431 25-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Sanity check the vma->node prior to binding into the GTT

We rely on the VMA being allocated inside the drm_mm and for its allotted
node being large enough to accommodate all the vma->pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170225181122.4788-3-chris@chris-wilson.co.uk


# ff685975 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move allocate_va_range to GTT

In the future, we need to call allocate_va_range on the aliasing-ppgtt
which means moving the call down from the vma into the vm (which is
more appropriate for calling the vm function).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-8-chris@chris-wilson.co.uk


# 782a3e9e 13-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Exercise i915_vma_pin/i915_vma_insert

High-level testing of the struct drm_mm by verifying our handling of
weird requests to i915_vma_pin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-35-chris@chris-wilson.co.uk


# e3c7a1c5 13-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Test creation of VMA

Simple test to exercise creation and lookup of VMA within an object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-34-chris@chris-wilson.co.uk


# e1cc3db0 09-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert that we never create a vma for the aliasing_ppgtt

The aliasing_ppgtt is just a container for the HW context that mirrors
the global gtt. It should never be used directly, so assert if we make
the mistake of trying to allocate a VMA for it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170209111933.12420-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 8adabb89 23-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pevent copying uninitialised garbage into vma->ggtt_view

Since tweaking i915_vma_compare() we allowed constructors to skip
clearing the ggtt_view believing that we didn't access the unused
members. That, as it turns out, was not entirely true. In particular,
i915_gem_fault() uses

ret = remap_io_mapping(area,
area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT),
(ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT,
min_t(u64, vma->size, area->vm_end - area->vm_start),
&ggtt->mappable);

i.e. the ggtt_view.partial for both normal and partial views. If we
allowed garbage into the normal vma->ggtt_view and then try userspace
tried to mmap it, we could explode in an unobvious fashion.

Fixes: 7b92c047bae2 ("drm/i915: Eliminate superfluous i915_ggtt_view_rotated")
Fixes: 3bf4d5751943 ("drm/i915: Stop clearing i915_ggtt_view")
Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170123145245.3972-1-chris@chris-wilson.co.uk
Tested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
(cherry picked from commit 7c518460303353084ebcfca99bc4b67ce33745a1)


# 2f5db26c 20-Jan-2017 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: reinstate call to trace_i915_vma_bind

The call went away in:

commit 3b16525cc4c1a43e9053cfdc414356eea24bdfad
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Aug 4 16:32:25 2016 +0100

drm/i915: Split insertion/binding of an object into the VM

It is useful to have this trace as it pairs nicely with the vma_unbind
one to track vma activity.
Added inside the i915_vma_bind function (was outside before) to keep a
similar placement as trace_i915_vma_unbind.

v2: print bind_flags instead of flags (Chris)

Fixes: 3b16525cc4c1 ("drm/i915: Split insertion/binding of an object into the VM")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484949083-11430-1-git-send-email-daniele.ceraolospurio@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 6146e6da5c961735dacf9b6c0c8b5f1382193ee2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 7c518460 23-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pevent copying uninitialised garbage into vma->ggtt_view

Since tweaking i915_vma_compare() we allowed constructors to skip
clearing the ggtt_view believing that we didn't access the unused
members. That, as it turns out, was not entirely true. In particular,
i915_gem_fault() uses

ret = remap_io_mapping(area,
area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT),
(ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT,
min_t(u64, vma->size, area->vm_end - area->vm_start),
&ggtt->mappable);

i.e. the ggtt_view.partial for both normal and partial views. If we
allowed garbage into the normal vma->ggtt_view and then try userspace
tried to mmap it, we could explode in an unobvious fashion.

Fixes: 7b92c047bae2 ("drm/i915: Eliminate superfluous i915_ggtt_view_rotated")
Fixes: 3bf4d5751943 ("drm/i915: Stop clearing i915_ggtt_view")
Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170123145245.3972-1-chris@chris-wilson.co.uk
Tested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 6146e6da 20-Jan-2017 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: reinstate call to trace_i915_vma_bind

The call went away in:

commit 3b16525cc4c1a43e9053cfdc414356eea24bdfad
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Aug 4 16:32:25 2016 +0100

drm/i915: Split insertion/binding of an object into the VM

It is useful to have this trace as it pairs nicely with the vma_unbind
one to track vma activity.
Added inside the i915_vma_bind function (was outside before) to keep a
similar placement as trace_i915_vma_unbind.

v2: print bind_flags instead of flags (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484949083-11430-1-git-send-email-daniele.ceraolospurio@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# b00ddb27 19-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert that created vma has a whole number of pages

VMA (and their objects) are supposed to composed of whole pages. Add an
assert to catch any invalid construct when we create the VMA.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-6-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 44a0ec0d 19-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert the drm_mm_node is allocated when on the VM lists

Before moving the vma between the VM active/inactive lists, assert that
the node is still allocated.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-5-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 1fcdaa7e 19-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Reject vma creation larger than address space

Disallow creation of a vma that is larger than the available address
space, or triggers an overflow on fence expansion.

Testcase: igt/gem_exec_reloc/gtt-32
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 481a6f7d 16-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915_gem_object_to_ggtt()

With the last user of this convenience wrapper gone, we can kill the
wrapper and in the process make the lookup function static.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-5-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# a01cb37a 16-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915_vma_create from VMA API

With the introduce of i915_vma_instance() for obtaining the VMA
singleton for a (obj, vm, view) tuple, we can remove the
i915_vma_create() in favour of a single entry point. We do incur a
lookup onto an empty tree, but the i915_vma_create() were being called
infrequently and during initialisation, so the small overhead is
negligible.

v2: Drop the i915_ prefix from the now static vma_create() function

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-4-chris@chris-wilson.co.uk


# 4ea9527c 16-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add a check that the VMA instance we lookup matches the request

Just as added paranoia against our future-selves add another check that
the lookup/created VMA instance matches the request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 718659a6 16-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename some warts in the VMA API

Whilst writing testcases to exercise the VMA API, some oddities came to
light, such as i915_gem_obj_lookup_or_create(). Joonas suggested
i915_vma_instance() as a neat replacement, so rename them, move them to
i915_vma.c and add some kerneldoc as a sugary bonus.

s/i915_gem_obj_to_vma/i915_vma_lookup/
s/i915_gem_obj_lookup_or_create_vma/i915_vma_instance/

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-2-chris@chris-wilson.co.uk


# 8bab1193 13-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert i915_ggtt_view to use an anonymous union

Reading the ggtt_views is much more pleasant without the extra
characters from specifying the union (i.e. ggtt_view.partial rather than
ggtt_view.params.partial). To make this work inside i915_vma_compare()
with only a single memcmp requires us to ensure that there are no
uninitialised bytes within each branch of the union (we make sure the
structs are packed) and we need to store the size of each branch.

v4: Rewrite changelog and add comments explaining the assert.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-5-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 0325701a 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert that we have allocated the drm_mm_node upon pinning

We currently check after the slow path that the vma is bound correctly,
but we don't currently check after the fast path. This is important in
case we accidentally take the fast path and leave the vma misplaced.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111210937.29252-27-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 625d988a 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Extract reserving space in the GTT to a helper

Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its
own routine so that it can be shared rather than duplicated.

v2: Kerneldoc

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: igvt-g-dev@lists.01.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk


# e007b19d 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the MRU stack search after evicting

When we evict from the GTT to make room for an object, the hole we
create is put onto the MRU stack inside the drm_mm range manager. On the
next search pass, we can speed up a PIN_HIGH allocation by referencing
that stack for the new hole.

v2: Pull together the 3 identical implements (ahem, a couple were
outdated) into a common routine for allocating a node and evicting as
necessary.
v3: Detect invalid calls to i915_gem_gtt_insert()
v4: kerneldoc

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk


# f51455d4 10-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE

Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
code comprehension and future changes. In the future, we may want to use
variable GTT page sizes and so have the challenge of knowing which
hardcoded values were used to represent a physical page vs the virtual
page.

v2: Look for a few more 4096s to convert, discover IS_ALIGNED().
v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep
bdw stolen w/a as 4096 until we know better.
v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page
sizes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 91d4e0aa 09-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move ggtt fence/alignment to i915_gem_tiling.c

Rename i915_gem_get_ggtt_size() and i915_gem_get_ggtt_alignment() to
i915_gem_fence_size() and i915_gem_fence_alignment() respectively to
better match usage. Similarly move the pair of functions into
i915_gem_tiling.c next to the fence restrictions.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-6-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 944397f0 09-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store required fence size/alignment for GGTT vma

The fence size/alignment is a combination of the vma size plus object
tiling parameters. Those parameters are rarely changed, making the fence
size/alignemnt roughly constant for the lifetime of the VMA. We can
simplify subsequent calculations by precalculating the size/alignment
required for GGTT vma taking fencing into account (with an update if we
do change the tiling or stride).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-4-chris@chris-wilson.co.uk


# 5b30694b 09-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Align GGTT sizes to a fence tile row

Ensure the view occupies the full tile row so that reads/writes into the
VMA do not escape (via fenced detiling) into neighbouring objects - we
will pad the object with scratch pages to satisfy the fence. This
applies the lazy-tiling we employed on gen2/3 to gen4+.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-2-chris@chris-wilson.co.uk


# e8f9ae9b 06-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use range_overflows()

Replace a few more open-coded overflow checks with the macro.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-4-chris@chris-wilson.co.uk


# 7a5580a2 31-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move assert of page pin vs bind count into i915_vma_unbind

The read of the page pin count and the bind count are unordered,
presenting races in the assert and it firing off incorrectly. Prevent
this by restricting the assert to the vma bind/unbind routines where we
have local cpu ordering between the two.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-1-chris@chris-wilson.co.uk


# 3f85fb34 22-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm: Wrap drm_mm_node.hole_follows

Insulate users from changes to the internal hole tracking within
struct drm_mm_node by using an accessor for hole_follows.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[danvet: resolve conflicts in i915_vma.c]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 07e19ea4 23-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert that the partial VMA fits within the object

When creating a partial VMA assert that it first fits with the parent
object, and that if it covers the whole of the parent a normal view was
created instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161223145804.6605-5-chris@chris-wilson.co.uk


# 966d5bf5 13-Dec-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: convert to using range_overflows

Convert some of the obvious hand-rolled ranged overflow sanity checks to
our shiny new range_overflows macro.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-4-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 7a0499a4 13-Dec-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: move vma sanity checking into i915_vma_bind

If we move the sanity checking from gen8_alloc_va_range_3lvl and
gen6_alloc_va_range into i915_vma_bind, we will increase our coverage to
now both callbacks. We also convert each WARN_ON over to a GEM_WARN_ON.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-2-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 7d1d9aea 05-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Tidy i915_gem_valid_gtt_space()

We can replace a couple of tests with an assertion that the passed in
node is already allocated (as matches the existing call convention) and
by a small bit of refactoring we can bring the line lengths to under
80cols.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-3-chris@chris-wilson.co.uk


# 172ae5b4 05-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)

Soft-pinning depends upon being able to check for availabilty of an
interval and evict overlapping object from a drm_mm range manager very
quickly. Currently it uses a linear list, and so performance is dire and
not suitable as a general replacement. Worse, the current code will oops
if it tries to evict an active buffer.

It also helps if the routine reports the correct error codes as expected
by its callers and emits a tracepoint upon use.

For posterity since the wrong patch was pushed (i.e. that missed these
key points and had known bugs), this is the changelog that should have
been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for
execbuffer"):

Userspace can pass in an offset that it presumes the object is located
at. The kernel will then do its utmost to fit the object into that
location. The assumption is that userspace is handling its own object
locations (for example along with full-ppgtt) and that the kernel will
rarely have to make space for the user's requests.

This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
* if the user supplies a virtual address via the execobject->offset
*and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
that object is placed at that offset in the address space selected
by the context specifier in execbuffer.
* the location must be aligned to the GTT page size, 4096 bytes
* as the object is placed exactly as specified, it may be used by this
execbuffer call without relocations pointing to it

It may fail to do so if:
* EINVAL is returned if the object does not have a 4096 byte aligned
address
* the object conflicts with another pinned object (either pinned by
hardware in that address space, e.g. scanouts in the aliasing ppgtt)
or within the same batch.
EBUSY is returned if the location is pinned by hardware
EINVAL is returned if the location is already in use by the batch
* EINVAL is returned if the object conflicts with its own alignment (as meets
the hardware requirements) or if the placement of the object does not fit
within the address space

All other execbuffer errors apply.

Presence of this execbuf extension may be queried by passing
I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
a reported value of 1 (or greater).

v2: Combine the hole/adjusted-hole ENOSPC checks
v3: More color, more splitting, more blurb.

Fixes: 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-2-chris@chris-wilson.co.uk


# 49d73912 29-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert vm->dev backpointer to vm->i915

99% of the time we access i915_address_space->dev we want the i915
device and not the drm device, so let's store the drm_i915_private
backpointer instead. The only real complication here are the inlines
in i915_vma.h where drm_i915_private is not yet defined and so we have
to choose an alternate path for our asserts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129095008.32622-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 5b8c8aec 16-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move frontbuffer CS write tracking from ggtt vma to object

I tried to avoid having to track the write for every VMA by only
tracking writes to the ggtt. However, for the purposes of frontbuffer
tracking this is insufficient as we need to invalidate around writes not
just to the the ggtt but all aliased ppgtt views of the framebuffer. By
moving the critical section to the object and only doing so for
framebuffer writes we can reduce the tracking even further by only
watching framebuffers and not vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161116190704.5293-1-chris@chris-wilson.co.uk
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# b42fe9ca 10-Nov-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Split out i915_vma.c

As a side product, had to split two other files;
- i915_gem_fence_reg.h
- i915_gem_object.h (only parts that needed immediate untanglement)

I tried to move code in as big chunks as possible, to make review
easier. i915_vma_compare was moved to a header temporarily.

v2:
- Use i915_gem_fence_reg.{c,h}

v3:
- Rebased

v4:
- Fix building when DEBUG_GEM is enabled by reordering a bit.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com