History log of /linux-master/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
Revision Date Author Comments
# 94bf3e60 27-Mar-2024 Andi Shyti <andi.shyti@linux.intel.com>

drm/i915/gt: Limit the reserved VM space to only the platforms that need it

Commit 9bb66c179f50 ("drm/i915: Reserve some kernel space per
vm") reduces the available VM space of one page in order to apply
Wa_16018031267 and Wa_16018063123.

This page was reserved indiscrimitely in all platforms even when
not needed. Limit it to DG2 onwards.

Fixes: 9bb66c179f50 ("drm/i915: Reserve some kernel space per vm")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327200546.640108-1-andi.shyti@linux.intel.com
(cherry picked from commit 9721634441d5dedba7f9eebb2bf0c9411cbafc4e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 9bb66c17 26-Oct-2023 Andrzej Hajda <andrzej.hajda@intel.com>

drm/i915: Reserve some kernel space per vm

Reserve one page in each vm for kernel space to use for things
such as workarounds.

v2: use real memory, do not decrease vm.total
v4: reserve only one page and explain flag
v5: remove allocated object on ppgtt cleanup
v6: decrease vm->total by reservation size

Suggested-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231026-wabb-v6-1-4aa7d55d0a8a@intel.com


# 6b711386 11-Sep-2023 Jani Nikula <jani.nikula@intel.com>

drm/i915/gt: rename DBG() to GTT_TRACE()

intel_gtt.h is indirectly included absolutely everywhere in the
driver. DBG() is too short a name. Rename it GTT_TRACE() after
GEM_TRACE().

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230911123305.1682554-1-jani.nikula@intel.com


# 08509377 07-Jul-2023 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Remove dead code from gen8_pte_encode

Commit 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
added a dedicated gen12_pte_encode but forgot to remove the Gen12 specific
bit from gen8_pte_encode.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
Cc: Fei Yang <fei.yang@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Fei Yang <fei.yang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230707124644.3965281-1-tvrtko.ursulin@linux.intel.com


# 6bf0961a 07-Jul-2023 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Remove dead code from gen8_pte_encode

Commit 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
added a dedicated gen12_pte_encode but forgot to remove the Gen12 specific
bit from gen8_pte_encode.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
Cc: Fei Yang <fei.yang@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Fei Yang <fei.yang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230707124644.3965281-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 08509377dd82ead98429785509f6b52a4b5f09f5)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 9275277d 09-May-2023 Fei Yang <fei.yang@intel.com>

drm/i915: use pat_index instead of cache_level

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity. PAT is expected to work much like MOCS already works today,
and by design userspace is expected to select the index that exactly
matches the desired behavior described in the hardware specification.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For kernel objects, cache_level is used
for simplicity and backward compatibility. For Pre-gen12 platforms PAT can
have 1:1 mapping to i915_cache_level, so these two are interchangeable. see
the use of LEGACY_CACHELEVEL.

One consequence of this change is that gen8_pte_encode is no longer working
for gen12 platforms due to the fact that gen12 platforms has different PAT
definitions. In the meantime the mtl_pte_encode introduced specfically for
MTL becomes generic for all gen12 platforms. This patch renames the MTL
PTE encode function into gen12_pte_encode and apply it to all gen12. Even
though this change looks unrelated, but separating them would temporarily
break gen12 PTE encoding, thus squash them in one patch.

Special note: this patch changes the way caching behavior is controlled in
the sense that some objects are left to be managed by userspace. For such
objects we need to be careful not to change the userspace settings.There
are kerneldoc and comments added around obj->cache_coherent, cache_dirty,
and how to bypass the checkings by i915_gem_object_has_cache_level. For
full understanding, these changes need to be looked at together with the
two follow-up patches, one disables the {set|get}_caching ioctl's and the
other adds set_pat extension to the GEM_CREATE uAPI.

Bspec: 63019

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com


# 5e352e32 09-May-2023 Fei Yang <fei.yang@intel.com>

drm/i915: preparation for using PAT index

This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.

Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.

cachelevel_to_pat: a platform dependent array mapping cache_level to
pat_index.

max_pat_index: the maximum PAT index recommended in hardware specification
Needed for validating the PAT index passed in from user
space.

i915_gem_get_pat_index: function to convert cache_level to PAT index.

obj_to_i915(obj): macro moved to header file for wider usage.

I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
convenience of coding.

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-2-fei.yang@intel.com


# eee44c29 26-Apr-2023 Jonathan Cavitt <jonathan.cavitt@intel.com>

drm/i915: Use correct huge page manager for MTL

MTL currently uses gen8_ppgtt_insert_huge when managing huge pages.
This is because MTL reports as not supporting 64K pages, or more
accurately, the system that reports whether a platform has 64K pages
reports false for MTL. This is only half correct, as the 64K page support
reporting system only cares about 64K page support for LMEM, which MTL
doesn't have.

MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing
over to using that manager doesn't resolve the issue because MTL is
expecting the virtual address space for the page table to be flushed after
initialization, so we must also add a flush statement there.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230425-hugepage-migrate-v8-2-7868d54eaa27@intel.com


# 341ad0e8 24-Apr-2023 Fei Yang <fei.yang@intel.com>

drm/i915/mtl: Add PTE encode function

PTE encode functions are platform dependent. This patch implements
PTE functions for MTL, and ensures the correct PTE encode function
is used by calling pte_encode function pointer instead of the
hardcoded gen8 version of PTE encode.

Fixes: b76c0deef627 ("drm/i915/mtl: Define MOCS and PAT tables for MTL")
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230424182902.3663500-2-fei.yang@intel.com


# 8133a6da 03-Oct-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: enable PS64 support for DG2

It turns out that on production DG2/ATS HW we should have support for
PS64. This feature allows to provide a 64K TLB hint at the PTE level,
which is a lot more flexible than the current method of enabling 64K GTT
pages for the entire page-table, since that leads to all kinds of
annoying restrictions, as documented in:

commit caa574ffc4aaf4f29b890223878c63e2e7772f62
Author: Matthew Auld <matthew.auld@intel.com>
Date: Sat Feb 19 00:17:49 2022 +0530

drm/i915/uapi: document behaviour for DG2 64K support

On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.

With PS64, we can now drop the 2M GTT alignment restriction, and instead
only require 64K or larger when dealing with lmem. We still use the
compact-pt layout when possible, but only when we are certain that this
doesn't interfere with userspace.

Note that this is a change in uAPI behaviour, but hopefully shouldn't be
a concern (IGT is at least able to autodetect the alignment), since we
are only making the GTT alignment constraint less restrictive.

Based on a patch from CQ Tang.

v2: update the comment wrt scratch page
v3: (Nirmoy)
- Fix the selftest to actually use the random size, plus some comment
improvements, also drop the rem stuff.

Reported-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Yang A Shi <yang.a.shi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221004114915.221708-1-matthew.auld@intel.com


# f28d4266 26-Sep-2022 Chris Wilson <chris.p.wilson@intel.com>

drm/i915/gt: Move scratch page into system memory on all platforms

The scratch page should never be accessed, and is only assigned as a
filler page to redirection invalid userspace access. It is not of a
performance concern and so we prefer to have a single consistent
configuration across all platforms, reducing the pressure on device
memory and avoiding the direct device access that would be required to
initialise the scratch page.

Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220926155018.109678-1-matthew.auld@intel.com


# c286558f 26-Sep-2022 Chris Wilson <chris.p.wilson@intel.com>

drm/i915/gt: Use i915_vm_put on ppgtt_create error paths

Now that the scratch page and page directories have a reference back to
the i915_address_space, we cannot do an immediate free of the ppgtt upon
error as those buffer objects will perform a later i915_vm_put in their
deferred frees.

The downside is that by replacing the onion unwind along the error
paths, the ppgtt cleanup must handle a partially constructed vm. This
includes ensuring that the vm->cleanup is set prior to the error path.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6900
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Fixes: 4d8151ae5329 ("drm/i915: Don't free shared locks while shared")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.14+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220926153333.102195-1-matthew.auld@intel.com


# 20e377e7 26-Sep-2022 Chris Wilson <chris.p.wilson@intel.com>

drm/i915/gt: Use i915_vm_put on ppgtt_create error paths

Now that the scratch page and page directories have a reference back to
the i915_address_space, we cannot do an immediate free of the ppgtt upon
error as those buffer objects will perform a later i915_vm_put in their
deferred frees.

The downside is that by replacing the onion unwind along the error
paths, the ppgtt cleanup must handle a partially constructed vm. This
includes ensuring that the vm->cleanup is set prior to the error path.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6900
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Fixes: 4d8151ae5329 ("drm/i915: Don't free shared locks while shared")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.14+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220926153333.102195-1-matthew.auld@intel.com
(cherry picked from commit c286558f58535cf97b717b946d6c96d774a09d17)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 61c5ed94 21-Mar-2022 Michael Cheng <michael.cheng@intel.com>

drm/i915/gt: replace cache_clflush_range

Replace all occurrence of cache_clflush_range with drm_clflush_virt_range.
This will prevent compile errors on non-x86 platforms.

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-6-michael.cheng@intel.com


# 6f84aa1c 18-Feb-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: add xehpsdv_ppgtt_insert_entry

If this is LMEM then we get a 32 entry PT, with each PTE pointing to
some 64K block of memory, otherwise it's just the usual 512 entry PT.
This very much assumes the caller knows what they are doing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-11-ramalingam.c@intel.com


# 5189e312 18-Feb-2022 Matthew Auld <matthew.auld@intel.com>

drm/i915: support 64K GTT pages for discrete cards

discrete cards optimise 64K GTT pages for local-memory, since everything
should be allocated at 64K granularity. We say goodbye to sparse
entries, and instead get a compact 256B page-table for 64K pages,
which should be more cache friendly. 4K pages for local-memory
are no longer supported by the HW.

v4: don't return uninitialized err in igt_ppgtt_compact
Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-8-ramalingam.c@intel.com


# 39a2bd34 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Use the vma resource as argument for gtt binding / unbinding

When introducing asynchronous unbinding, the vma itself may no longer
be alive when the actual binding or unbinding takes place.

Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
Similarly change the insert_entries() op for struct i915_address_space.

Replace a couple of i915_vma_snapshot members with their newly introduced
i915_vma_resource counterparts, since they have the same lifetime.

Also make sure to avoid changing the struct i915_vma_flags (in particular
the bind flags) async. That should now only be done sync under the
vm mutex.

v2:
- Update the vma_res::bound_flags when binding to the aliased ggtt
v6:
- Remove I915_VMA_ALLOC_BIT (Matthew Auld)
- Change some members of struct i915_vma_resource from unsigned long to u64
(Matthew Auld)
v7:
- Fix vma resource size parameters to be u64 rather than unsigned long
(Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com


# fef53be0 08-Dec-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt/xehpsdv: move scratch page to system memory

On some platforms the hw has dropped support for 4K GTT pages when
dealing with LMEM, and due to the design of 64K GTT pages in the hw, we
can only mark the *entire* page-table as operating in 64K GTT mode,
since the enable bit is still on the pde, and not the pte. And since we
we still need to allow 4K GTT pages for SMEM objects, we can't have a
"normal" 4K page-table with scratch pointing to LMEM, since that's
undefined from the hw pov. The simplest solution is to just move the 64K
scratch page to SMEM on such platforms and call it a day, since that
should work for all configurations.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211208141613.7251-4-ramalingam.c@intel.com


# 5f978167 06-Dec-2021 Michael Cheng <michael.cheng@intel.com>

drm/i915: Introduce new macros for i915 PTE

Certain functions within i915 uses macros that are defined for
specific architectures by the mmu, such as _PAGE_RW and _PAGE_PRESENT
(Some architectures don't even have these macros defined, like ARM64).

Instead of re-using bits defined for the CPU, we should use bits
defined for i915. This patch introduces two new 64 bit macros,
GEN8_PAGE_PRESENT and GEN8_PAGE_RW, to check for bits 0 and 1 and, to
replace all occurrences of _PAGE_RW and _PAGE_PRESENT within i915.

v2(Michael Cheng): Use GEN8_ instead of I915_

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[ Move defines together with other GEN8 defines ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206215245.513677-2-michael.cheng@intel.com


# b0cc4dca 28-Oct-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: stop caching the scratch page

Normal users shouldn't be hitting this, likely this would indicate a
userspace bug. So don't bother caching, which should be safe now that we
manually flush the page.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028092638.3142258-2-matthew.auld@intel.com


# 5926ff80 18-Oct-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: drop unneeded make_unshrinkable

We already do this when mapping the pages.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-3-matthew.auld@intel.com


# a259cc14 22-Sep-2021 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Reduce the number of objects subject to memcpy recover

We really only need memcpy restore for objects that affect the
operability of the migrate context. That is, primarily the page-table
objects of the migrate VM.

Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early
restores using memcpy and a way to assign LMEM page-table object flags
to be used by the vms.

Restore objects without this flag with the gpu blitter and only objects
carrying the flag using TTM memcpy.

Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and
defer for a later audit which vms actually need it. Most importantly, user-
allocated vms with pinned page-table objects can be restored using the
blitter.

Performance-wise memcpy restore is probably as fast as gpu restore if not
faster, but using gpu restore will help tackling future restrictions in
mappable LMEM size.

v4:
- Don't mark the aliasing ppgtt page table flags for early resume, but
rather the ggtt page table flags as intended. (Matthew Auld)
- The check for user buffer objects during early resume is pointless, since
they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld)
v5:
- Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored
before we fire up the migrate context.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com


# 502d0609 03-Sep-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: add some flushing for the 64K GTT path

If we need to mark the PDE as operating in 64K GTT mode, we should be
paranoid and flush the extra writes, like we already do for the PTEs. On
some platforms the clflush can apparently add the just the right amount
of magical delay to force the GPU to see the updated entry.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903155317.1854012-1-matthew.auld@intel.com


# 8f88ca76 13-Jul-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: drop the page table optimisation

We skip filling out the pt with scratch entries if the va range covers
the entire pt, since we later have to fill it with the PTEs for the
object pages anyway. However this might leave open a small window where
the PTEs don't point to anything valid for the HW to consume.

When for example using 2M GTT pages this fill_px() showed up as being
quite significant in perf measurements, and ends up being completely
wasted since we ignore the pt and just use the pde directly.

Anyway, currently we have our PTE construction split between alloc and
insert, which is probably slightly iffy nowadays, since the alloc
doesn't actually allocate anything anymore, instead it just sets up the
page directories and points the PTEs at the scratch page. Later when we
do the insert step we re-program the PTEs again. Better might be to
squash the alloc and insert into a single step, then bringing back this
optimisation(along with some others) should be possible.

Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: <stable@vger.kernel.org> # v4.15+
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210713130431.2392740-1-matthew.auld@intel.com


# 3607e1e9 17-Jun-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Add a routine to iterate over the pagetables of a GTT

In the next patch, we will want to look at the dma addresses of
individual page tables, so add a routine to iterate over them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-6-thomas.hellstrom@linux.intel.com


# 0dcd6fdf 17-Jun-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Add an insert_entry for gen8_ppgtt

In the next patch, we will want to write a PTE for an explicit
dma address, outside of the usual vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-5-thomas.hellstrom@linux.intel.com


# 0abb33bf 13-Jul-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: drop the page table optimisation

We skip filling out the pt with scratch entries if the va range covers
the entire pt, since we later have to fill it with the PTEs for the
object pages anyway. However this might leave open a small window where
the PTEs don't point to anything valid for the HW to consume.

When for example using 2M GTT pages this fill_px() showed up as being
quite significant in perf measurements, and ends up being completely
wasted since we ignore the pt and just use the pde directly.

Anyway, currently we have our PTE construction split between alloc and
insert, which is probably slightly iffy nowadays, since the alloc
doesn't actually allocate anything anymore, instead it just sets up the
page directories and points the PTEs at the scratch page. Later when we
do the insert step we re-program the PTEs again. Better might be to
squash the alloc and insert into a single step, then bringing back this
optimisation(along with some others) should be possible.

Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: <stable@vger.kernel.org> # v4.15+
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210713130431.2392740-1-matthew.auld@intel.com
(cherry picked from commit 8f88ca76b3942d82e2c1cea8735ec368d89ecc15)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# c816723b 05-Jun-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER

This was done by the following semantic patch:

@@ expression i915; @@
- INTEL_GEN(i915)
+ GRAPHICS_VER(i915)

@@ expression i915; expression E; @@
- INTEL_GEN(i915) >= E
+ GRAPHICS_VER(i915) >= E

@@ expression dev_priv; expression E; @@
- !IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) != E

@@ expression dev_priv; expression E; @@
- IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) == E

@@
expression dev_priv;
expression from, until;
@@
- IS_GEN_RANGE(dev_priv, from, until)
+ IS_GRAPHICS_VER(dev_priv, from, until)

@def@
expression E;
identifier id =~ "^gen$";
@@
- id = GRAPHICS_VER(E)
+ ver = GRAPHICS_VER(E)

@@
identifier def.id;
@@
- id
+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com


# 6aed5673 27-Apr-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt/dgfx: place the PD in LMEM

It's a requirement that for dgfx we place all the paging structures in
device local-memory.

v2: use i915_coherent_map_type()
v3: improve the shared dma-resv object comment

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-4-matthew.auld@intel.com


# 529b9ec8 27-Apr-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: map the PD up front

We need to generalise our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
mapping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Note that keeping the mapping around is a potential concern here, since
while the vma is pinned the mapping remains there for the PDs
underneath, or at least until the used_count reaches zero, at which
point we can safely destroy the mapping. For 32b this will be even worse
since the address space is more limited, but since this change mostly
impacts full ppGTT platforms, the justification is that for modern
platforms we shouldn't care too much about 32b.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com


# ac69496f 26-Apr-2021 Lv Yunlong <lyl2019@mail.ustc.edu.cn>

drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdp

Our code analyzer reported a double free bug.

In gen8_preallocate_top_level_pdp, pde and pde->pt.base are allocated
via alloc_pd(vm) with one reference. If pin_pt_dma() failed, pde->pt.base
is freed by i915_gem_object_put() with a reference dropped. Then free_pd
calls free_px() defined in intel_ppgtt.c, which calls i915_gem_object_put()
to put pde->pt.base again.

As pde->pt.base is protected by refcount, so the second put will not free
pde->pt.base actually. But, maybe it is better to remove the first put?

Fixes: 82adf901138cc ("drm/i915/gt: Shrink i915_page_directory's slab bucket")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426124340.4238-1-lyl2019@mail.ustc.edu.cn


# ea995218 26-Apr-2021 Lv Yunlong <lyl2019@mail.ustc.edu.cn>

drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdp

Our code analyzer reported a double free bug.

In gen8_preallocate_top_level_pdp, pde and pde->pt.base are allocated
via alloc_pd(vm) with one reference. If pin_pt_dma() failed, pde->pt.base
is freed by i915_gem_object_put() with a reference dropped. Then free_pd
calls free_px() defined in intel_ppgtt.c, which calls i915_gem_object_put()
to put pde->pt.base again.

As pde->pt.base is protected by refcount, so the second put will not free
pde->pt.base actually. But, maybe it is better to remove the first put?

Fixes: 82adf901138cc ("drm/i915/gt: Shrink i915_page_directory's slab bucket")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426124340.4238-1-lyl2019@mail.ustc.edu.cn
(cherry picked from commit ac69496fe65cca0611d5917b7d232730ff605bc7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 11724eea 03-Feb-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt/dg1: add PTE_LM plumbing for ppGTT

For the PTEs we get an LM bit, to signal whether the page resides in
SMEM or LMEM.

BSpec: 45040

v2: just use gen8_pte_encode for dg1

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210203171231.551338-2-matthew.auld@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2f8aa3b8 22-Jan-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Add some missing blank lines after declaration

Trivial checkpatch cleanup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-2-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9834dfef 13-Jan-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Prune inlines

Remove all the manual inlines from non-critical sections in gt/

add/remove: 2/0 grow/shrink: 0/3 up/down: 762/-1473 (-711)
Function old new delta
mi_set_context.isra - 602 +602
write_dma_entry - 160 +160
__set_pd_entry 214 69 -145
clear_pd_entry 190 42 -148
ring_request_alloc 2021 841 -1180
Total: Before=1605086, After=1604375, chg -0.04%

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113152224.29794-1-chris@chris-wilson.co.uk


# b8cff311 19-Oct-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Onion unwind for scratch page allocation failure

In switching to using objects for our ppGTT scratch pages, care was not
taken to avoid trying to unref NULL objects on failure. And for gen6
ppGTT, it appears we forgot entirely to unwind after a partial allocation
failure.

Fixes: 89351925a477 ("drm/i915/gt: Switch to object allocations for page directories")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019083444.1286-1-chris@chris-wilson.co.uk
(cherry picked from commit fa812ce96a46efc27cae4dcad866aaee9cb25d28)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# fa812ce9 19-Oct-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Onion unwind for scratch page allocation failure

In switching to using objects for our ppGTT scratch pages, care was not
taken to avoid trying to unref NULL objects on failure. And for gen6
ppGTT, it appears we forgot entirely to unwind after a partial allocation
failure.

Fixes: 89351925a477 ("drm/i915/gt: Switch to object allocations for page directories")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019083444.1286-1-chris@chris-wilson.co.uk


# 8a473dba 06-Oct-2020 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Fix DMA mapped scatterlist walks

When walking DMA mapped scatterlists sg_dma_len has to be used since it
can be different (coalesced) from the backing store entry.

This also means we have to end the walk when encountering a zero length
DMA entry and cannot rely on the normal sg list end marker.

Both issues were there in theory for some time but were hidden by the fact
Intel IOMMU driver was never coalescing entries. As there are ongoing
efforts to change this we need to start handling it.

v2:
* Use unsigned int for local storing sg_dma_len. (Logan)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 85d1225ec066 ("drm/i915: Introduce & use new lightweight SGL iterators")
References: b31144c0daa8 ("drm/i915: Micro-optimise gen6_ppgtt_insert_entries()")
Reported-by: Tom Murphy <murphyt7@tcd.ie>
Suggested-by: Tom Murphy <murphyt7@tcd.ie> # __sgt_iter
Suggested-by: Logan Gunthorpe <logang@deltatee.com> # __sgt_iter
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-1-tvrtko.ursulin@linux.intel.com


# 82adf901 29-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Shrink i915_page_directory's slab bucket

kmalloc uses power-of-two slab buckets for small allocations (up to a
few pages). Since i915_page_directory is a page of pointers, plus a
couple more, this is rounded up to 8K, and we waste nearly 50% of that
allocation. Long terms this leads to poor memory utilisation, bloating
the kernel footprint, but the problem is exacerbated by our conservative
preallocation scheme for binding VMA. As we are required to allocate all
levels for each vma just in case we need to insert them upon binding,
this leads to a large multiplication factor for a single page vma. By
halving the allocation we need for the page directory structure, we
halve the impact of that factor, bringing workloads that once fitted into
memory, hopefully back to fitting into memory.

We maintain the split between i915_page_directory and i915_page_table as
we only need half the allocation for the lowest, most populous, level.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 89351925 29-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Switch to object allocations for page directories

The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.

In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.

v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# cd0452aa 29-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Preallocate stashes for vma page-directories

We need to make the DMA allocations used for page directories to be
performed up front so that we can include those allocations in our
memory reservation pass. The downside is that we have to assume the
worst case, even before we know the final layout, and always allocate
enough page directories for this object, even when there will be overlap.
This unfortunately can be quite expensive, especially as we have to
clear/reset the page directories and DMA pages, but it should only be
required during early phases of a workload when new objects are being
discovered, or after memory/eviction pressure when we need to rebind.
Once we reach steady state, the objects should not be moved and we no
longer need to preallocating the pages tables.

It should be noted that the lifetime for the page directories DMA is
more or less decoupled from individual fences as they will be shared
across objects across timelines.

v2: Only allocate enough PD space for the PTE we may use, we do not need
to allocate PD that will be left as scratch.
v3: Store the shift unto the first PD level to encapsulate the different
PTE counts for gen6/gen8.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# b2379ba2 13-May-2020 Nathan Chancellor <nathan@kernel.org>

drm/i915: Remove duplicate inline specifier on write_pte

When building with clang:

drivers/gpu/drm/i915/gt/gen8_ppgtt.c:392:24: warning: duplicate
'inline' declaration specifier [-Wduplicate-decl-specifier]
declaration specifier [-Wduplicate-decl-specifier]
static __always_inline inline void
^
include/linux/compiler_types.h:138:16: note: expanded from macro
'inline'
#define inline inline __gnu_inline __inline_maybe_unused notrace
^
1 warning generated.

__always_inline is defined as 'inline __attribute__((__always_inline))'
so we do not need to specify it twice.

Fixes: 84eac0c65940 ("drm/i915/gt: Force pte cacheline to main memory")
Link: https://github.com/ClangBuiltLinux/linux/issues/1024
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513182340.3968668-1-natechancellor@gmail.com


# 84eac0c6 11-May-2020 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gt: Force pte cacheline to main memory

We have problems of tgl not seeing a valid pte entry when iommu is
enabled. Add heavy handed flushing of entry modification by flushing the
cpu, cacheline and then wcb. This forces the pte out to main memory past
this point regarless of promises of coherency.

This is an evolution of an experimental patch from Chris Wilson of adding
wmb for coherent partners, by adding a clflush to force the cache->memory
step.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1840
Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200511160803.15407-1-mika.kuoppala@linux.intel.com


# 9e859eb9 27-Feb-2020 Jani Nikula <jani.nikula@intel.com>

drm/i915/vgpu: improve vgpu abstractions

Add intel_vgpu_register() abstraction, rename i915_detect_vgpu() to
intel_vgpu_detect() to match other function naming, un-inline
intel_vgpu_active(), intel_vgpu_has_full_ppgtt() and
intel_vgpu_has_huge_gtt() to reduce header interdependencies.

The i915_vgpu.[ch] filename and intel_vgpu_ prefix discrepancy remains.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227144408.24345-1-jani.nikula@intel.com


# 69edc390 26-Feb-2020 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/ggtt: do not set bits 1-11 in gen12 ptes

On TGL, bits 2-4 in the GGTT PTE are not ignored anymore and are
instead used for some extra VT-d capabilities. We don't (yet?) have
support for those capabilities, but, given that we shared the pte_encode
function betweed GGTT and PPGTT, we still set those bits to the PPGTT
PPAT values. The DMA engine gets very confused when those bits are
set while the iommu is enabled, leading to errors. E.g. when loading
the GuC we get:

[ 9.796218] DMAR: DRHD: handling fault status reg 2
[ 9.796235] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
[ 9.899215] [drm:intel_guc_fw_upload [i915]] *ERROR* GuC firmware signature verification failed

To fix this, just have dedicated gen8_pte_encode function per type of
gtt. Also, explicitly set vm->pte_encode for gen8_ppgtt, even if we
don't use it, to make sure we don't accidentally assign it to the GGTT
one, like we do for gen6_ppgtt, in case we need it in the future.

Reported-by: "Sodhi, Vunny" <vunny.sodhi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200226185657.26445-1-daniele.ceraolospurio@intel.com


# 8e78871b 17-Jan-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915/userptr: fix size calculation

If we create a rather large userptr object(e.g 1ULL << 32) we might
shift past the type-width of num_pages: (int)num_pages << PAGE_SHIFT,
resulting in a totally bogus sg_table, which fortunately will eventually
manifest as:

gen8_ppgtt_insert_huge:463 GEM_BUG_ON(iter->sg->length < page_size)
kernel BUG at drivers/gpu/drm/i915/gt/gen8_ppgtt.c:463!

v2: more unsigned long
prefer I915_GTT_PAGE_SIZE

Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117132413.1170563-2-matthew.auld@intel.com


# 2c86e55d 07-Jan-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: split up i915_gem_gtt

Attempt to split i915_gem_gtt.[ch] into more manageable chunks.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk