History log of /linux-master/drivers/gpu/drm/i915/i915_gem_gtt.c
Revision Date Author Comments
# a251c17a 05-Oct-2022 Jason A. Donenfeld <Jason@zx2c4.com>

treewide: use get_random_u32() when possible

The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Acked-by: Helge Deller <deller@gmx.de> # for parisc
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>


# 7e00897b 14-Jan-2022 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.

Because we will start to require the obj->resv lock for unbinding,
ensure these vma eviction utility functions also take the lock.

This requires some function signature changes, to ensure that the
ww context is passed around, but is mostly straightforward.

Previously this was split up into several patches, but reworking
should allow for easier bisection.

Changes since v1:
- Handle evicting dead objects better.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-4-maarten.lankhorst@linux.intel.com


# 2ef97818 07-Jan-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: split out i915_gem_evict.h from i915_drv.h

We already have the i915_gem_evict.c file.

v2: Fixed commit message (Tvrtko)

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ec666853171d04daeb21a93083940df36907c343.1641561552.git.jani.nikula@intel.com


# 204129a2 04-Jan-2022 Michał Winiarski <michal.winiarski@intel.com>

drm/i915: Use to_gt() helper for GGTT accesses

GGTT is currently available both through i915->ggtt and gt->ggtt, and we
eventually want to get rid of the i915->ggtt one.
Use to_gt() for all i915->ggtt accesses to help with the future
refactoring.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220104223550.56135-1-andi.shyti@linux.intel.com


# c4f61203 25-Sep-2021 Cai Huoqing <caihuoqing@baidu.com>

drm/i915: Use direction definition DMA_BIDIRECTIONAL instead of PCI_DMA_BIDIRECTIONAL

Replace direction definition PCI_DMA_BIDIRECTIONAL
with DMA_BIDIRECTIONAL, because it helps to enhance readability
and avoid possible inconsistency.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210925124613.144-1-caihuoqing@baidu.com


# cf41a8f1 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Finally remove obj->mm.lock.

With all callers and selftests fixed to use ww locking, we can now
finally remove this lock.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-62-maarten.lankhorst@linux.intel.com


# 8ff5446a 28-Jan-2021 Thomas Zimmermann <tzimmermann@suse.de>

drm/i915: Remove references to struct drm_device.pdev

Using struct drm_device.pdev is deprecated. Convert i915 to struct
drm_device.dev. No functional changes.

v6:
* also remove assignment in selftests/ in a later patch (Chris)
v5:
* remove assignment in later patch (Chris)
v3:
* rebased
v2:
* move gt/ and gvt/ changes into separate patches

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-2-tzimmermann@suse.de


# 63de1da1 09-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove livelock from "do_idle_maps" vtd w/a

A call to wait for the GT to idle from inside the put_pages fallback is
prone to cause an uninterruptible livelock. As it does not provide
adequate serialisation with new requests, simply fallback to a trivial
sleep.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-1-chris@chris-wilson.co.uk


# a3a40284 06-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Update dma-attributes for our sg DMA

Looking through the attributes for DMA mappings, it appears that by
default dma_map_sg will try and create a kernel accessible map of the
page. We never access this, as we either have a struct page already or
an iomap, so we can request that the dma mapper does not create one.
Without a kernel map in place, one presumes the rest of the memory
control attributes do not apply. We also explicitly control the caches
around the mappings, so we can ask it not to bother synchronising itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200706224308.22636-1-chris@chris-wilson.co.uk


# 83d2bdb6 25-Feb-2020 Jani Nikula <jani.nikula@intel.com>

drm/i915: significantly reduce the use of <drm/i915_drm.h>

The #include has been splattered all over the place, but there are
precious few places, all .c files, that actually need it.

v2: remove leftover double newlines

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200225133131.3301-1-jani.nikula@intel.com


# 00376ccf 30-Jan-2020 Wambui Karuga <wambui.karugax@gmail.com>

drm/i915: conversion to drm_device logging macros when drm_i915_private is present.

Converts various instances of the printk drm logging macros to the
struct drm_device based logging macros in the drm/i915 folder using the
following coccinelle script that transforms based on the existence of
the struct drm_i915_private device pointer:
@@
identifier fn, T;
@@

fn(...) {
...
struct drm_i915_private *T = ...;
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

@@
identifier fn, T;
@@

fn(...,struct drm_i915_private *T,...) {
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

Checkpatch warnings were fixed manually.

Instances of the DRM_DEBUG macro were not converted due to lack of a
consensus of an analogous struct drm_device based macro.

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200131093416.28431-2-wambui.karugax@gmail.com


# ecc4d2a5 17-Jan-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915/userptr: fix size calculation

If we create a rather large userptr object(e.g 1ULL << 32) we might
shift past the type-width of num_pages: (int)num_pages << PAGE_SHIFT,
resulting in a totally bogus sg_table, which fortunately will eventually
manifest as:

gen8_ppgtt_insert_huge:463 GEM_BUG_ON(iter->sg->length < page_size)
kernel BUG at drivers/gpu/drm/i915/gt/gen8_ppgtt.c:463!

v2: more unsigned long
prefer I915_GTT_PAGE_SIZE

Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117132413.1170563-2-matthew.auld@intel.com
(cherry picked from commit 8e78871bc1e5efec22c950d3fd24ddb63d4ff28a)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# c3866f54 10-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Skip trying to unbind in restore_ggtt_mappings

Currently we first to try to unbind the VMA (and lazily rebind on next
use) as an optimisation during restore_ggtt_mappings. Ideally, the only
objects in the GGTT upon resume are the pinned kernel objects which
can't be unbound and need to be restored. As the unbind interferes with
the plan to mark those objects as active for error capture, forgo the
optimisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110110402.1231745-1-chris@chris-wilson.co.uk
(cherry picked from commit 80e5351df13a5c4e9ecc14a58fa60c84d356ee87)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 2c86e55d 07-Jan-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: split up i915_gem_gtt

Attempt to split i915_gem_gtt.[ch] into more manageable chunks.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk


# 4bdc0d67 06-Jan-2020 Christoph Hellwig <hch@lst.de>

remove ioremap_nocache and devm_ioremap_nocache

ioremap has provided non-cached semantics by default since the Linux 2.6
days, so remove the additional ioremap_nocache interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>


# 76f9764c 22-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Introduce a vma.kref

Start introducing a kref on i915_vma in order to protect the vma unbind
(i915_gem_object_unbind) from a parallel destruction (i915_vma_parked).
Later, we will use the refcount to manage all access and turn i915_vma
into a first class container.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222210256.2066451-2-chris@chris-wilson.co.uk


# e6ba7648 21-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915->kernel_context

Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk


# 902eb748 16-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Tidy up full-ppgtt on Ivybridge

With a couple more memory barriers dotted around the place we can
significantly reduce the MTBF on Ivybridge. Still doesn't really help
Haswell though.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191216142409.2605211-1-chris@chris-wilson.co.uk


# ca5930b1 07-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Account for preallocation in asserts

Our asserts allow for the PDEs to be allocated concurrently, but we did
not account for the aliasing-ppgtt to be preallocated on top.

Testcase: igt/gem_ppgtt #bsw
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191207221453.2802627-1-chris@chris-wilson.co.uk


# d315fe8b 05-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Trim gen6 ppgtt updates to PD cachelines

It appears now that we have the ring TLB invalidation in place, we need
only update the page directory cachelines that we have altered. A great
reduction from rewriting the whole 2MiB ppgtt on every update.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205234059.1010030-1-chris@chris-wilson.co.uk


# ccd20945 05-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Try hard to bind the context

It is not acceptable for context pinning to fail with -ENOSPC as we
should always be able to make space in the GGTT. The only reason we may
fail is that other "temporary" context pins are reserving their space
and we need to wait for an available slot.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/676
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205113726.413351-2-chris@chris-wilson.co.uk


# 1bbdd241 01-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Refactor gen6_flush_pd()

As the gen6 page directory is written on binding and after every update,
the code ended up duplicated. Refactor the code into a single routine to
share the locking and serialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191201140916.2128905-1-chris@chris-wilson.co.uk


# 19b6304a 30-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Serialise access to GFX_FLSH_CNTL

Now that many threads may try to use the same mmio to flush the global
buffers after updating the PTE, serialise access to the mmio to prevent
concurrent access on gen7.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191130162320.1683424-1-chris@chris-wilson.co.uk


# 3cd6e886 29-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gen7: Re-enable full-ppgtt for ivb & hsw

After much hair pulling, resort to preallocating the ppGTT entries on
init to circumvent the apparent lack of PD invalidate following the
write to PP_DCLV upon switching mm between contexts (and here the same
context after binding new objects). However, the details of that PP_DCLV
invalidate are still unknown, and it appears we need to reload the mm
twice to cover over a timing issue. Worrying.

Fixes: 3dc007fe9b2b ("drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129201328.1398583-1-chris@chris-wilson.co.uk


# 3cac1958 06-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Leave the aliasing-ppgtt size alone

The hidden aliasing-ppgtt's size is never revealed, as we only inspect
the front GTT when engaged. However, we were "fixing" the hidden ppgtt
to match, with the net result that we ended up leaking the unused
portion on Braswell were we preallocated the entire set of top level
PDP, see gen8_preallocate_top_level_pdp().

[ 26.025364] DMA-API: pci 0000:00:02.0: device driver has pending DMA allocations while released from device [count=2]
[ 26.025364] One of leaked entries details: [device address=0x0000000230778000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as single]
[ 26.025683] WARNING: CPU: 0 PID: 415 at kernel/dma/debug.c:894 dma_debug_device_change+0x1a4/0x1f0
[ 26.025905] Modules linked in: i915(E-) intel_powerclamp(E) nls_ascii(E) nls_cp437(E) crct10dif_pclmul(E) crc32_pclmul(E) vfat(E) crc32c_intel(E) fat(E) ghash_clmulni_intel(E) prime_numbers(E) intel_gtt(E) i2c_algo_bit(E) efi_pstore(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) evdev(E) drm(E) aesni_intel(E) glue_helper(E) crypto_simd(E) cryptd(E) intel_cstate(E) sg(E) efivars(E) pcspkr(E) video(E) button(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) sd_mod(E) lpc_ich(E) ahci(E) mfd_core(E) i2c_i801(E) libahci(E) i2c_designware_pci(E) i2c_designware_core(E)
[ 26.026613] CPU: 0 PID: 415 Comm: rmmod Tainted: G E 5.4.0-rc6+ #25
[ 26.026837] Hardware name: /, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[ 26.027080] RIP: 0010:dma_debug_device_change+0x1a4/0x1f0
[ 26.027319] Code: 89 54 24 08 e8 ad 60 62 00 48 8b 54 24 08 48 89 c6 41 57 4d 89 e9 49 89 d8 44 89 f1 41 54 48 c7 c7 e0 61 06 82 e8 c1 aa f5 ff <0f> 0b 5a 59 48 83 3c 24 00 0f 85 97 26 00 00 8b 05 77 47 92 01 85
[ 26.027600] RSP: 0018:ffff888228d2fcc8 EFLAGS: 00010282
[ 26.027831] RAX: 0000000000000000 RBX: 0000000230778000 RCX: 0000000000000000
[ 26.028053] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffed10451a5f8f
[ 26.028279] RBP: ffff88823480c0b0 R08: 0000000000000001 R09: ffffed1046e83eb1
[ 26.028500] R10: ffffed1046e83eb0 R11: ffff88823741f587 R12: ffffffff82067340
[ 26.028725] R13: 0000000000001000 R14: 0000000000000002 R15: ffffffff82067480
[ 26.028952] FS: 00007fdf3ed174c0(0000) GS:ffff888237400000(0000) knlGS:0000000000000000
[ 26.029185] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 26.029405] CR2: 000055e211109030 CR3: 0000000230139000 CR4: 00000000001006f0
[ 26.029622] Call Trace:
[ 26.029846] notifier_call_chain+0x67/0xa0
[ 26.030076] blocking_notifier_call_chain+0x5a/0x80
[ 26.030305] device_release_driver_internal+0x20d/0x260
[ 26.030535] driver_detach+0x7b/0xe1
[ 26.030761] bus_remove_driver+0x8c/0x153
[ 26.030993] pci_unregister_driver+0x2d/0xf0
[ 26.032603] i915_exit+0x16/0x1c [i915]

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: 1eda701eace2 ("drm/i915/gtt: Recursive cleanup for gen8")
References: c082afac86cb ("drm/i915: Move aliasing_ppgtt underneath its i915_ggtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106221223.7437-1-chris@chris-wilson.co.uk
(cherry picked from commit 2b0a4fc25ad8e3da4a156995a513dca6abf247de)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 2b0a4fc2 06-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Leave the aliasing-ppgtt size alone

The hidden aliasing-ppgtt's size is never revealed, as we only inspect
the front GTT when engaged. However, we were "fixing" the hidden ppgtt
to match, with the net result that we ended up leaking the unused
portion on Braswell were we preallocated the entire set of top level
PDP, see gen8_preallocate_top_level_pdp().

[ 26.025364] DMA-API: pci 0000:00:02.0: device driver has pending DMA allocations while released from device [count=2]
[ 26.025364] One of leaked entries details: [device address=0x0000000230778000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as single]
[ 26.025683] WARNING: CPU: 0 PID: 415 at kernel/dma/debug.c:894 dma_debug_device_change+0x1a4/0x1f0
[ 26.025905] Modules linked in: i915(E-) intel_powerclamp(E) nls_ascii(E) nls_cp437(E) crct10dif_pclmul(E) crc32_pclmul(E) vfat(E) crc32c_intel(E) fat(E) ghash_clmulni_intel(E) prime_numbers(E) intel_gtt(E) i2c_algo_bit(E) efi_pstore(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) evdev(E) drm(E) aesni_intel(E) glue_helper(E) crypto_simd(E) cryptd(E) intel_cstate(E) sg(E) efivars(E) pcspkr(E) video(E) button(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) sd_mod(E) lpc_ich(E) ahci(E) mfd_core(E) i2c_i801(E) libahci(E) i2c_designware_pci(E) i2c_designware_core(E)
[ 26.026613] CPU: 0 PID: 415 Comm: rmmod Tainted: G E 5.4.0-rc6+ #25
[ 26.026837] Hardware name: /, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[ 26.027080] RIP: 0010:dma_debug_device_change+0x1a4/0x1f0
[ 26.027319] Code: 89 54 24 08 e8 ad 60 62 00 48 8b 54 24 08 48 89 c6 41 57 4d 89 e9 49 89 d8 44 89 f1 41 54 48 c7 c7 e0 61 06 82 e8 c1 aa f5 ff <0f> 0b 5a 59 48 83 3c 24 00 0f 85 97 26 00 00 8b 05 77 47 92 01 85
[ 26.027600] RSP: 0018:ffff888228d2fcc8 EFLAGS: 00010282
[ 26.027831] RAX: 0000000000000000 RBX: 0000000230778000 RCX: 0000000000000000
[ 26.028053] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffed10451a5f8f
[ 26.028279] RBP: ffff88823480c0b0 R08: 0000000000000001 R09: ffffed1046e83eb1
[ 26.028500] R10: ffffed1046e83eb0 R11: ffff88823741f587 R12: ffffffff82067340
[ 26.028725] R13: 0000000000001000 R14: 0000000000000002 R15: ffffffff82067480
[ 26.028952] FS: 00007fdf3ed174c0(0000) GS:ffff888237400000(0000) knlGS:0000000000000000
[ 26.029185] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 26.029405] CR2: 000055e211109030 CR3: 0000000230139000 CR4: 00000000001006f0
[ 26.029622] Call Trace:
[ 26.029846] notifier_call_chain+0x67/0xa0
[ 26.030076] blocking_notifier_call_chain+0x5a/0x80
[ 26.030305] device_release_driver_internal+0x20d/0x260
[ 26.030535] driver_detach+0x7b/0xe1
[ 26.030761] bus_remove_driver+0x8c/0x153
[ 26.030993] pci_unregister_driver+0x2d/0xf0
[ 26.032603] i915_exit+0x16/0x1c [i915]

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: 1eda701eace2 ("drm/i915/gtt: Recursive cleanup for gen8")
References: c082afac86cb ("drm/i915: Move aliasing_ppgtt underneath its i915_ggtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106221223.7437-1-chris@chris-wilson.co.uk


# 895d8ebe 29-Oct-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: error capture with no ggtt slot

If the aperture is not available in HW we can't use a ggtt slot and wc
copy, so fall back to regular kmap.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-4-matthew.auld@intel.com


# 54b512cd 29-Oct-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: do not map aperture if it is not available.

Skip both setup and cleanup of the aperture mapping if the HW doesn't
have an aperture bar.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-2-matthew.auld@intel.com


# 3fc794f2 26-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Split memory_region initialisation into its own file

Pull the memory region bookkeeping into its file. Let's start clean and
see how long it lasts!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191026202032.4371-1-chris@chris-wilson.co.uk


# 2c9a4915 24-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Convert PAT setup to uncore mmio

One more thing which relied on implicit dev_priv can be covnerted to use
the new mmio accessors.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024093440.32280-1-tvrtko.ursulin@linux.intel.com


# c6e07ada 17-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Convert the leftover for_each_engine(gt)

Use the local gt for iterating over the available set of engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018115331.8980-1-chris@chris-wilson.co.uk


# 72405c3d 18-Oct-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: treat stolen as a region

Convert stolen memory over to a region object. Still leaves open the
question with what to do with pre-allocated objects...

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018090751.28295-3-matthew.auld@intel.com


# da1184cd 18-Oct-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: treat shmem as a region

Convert shmem to an intel_memory_region.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018090751.28295-2-matthew.auld@intel.com


# 3aae9d08 18-Oct-2019 Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

drm/i915: enumerate and init each supported region

Nothing to enumerate yet...

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018090751.28295-1-matthew.auld@intel.com


# a4e7ccda 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move context management under GEM

Keep track of the GEM contexts underneath i915->gem.contexts and assign
them their own lock for the purposes of list management.

v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex
v3: Correct split with removal of logical HW ID

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk


# 66101975 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move request runtime management onto gt

Requests are run from the gt and are tided into the gt runtime power
management, so pull the runtime request management under gt/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-12-chris@chris-wilson.co.uk


# f33a8a51 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Merge wait_for_timelines with retire_request

wait_for_timelines is essentially the same loop as retiring requests
(with an extra timeout), so merge the two into one routine.

v2: i915_retire_requests_timeout and keep VT'd w/a as !interruptible

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-10-chris@chris-wilson.co.uk


# b1e3177b 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Coordinate i915_active with its own mutex

Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.

This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).

The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.

v2: Add some comments that barely elucidate anything :(

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk


# 2850748e 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull i915_vma_pin under the vm->mutex

Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).

In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.

Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!

v2: Add some commentary, and some helpers to reduce patch churn.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk


# 11331125 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up address spaces that may need to allocate

Since we cannot allocate underneath the vm->mutex (it is used in the
direct-reclaim paths), we need to shift the allocations off into a
mutexless worker with fence recursion prevention. To know when we need
this protection, we mark up the address spaces that do allocate before
insertion. In the future, we may wish to extend the async bind scheme to
more than just allocations.

v2: s/vm->bind_alloc/vm->bind_async_flags/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-3-chris@chris-wilson.co.uk


# 5e053450 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only track bound elements of the GTT

The premise here is to simply avoiding having to acquire the vm->mutex
inside vma create/destroy to update the vm->unbound_lists, to avoid some
nasty lock recursions later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-2-chris@chris-wilson.co.uk


# 71724f70 03-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/mm: Use helpers for drm_mm_node booleans

In preparation for rearranging the booleans into a flags field, ensure
all the current users are using the inline helpers and not directly
accessing the members.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191003210100.22250-3-chris@chris-wilson.co.uk


# c8185520 13-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Make sure the gen6 ppgtt is bound before first use

As we remove the struct_mutex protection from around the vma pinning,
counters need to be atomic and aware that there may be multiple threads
simultaneously active.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913064200.24297-1-chris@chris-wilson.co.uk


# 7cb8468b 11-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/tgl: Disable read-only ppgtt support

The same read-only affliction as befell Icelake is affecting Tigerlake.
Disable the read-only support as clearly it was not fixed.

Testcase: igt/i915_selftests/live_gem_context
References: 3936867dbc1e ("drm/i915: Disable read only ppgtt support for gen11")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911125717.28997-1-chris@chris-wilson.co.uk


# 4dd2fbbf 11-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Make i915_vma.flags atomic_t for mutex reduction

In preparation for reducing struct_mutex stranglehold around the vm,
make the vma.flags atomic so that we can acquire a pin on the vma
atomically before deciding if we need to take the mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk


# e9ceb751 09-Sep-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust

Make it clear that the color adjust callback applies to the ggtt.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-2-matthew.auld@intel.com


# 1e0a96e5 09-Sep-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: export color_differs

Export color_differs so that we can use it elsewhere.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-1-matthew.auld@intel.com


# 9e77f500 23-Aug-2019 Xiaolin Zhang <xiaolin.zhang@intel.com>

drm/i915: to make vgpu ppgtt notificaiton as atomic operation

vgpu ppgtt notification was split into 2 steps, the first step is to
update PVINFO's pdp register and then write PVINFO's g2v_notify register
with action code to tirgger ppgtt notification to GVT side.

currently these steps were not atomic operations due to no any protection,
so it is easy to enter race condition state during the MTBF, stress and
IGT test to cause GPU hang.

the solution is to add a lock to make vgpu ppgtt notication as atomic
operation.

Cc: stable@vger.kernel.org
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1566543451-13955-1-git-send-email-xiaolin.zhang@intel.com
(cherry picked from commit 52988009843160c5b366b4082ed6df48041c655c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 8f5e2b30 01-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Restrict the aliasing-ppgtt to the size of the ggtt

The aliasing-ppgtt is not allowed to be smaller than the ggtt, nor
should we advertise it as being any bigger, or else we may get sued for
false advertisement.

Testcase: igt/gem_exec_big
Fixes: 0b718ba1e884 ("drm/i915/gtt: Downgrade Cherryview back to aliasing-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-1-chris@chris-wilson.co.uk


# c1d143dd 30-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove ppgtt->dirty_engines

This is no longer used anywhere and so can be removed. However, tracking
the dirty status on the ppgtt doesn't work very well if the ppgtt is
shared, so perhaps for the best that it is no longer required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830180000.24608-3-chris@chris-wilson.co.uk


# 31444afb 29-Aug-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: s/for_each_sgt_dma/for_each_sgt_daddr/

The sg_table for our backing store might contain addresses from
stolen-memory or in the future local-memory, at which point this is no
longer a dma-iterator. As a consequence we should now break on NULL
iter.sgp, instead of dmap == 0 which is considered an invalid dma
address.

As a bonus, gcc much prefers this construct,

Function old new delta
gen8_ggtt_insert_entries 211 192 -19
gen6_ggtt_insert_entries 292 262 -30
i915_error_object_create 996 954 -42

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829201919.21493-1-matthew.auld@intel.com


# a8ff5d40 23-Aug-2019 Michel Thierry <michel.thierry@intel.com>

drm/i915/tgl: Do not apply WaIncreaseDefaultTLBEntries from GEN12 onwards

Workaround no longer needed (plus L3_LRA_1_GPGPU doesn't exist).

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-17-lucas.demarchi@intel.com


# 52988009 23-Aug-2019 Xiaolin Zhang <xiaolin.zhang@intel.com>

drm/i915: to make vgpu ppgtt notificaiton as atomic operation

vgpu ppgtt notification was split into 2 steps, the first step is to
update PVINFO's pdp register and then write PVINFO's g2v_notify register
with action code to tirgger ppgtt notification to GVT side.

currently these steps were not atomic operations due to no any protection,
so it is easy to enter race condition state during the MTBF, stress and
IGT test to cause GPU hang.

the solution is to add a lock to make vgpu ppgtt notication as atomic
operation.

Cc: stable@vger.kernel.org
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1566543451-13955-1-git-send-email-xiaolin.zhang@intel.com


# 191797a8 23-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Preallocate Braswell top-level page directory

In order for the Braswell top-level PD to remain the same from the time
of request construction to its submission onto HW, as we may be
asynchronously rewriting the page tables (thus changing the expected
register state after having already stored the old addresses in the
request), the top level PD must be preallocated.

So wave goodbye to our lazy allocation of those 4x2 pages.

v2: A little bit of write-flushing required (presumably it always has
been required, but now we are more susceptible and it is showing up!)

v3: Put back the forced-PD-reload on every batch, we can't survive
without it and explicitly marking the context for PD reload makes
Braswell turn nasty.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823141421.2398-1-chris@chris-wilson.co.uk


# b3c0692f 23-Aug-2019 Michel Thierry <michel.thierry@intel.com>

drm/i915/tgl: Move GTCR register to cope with GAM MMIO address remap

GAM registers located in the 0x4xxx range have been relocated to 0xCxxx;
this is to make space for global MOCS registers.

v2: Rename register and bitfield to its new name (suggested by Mika)

HSD: 399379
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-2-lucas.demarchi@intel.com


# c71ccbe2 21-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Add some range asserts

These should have been validated in the upper layers, but for sanity's
sake, repeat them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821155728.2839-2-chris@chris-wilson.co.uk


# 6846895f 21-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace PIN_NONFAULT with calls to PIN_NOEVICT

When under severe stress for GTT mappable space, the LRU eviction model
falls off a cliff. We spend all our time scanning the much larger
non-mappable area searching for something within the mappable zone we can
evict. Turn this on its head by only using the full vma for the object if
it is already pinned in the mappable zone or there is sufficient *free*
space to accommodate it (prioritizing speedy reuse). If there is not,
immediately fall back to using small chunks (tilerow for GTT mmap, single
pages for pwrite/relocation) and using random eviction before doing a full
search.

Testcase: igt/gem_concurrent_blt
References: https://bugs.freedesktop.org/show_bug.cgi?id=110848
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821123234.19194-1-chris@chris-wilson.co.uk


# 78387745 21-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Include asm/smp.h

We need asm/smp.h for wbinvd_on_all_cpus()

Reported-by: kbuild-all@01.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821093905.7693-1-chris@chris-wilson.co.uk


# ff175010 20-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Relax assertion for pt_used

When inserting the final level PTE, we check that we are not overflowing
the page table (checking that pt_used does not exceed the size of the
table). However, we have to allow for every other PTE to be pinned by a
simultaneous removal thread (as on remove we bump the pt_used counter
before adjusting the table).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821042044.7354-1-chris@chris-wilson.co.uk


# 6acbe9f6 20-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Relax pd_used assertion

The current assertion tries to make sure that we do not over count the
number of used PDE inside a page directory -- that is with an array of
512 pde, we do not expect more than 512 elements used! However, our
assertion has to take into account that as we pin an element into the
page directory, the caller first pins the page directory so the usage
count is one higher. However, this should be one extra pin per thread,
and the upper bound is that we may have one thread for each entry.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820141218.14714-1-chris@chris-wilson.co.uk


# b41e63d8 17-Aug-2019 Michel Thierry <michel.thierry@intel.com>

drm/i915/tgl: Updated Private PAT programming

Gen12 removes the target-cache and age fields from the private PAT
because MOCS now have the capability to set these itself. Only memory-type
field should be programmed in the ppat, the reminded bits are reserved.

Since now there are only 4 possible combinations, we could set only 4
PPAT and leave the reminded 4 as UC, but I left them as WB as we used
to have before.

Also these registers have been relocated to the 0x4800-0x481c range.

HSDES: 1406402661
BSpec: 31654
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-33-lucas.demarchi@intel.com


# 64b95df9 19-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assume exclusive access to objects inside resume

Inside gtt_restore_mappings() we currently take the obj->resv->lock, but
in the future we need to avoid taking this fs-reclaim tainted lock as we
need to extend the coverage of the vm->mutex. Take advantage of the
single-threaded nature of the early resume phase, and do a single
wbinvd() to flush all the GTT objects en masse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190819200705.3631-1-chris@chris-wilson.co.uk


# eb7c022d 15-Aug-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Fold gen8 insertions into one

As we give page directory pointer (lvl 3) structure
for pte insertion, we can fold both versions into
one function by teaching it to get pdp regardless
of top level.

v2: naming and asserts (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816094754.26492-1-mika.kuoppala@linux.intel.com


# 6ac689d2 16-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the associated uncore for the vm

We store the gt&uncore to use in the i915_address_space, so use it!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816083143.23558-1-chris@chris-wilson.co.uk


# 88f8065c 15-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert a few more bland dmesg info to be device specific

Looking around the GT initialisation, we have a few log messages we
think are interesting enough present to the user (such as the amount of L4
cache) and a few to inform them of the result of actions or conflicting
HW restrictions (i.e. quirks). These are device specific messages, so
use the dev family of printk.

v2: shave off a few bytes of .rodata!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815093604.3618-1-chris@chris-wilson.co.uk


# 3d6792cf 12-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Forgo last_fence active request tracking

We were using the last_fence to track the last request that used this
vma that might be interpreted by a fence register and forced ourselves
to wait for this request before modifying any fence register that
overlapped our vma. Due to requirement that we need to track any XY_BLT
command, linear or tiled, this in effect meant that we have to track the
vma for its active lifespan anyway, so we can forgo the explicit
last_fence tracking and just use the whole vma->active.

Another solution would be to pipeline the register updates, and would
help resolve some long running stalls for gen3 (but only gen 2 and 3!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190812174804.26180-1-chris@chris-wilson.co.uk


# 1feb7864 09-Aug-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: enable GTT cache by default

For some platforms the GTT cache is by default not enabled, and
currently where we explicitly enable it, we make it conditional on 2M GTT
page support, since the BSpec states that we must disable it if we
enable 2M/1G pages. To make this more consistent opt for blanket
enabling the GTT cache for all relevant gens in a single place, while
still keeping the same behaviour of checking for 2M support.

BSpec: 9314
BSpec: 423
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809193456.3836-1-matthew.auld@intel.com


# 6da4a2c4 06-Aug-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: remove unnecessary includes of intel_display_types.h header

With its original name intel_drv.h the intel_display_types.h header was
superfluously cargo-cult included all over the place, while it's really
mostly about display internals. Remove the unnecessary includes.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e3d737f0ab87c55969e62c1e077e15c04c238297.1565085692.git.jani.nikula@intel.com


# 1d455f8d 06-Aug-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: rename intel_drv.h to display/intel_display_types.h

Everything about the file is about display, and mostly about types
related to display. Move under display/ as intel_display_types.h to
reflect the facts.

There's still plenty to clean up, but start off with moving the file
where it logically belongs and naming according to contents.

v2: fix the include guard name in the renamed file

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806113933.11799-1-jani.nikula@intel.com


# c082afac 30-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move aliasing_ppgtt underneath its i915_ggtt

The aliasing_ppgtt provides a PIN_USER alias for the global gtt, so move
it under the i915_ggtt to simplify later transformations to enable
intel_context.vm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-1-chris@chris-wilson.co.uk


# 60a4233a 29-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Flush the i915_vm_release before ggtt shutdown

As the gen6_ppgtt may refer back to the GGTT for their page-directory
slots, make sure those __i915_vm_release are completed prior to shutting
down the GGTT.

Fixes: b32fa8111563 ("drm/i915/gtt: Defer address space cleanup to an RCU worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190729132412.23380-1-chris@chris-wilson.co.uk


# 5f4c82c8 04-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Mark the freed page table entries with scratch

On unwinding the allocation error path and having freed the page table
entry, it is imperative that we mark it as scratch.

<4> [416.075569] general protection fault: 0000 [#1] PREEMPT SMP PTI
<4> [416.075801] CPU: 0 PID: 2385 Comm: kworker/u2:11 Tainted: G U 5.2.0-rc7-CI-Patchwork_13534+ #1
<4> [416.076162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
<4> [416.076522] Workqueue: i915 __i915_vm_release [i915]
<4> [416.076754] RIP: 0010:gen8_ppgtt_cleanup_3lvl+0x58/0xb0 [i915]
<4> [416.077023] Code: 81 e2 04 fe ff ff 81 c2 ff 01 00 00 4c 8d 74 d6 58 4d 8b 65 00 4d 3b a7 28 02 00 00 74 40 49 8d 5c 24 50 49 81 c4 50 10 00 00 <48> 8b 2b 49 3b af 20 02 00 00 74 13 4c 89 ff 48 89 ee e8 01 fb ff
<4> [416.077445] RSP: 0018:ffffc9000046bd98 EFLAGS: 00010206
<4> [416.077625] RAX: 0001000000000000 RBX: 6b6b6b6b6b6b6bbb RCX: 8b4b56d500000000
<4> [416.077838] RDX: 00000000000001ff RSI: ffff88805a578008 RDI: ffff88805bd0efc8
<4> [416.078167] RBP: ffff88805bd0efc8 R08: 0000000004e42b93 R09: 0000000000000001
<4> [416.078381] R10: 0000000000000000 R11: ffff888077a1b0b8 R12: 6b6b6b6b6b6b7bbb
<4> [416.078594] R13: ffff88805a578058 R14: ffff88805a579058 R15: ffff88805bd0efc8
<4> [416.078815] FS: 0000000000000000(0000) GS:ffff88807da00000(0000) knlGS:0000000000000000
<4> [416.079395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [416.079851] CR2: 000056160fec2b14 CR3: 0000000071bbc003 CR4: 00000000003606f0
<4> [416.080388] Call Trace:
<4> [416.080828] gen8_ppgtt_cleanup+0x64/0x100 [i915]
<4> [416.081399] __i915_vm_release+0xfc/0x1d0 [i915]

Fixes: 1d1b5490b91c ("drm/i915/gtt: Replace struct_mutex serialisation for allocation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190704201656.15775-1-chris@chris-wilson.co.uk
(cherry picked from commit e7539b79f703a6b533385088fc15cb5c9ab3f56f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# f691eaa4 03-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Defer the free for alloc error paths

If we hit an error while allocating the page tables, we have to unwind
the incomplete updates, and wish to free the unused pd. However, we are
not allowed to be hoding the spinlock at that point, and so must use the
later free to defer it until after we drop the lock.

<3> [414.363795] BUG: sleeping function called from invalid context at drivers/gpu/drm/i915/i915_gem_gtt.c:472
<3> [414.364167] in_atomic(): 1, irqs_disabled(): 0, pid: 3905, name: i915_selftest
<4> [414.364406] 3 locks held by i915_selftest/3905:
<4> [414.364408] #0: 0000000034fe8aa8 (&dev->mutex){....}, at: device_driver_attach+0x18/0x50
<4> [414.364415] #1: 000000006bd8a560 (&dev->struct_mutex){+.+.}, at: igt_ctx_exec+0xb7/0x410 [i915]
<4> [414.364476] #2: 000000003dfdc766 (&(&pd->lock)->rlock){+.+.}, at: gen8_ppgtt_alloc_pdp+0x448/0x540 [i915]
<3> [414.364529] Preemption disabled at:
<4> [414.364530] [<0000000000000000>] 0x0
<4> [414.364696] CPU: 0 PID: 3905 Comm: i915_selftest Tainted: G U 5.2.0-rc7-CI-CI_DRM_6403+ #1
<4> [414.364698] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
<4> [414.364699] Call Trace:
<4> [414.364704] dump_stack+0x67/0x9b
<4> [414.364708] ___might_sleep+0x167/0x250
<4> [414.364777] vm_free_page+0x24/0xc0 [i915]
<4> [414.364852] free_pd+0xf/0x20 [i915]
<4> [414.364897] gen8_ppgtt_alloc_pdp+0x489/0x540 [i915]
<4> [414.364946] gen8_ppgtt_alloc_4lvl+0x8e/0x2e0 [i915]
<4> [414.364992] ppgtt_bind_vma+0x2e/0x60 [i915]
<4> [414.365039] i915_vma_bind+0xe8/0x2c0 [i915]
<4> [414.365088] __i915_vma_do_pin+0xa1/0xd20 [i915]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111050
Fixes: 1d1b5490b91c ("drm/i915/gtt: Replace struct_mutex serialisation for allocation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703171913.16585-3-chris@chris-wilson.co.uk
(cherry picked from commit 068610895ebd4bd86f496f01eb7b97e56d7269b2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 3bdd4f84 22-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rely on spinlock protection for GPU error capture

Trust that we now have adequate protection over the low level structures
via the engine->active.lock to allow ourselves to capture the GPU error
state without the heavy hammer of stop_machine(). Sadly this does mean
that we have to forgo some of the lesser used information (not derived
from the active state) that is not controlled by the active locks. This
includes the list of buffers in the ppGTT and pinned globally in the
GGTT. Originally this was used to manually verify relocations, but
hasn't been required for sometime and modern mesa now has the habit of
ensuring that all interesting buffers within a batch are captured in their
entirety (that are the auxiliary state buffers, but not the textures).

A useful side-effect is that this allows us to restore error capturing
for Braswell and Broxton.

v2: Use pagevec for a typical arbitrary number of preallocated pages

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190722222847.24178-1-chris@chris-wilson.co.uk


# 04364138 19-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Fix rounding for 36b

The top-level page directory for 36b is a single entry, not multiple
like 32b. Fix up the rounding on the calculation of the size of the top
level so that we populate the 4th level correctly for 36b.

Reported-by: Jose Souza <jose.souza@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 1eda701eace2 ("drm/i915/gtt: Recursive cleanup for gen8")
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190719130737.5835-1-chris@chris-wilson.co.uk


# 5cad0ddf 19-Jul-2019 Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/i915/gtt: Don't try to clear failed empty pd allocation

When __gen8_ppgtt_alloc fails without allocating anything
we should not try to call __gen8_ppgtt_clear as there is
nothing to clear and underlying code will complain with:

[ 157.861645] gen8_pd_range:881 GEM_BUG_ON(start >= end)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190719153322.10464-1-michal.wajdeczko@intel.com


# 6b5f3cb1 19-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Correct unshifted 'from' for gen8_ppgtt_alloc errors

Since the underlying __gen8_ppgtt_clear takes the shifted address, we
must remember to provide it with the shifted original start address.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Tested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190719131524.827-1-chris@chris-wilson.co.uk


# bea5faf7 11-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Tidy up ppgtt insertion for gen8

Apply the new radix shift helpers to extract the multi-level indices
cleanly when inserting pte into the gtt tree.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112725.2892-5-chris@chris-wilson.co.uk


# 8a98e839 11-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Recursive ppgtt alloc for gen8

Refactor the separate allocation routines into a single recursive
function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112725.2892-4-chris@chris-wilson.co.uk


# 84b1ca2f 13-Jul-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/uc: prefer intel_gt over i915 in GuC/HuC paths

With our HW interface logic moving from i915 to gt and with GuC and HuC
being part of the gt HW, it makes sense to use the intel_gt structure
instead of i915 as our reference object in GuC/HuC paths.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-9-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 4c2be3c5 11-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Recursive ppgtt clear for gen8

With an explicit level, we can refactor the separate clear functions
as a simple recursive function. The additional knowledge of the level
allows us to spot when we can free an entire subtree at once.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112725.2892-3-chris@chris-wilson.co.uk


# 1eda701e 11-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Recursive cleanup for gen8

With an explicit level, we can refactor the separate cleanup functions
as a simple recursive function. We take the opportunity to pass down the
size of each level so that we can deal with the different sizes of
top-level and avoid over allocating for 32/36-bit vm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112725.2892-2-chris@chris-wilson.co.uk


# 3b58a945 12-Jul-2019 Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

drm/i915: Propagate "_release" function name suffix down

Replace mixed "_fini"/"_cleanup"/"_cleanup_hw" suffixes found in names
of functions called from i915_driver_release() with "_release" suffix
consistently. This provides better code readability, especially
helpful when trying to work out which phase the code is in.

Functions names starting with "i915_driver_", i.e., those defined in
drivers/gpu/dri/i915/i915_drv.c, just have their "cleanup" or "fini"
parts of their names replaced with the "_release" suffix, while names
of functions coming from other source files have been suffixed with
"_driver_release" to avoid ambiguity with other possible .release entry
points.

v2: early_probe pairs better with late_release (Chris)
v3: fix typo in commit message (Joonas)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112429.740-5-janusz.krzysztofik@linux.intel.com


# 6239901c 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Use NULL to encode scratch shadow entries

We can simplify our gtt walking code by comparing against NULL for
scratch entries as opposed to looking up the distinct per-level scratch
pointer.

The only caveat is to remember to protect external parties and map the
NULL to the scratch top pd.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-6-chris@chris-wilson.co.uk


# c03cbe4c 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Convert vm->scratch into an array

Each level has its own scratch. Make the levels more obvious by forgoing
the fancy similarly names and replace them with a number. 0 is the bottom
most level, the physical page used for actual data; 1+ are the page
directories.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-5-chris@chris-wilson.co.uk


# 27763264 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Compute the radix for gen8 page table levels

The radix levels of each page directory are easily determined so replace
the numerous hardcoded constants with precomputed derived constants.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-4-chris@chris-wilson.co.uk


# 18c7962b 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Markup i915_ppgtt height

This will be useful to consolidate recursive code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-3-chris@chris-wilson.co.uk


# a9abea97 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Reorder gen8 ppgtt free/clear/alloc

In preparation for refactoring the free/clear/alloc, first move the code
around so that we can avoid forward declarations in the next set of
patches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-2-chris@chris-wilson.co.uk


# 57a7e305 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Wrap page_table with page_directory

The page directory extends the page table with the shadow entries. Make
the page directory struct embed the page table for easier code reuse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712094327.24437-1-chris@chris-wilson.co.uk


# 6eebfe8a 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Use shallow dma pages for scratch

We only use the dma pages for scratch, and so do not need to allocate
the extra storage for the shadow page directory.

v2: Refrain from reintroducing I915_PDES

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712075818.20616-1-chris@chris-wilson.co.uk


# 50b38bc4 05-Jul-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Introduce release_pd_entry

By encapsulating the locking upper level and used check for entry
into a helper function, we can use it in all callsites.

v2: get rid of atomic_reads on lower level clears (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190705215204.4559-4-chris@chris-wilson.co.uk


# 73a8fdef 05-Jul-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Setup phys pages for 3lvl pdps

If we setup backing phys page for 3lvl pdps, as they
are not used, we will lose 5 pages per ppgtt.

Trading this memory on bsw, we gain more common code paths for all
gen8+ directory manipulation. And those paths are now void of checks
for page directory type, making the hot paths faster.

v2: don't shortcut vm (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190705215204.4559-3-chris@chris-wilson.co.uk


# 72230b87 05-Jul-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Tear down setup and cleanup macros for page dma

We don't use common codepaths to setup and cleanup page
directories vs page tables. So their setup and cleanup macros
are of no use and can be removed.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190705215204.4559-2-chris@chris-wilson.co.uk


# f20f272f 05-Jul-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: pde entry encoding is identical

For all page directory entries, the pde encoding is
identical. Don't complicate call sites with different
versions of doing the same thing, so we always check the
existence of physical page before writing the entry into
it. This further generalizes the pd so that manipulation in
callsites will be identical, removing the need to handle
pdps differently for gen8.

v2: squash
v3: inc/dec with set/clear (Chris)
v4: inlines, warn, stray set_pd (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190705215204.4559-1-chris@chris-wilson.co.uk


# e7539b79 04-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Mark the freed page table entries with scratch

On unwinding the allocation error path and having freed the page table
entry, it is imperative that we mark it as scratch.

<4> [416.075569] general protection fault: 0000 [#1] PREEMPT SMP PTI
<4> [416.075801] CPU: 0 PID: 2385 Comm: kworker/u2:11 Tainted: G U 5.2.0-rc7-CI-Patchwork_13534+ #1
<4> [416.076162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
<4> [416.076522] Workqueue: i915 __i915_vm_release [i915]
<4> [416.076754] RIP: 0010:gen8_ppgtt_cleanup_3lvl+0x58/0xb0 [i915]
<4> [416.077023] Code: 81 e2 04 fe ff ff 81 c2 ff 01 00 00 4c 8d 74 d6 58 4d 8b 65 00 4d 3b a7 28 02 00 00 74 40 49 8d 5c 24 50 49 81 c4 50 10 00 00 <48> 8b 2b 49 3b af 20 02 00 00 74 13 4c 89 ff 48 89 ee e8 01 fb ff
<4> [416.077445] RSP: 0018:ffffc9000046bd98 EFLAGS: 00010206
<4> [416.077625] RAX: 0001000000000000 RBX: 6b6b6b6b6b6b6bbb RCX: 8b4b56d500000000
<4> [416.077838] RDX: 00000000000001ff RSI: ffff88805a578008 RDI: ffff88805bd0efc8
<4> [416.078167] RBP: ffff88805bd0efc8 R08: 0000000004e42b93 R09: 0000000000000001
<4> [416.078381] R10: 0000000000000000 R11: ffff888077a1b0b8 R12: 6b6b6b6b6b6b7bbb
<4> [416.078594] R13: ffff88805a578058 R14: ffff88805a579058 R15: ffff88805bd0efc8
<4> [416.078815] FS: 0000000000000000(0000) GS:ffff88807da00000(0000) knlGS:0000000000000000
<4> [416.079395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [416.079851] CR2: 000056160fec2b14 CR3: 0000000071bbc003 CR4: 00000000003606f0
<4> [416.080388] Call Trace:
<4> [416.080828] gen8_ppgtt_cleanup+0x64/0x100 [i915]
<4> [416.081399] __i915_vm_release+0xfc/0x1d0 [i915]

Fixes: 1d1b5490b91c ("drm/i915/gtt: Replace struct_mutex serialisation for allocation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190704201656.15775-1-chris@chris-wilson.co.uk


# ae1c5fd7 04-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Handle double alloc failures

Matthew pointed out that we could face a double failure with concurrent
allocations/frees, and so the assumption that the local var alloc was
NULL was fraught with danger. Rather than complicate the error paths too
much to add a second local for a second free, just do the second free
earlier on the unwind path.

Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190704104345.6603-1-chris@chris-wilson.co.uk


# 06861089 03-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Defer the free for alloc error paths

If we hit an error while allocating the page tables, we have to unwind
the incomplete updates, and wish to free the unused pd. However, we are
not allowed to be hoding the spinlock at that point, and so must use the
later free to defer it until after we drop the lock.

<3> [414.363795] BUG: sleeping function called from invalid context at drivers/gpu/drm/i915/i915_gem_gtt.c:472
<3> [414.364167] in_atomic(): 1, irqs_disabled(): 0, pid: 3905, name: i915_selftest
<4> [414.364406] 3 locks held by i915_selftest/3905:
<4> [414.364408] #0: 0000000034fe8aa8 (&dev->mutex){....}, at: device_driver_attach+0x18/0x50
<4> [414.364415] #1: 000000006bd8a560 (&dev->struct_mutex){+.+.}, at: igt_ctx_exec+0xb7/0x410 [i915]
<4> [414.364476] #2: 000000003dfdc766 (&(&pd->lock)->rlock){+.+.}, at: gen8_ppgtt_alloc_pdp+0x448/0x540 [i915]
<3> [414.364529] Preemption disabled at:
<4> [414.364530] [<0000000000000000>] 0x0
<4> [414.364696] CPU: 0 PID: 3905 Comm: i915_selftest Tainted: G U 5.2.0-rc7-CI-CI_DRM_6403+ #1
<4> [414.364698] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
<4> [414.364699] Call Trace:
<4> [414.364704] dump_stack+0x67/0x9b
<4> [414.364708] ___might_sleep+0x167/0x250
<4> [414.364777] vm_free_page+0x24/0xc0 [i915]
<4> [414.364852] free_pd+0xf/0x20 [i915]
<4> [414.364897] gen8_ppgtt_alloc_pdp+0x489/0x540 [i915]
<4> [414.364946] gen8_ppgtt_alloc_4lvl+0x8e/0x2e0 [i915]
<4> [414.364992] ppgtt_bind_vma+0x2e/0x60 [i915]
<4> [414.365039] i915_vma_bind+0xe8/0x2c0 [i915]
<4> [414.365088] __i915_vma_do_pin+0xa1/0xd20 [i915]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111050
Fixes: 1d1b5490b91c ("drm/i915/gtt: Replace struct_mutex serialisation for allocation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703171913.16585-3-chris@chris-wilson.co.uk


# 096a9394 02-Jul-2019 Michał Winiarski <michal.winiarski@intel.com>

drm/i915/gtt: Don't check PPGTT presence on PPGTT-only platforms

We missed one place where we check PPGTT-only platform for PPGTT
presence. Let's remove it.
While I'm here let's assert that this particular code is never called on
pre-gen8 platforms.

References: 4bdafb9ddfa4 ("drm/i915: Remove i915.enable_ppgtt override")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190702113149.21200-2-michal.winiarski@intel.com


# a3389c14 02-Jul-2019 Michał Winiarski <michal.winiarski@intel.com>

Revert "drm/i915: Introduce private PAT management"

This reverts commit 4395890a48551982549d222d1923e2833dac47cf.

It's been over a year since this was merged, and the actual users of
intel_ppat_get / intel_ppat_put never materialized.

Time to remove it!

v2: Unbreak suspend (Chris)
v3: Rebase, drop fixes tag to avoid confusion

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190702113149.21200-1-michal.winiarski@intel.com


# 12c255b5 21-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Provide an i915_active.acquire callback

If we introduce a callback for i915_active that is only called the first
time we use the i915_active and is symmetrically paired with the
i915_active.retire callback, we can replace the open-coded and
non-atomic implementations -- which will be very fragile (i.e. broken)
upon removing the struct_mutex serialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-4-chris@chris-wilson.co.uk


# 80fc1c19 21-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Fixup kerneldoc parameters

drivers/gpu/drm/i915/gt/intel_mocs.c:513: warning: Function parameter or member 'gt' not described in 'intel_mocs_init_l3cc_table'
drivers/gpu/drm/i915/gt/intel_mocs.c:513: warning: Excess function parameter 'dev_priv' description in 'intel_mocs_init_l3cc_table'

intel_vgt_balloon/deballoon, i915_ggtt_probe_hw intel_wopcm_init_hw need
similar treatment

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621131640.28864-2-chris@chris-wilson.co.uk


# ba4134a4 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Save trip via top-level i915 in a few more places

For gt related operations it makes more logical sense to stay in the realm
of gt instead of dereferencing via driver i915.

This patch handles a few of the easy ones with work requiring more
refactoring still outstanding.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-30-tvrtko.ursulin@linux.intel.com


# 1d66377a 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Compartmentalize i915_gem_init_ggtt

Continuing on the theme of better logical organization of our code, make
the first step towards making the ggtt code better isolated from wider
struct drm_i915_private.

v2:
* Bring the ickle onion unwind back. (Chris)
* Rename to i915_init_ggtt. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-27-tvrtko.ursulin@linux.intel.com


# 3b896628 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Compartmentalize i915_ggtt_cleanup_hw

Continuing on the theme of better logical organization of our code, make
the first step towards making the ggtt code better isolated from wider
struct drm_i915_private.

v2:
* Cleanup of mm.wc_stash does not need struct_mutex. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-26-tvrtko.ursulin@linux.intel.com


# 68c754b8 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Compartmentalize i915_gem_suspend/restore_gtt_mappings

Having made start to better code compartmentalization by introducing
struct intel_gt, continue the theme elsewhere in code by making functions
take parameters take what logically makes most sense for them instead of
the global struct drm_i915_private.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-22-tvrtko.ursulin@linux.intel.com


# 763c1e63 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Store intel_gt backpointer in vm

This will come useful in the following patch.

v2:
* Handle mock ggtt.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-21-tvrtko.ursulin@linux.intel.com


# 759e4a74 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make ggtt invalidation work on ggtt

It is more logical for ggtt invalidation to take ggtt as input parameter.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-20-tvrtko.ursulin@linux.intel.com


# 8b5342f5 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Compartmentalize i915_ggtt_init_hw

Having made start to better code compartmentalization by introducing
struct intel_gt, continue the theme elsewhere in code by making functions
take parameters take what logically makes most sense for them instead of
the global struct drm_i915_private.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-19-tvrtko.ursulin@linux.intel.com


# ee1de7dd 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Compartmentalize i915_ggtt_probe_hw

Having made start to better code compartmentalization by introducing
struct intel_gt, continue the theme elsewhere in code by making functions
take parameters take what logically makes most sense for them instead of
the global struct drm_i915_private.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-18-tvrtko.ursulin@linux.intel.com


# 28a1f789 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Consolidate some open coded mmio rmw

Replace some gen6/7 open coded rmw with intel_uncore_rmw.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-14-tvrtko.ursulin@linux.intel.com


# acb56d97 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Convert i915_ppgtt_init_hw to intel_gt

More removal of implicit dev_priv from using old mmio accessors.

v2:
* Rebase for uncore_to_i915 removal.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-13-tvrtko.ursulin@linux.intel.com


# eaf522f6 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make i915_check_and_clear_faults take intel_gt

Continuing the conversion and elimination of implicit dev_priv.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-6-tvrtko.ursulin@linux.intel.com


# 3cb4ce00 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Convert intel_vgt_(de)balloon to uncore

More removal of implicit dev_priv from using old mmio accessors.

Furthermore these calls really operate on ggtt so it logically makes sense
if they take it as parameter.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-1-tvrtko.ursulin@linux.intel.com


# b32fa811 20-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Defer address space cleanup to an RCU worker

Enable RCU protection of i915_address_space and its ppgtt superclasses,
and defer its cleanup into a worker executed after an RCU grace period.

In the future we will be able to use the RCU protection to reduce the
locking around VM lookups, but the immediate benefit is being able to
defer the release into a kworker (process context). This is required as
we may need to sleep to reap the WC pages stashed away inside the ppgtt.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110934
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620183705.31006-1-chris@chris-wilson.co.uk


# 32a19631 17-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Serialise both updates to PDE and our shadow

Currently, we perform a locked update of the shadow entry when
allocating a page directory entry such that if two clients are
concurrently allocating neighbouring ranges we only insert one new entry
for the pair of them. However, we also need to serialise both clients
wrt to the actual entry in the HW table, or else we may allow one client
or even a third client to proceed ahead of the HW write. My handwave
before was that under the _pathological_ condition we would see the
scratch entry instead of the expected entry, causing a temporary
glitch. That starvation condition will eventually show up in practice, so
fix it.

The reason for the previous cheat was to avoid having to free the extra
allocation while under the spinlock. Now, we keep the extra entry
allocated until the end instead.

v2: Fix error paths for gen6

Fixes: 1d1b5490b91c ("drm/i915/gtt: Replace struct_mutex serialisation for allocation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190617140426.7203-1-chris@chris-wilson.co.uk


# 9ee72503 14-Jun-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Generalize alloc_pd

Allocate all page directory variants with alloc_pd. As
the lvl3 and lvl4 variants differ in manipulation, we
need to check for existence of backing phys page before accessing
it.

v2: use err in returns

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164350.30415-5-mika.kuoppala@linux.intel.com


# 56ab6741 14-Jun-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Introduce init_pd

All page directories, excluding last level, are initialized with
pointer to next level page directories. Make common function for it.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164350.30415-4-mika.kuoppala@linux.intel.com


# 4fba8764 14-Jun-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Introduce init_pd_with_page

We set the page directory entries to point into a page table.
There is no gen specifics in here so make it simple and
obvious.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164350.30415-3-mika.kuoppala@linux.intel.com


# b5b7bef9 14-Jun-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Use a common type for page directories

All page directories are identical in function, only the position in the
hierarchy differ. Use same base type for directory functionality.

v2: cleanup, size always 512, init to null

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164350.30415-2-mika.kuoppala@linux.intel.com


# 7d82cc35 14-Jun-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: No need to zero the table for page dirs

We set them to scratch right after allocation so prevent
useless zeroing before.

v2: atomic_t
v3: allow pdp alloc fail

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164350.30415-1-mika.kuoppala@linux.intel.com


# df0566a6 13-Jun-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: move modesetting core code under display/

Now that we have a new subdirectory for display code, continue by moving
modesetting core code.

display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this
is, again, a surprisingly clean operation.

v2:
- don't move intel_sideband.[ch] (Ville)
- use tabs for Makefile file lists and sort them

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com


# c447ff7d 13-Jun-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: update with_intel_runtime_pm to use the rpm structure

Matching the underlying get/put functions.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com


# d858d569 13-Jun-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: update rpm_get/put to use the rpm structure

The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com


# 0cf289bd 13-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move fence register tracking from i915->mm to ggtt

As the fence registers only apply to regions inside the GGTT is makes
more sense that we track these as part of the i915_ggtt and not the
general mm. In the next patch, we will then pull the register locking
underneath the i915_ggtt.mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk


# 09a32cb7 11-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make GuC GGTT reservation work on ggtt

These functions operate on ggtt so make them take that directly as
parameter.

At the same time move the USES_GUC conditional down to
intel_guc_reserve_ggtt_top for symmetry with
intel_guc_reserved_gtt_size.

v2:
* Rename and move functions to be static in i915_gem_gtt.c (Michal)

v3:
* Add comment explaining reason for reservation, add assert and fix
error message. (Michal)

v4:
* Fix checkpatch error.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611122350.15060-1-tvrtko.ursulin@linux.intel.com


# 9937e16b 10-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/guc: Move intel_guc_reserved_gtt_size to intel_wopcm_guc_size

Reduces pointer chasing and gets more to the point.

v2:
* Tidy whitespace.
* Tidy comment. (Michal)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611110044.7742-1-tvrtko.ursulin@linux.intel.com


# ab53497b 11-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename i915_hw_ppgtt to i915_ppgtt

Keeping the _hw_ in there does not help to distinguish it from its
only brethren i915_ggtt, so drop it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-2-chris@chris-wilson.co.uk


# e568ac38 11-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull kref into i915_address_space

Make the kref common to both derived structs (i915_ggtt and i915_ppgtt)
so that we can safely reference count an abstract ctx->vm address space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk


# 6a8cc66f 06-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Move i915_check_and_clear_faults to intel_reset.c

The code is logically about reset so it makes sense.

It also enables making i915_clear_error_registers static.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607115932.20271-1-tvrtko.ursulin@linux.intel.com


# dbc65183 07-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Convert some more bits to use engine mmio accessors

Remove a couple dev_priv locals as a consequence.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607084521.16845-1-tvrtko.ursulin@linux.intel.com


# bcc726be 07-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Unexport i915_gem_init/fini_aliasing_ppgtt

These two are only used from within i915_gem_gtt.c and can trivially be
made static.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-5-tvrtko.ursulin@linux.intel.com


# 77a302e0 07-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make Gen6/7 RING_FAULT_REG access engine centric

Similar to earlier conversions, eliminate the implicit dev_priv by
introducing some helpers which take the engine parameter (since the
register itself is per engine).

v2:
* Always use parentheses in macro arguments.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607101535.767-1-tvrtko.ursulin@linux.intel.com


# b61ea001 07-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Reset only affected engines when handling error capture

Pass down the engine mask to i915_clear_error_registers so only affected
engines can be reset on the Gen6/7 path.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-1-tvrtko.ursulin@linux.intel.com


# 155ab883 05-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move object close under its own lock

Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk


# 1d1b5490 04-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Replace struct_mutex serialisation for allocation

Instead of relying on the caller holding struct_mutex across the
allocation, push the allocation under a tree of spinlocks stored inside
the page tables. Not only should this allow us to avoid struct_mutex
here, but it will allow multiple users to lock independent ranges for
concurrent allocations, and operate independently. This is vital for
pushing the GTT manipulation into a background thread where dependency
on struct_mutex is verboten, and for allowing other callers to avoid
struct_mutex altogether.

v2: Restore lost GEM_BUG_ON for removing too many PTE from
gen6_ppgtt_clear_range.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190604153830.19096-1-chris@chris-wilson.co.uk


# 59ec84ec 04-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use unchecked uncore writes to flush the GTT

As the GTT is outside of the powerwell, we can simplify flushing the
GGTT writes by using an unchecked mmio write and post.

v2: s/unc/uncore/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190604120022.20472-3-chris@chris-wilson.co.uk


# 0a4a6e74 29-May-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: grab wakeref in gen6_alloc_va_range

Some steps in gen6_alloc_va_range require the HW to be awake, so ideally
we should be grabbing the wakeref ourselves and not relying on the
caller already holding it for us.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190529123108.24422-1-matthew.auld@intel.com


# 7f5f2280 29-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Avoid overflowing the WC stash

An interesting issue cropped with making the pagetables be allocated and
freed concurrently (i.e. removing their grandeous struct_mutex guard)
was that we would overflow the page stash. This happens when we have
multiple allocators grabbing WC pages such that we fill the vm's local
page stash and then when we free another page, the page stash is already
full and we overflow.

The fix is quite simple: to check for a full page stash before adding
another. This results in us keeping a vm local page stash around for
much longer, which is both a blessing and a curse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190529093407.31697-1-chris@chris-wilson.co.uk


# 6951e589 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GEM object domain management from struct_mutex to local

Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk


# 37d63f8f 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull scatterlist utils out of i915_gem.h

Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.

v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk


# c2df2201 24-May-2019 Colin Ian King <colin.king@canonical.com>

drm/i915/gtt: set err to -ENOMEM on memory allocation failure

Currently when the allocation of ppgtt->work fails the error return
path via err_free returns an uninitialized value in err. Fix this
by setting err to the appropriate error return of -ENOMEM.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: d3622099c76f ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190524212627.24256-1-colin.king@canonical.com


# 63e8dcdb 24-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Neuter the deferred unbind callback from gen6_ppgtt_cleanup

Having deferred the vma destruction to a worker where we can acquire the
struct_mutex, we have to avoid chasing back into the now destroyed
ppgtt. The pd_vma is special in having a custom unbind function to scan
for unused pages despite the VMA itself being notionally part of the
GGTT. As such, we need to disable that callback to avoid a
use-after-free.

This unfortunately blew up so early during boot that CI declared the
machine unreachable as opposed to being the major failure it was. Oops.

Fixes: d3622099c76f ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190524064529.20514-1-chris@chris-wilson.co.uk


# d3622099 23-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup

We rearranged the vm_destroy_ioctl to avoid taking struct_mutex, little
realising that buried underneath the gen6 ppgtt release path was a
struct_mutex requirement (to remove its GGTT vma). Until that
struct_mutex is vanquished, take a detour in gen6_ppgtt_cleanup to do
the i915_vma_destroy from inside a worker under the struct_mutex.

<4> [257.740160] WARN_ON(debug_locks && !lock_is_held(&(&vma->vm->i915->drm.struct_mutex)->dep_map))
<4> [257.740213] WARNING: CPU: 3 PID: 1507 at drivers/gpu/drm/i915/i915_vma.c:841 i915_vma_destroy+0x1ae/0x3a0 [i915]
<4> [257.740214] Modules linked in: snd_hda_codec_hdmi i915 x86_pkg_temp_thermal mei_hdcp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core r8169 realtek snd_pcm mei_me mei prime_numbers lpc_ich
<4> [257.740224] CPU: 3 PID: 1507 Comm: gem_vm_create Tainted: G U 5.2.0-rc1-CI-CI_DRM_6118+ #1
<4> [257.740225] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
<4> [257.740249] RIP: 0010:i915_vma_destroy+0x1ae/0x3a0 [i915]
<4> [257.740250] Code: 00 00 00 48 81 c7 c8 00 00 00 e8 ed 08 f0 e0 85 c0 0f 85 78 fe ff ff 48 c7 c6 e8 ec 30 a0 48 c7 c7 da 55 33 a0 e8 42 8c e9 e0 <0f> 0b 8b 83 40 01 00 00 85 c0 0f 84 63 fe ff ff 48 c7 c1 c1 58 33
<4> [257.740251] RSP: 0018:ffffc90000aafc68 EFLAGS: 00010282
<4> [257.740252] RAX: 0000000000000000 RBX: ffff8883f7957840 RCX: 0000000000000003
<4> [257.740253] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffffff8212d1b9
<4> [257.740254] RBP: ffffc90000aafcc8 R08: 0000000000000000 R09: 0000000000000000
<4> [257.740255] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8883f4d5c2a8
<4> [257.740256] R13: ffff8883f4d5d680 R14: ffff8883f4d5c668 R15: ffff8883f4d5c2f0
<4> [257.740257] FS: 00007f777fa8fe40(0000) GS:ffff88840f780000(0000) knlGS:0000000000000000
<4> [257.740258] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [257.740259] CR2: 00007f777f6522b0 CR3: 00000003c612a006 CR4: 00000000001606e0
<4> [257.740260] Call Trace:
<4> [257.740283] gen6_ppgtt_cleanup+0x25/0x60 [i915]
<4> [257.740306] i915_ppgtt_release+0x102/0x290 [i915]
<4> [257.740330] i915_gem_vm_destroy_ioctl+0x7c/0xa0 [i915]
<4> [257.740376] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915]
<4> [257.740379] drm_ioctl_kernel+0x83/0xf0
<4> [257.740382] drm_ioctl+0x2f3/0x3b0
<4> [257.740422] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915]
<4> [257.740426] ? _raw_spin_unlock_irqrestore+0x39/0x60
<4> [257.740430] do_vfs_ioctl+0xa0/0x6e0
<4> [257.740433] ? lock_acquire+0xa6/0x1c0
<4> [257.740436] ? __task_pid_nr_ns+0xb9/0x1f0
<4> [257.740439] ksys_ioctl+0x35/0x60
<4> [257.740441] __x64_sys_ioctl+0x11/0x20
<4> [257.740443] do_syscall_64+0x55/0x1c0
<4> [257.740445] entry_SYSCALL_64_after_hwframe+0x49/0xbe

References: e0695db7298e ("drm/i915: Create/destroy VM (ppGTT) for use with contexts")
Fixes: 7f3f317a66ca ("drm/i915: Restore control over ppgtt for context creation ABI")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190523064933.23604-1-chris@chris-wilson.co.uk


# 1a74fc0b 09-May-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add a new "remapped" gtt_view

To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.

v2: Use intel_remapped_plane_info base type
s/unused/unused_mbz/ (Chris)
Separate BUILD_BUG_ON()s (Chris)
Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
to one row of pages to keep things simple

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# 112ed2d3 24-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GraphicsTechnology files under gt/

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk


# 91180076 19-Apr-2019 Fernando Pacheco <fernando.pacheco@intel.com>

drm/i915/uc: Reserve upper range of GGTT

GuC and HuC depend on struct_mutex for device reinitialization. Moving
away from this dependency requires perma-pinning the firmware images in
GGTT. The upper portion of the GuC address space has a sizeable hole
(several MB) that is inaccessible by GuC. Reserve this range within GGTT
as it can comfortably hold GuC/HuC firmware images.

v2: Reserve node rather than insert (Chris)
Simpler determination of node start/size (Daniele)
Move reserve/release out to intel_guc.* files

v3: Reserve starting at GUC_GGTT_TOP only and bail if this
fails (Chris)

Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-3-fernando.pacheco@intel.com


# 267e80ee 19-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Skip clearing the GGTT under gen6+ full-ppgtt

If we know that the user cannot access the GGTT, by virtue of having a
segregated memory area, we can skip clearing the unused entries as they
cannot be accessed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419201207.5477-1-chris@chris-wilson.co.uk


# 3936867d 11-Apr-2019 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Disable read only ppgtt support for gen11

On gen11 writing to read only ppgtt page causes a gpu hang.
This behaviour is different than with previous gen where
read only ppgtt access is supported. On those, the write
is just dropped without visible side effects.

Disable ro ppgtt support on gen11 until a solution can
be found to bring it into line with its predecessors.

References: HSDES#1807136187
References: https://bugzilla.freedesktop.org/show_bug.cgi?id=108569
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190411083034.28311-1-mika.kuoppala@linux.intel.com


# e0695db7 22-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Create/destroy VM (ppGTT) for use with contexts

In preparation to making the ppGTT binding for a context explicit (to
facilitate reusing the same ppGTT between different contexts), allow the
user to create and destroy named ppGTT.

v2: Replace global barrier for swapping over the ppgtt and tlbs with a
local context barrier (Tvrtko)
v3: serialise with struct_mutex; it's lazy but required dammit
v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more
similarly, turned out different!)

v5: Fix up test unwind for aliasing-ppgtt (snb)
v6: Tighten language for uapi struct drm_i915_gem_vm_control.
v7: Patch the context image for runtime ppgtt switching!

Testcase: igt/gem_vm_create
Testcase: igt/gem_ctx_param/vm
Testcase: igt/gem_ctx_clone/vm
Testcase: igt/gem_ctx_shared
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-2-chris@chris-wilson.co.uk


# 3aa9945a 21-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Separate GEM context construction and registration to userspace

In later patches, it became apparent that userspace can see a partially
constructed GEM context and begin using it before it was ready, to much
hilarity. Close this window of opportunity by lifting the registration of
the context with userspace (the insertion of the context into the filp's
idr) to the very end of the CONTEXT_CREATE ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-1-chris@chris-wilson.co.uk


# 2ebd000a 14-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Refactor common ppgtt initialisation

The basic setup of the i915_hw_ppgtt is the same between gen6 and gen8,
so refactor that into a common routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-5-chris@chris-wilson.co.uk


# a9fe9ca4 14-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Rename i915_vm_is_48b to i915_vm_is_4lvl

Large ppGTT are differentiated by the requirement to go to four levels
to address more than 32b. Given the introduction of more 4 level ppGTT
with different sizes of addressable bits, rename i915_vm_is_48b() to
better reflect the commonality of using 4 levels.

Based on a patch by Bob Paauwe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-4-chris@chris-wilson.co.uk


# cbecbcca 14-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Record platform specific ppGTT size in intel_device_info

As the maximum addressable bits is determined by platform, record that
information in our static chipset tables. This has the advantage of
being clearly recorded in our capability dumps for dmesg, debugfs and
error states.

Based on a patch by Bob Paauwe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-2-chris@chris-wilson.co.uk


# fb251a72 05-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Mark ALL_ENGINES as dirty on ppGTT modification

Small simplification to set all bits in the dirty mask rather than
lookup the exact mask of populated engines. The bits for the engines
that do not exist are unused and so can safely set and then ignored.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-2-chris@chris-wilson.co.uk


# 8a68d464 05-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store the BIT(engine->id) as the engine's mask

In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk


# a2ac437b 05-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Store scratch page size alongside not in the common struct

As the scratch page is the only one to be allocated with variable size,
rather than keep an unused slot in all i915_page_table structs, store it
alongside the vm->scratch_page.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305135430.4948-1-chris@chris-wilson.co.uk


# 4f183645 04-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Use optimised memset32/64 for clearing PTE

Replace the open-coded memset loops with the memset32/64 routines that
reduce to a single instruction or two:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-83 (-83)
Function old new delta
gen6_ppgtt_clear_range 371 344 -27
gen8_ppgtt_clear_pd 575 519 -56

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304230646.23714-1-chris@chris-wilson.co.uk


# 13f1bfd3 28-Feb-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Make object/vma allocation caches global

As our allocations are not device specific, we can move our slab caches
to a global scope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk


# 772b5408 20-Feb-2019 Chengguang Xu <cgxu519@gmx.com>

drm/i915: remove redundant likely/unlikely annotation

unlikely has already included in IS_ERR(), so just
remove redundant likely/unlikely annotation.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221020819.21832-1-cgxu519@gmx.com


# 21950ee7 05-Feb-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull i915_gem_active into the i915_active family

Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk


# 64d6c500 05-Feb-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Generalise GPU activity tracking

We currently track GPU memory usage inside VMA, such that we never
release memory used by the GPU until after it has finished accessing it.
However, we may want to track other resources aside from VMA, or we may
want to split a VMA into multiple independent regions and track each
separately. For this purpose, generalise our request tracking (akin to
struct reservation_object) so that we can embed it into other objects.

v2: Tweak error handling during selftest setup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk


# 09d7e46b 28-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull VM lists under the VM mutex.

A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk


# 499197dc 28-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Stop tracking MRU activity on VMA

Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk


# 9f58892e 16-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull all the reset functionality together into i915_reset.c

Currently the code to reset the GPU and our state is spread widely
across a few files. Pull the logic together into a common file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk


# 8cd99918 14-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prevent concurrent GGTT update and use on Braswell (again)

On Braswell, under heavy stress, if we update the GGTT while
simultaneously accessing another region inside the GTT, we are returned
the wrong values. To prevent this we stop the machine to update the GGTT
entries so that no memory traffic can occur at the same time.

This was first spotted in

commit 5bab6f60cb4d1417ad7c599166bcfec87529c1a2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Oct 23 18:43:32 2015 +0100

drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

but removed again in forlorn hope with

commit 4509276ee824bb967885c095c610767e42345c36
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Feb 20 12:47:18 2017 +0000

drm/i915: Remove Braswell GGTT update w/a

However, gem_concurrent_blit is once again only stable with the patch
applied and CI is detecting the odd failure in forked gem_mmap_gtt tests
(which smell like the same issue). Fwiw, a wide variety of CPU memory
barriers (around GGTT flushing, fence updates, PTE updates) and GPU
flushes/invalidates (between requests, after PTE updates) were tried as
part of the investigation to find an alternate cause, nothing comes
close to serialised GGTT updates.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105591
Testcase: igt/gem_concurrent_blit
Testcase: igt/gem_mmap_gtt/*forked*
References: 5bab6f60cb4d ("drm/i915: Serialise updates to GGTT with access through GGTT on Braswell")
References: 4509276ee824 ("drm/i915: Remove Braswell GGTT update w/a")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114211729.30352-1-chris@chris-wilson.co.uk


# 305dc3f9 14-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Differentiate between ggtt->mutex and ppgtt->mutex

We have two classes of VM, global GTT and per-process GTT. In order to
allow ourselves the freedom to mix both along call chains, distinguish
the two classes with regards to their mutex and lockdep maps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114215956.32266-1-chris@chris-wilson.co.uk


# d4225a53 14-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Syntatic sugar for using intel_runtime_pm

Frequently, we use intel_runtime_pm_get/_put around a small block.
Formalise that usage by providing a macro to define such a block with an
automatic closure to scope the intel_runtime_pm wakeref to that block,
i.e. macro abuse smelling of python.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk


# 538ef96b 14-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Track the rpm wakerefs

Keep track of the temporary rpm wakerefs used for user access to the
device, so that we can cancel them upon release and clearly identify any
leaks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk


# 16e4dd03 14-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Markup paired operations on wakerefs

The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk


# 280d479b 21-Dec-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Unwind failure on pinning the gen7 ppgtt

If we fail to pin the ggtt vma slot for the ppgtt page tables, we need
to unwind the locals before reporting the error. Or else on subsequent
attempts to bind the page tables into the ggtt, we will already believe
that the vma has been pinned and continue on blithely. If something else
should happen to be at that location, choas ensues.

Fixes: a2bbf7148342 ("drm/i915/gtt: Only keep gen6 page directories pinned while active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: <stable@vger.kernel.org> # v4.19+
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181222030623.21710-1-chris@chris-wilson.co.uk
(cherry picked from commit d4de753526f4d99f541f1b6ed1d963005c09700c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 2f80d7bd 08-Jan-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: drop all drmP.h includes

Needs just a few additional includes here and there.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com


# d25f71a1 07-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Return immediately if trylock fails for direct-reclaim

Ignore trying to shrink from i915 if we fail to acquire the struct_mutex
in the shrinker while performing direct-reclaim. The trade-off being
(much) lower latency for non-i915 clients at an increased risk of being
unable to obtain a page from direct-reclaim without hitting the
oom-notifier. The proviso being that we still keep trying to hard
obtain the lock for kswapd so that we can reap under heavy memory
pressure.

v2: Taint all mutexes taken within the shrinker with the struct_mutex
subclass as an early warning system, and drop I915_SHRINK_ACTIVE from
vmap to reduce the number of dangerous paths. We also have to drop
I915_SHRINK_ACTIVE from oom-notifier to be able to make the same claim
that ACTIVE is only used from outside context, which fits in with a
longer strategy of avoiding stalls due to scanning active during
shrinking.

The danger in using the subclass struct_mutex is that we declare
ourselves more knowledgable than lockdep and deprive ourselves of
automatic coverage. Instead, we require ourselves to mark up any mutex
taken inside the shrinker in order to detect lock-inversion, and if we
miss any we are doomed to a deadlock at the worst possible moment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107115509.12523-1-chris@chris-wilson.co.uk


# 28e52b98 26-Dec-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove debugfs/i915_ppgtt_info

The information presented here is not relevant to current development.
We can either use the context information, but more often we want to
inspect the active gpu state.

The ulterior motive is to eradicate dev->filelist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181227121549.29139-1-chris@chris-wilson.co.uk


# d4de7535 21-Dec-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Unwind failure on pinning the gen7 ppgtt

If we fail to pin the ggtt vma slot for the ppgtt page tables, we need
to unwind the locals before reporting the error. Or else on subsequent
attempts to bind the page tables into the ggtt, we will already believe
that the vma has been pinned and continue on blithely. If something else
should happen to be at that location, choas ensues.

Fixes: a2bbf7148342 ("drm/i915/gtt: Only keep gen6 page directories pinned while active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: <stable@vger.kernel.org> # v4.19+
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181222030623.21710-1-chris@chris-wilson.co.uk


# cf819eff 12-Dec-2018 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915: replace IS_GEN<N> with IS_GEN(..., N)

Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.

The following spatch was used to convert the users of these macros:

@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)

v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
using the bitmask

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com


# e8894267 07-Dec-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pipeline PDP updates for Braswell

Currently we face a severe problem on Braswell that manifests as invalid
ppGTT accesses. The code tries to maintain the PDP (page directory
pointers) inside the context in two ways, direct write into the context
and a pipelined LRI update. The direct write into the context is
fundamentally racy as it is unserialised with any access (read or write)
the GPU is doing. By asserting that Braswell is not used with vGPU
(currently an unsupported platform) we can eliminate the dangerous
direct write into the context image and solely use the pipelined update.

However, the LRI of the PDP fouls up the GPU, causing it to freeze and
take out the machine with "forcewake ack timeouts". This seems possible
to workaround by preventing the GPU from sleeping (via means of
disabling the power-state management interface, i.e. forcing each ring
to remain awake) around the update. Equally, it seems an EMIT_INVALIDATE
before the LRI is sufficient to prevent the forcewake errors.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
References: https://bugs.freedesktop.org/show_bug.cgi?id=108714
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207090213.14352-3-chris@chris-wilson.co.uk


# 8830f26b 02-Nov-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture

Since capturing the error state requires fiddling around with the GGTT
to read arbitrary buffers and is itself run under stop_machine(), it
deadlocks the machine (effectively a hard hang) when run in conjunction
with Broxton's VTd workaround to serialize GGTT access.

v2: Store the ERR_PTR in first_error so that the error can be reported
to the user via sysfs.
v3: Mention the quirk in dmesg (using info as per usual)

Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk
(cherry picked from commit fb6f0b64e455b207a636346588e65bf9598d30eb)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# fb6f0b64 02-Nov-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture

Since capturing the error state requires fiddling around with the GGTT
to read arbitrary buffers and is itself run under stop_machine(), it
deadlocks the machine (effectively a hard hang) when run in conjunction
with Broxton's VTd workaround to serialize GGTT access.

v2: Store the ERR_PTR in first_error so that the error can be reported
to the user via sysfs.
v3: Mention the quirk in dmesg (using info as per usual)

Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-5-chris@chris-wilson.co.uk


# c5828105 25-Oct-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up GTT sizes as u64

Since we use a 64b virtual GTT irrespective of the system, we want to
ensure that the GTT computations remains 64b even on 32b systems,
including treatment of huge virtual pages.

No code generation changes on 64b:

Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108282
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181025091823.20571-1-chris@chris-wilson.co.uk
(cherry picked from commit 9125963a9494253fa5a29cc1b4169885d2be7042)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# e5ee4956 30-Oct-2018 Hang Yuan <hang.yuan@linux.intel.com>

drm/i915/gtt: Revert "Disable read-only support under GVT"

This reverts commit c9e666880de5a1fed04dc412b046916d542b72dd.

Checked GVT codes that guest PPGTT PTE flag bits are propagated
to shadow PTE. Read/write bit is not changed. Further tested by
i915 self-test case "igt_ctx_readonly". No error or GPU hang was
detected. So enable read-only support under GVT.

Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1540883281-11359-1-git-send-email-hang.yuan@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# b379e306 29-Oct-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Reuse the read-only 64KiB scratch page and directories

If we can prevent stray writes from landing in the scratch page, we can
reuse the same page and same scratch PT for all contexts without fear of
information leaks and side-channels.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181029182721.29568-2-chris@chris-wilson.co.uk


# daf3dc0f 29-Oct-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Record the scratch pte

Record the scratch PTE encoding upon creation rather than recomputing
the bits everytime. This is important for the next patch where we forgo
having a valid scratch page with which we may compute the bits and so
require keeping the PTE value instead.

v2: Fix up scrub_64K to use scratch_pte as well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181029182721.29568-1-chris@chris-wilson.co.uk


# 9125963a 25-Oct-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up GTT sizes as u64

Since we use a 64b virtual GTT irrespective of the system, we want to
ensure that the GTT computations remains 64b even on 32b systems,
including treatment of huge virtual pages.

No code generation changes on 64b:

Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108282
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181025091823.20571-1-chris@chris-wilson.co.uk


# 73f522ba 16-Oct-2018 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Use i915_gem_object_get_dma_address() to populate rotated vmas

Replace the kvmalloc_array() with i915_gem_object_get_dma_address() when
populating rotated vmas. One random access mechanism ought to be enough
for everyone?

To calculate the size of the radix tree I think we can do
something like this (assuming 64bit pointers):
num_pages = obj_size / 4096
tree_height = ceil(log64(num_pages))
num_nodes = sum(64^n, n, 0, tree_height-1)
tree_size = num_nodes * 576

If we compare that with the object size we should get a relative
overhead of around .2% to 1% for reasonable sized objects,
which framebuffers tend to be.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181016150413.11577-1-ville.syrjala@linux.intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 4bdafb9d 26-Sep-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915.enable_ppgtt override

Now that we are confident in providing full-ppgtt where supported,
remove the ability to override the context isolation.

v2: Remove faked aliasing-ppgtt for testing as it no longer is accepted.
v3: s/USES/HAS/ to match usage and reject attempts to load the module on
old GVT-g setups that do not provide support for full-ppgtt.
v4: Insulate ABI ppGTT values from our internal enum (later plans
involve moving ppGTT depth out of the enum, thus potentially breaking
ABI unless we document the current values).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180926201222.5643-1-chris@chris-wilson.co.uk


# f8e57863 26-Sep-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Trim partial view sg lists

Partial views are small but there can be many of them, and since the sg
list space for them is allocated pessimistically, we can save some slab by
trimming the unused tail entries.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180926080353.20867-1-tvrtko.ursulin@linux.intel.com


# 4a3d3f67 22-Sep-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Match code to comment and enforce ppgtt for execlists

Our execlist dispatch code requires a ppGTT so make sure we enforce that
option in intel_sanitize_enable_ppgtt(). The comment already tries to
explain that execlists requires ppgtt, but was written when gen8 may
have also taken the legacy path; so rewrite the code to match the
comment by using HAS_EXECLISTS() feature instead of the gen.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180922141804.21183-1-chris@chris-wilson.co.uk


# 8c01903c 20-Sep-2018 Matthew Auld <matthew.auld@intel.com>

drm/i915: pass dev_priv to i915_gem_cleanup_stolen

It really wants dev_priv anyway, also now matches i915_gem_init_stolen.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180920142707.19659-2-matthew.auld@intel.com


# 21c62a9d 17-Sep-2018 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Replace some PAGE_SHIFTs with I915_GTT_PAGE_SIZE

Clean up some cases where we're dealing with GTT pages instead of
system pages to use I915_GTT_PAGE_SIZE instead of PAGE_SHIT. So
just replace the the shifts with mul/div as appropriate. These
are the easy ones, the rest probably need some actual thought.

No real changes in the generated asm. Only gen8_ppgtt_insert_4lvl()
was affected as gcc decided to do the following change:
- be9: 89 d9 mov %ebx,%ecx
- beb: c1 e1 0c shl $0xc,%ecx
- bee: 48 63 c9 movslq %ecx,%rcx
+ be9: 48 63 cb movslq %ebx,%rcx
+ bec: 48 c1 e1 0c shl $0xc,%rcx
and that then shifted a bunch of the offset by one byte. I presume
the sign extensions in the asm are due to integer promotions from
u16 etc. Hopefully someone has confirmed that those don't end up
doing the wrong thing for us.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180917171414.19220-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# f6e35cda 13-Sep-2018 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Replace some PAGE_SIZE with I915_GTT_PAGE_SIZE

Use I915_GTT_PAGE_SIZE when talking about GTT pages rather than
physical pages.

There are some PAGE_SHIFTs left though. Not sure if we want to
introduce I915_GTT_PAGE_SHIFT or what?

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> # at least some of it :)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180913150405.706-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 48e90504 31-Aug-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Explicitly mark Global GTT address spaces

So far we have been relying on vm->file pointer being NULL to declare
something GGTT.

This has the unfortunate consequence that the default kernel context is
also declared GGTT and interferes with the following patch which wants to
instantiate VMA's and execute requests against the kernel context.

Change the is_ggtt test to use an explicit flag in struct address_space to
solve this issue.

Note that the bit used is free since there is an alignment hole in the
struct.

v2:
* Mark mock ggtt.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180831143643.12366-1-tvrtko.ursulin@linux.intel.com


# 09605548 30-Aug-2018 Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: clear error registers after error capture

We need to clear the register in order to get correct value after the
next potential hang.

v2: Centralize error register clearing in i915_irq.c (Chris)

v3: Don't read gen8 register on < gen6 (Chris)

v4: Don't swap gen8+ & gen6+ code... (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180830132424.21940-1-lionel.g.landwerlin@intel.com


# 3d94361a 30-Jul-2018 Matthew Auld <matthew.auld@intel.com>

drm/i915/gtt: remove px_page

Entries will either be pointing to scratch or real PD, making the
px_page(pd) check pointless. Also since there are no other users of
px_page, just remove it.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180730120544.20784-1-matthew.auld@intel.com


# dd18cedf 27-Jul-2018 Jakub Bartmiński <jakub.bartminski@intel.com>

drm/i915/guc: Move the pin bias value from GuC to GGTT

Removing the pin bias from GuC allows us to not check for GuC every time
we pin a context, which fixes the assertion error on unresolved GuC
platform default in mock contexts selftest.

It also seems that we were using uninitialized WOPCM variables when
setting the GuC pin bias. The pin bias has to be set after the WOPCM,
but before the call to i915_gem_contexts_init where the first contexts
are pinned.

v2:
This also makes it so that there's no need to set GuC variables from
within the WOPCM init function or to move the WOPCM init, while keeping
the correct initialization order. Also for mock tests the pin bias is
left at 0 and we make sure that the pin bias with GuC will not be
smaller than without GuC.

v3:
Avoid unused i915 in intel_guc_ggtt_offset if debug is disabled.

v4:
Squash with WOPCM init reordering.
Moved the i915_ggtt_pin_bias helper to this patch, and made some
functions use it instead of directly dereferencing i915->ggtt.

v5:
Since we now don't use wopcm.guc.base for the pin bias there's no need to
validate it. It also has already been verified in WOPCM init.

v6:
Deleted the now unnecessarily introduced includes from previous versions.
Dropped naming changes from dev_priv to i915 for better patch readability.

v7:
Changed some comments to make more sense in the context they're in.

v8:
Moved and renamed the function which now returns the wopcm.guc.size to
intel_guc.c:intel_guc_reserved_gtt_size to avoid any possible confusion
with the pin_bias in ggtt, which should be used for pinning.
Fixed patch not applying or the most recent upstream.

Fixes: f7dc0157e4b5 ("drm/i915/uc: Fetch GuC/HuC firmwares from guc/huc specific init")
Testcase: igt/drv_selftest/mock_contexts #GuC
Signed-off-by: Jakub Bartmiński <jakub.bartminski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180727141148.30874-3-jakub.bartminski@intel.com


# 35e90081 20-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Suppress assertion for i915_ggtt_disable_guc

Another step in the drv_module_reload fault-injection saga, is that we
try to disable the guc twice. Probably. It's a little unclear exactly
what is going on in the unload sequence that catches us out, so for the
time being suppress the assertion to get the test re-enabled.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180720095144.5885-1-chris@chris-wilson.co.uk


# 5f9c4f95 17-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Full ppgtt everywhere, no excuses

We believe we have all the kinks worked out, even for the early
Valleyview devices, for whom we currently disable all ppgtt.

References: 62942ed7279d ("drm/i915/vlv: disable PPGTT on early revs v3")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717095751.1034-2-chris@chris-wilson.co.uk


# 79556df2 17-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Enable full-ppgtt by default everywhere

We should we have all the kinks worked out and full-ppgtt now works
reliably on gen7 (Ivybridge, Valleyview/Baytrail and Haswell). If we can
let userspace have full control over their own ppgtt, it makes softpinning
far more effective, in turn making GPU dispatch far more efficient by
virtue of better mm segregation. On the other hand, switching over to a
different GTT for every client does incur noticeable overhead, but only
for very lightweight tasks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717095751.1034-1-chris@chris-wilson.co.uk


# 3e977ac6 12-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prevent writing into a read-only object via a GGTT mmap

If the user has created a read-only object, they should not be allowed
to circumvent the write protection by using a GGTT mmapping. Deny it.

Also most machines do not support read-only GGTT PTEs, so again we have
to reject attempted writes. Fortunately, this is known a priori, so we
can at least reject in the call to create the mmap (with a sanity check
in the fault handler).

v2: Check the vma->vm_flags during mmap() to allow readonly access.
v3: Remove VM_MAYWRITE to curtail mprotect()

Testcase: igt/gem_userptr_blits/readonly_mmap*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> #v1
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-4-chris@chris-wilson.co.uk


# c9e66688 12-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Disable read-only support under GVT

GVT is not propagating the PTE bits, and is always setting the
read-write bit, thus breaking read-only support.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-3-chris@chris-wilson.co.uk


# 250f8c81 12-Jul-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915/gtt: Read-only pages for insert_entries on bdw+

Hook up the flags to allow read-only ppGTT mappings for gen8+

v2: Include a selftest to check that writes to a readonly PTE are
dropped
v3: Don't duplicate cpu_check() as we can just reuse it, and even worse
don't wholesale copy the theory-of-operation comment from igt_ctx_exec
without changing it to explain the intention behind the new test!
v4: Joonas really likes magic mystery values

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-2-chris@chris-wilson.co.uk


# 25dda4da 12-Jul-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915/gtt: Add read only pages to gen8_pte_encode

We can set a bit inside the ppGTT PTE to indicate a page is read-only;
writes from the GPU will be discarded. We can use this to protect pages
and in particular support read-only userptr mappings (necessary for
importing PROT_READ vma).

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-1-chris@chris-wilson.co.uk


# 19bb33c7 11-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Introduce i915_address_space.mutex

Add a mutex into struct i915_address_space to be used while operating on
the vma and their lists for a particular vm. As this may be called from
the shrinker, we taint the mutex with fs_reclaim so that from the start
lockdep warns us if we are caught holding the mutex across an
allocation. (With such small steps we will eventually rid ourselves of
struct_mutex recursion!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711073608.20286-2-chris@chris-wilson.co.uk


# ec625fb9 09-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Provide a timeout to i915_gem_wait_for_idle()

Usually we have no idea about the upper bound we need to wait to catch
up with userspace when idling the device, but in a few situations we
know the system was idle beforehand and can provide a short timeout in
order to very quickly catch a failure, long before hangcheck kicks in.

In the following patches, we will use the timeout to curtain two overly
long waits, where we know we can expect the GPU to complete within a
reasonable time or declare it broken.

In particular, with a broken GPU we expect it to fail during the initial
GPU setup where do a couple of context switches to record the defaults.
This is a task that takes a few milliseconds even on the slowest of
devices, but we may have to wait 60s for hangcheck to give in and
declare the machine inoperable. In this a case where any gpu hang is
unacceptable, both from a timeliness and practical standpoint.

The other improvement is that in selftests, we do not need to arm an
independent timer to inject a wedge, as we can just limit the timeout on
the wait directly.

v2: Include the timeout parameter in the trace.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-1-chris@chris-wilson.co.uk


# 5c3f8c22 06-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track vma activity per fence.context, not per engine

In the next patch, we will want to be able to use more flexible request
timelines that can hop between engines. From the vma pov, we can then
not rely on the binding of this request to an engine and so can not
ensure that different requests are ordered through a per-engine
timeline, and so we must track activity of all timelines. (We track
activity on the vma itself to prevent unbinding from HW before the HW
has finished accessing it.)

v2: Switch to a rbtree for 32b safety (since using u64 as a radixtree
index is fraught with aliasing of unsigned longs).
v3: s/lookup_active/active_instance/ because we can never agree on names

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-5-chris@chris-wilson.co.uk


# 66daec6b 06-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Control cache domain of dma_map_page() directly

We already maually control the CPU cache for our page table directories,
so we can tell the dma mapper to skip doing it as well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706122611.4142-2-chris@chris-wilson.co.uk


# 58174eac 06-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Suppress warnings for dma_map_page

As we propagate back the error to the caller for them to handle, we do
not need the lowest level spitting out a redundant warning upon an
allocation failure inside dma_map_page().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706122611.4142-1-chris@chris-wilson.co.uk


# cef08fdc 05-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove defunct i915->vm_list

No longer used and can be removed. One less global that currently
demands struct_mutex protection.

References: e9e7dc4144cd ("drm/i915/gtt: Make gen6 page directories evictable")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705065653.20449-1-chris@chris-wilson.co.uk


# 63fd659f 04-Jul-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Pull global wc page stash under its own locking

Currently, the wc-stash used for providing flushed WC pages ready for
constructing the page directories is assumed to be protected by the
struct_mutex. However, we want to remove this global lock and so must
install a replacement global lock for accessing the global wc-stash (the
per-vm stash continues to be guarded by the vm).

We need to push ahead on this patch due to an oversight in hastily
removing the struct_mutex guard around the igt_ppgtt_alloc selftest. No
matter, it will prove very useful (i.e. will be required) in the near
future.

v2: Restore the onstack stash so that we can drop the vm->mutex in
future across the allocation.
v3: Restore the lost pagevec_init of the onstack allocation, and repaint
function names.
v4: Reorder init so that we don't try and use i915_address_space before
it is ininitialised.

Fixes: 1f6f00238abf ("drm/i915/selftests: Drop struct_mutex around lowlevel pggtt allocation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180704185518.4193-1-chris@chris-wilson.co.uk


# a0fbacb5 14-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Reduce a pair of runtime asserts

We can stop asserting using WARN_ON as given sufficient CI coverage, we
can rely on using GEM_BUG_ON() to catch problems before merging.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614184218.1606-2-chris@chris-wilson.co.uk


# 986dbac4 14-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Cache the PTE encoding of the scratch page

As the most frequent PTE encoding is for the scratch page, cache it upon
creation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614184218.1606-1-chris@chris-wilson.co.uk


# 4a192c7e 14-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Free unused page tables on unbind the context

As we cannot reliably change used page tables while the context is
active, the earliest opportunity we have to recover excess pages is when
the context becomes idle. So whenever we unbind the context (it must be
idle, and indeed being evicted) free the unused ptes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614134315.5900-2-chris@chris-wilson.co.uk


# 549fe88b 14-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Lazily allocate page directories for gen7

As we were only supporting aliasing_ppgtt on gen7 for some time, we
saved a few checks by preallocating the page directories on creation.
However, since we need 2MiB of page directories for each ppgtt, to
support arbitrary numbers of user contexts, we need to be more prudent
in our allocations, and defer the page allocation until it is used. We
don't recover unused pages yet as we found that doing so on the fly
(i.e. altering TLB entries) would confuse the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614134315.5900-1-chris@chris-wilson.co.uk


# a2bbf714 14-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Only keep gen6 page directories pinned while active

In order to be able to evict the gen6 ppgtt, we have to unpin it at some
point. We can simply use our context activity tracking to know when the
ppgtt is no longer in use by hardware, and so only keep it pinned while
being used a request.

For the kernel_context (and thus aliasing_ppgtt), it remains pinned at
all times, as the kernel_context itself is pinned at all times.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614094103.18025-5-chris@chris-wilson.co.uk


# e9e7dc41 12-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Make gen6 page directories evictable

Currently all page directories are bound at creation using an
unevictable node in the GGTT. This severely limits us as we cannot
remove any inactive ppgtt for new contexts, or under aperture pressure.
To fix this we need to make the page directory into a first class and
unbindable vma. Hence, the creation of a custom vma to wrap the page
directory as opposed to a GEM object.

In this patch, we leave the page directories pinned upon creation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180612120446.13901-1-chris@chris-wilson.co.uk


# a9ded785 12-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Onionify error handling for gen6_ppgtt_create

Pull the empty stubs together into the top level gen6_ppgtt_create, and
tear each one down on error in proper onion order (rather than use
Joonas' pet hate of calling the cleanup function in indeterminable
state).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180612081815.3585-2-chris@chris-wilson.co.uk


# 35ac40d8 12-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Subclass gen6_hw_ppgtt

The legacy gen6 ppgtt needs a little more hand holding than gen8+, and
so requires a larger structure. As I intend to make this slightly more
complicated in the future, separate the gen6 from the core gen8 hw
struct by subclassing. This patch moves the gen6 only features out to
gen6_hw_ppgtt and pipes the new type everywhere that needs it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180612081815.3585-1-chris@chris-wilson.co.uk


# 68a85703 11-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Invalidate GGTT caches after writing the gen6 page directories

When we update the gen6 ppgtt page directories, we do so by writing the
new address into a reserved slot in the GGTT. It appears that when the
GPU reads that entry from the gsm, it uses its small cache and that we
need to invalidate that cache after writing. We don't see an issue
currently as we prefill the ppgtt page directories on creation; and only
create the single aliasing_ppgtt long before we start using the GGTT
(and so before the cache may have a conflicting entry).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611171825.13678-1-chris@chris-wilson.co.uk


# b3ee09a4 10-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/ringbuffer: Fix context restore upon reset

The discovery with trying to enable full-ppgtt was that we were
completely failing to the load both the mm and context following the
reset. Although we were performing mmio to set the PP_DIR (per-process
GTT) and CCID (context), these were taking no effect (the assumption was
that this would trigger reload of the context and restore the page
tables). It was not until we performed the LRI + MI_SET_CONTEXT in a
following context switch would anything occur.

Since we are then required to reset the context image and PP_DIR using
CS commands, we place those commands into every batch. The hardware
should recognise the no-ops and eliminate the expensive context loads,
but we still have to pay the cost of using cross-powerwell register
writes. In practice, this has no effect on actual context switch times,
and only adds a few hundred nanoseconds to no-op switches. We can improve
the latter by eliminating the w/a around known no-op switches, but there
is an ulterior motive to keeping them.

Always emitting the context switch at the beginning of the request (and
relying on HW to skip unneeded switches) does have one key advantage.
Should we implement request reordering on Haswell, we will not know in
advance what the previous executing context was on the GPU and so we
would not be able to elide the MI_SET_CONTEXT commands ourselves and
always have to emit them. Having our hand forced now actually prepares
us for later.

Now since that context and mm follow the request, we no longer (and not
for a long time since requests took over!) require a trace point to tell
when we write the switch into the ring, since it is always. (This is
even more important when you remember that simply writing into the ring
bears no relation to the current mm.)

v2: Sandybridge has to agree to use LRI as well.

Testcase: igt/drv_selftests/live_hangcheck
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611110845.31890-1-chris@chris-wilson.co.uk


# eed28903 09-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Reorder aliasing_ppgtt fini

To allow ourselves to use a first class vma for the aliasing_ppgtt page
directory, we have to reorder the shutdown on module unload to remove
and unpin the aliasing_ppgtt before complaining about any objects left
in the GGTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180609090151.22007-1-chris@chris-wilson.co.uk


# e1f87898 08-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Remove vgpu check for gen6

Since vgpu is not supported on Haswell or any other gen6/7, we do not
need to check and act upon it's enablement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180608150435.15010-2-chris@chris-wilson.co.uk


# f6b1e35f 08-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Remove redundant hsw_mm_switch()

hsw_mm_switch() and gen7_mm_switch() are identical, so let's remove the
redundant specialism.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180608150435.15010-1-chris@chris-wilson.co.uk


# b4e2727d 08-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Fix unwind length passed to gen6_ppgtt_clear_range

When we want to unwind an error when allocating the PD for gen6, we call
gen6_ppgtt_clear_range() telling to clear upto the PD we've previously
allocated. However, we passed it an incorrect length, passing it the
endpoint instead. Fortunately, as the start was always 0, this has no
impact today, but tomorrow we want to start using non-zero origins.

Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180608173221.10455-1-chris@chris-wilson.co.uk


# 17f297b4 07-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Push allocation to hw ppgtt constructor

In the next patch, we will subclass the gen6 hw_ppgtt. In order, for the
two different generations of hw ppgtt stucts to be of different size,
push the allocation down to the constructor.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607163040.9781-1-chris@chris-wilson.co.uk


# 93f2cde2 07-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Decouple vma vfuncs from vm

To allow for future non-object backed vma, we need to be able to
specialise the callbacks for binding, et al, the vma. For example,
instead of calling vma->vm->bind_vma(), we now call
vma->ops->bind_vma(). This gives us the opportunity to later override the
operation for a custom vma.

v2: flip order of unbind/bind

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-2-chris@chris-wilson.co.uk


# 520ea7c5 07-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prepare for non-object vma

In order to allow ourselves to use VMA to wrap other entities other than
GEM objects, we need to allow for the vma->obj backpointer to be NULL.
In most cases, we know we are operating on a GEM object and its vma, but
we need the core code (such as i915_vma_pin/insert/bind/unbind) to work
regardless of the innards.

The remaining eyesore here is vma->obj->cache_level and related (but
less of an issue) vma->obj->gt_ro. With a bit of care we should mirror
those on the vma itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-1-chris@chris-wilson.co.uk


# 64b3c936 06-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Fix typo in fill_px() macro

The macro declared the ppgtt parameter but implicitly used the local vm
instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180606205128.25952-1-chris@chris-wilson.co.uk


# 82ad6443 05-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Rename i915_hw_ppgtt base member

In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk


# 74479985 05-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Teach restore-gtt to walk the ggtt vma list not the object list

In preparation, for having non-vma objects stored inside the ggtt, to
handle restoration of the GGTT following resume, we need to walk over
the ggtt address space rebinding vma, as opposed to walking over bound
objects looking for ggtt entries.

v2: Skip objects only bound for the aliasing_ppgtt

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20180605082856.19221-1-chris@chris-wilson.co.uk


# 30aacd3f 04-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Remove obsolete switch_mm hooks for gen8+

As the ppgtt for execlists is tightly coupled to the executing context,
and not switch separately, we no longer use the ppgtt->switch_mm hooks
on gen8+. Remove them.

References: 79e6770cb1f5 ("drm/i915: Remove obsolete ringbuffer emission for gen8+")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180604131552.29370-1-chris@chris-wilson.co.uk


# 3df845e7 01-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Don't restore the non-existent PDE for GGTT

On resume, we have to rewrite all the PDE entries for gen7 ppgtts. If we
switch on full-ppgtt, there is then one address space with no PDE, the
GGTT. Currently under aliasing-ppgtt, the GGTT address space does have
an associated ppgtt and so the restore works just fine. We would have a
similar problem if we tried disabling aliasing-ppgtt
(i915.enable_ppgtt=0). So skip the empty ppgtt, as being non-existent it
doesn't need restoring.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180601093554.13083-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 37800ca8 01-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Avoid calling non-existent allocate_va_range

On hsw and older, we do not need to allocate the ppgtt on the fly and so
ppgtt->allocate_va_range() is NULL. Fixup ppgtt_bind_vma not to call it,
in that case!

v2: PIN_UPDATE still exists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180601093554.13083-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180601093554.13083-2-chris@chris-wilson.co.uk


# eb479f86 21-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Limit searching for PIN_HIGH

To no surprise (since we've flip-flopped over the use of PIN_HIGH a few
times), doing a search by address over a pathologically fragmented
address space is exceeding slow. To protect ourselves from nearly
unbounded latency (think searching a million holes while under
struct_mutex), limit the search for the highest available hole and
fallback to best-fit if it fails.

In the pathologically fragmented case, such as igt/gem_ctx_thrash, the
effect is dramatic, bringing the runtime down from hours to seconds
(depending on how many other slow searches you hit, e.g. alloc_iova()
and alloc_vmap_area() both degrade to a slow rbtree walk after their
small cache is exhausted). For the real world, the number of search
steps is unlikely to be significant as we should only need to search
once per new context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180521082131.13744-3-chris@chris-wilson.co.uk


# 1abb70f5 22-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Allow pagedirectory allocations to fail

As we handle the allocation failure of the page directory and tables by
propagating the failure back to userspace, allow it to fail if direct
reclaim is unable to satisfy the request (i.e. disable the oomkiller).
The premise being that if we are unable to allocate a single page for
the pagetable, we will not be able to handle the multitude of pages
required for the gfx operation and we should back off to allow the
system to recover.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106609
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180522083643.29601-1-chris@chris-wilson.co.uk


# f79401b4 11-May-2018 Matthew Auld <matthew.auld@intel.com>

drm/i915/selftests: scrub 64K

We write all 4K page entries, even when using 64K pages. In order to
verify that the HW isn't cheating by using the 4K PTE instead of the 64K
PTE, we want to remove all the surplus entries. If the HW skipped the
64K PTE, it will read/write into the scratch page instead - which we
detect as missing results during selftests.

v2: much improved commentary (Chris)

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Changbin Du <changbin.du@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180511095140.25590-1-matthew.auld@intel.com


# cc38cae7 08-May-2018 Oscar Mateo <oscar.mateo@intel.com>

drm/i915/icl: Introduce initial Icelake Workarounds

Inherit workarounds from previous platforms that are still valid for
Icelake.

v2: GEN7_ROW_CHICKEN2 is masked
v3:
- Since it has been fixed already in upstream, removed the TODO
comment about WA_SET_BIT for WaInPlaceDecompressionHang.
- Squashed with this patch:
drm/i915/icl: add icelake_init_clock_gating()
from Paulo Zanoni <paulo.r.zanoni@intel.com>
- Squashed with this patch:
drm/i915/icl: WaForceEnableNonCoherent
from Oscar Mateo <oscar.mateo@intel.com>
- WaPushConstantDereferenceHoldDisable is now Wa_1604370585 and
applies to B0 as well.
- WaPipeControlBefore3DStateSamplePattern WABB was being applied
to ICL incorrectly.
v4:
- Wrap the commit message
- s/dev_priv/p to please checkpatch
v5: Rebased on top of the WA refactoring
v6: Rebased on top of further whitelist registers refactoring (Michel)
v7: Added WaRsForcewakeAddDelayForAck
v8: s/ICL_HDC_CHICKEN0/ICL_HDC_MODE (Mika)
v9:
- C, not lisp (Chris)
- WaIncreaseDefaultTLBEntries is the same for GEN > 9_LP (Tvrtko)

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1525814984-20039-2-git-send-email-oscar.mateo@intel.com


# ca6acc25 08-May-2018 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Trust the uncached store to flush wcb

Not all architectures guarantee that uncached read will
flush the write combining buffer. So marking it explicitly
is recommended [1].

However we know the architecture we are operating on
and can avoid wmb as the UC store will flush the wcb [2].

Omit the wmb() before invalidate as redudant.

v2: squash combining and removal (Chris)
v3: remove obsolete comments about posting reads (Chris)

References: http://yarchive.net/comp/linux/write_combining.html [1]
References: http://download.intel.com/design/PentiumII/applnots/24442201.pdf [2]
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180508124154.14586-1-mika.kuoppala@linux.intel.com


# c258f91d 03-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Tidy up duplicate branches in gen8_gmch_probe()

Following commit f773568b6ff8 ("drm/i915: nuke the duplicated stolen
discovery"), the if-else-chain for determining the GTT size is redundant
with the !chv branches all being the same.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
References: f773568b6ff8 ("drm/i915: nuke the duplicated stolen discovery")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503212956.3948-1-chris@chris-wilson.co.uk


# 3365e226 03-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lazily unbind vma on close

When userspace is passing around swapbuffers using DRI, we frequently
have to open and close the same object in the foreign address space.
This shows itself as the same object being rebound at roughly 30fps
(with a second object also being rebound at 30fps), which involves us
having to rewrite the page tables and maintain the drm_mm range manager
every time.

However, since the object still exists and it is only the local handle
that disappears, if we are lazy and do not unbind the VMA immediately
when the local user closes the object but defer it until the GPU is
idle, then we can reuse the same VMA binding. We still have to be
careful to mark the handle and lookup tables as closed to maintain the
uABI, just allowing the underlying VMA to be resurrected if the user is
able to access the same object from the same context again.

If the object itself is destroyed (neither userspace keeping a handle to
it), the VMA will be reaped immediately as usual.

In the future, this will be even more useful as instantiating a new VMA
for use on the GPU will become heavier. A nuisance indeed, so nip it in
the bud.

v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
v3: Leave a hint as to why we deferred the unbind on close.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk


# 65fcb806 02-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move timeline from GTT to ring

In the future, we want to move a request between engines. To achieve
this, we first realise that we have two timelines in effect here. The
first runs through the GTT is required for ordering vma access, which is
tracked currently by engine. The second is implied by sequential
execution of commands inside the ringbuffer. This timeline is one that
maps to userspace's expectations when submitting requests (i.e. given the
same context, batch A is executed before batch B). As the rings's
timelines map to userspace and the GTT timeline an implementation
detail, move the timeline from the GTT into the ring itself (per-context
in logical-ring-contexts/execlists, or a global per-engine timeline for
the shared ringbuffers in legacy submission.

The two timelines are still assumed to be equivalent at the moment (no
migrating requests between engines yet) and so we can simply move from
one to the other without adding extra ordering.

v2: Reinforce that one isn't allowed to mix the engine execution
timeline with the client timeline from userspace (on the ring).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk


# e61e0f51 21-Feb-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename drm_i915_gem_request to i915_request

We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)

In short, the spatch:
@@

@@
- struct drm_i915_gem_request
+ struct i915_request

A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 62d4028f 15-Feb-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Convert WARN_ON to GEM debugging

As we presume that we have sufficient coverage of CI for new machines
and new code paths, we do not need to have user impacting WARN_ON for
programming errors inside i915_gem_gtt.c, so convert those over to
GEM_BUG_ON. This leaves the memory debugging WARN_ON in place as they
are not so easy to exercise with CI.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180215110759.28603-1-chris@chris-wilson.co.uk


# e53792f4 12-Feb-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace open-coded memset_p()

When initialising the page directories, we set the GTT entries and the
tree to the scratch page. We have already replaced the DMA fill with
memset64(), but we can similarly use memset_p() to set the pointer array.

References: 4dd504f7d98a ("drm/i915: Use memset64() to prefill the GTT page")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180212133118.16443-1-chris@chris-wilson.co.uk


# c56b89f1 09-Feb-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Use INTEL_GEN everywhere

Coccinelle patch:

@@
identifier p;
@@
-INTEL_INFO(p)->gen
+INTEL_GEN(p)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180208130606.15556-12-tvrtko.ursulin@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180209215847.6660-1-chris@chris-wilson.co.uk


# 751b01cb 31-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/ppgtt: Pin page directories before allocation

Commit e2b763caa6eb ("drm/i915: Remove bitmap tracking for used-pdpes")
believed that because it did not insert its freshly allocated page
directory into the pd tree, it was safe from the shrinker. I failed to
heed the lesson learnt from commit dd19674bacba ("drm/i915: Remove bitmap
tracking for used-ptes") that we need to pin all the levels in the tree
before hitting the shrinker or else the shrinker may free an upper layer
as we proceed to allocate the tree. Thus leaving dangling pointers
everywhere and a GPF should we hit direct reclaim at just the wrong
moment.

CPU: 0 PID: 7374 Comm: chromium Tainted: P O 4.14.13-1-ARCH #1
Hardware name: Apple Inc. MacBookPro12,1/Mac-E43C1C25D4880AD6, BIOS MBP121.88Z.0167.B33.1706181928 06/18/2017
task: ffff994f696c2c40 task.stack: ffffb1a789d4c000
RIP: 0010:gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915]
RSP: 0018:ffffb1a789d4f940 EFLAGS: 00010206
RAX: 81c1788cc4f68138 RBX: ffff994f54db8000 RCX: ffff994f696c2c40
RDX: 000000023bc73003 RSI: ffff994d598b6b80 RDI: ffff994f54db8000
RBP: ffff994d598b6b80 R08: 0000000000000000 R09: 0000000000000000
R10: ffffb1a789d4f550 R11: ffff994eaf3c3208 R12: 0000000000000027
R13: 0000000000005000 R14: 0000000004e8f000 R15: ffff994f54dba000
FS: 00007f585886aa00(0000) GS:ffff994faec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004ac8e8 CR3: 00000002552c8004 CR4: 00000000003606f0
Call Trace:
gen8_ppgtt_alloc_pdp+0x178/0x320 [i915]
gen8_ppgtt_alloc_4lvl+0x5f/0x150 [i915]
ppgtt_bind_vma+0x30/0x70 [i915]
i915_vma_bind+0x68/0xd0 [i915]
__i915_vma_do_pin+0x2d6/0x3a0 [i915]
eb_lookup_vmas+0x7a2/0xb50 [i915]
i915_gem_do_execbuffer+0x4d7/0x10e0 [i915]
? sock_wfree+0x34/0x60
? unix_stream_read_generic+0x1f9/0x7e0
? import_iovec+0x37/0xd0
? i915_gem_execbuffer2+0x5d/0x390 [i915]
i915_gem_execbuffer2+0x1b7/0x390 [i915]
? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
drm_ioctl_kernel+0x59/0xb0 [drm]
drm_ioctl+0x2d5/0x370 [drm]
? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
? __seccomp_filter+0x3b/0x260
do_vfs_ioctl+0xa1/0x610
? syscall_trace_enter+0xdb/0x2b0
SyS_ioctl+0x74/0x80
do_syscall_64+0x55/0x110
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f584fa82d27
RSP: 002b:00007ffee14a7828 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000003b0126a1030 RCX: 00007f584fa82d27
RDX: 00007ffee14a7870 RSI: 0000000040406469 RDI: 0000000000000080
RBP: 00007ffee14a7870 R08: 0000000000000002 R09: 0000000000000077
R10: 00007f5839f2b780 R11: 0000000000000246 R12: 0000000040406469
R13: 0000000000000080 R14: 00007f5842b00040 R15: 0000000000000000
Code: 01 00 83 81 58 0a 00 00 01 48 2b 05 13 9d fd c9 48 c1 f8 06 48 c1 e0 0c 48 8d 04 d0 48 8b 56 08 48 03 05 0c 9d fd c9 48 83 ca 03 <48> 89 10 83 a9 58 0a 00 00 01 65 ff 0d 37 03 fb 3e 74 02 f3 c3
RIP: gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915] RSP: ffffb1a789d4f940

Reported-by: Eric Blau <eblau@eblau.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104773
Fixes: e2b763caa6eb ("drm/i915: Remove bitmap tracking for used-pdpes")
References: dd19674bacba ("drm/i915: Remove bitmap tracking for used-ptes")
Testcase: igt/drv_selftest/live_gtt (igt_ppgtt_shrink_boom)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180131214440.7141-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
(cherry picked from commit b715a2f0c7714a399e7f8e951cc8dea9cd4eeb4b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 124804c4 21-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Protect WC stash allocation against direct reclaim

As we attempt to allocate pages for use in a new WC stash, direct
reclaim may run underneath us and fill up the WC stash. We have to be
careful then not to overflow the pvec.

Fixes: 66df1014efba ("drm/i915: Keep a small stash of preallocated WC pages")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103109
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180121173143.17090-1-chris@chris-wilson.co.uk
(cherry picked from commit 073cd7816685ac77c6d8b4d321a5586c9177b76a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 25da77f8 22-Dec-2017 Oscar Mateo <oscar.mateo@intel.com>

drm/i915: Stop getting the fault address from RING_FAULT_REG

This register does not contain it. Instead, we have to look into FAULT_TLB_DATA0 & 1
(where, by the way, we can also get the address space).

v2: Right formatting
v3:
- Use 12 (as per the register format) instead of PAGE_SIZE (Chris)
- s/BITS_44_TO_47/HIGHBITS (Chris)
- Right formatting, this time for real

Fixes: b03ec3d67ab8 ("drm/i915: There is only one fault register from GEN8 onwards")
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1513982329-32191-1-git-send-email-oscar.mateo@intel.com
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 5a3f58dfd1420347be838546e00a4a9e847bda30)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# b715a2f0 31-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/ppgtt: Pin page directories before allocation

Commit e2b763caa6eb ("drm/i915: Remove bitmap tracking for used-pdpes")
believed that because it did not insert its freshly allocated page
directory into the pd tree, it was safe from the shrinker. I failed to
heed the lesson learnt from commit dd19674bacba ("drm/i915: Remove bitmap
tracking for used-ptes") that we need to pin all the levels in the tree
before hitting the shrinker or else the shrinker may free an upper layer
as we proceed to allocate the tree. Thus leaving dangling pointers
everywhere and a GPF should we hit direct reclaim at just the wrong
moment.

CPU: 0 PID: 7374 Comm: chromium Tainted: P O 4.14.13-1-ARCH #1
Hardware name: Apple Inc. MacBookPro12,1/Mac-E43C1C25D4880AD6, BIOS MBP121.88Z.0167.B33.1706181928 06/18/2017
task: ffff994f696c2c40 task.stack: ffffb1a789d4c000
RIP: 0010:gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915]
RSP: 0018:ffffb1a789d4f940 EFLAGS: 00010206
RAX: 81c1788cc4f68138 RBX: ffff994f54db8000 RCX: ffff994f696c2c40
RDX: 000000023bc73003 RSI: ffff994d598b6b80 RDI: ffff994f54db8000
RBP: ffff994d598b6b80 R08: 0000000000000000 R09: 0000000000000000
R10: ffffb1a789d4f550 R11: ffff994eaf3c3208 R12: 0000000000000027
R13: 0000000000005000 R14: 0000000004e8f000 R15: ffff994f54dba000
FS: 00007f585886aa00(0000) GS:ffff994faec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004ac8e8 CR3: 00000002552c8004 CR4: 00000000003606f0
Call Trace:
gen8_ppgtt_alloc_pdp+0x178/0x320 [i915]
gen8_ppgtt_alloc_4lvl+0x5f/0x150 [i915]
ppgtt_bind_vma+0x30/0x70 [i915]
i915_vma_bind+0x68/0xd0 [i915]
__i915_vma_do_pin+0x2d6/0x3a0 [i915]
eb_lookup_vmas+0x7a2/0xb50 [i915]
i915_gem_do_execbuffer+0x4d7/0x10e0 [i915]
? sock_wfree+0x34/0x60
? unix_stream_read_generic+0x1f9/0x7e0
? import_iovec+0x37/0xd0
? i915_gem_execbuffer2+0x5d/0x390 [i915]
i915_gem_execbuffer2+0x1b7/0x390 [i915]
? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
drm_ioctl_kernel+0x59/0xb0 [drm]
drm_ioctl+0x2d5/0x370 [drm]
? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
? __seccomp_filter+0x3b/0x260
do_vfs_ioctl+0xa1/0x610
? syscall_trace_enter+0xdb/0x2b0
SyS_ioctl+0x74/0x80
do_syscall_64+0x55/0x110
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f584fa82d27
RSP: 002b:00007ffee14a7828 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000003b0126a1030 RCX: 00007f584fa82d27
RDX: 00007ffee14a7870 RSI: 0000000040406469 RDI: 0000000000000080
RBP: 00007ffee14a7870 R08: 0000000000000002 R09: 0000000000000077
R10: 00007f5839f2b780 R11: 0000000000000246 R12: 0000000040406469
R13: 0000000000000080 R14: 00007f5842b00040 R15: 0000000000000000
Code: 01 00 83 81 58 0a 00 00 01 48 2b 05 13 9d fd c9 48 c1 f8 06 48 c1 e0 0c 48 8d 04 d0 48 8b 56 08 48 03 05 0c 9d fd c9 48 83 ca 03 <48> 89 10 83 a9 58 0a 00 00 01 65 ff 0d 37 03 fb 3e 74 02 f3 c3
RIP: gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915] RSP: ffffb1a789d4f940

Reported-by: Eric Blau <eblau@eblau.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104773
Fixes: e2b763caa6eb ("drm/i915: Remove bitmap tracking for used-pdpes")
References: dd19674bacba ("drm/i915: Remove bitmap tracking for used-ptes")
Testcase: igt/drv_selftest/live_gtt (igt_ppgtt_shrink_boom)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180131214440.7141-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 7fb9ee5d 29-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Simplify guard logic for setup_scratch_page()

Older gcc is complaining it can't follow the guards and thinks that
addr may be used uninitialised

In the process, we can simplify down to one loop,
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-131 (-131)
Function old new delta
setup_scratch_page 545 414 -131

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129102840.19901-1-chris@chris-wilson.co.uk


# 073cd781 21-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Protect WC stash allocation against direct reclaim

As we attempt to allocate pages for use in a new WC stash, direct
reclaim may run underneath us and fill up the WC stash. We have to be
careful then not to overflow the pvec.

Fixes: 66df1014efba ("drm/i915: Keep a small stash of preallocated WC pages")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103109
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180121173143.17090-1-chris@chris-wilson.co.uk


# 5a3f58df 22-Dec-2017 Oscar Mateo <oscar.mateo@intel.com>

drm/i915: Stop getting the fault address from RING_FAULT_REG

This register does not contain it. Instead, we have to look into FAULT_TLB_DATA0 & 1
(where, by the way, we can also get the address space).

v2: Right formatting
v3:
- Use 12 (as per the register format) instead of PAGE_SIZE (Chris)
- s/BITS_44_TO_47/HIGHBITS (Chris)
- Right formatting, this time for real

Fixes: b03ec3d67ab8 ("drm/i915: There is only one fault register from GEN8 onwards")
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1513982329-32191-1-git-send-email-oscar.mateo@intel.com
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 82e07602 04-Jan-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pass DMA_ATTR_NO_WARN to dma_map_sg()

In some iommu, e.g. swiotlb, the available space can be quite limited.
So we employ a trial-and-error approach to seeing if our large
contiguous chunks can fit, and if that fails we try again with smaller
chunks after trying to free our own lazily allocated blobs. As we use a
trial-and-error approach, we do not want dma_map_sg() to emit a WARN of
its own accord, we want to gracefully report the error back to the caller
instead.

Note that our noisy culprit, swiotlb, doesn't honour the flag, yet.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180104163842.11635-1-chris@chris-wilson.co.uk


# aeb950bd 13-Dec-2017 Michał Winiarski <michal.winiarski@intel.com>

drm/i915/guc: Call invalidate after changing the vfunc

To make this operation a bit cleaner, we should make sure that the HW
can catch up by calling the new implementation right away.
Note that currently we're only touching the vfunc at module load time
(before GuC is even loaded), so this shouldn't cause any functional
changes.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-4-michal.winiarski@intel.com


# 1875fe7b 12-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Downgrade misleading "Memory usable" message

It never meant what it said, as it was always the total size of the
Global GTT and not a limit upon memory usage. Originally it served as a
quick guide to the largest batch that could be submitted by userspace,
an approximation to its maximum RSS, but was phrased badly. Today with
the 48b ppgtt, it is even more meaningless. Replace with a more specific
debug message; those wanting to know how much "video ram" they have
should consult the userspace libraries for the relevant approximation.

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171212113532.22574-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# b7128ef1 11-Dec-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: prefer resource_size_t for everything stolen

Keeps things consistent now that we make use of struct resource. This
should keep us covered in case we ever get huge amounts of stolen
memory.

v2: bunch of missing conversions (Chris)

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-10-matthew.auld@intel.com


# 73ebd503 11-Dec-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: make mappable struct resource centric

Now that we are using struct resource to track the stolen region, it is
more convenient if we track the mappable region in a resource as well.

v2: prefer iomap and gmadr naming scheme
prefer DEFINE_RES_MEM

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-8-matthew.auld@intel.com


# 77894226 11-Dec-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: make dsm struct resource centric

Now that we are using struct resource to track the stolen region, it is
more convenient if we track dsm in a resource as well.

v2: check range_overflow when writing to 32b registers (Chris)
pepper in some comments (Chris)
v3: refit i915_stolen_to_dma()
v4: kill ggtt->stolen_size
v5: some more polish

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-6-matthew.auld@intel.com


# f773568b 11-Dec-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: nuke the duplicated stolen discovery

We duplicate the stolen discovery code in early-quirks and in i915,
however now that the stolen region is exported as a resource from
early-quirks we can nuke the duplication.

v2: check overflows_type

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-5-matthew.auld@intel.com


# e2189dd0 07-Dec-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Refactor common list iteration over GGTT vma

In quite a few places, we have a list iteration over the vma on an
object that only want to inspect GGTT vma. By construction, these are
placed at the start of the list, so we have copied that knowledge into
many callsites. Pull that knowledge back to i915_vma.h and provide a
for_each_ggtt_vma() to tidy up the code.

v2: Add a backreference from vma_create() to remind ourselves why we put
ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail).
v3: Fixup s/vma/V/

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk


# 93ffbe8e 06-Dec-2017 Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/i915/guc: Introduce USES_GUC_xxx helper macros

In the upcoming patch we will change the way how to recognize
when GuC is in use. Using helper macros will minimize scope
of that changes. While here, update dev_info message.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-3-michal.wajdeczko@intel.com


# 62d0fe45 22-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove success dmesg noise for intel_rotate_pages()

During selftesting intel_rotate_pages() is very, very verbose without
giving us any information. Suppress the noise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171122145646.1859-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# fb5c551a 20-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915.enable_execlists module parameter

Execlists and legacy ringbuffer submission are no longer feature
comparable (execlists now offer greater functionality that should
overcome their performance hit) and obsoletes the unsafe module
parameter, i.e. comparing the two modes of execution is no longer
useful, so remove the debug tool.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk


# 86679820 15-Nov-2017 Mel Gorman <mgorman@techsingularity.net>

mm, pagevec: remove cold parameter for pagevecs

Every pagevec_init user claims the pages being released are hot even in
cases where it is unlikely the pages are hot. As no one cares about the
hotness of pages being released to the allocator, just ditch the
parameter.

No performance impact is expected as the overhead is marginal. The
parameter is removed simply because it is a bit stupid to have a useless
parameter copied everywhere.

Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 4667c2d5 15-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Initialise entry in intel_ppat_get() for older compilers

gcc-4.7.3 is confused by the guards inside intel_ppat_get() and reports:

drivers/gpu/drm/i915/i915_gem_gtt.c: In function ‘intel_ppat_get’:
drivers/gpu/drm/i915/i915_gem_gtt.c:3044:27: warning: ‘entry’ may be used uninitialized in this function [-Wmaybe-uninitialized]

Forgive the compiler this once, and rearrange the code so that entry is
always initialised.

v2: Flavour with a bit of NULL (instead of ERR_PTR(-ENOSPC))

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115131705.16341-1-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>


# b03ec3d6 13-Nov-2017 Michel Thierry <michel.thierry@intel.com>

drm/i915: There is only one fault register from GEN8 onwards

Until Haswell/Baytrail, the hardware used to have a per engine fault
register (e.g. 0x4094 - render fault register, 0x4194 - media fault
register and so on). But since Broadwell, all these registers were
combined into a singe one and the engine id stored in bits 14:12.

Not only we should not been reading (and writing to) registers that do
not exist, in platforms with VCS2 (SKL), the address that would belong
this engine (0x4494, VCS2_HW = 4) is already assigned to other register.

v2: use less controversial function names (Chris).
v3: make non-exported functions static, remove now obsolete check for
engine presence before posting_read (Chris).

References: IHD-OS-BDW-Vol 2c-11.15, page 75.
References: IHD-OS-SKL-Vol 2c-05.16, page 350.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113173628.11689-1-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 15e4cda9 09-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark vm_free_page() as a potential sleeper agent

vm_free_page() may call down into set_pages_array_wb() (which itself
sleeps, on x86 at least) but only if on !llc and the caches overflow.
Since this is unlikely, we only rarely trigger the error in practice,
and so to make CI detection of this sleeping bug possible we want to
mark the common vm_free_page() as a potential sleep.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=103638
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171109213450.13875-1-chris@chris-wilson.co.uk


# b58d4bef 23-Oct-2017 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Disable lazy PPGTT page table optimization for vGPU

When running under virtualization (vGPU active), we must disable
the lazy PPGTT page table initialization optimization introduced by
commit 14826673247e ("drm/i915: Only initialize partially filled
pagetables").

We must do this because GVT-g makes unduly assumptions about guest
behaviour, which this optimization breaks. This results in following
looking errors in the host:

ERROR gvt: guest page write error -22, gfn 0x7ada8, pa 0x7ada89a8, var 0x6, len 1

The real fix is to not to depend on i915 driver behaviour, but instead
either rely on only the contracts that i915 has with the hardware, or
add some paravirtualization. While the real fix is en route, it won't
be finished in time for 4.15, so the best option is to disable the
optimization for now when vGPU is active to avoid breaking 4.15 guests
in existing VM environments.

Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
Suggested-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
[Joonas: Rewrote the commit message and added tags.]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171023153209.10527-1-joonas.lahtinen@linux.intel.com
(cherry picked from commit 22a8a4fc93b14b5a8cfc785edbdc6f7bd98bffb6)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 423a8a94 06-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Deconstruct struct sgt_dma initialiser

gcc-4.4 complains about:

struct sgt_dma iter = {
.sg = vma->pages->sgl,
.dma = sg_dma_address(iter.sg),
.max = iter.dma + iter.sg->length,
};

drivers/gpu/drm/i915/i915_gem_gtt.c: In function ‘gen8_ppgtt_insert_4lvl’:
drivers/gpu/drm/i915/i915_gem_gtt.c:938: error: ‘iter.sg’ is used uninitialized in this function
drivers/gpu/drm/i915/i915_gem_gtt.c:939: error: ‘iter.dma’ is used uninitialized in this function

and worse generates invalid code that triggers a GPF:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: snd_aloop nf_conntrack_ipv6 nf_defrag_ipv6 nf_log_ipv6 ip6table_filter ip6_tables ctr ccm xt_state nf_log_ipv4
nf_log_common xt_LOG xt_limit xt_recent xt_owner xt_addrtype iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ip_tables dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm
irqbypass uas usb_storage hid_multitouch btusb btrtl uvcvideo videobuf2_v4l2 videobuf2_core videodev media videobuf2_vmalloc videobuf2_memops
sg ppdev dell_wmi sparse_keymap mei_wdt sd_mod iTCO_wdt iTCO_vendor_support rtsx_pci_ms memstick rtsx_pci_sdmmc mmc_core dell_smm_hwmon hwmon
dell_laptop dell_smbios dcdbas joydev input_leds hci_uart btintel btqca btbcm bluetooth parport_pc parport i2c_hid
intel_lpss_acpi intel_lpss pcspkr wmi int3400_thermal acpi_thermal_rel dell_rbtn mei_me mei snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic ahci libahci acpi_pad xhci_pci xhci_hcd snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device
snd_pcm snd_timer snd soundcore int3403_thermal arc4 e1000e ptp pps_core i2c_i801 iwlmvm mac80211 rtsx_pci iwlwifi cfg80211 rfkill
intel_pch_thermal processor_thermal_device int340x_thermal_zone intel_soc_dts_iosf i915 video fjes
CPU: 2 PID: 2408 Comm: X Not tainted 4.10.0-rc5+ #1
Hardware name: Dell Inc. Latitude E7470/0T6HHJ, BIOS 1.11.3 11/09/2016
task: ffff880219fe4740 task.stack: ffffc90005f98000
RIP: 0010:gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
RSP: 0018:ffffc90005f9b8c8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8802167d8000 RCX: 0000000000000001
RDX: 00000000ffff7000 RSI: ffff880219f94140 RDI: ffff880228444000
RBP: ffffc90005f9b948 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000080
R13: 0000000000000001 R14: ffffc90005f9bcd7 R15: ffff88020c9a83c0
FS: 00007fb53e1ee920(0000) GS:ffff88024dd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000022ef95000 CR4: 00000000003406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
ppgtt_bind_vma+0x40/0x50 [i915]
i915_vma_bind+0xcb/0x1c0 [i915]
__i915_vma_do_pin+0x6e/0xd0 [i915]
i915_gem_execbuffer_reserve_vma+0x162/0x1d0 [i915]
i915_gem_execbuffer_reserve+0x4fc/0x510 [i915]
? __kmalloc+0x134/0x250
? i915_gem_wait_for_error+0x25/0x100 [i915]
? i915_gem_wait_for_error+0x25/0x100 [i915]
i915_gem_do_execbuffer+0x2df/0xa00 [i915]
? drm_malloc_gfp.clone.0+0x42/0x80 [i915]
? path_put+0x22/0x30
? __check_object_size+0x62/0x1f0
? terminate_walk+0x44/0x90
i915_gem_execbuffer2+0x95/0x1e0 [i915]
drm_ioctl+0x243/0x490
? handle_pte_fault+0x1d7/0x220
? i915_gem_do_execbuffer+0xa00/0xa00 [i915]
? handle_mm_fault+0x10d/0x2a0
vfs_ioctl+0x18/0x30
do_vfs_ioctl+0x14b/0x3f0
SyS_ioctl+0x92/0xa0
entry_SYSCALL_64_fastpath+0x1a/0xa9
RIP: 0033:0x7fb53b4fcb77
RSP: 002b:00007ffe0c572898 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb53e17c038 RCX: 00007fb53b4fcb77
RDX: 00007ffe0c572900 RSI: 0000000040406469 RDI: 000000000000000b
RBP: 00007fb5376d67e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000028 R11: 0000000000003246 R12: 0000000000000000
R13: 0000000000000000 R14: 000055eecb314d00 R15: 000055eecb315460
Code: 0f 84 5d ff ff ff eb a2 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 58 0f 1f 44 00 00 31 c0 89 4d b0 <4c>
8b 60 10 44 8b 70 0c 48 89 d0 4c 8b 2e 48 c1 e8 27 25 ff 01
RIP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915] RSP: ffffc90005f9b8c8
CR2: 0000000000000010

Recent gccs, such as 4.9, 6.3 or 7.2, do not generate the warning nor do
they explode on use. If we manually create the struct using locals from
the stack, this should eliminate this issue, and does not alter code
generation with gcc-7.2.

Fixes: 894ccebee2b0 ("drm/i915: Micro-optimise gen8_ppgtt_insert_entries()")
Reported-by: Kelly French <kfrench@federalhill.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kelly French <kfrench@federalhill.net>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171106211128.12538-1-chris@chris-wilson.co.uk
Tested-by: Kelly French <kfrench@federalhill.net>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit 5684514ba9dc6d7aa932cc53d97d866b2386221f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 5684514b 06-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Deconstruct struct sgt_dma initialiser

gcc-4.4 complains about:

struct sgt_dma iter = {
.sg = vma->pages->sgl,
.dma = sg_dma_address(iter.sg),
.max = iter.dma + iter.sg->length,
};

drivers/gpu/drm/i915/i915_gem_gtt.c: In function ‘gen8_ppgtt_insert_4lvl’:
drivers/gpu/drm/i915/i915_gem_gtt.c:938: error: ‘iter.sg’ is used uninitialized in this function
drivers/gpu/drm/i915/i915_gem_gtt.c:939: error: ‘iter.dma’ is used uninitialized in this function

and worse generates invalid code that triggers a GPF:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: snd_aloop nf_conntrack_ipv6 nf_defrag_ipv6 nf_log_ipv6 ip6table_filter ip6_tables ctr ccm xt_state nf_log_ipv4
nf_log_common xt_LOG xt_limit xt_recent xt_owner xt_addrtype iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ip_tables dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm
irqbypass uas usb_storage hid_multitouch btusb btrtl uvcvideo videobuf2_v4l2 videobuf2_core videodev media videobuf2_vmalloc videobuf2_memops
sg ppdev dell_wmi sparse_keymap mei_wdt sd_mod iTCO_wdt iTCO_vendor_support rtsx_pci_ms memstick rtsx_pci_sdmmc mmc_core dell_smm_hwmon hwmon
dell_laptop dell_smbios dcdbas joydev input_leds hci_uart btintel btqca btbcm bluetooth parport_pc parport i2c_hid
intel_lpss_acpi intel_lpss pcspkr wmi int3400_thermal acpi_thermal_rel dell_rbtn mei_me mei snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic ahci libahci acpi_pad xhci_pci xhci_hcd snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device
snd_pcm snd_timer snd soundcore int3403_thermal arc4 e1000e ptp pps_core i2c_i801 iwlmvm mac80211 rtsx_pci iwlwifi cfg80211 rfkill
intel_pch_thermal processor_thermal_device int340x_thermal_zone intel_soc_dts_iosf i915 video fjes
CPU: 2 PID: 2408 Comm: X Not tainted 4.10.0-rc5+ #1
Hardware name: Dell Inc. Latitude E7470/0T6HHJ, BIOS 1.11.3 11/09/2016
task: ffff880219fe4740 task.stack: ffffc90005f98000
RIP: 0010:gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
RSP: 0018:ffffc90005f9b8c8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8802167d8000 RCX: 0000000000000001
RDX: 00000000ffff7000 RSI: ffff880219f94140 RDI: ffff880228444000
RBP: ffffc90005f9b948 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000080
R13: 0000000000000001 R14: ffffc90005f9bcd7 R15: ffff88020c9a83c0
FS: 00007fb53e1ee920(0000) GS:ffff88024dd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000022ef95000 CR4: 00000000003406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
ppgtt_bind_vma+0x40/0x50 [i915]
i915_vma_bind+0xcb/0x1c0 [i915]
__i915_vma_do_pin+0x6e/0xd0 [i915]
i915_gem_execbuffer_reserve_vma+0x162/0x1d0 [i915]
i915_gem_execbuffer_reserve+0x4fc/0x510 [i915]
? __kmalloc+0x134/0x250
? i915_gem_wait_for_error+0x25/0x100 [i915]
? i915_gem_wait_for_error+0x25/0x100 [i915]
i915_gem_do_execbuffer+0x2df/0xa00 [i915]
? drm_malloc_gfp.clone.0+0x42/0x80 [i915]
? path_put+0x22/0x30
? __check_object_size+0x62/0x1f0
? terminate_walk+0x44/0x90
i915_gem_execbuffer2+0x95/0x1e0 [i915]
drm_ioctl+0x243/0x490
? handle_pte_fault+0x1d7/0x220
? i915_gem_do_execbuffer+0xa00/0xa00 [i915]
? handle_mm_fault+0x10d/0x2a0
vfs_ioctl+0x18/0x30
do_vfs_ioctl+0x14b/0x3f0
SyS_ioctl+0x92/0xa0
entry_SYSCALL_64_fastpath+0x1a/0xa9
RIP: 0033:0x7fb53b4fcb77
RSP: 002b:00007ffe0c572898 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb53e17c038 RCX: 00007fb53b4fcb77
RDX: 00007ffe0c572900 RSI: 0000000040406469 RDI: 000000000000000b
RBP: 00007fb5376d67e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000028 R11: 0000000000003246 R12: 0000000000000000
R13: 0000000000000000 R14: 000055eecb314d00 R15: 000055eecb315460
Code: 0f 84 5d ff ff ff eb a2 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 58 0f 1f 44 00 00 31 c0 89 4d b0 <4c>
8b 60 10 44 8b 70 0c 48 89 d0 4c 8b 2e 48 c1 e8 27 25 ff 01
RIP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915] RSP: ffffc90005f9b8c8
CR2: 0000000000000010

Recent gccs, such as 4.9, 6.3 or 7.2, do not generate the warning nor do
they explode on use. If we manually create the struct using locals from
the stack, this should eliminate this issue, and does not alter code
generation with gcc-7.2.

Fixes: 894ccebee2b0 ("drm/i915: Micro-optimise gen8_ppgtt_insert_entries()")
Reported-by: Kelly French <kfrench@federalhill.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kelly French <kfrench@federalhill.net>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171106211128.12538-1-chris@chris-wilson.co.uk
Tested-by: Kelly French <kfrench@federalhill.net>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# d9b99ffc 27-Oct-2017 Michel Thierry <michel.thierry@intel.com>

drm/i915/cnl: Remove unnecessary check in cnl_setup_private_ppat

There is no need check if PPGTT is disabled because that not possible
in CNL. Execlists and GuC submission modes rely on at least aliasing
PPGTT and even intel_sanitize_enable_ppgtt says: "We don't allow disabling
PPGTT for gen9+ as it's a requirement for execlists, the sole mechanism
available to submit work."

Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171027223207.7869-1-michel.thierry@intel.com


# 22a8a4fc 23-Oct-2017 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Disable lazy PPGTT page table optimization for vGPU

When running under virtualization (vGPU active), we must disable
the lazy PPGTT page table initialization optimization introduced by
commit 14826673247e ("drm/i915: Only initialize partially filled
pagetables").

We must do this because GVT-g makes unduly assumptions about guest
behaviour, which this optimization breaks. This results in following
looking errors in the host:

ERROR gvt: guest page write error -22, gfn 0x7ada8, pa 0x7ada89a8, var 0x6, len 1

The real fix is to not to depend on i915 driver behaviour, but instead
either rely on only the contracts that i915 has with the hardware, or
add some paravirtualization. While the real fix is en route, it won't
be finished in time for 4.15, so the best option is to disable the
optimization for now when vGPU is active to avoid breaking 4.15 guests
in existing VM environments.

Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
Suggested-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
[Joonas: Rewrote the commit message and added tags.]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171023153209.10527-1-joonas.lahtinen@linux.intel.com


# f2123818 15-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move dev_priv->mm.[un]bound_list to its own lock

Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61df ("drm/i915/shrinker: Flush active on objects before
counting")).

v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk


# 612dde7e 10-Oct-2017 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Simplify intel_sanitize_enable_ppgtt

Remove dead code around has_aliasing_ppgtt condition.

Suggested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010143355.16577-1-joonas.lahtinen@linux.intel.com


# 06ea8c53 09-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Silently fallback to 4k scratch

If we fail to allocate a 64k hugepage for scratch, we try again with a
normal 4k page (with some loss of efficiency at runtime). As we handle
this gracefully, we do not need a noisy allocation failure warning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010111005.13625-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# d9ec12f8 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: accurate page size tracking for the ppgtt

Now that we support multiple page sizes for the ppgtt, it would be
useful to track the real usage for debugging purposes.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-16-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-15-chris@chris-wilson.co.uk


# 17a00cf7 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: support 64K pages for the 48b PPGTT

Support inserting 64K pages into the 48b PPGTT.

v2: check for 64K scratch

v3: we should only have to re-adjust maybe_64K at every sg interval

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-15-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-14-chris@chris-wilson.co.uk


# aa095871 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: add support for 64K scratch page

Before we can fully enable 64K pages, we need to first support a 64K
scratch page if we intend to support the case where we have object sizes
< 2M, since any scratch PTE must also point to a 64K region. Without
this our 64K usage is limited to objects which completely fill the
page-table, and therefore don't need any scratch.

v2: add reminder about why 48b PPGTT

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-14-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-13-chris@chris-wilson.co.uk


# 0a03852e 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: support 2M pages for the 48b PPGTT

Support inserting 2M gtt pages into the 48b PPGTT.

v2: sanity check sg->length against page_size

v3: don't recalculate rem on each loop
whitespace breakup

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-13-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-12-chris@chris-wilson.co.uk


# 9a6330cf 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: enable IPS bit for 64K pages

Before we can enable 64K pages through the IPS bit, we must first enable
it through MMIO, otherwise the page-walker will simply ignore it.

v2: add comment mentioning that 64K is BDW+

v3: move to more suitable home

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-11-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-10-chris@chris-wilson.co.uk


# 7464284b 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: align the vma start to the largest gtt page size

For the 48b PPGTT try to align the vma start address to the required
page size boundary to guarantee we use said page size in the gtt. If we
are dealing with multiple page sizes, we can't guarantee anything and
just align to the largest. For soft pinning and objects which need to be
tightly packed into the lower 32bits we don't force any alignment.

v2: various improvements suggested by Chris

v3: use set_pages and better placement of page_sizes

v4: prefer upper_32_bits()

v5: assign vma->page_sizes = vma->obj->page_sizes directly
prefer sizeof(vma->page_sizes)

v6: fixup checking of end to exclude GGTT (which are assumed to be
limited to 4G).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-9-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-8-chris@chris-wilson.co.uk


# fa3f46af 06-Oct-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: introduce vm set_pages/clear_pages

Move the setting/clearing of the vma->pages to a vm operation. Doing so
neatens things up a little, but more importantly gives us a sane place
to also set/clear the vma->pages_sizes, which we introduce later in
preparation for supporting huge-pages.

v2: remove redundant vma->pages check

v3: GEM_BUG_ON(vma->pages) following i915_vma_remove

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-7-chris@chris-wilson.co.uk


# 4dd504f7 26-Sep-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use memset64() to prefill the GTT page

Take advantage of optimised memset64() instead of open coding it to
prefill the GTT pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170926095353.11036-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 4f044a88 19-Sep-2017 Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/i915: Rename global i915 to i915_modparams

Our global struct with params is named exactly the same way
as new preferred name for the drm_i915_private function parameter.
To avoid such name reuse lets use different name for the global.

v5: pure rename
v6: fix

Credits-to: Coccinelle

@@
identifier n;
@@
(
- i915.n
+ i915_modparams.n
)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjala <ville.syrjala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170919193846.38060-1-michal.wajdeczko@intel.com


# 1298d51c 18-Sep-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Return the correct score in i915_ppat_get()

The cache attribute of the required entry has to be the same with the
existing value. After this requirement is met, the futher comparison
should be performed. After this fix, the refined test case can pass.

v2:

- Refine the tittle and comments. (Rodrigo)

Fixes: 4395890a4855 ("drm/i915: Introduce private PAT management")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1505741794-10593-1-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# c095b97c 14-Sep-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Remove the "INDEX" suffix from PPAT marcos

Remove the "INDEX" suffix from PPAT marcos as they are bits actually, not
indexes.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1505392783-4084-2-git-send-email-zhi.a.wang@intel.com


# 4395890a 14-Sep-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Introduce private PAT management

The private PAT management is to support PPAT entry manipulation. Two
APIs are introduced for dynamically managing PPAT entries: intel_ppat_get
and intel_ppat_put.

intel_ppat_get will search for an existing PPAT entry which perfectly
matches the required PPAT value. If not, it will try to allocate a new
entry if there is any available PPAT indexs, or return a partially
matched PPAT entry if there is no available PPAT indexes.

intel_ppat_put will put back the PPAT entry which comes from
intel_ppat_get. If it's dynamically allocated, the reference count will
be decreased. If the reference count turns into zero, the PPAT index is
freed again.

Besides, another two callbacks are introduced to support the private PAT
management framework. One is ppat->update_hw(), which writes the PPAT
configurations in ppat->entries into HW. Another one is ppat->match, which
will return a score to show how two PPAT values match with each other.

v17:

- Refine the comparision of score of BDW. (Joonas)

v16:

- Fix a bug in PPAT match function of BDW. (Joonas)

v15:

- Refine some code flow. (Joonas)

v12:

- Fix a problem "not returning the entry of best score". (Zhenyu)

v7:

- Keep all the register writes unchanged in this patch. (Joonas)

v6:

- Address all comments from Chris:
http://www.spinics.net/lists/intel-gfx/msg136850.html

- Address all comments from Joonas:
http://www.spinics.net/lists/intel-gfx/msg136845.html

v5:

- Add check and warnnings for those platforms which don't have PPAT.

v3:

- Introduce dirty bitmap for PPAT registers. (Chris)
- Change the name of the pointer "dev_priv" to "i915". (Chris)
- intel_ppat_{get, put} returns/takes a const intel_ppat_entry *. (Chris)

v2:

- API re-design. (Chris)

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v7
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[Joonas: Use BIT() in the enum in bdw_private_pat_match]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1505392783-4084-1-git-send-email-zhi.a.wang@intel.com


# 0ee931c4 13-Sep-2017 Michal Hocko <mhocko@suse.com>

mm: treewide: remove GFP_TEMPORARY allocation flag

GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation. As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory. So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification. I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring. This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL. Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic. It
seems to be a heuristic without any measured advantage for most (if not
all) its current users. The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers. So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 36e16c49 12-Sep-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Factor out setup_private_pat()

Factor out setup_private_pat() for introducing the following patches.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1505202148-22959-1-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 14826673 08-Sep-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only initialize partially filled pagetables

If we know that we will completely fill a pagetable (i.e. we are
inserting a complete set of 512 pages), we can skip prefilling that PT
with scratch entries. If we have to abort the insertion prior to writing
the real entries, we will teardown the pagetable and remove it from the
page directory (so that we will restart the allocation next time).

We could do similar tricks for the PD and PDP, but the likelihood of a
single insertion covering the entire 512 entries diminishes, as do the
cycle savings. The saving are even greater (relatively) when we are
preallocating page tables for huge pages, as then we never need to fill
the page table.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170908181622.17791-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 912d572d 06-Sep-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: wire up shrinkctl->nr_scanned

shrink_slab() allows us to report back the number of objects we
successfully scanned (out of the target shrinkctl->nr_to_scan). As
report the number of pages owned by each GEM object as a separate item
to the shrinker, we cannot precisely control the number of shrinker
objects we scan on each pass; and indeed may free more than requested.
If we fail to tell the shrinker about the number of objects we process,
it will continue to hold a grudge against us as any objects left
unscanned are added to the next reclaim -- and so we will keep on
"unfairly" shrinking our own slab in comparison to other slabs.

Link: http://lkml.kernel.org/r/20170822135325.9191-2-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Shaohua Li <shli@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a338d5f8 31-Aug-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Fix the missing PPAT cache attributes on CNL

Add back the GEN8_PPAT_WB cache attributes in cnl_setup_private_ppat(),
which are missed on CNL.

Fixes: 4e34935fcf69 ("drm/i915/cnl: Setup PAT Index.")
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504208177-27784-1-git-send-email-zhi.a.wang@intel.com
(cherry picked from commit 6e31cdcfe17d8a25530924183d4a843602baebb1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 6e31cdcf 31-Aug-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Fix the missing PPAT cache attributes on CNL

Add back the GEN8_PPAT_WB cache attributes in cnl_setup_private_ppat(),
which are missed on CNL.

Fixes: 4e34935fcf69 ("drm/i915/cnl: Setup PAT Index.")
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504208177-27784-1-git-send-email-zhi.a.wang@intel.com


# 385db982 29-Aug-2017 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/cnl: Avoid ioremap_wc on Cannonlake as well.

Driver’s CPU access to GTT is via the GTTMMADR BAR.

The current HW implementation of that BAR is to only
support <= DW (and maybe QW) writes—not 16/32/64B writes
that could occur with WC and/or SSE/AVX moves.

GTTMMADR must be marked uncacheable (UC).
Accesses to GTTMMADR(GTT), must be 64 bits or less (ie. 1 GTT entry).

v2: Get clarification on the reasons and spec is getting
updated to reflect it now.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170829230907.21363-1-rodrigo.vivi@intel.com


# 66df1014 22-Aug-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Keep a small stash of preallocated WC pages

We use WC pages for coherent writes into the ppGTT on !llc
architectures. However, to create a WC page requires a stop_machine(),
i.e. is very slow. To compensate we currently keep a per-vm cache of
recently freed pages, but we still see the slow startup of new contexts.
We can amoritize that cost slightly by allocating WC pages in small
batches (PAGEVEC_SIZE == 14) and since creating a WC page implies a
stop_machine() there is no penalty for keeping that stash global.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822173828.5932-1-chris@chris-wilson.co.uk


# 90007bca 15-Aug-2017 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/cnl: Introduce initial Cannonlake Workarounds.

Let's inherit workarounds from previous platforms that
according to wa_database and BSpec are still valid for
Cannonlake.

v2: Add missed workarounds.
v3: Rebase
v4: Remove bad chunk that was added to rc6 disable. (Ander)
Also remove A0 W/a that are not needed anymore.
v5: Rebase on top of CFL.
v6: Remove empty gen9_init_perctx_bb and gen9_init_indirectctx_bb
since they don't carry any gen10 related W/a. (by Oscar).
Also Remove A0 exclusive workaround.
v7: Remove more A0 exclusive workarounds. As pointed out by Oscar
many workarounds were changed to be A0 only so let's remove
them.

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-1-rodrigo.vivi@intel.com


# 4e34935f 15-Aug-2017 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/cnl: Setup PAT Index.

Different from previous platforms, on CNL+ there's separated
registers for separated indexes.

v2: Remove comments regarding uncertainty around the table.
v3: Remove extra line (by Ben)

Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815232539.3562-1-rodrigo.vivi@intel.com


# 8a4ab66f 14-Aug-2017 Tina Zhang <tina.zhang@intel.com>

drm/i915: Enable guest i915 full ppgtt functionality

Enable the guest i915 full ppgtt functionality when host can provide this
capability. vgt_caps is introduced to guest i915 driver to get the vgpu
capabilities from the device model. VGT_CPAS_FULL_PPGTT is one of the
capabilities type to let guest i915 dirver know that the guest i915 full
ppgtt is supported by device model.

Notice that the minor version of pvinfo isn't bumped because of this
vgt_caps introduction, due to older guest would be broken by simply
increasing the pvinfo version. Although the pvinfo minor version doesn't
increase, the compatibility won't be blocked. The compatibility is ensured
by checking the value of caps field in pvinfo. Zero means no full ppgtt
support and BIT(2) means this feature is provided.

Changes since v1:
- Use u32 instead of uint32_t (Joonas)
- Move VGT_CAPS_FULL_PPGTT introduction to this patch and use #define
instead of enum (Joonas)
- Rewrite the vgpu full ppgtt capability checking logic. (Joonas)
- Some coding style refine. (Joonas)

Changes since v2:
- Divide the whole patch set into two separate patch series, with one
patch in i915 side to check guest i915 full ppgtt capability and enable
it when this capability is supported by the device model, and the other
one in gvt side which fixs the blocking issue and enables the device
model to provide the capability to guest. And this patch focuses on guest
i915 side. (Joonas)
- Change the title from "introduce vgt_caps to pvinfo" to
"Enable guest i915 full ppgtt functionality". (Tina)

Change since v3:
- Add some comments about pvinfo caps and version. (Joonas)

Change since v4:
- Tested by Tina Zhang.

Change since v5:
- Add limitation about supporting 32bit full ppgtt.

Change since v6:
- Change the fallback to 48bit full ppgtt if i915.ppgtt_enable=2. (Zhenyu)

Change in v9:
- Remove the fixme comment due to no plan for 32bit full ppgtt
support. (Zhenyu)
- Reorder the patch-set to fix compiling issue with git-bisect. (Zhenyu)
- Add print log when forcing guest 48bit full ppgtt. (Zhenyu)

v10:
- Update against Joonas's has_full_ppgtt and has_full_48bit_ppgtt disconnect
change. (Zhenyu)

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> # in v2
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>


# 4fc05063 10-Aug-2017 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Disconnect 32 and 48 bit ppGTT support

Configurations like virtualized environments may support only 48 bit
ppGTT without supporting 32 bit ppGTT. Support this by disconnecting
the relationship of the two feature bits.

Cc: Tina Zhang <tina.zhang@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>


# 17369ba0 07-Jul-2017 Chuanxiao Dong <chuanxiao.dong@intel.com>

drm/i915: Fix the kernel panic when using aliasing ppgtt

The ppgtt should be get directly from i915_address_space *vm instead of
vma->vm.

v2:
- add one more fix for bxt. (Chris)

Fixes: 4a234c5fae16 ("drm/i915: pass the vma to insert_entries")
Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=101713
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> v1
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1499421059-18262-1-git-send-email-chuanxiao.dong@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 4a234c5f 22-Jun-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: pass the vma to insert_entries

The vma already contains most of the information we need for insertion.
But also in preparation for supporting huge gtt pages, it would be
useful to know the details of the vma, such that we can we can easily
determine the page sizes we are allowed to use when inserting into the
48b PPGTT. This is especially true for 64K where we can't just
arbitrarily use it, since we require aligning/padding the vm space to
2M, which sometimes we can't enforce in the upper levels.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170622095836.6800-1-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 46c26662 16-Jun-2017 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/cfl: Introduce Coffee Lake workarounds.

Coffee Lake inherit most of Kabylake production
workarounds.

v2: Fix typo on commit message and remove
WaDisableKillLogic and GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC,
since as Mika pointed out they shouldn't be here for cfl
according to BSpec.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497653398-15722-1-git-send-email-rodrigo.vivi@intel.com


# 616d9cee 16-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: First try the previous execbuffer location

When choosing a slot for an execbuffer, we ideally want to use the same
address as last time (so that we don't have to rebind it) and the same
address as expected by the user (so that we don't have to fixup any
relocations pointing to it). If we first try to bind the incoming
execbuffer->offset from the user, or the currently bound offset that
should hopefully achieve the goal of avoiding the rebind cost and the
relocation penalty. However, if the object is not currently bound there
we don't want to arbitrarily unbind an object in our chosen position and
so choose to rebind/relocate the incoming object instead. After we
report the new position back to the user, on the next pass the
relocations should have settled down.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtien@linux.intel.com>


# d90c9890 31-May-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally

Commit 7c3f86b6dc51 ("drm/i915: Invalidate the guc ggtt TLB upon
insertion") added the restoration of the invalidation routine after the
GuC was disabled, but missed that the GuC was unconditionally disabled
when not used. This then overwrites the invalidate routine for the older
chipsets, causing havoc and breaking resume as the most obvious victim.

We place the guard inside i915_ggtt_disable_guc() to be backport
friendly (the bug was introduced into v4.11) but it would be preferred
to be in more control over when this was guard (i.e. do not try and
teardown the data structures before we have enabled them). That should
be true with the reorganisation of the guc loaders.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 7c3f86b6dc51 ("drm/i915: Invalidate the guc ggtt TLB upon insertion")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: <stable@vger.kernel.org> # v4.11+
Link: http://patchwork.freedesktop.org/patch/msgid/20170531190514.3691-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
(cherry picked from commit cb60606d835ca8b2f744835116bcabe64ce88849)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# d86b18a0 24-May-2017 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915: Serialize GTT/Aperture accesses on BXT

BXT has a H/W issue with IOMMU which can lead to system hangs when
Aperture accesses are queued within the GAM behind GTT Accesses.

This patch avoids the condition by wrapping all GTT updates in stop_machine
and using a flushing read prior to restarting the machine.

The stop_machine guarantees no new Aperture accesses can begin while
the PTE writes are being emmitted. The flushing read ensures that
any following Aperture accesses cannot begin until the PTE writes
have been cleared out of the GAM's fifo.

Only FOLLOWING Aperture accesses need to be separated from in flight
PTE updates. PTE Writes may follow tightly behind already in flight
Aperture accesses, so no flushing read is required at the start of
a PTE update sequence.

This issue was reproduced by running
igt/gem_readwrite and
igt/gem_render_copy
simultaneously from different processes, each in a tight loop,
with INTEL_IOMMU enabled.

This patch was originally published as:
drm/i915: Serialize GTT Updates on BXT

[Note: This will cause a performance penalty for some use cases, but
avoiding hangs trumps performance hits. This may need to be worked
around in Mesa to recover the lost performance.]

v2: Move bxt/iommu detection into static function
Remove #ifdef CONFIG_INTEL_IOMMU protection
Make function names more reflective of purpose
Move flushing read into static function

v3: Tidy up for checkpatch.pl

Testcase: igt/gem_concurrent_blit
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1495641251-30022-1-git-send-email-jon.bloomfield@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 0ef34ad6222abfa513117515fec720c33a58f105)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 04f7b24e 01-Jun-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/guc: Assert that we switch between known ggtt->invalidate functions

When we enable the GuC, we enable an alternative mechanism for doing
post-GGTT update invalidation. Likewise, when we disable the GuC, we
restore the previous method. Assert that we change between known
endpoints, so that we can catch if we accidentally clobber some other
gen and if we change the invalidate routine without updating guc.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601090446.1334-1-chris@chris-wilson.co.uk
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# cb60606d 31-May-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally

Commit 7c3f86b6dc51 ("drm/i915: Invalidate the guc ggtt TLB upon
insertion") added the restoration of the invalidation routine after the
GuC was disabled, but missed that the GuC was unconditionally disabled
when not used. This then overwrites the invalidate routine for the older
chipsets, causing havoc and breaking resume as the most obvious victim.

We place the guard inside i915_ggtt_disable_guc() to be backport
friendly (the bug was introduced into v4.11) but it would be preferred
to be in more control over when this was guard (i.e. do not try and
teardown the data structures before we have enabled them). That should
be true with the reorganisation of the guc loaders.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 7c3f86b6dc51 ("drm/i915: Invalidate the guc ggtt TLB upon insertion")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: <stable@vger.kernel.org> # v4.11+
Link: http://patchwork.freedesktop.org/patch/msgid/20170531190514.3691-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>


# 80debff8 25-May-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Consolidate #ifdef CONFIG_INTEL_IOMMU

We depend on intel_iommu_gfx_mapped for various workarounds, but that is
only available under an #ifdef CONFIG_INTEL_IOMMU. Refactor all the
cut-and-paste ifdefs to a common routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170525121612.2190-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 0ef34ad6 24-May-2017 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915: Serialize GTT/Aperture accesses on BXT

BXT has a H/W issue with IOMMU which can lead to system hangs when
Aperture accesses are queued within the GAM behind GTT Accesses.

This patch avoids the condition by wrapping all GTT updates in stop_machine
and using a flushing read prior to restarting the machine.

The stop_machine guarantees no new Aperture accesses can begin while
the PTE writes are being emmitted. The flushing read ensures that
any following Aperture accesses cannot begin until the PTE writes
have been cleared out of the GAM's fifo.

Only FOLLOWING Aperture accesses need to be separated from in flight
PTE updates. PTE Writes may follow tightly behind already in flight
Aperture accesses, so no flushing read is required at the start of
a PTE update sequence.

This issue was reproduced by running
igt/gem_readwrite and
igt/gem_render_copy
simultaneously from different processes, each in a tight loop,
with INTEL_IOMMU enabled.

This patch was originally published as:
drm/i915: Serialize GTT Updates on BXT

v2: Move bxt/iommu detection into static function
Remove #ifdef CONFIG_INTEL_IOMMU protection
Make function names more reflective of purpose
Move flushing read into static function

v3: Tidy up for checkpatch.pl

Testcase: igt/gem_concurrent_blit
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1495641251-30022-1-git-send-email-jon.bloomfield@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 2098105e 17-May-2017 Michal Hocko <mhocko@kernel.org>

drm: drop drm_[cm]alloc* helpers

Now that drm_[cm]alloc* helpers are simple one line wrappers around
kvmalloc_array and drm_free_large is just kvfree alias we can drop
them and replace by their native forms.

This shouldn't introduce any functional change.

Changes since v1
- fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day
build robot

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers
[danvet: Fixup vgem which grew another user very recently.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz


# 171d8b93 16-May-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: use vma->size for appgtt allocate_va_range

For the aliasing ppgtt we clear the va range up to vma->size, but seem
to allocate up to vma->node.size, which is a little inconsistent given
that vma->node.size >= vma->size. Not that is really matters all that
much since we preallocate anyway, but for consistency just use
vma->size.

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170516085514.5853-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit d567232cbd9ec2a289ddffea4013b7265bbcc3d5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# d567232c 16-May-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: use vma->size for appgtt allocate_va_range

For the aliasing ppgtt we clear the va range up to vma->size, but seem
to allocate up to vma->node.size, which is a little inconsistent given
that vma->node.size >= vma->size. Not that is really matters all that
much since we preallocate anyway, but for consistency just use
vma->size.

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170516085514.5853-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 2f720aac 12-May-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: don't do allocate_va_range again on PIN_UPDATE

If a vma is already bound to a ppgtt, we incorrectly call
allocate_va_range again when doing a PIN_UPDATE, which will result in
over accounting within our paging structures, such that when we do
unbind something we don't actually destroy the structures and end up
inadvertently recycling them. In reality this probably isn't too bad,
but once we start touching PDEs and PDPEs for 64K/2M/1G pages this
apparent recycling will manifest into lots of really, really subtle
bugs.

v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512091423.26085-1-chris@chris-wilson.co.uk
(cherry picked from commit 1f23475c893a85c934143cd64865ebb9b6af383f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 1f23475c 12-May-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: don't do allocate_va_range again on PIN_UPDATE

If a vma is already bound to a ppgtt, we incorrectly call
allocate_va_range again when doing a PIN_UPDATE, which will result in
over accounting within our paging structures, such that when we do
unbind something we don't actually destroy the structures and end up
inadvertently recycling them. In reality this probably isn't too bad,
but once we start touching PDEs and PDPEs for 64K/2M/1G pages this
apparent recycling will manifest into lots of really, really subtle
bugs.

v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512091423.26085-1-chris@chris-wilson.co.uk


# a92d1a91 09-May-2017 Imre Deak <imre.deak@intel.com>

drm/i915: Sanitize stolen memory size calculation

On GEN8+ (not counting CHV) the calculation can in theory result in an
incorrect sign extension with all upper bits set. In practice this is
unlikely to happen since it would require 4GB of stolen memory set
aside. For consistency still prevent the sign extension explicitly
everywhere.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1494408113-379-6-git-send-email-imre.deak@intel.com


# 4519290a 09-May-2017 Imre Deak <imre.deak@intel.com>

drm/i915: Check error return when setting DMA mask

Even though an error from these functions isn't fatal we still want to
have a diagnostic message about it.

v2:
- Don't do assignments in if statements. (Jani)

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1494408113-379-4-git-send-email-imre.deak@intel.com


# ed3ba079 08-May-2017 Laura Abbott <labbott@redhat.com>

drm: use set_memory.h header

set_memory_* functions have moved to set_memory.h. Switch to this
explicitly.

[akpm@linux-foundation.org: track drivers/gpu/drm/i915/i915_gem_gtt.c linux-next changes]
Link: http://lkml.kernel.org/r/1488920133-27229-8-git-send-email-labbott@redhat.com
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 1383aeca 30-Mar-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Ironlake do_idle_maps w/a may be called w/o struct_mutex

Since commit 1233e2db199d ("drm/i915: Move object backing storage
manipulation to its own locking"), i915_gem_object_put_pages() and
specifically the i915_gem_gtt_finish_pages() may be called from outside
of the struct_mutex and so we can no longer pass I915_WAIT_LOCKED to
i915_gem_wait_for_idle.

Fixes: 1233e2db199d ("drm/i915: Move object backing storage manipulation to its own locking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.10+
Link: http://patchwork.freedesktop.org/patch/msgid/20170330085341.20311-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 228ec87ccd040b620c467cd61d594bfaa4f8a12e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 228ec87c 30-Mar-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Ironlake do_idle_maps w/a may be called w/o struct_mutex

Since commit 1233e2db199d ("drm/i915: Move object backing storage
manipulation to its own locking"), i915_gem_object_put_pages() and
specifically the i915_gem_gtt_finish_pages() may be called from outside
of the struct_mutex and so we can no longer pass I915_WAIT_LOCKED to
i915_gem_wait_for_idle.

Fixes: 1233e2db199d ("drm/i915: Move object backing storage manipulation to its own locking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.10+
Link: http://patchwork.freedesktop.org/patch/msgid/20170330085341.20311-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 054b9acd 28-Feb-2017 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Setup vm callbacks late

If we manage to tangle errorpaths and get call to callbacks,
it is better to defensively keep them as null until object init is
finished so that we get clean null deref on callsite,
instead of more cryptic wreckage with partly initialized vm objects.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1488295691-9404-5-git-send-email-mika.kuoppala@intel.com


# e7167769 28-Feb-2017 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Avoid using word legacy with ppgtt

The term legacy is subjective. Use 3lvl and 4lvl
where appropriate.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1488295691-9404-4-git-send-email-mika.kuoppala@intel.com


# 1e6437b0 28-Feb-2017 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Prefer i915_vm_is_48bit() over macro

If we setup the vm size early, we can use the newly introduced
i915_vm_is_48bit() in majority of callsites wanting to know the vm size.

As we operate either with 3lvl or 4lvl page table structure,
wrap the vm size query inside a function which tells us if
4lvl setup is needed for particular vm, as the following
code uses the function names where level is noted.

v2: use_4lvl (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1488295691-9404-3-git-send-email-mika.kuoppala@intel.com


# 3e490042 28-Feb-2017 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Make I915_PDPES_PER_PDP inline function

The macro takes a vm pointer at some sites, and dev_priv on others
We were saved as the internal macro never deferences the pointer
given.

As the number of pdpes depend on vm configuration, make it
as a inline function that accepts vm pointer.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wsilon.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1488295691-9404-1-git-send-email-mika.kuoppala@intel.com


# 12946ece 27-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove redundant TLB invalidate on switching ppgtt

We are required to reload the TLBs around ppgtt switches. However, we
already do an unconditional TLB invalidate before every batch and a flush
afterwards, so this condition is already satisfied without extra flushes
around the LRI instructions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227135913.8056-2-chris@chris-wilson.co.uk


# 2f7399af 26-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Unwind vma->pages allocation upon failure

If we fail to allocate the ppgtt range after allocating the pages for
the vma, we should unwind the local allocation before reporting back the
failure.

Fixes: ff685975d97f ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227122654.27651-2-chris@chris-wilson.co.uk


# bf75d59e 26-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only unwind the local pgtable layer if empty

Only if we allocated the layer and the lower level failed should we
remove this layer when unwinding. Otherwise we ignore the overlapping
entries by overwriting the old layer with scratch.

Fixes: c5d092a4293f ("drm/i915: Remove bitmap tracking for used-pml4")
Fixes: e2b763caa6eb ("drm/i915: Remove bitmap tracking for used-pdpes")
Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99947
Testcase: igt/drv_selftest/live_gtt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227122654.27651-1-chris@chris-wilson.co.uk


# 9e89f9ee 25-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Advance start address on crossing PML (48b ppgtt) boundary

When advancing onto the next 4th level page table entry, we need to
reset our indices to 0. Currently we restart from the original address
which means we start with an offset into the next PML table.

Fixes: 894ccebee2b0 ("drm/i915: Micro-optimise gen8_ppgtt_insert_entries()")
Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99948
Testcase: igt/drv_selftest/live_gtt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Tested-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170225181122.4788-4-chris@chris-wilson.co.uk


# 357480ce 25-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert we do not overflow 4lvl page directories

Before looking up the page directory entry, check we are still within
bounds.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170225181122.4788-2-chris@chris-wilson.co.uk


# 4509276e 19-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove Braswell GGTT update w/a

Testing with concurrent GGTT accesses no longer show the coherency
problems from yonder, commit 5bab6f60cb4d ("drm/i915: Serialise updates
to GGTT with access through GGTT on Braswell"). My presumption is that
the root cause was more likely fixed by commit 3b5724d702ef ("drm/i915:
Wait for writes through the GTT to land before reading back"), along
with the use of WC updates to the global gTT in commit 8448661d65f6
("drm/i915: Convert clflushed pagetables over to WC maps". Given
that the original symptoms can no longer be reproduced, time to remove
the workaround.

Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170220124718.14796-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 5a55b527 17-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only apply legacy PDE overflow detection to 3lvl machines

Prevent the overflow check from firing on machines with the full 4lvl
page tables, that are not restricted to GEN8_LEGACY_PDES.

v2: Also fix the off-by-one in the compare

Fixes: 894ccebee2b0 ("drm/i915: Micro-optimise gen8_ppgtt_insert_entries()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217141455.19877-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>


# 3e52d71e 08-Feb-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: A hotfix for making aliasing PPGTT work for GVT-g

This patch makes PPGTT page table non-shrinkable when using aliasing PPGTT
mode. It's just a temporary solution for making GVT-g work.

Fixes: 2ce5179fe826 ("drm/i915/gtt: Free unused lower-level page tables")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1486559013-25251-2-git-send-email-zhi.a.wang@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.10
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit e81ecb5e31db6c2a259d694738cf620d9fa70861)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 75c7b0b8 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use preferred kernel types in i915_gem_gtt.c

Make checkpatch happy and make the use of u32/u64 consistent throughout
i915_gem_gtt.[ch]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-23-chris@chris-wilson.co.uk


# 57202f47 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Differentiate the aliasing_ppgtt with an invalid filp

Use an invalid filp so that the aliasing_ppgtt can be clearly
identified.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-22-chris@chris-wilson.co.uk


# e565ceb0 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Only preallocate the aliasing GTT to the extents of the global GTT

As the aliasing GTT is only accessed via the global GTT, we will never
use more of it than we expose via the Global GTT and so we only need to
preallocate sufficient space within the ppgtt for the full GTT. Equally,
if the aliasing GTT is smaller than the global GTT, we have a serious
issue and must bail.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-21-chris@chris-wilson.co.uk


# 381b943b 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915_address_space.start

Once upon a time, back in the UMS days, we supported userspace
initialising the GTT and sharing portions of the GTT with other users.
Now, we own the GTT (both global and per-process) and the tables always
start at 0 - so we can remove i915_address_space.start and forget about
this old complication.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-20-chris@chris-wilson.co.uk


# 3dc523ea 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove defunct GTT tracepoints

The tracepoints are now entirely synonymous with binding and unbinding the
VMA (and the tracepoints there).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-18-chris@chris-wilson.co.uk


# 75afcf72 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Always mark the PDP as dirty when altered

We want to reload the PDP (and flush the TLB) when the addresses are
changed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-17-chris@chris-wilson.co.uk


# ec151f31 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove superfluous posting reads after clear GGTT

The barrier here is not required - we apply the barrier before the range
is ever reused by the GPU instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-16-chris@chris-wilson.co.uk


# c5d092a4 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove bitmap tracking for used-pml4

We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-15-chris@chris-wilson.co.uk


# e2b763ca 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove bitmap tracking for used-pdpes

We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-14-chris@chris-wilson.co.uk


# fe52e37f 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove bitmap tracking for used-pdes

We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-13-chris@chris-wilson.co.uk


# dd19674b 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove bitmap tracking for used-ptes

We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99295
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-12-chris@chris-wilson.co.uk


# 16a011c8 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Tidy gen6_write_pde()

Stop passing around unused parameters makes the code more compact.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-11-chris@chris-wilson.co.uk


# f0a22974 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove redundant clear of appgtt

Upon creation of the va range, it is initialised to point at scratch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-10-chris@chris-wilson.co.uk


# 52c126ee 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Always preallocate gen6/7 ppgtt

The hardware does not cope very well with us changing the PD within an
active context (the context must be idle for it to re-read the PD). As
we only check whether the page is idle before changing the entry (and on
through the PD tree), we cannot reliably replace PD entries on
gen6/gen7. To fully avoid changing the tree at runtime, preallocate it
on init.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-9-chris@chris-wilson.co.uk


# ff685975 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move allocate_va_range to GTT

In the future, we need to call allocate_va_range on the aliasing-ppgtt
which means moving the call down from the vma into the vm (which is
more appropriate for calling the vm function).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-8-chris@chris-wilson.co.uk


# 9231da70 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove kmap/kunmap wrappers

As these are now both plain and simple kmap_atomic/kunmap_atomic pairs,
we can remove the wrappers for a small gain of clarity (in particular,
not hiding the atomic critical sections!).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-7-chris@chris-wilson.co.uk


# 8448661d 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert clflushed pagetables over to WC maps

We flush the entire page every time we update a few bytes, making the
update of a page table many, many times slower than is required. If we
create a WC map of the page for our updates, we can avoid the clflush
but incur additional cost for creating the pagetable. We amoritize that
cost by reusing page vmappings, and only changing the page protection in
batches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>


# cbc4e9e6 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Split ggtt/alasing_gtt unbind_vma

Similar to how we already split the bind_vma for ggtt/aliasing_gtt, also
split up the unbind for symmetry.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-5-chris@chris-wilson.co.uk


# 1188bc66 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Don't special case teardown of aliasing_ppgtt

The aliasing_ppgtt is a regular ppgtt, and we can use the regular
i915_ppgtt_put() to properly tear it down.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-4-chris@chris-wilson.co.uk


# 894ccebe 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Micro-optimise gen8_ppgtt_insert_entries()

Improve the sg iteration and in hte process eliminate a bug in
miscomputing the pml4 length as orig_nents<<PAGE_SHIFT is no longer the
full length of the sg table.

v2: Check for the end of the fourth level page table (the final pdpe)
and move onto the next.
v3: Assert that 3lvl insert_pte_entries doesn't overflow its smaller set
of PDP.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-3-chris@chris-wilson.co.uk


# b31144c0 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Micro-optimise gen6_ppgtt_insert_entries()

Inline the address computation to avoid the vfunc call for every page.
We still have to pay the high overhead of sg_page_iter_next(), but now
at least GCC can optimise the inner most loop, giving a significant
boost to some thrashing Unreal Engine workloads.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-2-chris@chris-wilson.co.uk


# ba7a5741 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Micro-optimise i915_get_ggtt_vma_pages()

The predominant VMA class is normal GTT, so allow gcc to emphasize that
path and avoid unnecessary stack movement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-1-chris@chris-wilson.co.uk


# 73dec95e 14-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Emit to ringbuffer directly

This removes the usage of intel_ring_emit in favour of
directly writing to the ring buffer.

intel_ring_emit was preventing the compiler for optimising
fetch and increment of the current ring buffer pointer and
therefore generating very verbose code for every write.

It had no useful purpose since all ringbuffer operations
are started and ended with intel_ring_begin and
intel_ring_advance respectively, with no bail out in the
middle possible, so it is fine to increment the tail in
intel_ring_begin and let the code manage the pointer
itself.

Useless instruction removal amounts to approximately
two and half kilobytes of saved text on my build.

Not sure if this has any measurable performance
implications but executing a ton of useless instructions
on fast paths cannot be good.

v2:
* Change return from intel_ring_begin to error pointer by
popular demand.
* Move tail increment to intel_ring_advance to enable some
error checking.

v3:
* Move tail advance back into intel_ring_begin.
* Rebase and tidy.

v4:
* Complete rebase after a few months since v3.

v5:
* Remove unecessary cast and fix !debug compile. (Chris Wilson)

v6:
* Make intel_ring_offset take request as well.
* Fix recording of request postfix plus a sprinkle of asserts.
(Chris Wilson)

v7:
* Use intel_ring_offset to get the postfix. (Chris Wilson)
* Convert GVT code as well.

v8:
* Rename *out++ to *cs++.

v9:
* Fix GVT out to cs conversion in GVT.

v10:
* Rebase for new intel_ring_begin in selftests.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com


# 6cde9a02 13-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Extract aliasing ppgtt setup

In order to force testing of the aliasing ppgtt, extract its
initialisation function.

v2: Also extract the cleanup function for symmetry.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-39-chris@chris-wilson.co.uk


# aae4a3d8 13-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use fault-injection to force the shrinker to run in live GTT tests

It is possible whilst allocating the page-directory tree for a ppgtt
bind that the shrinker may run and reap unused parts of the tree. If the
shrinker happens to remove a chunk of the tree that the
allocate_va_range has already processed, we may then try to insert into
the dangling tree. This test uses the fault-injection framework to force
the shrinker to be invoked before we allocate new pages, i.e. new chunks
of the PD tree.

References: https://bugs.freedesktop.org/show_bug.cgi?id=99295
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 1c42819a 13-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add initial selftests for i915_gem_gtt

Simple starting point for adding selftests for i915_gem_gtt, first
try creating a ppGTT and filling it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-27-chris@chris-wilson.co.uk


# 3b5bb0a3 13-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mock a GGTT for self-testing

A very simple mockery, just a random manager and timeline. Useful for
inserting objects and ordering retirement; and not much else.

v2: mock_fini_ggtt() to complement mock_init_ggtt().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-7-chris@chris-wilson.co.uk


# 94d4a2a9 10-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Unbind any residual objects/vma from the Global GTT on shutdown

We may unload the PCI device before all users (such as dma-buf) are
completely shutdown. This may leave VMA in the global GTT which we want
to revoke, whilst keeping the objects themselves around to service the
dma-buf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170210163523.17533-2-chris@chris-wilson.co.uk


# e81ecb5e 08-Feb-2017 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: A hotfix for making aliasing PPGTT work for GVT-g

This patch makes PPGTT page table non-shrinkable when using aliasing PPGTT
mode. It's just a temporary solution for making GVT-g work.

Fixes: 2ce5179fe826 ("drm/i915/gtt: Free unused lower-level page tables")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1486559013-25251-2-git-send-email-zhi.a.wang@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+
Cc: stable@vger.kernel.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# a6508ded 06-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use page coloring to provide the guard page at the end of the GTT

As we now mark the reserved hole (drm_mm.head_node) with the special
UNEVICTABLE color, we can use the page coloring to avoid prefetching of
the CS beyond the end of the GTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170206084547.27921-3-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 47db922f 06-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assign I915_COLOR_UNEVICTABLE to the address space head_node

The drm_mm range manager (within i915_address_space) uses a special
drm_mm_node that excludes the unavailable range (beyond the end of the
drm_mm). However, we play games with the global GTT to use the head_node
to exclude the tail page but tell ourselves that the whole range is
available. This causes an issue when we try to evict using the full
range of the global GTT which is wider than the drm_mm, resulting in
complete confusion and catastrophe. One way to resolve this would be to
use a reserved node to exclude the guard page, or we can treat the
drm_mm's head_node as our guard page and assign it the appropriate
colour.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170206084547.27921-2-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# b196fbc7 06-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Manipulate the Global GTT size using I915_GTT_PAGE_SIZE

I incorrectly converted the exclusion of the last 4096 bytes (that avoids
any potential prefetching past the end of the GTT) to PAGE_SIZE and not
to I915_GTT_PAGE_SIZE as it should be.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170206084547.27921-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# e32e836a 02-Feb-2017 Matthew Auld <matthew.auld@intel.com>

drm/i915: remove 512GB allocation warning

Now that we have selftests in place exercising truly huge allocations
we will start to hit the 512GB warning, so now seems like a good time to
remove this user-triggerable WARN.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1486047300-13198-1-git-send-email-matthew.auld@intel.com
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 4e64e553 02-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm: Improve drm_mm search (and fix topdown allocation) with rbtrees

The drm_mm range manager claimed to support top-down insertion, but it
was neither searching for the top-most hole that could fit the
allocation request nor fitting the request to the hole correctly.

In order to search the range efficiently, we create a secondary index
for the holes using either their size or their address. This index
allows us to find the smallest hole or the hole at the bottom or top of
the range efficiently, whilst keeping the hole stack to rapidly service
evictions.

v2: Search for holes both high and low. Rename flags to mode.
v3: Discover rb_entry_safe() and use it!
v4: Kerneldoc for enum drm_mm_insert_mode.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Sinclair Yeh <syeh@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com> # vmwgfx
Reviewed-by: Lucas Stach <l.stach@pengutronix.de> #etnaviv
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170202210438.28702-1-chris@chris-wilson.co.uk


# 9fb5026f 26-Jan-2017 Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>

drm/i915/glk: Turn on workarounds that apply to Geminilake too

Apply workarounds to Geminilake, and annotate those that are applied
unconditionally when they apply to GLK based on the workaround database.

v2: Fix commit message typos. (David)
v3: Rebase.
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485422218-9102-1-git-send-email-ander.conselvan.de.oliveira@intel.com


# b976dc53 23-Jan-2017 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Introduce IS_GEN9_BC for Skylake and Kabylake.

Along with GLK it was introduced the .is_lp and IS_GEN9_LP.
So, following the same simplification standard we can
put Skylake and Kabylake under the same bucket for most
of the things.

So let's add the IS_GEN9_BC for "Big Core" (non Atom based
platforms).

The i915_drv.c was let out of this patch on purpose
because that is really a decision per platform, just like
other cases where IS_KABYLAKE is different from IS_SKYLAKE.

v2: fix conflict with IS_LP and 3 new cases for this
big core bucket:
- intel_ddi.c: intel_ddi_get_link_dpll
- intel_fbc.c: find_compression_threshold
- i915_gem_gtt.c: gtt_write_workarounds

Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485196357-30599-2-git-send-email-rodrigo.vivi@intel.com


# 8da53efa 23-Jan-2017 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/kbl: Apply WaIncreaseDefaultTLBEntries on KBL.

According to wa_database this Wa persist on KBL as it was on SKL.

Cc: Tim Gore <tim.gore@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485196357-30599-1-git-send-email-rodrigo.vivi@intel.com


# 718659a6 16-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename some warts in the VMA API

Whilst writing testcases to exercise the VMA API, some oddities came to
light, such as i915_gem_obj_lookup_or_create(). Joonas suggested
i915_vma_instance() as a neat replacement, so rename them, move them to
i915_vma.c and add some kerneldoc as a sugary bonus.

s/i915_gem_obj_to_vma/i915_vma_lookup/
s/i915_gem_obj_lookup_or_create_vma/i915_vma_instance/

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-2-chris@chris-wilson.co.uk


# 9734ad13 15-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert we do not attempt to reuse an allocated node

i915_gem_gtt_reserve() and i915_gem_gtt_insert() can only work on
unallocated nodes. Check that the callers complies.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170115172740.28995-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 3fec7ec4 15-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Catch attempting to use the aliasing_gtt's drm_mm

The aliasing_gtt is just that, an alias of the global GTT. We do not
populate it directly, instead we always use the global GTT. Catch any
attempt to incorrectly allocate ranges from the aliasing_gtt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170115134746.29325-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 47a8e3f6 13-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Eliminate superfluous i915_ggtt_view_normal

Since commit 058d88c4330f ("drm/i915: Track pinned VMA"), there is only
one user of i915_ggtt_view_normal rodate. Just treat NULL as no special
view in pin_to_display() like everywhere else.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-7-chris@chris-wilson.co.uk


# 7b92c047 13-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Eliminate superfluous i915_ggtt_view_rotated

It is only being used to clear a struct and set the type, after which it
is overwritten. Since we no longer check the unset bits of the union,
skipping the clear is permissible.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-6-chris@chris-wilson.co.uk


# 8bab1193 13-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert i915_ggtt_view to use an anonymous union

Reading the ggtt_views is much more pleasant without the extra
characters from specifying the union (i.e. ggtt_view.partial rather than
ggtt_view.params.partial). To make this work inside i915_vma_compare()
with only a single memcmp requires us to ensure that there are no
uninitialised bytes within each branch of the union (we make sure the
structs are packed) and we need to store the size of each branch.

v4: Rewrite changelog and add comments explaining the assert.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-5-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 0c7eeda1 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move i915_ppgtt_close() into i915_gem_gtt.c

Move it alongside its ppgtt counterparts, in order to make it available
for the ppgtt selftests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111210937.29252-26-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# a4dbf7cf 12-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix up kerneldoc parameters for i915_gem_gtt_*()

Parameter: good.
Parameter - bad.

One day I'll learn the syntax.

Fixes: 625d988acc28 ("drm/i915: Extract reserving space in the GTT to a helper")
Fixes: e007b19d7ba7 ("drm/i915: Use the MRU stack search after evicting")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112164559.27232-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 7c3f86b6 12-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Invalidate the guc ggtt TLB upon insertion

Move the GuC invalidation of its ggtt TLB to where we perform the ggtt
modification rather than proliferate it into all the callers of the
insert (which may or may not in fact have to do the insertion).

v2: Just do the guc invalidate unconditionally, (afaict) it has no impact
without the guc loaded on gen8+
v3: Conditionally invalidate the guc - just in case that register has
not been validated for other modes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112110050.25333-1-chris@chris-wilson.co.uk


# 606fec95 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prefer random replacement before eviction search

Performing an eviction search can be very, very slow especially for a
range restricted replacement. For example, a workload like
gem_concurrent_blit will populate the entire GTT and then cause aperture
thrashing. Since the GTT is a mix of active and inactive tiny objects,
we have to search through almost 400k objects before finding anything
inside the mappable region, and as this search is required before every
operation performance falls off a cliff.

Instead of performing the full search, we do a trial replacement of the
node at a random location fitting the specified restrictions. We lose
the strict LRU property of the GTT in exchange for avoiding the slow
search (several orders of runtime improvement for gem_concurrent_blit
4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
(later) mitigated firstly by only doing replacement if we find no
freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
it starts thrashing (i.e. the random replacement will only occur from the
already inactive set of objects).

v2: Ascii-art, and check preconditionst
v3: Rephrase final sentence in comment to explain why we don't bother
with if (i915_is_ggtt(vm)) for preferring random replacement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-3-chris@chris-wilson.co.uk


# 625d988a 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Extract reserving space in the GTT to a helper

Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its
own routine so that it can be shared rather than duplicated.

v2: Kerneldoc

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: igvt-g-dev@lists.01.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk


# e007b19d 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the MRU stack search after evicting

When we evict from the GTT to make room for an object, the hole we
create is put onto the MRU stack inside the drm_mm range manager. On the
next search pass, we can speed up a PIN_HIGH allocation by referencing
that stack for the new hole.

v2: Pull together the 3 identical implements (ahem, a couple were
outdated) into a common routine for allocating a node and evicting as
necessary.
v3: Detect invalid calls to i915_gem_gtt_insert()
v4: kerneldoc

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk


# f51455d4 10-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE

Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
code comprehension and future changes. In the future, we may want to use
variable GTT page sizes and so have the challenge of knowing which
hardcoded values were used to represent a physical page vs the virtual
page.

v2: Look for a few more 4096s to convert, discover IS_ALIGNED().
v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep
bdw stolen w/a as 4096 until we know better.
v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page
sizes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 9e65a378 13-Dec-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: don't open code the pdpe/pml4e clearing

Now that it's obvious what the helpers do, we can simplify the code
somewhat by using them when clearing the pdpe/pml4e with the relevant
scratch entry.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161213160512.7008-1-matthew.auld@intel.com


# 56843107 13-Dec-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: s/gen8_setup_page_directory_pointer/gen8_setup_pml4e/

The function name gen8_setup_page_directory_pointer is misleading, and
only serves to confuse the reader, it's not setting up a pdp, but
rather encoding a specific pml4e with a given pdp.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 5c693b2b 13-Dec-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: s/gen8_setup_page_directory/gen8_setup_pdpe/

The function name gen8_setup_page_directory is misleading, and only
serves to confuse the reader, it's not setting up a pd, but rather
encoding a specific pdpe with a given pd.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 1a292fa5 06-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Purge loose pages if we run out of DMA remap space

If the DMA remap fails, one cause can be that we have too many objects
pinned in a small remapping table, such as swiotlb. (DMA remapping does
not trigger the shrinker by itself on its normal failure paths.) So try
purging all other objects (using i915_gem_shrink_all(), sparing our own
pages as we have yet to assign them to the obj->pages) and try again. If
there are no pages to reclaim (and consequently no pages to unmap), the
shrinker will report 0 and we fail with -ENOSPC as before.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170106152240.5793-2-chris@chris-wilson.co.uk


# edd1f2fe 06-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use fixed-sized types for stolen

Stolen memory is a hardware resource of known size, so use an accurate
fixed integer type rather than the ambiguous variable size_t. This was
motivated by the next patch spotting inconsistencies in our types.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-3-chris@chris-wilson.co.uk


# db9309a5 05-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/guc: Exclude the upper end of the Global GTT for the GuC

The GuC uses a special mapping for the upper end of the Global GTT,
similar to the way it uses a special mapping for the lower end, so
exclude it from our drm_mm to prevent us using it.

v2: Rename to reflect that it is unmappable similar to the region at the
bottom of the GGTT, and couple it into the assertion that we don't feed
unmappable addresses to the GuC.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-5-chris@chris-wilson.co.uk


# 7a0499a4 13-Dec-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: move vma sanity checking into i915_vma_bind

If we move the sanity checking from gen8_alloc_va_range_3lvl and
gen6_alloc_va_range into i915_vma_bind, we will increase our coverage to
now both callbacks. We also convert each WARN_ON over to a GEM_WARN_ON.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-2-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# b44f97fd 16-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Simplify i915_gtt_color_adjust()

If we remember that node_list is a circular list containing the fake
head_node, we can use a simple list_next_entry() and skip the NULL check
for the allocated check against the head_node.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161216074718.32500-3-chris@chris-wilson.co.uk


# 45b186f1 16-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm: Constify the drm_mm API

Mark up the pointers as constant through the API where appropriate.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161216074718.32500-5-chris@chris-wilson.co.uk


# 9e1d0e60 05-Dec-2016 Michel Thierry <michel.thierry@intel.com>

drm/i915: Advertise ppgtt support type in platform definition

Instead of being hidden in sanitize_enable_ppgtt.
It also seems to be the place to do so nowadays.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 85fd4f58 05-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark all non-vma being inserted into the address spaces

We need to distinguish between full i915_vma structs and simple
drm_mm_nodes when considering eviction (i.e. we must be careful not to
treat a mere drm_mm_node as a much larger i915_vma causing memory
corruption, if we are lucky). To do this, color these not-a-vma with -1
(I915_COLOR_UNEVICTABLE).

v2...v200: New name for -1.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-1-chris@chris-wilson.co.uk


# cc3f90f0 02-Dec-2016 Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>

drm/i915/glk: Reuse broxton code for geminilake

Geminilake is mostly backwards compatible with broxton, so change most
of the IS_BROXTON() checks to IS_GEN9_LP(). Differences between the
platforms will be implemented in follow-up patches.

v2: Don't reuse broxton's path in intel_update_max_cdclk().
Don't set plane count as in broxton.

v3: Rebase

v4: Include the check intel_bios_is_port_hpd_inverted().
Commit message.

v5: Leave i915_dmc_info() out; glk's csr version != bxt's. (Rodrigo)

v6: Rebase.

v7: Convert a few mode IS_BROXTON() occurances in pps, ddi, dsi and pll
code. (Rodrigo)

v8: Squash a couple of DDI patches with more conversions. (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1480667037-11215-2-git-send-email-ander.conselvan.de.oliveira@intel.com


# c6385c94 28-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix tracepoint compilation

drivers/gpu/drm/i915/./i915_trace.h: In function ‘trace_event_raw_event_i915_gem_evict’:
drivers/gpu/drm/i915/./i915_trace.h:409:24: error: ‘struct i915_address_space’ has no member named ‘dev’
__entry->dev = vm->dev->primary->index;

A couple of macros missed in the s/vm->dev/vm->i915/ conversion.

Fixes: 49d73912cbfc ("drm/i915: Convert vm->dev backpointer to vm->i915")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129124205.19351-1-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>


# 49d73912 29-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert vm->dev backpointer to vm->i915

99% of the time we access i915_address_space->dev we want the i915
device and not the drm device, so let's store the drm_i915_private
backpointer instead. The only real complication here are the inlines
in i915_vma.h where drm_i915_private is not yet defined and so we have
to choose an alternate path for our asserts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129095008.32622-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# a18dbba8 28-Nov-2016 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Move the release of PT page to the upper caller

a PT page will be released if it doesn't contain any meaningful mappings
during PPGTT page table shrinking. The PT entry in the upper level will
be set to a scratch entry.

Normally this works nicely, but in virtualization world, the PPGTT page
table is tracked by hypervisor. Releasing the PT page before modifying
the upper level PT entry would cause extra efforts.

As the tracked page has been returned to OS before losing track from
hypervisor, it could be written in any pattern. Hypervisor has to recognize
if a page is still being used as a PT page by validating these writing
patterns. It's complicated. Better let the guest modify the PT entry in
upper level PT first, then release the PT page.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://patchwork.freedesktop.org/patch/122697/msgid/1479728666-25333-1-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1480402516-22275-1-git-send-email-zhi.a.wang@intel.com


# ed9724dd 17-Nov-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: add i915_address_space_fini

We already have an i915_address_space_init, so for symmetry we should
also have a _fini, plus we already open code it twice. This then also
fixes a bug where we leak the timeline for the ggtt vm.

v2: don't forget about the struct_mutex for the ggtt path.

Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161117210411.14044-1-chris@chris-wilson.co.uk


# 7ace3d30 16-Nov-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: dev_priv cleanup in i915_gem_stolen.c

And a little bit of cascaded function prototype changes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 275a991c 16-Nov-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: dev_priv cleanup in i915_gem_gtt.c

Started with removing INTEL_INFO(dev) and cascaded into a quite
big trickle of function prototype changes. Still, I think it is
for the better.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# c6be607a 16-Nov-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: dev_priv and a small cascade of cleanups in i915_gem.c

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# b7f05d4a 09-Nov-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Pass dev_priv to INTEL_INFO everywhere apart from the gen use

After this patch only conversion of INTEL_INFO(p)->gen to
INTEL_GEN(dev_priv) remains before the __I915__ macro can
be removed.

v2: Tidy vlv_compute_wm. (David Weinehall)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>


# b42fe9ca 10-Nov-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Split out i915_vma.c

As a side product, had to split two other files;
- i915_gem_fence_reg.h
- i915_gem_object.h (only parts that needed immediate untanglement)

I tried to move code in as big chunks as possible, to make review
easier. i915_vma_compare was moved to a header temporarily.

v2:
- Use i915_gem_fence_reg.{c,h}

v3:
- Rebased

v4:
- Fix building when DEBUG_GEM is enabled by reordering a bit.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com


# dfd2812e 04-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove the vma from the object list upon close

Currently, the vma is being unlink from the object lookup on destroy.
However, we are meant to be decoupling it upon close so that the user
cannot access the closed vma whilst it remains active on the GPU.

[ 34.074858] kernel BUG at drivers/gpu/drm/i915/i915_gem_gtt.c:3561!
[ 34.074875] invalid opcode: 0000 [#1] PREEMPT SMP
[ 34.074888] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich mei_me mei snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core i2c_designware_platform i2c_designware_core snd_pcm e1000e ptp pps_core sdhci_acpi sdhci mmc_core i2c_hid [last unloaded: i915]
[ 34.075010] CPU: 1 PID: 6224 Comm: gem_close_race Tainted: G U 4.9.0-rc3-CI-CI_DRM_1800+ #1
[ 34.075034] Hardware name: /NUC5i7RYB, BIOS RYBDWi35.86A.0355.2016.0224.1501 02/24/2016
[ 34.075057] task: ffff8802459a8040 task.stack: ffffc90000524000
[ 34.075074] RIP: 0010:[<ffffffffa0392cbc>] [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[ 34.075118] RSP: 0018:ffffc90000527b68 EFLAGS: 00010202
[ 34.075135] RAX: ffff8802426c5e40 RBX: 0000000000000000 RCX: ffff8802447fc2a8
[ 34.075158] RDX: 0000000000000000 RSI: ffff8802447fc2a8 RDI: ffff880248a4a880
[ 34.075181] RBP: ffffc90000527b88 R08: 0000000000000008 R09: 0000000000000000
[ 34.075203] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880248a4a880
[ 34.075225] R13: ffff8802447fc2a8 R14: ffff880243e9afa8 R15: ffff880248a4a9c8
[ 34.075248] FS: 00007f9b43e59740(0000) GS:ffff880256c80000(0000) knlGS:0000000000000000
[ 34.075273] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 34.075292] CR2: 00007f9b43419140 CR3: 000000024455d000 CR4: 00000000003406e0
[ 34.075314] Stack:
[ 34.075323] 0000000000000000 ffffc90000527bd0 ffff880243cb8008 ffff880243e9afa8
[ 34.075353] ffffc90000527c08 ffffffffa03874c7 ffffc90000527bb8 ffff880243e9afa8
[ 34.075383] ffff880243e9afb0 ffffc90000527e10 ffff8802447fc2a8 ffff880243cb8040
[ 34.075414] Call Trace:
[ 34.075435] [<ffffffffa03874c7>] eb_lookup_vmas.isra.7+0x247/0x330 [i915]
[ 34.075468] [<ffffffffa0388c34>] i915_gem_do_execbuffer.isra.15+0x604/0x1a10 [i915]
[ 34.075507] [<ffffffffa039c957>] ? i915_gem_object_get_sg+0x347/0x380 [i915]
[ 34.075532] [<ffffffff811a69ce>] ? __might_fault+0x3e/0x90
[ 34.075562] [<ffffffffa038a430>] i915_gem_execbuffer2+0xc0/0x250 [i915]
[ 34.075585] [<ffffffff81552926>] drm_ioctl+0x1f6/0x480
[ 34.075604] [<ffffffff8100107a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 34.075635] [<ffffffffa038a370>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[ 34.075658] [<ffffffff81202d2e>] do_vfs_ioctl+0x8e/0x690
[ 34.075677] [<ffffffff8181582d>] ? _raw_spin_unlock_irqrestore+0x3d/0x60
[ 34.075700] [<ffffffff810fcd51>] ? SyS_timer_settime+0x141/0x1e0
[ 34.075721] [<ffffffff810d6de2>] ? trace_hardirqs_on_caller+0x122/0x1b0
[ 34.075742] [<ffffffff8120336c>] SyS_ioctl+0x3c/0x70
[ 34.075760] [<ffffffff8181602e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 34.075781] Code: 44 a0 48 c7 c2 9a 7e 43 a0 be e0 0d 00 00 48 c7 c7 a0 45 44 a0 e8 55 b8 ce e0 48 85 db 74 a3 49 83 bd f8 03 00 00 00 74 99 0f 0b <0f> 0b 48 89 da 4c 89 ee 4c 89 e7 e8 04 a9 ff ff 48 89 da 49 89
[ 34.075955] RIP [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[ 34.075994] RSP <ffffc90000527b68>

Testcase: igt/gem_close_race/basic-threads
Fixes: db6c2b4151f2 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161104161241.25871-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 2c3a3f44 04-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix pages pin counting around swizzle quirk

commit bc0629a76726 ("drm/i915: Track pages pinned due to swizzling
quirk") fixed one problem, but revealed a whole lot more. The root cause
of the pin count mismatch for the swizzle quirk (for L-shaped memory on
gen3/4) was that we were incrementing the pages_pin_count upon getting
the backing pages but then overwriting the pages_pin_count to set it to
1 afterwards. With a little bit of adjustment to satisfy the GEM_BUG_ON
sanitychecks, the fix is to replace the explicit atomic_set with an
atomic_inc.

v2: Consistently use atomics (not mix atomics and helpers) within the
lowlevel get_pages routines. This makes the atomic operations much
clearer.

Fixes: 1233e2db199d ("drm/i915: Move object backing storage manipulation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161104103001.27643-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# a44342ac 03-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix test on inputs for vma_compare()

When supplying a view to vma_compare() it is required that the supplied
i915_address_space is the global GTT. I tested the VMA instead (which is
the current position in the rbtree and maybe from any address space).

Reported-by: Matthew Auld <matthew.auld@intel.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98579
Fixes: db6c2b4151f2 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161103200852.23431-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 56cea323 01-Nov-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Unify global_list into global_link

$ sed -i -r 's/\bglobal_list\b/global_link/g' *.c *.h

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478081764-8058-1-git-send-email-joonas.lahtinen@linux.intel.com


# fce93755 31-Oct-2016 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Mark tlbs dirty on clear

Now when clearing ptes can modify upper level pdp's,
we need to mark them dirty so that they will be flushed
correctly.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478006856-8313-1-git-send-email-mika.kuoppala@intel.com


# 37c63934 01-Nov-2016 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Fix pte clear range

Comparing pte index to a number of entries is wrong
when clearing a range of pte entries. Use end marker
of 'one past' to correctly point adequate number of
ptes to the scratch page.

v2: assert early instead of warning late (Chris)
v3: removed consts (Joonas)

Fixes: d209b9c3cd28 ("drm/i915/gtt: Split gen8_ppgtt_clear_pte_range")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98282
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>


# db6c2b41 01-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store the vma in an rbtree under the object

With full-ppgtt one of the main bottlenecks is the lookup of the VMA
underneath the object. For execbuf there is merit in having a very fast
direct lookup of ctx:handle to the vma using a hashtree, but that still
leaves a large number of other lookups. One way to speed up the lookup
would be to use a rhashtable, but that requires extra allocations and
may exhibit poor worse case behaviour. An alternative is to use an
embedded rbtree, i.e. no extra allocations and deterministic behaviour,
but at the slight cost of O(lgN) lookups (instead of O(1) for
rhashtable). The major of such tree will be very shallow and so not much
slower, and still scales much, much better than the current unsorted
list.

v2: Bump vma_compare() to return a long, as we return the result of
comparing two pointers.

References: https://bugs.freedesktop.org/show_bug.cgi?id=87726
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk


# 80b204bc 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Enable multiple timelines

With the infrastructure converted over to tracking multiple timelines in
the GEM API whilst preserving the efficiency of using a single execution
timeline internally, we can now assign a separate timeline to every
context with full-ppgtt.

v2: Add a comment to indicate the xfer between timelines upon submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk


# d07f0e59 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GEM activity tracking into a common struct reservation_object

In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

* busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

* non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

* dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations
(maintaining the fence refcounts).

* loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

* minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk


# 03ac84f1 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pass around sg_table to get_pages/put_pages backend

The plan is to move obj->pages out from under the struct_mutex into its
own per-object lock. We need to prune any assumption of the struct_mutex
from the get_pages/put_pages backends, and to make it easier we pass
around the sg_table to operate on rather than indirectly via the obj.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk


# a4f5ea64 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Refactor object page API

The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk


# d2a84a76 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use radixtree to jump start intel_partial_pages()

We can use the radixtree index of the obj->pages to find the start
position of the desired partial range.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-11-chris@chris-wilson.co.uk


# 4c7d62c6 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Markup GEM API with lockdep asserts

Add lockdep_assert_held(struct_mutex) to the API preamble of the
internal GEM interfaces.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-9-chris@chris-wilson.co.uk


# f8a7fde4 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Defer active reference until required

We only need the active reference to keep the object alive after the
handle has been deleted (so as to prevent a synchronous gem_close). Why
then pay the price of a kref on every execbuf when we can insert that
final active ref just in time for the handle deletion?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk


# 2eedfc7d 24-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove RPM sequence checking

We only used the RPM sequence checking inside the lowlevel GTT
accessors, when we had to rely on callers taking the wakeref on our
behalf. Now that we take the RPM wakeref inside the GTT management
routines themselves, we can forgo the sanitycheck of the callers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-4-chris@chris-wilson.co.uk


# 9c870d03 24-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use RPM as the barrier for controlling user mmap access

We can remove the false coupling between RPM and struct mutex by the
observation that we can use the RPM wakeref as the barrier around user
mmap access. That is as we tear down the user's PTE atomically from
within rpm suspend and then to fault in new PTE requires the rpm
wakeref, means that no user access is possible through those PTE without
RPM being awake. Having made that observation, we can then remove the
presumption of having to take rpm outside of struct_mutex and so allow
fine grained acquisition of a wakeref around hw access rather than
having to remember to acquire the wakeref early on.

v2: Rejig placement of the new intel_runtime_pm_get() to be as tight
as possible around the GTT pread/pwrite.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-2-chris@chris-wilson.co.uk


# 2ce5179f 13-Oct-2016 Michał Winiarski <michal.winiarski@intel.com>

drm/i915/gtt: Free unused lower-level page tables

Since "Dynamic page table allocations" were introduced, our page tables
can grow (being dynamically allocated) with address space range usage.
Unfortunately, their lifetime is bound to vm. This is not a huge problem
when we're not using softpin - drm_mm is creating an upper bound on used
range by causing addresses for our VMAs to eventually be reused.

With softpin, long lived contexts can drain the system out of memory
even with a single "small" object. For example:

bo = bo_alloc(size);
while(true)
offset += size;
exec(bo, offset);

Will cause us to create new allocations until all memory in the system
is used for tracking GPU pages (even though almost all PTEs in this vm
are pointing to scratch).

Let's free unused page tables in clear_range to prevent this - if no
entries are used, we can safely free it and return this information to
the caller (so that higher-level entry is pointing to scratch).

v2: Document return value and free semantics (Joonas)
v3: No newlines in vars block (Joonas)
v4: Drop redundant local 'reduce' variable
v5: Handle CI fail with enable_ppgtt=2

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-3-git-send-email-michal.winiarski@intel.com


# d209b9c3 13-Oct-2016 Michał Winiarski <michal.winiarski@intel.com>

drm/i915/gtt: Split gen8_ppgtt_clear_pte_range

Let's use more top-down approach, where each gen8_ppgtt_clear_* function
is responsible for clearing the struct passed as an argument and calling
relevant clear_range functions on lower-level tables.
Doing this rather than operating on PTE ranges makes the implementation
of shrinking page tables quite simple.

v2: Drop min when calculating num_entries, no negation in 48b ppgtt
check, no newlines in vars block (Joonas)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-2-git-send-email-michal.winiarski@intel.com


# 4fb84d99 13-Oct-2016 Michał Winiarski <michal.winiarski@intel.com>

drm/i915: Remove unused "valid" parameter from pte_encode

We never used any invalid ptes, those were put in place for
a possibility of doing gpu faults. However our batchbuffers are not
restricted in length, so everything needs to be pointing to something
and thus out-of-bounds is pointing to scratch.

Remove the valid flag as it is always true.

v2: Expand commit msg, patch reorder (Mika)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-1-git-send-email-michal.winiarski@intel.com


# 5db94019 13-Oct-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make IS_GEN macros only take dev_priv

Saves 1416 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476352990-2504-1-git-send-email-tvrtko.ursulin@linux.intel.com


# 920a14b2 14-Oct-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make IS_CHERRYVIEW only take dev_priv

Saves 864 bytes of .rodata strings and ~100 of .text.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# e2d214ae 13-Oct-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make IS_BROXTON only take dev_priv

Saves 1392 bytes of .rodata strings.

Also change a few function/macro prototypes in i915_gem_gtt.c
from dev to dev_priv where it made more sense to do so.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Mention function prototype changes. (David Weinehall)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>


# d9486e65 13-Oct-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make IS_SKYLAKE only take dev_priv

Saves 1016 bytes of .rodata strings and couple hundred of .text.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# 772c2a51 13-Oct-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make IS_HASWELL only take dev_priv

Saves 2432 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# 8652744b 13-Oct-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make IS_BROADWELL only take dev_priv

Saves 1808 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# 3b3f1650 13-Oct-2016 Akash Goel <akash.goel@intel.com>

drm/i915: Allocate intel_engine_cs structure only for the enabled engines

With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
instead pass a local iterator variable to the for_each_engine**
macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com


# 95374d75 12-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Always use the GTT for error capture

Since the GTT provides universal access to any GPU page, we can use it
to reduce our plethora of read methods to just one. It also has the
important characteristic of being exactly what the GPU sees - if there
are incoherency problems, seeing the batch as executed (rather than as
trapped inside the cpu cache) is important.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk


# 82daabae 09-Sep-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: remove writeq ifdeffery

drm already provides fallback versions of readq and writeq.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1473451373-9852-1-git-send-email-matthew.auld@intel.com


# fbb30a5c 09-Sep-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Flush to GTT domain all GGTT bound objects after hibernation

Recently I have been applying an optimisation to avoid stalling and
clflushing GGTT objects based on their current binding. That is we only
set-to-gtt-domain upon first bind. However, on hibernation the objects
remain bound, but they are in the CPU domain. Currently (since commit
975f7ff42edf ("drm/i915: Lazily migrate the objects after hibernation"))
we only flush scanout objects as all other objects are expected to be
flushed prior to use. That breaks down in the face of the runtime
optimisation above - and we need to flush all GGTT pinned objects
(essentially ringbuffers).

To reduce the burden of extra clflushes, we only flush those objects we
cannot discard from the GGTT. Everything pinned to the scanout, or
current contexts or ringbuffers will be flushed and rebound. Other
objects, such as inactive contexts, will be left unbound and in the CPU
domain until first use after resuming.

Fixes: 7abc98fadfdd ("drm/i915: Only change the context object's domain...")
Fixes: 57e885318119 ("drm/i915: Use VMA for ringbuffer tracking")
References: https://bugs.freedesktop.org/show_bug.cgi?id=94722
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909201957.2499-1-chris@chris-wilson.co.uk


# 22dd3bb9 09-Sep-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up all locked waiters

In the next patch we want to handle reset directly by a locked waiter in
order to avoid issues with returning before the reset is handled. To
handle the reset, we must first know whether we hold the struct_mutex.
If we do not hold the struct_mtuex we can not perform the reset, but we do
not block the reset worker either (and so we can just continue to wait for
request completion) - otherwise we must relinquish the mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-10-chris@chris-wilson.co.uk


# ea746f36 09-Sep-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Expand bool interruptible to pass flags to i915_wait_request()

We need finer control over wakeup behaviour during i915_wait_request(),
so expand the current bool interruptible to a bitmask.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk


# 557b1a8c 05-Sep-2016 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: disable 48bit full PPGTT when vGPU is active

Disable 48bit full PPGTT on vGPU too for now.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160906040412.1274-3-zhenyuw@linux.intel.com
(cherry picked from commit e320d40022128845dfff900422ea9fd69f576c98)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# e320d400 05-Sep-2016 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: disable 48bit full PPGTT when vGPU is active

Disable 48bit full PPGTT on vGPU too for now.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160906040412.1274-3-zhenyuw@linux.intel.com


# bb8f9cff 22-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Allow DMA pagetables to use highmem

As we never need to directly access the pages we allocate for scratch and
the pagetables, and always remap them into the GTT through the dma
remapper, we do not need to limit the allocations to lowmem i.e. we can
pass in the __GFP_HIGHMEM flag to the page allocation.

For backwards compatibility, e.g. certain old GPUs not liking highmem
for certain functions that may be accidentally mapped to the scratch
page by userspace, keep the GMCH probe as only allocating from DMA32.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 8bcdd0f7 22-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Embed the scratch page struct into each VM

As the scratch page is no longer shared between all VM, and each has
their own, forgo the small allocation and simply embed the scratch page
struct into the i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>


# 52a05c30 22-Aug-2016 David Weinehall <david.weinehall@linux.intel.com>

drm/i915: pdev cleanup

In an effort to simplify things for a future push of dev_priv instead
of dev wherever possible, always take pdev via dev_priv where
feasible, eliminating the direct access from dev. Right now this
only eliminates a few cases of dev, but it also obviates that we pass
dev into a lot of functions where dev_priv would be the more obvious
choice.

v2: Fixed one more place missing in the previous patch set

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-5-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# c49d13ee 22-Aug-2016 David Weinehall <david.weinehall@linux.intel.com>

drm/i915: consistent struct device naming

We currently have a mix of struct device *device, struct device *kdev,
and struct device *dev (the latter forcing us to refer to
struct drm_device as something else than the normal dev).

To simplify things, always use kdev when referring to struct device.

v2: Replace the dev_to_drm_minor() macro with the inline function
kdev_to_drm_minor().

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-3-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 14daa63b 22-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Stop marking the unaccessible scratch page as UC

Since by design, if not entirely by practice, nothing is allowed to
access the scratch page we use to background fill the VM, then we do not
need to ensure that it is coherent between the CPU and GPU.
set_pages_uc() does a stop_machine() after changing the PAT, and that
significantly impacts upon context creation throughput.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# f7bbe788 19-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Embed the io-mapping struct inside drm_i915_private

As io_mapping.h now always allocates the struct, we can avoid that
allocation and extra pointer dance by embedding the struct inside
drm_i915_private

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-5-chris@chris-wilson.co.uk


# 49ef5294 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move fence tracking from object to vma

In order to handle tiled partial GTT mmappings, we need to associate the
fence with an individual vma.

v2: A couple of silly drops replaced spotted by Joonas

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-21-chris@chris-wilson.co.uk


# 05a20d09 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move map-and-fenceable tracking to the VMA

By moving map-and-fenceable tracking from the object to the VMA, we gain
fine-grained tracking and the ability to track individual fences on the VMA
(subsequent patch).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-16-chris@chris-wilson.co.uk


# 058d88c4 15-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track pinned VMA

Treat the VMA as the primary struct responsible for tracking bindings
into the GPU's VM. That is we want to treat the VMA returned after we
pin an object into the VM as the cookie we hold and eventually release
when unpinning. Doing so eliminates the ambiguity in pinning the object
and then searching for the relevant pin later.

v2: Joonas' stylistic nitpicks, a fun rebase.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-27-git-send-email-chris@chris-wilson.co.uk


# 19880c4a 15-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Consolidate i915_vma_unpin_and_release()

In a few places, we repeat a call to clear a pointer to a vma whilst
unpinning and releasing a reference to its owner. Refactor those into a
common function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-26-git-send-email-chris@chris-wilson.co.uk


# e5cdb22b 15-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move assertion for iomap access to i915_vma_pin_iomap

Access through the GTT requires the device to be awake. Ideally
i915_vma_pin_iomap() is short-lived and the pinning demarcates the
access through the iomap. This is not entirely true, we have a mixture
of long lived pins that exceed the wakelock (such as legacy ringbuffers)
and short lived pin that do live within the wakelock (such as execlist
ringbuffers).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-17-git-send-email-chris@chris-wilson.co.uk


# 81a8aa4a 15-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Create a VMA for an object

In many places, we wish to store the VMA in preference to the object
itself and so being able to create the persistent VMA is useful.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-9-git-send-email-chris@chris-wilson.co.uk


# 247177dd 15-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Always set the vma->pages

Previously, we would only set the vma->pages pointer for GGTT entries.
However, if we always set it, we can use it to prettify some code that
may want to access the backing store associated with the VMA (as
assigned to the VMA).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-8-git-send-email-chris@chris-wilson.co.uk


# 6687c906 15-Sep-2015 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Rewrite fb rotation GTT handling

Redo the fb rotation handling in order to:
- eliminate the NV12 special casing
- handle fb->offsets[] properly
- make the rotation handling easier for the plane code

To achieve these goals we reduce intel_rotation_info to only contain
(for each plane) the rotated view width,height,stride in tile units,
and the page offset into the object where the plane starts. Each plane
is handled exactly the same way, no special casing for NV12 or other
formats. We then store the computed rotation_info under
intel_framebuffer so that we don't have to recompute it again.

To handle fb->offsets[] we treat them as a linear offsets and convert
them to x/y offsets from the start of the relevant GTT mapping (either
normal or rotated). We store the x/y offsets under intel_framebuffer,
and for some extra convenience we also store the rotated pitch (ie.
tile aligned plane height). So for each plane we have the normal
x/y offsets, rotated x/y offsets, and the rotated pitch. The normal
pitch is available already in fb->pitches[].

While we're gathering up all that extra information, we can also easily
compute the storage requirements for the framebuffer, so that we can
check that the object is big enough to hold it.

When it comes time to deal with the plane source coordinates, we first
rotate the clipped src coordinates to match the relevant GTT view
orientation, then add to them the fb x/y offsets. Next we compute
the aligned surface page offset, and as a result we're left with some
residual x/y offsets. Finally, if required by the hardware, we convert
the remaining x/y offsets into a linear offset.

For gen2/3 we simply skip computing the final page offset, and just
convert the src+fb x/y offsets directly into a linear offset since
that's what the hardware wants.

After this all platforms, incluing SKL+, compute these things in exactly
the same way (excluding alignemnt differences).

v2: Use BIT(DRM_ROTATE_270) instead of ROTATE_270 when rotating
plane src coordinates
Drop some spurious changes that got left behind during
development
v3: Split out more changes to prep patches (Daniel)
s/intel_fb->plane[].foo.bar/intel_fb->foo[].bar/ for brevity
Rename intel_surf_gtt_offset to intel_fb_gtt_offset
Kill the pointless 'plane' parameter from intel_fb_gtt_offset()
v4: Fix alignment vs. alignment-1 when calling
_intel_compute_tile_offset() from intel_fill_fb_info()
Pass the pitch in tiles in
stad of pixels to intel_adjust_tile_offset() from intel_fill_fb_info()
Pass the full width/height of the rotated area to
drm_rect_rotate() for clarity
Use u32 for more offsets
v5: Preserve the upper_32_bits()/lower_32_bits() handling for the
fb ggtt offset (Sivakumar)
v6: Rebase due to drm_plane_state src/dst rects

Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-2-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3871f42a 05-Aug-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: fix aliasing_ppgtt leak

In i915_ggtt_cleanup_hw we need to remember to free aliasing_ppgtt. This
fixes the following kmemleak message:

unreferenced object 0xffff880213cca000 (size 8192):
comm "modprobe", pid 1298, jiffies 4294745402 (age 703.930s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff817c808e>] kmemleak_alloc+0x4e/0xb0
[<ffffffff8121f9c2>] kmem_cache_alloc_trace+0x142/0x1d0
[<ffffffffa06d11ef>] i915_gem_init_ggtt+0x10f/0x210 [i915]
[<ffffffffa06d71bb>] i915_gem_init+0x5b/0xd0 [i915]
[<ffffffffa069749a>] i915_driver_load+0x97a/0x1460 [i915]
[<ffffffffa06a26ef>] i915_pci_probe+0x4f/0x70 [i915]
[<ffffffff81423015>] local_pci_probe+0x45/0xa0
[<ffffffff81424463>] pci_device_probe+0x103/0x150
[<ffffffff81515e6c>] driver_probe_device+0x22c/0x440
[<ffffffff81516151>] __driver_attach+0xd1/0xf0
[<ffffffff8151379c>] bus_for_each_dev+0x6c/0xc0
[<ffffffff8151555e>] driver_attach+0x1e/0x20
[<ffffffff81514fa3>] bus_add_driver+0x1c3/0x280
[<ffffffff81516aa0>] driver_register+0x60/0xe0
[<ffffffff8142297c>] __pci_register_driver+0x4c/0x50
[<ffffffffa013605b>] 0xffffffffa013605b

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: b18b6bde300e ("drm/i915/bdw: Free PPGTT struct")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470420280-21417-1-git-send-email-matthew.auld@intel.com
(cherry picked from commit cb7f27601c81a1e0454e9461e96f65b31fafbea0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# cb7f2760 05-Aug-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: fix aliasing_ppgtt leak

In i915_ggtt_cleanup_hw we need to remember to free aliasing_ppgtt. This
fixes the following kmemleak message:

unreferenced object 0xffff880213cca000 (size 8192):
comm "modprobe", pid 1298, jiffies 4294745402 (age 703.930s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff817c808e>] kmemleak_alloc+0x4e/0xb0
[<ffffffff8121f9c2>] kmem_cache_alloc_trace+0x142/0x1d0
[<ffffffffa06d11ef>] i915_gem_init_ggtt+0x10f/0x210 [i915]
[<ffffffffa06d71bb>] i915_gem_init+0x5b/0xd0 [i915]
[<ffffffffa069749a>] i915_driver_load+0x97a/0x1460 [i915]
[<ffffffffa06a26ef>] i915_pci_probe+0x4f/0x70 [i915]
[<ffffffff81423015>] local_pci_probe+0x45/0xa0
[<ffffffff81424463>] pci_device_probe+0x103/0x150
[<ffffffff81515e6c>] driver_probe_device+0x22c/0x440
[<ffffffff81516151>] __driver_attach+0xd1/0xf0
[<ffffffff8151379c>] bus_for_each_dev+0x6c/0xc0
[<ffffffff8151555e>] driver_attach+0x1e/0x20
[<ffffffff81514fa3>] bus_add_driver+0x1c3/0x280
[<ffffffff81516aa0>] driver_register+0x60/0xe0
[<ffffffff8142297c>] __pci_register_driver+0x4c/0x50
[<ffffffffa013605b>] 0xffffffffa013605b

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: b18b6bde300e ("drm/i915/bdw: Free PPGTT struct")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470420280-21417-1-git-send-email-matthew.auld@intel.com


# 307dc25b 05-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Simplify do_idling() (Ironlake vt-d w/a)

Now that we pass along the expected interruptible nature for the
wait-for-idle, we do not need to modify the global
i915->mm.interruptible for this single call. (Only the immediate call to
i915_gem_wait_for_idle() takes the interruptible status as the other
action, dma_map_sg(), is independent of i915.ko)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-7-git-send-email-chris@chris-wilson.co.uk


# dcff85c8 05-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex

The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk


# de895082 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin()

Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for
i915_gem_object_ggtt_pin(), spare us the confusion and remove it.
Removing it now simplifies later patches to change the i915_vma_pin()
(and friends) interface.

v2: Add a redundant GEM_BUG_ON(!view) to
i915_gem_obj_lookup_or_create_ggtt_vma()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-18-git-send-email-chris@chris-wilson.co.uk


# 3272db53 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Combine all i915_vma bitfields into a single set of flags

In preparation to perform some magic to speed up i915_vma_pin(), which
is among the hottest of hot paths in execbuf, refactor all the bitfields
accessed by i915_vma_pin() into a single unified set of flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk


# 59bfa124 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Start passing around i915_vma from execbuffer

During execbuffer we look up the i915_vma in order to reserve them in
the VM. However, we then do a double lookup of the vma in order to then
pin them, all because we lack the necessary interfaces to operate on
i915_vma - so introduce i915_vma_pin()!

v2: Tidy parameter lists to remove one level of redirection in the hot
path.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk


# 20dfbde4 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Wrap vma->pin_count accessors with small inline helpers

In the next few patches, the VMA pinning API is overhauled and to reduce
the churn we pull out the update to the accessors into a prep patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-14-git-send-email-chris@chris-wilson.co.uk


# de180033 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Record allocated vma size

Tracking the size of the VMA as allocated allows us to dramatically
reduce the complexity of later functions (like inserting the VMA in to
the drm_mm range manager).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-13-git-send-email-chris@chris-wilson.co.uk


# e522ac23 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove surplus drm_device parameter to i915_gem_evict_something()

Eviction is VM local, so we can ignore the significance of the
drm_device in the caller, and leave it to i915_gem_evict_something() to
manage itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-2-git-send-email-chris@chris-wilson.co.uk


# df0e9a28 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

Revert "drm/i915: Clean up associated VMAs on context destruction"

This reverts commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae.

The patch was only a stop-gap measure that fixed half the problem - the
leak of the fbcon when restarting X. A complete solution required
releasing the VMA when the object itself was closed rather than rely on
file/process exit. The previous patches add the VMA tracking necessary
to do close them along with the object, context or file, and so the time
has come to remove the partial fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-28-git-send-email-chris@chris-wilson.co.uk


# 50e046b6 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark the context and address space as closed

When the user closes the context mark it and the dependent address space
as closed. As we use an asynchronous destruct method, this has two
purposes. First it allows us to flag the closed context and detect
internal errors if we to create any new objects for it (as it is removed
from the user's namespace, these should be internal bugs only). And
secondly, it allows us to immediately reap stale vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-27-git-send-email-chris@chris-wilson.co.uk


# b1f788c6 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Release vma when the handle is closed

In order to prevent a leak of the vma on shared objects, we need to
hook into the object_close callback to destroy the vma on the object for
this file. However, if we destroyed that vma immediately we may cause
unexpected application stalls as we try to unbind a busy vma - hence we
defer the unbind to when we retire the vma.

v2: Keep vma allocated until closed. This is useful for a later
optimisation, but it is required now in order to handle potential
recursion of i915_vma_unbind() by retiring itself.
v3: Comments are important.

Testcase: igt/gem_ppggtt/flink-and-close-vma-leak
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-26-git-send-email-chris@chris-wilson.co.uk


# b0decaf7 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track active vma requests

Hook the vma itself into the i915_gem_request_retire() so that we can
accurately track when a solitary vma is inactive (as opposed to having
to wait for the entire object to be idle). This improves the interaction
when using multiple contexts (with full-ppgtt) and eliminates some
frequent list walking when retiring objects after a completed request.

A side-effect is that we get an active vma reference for free. The
consequence of this is shown in the next patch...

v2: Update inline names to be consistent with
i915_gem_object_get_active()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-25-git-send-email-chris@chris-wilson.co.uk


# 2bfa996e 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store owning file on the i915_address_space

For the global GTT (and aliasing GTT), the address space is owned by the
device (it is a global resource) and so the per-file owner field is
NULL. For per-process GTT (where we create an address space per
context), each is owned by the opening file. We can use this ownership
information to both distinguish GGTT and ppGTT address spaces, as well
as occasionally inspect the owner.

v2: Whitespace, tells us who owns i915_address_space

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-6-git-send-email-chris@chris-wilson.co.uk


# 34c998b4 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rearrange GGTT probing to avoid needing a vfunc

Since we have a static if-else-chain for device probing of the global
GTT, we do not need to use a function pointer, let alone store it when
we never use it again. So use the if-else-chain to call down into the
device specific probe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-5-git-send-email-chris@chris-wilson.co.uk


# f6b9d5ca 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Split early global GTT initialisation

Initialising the global GTT is tricky as we wish to use the drm_mm range
manager during the modesetting initialisation (to capture stolen
allocations from the BIOS) before we actually enable GEM. To overcome
this, we currently setup the drm_mm first and then carefully rebind
them.

v2: Fixup after rebasing
v3: GGTT initialisation needs to be split around kicking out conflicts
v4: Restore an old UMS BUG_ON(mappable > total) as a DRM_ERROR plus
fixup of probe results.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-4-git-send-email-chris@chris-wilson.co.uk


# 97d6d7ab 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Update GGTT initialisation functions to take drm_i915_private

Since these are internal functions they operate on drm_i915_private and
not the drm_device being passed in. So pass in the drm_i915_private
instead, and remove one layer of dancing. No space wins here, just
conforming to the norm in function parameters.

v2: Include all the probe functions

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-3-git-send-email-chris@chris-wilson.co.uk


# 0088e522 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Split GGTT initialisation between probing and setup

In order to handle conflicting drivers (i.e. vgacon) having a different
setup of hardware, we have to remove those other drivers before we try
to setup our own mappings. This requires us to split GGTT initialisation
between probing for the hardware location (part of the PCI BAR) and
later establishing the kernel resources for it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-2-git-send-email-chris@chris-wilson.co.uk


# 7c9cf4e3 02-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Reduce engine->emit_flush() to a single mode parameter

Rather than passing a complete set of GPU cache domains for either
invalidation or for flushing, or even both, just pass a single parameter
to the engine->emit_flush to determine the required operations.

engine->emit_flush(GPU, 0) -> engine->emit_flush(EMIT_INVALIDATE)
engine->emit_flush(0, GPU) -> engine->emit_flush(EMIT_FLUSH)
engine->emit_flush(GPU, GPU) -> engine->emit_flush(EMIT_FLUSH | EMIT_INVALIDATE)

This allows us to extend the behaviour easily in future, for example if
we want just a command barrier without the overhead of flushing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-8-git-send-email-chris@chris-wilson.co.uk


# c7fe7d25 02-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove obsolete engine->gpu_caches_dirty

Space for flushing the GPU cache prior to completing the request is
preallocated and so cannot fail - the GPU caches will always be flushed
along with the completed request. This means we no longer have to track
whether the GPU cache is dirty between batches like we had to with the
outstanding_lazy_seqno.

With the removal of the duplication in the per-backend entry points for
emitting the obsolete lazy flush, we can then further unify the
engine->emit_flush.

v2: Expand a bit on the legacy of gpu_caches_dirty

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-18-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-7-git-send-email-chris@chris-wilson.co.uk


# 7e37f889 02-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename struct intel_ringbuffer to struct intel_ring

The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.

s/struct intel_ringbuffer/struct intel_ring/
s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/
s/describe_ctx_ringbuf()/describe_ctx_ring()/
s/intel_ring_get_active_head()/intel_engine_get_active_head()/
s/intel_ring_sync_index()/intel_engine_sync_index()/
s/intel_ring_init_seqno()/intel_engine_init_seqno()/
s/ring_stuck()/engine_stuck()/
s/intel_cleanup_engine()/intel_engine_cleanup()/
s/intel_stop_engine()/intel_engine_stop()/
s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/
s/intel_unpin_ringbuffer()/intel_unpin_ring()/
s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/
s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/
s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/
s/intel_ringbuffer_free()/intel_ring_free()/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk


# 1dae2dfb 02-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename request->ringbuf to request->ring

Now that we have disambuigated ring and engine, we can use the clearer
and more consistent name for the intel_ringbuffer pointer in the
request.

@@
struct drm_i915_gem_request *r;
@@
- r->ringbuf
+ r->ring

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-12-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-2-git-send-email-chris@chris-wilson.co.uk


# b5321f30 02-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Unify intel_logical_ring_emit and intel_ring_emit

Both perform the same actions with more or less indirection, so just
unify the code.

v2: Add back a few intel_engine_cs locals

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-11-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-1-git-send-email-chris@chris-wilson.co.uk


# 2a1d7752 25-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prefer list_first_entry_or_null

list_first_entry_or_null() can generate better code than using
if (!list_empty()) {ptr = list_first_entry()) ..., so put it to use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469530913-17180-2-git-send-email-chris@chris-wilson.co.uk


# 406ea8d2 20-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Treat ringbuffer writes as write to normal memory

Ringbuffers are now being written to either through LLC or WC paths, so
treating them as simply iomem is no longer adequate. However, for the
older !llc hardware, the hardware is documentated as treating the TAIL
register update as serialising, so we can relax the barriers when filling
the rings (but even if it were not, it is still an uncached register write
and so serialising anyway.).

For simplicity, let's ignore the iomem annotation.

v2: Remove iomem from ringbuffer->virtual_address
v3: And for good measure add iomem elsewhere to keep sparse happy

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-8-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-7-git-send-email-chris@chris-wilson.co.uk


# 48f112fe 24-Jun-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fill unused GGTT with scratch pages for VT-d

One of the numerous VT-d workarounds we require is that the display
hardware reads past the end of the buffer triggering VT-d faults. This
is acknowledged in the code as being safe "since we fill the unused
portions of the GGTT with the scratch page". Alas, that is no longer
always true and so we trigger DMAR read faults.

Skylake also requires another workaround to avoid mixing VT-d and
unpopulated PTE, and so there we also need to ensure we fill unused
entries with the scratch page.

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96584
Fixes: f7770bfd9fd2 ("drm/i915: Skip clearing the GGTT on full-ppgtt systems")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773634-8106-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: David Weinehall <david.weinehall@intel.com>


# 91c8a326 05-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm

Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

text data bss dec hex filename
1068757 4565 416 1073738 10624a drivers/gpu/drm/i915/i915.ko
1066949 4565 416 1071930 105b3a drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk


# 8eb95204 04-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Amalgamate gen6_mm_switch() and vgpu_mm_switch()

These are identical, so let's just use the same vfunc.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>


# fac5e23e 04-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mass convert dev->dev_private to to_i915(dev)

Since we now subclass struct drm_device, we can save pointer dances by
noting the equivalence of struct drm_device and struct drm_i915_private,
i.e. by using to_i915().

text data bss dec hex filename
1073824 4562 416 1078802 107612 drivers/gpu/drm/i915/i915.ko
1068976 4562 416 1073954 106322 drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:

@@
expression E;
identifier p;
@@
- struct drm_i915_private *p = E->dev_private;
+ struct drm_i915_private *p = to_i915(E);

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk


# 731f74c5 24-Jun-2016 Dave Gordon <david.s.gordon@intel.com>

drm/i915: tweak gen6_for_{each_pde, all_pdes} macros

Gen8 versions of these macros were updated a few months ago
(e8ebd8e drm/i915: eliminate 'temp' in gen8_for_each macros)
originally because at least one iterator could generate an
out of bounds access, but also because eliminating the 'temp'
parameter generated smaller and faster code.

Matthew Auld recently noticed the same problem with the gen6
versions and provided a patch
https://lists.freedesktop.org/archives/intel-gfx/2016-June/099334.html
but while we're changing these, we might as well make them as
much like the gen8 versions as possible, including the style
of using "&& (..., true)" rather than ": (..., 1) : 0", and
of course eliminating the redundant 'temp'.

Furthermore, the "all_pdes" version is only used in one place,
so we can improve code efficiency by changing both the macro
parameters and the calling code to reduce extra dereferences.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466793466-23500-1-git-send-email-david.s.gordon@intel.com


# 6e5a5beb 24-Jun-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Split idling from forcing context switch

We only need to force a switch to the kernel context placeholder during
eviction. All other uses of i915_gpu_idle() just want to wait until
existing work on the GPU is idle. Rename i915_gpu_idle() to
i915_gem_wait_for_idle() to avoid any implications about "parking" the
context first.

v2: Tweak an error message if the wait fails for the ilk vtd w/a

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466776558-21516-6-git-send-email-chris@chris-wilson.co.uk


# b02d22a3 16-Jun-2016 Zhi Wang <zhi.a.wang@intel.com>

drm/i915: Fold vGPU active check into inner functions

v5:
- Let functions take struct drm_i915_private *. (Tvrtko)

- Fold vGPU related active check into the inner functions. (Kevin)

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-4-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# d6473f56 10-Jun-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add support for mapping an object page by page

Introduced a new vm specfic callback insert_page() to program a single pte in
ggtt or ppgtt. This allows us to map a single page in to the mappable aperture
space. This can be iterated over to access the whole object by using space as
meagre as page size.

v2: Added low level rpm assertions to insert_page routines (Chris)

v3: Added POSTING_READ post register write (Tvrtko)

v4: Rebase (Ankit)

v5: Removed wmb() and FLUSH_CTL from insert_page, caller to take care
of it (Chris)

v6: insert_page not working correctly without FLSH_CNTL write, added the
write again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 5fbd0418 06-May-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms

Move the intel_enable_gtt() call to happen before we touch the GTT
during resume. Right now it's done way too late. Before
commit ebb7c78d358b ("agp/intel-gtt: Only register fake agp driver for gen1")
it was actually done earlier on account of also getting called from
the resume hook of the fake agp driver. With the fake agp driver
no longer getting registered we must move the call up.

The symptoms I've seen on my 830 machine include lowmem corruption,
other kinds of memory corruption, and straight up hung machine during
or just after resume. Not really sure what causes the memory corruption,
but so far I've not seen any with this fix.

I think we shouldn't really need to call this during init, but we have
been doing that so I've decided to keep the call. However moving that
call earlier could be prudent as well. Doing it right after the
intel-gtt probe seems appropriate.

Also tested this on 946gz,elk,ilk and all seemed quite happy with
this change.

v2: Reorder init_hw vs. enable_hw functions (Chris)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: ebb7c78d358b ("agp/intel-gtt: Only register fake agp driver for gen1")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462559755-353-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit ac840ae53573d9f435c88c131f6707a79aecb466)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 85d1225e 20-May-2016 Dave Gordon <david.s.gordon@intel.com>

drm/i915: Introduce & use new lightweight SGL iterators

The existing for_each_sg_page() iterator is somewhat heavyweight, and is
limiting i915 driver performance in a few benchmarks. So here we
introduce somewhat lighter weight iterators, primarily for use with GEM
objects or other case where we need only deal with whole aligned pages.

Unlike the old iterator, the new iterators use an internal state
structure which is not intended to be accessed by the caller; instead
each takes as a parameter an output variable which is set before each
iteration. This makes them particularly simple to use :)

One of the new iterators provides the caller with the DMA address of
each page in turn; the other provides the 'struct page' pointer required
by many memory management operations.

Various uses of for_each_sg_page() are then converted to the new macros.

v2: Force inlining of the sg_iter constructor and make the union
anonymous.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-4-git-send-email-chris@chris-wilson.co.uk


# f7770bfd 14-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Skip clearing the GGTT on full-ppgtt systems

Under full-ppgtt, access to the global GTT is carefully regulated
through hardware functions (i.e. userspace cannot read and write to
arbitrary locations in the GGTT via the GPU). With this restriction in
place, we can forgo clearing stale entries from the GGTT as they will
not be accessed.

For aliasing-ppgtt, we could almost do the same except that we do allow
userspace access to the global-GTT via execbuf in order to workraound
some quirks of certain instructions. (This execbuf path is filtered out
with EINVAL on full-ppgtt.)

The most dramatic effect this will have will be during resume, as with
full-ppgtt the GGTT is only used sparingly.

References: https://bugs.freedesktop.org/show_bug.cgi?id=94722
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Tested-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463207195-22076-4-git-send-email-chris@chris-wilson.co.uk


# 975f7ff4 14-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lazily migrate the objects after hibernation

Now that we mark the object domains for having been restored from the
hibernation image, we not need to flush everything during resume and
can instead rely on the normal domain tracking to flush only when
required. The only caveat here are objects that are pinned for use by
the hardware, whose contents must be coherent for when the device
resumes reading from then (shortly afterwards with the driver assuming
the objects are in the correct domain).

References: https://bugs.freedesktop.org/show_bug.cgi?id=94722
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Tested-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463207195-22076-3-git-send-email-chris@chris-wilson.co.uk


# dc97997a 10-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use drm_i915_private as the native pointer for intel_uncore.c

Pass drm_i915_private to the uncore init/fini routines and their
subservients as it is their native type.

text data bss dec hex filename
6309978 3578778 696320 10585076 a183f4 vmlinux
6309530 3578778 696320 10584628 a18234 vmlinux

a modest 400 bytes of saving, but 60 lines of code deleted!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462885804-26750-1-git-send-email-chris@chris-wilson.co.uk


# ac840ae5 06-May-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms

Move the intel_enable_gtt() call to happen before we touch the GTT
during resume. Right now it's done way too late. Before
commit ebb7c78d358b ("agp/intel-gtt: Only register fake agp driver for gen1")
it was actually done earlier on account of also getting called from
the resume hook of the fake agp driver. With the fake agp driver
no longer getting registered we must move the call up.

The symptoms I've seen on my 830 machine include lowmem corruption,
other kinds of memory corruption, and straight up hung machine during
or just after resume. Not really sure what causes the memory corruption,
but so far I've not seen any with this fix.

I think we shouldn't really need to call this during init, but we have
been doing that so I've decided to keep the call. However moving that
call earlier could be prudent as well. Doing it right after the
intel-gtt probe seems appropriate.

Also tested this on 946gz,elk,ilk and all seemed quite happy with
this change.

v2: Reorder init_hw vs. enable_hw functions (Chris)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: ebb7c78d358b ("agp/intel-gtt: Only register fake agp driver for gen1")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462559755-353-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# c033666a 06-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store a i915 backpointer from engine, and use it

text data bss dec hex filename
6309351 3578714 696320 10584385 a18141 vmlinux
6308391 3578714 696320 10583425 a17d81 vmlinux

Almost 1KiB of code reduction.

v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions

text data bss dec hex filename
6304579 3578778 696320 10579677 a16edd vmlinux
6303427 3578778 696320 10578525 a16a5d vmlinux

Now over 1KiB!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk


# cba6dba4 05-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Unexport i915_ppgtt_init()

As i915_ppgtt_init() is not used outside of i915_gem_gtt.c we can make
it static.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1462443767-5194-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


# 0e4ca100 29-Apr-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix ordering of sanitize ppgtt and sanitize execlists

The i915.enable_ppgtt option depends upon the state of
i915.enable_execlists option - so we need to sanitize execlists first.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461932305-14637-2-git-send-email-chris@chris-wilson.co.uk


# f9326be5 28-Apr-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use

The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-11-git-send-email-chris@chris-wilson.co.uk


# 8ef8561f 28-Apr-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move ioremap_wc tracking onto VMA

By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).

Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.

v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk


# ce7fda2e 28-Apr-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Introduce i915_vm_to_ggtt()

In a couple of places, we have an i915_address_space that we know is
really an i915_ggtt that we want to use. Create an inline helper to
convert from the i915_address_space subclass into its container.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-4-git-send-email-chris@chris-wilson.co.uk


# 64c050db 27-Apr-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: tidy up gen8_init_scratch

Prefer a goto teardown path to do all the required cleanup.

v2:
(Joonas Lahtinen)
- remove NULL assignments
- rename goto labels

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461759565-12086-1-git-send-email-matthew.auld@intel.com


# df28564d 21-Apr-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: use dev_priv directly in gen8_ppgtt_notify_vgt

Remove dev local and use to_i915() in gen8_ppgtt_notify_vgt.

v2: use dev_priv directly for QUESTION_MACROS (Joonas Lahtinen)

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461323365-21256-1-git-send-email-matthew.auld@intel.com


# 44a71024 12-Apr-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: call kunmap_px on pt_vaddr

We need to kunmap pt_vaddr and not pt itself, otherwise we end up
mapping a bunch of pages without ever unmapping them.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: d1c54acd67dc ("drm/i915/gtt: Introduce kmap|kunmap for dma page")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460476663-24890-4-git-send-email-matthew.auld@intel.com


# 3accaf7e 13-Apr-2016 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Store and use edram capabilities

Store the edram capabilities instead of only the size of
edram. This is preparatory patch to allow edram size calculation
based on edram capability bits for gen9+. With gen9 the
edram is behind llc and is a separate entity. With hsw/bdw
it was more of a victim cache for LLC so the name 'eLLC' might
be warranted. Regardless, rename all mentions of eLLC to EDRAM to
clear the confusion.

v2: return bytes for edram size (Chris)
s/eLLC/eDRAM in output if we are gen > 8

v3: rebase, INTEL_GEN (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# f2a85e19 07-Apr-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm,i915: Introduce drm_malloc_gfp()

I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want to
try a high-order kmalloc() before using a vmalloc().

So refactor my usage into drm_malloc_gfp().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk


# 2d1fe073 07-Apr-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Do not use {HAS_*, IS_*, INTEL_INFO}(dev_priv->dev)

dev_priv is what the macro works hard to extract, pass it directly.

> sed 's/\([A-Z].*(dev_priv\)->dev)/\1)/g'

v2:
- Include all wrapper macros too (Chris)

v3:
- Include sed cmdline (Chris)

v4:
- Break long line
- Rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460016485-8089-1-git-send-email-joonas.lahtinen@linux.intel.com


# e5716f55 07-Apr-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Use i915_vm_to_ppgtt instead of manual container_of

Looks much better without container_of everywhere.

v2:
- In i915_gem_restore_gtt_mappings too (Chris)

v3:
- Do not cause WARN by calling on non PPGTT object (Chris)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 72e96d64 30-Mar-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Refer to GGTT {,VM} consistently

Refer to the GGTT VM consistently as "ggtt->base" instead of just "ggtt",
"vm" or indirectly through other variables like "dev_priv->ggtt.base"
to avoid confusion with the i915_ggtt object itself and PPGTT VMs.

Refer to the GGTT as "ggtt" instead of indirectly through chaining.

As a bonus gets rid of the long-standing i915_obj_to_ggtt vs.
i915_gem_obj_to_ggtt conflict, due to removal of i915_obj_to_ggtt!

v2:
- Added some more after grepping sources with Chris

v3:
- Refer to GGTT VM through ggtt->base consistently instead of ggtt_vm
(Chris)

v4:
- Convert all dev_priv->ggtt->foo accesses to ggtt->foo.

v5:
- Make patch checker happy

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# d85489d3 24-Mar-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Rename GGTT init functions

Rename and document the GGTT init functions to give a better
idea of the context where they are called from.

i915_gem_gtt_init => i915_ggtt_init_hw
i915_gem_init_global_gtt => i915_gem_init_ggtt
i915_global_gtt_cleanup => i915_ggtt_cleanup_hw

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458830866-12578-1-git-send-email-joonas.lahtinen@linux.intel.com


# ade7daa1 24-Mar-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: BUG_ON when ggtt_view is NULL

Lets BUG_ON and don't bother with a WARN and returning an error, so we can
remove the need to pollute the code with error handling, after all it is
a programmer error to provide NULL view. Also while we're here remove
redundant NULL ggtt_view check.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458834860-7898-1-git-send-email-matthew.auld@intel.com


# b4ac5afc 24-Mar-2016 Dave Gordon <david.s.gordon@intel.com>

drm/i915: replace for_each_engine()

Having provided for_each_engine_id() for cases where the third (id)
argument is useful, we can now replace all the remaining instances with
a simpler version that takes only two parameters. In many cases, this
also allows the elimination of the local variable used in the iterator
(usually 'i').

v2:
s/dev_priv/(dev_priv__)/ in body of for_each_engine_masked() [Chris Wilson]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458757194-17783-2-git-send-email-david.s.gordon@intel.com


# 321d178e 20-Nov-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Tidy aliasing_gtt_bind_vma()

In commit 0a878716265e9af9f697264dc2e858fcc060d833
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Oct 15 14:23:01 2015 +0200

drm/i915: restore ggtt double-bind avoidance

we wrote the ggtt_bind_vma() observing a number of cleanups we could do
over the template of aliasing_gtt_bind_vma(). Now let's apply the
cleanups we made there back to the original. The essence is to avoid
redundant variables and assignements, and by doing so make the code
easier to read.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448015238-24639-1-git-send-email-chris@chris-wilson.co.uk


# c890e2d5 18-Mar-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Codify our assumption that the Global GTT is <= 4GiB

Throughout the code base, we use u32 for offsets into the global GTT. If
we ever see any hardware with a larger GGTT, then we run the real risk
of silent corruption. So test for our assumption up front so that we
have a nice reminder should the time come when it fails.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[Rebased and changed 1ull -> 1ULL, cut 80 char line]
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458290579-27783-1-git-send-email-joonas.lahtinen@linux.intel.com


# d507d735 18-Mar-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915/gtt: Clean up GGTT probing code

Use less pointers with the probing code, making it much less confusing
to read.

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 62106b4f 18-Mar-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Rename dev_priv->gtt to dev_priv->ggtt

Refer to Global GTT consistently as GGTT, thus rename dev_priv->gtt
to dev_priv->ggtt and struct i915_gtt to struct i915_ggtt.

Fix a couple of whitespace problems while at it.

v2:
- Fix a typo in commit message.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# dc3b04fb 18-Mar-2016 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915/gtt: Reference mappable_end variable from pointer

Reference variable value from pointer, not assumed pointer destination.

Since:

commit c44ef60e437019b8ca1dab8b4d2e8761fd4ce1e9
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Thu Jun 25 18:35:05 2015 +0300

drm/i915/gtt: Allow >= 4GB sizes for vm.

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 39dabecd 17-Mar-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Use shorter route to dev_private where possible

Where we have a request we can use req->i915 directly instead
of going through the engine and device. Coccinelle script:

@@
function f;
identifier r;
@@
f(..., struct drm_i915_gem_request *r, ...)
{
...
- engine->dev->dev_private
+ r->i915
...
}
@@
struct drm_i915_gem_request *req;
@@
(
req->
- engine->dev->dev_private
+ i915
)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1458219850-21007-1-git-send-email-tvrtko.ursulin@linux.intel.com


# 666796da 16-Mar-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: More intel_engine_cs renaming

Some trivial ones, first pass done with Coccinelle:

@@
@@
(
- I915_NUM_RINGS
+ I915_NUM_ENGINES
|
- intel_ring_flag
+ intel_engine_flag
|
- for_each_ring
+ for_each_engine
|
- i915_gem_request_get_ring
+ i915_gem_request_get_engine
|
- intel_ring_idle
+ intel_engine_idle
|
- i915_gem_reset_ring_status
+ i915_gem_reset_engine_status
|
- i915_gem_reset_ring_cleanup
+ i915_gem_reset_engine_cleanup
|
- init_ring_lists
+ init_engine_lists
)

But that didn't fully work so I cleaned it up with:

for f in *.[hc]; do sed -i -e s/I915_NUM_RINGS/I915_NUM_ENGINES/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_request_get_ring/i915_gem_request_get_engine/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_flag/intel_engine_flag/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_idle/intel_engine_idle/ $f; done
for f in *.[hc]; do sed -i -e s/init_ring_lists/init_engine_lists/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_cleanup/i915_gem_reset_engine_cleanup/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_status/i915_gem_reset_engine_status/ $f; done

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 4a570db5 16-Mar-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Rename intel_engine_cs struct members

below and a couple manual fixups.

@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs *J;
+ struct intel_engine_cs *engine;
...
}
@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs J;
+ struct intel_engine_cs engine;
...
}
@@
struct drm_i915_private *d;
@@
(
- d->ring
+ d->engine
)
@@
struct i915_execbuffer_params *p;
@@
(
- p->ring
+ p->engine
)
@@
struct intel_ringbuffer *r;
@@
(
- r->ring
+ r->engine
)
@@
struct drm_i915_gem_request *req;
@@
(
- req->ring
+ req->engine
)

v2: Script missed the tracepoint code - fixed up by hand.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# e2f80391 16-Mar-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Rename local struct intel_engine_cs variables

Done by the Coccinelle script below plus a manual
intervention to GEN8_RING_SEMAPHORE_INIT.

@@
expression E;
@@
- struct intel_engine_cs *ring = E;
+ struct intel_engine_cs *engine = E;
<+...
- ring
+ engine
...+>
@@
@@
- struct intel_engine_cs *ring;
+ struct intel_engine_cs *engine;
<+...
- ring
+ engine
...+>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 11f20322 15-Feb-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Move the NULL sg handling out from rotate_pages()

rotate_pages() checks to see if it got called with a NULL sg, and then
goes to extract it from sg->sgl. It always gets called with a NULL sg
for the first plane, so moving the initial 'sg=st->sgl' assignment out
into intel_rotate_fb_obj_pages() seems less special-casey.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-9-git-send-email-ville.syrjala@linux.intel.com


# 1663b9d6 15-Feb-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Reorganize intel_rotation_info

Throw out a bunch of unnecessary stuff from struct intel_rotation_info,
and pull most of the remaining stuff to live under an array of
per-color plane sub-structures.

What still remains outside the sub-structure will be reorgranized later
as well, but that requires more work elsewhere so leave it be for now.

v2: Split the vma size == luma+chroma size fix to prep patch (Daniel)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-8-git-send-email-ville.syrjala@linux.intel.com


# 9106cf17 15-Feb-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Account for the size of the chroma plane for the rotated gtt view

The size of the rotated ggtt mapping ought to include the size of the
chroma plane as well. Not a huge deal since we don't expose NV12 (or any
pother planar format for that matter) yet.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: 89e3e1427629 ("drm/i915: Support NV12 in rotated GGTT mapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 596c5923 26-Feb-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Reduce the pointer dance of i915_is_ggtt()

The multiple levels of indirect do nothing but hinder the compiler and
the pointer chasing turns to be quite painful but painless to fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456484600-11477-1-git-send-email-tvrtko.ursulin@linux.intel.com


# 1c7f4bca 26-Feb-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename vma->*_list to *_link for consistency

Elsewhere we have adopted the convention of using '_link' to denote
elements in the list (and '_list' for the actual list_head itself), and
that the name should indicate which list the link belongs to (and
preferrably not just where the link is being stored).

s/vma_link/obj_link/ (we iterate over obj->vma_list)
s/mm_list/vm_link/ (we iterate over vm->[in]active_list)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# d5165ebd 04-Feb-2016 Tim Gore <tim.gore@intel.com>

drm/i915: implement WaIncreaseDefaultTLBEntries

WaIncreaseDefaultTLBEntries increases the number of TLB
entries available for GPGPU workloads and gives significant
( > 10% ) performance gain for some OCL benchmarks.
Put this in a new function that can be a place for
workarounds that are GT related but not required per ring.
This function is called on driver load and also after a
reset and on resume, so it is safe for workarounds that get
clobbered in these situations. This function currently has
just this one workaround.

v2: This was originally split into 3 patches but following
review feedback was squashed into 1.
I have not incorporated some style comments from Chris
Wilson as I felt that after defining and intialising a
temporary variable and then adding an additional if block
to only write the register if the temporary variable had
been set, this didn't really give a net gain.

v3: Resending in the hope that BAT will run

v4: Change subject line to trigger BAT (please!)

Signed-off-by: Tim Gore <tim.gore@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1454586574-2343-1-git-send-email-tim.gore@intel.com


# 11d23e6f 20-Jan-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Pass rotation_info to intel_rotate_fb_obj_pages()

intel_rotate_fb_obj_pages() doens't need the entire gtt view, just the
rotation info suffices.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453316739-13296-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 87130255 20-Jan-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Pass stride to rotate_pages()

Pass stride in addition to width and height to rotate_pages(). For now
width and stride are the same, but once framebuffer offsets enter the
scene that may no longer be the case.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453316739-13296-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7723f47d 20-Jan-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Rename the rotated gtt view member to 'rotated'

Also rename 'rotation_info' to 'rotated' to match the view type exactly,
this should avoid confusion which union members is valid for each view
type.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453316739-13296-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a4eba47b 19-Jan-2016 Imre Deak <imre.deak@intel.com>

drm/i915: Move stolen memory initialization earlier during loading

The only device specific dependency of the stolen memory setup is the
MMIO mapping and the stolen memory size. Both are already available in
i915_gtt_init(), so move the stolen initialization to there. The
clean-up code for i915_gtt_init() is in i915_global_gtt_cleanup(), so
move the stolen memory clean-up code there too.

This will be needed by an upcoming patch that needs the details of the
memory we reserve, but the change is also part of our generic goal to
move the initialization of resources with no or little dependencies on
other device specific resources towards the beginning of the init
sequence.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453209992-25995-8-git-send-email-imre.deak@intel.com


# 2d7f3bdb 14-Jan-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Pass the dma_addr_t array as const to rotate_pages()

rotate_pages() doesn't modify the passed in dma addresses, so make
them const.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452777736-4909-4-git-send-email-ville.syrjala@linux.intel.com


# b5e16987 14-Jan-2016 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Set i915_ggtt_view_normal type explicitly

Just for clarity set the type for i915_ggtt_view_normal explicitly.

While at it fix the indentation fail for i915_ggtt_view_rotated.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452777736-4909-3-git-send-email-ville.syrjala@linux.intel.com


# d5f384de 18-Nov-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move Braswell stop_machine GGTT insertion workaround

There was a silent conflict between

commit 0a878716265e9af9f697264dc2e858fcc060d833
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Oct 15 14:23:01 2015 +0200

drm/i915: restore ggtt double-bind avoidance

and

commit 5bab6f60cb4d1417ad7c599166bcfec87529c1a2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Oct 23 18:43:32 2015 +0100

drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

thankfully caught by the extra WARN safegaurd in 0a878716. Since we now
override the GGTT insert_pages callback when installing the aliasing
ppgtt, we assert that the callback is the original ggtt routine.
However, on Braswell we now use a different insertion routine to
serialise access through the GGTT with updating the PTE and hence the
conflict. To avoid the conflict, move the custom insertion routine for
Braswell down a level.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1447859979-20107-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit c140330b5e6b5bc2262ffb2f50bfeea06a482699)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 62d622c1 20-Nov-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Set the map-and-fenceable flag for preallocated objects

As we mark the preallocated objects as bound, we should also flag them
correctly as being map-and-fenceable (if appropriate!) so that later
users do not get confused and try and rebind the pinned vma in order to
get a map-and-fenceable binding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1448029000-10616-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit d0710abbcd88b1ff17760e97d74a673e67b49ea1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# c140330b 18-Nov-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move Braswell stop_machine GGTT insertion workaround

There was a silent conflict between

commit 0a878716265e9af9f697264dc2e858fcc060d833
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Oct 15 14:23:01 2015 +0200

drm/i915: restore ggtt double-bind avoidance

and

commit 5bab6f60cb4d1417ad7c599166bcfec87529c1a2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Oct 23 18:43:32 2015 +0100

drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

thankfully caught by the extra WARN safegaurd in 0a878716. Since we now
override the GGTT insert_pages callback when installing the aliasing
ppgtt, we assert that the callback is the original ggtt routine.
However, on Braswell we now use a different insertion routine to
serialise access through the GGTT with updating the PTE and hence the
conflict. To avoid the conflict, move the custom insertion routine for
Braswell down a level.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1447859979-20107-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d0710abb 20-Nov-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Set the map-and-fenceable flag for preallocated objects

As we mark the preallocated objects as bound, we should also flag them
correctly as being map-and-fenceable (if appropriate!) so that later
users do not get confused and try and rebind the pinned vma in order to
get a map-and-fenceable binding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1448029000-10616-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# be69459a 15-Dec-2015 Imre Deak <imre.deak@intel.com>

drm/i915: check that we are in an RPM atomic section in GGTT PTE updaters

The device should be on for the whole duration of the update, so check
for this.

v2:
- use the existing dev_priv directly everywhere (Ville)
v3:
- check also that we are in an RPM atomic section (Chris)
- add the assert to i915_ggtt_insert_entries/clear_range too (Chris)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1450203038-5150-11-git-send-email-imre.deak@intel.com


# 666a4537 09-Dec-2015 Wayne Boyer <wayne.boyer@intel.com>

drm/i915: Separate cherryview from valleyview

The cherryview device shares many characteristics with the valleyview
device. When support was added to the driver for cherryview, the
corresponding device info structure included .is_valleyview = 1.
This is not correct and leads to some confusion.

This patch changes .is_valleyview to .is_cherryview in the cherryview
device info structure and simplifies the IS_CHERRYVIEW macro.
Then where appropriate, instances of IS_VALLEYVIEW are replaced with
IS_VALLEYVIEW || IS_CHERRYVIEW or equivalent.

v2: Use IS_VALLEYVIEW || IS_CHERRYVIEW instead of defining a new macro.
Also add followup patches to fix issues discovered during the first
review. (Ville)
v3: Fix some style issues and one gen check. Remove CRT related changes
as CRT is not supported on CHV. (Imre, Ville)
v4: Make a few more optimizations. (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449692975-14803-1-git-send-email-wayne.boyer@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>


# e8ebd8e2 08-Dec-2015 Dave Gordon <david.s.gordon@intel.com>

drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros

All of these iterator macros require a 'temp' argument, used merely to
hold internal partial results. We can instead declare the temporary
variable inside the macro, so the caller need not provide it.

Some of the old code contained nested iterators that actually reused the
same 'temp' variable for both inner and outer instances. It's quite
surprising that this didn't introduce bugs! But it does show that the
value of 'temp' isn't required to persist during the iterated body.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449581451-11848-2-git-send-email-david.s.gordon@intel.com


# a6d09186 14-Oct-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Stuff rotation params into view union

We don't need 2 separate unions.

Note that this was done intentinoally

Author: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Date: Wed May 6 14:35:38 2015 +0300

drm/i915: Add a partial GGTT view type

on Tvrtko's request, but without a clear justification. Rotated views
are also not checking for matching paramters in i915_ggtt_view_equal,
which seems like a bug. But this patch here doesn't change that.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1444834266-12689-2-git-send-email-daniel.vetter@ffwll.ch
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ab75bb5d 04-Nov-2015 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Turn vgpu pdps into an array

We'll want to avoid performing arithmetic with register offsets, so
instead calculating the vgpu PDP as pdp0_lo+offset, make the PDPs
into an array. This way we can simply loop through them.

Cc: Eddie Dong <eddie.dong@intel.com>
Cc: Jike Song <jike.song@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1446672017-24497-25-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Zhiyuan Lv <zhiyuan.lv@intel.com>


# f92a9162 04-Nov-2015 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add functions to emit register offsets to the ring

When register type safety happens, we can't just try to emit the
register itself to the ring. Instead we'll need to extract the
offset from it first. Add some convenience functions that will do
that.

v2: Convert MOCS setup too

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1446672017-24497-20-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 5bab6f60 23-Oct-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

When accessing through the GTT from one CPU whilst concurrently updating
the GGTT PTEs in another thread, the hardware likes to return random
data. As we have strong serialisation prevent us from modifying the PTE
of an active GTT mmapping, we have to conclude that it whilst modifying
other PTE's that error occurs. (I have not looked for any pattern such
as modifying PTE within the same page or cacheline as active PTE -
though checking whether revoking neighbouring objects should be enough
to test that theory.) The corruption also seems restricted to Braswell
and disappears with maxcpus=0. This patch stops all access through the
GTT by other CPUs when we update any PTE by stopping the machine around
the GGTT update.

Note that splitting up the 64 bit write into two 32 bit writes was
tried and found to fail too.

Testcase: igt/gem_concurrent_blit
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89079
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add note about 2x 32bits failing too.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1f9a99e0 30-Sep-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Flip the 48b switch

Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
Update the sanitize_enable_ppgtt for 48 bit PPGTT mode.

Note, aliasing PPGTT remains 32b only.

v2: s/full_64b/full_48b/. (Akash)
v3: Add sanitize_enable_ppgtt changes until here. (Akash)
v4: Update param description (Chris)

Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0a878716 15-Oct-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: restore ggtt double-bind avoidance

This was accidentally lost in

commit 75d04a3773ecee617847de963ae4195d6aa74c28
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Tue Apr 28 17:56:17 2015 +0300

drm/i915/gtt: Allocate va range only if vma is not bound

While at it implement an improved version suggested by Chris which
avoids the double-bind irrespective of what type of bind is done
first.

Note that this exact bug was already addressed in

commit d0e30adc42d979e4adc36b6c112b57337423b70c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Jul 29 20:02:48 2015 +0100

drm/i915: Mark PIN_USER binding as GLOBAL_BIND without the aliasing ppgtt

but the problem is still that originally in

commit 0875546c5318c85c13d07014af5350e9000bc9e9
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Mon Apr 20 09:04:05 2015 -0700

drm/i915: Fix up the vma aliasing ppgtt binding

if forgotten to take into account there case where we have a
GLOBAL_BIND before a LOCAL_BIND. This patch here fixes that.

v2: Pimp commit message and revert the partial fix.

v3: Split into two functions to specialize on aliasing_ppgtt y/n.

v4: WARN_ON for paranoia in the init sequence, since the ggtt probe
and aliasing ppgtt setup are far apart.

v5: Style nits.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://mid.gmane.org/1444911781-32607-1-git-send-email-daniel.vetter@ffwll.ch
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7e435ad2 18-Sep-2015 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add LO/HI PRIVATE_PAT registers

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7c4a7d60 24-Sep-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Defer adding preallocated stolen objects to the VM list

When preallocating a stolen object during early initialisation, we may
be running before we have setup the the global GTT VM state, in
particular before we have initialised the range manager and associated
lists. As this is the case, we defer binding the stolen object until we
call i915_gem_setup_global_gtt(). Not only should we defer the binding,
but we should also defer the VM list manipulation.

Fixes regression uncovered by commit a2cad9dff4dd44d0244b966d980de9d602d87593
Author: Michał Winiarski <michal.winiarski@intel.com>
Date: Wed Sep 16 11:49:00 2015 +0200

drm/i915/gtt: Do not initialize drm_mm twice.

Whilst I am here remove the duplicate work leaving dangling pointers
from the error path...

v2: Typos galore before coffee.

Reported-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92099
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Tested-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# dedf278c 21-Sep-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Enable querying offset of UV plane with intel_plane_obj_offset

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 89e3e142 21-Sep-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Support NV12 in rotated GGTT mapping

Just adding the rotated UV plane at the end of the rotated Y plane.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 804beb4b 21-Sep-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Support appending to the rotated pages mapping

By providing a start offset into the source array of pages, and returning the
end position in the scatter-gather table, we will be able to append the UV
plane to the rotated mapping in later patches.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a2cad9df 16-Sep-2015 Michał Winiarski <michal.winiarski@intel.com>

drm/i915/gtt: Do not initialize drm_mm twice.

It would be initialized just moments later by i915_init_vm.

Rearrange the code such that i915_init_vm() is next to its callers
inside i915_gem_gtt (and so we can make it static). After removing the
dance around the files, it is clear that we are repeating some work
inside the initializers (such as calling drm_mm_init() multiple times),
so take advantage of the refactor to also remove some redundant code and
clean up the interface.

v2: Commit msg update,
s/i915_init_vm/i915_address_space_init, move to i915_gem_gtt.c,
init address_space during i915_gem_setup_global_gtt for ggtt.
v3: Do not init global_link - we are adding it to vm_list moments later,
make i915_address_space_init static, use OOP style parameter order.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3a41a05d 03-Sep-2015 Michał Winiarski <michal.winiarski@intel.com>

drm/i915/gtt: Avoid calling kcalloc in a loop when allocating temp bitmaps

On each call to gen8_alloc_va_range_3lvl we're allocating temporary
bitmaps needed for error handling. Unfortunately, when we increase
address space size (48b ppgtt) we do additional (512 - 4) calls to
kcalloc, increasing latency between exec and actual start of execution
on the GPU. Let's just do a single kcalloc, we can also drop the size
from free_gen8_temp_bitmaps since it's no longer used.

v2: Use GFP_TEMPORARY to make the allocations reclaimable.
v3: Drop the 2D array, just allocate a single block.
v4: Rebase to handle gen8_preallocate_top_level_pdps.
v5: Align misaligned bracket.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Correct kcalloc arguments as suggested by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 650da34c 28-Aug-2015 Zhiyuan Lv <zhiyuan.lv@intel.com>

drm/i915: guest i915 notification for Intel GVT-g

When i915 drivers run inside a VM with Intel GVT-g, some explicit
notifications are needed from guest to host device model through PV
INFO page write. The notifications include:

PPGTT create
PPGTT destroy

They are used for the shadow implementation of PPGTT. Intel GVT-g
needs to write-protect the guest pages of PPGTT, and clear the write
protection when they end their life cycle.

v2:
- Use lower_32_bits()/upper_32_bits() for qword operations;
- Remove the notification of guest context creation/destroy;

Signed-off-by: Zhiyuan Lv <zhiyuan.lv@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 331f38e7 28-Aug-2015 Zhiyuan Lv <zhiyuan.lv@intel.com>

drm/i915: preallocate pdps for 32 bit vgpu

This is based on Mika Kuoppala's patch below:
http://article.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/61104/match=workaround+hw+preload

The patch will preallocate the page directories for 32-bit PPGTT when
i915 runs inside a virtual machine with Intel GVT-g. With this change,
the root pointers in EXECLIST context will always keep the same.

The change is needed for vGPU because Intel GVT-g will do page table
shadowing, and needs to track all the page table changes from guest
i915 driver. However, if guest PPGTT is modified through GPU commands
like LRI, it is not possible to trap the operations in the right time,
so it will be hard to make shadow PPGTT to work correctly.

Shadow PPGTT could be much simpler with this change. Meanwhile
hypervisor could simply prohibit any attempt of PPGTT modification
through GPU command for security.

The function gen8_preallocate_top_level_pdps() in the patch is from
Mika, with only one change to set "used_pdpes" to avoid duplicated
allocation later.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Zhiyuan Lv <zhiyuan.lv@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 25f50337 07-Aug-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Always pass dev pointer in pdp_init

And fix 0-DAY kernel test infrastructure warning.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f365f911 07-Aug-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Use complete virtual address range on 32-bit platforms

With the offset length being taken care of in ("drm/i915/gtt: Allow >=
4GB offsets in X86_32"), the code should be finally safe in 32-bit
kernels.

This reverts commit 501fd70fcaebc911b6b96a7b331e6960e5af67e7
Author: Michel Thierry <michel.thierry@intel.com>
Date: Fri May 29 14:15:05 2015 +0100

drm/i915: limit PPGTT size to 2GB in 32-bit platforms

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 088e0df4 07-Aug-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gtt: Allow >= 4GB offsets in X86_32

Similar to commit c44ef60e437019b8ca1dab8b4d2e8761fd4ce1e9 ("drm/i915/gtt:
Allow >= 4GB sizes for vm"), i915_gem_obj_offset and i915_gem_obj_ggtt_offset
return an unsigned long, which in only 4-bytes long in 32-bit kernels.

Change return type (and other related offset variables) to u64.

Since Global GTT is always limited to 4GB, this change would not be required
in i915_gem_obj_ggtt_offset, but this is done for consistency.

v2: Remove unnecessary offset variable in do_pin, as we already have
vma->node.start (Chris).
Update GGTT offset too (Tvrtko).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ea91e401 29-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Add ppgtt info and debug_dump

v2: Clean up patch after rebases.
v3: gen8_dump_ppgtt for 32b and 48b PPGTT.
v4: Use used_pml4es/pdpes (Akash).
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rely on used_px bits instead of null checking (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 69ab76fd 29-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Initialize PDPs and PML4

Similar to PDs, while setting up a page directory pointer, make all entries
of the pdp point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or proactive prefetch.

Also add a scratch pdp, which the PML4 entries point to.

v2: Handle scratch_pdp allocation failure correctly, and keep
initialize_px functions together (Akash)
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on
the added macros to initialize the pdps.
v4: Rebase after final merged version of Mika's ppgtt/scratch patches
(and removed commit message part related to v3).
v5: Update commit message to also mention PML4 table initialization and
the new scratch pdp (Akash).

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# de5ba8eb 03-Aug-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Add 4 level support in insert_entries and clear_range

When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map
Level 4 (PML4), before it selects which Page Directory Pointer (PDP)
it will write to.

Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range.

This patch was inspired by Ben's "Depend exclusively on map and
unmap_vma".

v2: Rebase after s/page_tables/page_table/.
v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use
clamp_pdp in gen8_ppgtt_insert_entries (Akash).
v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to
maintain symmetry with gen8_ppgtt_insert_entries (Akash).
v5: Do not mix pages and bytes in insert_entries (Akash).
v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at
once.
v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
Use gen8_px_index functions, and remove unnecessary number of pages
parameter in insert_pte_entries.
v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of
adding and extra clamp function; remove unnecessary pdp_start/pdp_len
variables (Akash).
v9: pages->orig_nents instead of sg_nents(pages->sgl) to get the
length (Akash).
v10: Remove pdp warning check ingen8_ppgtt_insert_pte_entries until this
commit (Akash).

Reviewed-by: Akash Goel <akash.goel@intel.com> (v9)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3387d433 03-Aug-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Pass sg_iter through pte inserts

As a step towards implementing 4 levels, while not discarding the
existing pte insert functions, we need to pass the sg_iter through.
The current function understands to the page directory granularity.
An object's pages may span the page directory, and so using the iter
directly as we write the PTEs allows the iterator to stay coherent
through a VMA insert operation spanning multiple page table levels.

v2: Rebase after s/page_tables/page_table/.
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series;
updated commit message (s/map/insert).
v4: Rebase.

Reviewed-by: Akash Goel <akash.goel@intel.com> (v3)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2dba3239 30-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Add 4 level switching infrastructure and lrc support

In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
the base address to PML4, while the other PDP registers are ignored.

In LRC, the addressing mode must be specified in every context
descriptor, and the base address to PML4 is stored in the reg state.

v2: PML4 update in legacy context switch is left for historic reasons,
the preferred mode of operation is with lrc context based submission.
v3: s/gen8_map_page_directory/gen8_setup_page_directory and
s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer.
Also, clflush will be needed for bxt. (Akash)
v4: Squashed lrc-specific code and use a macro to set PML4 register.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
PDP update in bb_start is only for legacy 32b mode.
v6: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v7: There is no need to update the pml4 register value in
execlists_update_context. (Akash)
v8: Move pd and pdp setup functions to a previous patch, they do not
belong here. (Akash)
v9: Check USES_FULL_48BIT_PPGTT instead of GEN8_CTX_ADDRESSING_MODE in
gen8_emit_bb_start to check if emit pdps is needed. (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 762d9936 30-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: implement alloc/free for 4lvl

PML4 has no special attributes, and there will always be a PML4.
So simply initialize it at creation, and destroy it at the end.

The code for 4lvl is able to call into the existing 3lvl page table code
to handle all of the lower levels.

v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
compiler happy. And define ret only in one place.
Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
v3: Use i915_dma_unmap_single instead of pci API. Fix a
couple of incorrect checks when unmapping pdp and pd pages (Akash).
v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
paths. (Akash)
v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
v8: Change location of pml4_init/fini. It will make next patches
cleaner.
v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
trying to reuse as much as possible for pdp alloc. pml4_init/fini
replaced by setup/cleanup_px macros.
v10: Rebase after Mika's merged ppgtt cleanup patch series.
v11: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v12: Fix pdpe start value in trace (Akash)
v13: Define all 4lvl functions in this patch directly, instead of
previous patches, add i915_page_directory_pointer_entry_alloc here,
use test_bit to detect when pdp is already allocated (Akash).
v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers
funtion, as we do for pds and pts; move pd and pdp setup functions to
this patch (Akash).
v15: Added kfree(pdp) from previous patch to this (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 81ba8aef 03-Aug-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Add PML4 structure

Introduces the Page Map Level 4 (PML4), ie. the new top level structure
of the page tables.

To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.

v2: Remove unnecessary CONFIG_X86_64 checks, ppgtt code is already
32/64-bit safe (Chris).
v3: Add goto free_scratch in temp 48-bit mode init code (Akash).
v4: kfree the pdp until the 4lvl alloc/free patch (Akash).
v5: Postpone 48-bit code in sanitize_enable_ppgtt (Akash).
v6: Keep _insert_pte_entries changes outside this patch (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4c06ec8d 29-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Add dynamic page trace events

The dynamic page allocation patch series added it for GEN6, this patch
adds them for GEN8.

v2: Consolidate pagetable/page_directory events
v3: Multiple rebases.
v4: Rebase after s/page_tables/page_table/.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after gen8_map_pagetable_range removal.
v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash)
v8: Defer define of i915_page_directory_pointer_entry_alloc (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f9b5b782 30-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT

The insert_entries function was the function used to write PTEs. For the
PPGTT it was "hardcoded" to only understand two level page tables, which
was the case for GEN7. We can reuse this for 4 level page tables, and
remove the concept of insert_entries, which was never viable past 2
level page tables anyway, but it requires a bit of rework to make the
function a bit more generic.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v3: Rebase after final merged version of Mika's ppgtt/scratch patches.
v4: Check and warn for NULL value of pdp pointer (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d4ec9da0 30-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Abstract PDP usage

Up until now, ppgtt->pdp has always been the root of our page tables.
Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.

In preparation for 4 level page tables, we need to stop using ppgtt->pdp
directly unless we know it's what we want. The future structure will use
ppgtt->pml4 for the top level, and the pdp is just one of the entries
being pointed to by a pml4e. The temporal pdp local variable will be
removed once the rest of the 4-level code lands.

Also, start passing the vm pointer to the alloc functions, instead of
ppgtt.

v2: Updated after dynamic page allocation changes.
v3: Rebase after s/page_tables/page_table/.
v4: Rebase after changes in "Dynamic page table allocations" patch.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after final merged version of Mika's ppgtt/scratch patches.
v7: Keep pagetable map in-line (and avoid unnecessary for_each_pde
loops), remove redundant ppgtt pointer in _alloc_pagetabs (Akash)
v8: Fix text indentation in _alloc_pagetabs/page_directories (Chris)
v9: Defer gen8_alloc_va_range_4lvl definition until 4lvl is implemented,
clean-up gen8_ppgtt_cleanup [pun intended] (Akash).
v10: Clean-up commit message (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6ac18502 29-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Make pdp allocation more dynamic

This transitional patch doesn't do much for the existing code. However,
it should make upcoming patches to use the full 48b address space a bit
easier.

32-bit ppgtt uses just 4 PDPs, while 48-bit ppgtt will have up-to 512;
this patch prepares the existing functions to query the right number of pdps
at run-time. This also means that used_pdpes should also be allocated during
ppgtt_init, as the bitmap size will depend on the ppgtt address range
selected.

v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp).
v3: To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.
v4: Rebase after s/page_tables/page_table/, added extra information
about 4-level page table formats and use IS_ENABLED macro.
v5: Check CONFIG_X86_64 instead of CONFIG_64BIT.
v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and
follow
his nomenclature in pdp functions (there is no alloc_pdp yet).
v7: Rebase after merged version of Mika's ppgtt cleanup patch series.
v8: Rebase after final merged version of Mika's ppgtt/scratch patches.
v9: Introduce PML4 (and 48-bit checks) until next patch (Akash).
v10: Also use test_bit to detect when pd/pt are already allocated (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
[danvet: Amend commit message as suggested by Michel.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 09120d4e 29-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Remove unnecessary gen8_clamp_pd

gen8_clamp_pd clamps to the next page directory boundary, but the macro
gen8_for_each_pde already has a check to stop at the page directory
boundary.

Furthermore, i915_pte_count also restricts to the next page table
boundary.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d0e30adc 29-Jul-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark PIN_USER binding as GLOBAL_BIND without the aliasing ppgtt

If the device does not support the aliasing ppgtt, we must translate
user bind requests (PIN_USER) from LOCAL_BIND to a GLOBAL_BIND. However,
since this is device specific we cannot do this conveniently in the
upper layers and so must manage the vma->bound flags in the backend.

Partial revert of commit 75d04a3773ecee617847de963ae4195d6aa74c28 [4.2-rc1]
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Tue Apr 28 17:56:17 2015 +0300

drm/i915/gtt: Allocate va range only if vma is not bound

Note this was spotted by Daniel originally, but we dropped the ball in
getting the fix in before the bug going wild. Sorry all.

Reported-by: Vincent Legoll vincent.legoll@gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91133
References: https://bugs.freedesktop.org/show_bug.cgi?id=90224
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5ec5b516 08-Jul-2015 Imre Deak <imre.deak@intel.com>

drm/i915: remove unused has_dma_mapping flag

After the previous patch this flag will check always clear, as it's
never set for shmem backed and userptr objects, so we can remove it.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Yeah this isn't really fixes but it's a nice cleanup to
clarify the code but not really worth the hassle of backmerging. So
just add to -fixes, we're still early in -rc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2c3d9984 06-Jul-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Restore all GGTT VMAs on resume

When rotated and partial views were added no one spotted the resume
path which assumes only one GGTT VMA per object and hence is now
skipping rebind of alternative views.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8776f02b 30-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Per ppgtt scratch page

Previously we have pointed the page where the individual ppgtt
scratch structures refer to, to be the instance which GGTT setup have
allocated. So it has been shared.

To achieve full isolation between ppgtts also in this regard,
allocate per ppgtt scratch page.

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4ad2af1e 30-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Return struct i915_scratch_page from alloc_scratch

Every other alloc_* function return the pointer to the page
they alloc. Follow the convention with scratch page also.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2e906bea 30-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Reorder page alloc/free/init functions

Maintain base page handling functions in order of
alloc, free, init. No functional changes.

v2: s/Introduce/Maintain (Michel)
v3: Rebase

Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f37c0505 10-Jun-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gtt: Switch gen8_free_page_tables params

After Mika's ppgtt cleanup series, all the other free functions have
drm_device as the first parameter, except this one.

No functional changes.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 00245266 24-Jun-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/ppgtt: Break loop in gen8_ppgtt_clear_range failure path

If for some reason [1], the page directory/table does not exist, clear_range
would end up in an infinite while loop.

Introduced by commit 06fda602dbca ("drm/i915: Create page table allocators").

[1] This is already being addressed in one of Mika's patches:
http://mid.gmane.org/1432314314-23530-17-git-send-email-mika.kuoppala@intel.com

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Reported-by: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 966082c9 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Use nonatomic bitmap ops

There is no need for atomicity here. Convert all bitmap
operations to nonatomic variants.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 79ab9370 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Move scratch_pd and scratch_pt into vm struct

Scratch page is part of struct i915_address_space. Move other
scratch entities into the same struct. This is a preparatory patch
for having only one instance of each scratch_pt/pd.

v2: make commit msg more readable

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1)
[danvet: Bikeshed summary to avoid confusion with vmas.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fe36f55d 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Cleanup page directory encoding

Write page directory entry without using superfluous
indirect function. Also remove unused device parameter
from the encode function.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b2dd4511 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Pin vma during virtual address allocation

Dynamic page table allocation might wake the shrinker
when memory is requested for page table structures.
As this happens when we try to allocate the virtual address
during binding, our vma might be among the targets for eviction.
We should do i915_vma_pin() and do pin early in there like Chris
suggests but this is interim solution.

Shield our vma from shrinker by incrementing pin count before
the virtual address is allocated.

The proper place to fix this would be in gem, inside of
i915_vma_pin(). But we don't have that yet so take the short
cut as a intermediate solution.

Testcase: igt/gem_ctx_thrash
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c114f76a 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Make scratch page i915_page_dma compatible

Lay out scratch page structure in similar manner than other
paging structures. This allows us to use the same tools for
setup and teardown.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 567047be 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Use macros to access dma mapped pages

Make paging structure type agnostic *_px macros to access
page dma struct, the backing page and the dma address.

This makes the code less cluttered on internals of
i915_page_dma.

v2: Superfluous const -> nonconst removed
v3: Rebased

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d1c54acd 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Introduce kmap|kunmap for dma page

As there is flushing involved when we have done the cpu
write, make functions for mapping for cpu space. Make macros
to map any type of paging structure.

v2: Make it clear tha flushing kunmap is only for ppgtt (Ville)
v3: Flushing fixed (Ville, Michel). Removed superfluous semicolon

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 73eeea53 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Introduce fill_page_dma()

When we setup page directories and tables, we point the entries
to a to the next level scratch structure. Make this generic
by introducing a fill_page_dma which maps and flushes. We also
need 32 bit variant for legacy gens.

v2: Fix flushes and handle valleyview (Ville)
v3: Now really fix flushes (Michel, Ville)

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# cee30c54 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Remove superfluous free_pd with gen6/7

This has slipped in somewhere but it was harmless
as we check the page pointer before teardown.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a08e111a 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Rename unmap_and_free_px to free_px

All the paging structures are now similar and mapped for
dma. The unmapping is taken care of by common accessors, so
don't overload the reader with such details.

v2: Be consistent with goto labels (Michel)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 44159ddb 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Introduce struct i915_page_dma

All our paging structures have struct page and dma address
for that page.

Add struct for page/dma address pairs and use it to make
the setup and teardown for different paging structures
identical.

Include the page directory offset also in the struct for legacy
gens. Rename it to clearly point out that it is offset into the
ggtt.

v2: Add comment about ggtt_offset (Michel)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d852c7bf 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Introduce i915_page_dir_dma_addr

The legacy mode mm switch and the execlist context assignment
needs dma address for the page directories.

Introduce a function that encapsulates the scratch_pd dma
fallback if no pd is found.

v2: Rebase, s/ring/req

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c44ef60e 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Allow >= 4GB sizes for vm.

We can have exactly 4GB sized ppgtt with 32bit system.
size_t is inadequate for this.

v2: Convert a lot more places (Daniel)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a05d80ee 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Check va range against vm size

Check the allocation area against the known end
of address space instead of against fixed value.

v2: Return ENODEV on internal bugs (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5b7e4c9c 25-Jun-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Mark TLBS dirty for gen8+

When we touch gen8+ page maps, mark them dirty like we
do with previous gens.

v2: Update comment (Joonas)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9e759ff1 22-Jun-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Return correct size for rotated views

Currently object size is returned for the rotated VMA size which can be
bigger than the rotated view itself. Since the binding code pads all
excess size with scratch pages the only minor issue with this is wasting
some GGTT space, but still feels nicer to fix and report the real size.

v2: Rebase for tracking size in bytes instead of pages.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 84fe03f7 23-Jun-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Move rotated geometry calculations into the fill helper

This way data is available as soon as the view is passed into the call chain.

v2: Store size in bytes instead of pages under the appropriate name. (Chris Wilson)

v3: Use uint64_t instead of size_t. (Daniel Vetter)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c9f8fd2d 24-Jun-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Remove mostly unused variable in intel_rotate_fb_obj_pages

It is only used in logging and it doesn't need to exist on its own.

Also it was misleading to log view size as object size.

v2: Improve commit message. (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[danvet: s/%lu/%zu/ where needed, reported by 0-day.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5fb9de1a 29-May-2015 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Update intel_ring_begin() to take a request structure

Now that everything above has been converted to use requests, intel_ring_begin()
can be updated to take a request instead of a ring. This also means that it no
longer needs to lazily allocate a request if no-one happens to have done it
earlier.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a84c3ae1 29-May-2015 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Update ring->flush() to take a requests structure

Updated the various ring->flush() functions to take a request instead of a ring.
Also updated the tracer to include the request id.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
[danvet: Rebase since I didn't merge the addition of req->uniq.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e85b26dc 29-May-2015 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Update switch_mm() to take a request structure

Updated the switch_mm() code paths to take a request instead of a ring. This
includes the myriad *_mm_switch functions themselves and a bunch of PDP related
helper functions.

v2: Rebased to newer tree.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b3dd6b96 29-May-2015 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Update ppgtt_init_ring() & context_enable() to take requests

The final step in removing the OLR from i915_gem_init_hw() is to pass the newly
allocated request structure in to each step rather than passing a ring
structure. This patch updates both i915_ppgtt_init_ring() and
i915_gem_context_enable() to take request pointers.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4ad2fd88 18-Jun-2015 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring

The i915_gem_init_hw() function calls a bunch of smaller initialisation
functions. Multiple of which have generic sections and per ring sections. This
means multiple passes are done over the rings. Each pass writes data to the ring
which floats around in that ring's OLR until some random point in the future
when an add_request() is done by some random other piece of code.

This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation
now being done in i915_ppgtt_init_ring(). The ring looping is now done at the
top level in i915_gem_init_hw().

v2: Fix dumb loop variable re-use.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8a1ebd74 22-May-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Remove _single from page table allocator

We are always allocating a single page. No need to be verbose so
remove the suffix.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ea3f5d26 22-May-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Don't leak scratch page on mapping error

Free the scratch page if dma mapping fails.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 501fd70f 29-May-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: limit PPGTT size to 2GB in 32-bit platforms

We already set this limit for the GGTT.

This is a temporary patch until a full replacement of size_t variables
(inadequate in 32-bit kernel) is in place.

Regression from:
commit a4e0bedca678c81eea4cd79a4bd502335639f73a
Author: Michel Thierry <michel.thierry@intel.com>
Date: Wed Apr 8 12:13:35 2015 +0100

drm/i915: Use complete address space in true PPGTT

v2: Prettify code and explain why this is needed. (Chris)
v3: Don't hide the compilation warning in 32-bit. (Chris)

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f3e06f11 12-May-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Fix the boundary check for vm area

The check for start + length >= total_vm_size is
wrong since start + length can be exactly the size of
the vm.

Fix the check to allow allocation to boundary.

Fixes a regression in commit 4dd738e9cd79
("drm/i915: Fix 32b overflow check in gen8_ppgtt_alloc_page_directories")

Testcase: igt/gem_evict_everything/swapping-interruptible
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90399
Tested-by: Lu Hua <huax.lu@intel.com>
Cc: Chris Wilson <chris@chris.wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8bd7ef16 06-May-2015 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Add a partial GGTT view type

Partial view type allows manipulating parts of huge BOs through the GGTT,
which was not previously possible due to constraint that whole object had
to be mapped for any access to it through GGTT.

v2:
- Retain error value from sg_alloc_table (Tvrtko Ursulin)
- Do not zero already zeroed variable (Tvrtko Ursulin)
- Use more common variable types for page size/offset (Tvrtko Ursulin)
v3:
- Only compare additional view parameters when need to (Tvrtko Ursulin)
v4:
- Do zero out the variable that needs to be (bug introduced in v2).

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 91e6711e 06-May-2015 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Do not make assumptions on GGTT VMA sizes

GGTT VMA sizes might be smaller than the whole object size due to
different GGTT views.

v2:
- Separate GGTT view constraint calculations from normal view
constraint calculations (Chris Wilson)
v3:
- Do not bother with debug wording. (Tvrtko Ursulin)
v4:
- Clearer logic for calculating map_and_fenceable (Tvrtko Ursulin)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[danvet: Drop BUG_ON, it's redudant.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4dd738e9 30-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Fix 32b overflow check in gen8_ppgtt_alloc_page_directories

The patch 69876bed7e008f5fe01538a2d47c09f2862129d0: "drm/i915/gen8:
page directories rework allocation" added an overflow warning, but the
mask had an extra 0. Use less typo-prone option suggested by Dave
instead, to check for (start + length) >= 0x100000000ULL.

This check will be unnecessary after gen8_alloc_va_range handles more
than 4 PDPs (48b addressing).

v2: Really check for 32b overflow (Ville)

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d4dc5e92 28-Apr-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove incorrect restriction on 32bit offsets in ppGTT backend

This is the wrong layer to apply an arbitrary restriction and the wrong
error code (object too large!). If we do want to prevent large offsets
being return to the user on 32bit systems (to hide bugs in userspace),
you want to restrict the drm_mm range manager instead. This first tells
userspace about the correct size of the GTT they can use (so they don't
try and overallocate object or batches), and fixes the eviction logic to
avoid the eventual and *guaranteed* error.

Fixes regression in
commit d7b2633dba04ef0fd7385f02a7b552abc5f1062f
Author: Michel Thierry <michel.thierry@intel.com>
Date: Wed Apr 8 12:13:34 2015 +0100

drm/i915/gen8: Dynamic page table allocations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 06615ee5 24-Apr-2015 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Do not clear mappings beyond VMA size

Do not to clear mappings outside the allocated VMA under any
circumstances. Only clear the smaller of VMA or object page count.

This is required to allow creating partial object VMAs which in
turn are needed for partial GGTT views.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 75d04a37 28-Apr-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/gtt: Allocate va range only if vma is not bound

When we have bound vma into an address space, the layout
of page table structures is immutable. So we can be absolutely
certain that if vma is already bound, there is no need to
(re)allocate a virtual address range for it.

v2: - add sanity checks and remove superfluous GLOBAL_BIND set
- we might do update for an unbound vma (Chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90224
Testcase: igt/gem_exec_big #bdw
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 70b9f6f8 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

rm/i915: Move i915_get_ggtt_vma_pages into ggtt_bind_vma

We have this neat abstraction between ppgtt and ggtt for (un)bind_vma
and didn't end up using it really. What a shame, so fix this and make
the ->bind_vma hook a bit more useful.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 2c642b07 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't try to outsmart gcc in i915_gem_gtt.c

Sprinkling static inline all over the place is carg-culting. Remove
it.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d369d2d9 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Unduplicate i915_ggtt_unbind/bind_vma

ggtt_bind/unbind_vma already has checks for aliasing ppgtt or not,
there's nothing else magic they do. Resurrect i915_ggtt_insert_entries
to make the reuse possibel.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 47552659 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move ppgtt_bind/unbind around

Again avoids some forward declarations.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fa42331b 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: move i915_gem_restore_gtt_mappings around

Avoids 2 forward declarations.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0875546c 20-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix up the vma aliasing ppgtt binding

Currently we have the problem that the decision whether ptes need to
be (re)written is splattered all over the codebase. Move all that into
i915_vma_bind. This needs a few changes:
- Just reuse the PIN_* flags for i915_vma_bind and do the conversion
to vma->bound in there to avoid duplicating the conversion code all
over.
- We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if
around) explicit, add PIN_USER for that.
- Two callers want to update ptes, give them a PIN_UPDATE for that.

Of course we still want to avoid double-binding, but that should be
taken care of:
- A ppgtt vma will only ever see PIN_USER, so no issue with
double-binding.
- A ggtt vma with aliasing ppgtt needs both types of binding, and we
track that properly now.
- A ggtt vma without aliasing ppgtt could be bound twice. In the
lower-level ->bind_vma functions hence unconditionally set
GLOBAL_BIND when writing the ggtt ptes.

There's still a bit room for cleanup, but that's for follow-up
patches.

v2: Fixup fumbles.

v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# f329f5f6 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move PTE_READ_ONLY to ->pte_encode vfunc

It's only used as a flag there, so unconfuse things a bit.
Also separate the bind_vma flag space from the pte_encode flag
space in the code.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5c5f6457 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Unify aliasing ppgtt handling

With the dynamic pagetable alloc code aliasing ppgtt special-cases
where again mixed in all over the place with the low-level init code.

Extract the va preallocation and clearing again into the common code
where aliasing ppgtt gets set up.

Note that with this we don't set the size of the aliasing ppgtt to the
size of the parent ggtt address space. Which isn't required at all
since except for the ppgtt setup/cleanup code no one ever looks at
this.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 061dd493 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Clean up aliasing ppgtt correctly on error paths

While at it inline the free functions - they don't actually free the
ppgtt, just clean up the allocations done for it.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 777dc5bb 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move vma vfuns to adddress_space

They change with the address space and not with each vma, so move them
into the right pile of vfuncs. Save 2 pointers per vma and clarifies
the code.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# c7e16f22 14-Apr-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move gen8 clear_range vfunc setup into common code

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 1d335d1b 10-Apr-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Move vm page allocation in proper place

Move to i915_vma_bind as it is part of the binding.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e20d2ab7 07-Apr-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use a separate slab for vmas

vma are more frequently allocated than objects and so should equally
benefit from having a dedicated slab.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a4e0bedc 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Use complete address space in true PPGTT

True PPGTT is capable of having a full address space, even if the system
has less allocated memory.

Note that aliasing PPGTT always aliases the GGTT and thus should remain
of the same size.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d7b2633d 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Dynamic page table allocations

This finishes off the dynamic page tables allocations, in the legacy 3
level style that already exists. Most everything has already been setup
to this point, the patch finishes off the enabling by setting the
appropriate function pointers.

In LRC mode, contexts need to know the PDPs when they are populated. With
dynamic page table allocations, these PDPs may not exist yet. Check if
PDPs have been allocated and use the scratch page if they do not exist yet.

Before submission, update the PDPs in the logic ring context as PDPs
have been allocated.

v2: Update aliasing/true ppgtt allocate/teardown/clear functions for
gen 6 & 7.

v3: Rebase.

v4: Remove BUG() from ppgtt_unbind_vma, but keep checking that either
teardown_va_range or clear_range functions exist (Daniel).

v5: Similar to gen6, in init, gen8_ppgtt_clear_range call is only needed
for aliasing ppgtt. Zombie tracking was originally added for teardown
function and is no longer required.

v6: Update err_out case in gen8_alloc_va_range (missed from lastest
rebase).

v7: Rebase after s/page_tables/page_table/.

v8: Updated scratch_pt check after scratch flag was removed in previous
patch.

v9: Note that lrc mode needs to be updated to support init state without
any PDP.

v10: Unmap correct page_table in gen8_alloc_va_range's error case, clean-up
gen8_aliasing_ppgtt_init (remove duplicated map), and initialize PTs
during page table allocation.

v11: Squashed LRC enabling commit, otherwise LRC mode would be left broken
until it was updated to handle the init case without any PDP.

v12: Do not overallocate new_pts bitmap, make alloc_gen8_temp_bitmaps
static and don't abuse of inline functions. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 33c8819f 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: begin bitmap tracking

Like with gen6/7, we can enable bitmap tracking with all the
preallocations to make sure things actually don't blow up.

v2: Rebased to match changes from previous patches.
v3: Without teardown logic, rely on used_pdpes and used_pdes when
freeing page tables.
v4: Rebased after s/page_tables/page_table/.
v5: Rebased after page table generalizations.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e5815a2e 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Split out mappings

When we do dynamic page table allocations for gen8, we'll need to have
more control over how and when we map page tables, similar to gen6.
In particular, DMA mappings for page directories/tables occur at allocation
time.

This patch adds the functionality and calls it at init, which should
have no functional change.

The PDPEs are still a special case for now. We'll need a function for
that in the future as well.

v2: Handle renamed unmap_and_free_page functions.
v3: Updated after teardown_va logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: No longer allocate all PDPs in GEN8+ systems with less than 4GB of
memory, and update populate_lr_context to handle this new case (proper
tracking will be added later in the patch series).
v6: Assign lrc page directory pointer addresses using a macro. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c488dbba 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Extract PPGTT param from page_directory alloc

This will be useful for when we move to 48b addressing, and the PDP isn't
the root of the page table structure.

v2: Rebase after changes for Gen8+ systems with less than 4GB of memory.
v3: Rebase after Mika's code review.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 09942c65 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: num_pd_pages/num_pd_entries isn't useful

These values are never quite useful for dynamic allocations of the page
tables. Getting rid of them will help prevent later confusion.

v2: Updated to use unmap_and_free_pd functions.
v3: Updated gen8_ppgtt_free after teardown logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: Keep allocating all page directories in GEN8+ systems with less
than 4GB of memory. Updated gen6_for_all_pdes.
v6: Prevent (harmless) out of range access in gen6_for_all_pdes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7cb6d7ac 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Update pdp switch and point unused PDPs to scratch page

One important part of this patch is we now write a scratch page
directory into any unused PDP descriptors. This matters for 2 reasons,
first, we're not allowed to just use 0, or an invalid pointer, and second,
we must wipe out any previous contents from the last context.

The latter point only matters with full PPGTT. The former point only
effect platforms with less than 4GB memory.

v2: Updated commit message to point that we must set unused PDPs to the
scratch page.

v3: Unmap scratch_pd in gen8_ppgtt_free.

v4: Initialize scratch_pd. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5441f0cb 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: pagetable allocation rework

Start using gen8_for_each_pde macro to allocate page tables.

v2: teardown_va_range references removed.
v3: Rebase after s/page_tables/page_table/.
v4: Keep setting up page tables for all page directories in systems with
less than 4GB of memory.
v5: Also initialize the page tables. (Mika)
v6: Initialize all page tables, including the extra ones from systems
with less than 4GB of memory. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 69876bed 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: page directories rework allocation

Start using gen8_for_each_pdpe macro to allocate the page directories.

Similar to PTs, while setting up a page directory, make all entries of
the pd point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or proactive prefetch. Systems without LLC require an explicit
flush.

v2: Rebased after s/free_pt_*/unmap_and_free_pt/ change.
v3: Rebased after teardown va range logic was removed.
v4: Keep setting up all page directories for systems with less than 4GB
of memory.
v5: Initialize PDs. (Mika)
v6: Initialize also the extra PDs from systems with less than 4GB of
memory. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5a8e9943 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Initialize page tables

Similar to gen6, while setting up a page table, make all entries of the
pt point to the scratch page before mapping; this is to be safe in case
of out of bound access or proactive prefetch.

Systems without LLC require an explicit flush.

v2: Expanded commit text and fixed indentation (Mika)

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9c57f070 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Remove unnecessary gen8_ppgtt_unmap_pages

We are already unmapping them in gen8_ppgtt_free. This function became
redundant since commit 06fda602dbca9c59d87db7da71192e4b54c9f5ff
("drm/i915: Create page table allocators").

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ec565b3c 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Remove _entry from PPGTT page structures

Lets try to keep this consistent:

Page Directory Pointer (PDP).
Page Directory (PD), also known as page directory pointer entries.
Page Table (PT), also known as page directory entries.

s/struct i915_page_table_entry/struct i915_page_table/
s/struct i915_page_directory_entry/struct i915_page_directory/
s/struct i915_page_directory_pointer_entry/struct
i915_page_directory_pointer/

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2a073f89 27-Mar-2015 Imre Deak <imre.deak@intel.com>

drm/i915/bxt: map GTT as uncached

On Broxton per specification the GTT has to be mapped as uncached.
This was caught by the PTE write readback warning, which showed a
corrupted PTE value with using the current write-combine mapping.

v2:
- add comment explaining how the problem with WC mapping manifests
(Daniel)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5a4e33a3 17-Mar-2015 Sumit Singh <sumit.k.singh@intel.com>

drm/i915/bxt: Enable PTE encoding

The caching options for page table entries have remained the same as
Cherryview. This patch fixes it so the right code path is taken on BXT.

v2: Fix up commit message (Mike)

Signed-off-by: Sumit Singh <sumit.k.singh@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9abc4648 27-Mar-2015 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Compare GGTT view structs instead of types

To allow for views where the view type is not defined by the view type only,
like it is in stereo or rotated 90 degree view, change the semantic to require
the whole view structure for comparison when we match a GGTT view.

This allows including parameters like offset to be included in the view which
is useful for eg. partial views.

v3:
- Rely on ggtt_view type being 0 for non-GGTT vma's, which equals to
I915_GGTT_VIEW_NORMAL. (Daniel Vetter)
- Do not use potentially slower comparison when we only want to know if
something is or is not a normal view.
- Rebase on top of rotated view patches. Add rotated view singleton.
- If one view is missing in comparison they're equal only if both are missing.

v4:
- Use comparison helper in obj_to_ggtt_view too. (Tvrtko Ursulin)
- Do WARN_ON if one view is NULL. (Tvrtko Ursulin)

Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2f2cf682 27-Mar-2015 kbuild test robot <fengguang.wu@intel.com>

drm/i915: fix simple_return.cocci warnings

drivers/gpu/drm/i915/i915_gem_gtt.c:1349:1-4: WARNING: end returns can be simpified and declaration on line 1347 can be dropped

Simplify a trivial if-return sequence. Possibly combine with a
preceding function call.
Generated by: scripts/coccinelle/misc/simple_return.cocci

CC: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 72744cb1 24-Mar-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Add dynamic page trace events

Traces for page directories and tables allocation and map.

v2: Removed references to teardown.
v3: bitmap_scnprintf has been deprecated.
v4: Replace bitmap_scnprintf with scnprintf correctly, and get right
range lengths. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4933d519 24-Mar-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Finish gen6/7 dynamic page table allocation

This patch continues on the idea from "Track GEN6 page table usage".
From here on, in the steady state, PDEs are all pointing to the scratch
page table (as recommended in the spec). When an object is allocated in
the VA range, the code will determine if we need to allocate a page for
the page table. Similarly when the object is destroyed, we will remove,
and free the page table pointing the PDE back to the scratch page.

Following patches will work to unify the code a bit as we bring in GEN8
support. GEN6 and GEN8 are different enough that I had a hard time to
get to this point with as much common code as I do.

The aliasing PPGTT must pre-allocate all of the page tables. There are a
few reasons for this. Two trivial ones: aliasing ppgtt goes through the
ggtt paths, so it's hard to maintain, we currently do not restore the
default context (assuming the previous force reload is indeed
necessary). Most importantly though, the only way (it seems from
empirical evidence) to invalidate the CS TLBs on non-render ring is to
either use ring sync (which requires actually stopping the rings in
order to synchronize when the sync completes vs. where you are in
execution), or to reload DCLV. Since without full PPGTT we do not ever
reload the DCLV register, there is no good way to achieve this. The
simplest solution is just to not support dynamic page table
creation/destruction in the aliasing PPGTT.

We could always reload DCLV, but this seems like quite a bit of excess
overhead only to save at most 2MB-4k of memory for the aliasing PPGTT
page tables.

v2: Make the page table bitmap declared inside the function (Chris)
Simplify the way scratching address space works.
Move the alloc/teardown tracepoints up a level in the call stack so that
both all implementations get the trace.

v3: Updated trace event to spit out a name

v4: Aliasing ppgtt is now initialized differently (in setup global gtt)

v5: Rebase to latest code. Also removed unnecessary aliasing ppgtt check
for trace, as it is no longer possible after the PPGTT cleanup patch series
of a couple of months ago (Daniel).

v6: Implement changes from code review (Daniel):
- allocate/teardown_va_range calls added.
- Add a scratch page allocation helper (only need the address).
- Move trace events to a new patch.
- Use updated mark_tlbs_dirty.
- Moved pt preallocation for aliasing ppgtt into gen6_ppgtt_init.

v7: teardown_va_range removed (Daniel).
In init, gen6_ppgtt_clear_range call is only needed for aliasing ppgtt.

v8: Rebase after s/page_tables/page_table/.

v9: Remove unnecessary scratch flag in page_table struct, future patches
can just compare against ppgtt->scratch_pt, and alloc_pt_scratch becomes
redundant. Initialize scratch_pt and pt. (Mika)

v10: Clean up aliasing ppgtt init error path and prevent leaking the
ppgtt obj when init fails. (Mika)
Updated commit author. (Daniel)

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v4+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 59568eb5 24-Mar-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Remove unnecessary gen6_ppgtt_unmap_pages

We are already unmapping them in gen6_ppgtt_free. This function became
redundant since commit 06fda602dbca9c59d87db7da71192e4b54c9f5ff
("drm/i915: Create page table allocators").

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1266cdb1 24-Mar-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Fix i915_dma_map_single positive error code

i915_dma_map_single relies on dma_mapping_error, which returns positive
error codes. Found by static checker.

Introduced by commit 678d96fbb3b5995a2fdff2bca5e1ab4a40b7e968
("drm/i915: Track GEN6 page table usage").

v2: Return negative error code and renamed commit title. (Dan)
v3: Missing reported-by tag (Daniel)

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1d00dad5 25-Mar-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/skl: Fix up positive error code

It should have been negative since it is returned with ERR_PTR().

Introduced in new code commit:

commit 50470bb011c4be278097670bea92462f4e8c8945
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date: Mon Mar 23 11:10:36 2015 +0000

drm/i915/skl: Support secondary (rotated) frame buffer mapping

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 50470bb0 23-Mar-2015 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/skl: Support secondary (rotated) frame buffer mapping

90/270 rotated scanout needs a rotated GTT view of the framebuffer.

This is put in a separate VMA with a dedicated ggtt view and wired such that
it is created when a framebuffer is pinned to a 90/270 rotated plane.

Rotation is only possible with Yb/Yf buffers and error is propagated to
user space in case of a mismatch.

Special rotated page view is constructed at the VMA creation time by
borrowing the DMA addresses from obj->pages.

v2:
* Do not bother with pages for rotated sg list, just populate the DMA
addresses. (Daniel Vetter)
* Checkpatch cleanup.

v3:
* Rebased on top of new plane handling (create rotated mapping when
setting the rotation property).
* Unpin rotated VMA on unpinning from display plane.
* Simplify rotation check using bitwise AND. (Chris Wilson)

v4:
* Fix unpinning of optional rotated mapping so it is really considered
to be optional.

v5:
* Rebased for fb modifier changes.
* Rebased for atomic commit.
* Only pin needed view for display. (Ville Syrjälä, Daniel Vetter)

v6:
* Rebased after preparatory work has been extracted out. (Daniel Vetter)

v7:
* Slightly simplified tiling geometry calculation.
* Moved rotated GGTT view implementation into i915_gem_gtt.c (Daniel Vetter)

v8:
* Do not use i915_gem_obj_size to get object size since that actually
returns the size of an VMA which may not exist.
* Rebased for ggtt view changes.

v9:
* Rebased after code review changes on the preceding patches.
* Tidy function definitions. (Joonas Lahtinen)

For: VIZ-4726
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v4)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 563222a7 18-Mar-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Track page table reload need

This patch was formerly known as, "Force pd restore when PDEs change,
gen6-7." I had to change the name because it is needed for GEN8 too.

The real issue this is trying to solve is when a new object is mapped
into the current address space. The GPU does not snoop the new mapping
so we must do the gen specific action to reload the page tables.

GEN8 and GEN7 do differ in the way they load page tables for the RCS.
GEN8 does so with the context restore, while GEN7 requires the proper
load commands in the command streamer. Non-render is similar for both.

Caveat for GEN7
The docs say you cannot change the PDEs of a currently running context.
We never map new PDEs of a running context, and expect them to be
present - so I think this is okay. (We can unmap, but this should also
be okay since we only unmap unreferenced objects that the GPU shouldn't
be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag
to signal that even if the context is the same, force a reload. It's
unclear exactly what this does, but I have a hunch it's the right thing
to do.

The logic assumes that we always emit a context switch after mapping new
PDEs, and before we submit a batch. This is the case today, and has been
the case since the inception of hardware contexts. A note in the comment
let's the user know.

It's not just for gen8. If the current context has mappings change, we
need a context reload to switch

v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt
is always null.

v3: Invalidate PPGTT TLBs inside alloc_va_range.

v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
neither ctx->ppgtt and aliasing_ppgtt exist.

v5: Removed references to teardown_va_range.

v6: Updated needs_pd_load_pre/post.

v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
comment about updated PDEs to object_pin/bind (Mika).

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 678d96fb 16-Mar-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Track GEN6 page table usage

Instead of implementing the full tracking + dynamic allocation, this
patch does a bit less than half of the work, by tracking and warning on
unexpected conditions. The tracking itself follows which PTEs within a
page table are currently being used for objects. The next patch will
modify this to actually allocate the page tables only when necessary.

With the current patch there isn't much in the way of making a gen
agnostic range allocation function. However, in the next patch we'll add
more specificity which makes having separate functions a bit easier to
manage.

One important change introduced here is that DMA mappings are
created/destroyed at the same page directories/tables are
allocated/deallocated.

Notice that aliasing PPGTT is not managed here. The patch which actually
begins dynamic allocation/teardown explains the reasoning for this.

v2: s/pdp.page_directory/pdp.page_directories
Make a scratch page allocation helper

v3: Rebase and expand commit message.

v4: Allocate required pagetables only when it is needed, _bind_to_vm
instead of bind_vma (Daniel).

v5: Rebased to remove the unnecessary noise in the diff, also:
- PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK.
- Removed unnecessary checks in gen6_alloc_va_range.
- Changed map/unmap_px_single macros to use dma functions directly and
be part of a static inline function instead.
- Moved drm_device plumbing through page tables operation to its own
patch.
- Moved allocate/teardown_va_range calls until they are fully
implemented (in subsequent patch).
- Merged pt and scratch_pt unmap_and_free path.
- Moved scratch page allocator helper to the patch that will use it.

v6: Reduce complexity by not tearing down pagetables dynamically, the
same can be achieved while freeing empty vms. (Daniel)

v7: s/i915_dma_map_px_single/i915_dma_map_single
s/gen6_write_pdes/gen6_write_pde
Prevent a NULL case when only GGTT is available. (Mika)

v8: Rebased after s/page_tables/page_table/.

v9: Reworked i915_pte_index and i915_pte_count.
Also exercise bitmap allocation here (gen6_alloc_va_range) and fix
incorrect write_page_range in i915_gem_restore_gtt_mappings (Mika).

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 07749ef3 16-Mar-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: page table generalizations

No functional changes, but will improve code clarity and removed some
duplicated defines.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# eb0b44ad 18-Mar-2015 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: kerneldoc for i915_gem_shrinker.c

And remove one bogus * from i915_gem_gtt.c since that's not a
kerneldoc there.

v2: Review from Chris:
- Clarify memory space to better distinguish from address space.
- Add note that shrink doesn't guarantee the freed memory and that
users must fall back to shrink_all.
- Explain how pinning ties in with eviction/shrinker.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# dabde5c7 18-Mar-2015 Dan Carpenter <dan.carpenter@oracle.com>

drm/i915: memory leak in __i915_gem_vma_create()

In the original code then if WARN_ON(i915_is_ggtt(vm) != !!ggtt_view)
was true then we leak "vma". Presumably that doesn't happen often but
static checkers complain and this bug is easy to fix.

Fixes: c3bbb6f2825d ('drm/i915: Do not use ggtt_view with (aliasing) PPGTT')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ec7adb6e 16-Mar-2015 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Do not use ggtt_view with (aliasing) PPGTT

GGTT views are only applicable when dealing with GGTT. Change the code to
reject ggtt_view where it should not be used and require it when it should
be.

v2:
- Dropped _ppgtt_ infixes, allow both types to be passed
- Disregard other but normal views when no view is specified
- More checks that valid parameters are passed
- More readable error checking

v3:
- Prefer WARN_ONCE over BUG_ON when there is code path for failure

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[danvet: Drop unecessary forward decl from earlier patch iterations.]
[danvet: Remove unused variable spotted by Tvrtko.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2934368e 04-Mar-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Setup all page directories for gen8

If the requested size is less than what the full range
of pdps can address, we end up setting pdps for only the
requested area.

The logical context however needs all pdp entries to be valid.
Prior to commit 06fda602dbca ("drm/i915: Create page table allocators")
we have been writing pdp entries with dma address of zero instead
of valid pdps. This is supposedly bad even if those pdps are not
addressed.

As commit 06fda602dbca ("drm/i915: Create page table allocators")
introduced more dynamic structure for pdps, we ended up oopsing
when we populated the lrc context. Analyzing this oops revealed
the fact that we have not been writing valid pdps with bsw, as
it is doing the ppgtt init with 2GB limit in some cases.

We should do the right thing and setup the non addressable part
pdps/pde/pte to scratch page through the minimal structure by
having just pdp with pde entries pointing to same page with
pte entries pointing to scratch page.

But instead of going through that trouble, setup all the pdps
through individual pd pages and pt entries, even for non
addressable parts. And let the clear range point them to scratch
page. This way we populate the lrc with valid pdps and wait
for dynamic page allocation work to land, and do the heavy lifting
for truncating page table tree according to usage.

The regression of oopsing in init was introduced by
commit 06fda602dbca ("drm/i915: Create page table allocators")

v2: Clear the range for the unused part also (Ville)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89350
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Valtteri Rantala <valtteri.rantala@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 440fd528 23-Jan-2015 Thierry Reding <treding@nvidia.com>

drm/mm: Support 4 GiB and larger ranges

The current implementation is limited by the number of addresses that
fit into an unsigned long. This causes problems on 32-bit Tegra where
unsigned long is 32-bit but drm_mm is used to manage an IOVA space of
4 GiB. Given the 32-bit limitation, the range is limited to 4 GiB - 1
(or 4 GiB - 4 KiB for page granularity).

This commit changes the start and size of the range to be an unsigned
64-bit integer, thus allowing much larger ranges to be supported.

[airlied: fix i915 warnings and coloring callback]

Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>

fixupo


# 686135da 26-Feb-2015 Dan Carpenter <dan.carpenter@oracle.com>

drm/i915: fix a printk format

This printk leads to the following Smatch warning:

drivers/gpu/drm/i915/i915_gem_gtt.c:336 alloc_pt_range()
error: '%pa' expects argument of type 'phys_addr_t*',
argument 5 has type 'struct i915_page_table_entry*'

It looks like a simple typo to me where "%p" was intended instead of
"%pa".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 719cd21c 26-Feb-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Add missing description to parameter in alloc_pt_range

The patch "drm/i915: Plumb drm_device through page tables operations"
added an extra parameter, but it didn't update the function description.
Also remove unnecessary blank line added by the same patch.

Found by kbuild test robot.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 06dc68d6 24-Feb-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Plumb drm_device through page tables operations

The next patch in the series will require it for alloc_pt_single.

v2: Rebased after s/page_tables/page_table/.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 06fda602 24-Feb-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Create page table allocators

As we move toward dynamic page table allocation, it becomes much easier
to manage our data structures if break do things less coarsely by
breaking up all of our actions into individual tasks. This makes the
code easier to write, read, and verify.

Aside from the dissection of the allocation functions, the patch
statically allocates the page table structures without a page directory.
This remains the same for all platforms,

The patch itself should not have much functional difference. The primary
noticeable difference is the fact that page tables are no longer
allocated, but rather statically declared as part of the page directory.
This has non-zero overhead, but things gain additional complexity as a
result.

This patch exists for a few reasons:
1. Splitting out the functions allows easily combining GEN6 and GEN8
code. Page tables have no difference based on GEN8. As we'll see in a
future patch when we add the DMA mappings to the allocations, it
requires only one small change to make work, and error handling should
just fall into place.

2. Unless we always want to allocate all page tables under a given PDE,
we'll have to eventually break this up into an array of pointers (or
pointer to pointer).

3. Having the discrete functions is easier to review, and understand.
All allocations and frees now take place in just a couple of locations.
Reviewing, and catching leaks should be easy.

4. Less important: the GFP flags are confined to one location, which
makes playing around with such things trivial.

v2: Updated commit message to explain why this patch exists

v3: For lrc, s/pdp.page_directory[i].daddr/pdp.page_directory[i]->daddr/

v4: Renamed free_pt/pd_single functions to unmap_and_free_pt/pd (Daniel)

v5: Added additional safety checks in gen8 clear/free/unmap.

v6: Use WARN_ON and return -EINVAL in alloc_pt_range (Mika).

v7: Make err_out loop symmetrical to the way we allocate in
alloc_pt_range. Also s/page_tables/page_table and correct commit
message (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7324cc04 24-Feb-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Complete page table structures

Move the remaining members over to the new page table structures.

This can be squashed with the previous commit if desire. The reasoning
is the same as that patch. I simply felt it is easier to review if split.

v2: In lrc: s/ppgtt->pd_dma_addr[i]/ppgtt->pdp.page_directory[i].daddr/
v3: Rebase.
v4: Rebased after s/page_tables/page_table/.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d7b3de91 24-Feb-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: page table abstractions

When we move to dynamic page allocation, keeping page_directory and pagetabs as
separate structures will help to break actions into simpler tasks.

To help transition the code nicely there is some wasted space in gen6/7.
This will be ameliorated shortly.

Following the x86 pagetable terminology:
PDPE = struct i915_page_directory_pointer_entry.
PDE = struct i915_page_directory_entry [page_directory].
PTE = struct i915_page_table_entry [page_tables].

v2: fixed mismatches after clean-up/rebase.

v3: Clarify the names of the multiple levels of page tables (Daniel)

v4: Addressing Mika's review comments.
s/gen8_free_page_directories/gen8_free_page_directory and free the
page tables for the directory there.
In gen8_ppgtt_allocate_page_directories, do not leak previously allocated
pt in case the page_directory alloc fails.
Update error return handling in gen8_ppgtt_alloc.

v5: Do not leak pt on error in gen6_ppgtt_allocate_page_tables. (Mika)

v6: s/page_tables/page_table/. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 71ba2d64 10-Feb-2015 Yu Zhang <yu.c.zhang@linux.intel.com>

drm/i915: Support alias ppgtt in VM if ppgtt is enabled

The current Intel GVT-g only supports alias ppgtt. And the
emulation is done in the host by first trapping PP_DIR_BASE
mmio accesses. Updating PP_DIR_BASE by using instructions such
as MI_LOAD_REGISTER_IMM are hard to detect and are not supported
in current code. Therefore this patch also adds a new callback
routine - vgpu_mm_switch() to set the PP_DIR_BASE by mmio writes.

v2:
take Chris' comments:
- move the code into sanitize_enable_ppgtt()
v4:
take Tvrtko's comments:
- fix the parenthesis alignment warning

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5dda8fa3 10-Feb-2015 Yu Zhang <yu.c.zhang@linux.intel.com>

drm/i915: Adds graphic address space ballooning logic

With Intel GVT-g, the global graphic memory space is partitioned by
multiple vGPU instances in different VMs. The ballooning code is called
in i915_gem_setup_global_gtt(), utilizing the drm mm allocator APIs to
mark the graphic address space which are partitioned out to other vGPUs
as reserved. With ballooning, host side does not need to translate a
grahpic address from guest view to host view. By now, current implementation
only support the static ballooning, but in the future, with more cooperation
from guest driver, the same interfaces can be extended to grow/shrink the
guest graphic memory dynamically.

v2:
take Chris and Daniel's comments:
- no guard page between different VMs
- use drm_mm_reserve_node() to do the reservation for ballooning,
instead of the previous drm_mm_insert_node_in_range_generic()

v3:
take Daniel's comments:
- move ballooning functions into i915_vgpu.c
- add kerneldoc to ballooning functions

v4:
take Tvrtko's comments:
- more accurate comments and commit message

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c8c26622 22-Jan-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Setup less PPGTT on failed page_directory

The current code will both potentially print a WARN, and setup part of
the PPGTT structure. Neither of these harm the current code, it is
simply for clarity, and to perhaps prevent later bugs, or weird
debug messages.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 76643600 22-Jan-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Rename to GEN8_LEGACY_PDPES

In gen8, 32b PPGTT has always had one "pdp" (it doesn't actually have
one, but it resembles having one). The #define was confusing as is, and
using "PDPE" is a much better description.

sed -i 's/GEN8_LEGACY_PDPS/GEN8_LEGACY_PDPES/' drivers/gpu/drm/i915/*.[ch]

It also matches the x86 pagetable terminology:
PTE = Page Table Entry - pagetable level 1 page
PDE = Page Directory Entry - pagetable level 2 page
PDPE = Page Directory Pointer Entry - pagetable level 3 page

And in the near future (for 48b addressing):
PML4E = Page Map Level 4 Entry

v2: Expanded information about Page Directory/Table nomenclature.

Cc: Daniel Vetter <daniel@ffwll.ch>
CC: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b1252bcf 03-Dec-2014 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Organize bind_vma funcs

Let's be optimistic that for future platforms this will remain the same
and reorg a bit.
This reorg in if blocks instead of switch make life easier for future
platform support addition.

Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1eb0f006 03-Dec-2014 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Organize PPGTT init

Let's be optimistic that for future platforms memory management doesn't change
that much and reuse gen8 function for PPGTT init.

Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2f82bbdf 15-Dec-2014 Michel Thierry <michel.thierry@intel.com>

drm/i915: Use true PPGTT in Gen8+ when execlists are enabled

In Gen8+, full ppgtt needs execlist, otherwise the ctx switch can hang.

Also remove the current restriction, a user should be able to explicitly set
ppgtt=2.

Note, this patch considers that execlist support has been enabled by
default on Gen8.

v2: Remove non-default restriction and clarify commit message (Daniel)

Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
[danvet: s/comment/commit message/ in the commit message since that's
what Michel meant as per our irc discussion.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 45f8f69a 10-Dec-2014 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Documentation for multiple GGTT views

A short section describing background, implementation and intended usage.

v2:
* Align section name between template and DOC comment. (Michel Thierry)

For: VIZ-4544
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fe14d5f4 10-Dec-2014 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Infrastructure for supporting different GGTT views per object

Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
to map objects into the same address space multiple times.

Added a GGTT view concept and linked it with the VMA to distinguish between
multiple instances per address space.

New objects and GEM functions which do not take this new view as a parameter
assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
previous behaviour.

This now means that objects can have multiple VMA entries so the code which
assumed there will only be one also had to be modified.

Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
which is DMA mapped on first VMA instantiation and unmapped on the last one
going away.

v2:
* Removed per view special casing in i915_gem_ggtt_prepare /
finish_object in favour of creating and destroying DMA mappings
on first VMA instantiation and last VMA destruction. (Daniel Vetter)
* Simplified i915_vma_unbind which does not need to count the GGTT views.
(Daniel Vetter)
* Also moved obj->map_and_fenceable reset under the same check.
* Checkpatch cleanups.

v3:
* Only retire objects once the last VMA is unbound.

v4:
* Keep scatter-gather table for alternative views persistent for the
lifetime of the VMA.
* Propagate binding errors to callers and handle appropriately.

v5:
* Explicitly look for normal GGTT view in i915_gem_obj_bound to align
usage in i915_gem_object_ggtt_unpin. (Michel Thierry)
* Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry)
* Removed stray semi-colon in i915_gem_object_set_cache_level.

For: VIZ-4544
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Drop hunk from i915_gem_shrink since it's just prettification
but upsets a __must_check warning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5f77eeb0 08-Dec-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use BUILD_BUG if possible in the i915 WARN_ON

Faster feedback to errors is always better. This is inspired by the
addition to WARN_ONs to mask/enable helpers for registers to make sure
callers have the arguments ordered correctly: Pretty much always the
arguments are static.

We use WARN_ON(1) a lot in default switch statements though where we
should always handle all cases. So add a new macro specifically for
that.

The idea to use __builtin_constant_p is from Chris Wilson.

v2: Use the ({}) gcc-ism to avoid the static inline, suggested by
Dave. My first attempt used __cond as the temp var, which is the same
used by BUILD_BUG_ON, but with inverted sense. Hilarity ensued, so
sprinkle i915 into the name.

Also use a temporary variable to only evaluate the condition once,
suggested by Damien.

v3: It's crazy but apparently 32bit gcc can't compile out the
BUILD_BUG_ON in a lot of cases and just falls over. I have no idea
why, but until clue grows just disable this nifty idea on 32bit
builds. Reported by 0-day builder.

v4: Got it all wrong, apparently its the gcc version. We need 4.9+.
Now reported by Imre.

v5: Chris suggested to add the case to MISSING_CASE for speedier
debug.

v6: Even some gcc 4.9 versions don't see through the maze, so give up
for now. Keep the skeleton and MISSING_CASE stuff though.

Cc: Imre Deak <imre.deak@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# f7635669 03-Dec-2014 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Stop putting GGTT VMA at the head of the list

Multiple GGTT VMAs per object will be introduced in the near future which will
make it impossible to guarantee normal GGTT view is at the head of the list.

Purpose of this patch is to break this assumption straight away so any
potential hidden assumptions in the code base can be bisected to this
simple patch.

For: VIZ-4544
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f548c0e9 19-Nov-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Can i915_gem_init_ioctl

Found one more!

With this we can clear up the ggtt init code a bit, yay!

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# cf3d262e 14-Nov-2014 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Fix comments about CHV snoop behaviour

Replace the misinformed notes about CHV snoop behaviour with something
that's hopefully closer to reality.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 70ee45e1 14-Nov-2014 Damien Lespiau <damien.lespiau@intel.com>

drm/i915/skl: Don't allow disabling ppgtt and execlists on gen9+

Running the driver without execlists and hence PPGTT (either aliasing or
full) isn't a supported configuration on gen9+.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3581f309 12-Nov-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Delete outdated comment in byt_pte_encode

This has been invalidated in

commit 24f3a8cf7766e52a087904b4346794c7b410f957
Author: Akash Goel <akash.goel@intel.com>
Date: Tue Jun 17 10:59:42 2014 +0530

drm/i915: Added write-enable pte bit supportt

But despite that it's in the diff context no one noticed :(

Cc: Akash Goel <akash.goel@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 198c974d 10-Nov-2014 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: Add tracepoints to track a vm during its lifetime

- ppgtt init/release: these tracepoints are useful for observing the
creation and destruction of Full PPGTTs.

- ctx create/free: we can use the ctx_free trace in combination with the
ppgtt_release one to be sure that the ppgtt doesn't stay alive for too
long after the ctx is destroyed. ctx_create is there for simmetry

- switch_mm: important point in the lifetime of the vm

v4: add DOC information
v5: pull the DOC in drm.tmpl
v6: clean ppgtt init/release traces + add ctx create/free and switch_mm
tracepoints (Chris)
v7: drop execlist_submit_context tracepoint

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 59a5d290 30-Oct-2014 Paulo Zanoni <paulo.r.zanoni@intel.com>

drm/i915: fix "Unexpected fault" error message line break

Fix the message, not the fault :)

This is what I see:
[ 282.108597] [drm:i915_check_and_clear_faults] Unexpected fault
[ 282.108597] Addr: 0x00000000\n Address space: PPGTT
[ 282.108597] Source ID: 24
[ 282.108597] Type: 0

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d6a8b72e 05-Nov-2014 Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Disable caches for Global GTT.

Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
So the only way to avoid screen corruptions is setting PAT 0 to Uncached.

MOCS can still be used though. But if userspace is trusting PTE for
cache selection the safest thing to do is to let caches disabled.

BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
so RTL will always use the value corresponding to pat_sel = 000"

- System agent ggtt writes (i.e. cpu gtt mmaps) already work before
this patch, i.e. the same uncached + snooping access like on gen6/7
seems to be in effect.
- So this just fixes blitter/render access. Again it looks like it's
not just uncached access, but uncached + snooping. So we can still
hold onto all our assumptions wrt cpu clflushing on LLC machines.

v2: Cleaner patch as suggested by Chris.
v3: Add Daniel's comment

Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Stable@vger.kernel.org
Tested-by: James Ausmus <james.ausmus@intel.com>
Reviewed-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# aff43766 23-Oct-2014 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Move flags describing VMA mappings into the VMA

If these flags are on the object level it will be more difficult to allow
for multiple VMAs per object.

v2: Simplification and cleanup after code review comments (Chris Wilson).

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# cacc6c83 22-Oct-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

Revert "drm/i915: Enable full PPGTT on gen7"

This reverts commit 8c50f10d73b50139dcfe48bc22f2c8c7822c1983.

It's not yet solid and Dave objected to pulling the tree in its
current state.

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
References: http://mid.mail-archive.com/CAPM=9ty2r1MLE=wzC-_vNSUzXVqAyXiGgocpSV9qOp0gzpK3xA@mail.gmail.com
References: http://lists.freedesktop.org/archives/intel-gfx/2014-October/053926.html
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 91e56499 25-Sep-2014 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Flush the PTEs after updating them before suspend

As we use WC updates of the PTE, we are responsible for notifying the
hardware when to flush its TLBs. Do so after we zap all the PTEs before
suspend (and the BIOS tries to read our GTT).

Fixes a regression from

commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Wed Oct 16 09:21:30 2013 -0700

drm/i915: Disable GGTT PTEs on GEN6+ suspend

that survived and continue to cause harm even after

commit e568af1c626031925465a5caaab7cca1303d55c7
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Mar 26 20:08:20 2014 +0100

drm/i915: Undo gtt scratch pte unmapping again

v2: Trivial rebase.
v3: Fixes requires pointer dances.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82340
Tested-by: ming.yao@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Todd Previte <tprevite@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 3fdcf80f 23-Jan-2014 Damien Lespiau <damien.lespiau@intel.com>

drm/i915/skl: Initialize PPGTT like gen8

gen9 uses very similar memory management to what gen8 has. Just follow
the flow.

v2: Fix trivial conflict (Damien)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fb8aad4b 16-Jan-2014 Damien Lespiau <damien.lespiau@intel.com>

drm/i915/skl: gen9 uses the same bind_vma() vfuncs as gen6+

Temporary plug a BUG() while waiting for a better solution. See:

http://lists.freedesktop.org/archives/intel-gfx/2014-January/038132.html

However Chris was looking at cleaning-up this as well, so went for the
easy intermediate solution instead.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 66375014 09-Jan-2014 Damien Lespiau <damien.lespiau@intel.com>

drm/i915/skl: Add the additional graphics stolen sizes

Skylake introduces new stolen memory sizes starting at 0xf0 (4MB) and
growing by 4MB increments from there.

v2: Rebase on top of the early-quirk changes from Ville.

v3: Rebase on top of the PCI_IDS/IDS macro rename

Reviewed-by: Thomas Wood <thomas.wood@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1893a71b 19-Sep-2014 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Inline feature detection into sanitize_enable_ppgtt

Rather than splitting and hiding away critical parts of
sanitize_enable_ppgtt() into single use macros in the headers, inline
them into the function for clarity.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c04d0161 12-Sep-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Drop get/put_pages for scratch page

While discussing/reviewing __GFP_MOVEABLE behaviour and interactions
with our various page allocations on irc Chris brought up that the
scratch page isn't allocated as moveable, but we still grab/put a
reference to lock it in place. Which is unecessary.

So drop that.

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 8c50f10d 05-Sep-2014 Michel Thierry <michel.thierry@intel.com>

drm/i915: Enable full PPGTT on gen7

Use full PPGTT as the default option in gen7.

Note that aliasing PPGTT is the default option for gen8 (see
HAS_PPGTT) since we're still fighting troubles around context
switching and execlists.

This may well come back to bite me later.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Explain that gen8 full ppgtt is blocked on execlists for
now.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 671b5013 20-Aug-2014 Thomas Daniel <thomas.daniel@intel.com>

drm/i915/bdw: Do not initialize PPGTT in the legacy way for execlists

A pending commit removes synchronous mode from switch_mm. This breaks
execlists because switch_mm will always try to write to the legacy ring
buffer.

Return immediately from i915_ppgtt_init_gw in execlists mode.
No longer check for execlists mode in gen8_ppgtt_enable() because this
will no longer be called in execlists mode.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e07f0552 19-Aug-2014 Michel Thierry <michel.thierry@intel.com>

drm/i915: Handle i915_ppgtt_put correctly

Unfortunately, the gem_obj/vma relationship is not symmetrical; a gem_obj
can look up for the same vma more than once (where the ppgtt refcount is
incremented), but will free the vma only once (i915_gem_free_object).

This difference in refcount get/put means that the ppgtt is not removed
after the context and vma are destroyed, because sometimes the refcount
will never go back to zero.

v2: Just move the ppgtt refcount into vma_create.

OTC-Jira: VIZ-3719
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6689c167 15-Aug-2014 McAulay, Alistair <alistair.mcaulay@intel.com>

drm/i915: Rework GPU reset sequence to match driver load & thaw

This patch is to address Daniels concerns over different code during reset:

http://lists.freedesktop.org/archives/intel-gfx/2014-June/047758.html

"The reason for aiming as hard as possible to use the exact same code for
driver load, gpu reset and runtime pm/system resume is that we've simply
seen too many bugs due to slight variations and unintended omissions."

Tested using igt drv_hangman.

V2: Cleaner way of preventing check_wedge returning -EAGAIN
V3: Clean the last_context during reset, to ensure do_switch() does the MI_SET_CONTEXT. As per review.
Signed-off-by: McAulay, Alistair <alistair.mcaulay@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Rebase over ctx->ppgtt rework and extend the comment in
check_wedge a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b7c71823 14-Aug-2014 Oscar Mateo <oscar.mateo@intel.com>

drm/i915/bdw: Don't write PDP in the legacy way when using LRCs

This is mostly for correctness so that we know we are running the LR
context correctly (this is, the PDPs are contained inside the context
object).

v2: Move the check to inside the enable PPGTT function. The switch
happens in two places: the legacy context switch (that we won't hit
when Execlists are enabled) and the PPGTT enable, which unfortunately
we need. This would look much nicer if the ppgtt->enable was part of
the ring init, where it logically belongs.

v3: Move the check to the start of the enable PPGTT function. None
of the legacy PPGTT enabling is required when using LRCs as the
PPGTT is enabled in the context descriptor and the PDPs are written
in the LRC.

v4: Clarify comment based on review feedback.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Resolve conflicts with ppgtt_enable rework.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 70e32544 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Cleanup aliasging ppgtt alongside the global gtt

Also remove related WARN_ONs which seem to have been hit since a rather
long time. But apperently no one noticed since our module reload is
already WARNING-infested :(

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 90d0a0e8 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Extract commmon global gtt cleanup code

We want to move the aliasing ppgtt cleanup back into the global
gtt cleanup code for symmetry, but first we need to create such
a place.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 19dd120c 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Extract common cleanup into i915_ppgtt_release

Address space cleanup isn't really a job for the low-level cleanup
callbacks. Without this change we can't reuse the low-level cleanup
callback for the aliasing ppgtt cleanup.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fa76da34 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Initialize the aliasing ppgtt as part of global gtt

Stuffing this into the context setup code doesn't make a lot of sense.
Also reusing the real ppgtt setup code makes even less sense since the
aliasing ppgtt isn't a real address space. Leaving all that stuff
unitialized will make sure that we catch any abusers promptly.

This is also a prep work to clean up the context->ppgtt link.

v2: Fix up the logic fail, I've fumbled it so badly to completely
disable ppgtt on gen6. Spotted by Ville and Michel. Also move around
the pde write into the gen6 init function, since otherwise it won't
work at all.

v3: Only initialize the aliasing ppgtt when we actually enable it.

Cc: "Thierry, Michel" <michel.thierry@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Squash in fixup from Fengguang Wu.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 82460d97 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Rework ppgtt init to no require an aliasing ppgtt

Currently we abuse the aliasing ppgtt to set up the ppgtt support in
general. Which is a bit backwards since with full ppgtt we don't ever
need the aliasing ppgtt.

So untangle this and separate the ppgtt init from the aliasing
ppgtt. While at it drag it out of the context enabling (which just
does a switch to the default context).

Note that we still have the differentiation between synchronous and
asynchronous ppgtt setup, but that will soon vanish. So also correctly
wire up the return value handling to be prepared for when ->switch_mm
drops the synchronous parameter and could start to fail.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6c5566a8 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Allow i915_gem_setup_global_gtt to fail

We already needs this just as a safety check in case the preallocation
reservation dance fails. But we definitely need this to be able to
move tha aliasing ppgtt setup back out of the context code to this
place, where it belongs.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 841cd773 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Only refcount ppgtt if it actually is one

This essentially unbreaks non-ppgtt operation where we'd scribble over
random memory.

While at it give the vm_to_ppgtt function a proper prefix and make it
a bit more paranoid.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4d884705 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Track file_priv, not ctx in the ppgtt structure

Hardware contexts reference a ppgtt, not the other way round. And the
only user of this (in debugfs) actually only cares about which file
the ppgtt is associated with. So give it what it wants.

While at it give the ppgtt create function a proper name&place.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ee960be7 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Some cleanups for the ppgtt lifetime handling

So when reviewing Michel's patch I've noticed a few things and cleaned
them up:
- The early checks in ppgtt_release are now redundant: The inactive
list should always be empty now, so we can ditch these checks. Even
for the aliasing ppgtt (though that's a different confusion) since
we tear that down after all the objects are gone.
- The ppgtt handling functions are splattered all over. Consolidate
them in i915_gem_gtt.c, give them OCD prefixes and add wrappers for
get/put.
- There was a bit a confusion in ppgtt_release about whether it cares
about the active or inactive list. It should care about them both,
so augment the WARNINGs to check for both.

There's still create_vm_for_ctx left to do, put that is blocked on the
removal of ppgtt->ctx. Once that's done we can rename it to
i915_ppgtt_create and move it to its siblings for handling ppgtts.

v2: Move the ppgtt checks into the inline get/put functions as
suggested by Chris.

v3: Inline the now redundant ppgtt local variable.

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b9d06dd9 06-Aug-2014 Michel Thierry <michel.thierry@intel.com>

drm/i915: vma/ppgtt lifetime rules

VMAs should take a reference of the address space they use.

Now, when the fd is closed, it will release the ref that the context was
holding, but it will still be referenced by any vmas that are still
active.

ppgtt_release() should then only be called when the last thing referencing
it releases the ref, and it can just call the base cleanup and free the
ppgtt.

Note that with this we will extend the lifetime of ppgtts which
contain shared objects. But all the non-shared objects will get
removed as soon as they drop of the active list and for the shared
ones the shrinker can eventually reap them. Since we currently can't
evict ppgtt pagetables either I don't think that temporary leak is
important.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Add note about potential ppgtt leak with this approach.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 692ef70c 05-Aug-2014 Jesse Barnes <jbarnes@virtuousgeek.org>

drm/i915: clean up PPGTT checking logic

sanitize_enable_ppgtt is the function that checks all the conditions,
honoring a forced ppgtt status or doing auto-detect as necessary. Just
make sure it returns the right value in all cases and use that in the
macros instead of the confusing intel_enable_ppgtt() function.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[danvet: Don't reenable full ppgtt through the backdoor.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 57007df7 28-Jul-2014 Pavel Machek <pavel@ucw.cz>

drm/i915: work around warning in i915_gem_gtt

Gcc warns that addr might be used uninitialized. It may not, but I see
why gcc gets confused.

Additionally, hiding code with side-effects inside WARN_ON() argument
seems uncool, so I moved it outside.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
[danvet: Add obligatory /* shuts up gcc */ comment.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ca2aed6c 27-Jun-2014 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Don't disable PPGTT for CHV based in PCI rev

In
commit 62942ed7279d3e06dc15ae3d47665eff3b373327
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Fri Jun 13 09:28:33 2014 -0700

drm/i915/vlv: disable PPGTT on early revs v3

we forgot about CHV. IS_VALLEYVIEW() is true for CHV, so we need to
explicitly avoid disabling PPGTT on CHV.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 24f3a8cf 16-Jun-2014 Akash Goel <akash.goel@intel.com>

drm/i915: Added write-enable pte bit supportt

This adds support for a write-enable bit in the entry of GTT.
This is handled via a read-only flag in the GEM buffer object which
is then used to see how to set the bit when writing the GTT entries.
Currently by default the Batch buffer & Ring buffers are marked as read only.

v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris)
Fixed the issue of leaving 'gt_old_ro' as unused. (Chris)

v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel).

v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions,
in lieu of overloading the cache_level enum (Daniel).

v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 62942ed7 13-Jun-2014 Jesse Barnes <jbarnes@virtuousgeek.org>

drm/i915/vlv: disable PPGTT on early revs v3

Early revs didn't have PPGTT support, so disable there.

v2: add debug msg when disabling on early stepping
v3: enable on other B3 packages as well (untested) (Ville)

References: https://bugs.freedesktop.org/show_bug.cgi?id=79669
References: https://bugs.freedesktop.org/show_bug.cgi?id=79670
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4c2e0990 05-Jun-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fixup global gtt cleanup

The global gtt is setup up in 2 parts, so we need to be careful
with the cleanup. For consistency shovel it all into the ->cleanup
callback, like with ppgtt.

Noticed because it blew up in the out_gtt: cleanup code while
fiddling with the vgacon code.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 562d55d9 27-May-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Only use 2g GGTT for 32b platforms

Daniel requested in the bug that I use a 3GB fallback size. Since this
is not in the spec as a valid size, I decided against it. We could
potentially add a patch to bump it to 3GB on top of this one.

This probably should be CC: stable - but I'll let the powers that be
decide that one.

Regression from a revert of the revert:
commit 7907f45bf9f67a1c5e5d4ae05bab428d7c2f43b2
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Wed Feb 19 22:05:46 2014 -0800

Revert "drm/i915/bdw: Limit GTT to 2GB"

v2: Change ifdef to 32b, instead of ifndef
update comment

v3. Update comment to not wrap (Daniel).
Update commit message

v4: s/CONFIG_32/CONFIG_X86_32 (Jani).

v5: s/CONFIG_x86_32BIT/CONFIG_x86_32, as meant in v4
s/32B/32b (chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76619
Cc: stable@vger.kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: "Yang, Guang A" <guang.a.yang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d23db88c 23-May-2014 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prevent negative relocation deltas from wrapping

This is pure evil. Userspace, I'm looking at you SNA, repacks batch
buffers on the fly after generation as they are being passed to the
kernel for execution. These batches also contain self-referenced
relocations as a single buffer encompasses the state commands, kernels,
vertices and sampler. During generation the buffers are placed at known
offsets within the full batch, and then the relocation deltas (as passed
to the kernel) are tweaked as the batch is repacked into a smaller buffer.
This means that userspace is passing negative relocations deltas, which
subsequently wrap to large values if the batch is at a low address. The
GPU hangs when it then tries to use the large value as a base for its
address offsets, rather than wrapping back to the real value (as one
would hope). As the GPU uses positive offsets from the base, we can
treat the relocation address as the minimum address read by the GPU.
For the upper bound, we trust that userspace will not read beyond the
end of the buffer.

So, how do we fix negative relocations from wrapping? We can either
check that every relocation looks valid when we write it, and then
position each object such that we prevent the offset wraparound, or we
just special-case the self-referential behaviour of SNA and force all
batches to be above 256k. Daniel prefers the latter approach.

This fixes a GPU hang when it tries to use an address (relocation +
offset) greater than the GTT size. The issue would occur quite easily
with full-ppgtt as each fd gets its own VM space, so low offsets would
often be handed out. However, with the rearrangement of the low GTT due
to capturing the BIOS framebuffer, it is already affecting kernels 3.15
onwards. I think only IVB+ is susceptible to this bug, but the workaround
should only kick in rarely, so it seems sensible to always apply it.

v3: Use a bias for batch buffers to prevent small negative delta relocations
from wrapping.

v4 from Daniel:
- s/BIAS/BATCH_OFFSET_BIAS/
- Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
were growing rather cumbersome.
- Add a comment to eb_get_batch explaining why we do this.
- Apply the batch offset bias everywhere but mention that we've only
observed it on gen7 gpus.
- Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.

v5: Add static to eb_get_batch, spotted by 0-day tester.

Testcase: igt/gem_bad_reloc
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a4872ba6 22-May-2014 Oscar Mateo <oscar.mateo@intel.com>

drm/i915: s/intel_ring_buffer/intel_engine_cs

In the upcoming patches we plan to break the correlation between
engine command streamers (a.k.a. rings) and ringbuffers, so it
makes sense to refactor the code and make the change obvious.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d7f25f23 08-May-2014 Damien Lespiau <damien.lespiau@intel.com>

drm/i915/chv: Implement stolen memory size detection

CHV uses the same bits as SNB/VLV to code the Graphics Mode Select field
(GFX stolen memory size) with the addition of finer granularity modes:
4MB increments from 0x11 (8MB) to 0x1d.

Values strictly above 0x1d are either reserved or not supported.

v2: 4MB increments, not 8MB. 32MB has been omitted from the list of new
values (Ville Syrjälä)

v3: Also correctly interpret GGMS (GTT Graphics Memory Size) (Ville
Syrjälä)

v4: Don't assign a value that needs 20bits or more to a u16 (Rafael
Barbalho)

[vsyrjala: v5: Split the early quirks to another patch]

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3e8b5ae9 06-May-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Use topdown allocation for PPGTT PDEs on gen6/7

It was always the intention to do the topdown allocation for context
objects (Chris' idea originally). Unfortunately, I never managed to land
the patch, but someone else did, so now we can use it.

As a reminder, hardware contexts never need to be in the precious GTT
aperture space - which is what is what happens with the normal bottom up
allocation we do today. Doing a top down allocation increases the odds
that the HW contexts can get out of the way, especially with per FD
contexts as is done in full PPGTT

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fd1ab8f4 09-Apr-2014 Rafael Barbalho <rafael.barbalho@intel.com>

drm/i915/chv: Flush caches when programming page tables

Page table updates were getting stuck in the CPU cache on chv causing
spurious page faults and strange behaviour.

Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
[vsyrjala: Add !HAS_LLC checks]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ee0ce478 09-Apr-2014 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915/chv: PPAT setup for Cherryview

Ignore the cache bits in PPAT and just set the snoop bit where
appropriate. BDW WB is mapped to snooped access, while all other
modes are mapped to non-snooped access.

The hardware supposedly ignores everything except the snoop bit
in the PPAT entries.

Additionally the hardware actually enforces snooping for all
page table accesses, and thus the snoop bit is ignored for PDEs.

v2: Rebased on top of the bdw resume fix to reload the ppat entries.

v3: Rebase on top of the i915_gem_gtt.h header extraction.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 63c42e56 18-Apr-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Add WT caching ability

I don't have any insight on what parts can do what. The docs do seem to
suggest WT caching works in at least the same manner as it does on
Haswell.

The addr = 0 is to shut up GCC:
drivers/gpu/drm/i915/i915_gem_gtt.c:80:7: warning: 'addr' may be used
uninitialized in this function [-Wmaybe-uninitialized]

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# cfa7c862 29-Apr-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Sanitize the enable_ppgtt module option once

Otherwise we'll end up spamming dmesg on every context creation on snb
with vt-d enabled. This regression was introduced in

commit 246cbfb5fb9a1ca0997fbb135464c1ff5bb9c549
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Fri Dec 6 14:11:14 2013 -0800

drm/i915: Reorganize intel_enable_ppgtt

As the i915.enable_ppgtt is read-only it cannot be changed after the
module is loaded and so we can perform an early sanitization of the
values.

v2:
- Add comment and pimp commit message (Chris)
- Use the param consistently (Jani)

v3:
- Fix init sequence on pre-gen6 by moving the sanitize_ppgtt call to
gtt_init. Fixes boot hangs on pre-gen6.
- Add a debug output for the sanitize ppgtt mode.

References: https://lkml.org/lkml/2014/4/17/599
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77916
Cc: Alessandro Suardi <alessandro.suardi@gmail.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 0f9dc59d 24-Mar-2014 Ben Widawsky <benjamin.widawsky@linux.intel.com>

drm/i915: Allow full PPGTT with param override

When PPGTT was disabled by default, the patch also prevented the user
from overriding this behavior via module parameter. Being able to test
this on arbitrary kernels is extremely beneficial to track down the
remaining bugs. The patch that prevented this was:

commit 93a25a9e2d67765c3092bfaac9b855d95e39df97
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Mar 6 09:40:43 2014 +0100

drm/i915: Disable full ppgtt by default

By default PPGTT is set to -1. 0 means off, 1 means aliasing only, 2
means full, all other values are reserved.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 62347f9e 02-Apr-2014 Lauri Kasanen <cand@gmx.com>

drm: Add support for two-ended allocation, v3

Clients like i915 need to segregate cache domains within the GTT which
can lead to small amounts of fragmentation. By allocating the uncached
buffers from the bottom and the cacheable buffers from the top, we can
reduce the amount of wasted space and also optimize allocation of the
mappable portion of the GTT to only those buffers that require CPU
access through the GTT.

For other drivers, allocating small bos from one end and large ones
from the other helps improve the quality of fragmentation.

Based on drm_mm work by Chris Wilson.

v3: Changed to use a TTM placement flag
v2: Updated kerneldoc

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Christian König <deathsimple@vodafone.de>
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: David Airlie <airlied@redhat.com>


# 5db6c735 31-Mar-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: dmesg output for VT-d testing

Our validation guys want to have a positive proof that the gfx driver
is indeed using VT-d, since setting up a gfx stack, especially in
early bring-up and by people not versed in linux gfx is a bit tricky.
So provide just that.

Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8d214b7d 24-Mar-2014 Ben Widawsky <benjamin.widawsky@linux.intel.com>

drm/i915: Allow full PPGTT with param override

When PPGTT was disabled by default, the patch also prevented the user
from overriding this behavior via module parameter. Being able to test
this on arbitrary kernels is extremely beneficial to track down the
remaining bugs. The patch that prevented this was:

commit 93a25a9e2d67765c3092bfaac9b855d95e39df97
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Mar 6 09:40:43 2014 +0100

drm/i915: Disable full ppgtt by default

By default PPGTT is set to -1. 0 means off, 1 means aliasing only, 2
means full, all other values are reserved.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0260c420 22-Mar-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Split out GTT specific header file

This file contains all necessary defines, prototypes and typesdefs for
manipulating GEN graphics address translation (this does not include the
legacy AGP driver)

Reiterating the comment in the header,
"Please try to maintain the following order within this file unless it
makes sense to do otherwise. From top to bottom:
1. typedefs
2. #defines, and macros
3. structure definitions
4. function prototypes

Within each section, please try to order by generation in ascending
order, from top to bottom (ie. GEN6 on the top, GEN8 on the bottom)."

I've made some minor cleanups, and fixed a couple of typos while here -
but there should be no functional changes.

The purpose of the patch is to reduce clutter in our main header file,
making room for new growth, and make documentation of our interfaces
easier by splitting things out.

With a little more work, like making i915_gtt a pointer, we could
potentially completely isolate this header from i915_drv.h. At the
moment however, I don't think it's worth the effort.

Personally, I would have liked to put the PTE encoding functions in this
file too, but I didn't want to rock the boat too much.

A similar patch has been in use on my machine for some time. This exact
patch though has only been compile tested.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 50227e1c 31-Mar-2014 Jani Nikula <jani.nikula@intel.com>

drm/i915: prefer struct drm_i915_private to drm_i915_private_t

Remove the rest of the references to drm_i915_private_t. No functional
changes.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop hunk in i915_cmd_parser.c]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e568af1c 26-Mar-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Undo gtt scratch pte unmapping again

It apparently blows up on some machines. This functionally reverts

commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Wed Oct 16 09:21:30 2013 -0700

drm/i915: Disable GGTT PTEs on GEN6+ suspend

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841
Reported-and-Tested-by: Brad Jackson <bjackson0971@gmail.com>
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Todd Previte <tprevite@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8ee661b5 26-Mar-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Undo gtt scratch pte unmapping again

It apparently blows up on some machines. This functionally reverts

commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Wed Oct 16 09:21:30 2013 -0700

drm/i915: Disable GGTT PTEs on GEN6+ suspend

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841
Reported-and-Tested-by: Brad Jackson <bjackson0971@gmail.com>
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Todd Previte <tprevite@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# a2319c08 18-Mar-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Restore PPAT on thaw

Apparently it is wiped out from under us, and we get some really fun
caching artifacts upon resume (it seems to be WB for all types by
default).

Reported-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: James Ausmus <james.ausmus@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76113
Tested-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c5139450 12-Mar-2014 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Drop WARN_ON(flags) from ppgtt_bind_vma()

We will call ppgtt_bind_vma() with flags != 0, so the WARN_ON(flags)
is bogus. Kill it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5a6c93fe 08-Mar-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Correct PPGTT total size

Our code allows have a PPGTT that is smaller than the maximum size for
GEN6-GEN7. Though I don't think this actually ever occurs, the code may
as well work properly and more importantly look correct by using the
variable size instead of the HW max.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8407bb91 08-Mar-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Use scratch page table for GEN8 PPGTT

I'm not clear if the hardware is still subject to the same prefetching
issues that made us use a scratch page in the first place. In either
case, we're using garbage with the current code (we will end up using
offset 0).

This may be the cause of our current gem_cpu_reloc regression with
PPGTT. I cannot test it at the moment.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 93a25a9e 06-Mar-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Disable full ppgtt by default

There are too many oustanding issues:

- Fence handling in the current code is broken. There's a patch series
from me, but it's blocked on and extended review (which includes
writing the testcases).

- IOMMU mapping handling is broken, we need to properly refcount it -
currently it gets destroyed when the first vma is unbound, so way
too early.

- There's a pending reset issue on snb. Since Mika's reset work and
full ppgtt have been pulled in in separate branches and ended up
intermittingly breaking each another it's unclear who's the exact
culprit here.

- We still have persistent evidince of crazy recursion bugs through
vma_unbind and ppgtt_relase, e.g.

https://bugs.freedesktop.org/show_bug.cgi?id=73383

This issue (and a few others meanwhile resolved) have blocked our
performance measuring/tuning group since 3 months.

- Secure batch dispatching is broken. This is blocking Brad Volkin's
command checker work since 3 months.

All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:

- We currently unconditionally bind objects into the aliasing ppgtt,
which means all priviledged objects like ringbuffers are visible to
unpriviledged access again. On top of that this also breaks the
command checker for aliasing ppgtt, since it can't hide the
validated batch any more.

Furthermore topic/full-ppgtt has never been reviewed:

- Lifetime rules around vma unbinding/release are unclear, resulting
into this awesome hack called ppgtt_release. Which seems to take the
blame for most of the recursion fallout.

- Context/ring init works different on gpu reset than anywhere else.
Such differeneces have in the past always lead to really hard to
track down bugs.

- Aliasing ppgtt is treated in a bunch of places as a real address
space, but it isn't - the real address space is always the global
gtt in that case. This results in a bit a mess between contexts and
ppgtt object, further complication the context/ppgtt/vma lifetime
rules.

- We don't have any docs describing the overall concepts introduced
with full ppgtt. A short, concise overview describing vmas and some
of the strange bits around them (like the unbound vmas used by
execbuf, or the new binding rules) really is needed.

Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.

Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.

Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5abbcca3 21-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Kill ppgtt->num_pt_pages

With the original PPGTT implementation if the number of PDPs was not a
power of two, the number of pages for the page tables would end up being
rounded up. The code actually had a bug here afaict, but this is a
theoretical bug as I don't believe this can actually occur with the
current code/HW..

With the rework of the page table allocations, there is no longer a
distinction between number of page table pages, and number of page
directory entries. To avoid confusion, kill the redundant (and newer)
struct member.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b146520f 19-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Split GEN6 PPGTT initialization up

Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a00d825d 19-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Split GEN6 PPGTT cleanup

This cleanup is similar to the GEN8 cleanup (though less necessary).
Having everything split will make cleaning the initialization path error
paths easier to understand.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c4ac524c 19-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Update i915_gem_gtt.c copyright

I keep meaning to do this... by now almost the entire file has been
written by an Intel employee (including Daniel post-2010).

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7907f45b 19-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

Revert "drm/i915/bdw: Limit GTT to 2GB"

This reverts commit 3a2ffb65eec6dbda2fd8151894f51c18b42c8d41.

Now that the code is fixed to use smaller allocations, it should be safe
to let the full GGTT be used on BDW.

The testcase for this is anything which uses more than half of the GTT,
thus eclipsing the old limit.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7ad47cf2 20-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Reorganize PT allocations

The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000

drm/i915: Avoid dereference past end of page arr

It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)

v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)

v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)

v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 782f1495 20-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Make clear/insert vfuncs args absolute

This patch converts insert_entries and clear_range, both functions which
are specific to the VM. These functions tend to encapsulate the gen
specific PTE writes. Passing absolute addresses to the insert_entries,
and clear_range will help make the logic clearer within the functions as
to what's going on. Currently, all callers simply do the appropriate
page shift, which IMO, ends up looking weird with an upcoming change for
the gen8 page table allocations.

Up until now, the PPGTT was a funky 2 level page table. GEN8 changes
this to look more like a 3 level page table, and to that extent we need
a significant amount more memory simply for the page tables. To address
this, the allocations will be split up in finer amounts.

v2: Replace size_t with uint64_t (Chris, Imre)

v3: Fix size in gen8_ppgtt_init (Ben)
Fix Size in i915_gem_suspend_gtt_mappings/restore (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# bf2b4ed2 19-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Split ppgtt initialization up

Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
kept the error path as similar as possible to the cleanup() function to
make sure cleanup is as robust as possible

v2: Remove comment "NB:From here on, ppgtt->base.cleanup() should
function properly"
Update commit message to reflect above

v3: Rebased on top of bugfixes found in the previous patch by Imre
Moved number of pd pages assertion to the proper place (Imre)

v4:
Allocate dma address space for num_pd_pages, not num_pd_entries (Ben)
Don't use gen8_pt_dma_addr after free on error path (Imre)
With new fix from v4 of the previous patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f3a964b9 19-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Reorganize PPGTT init

Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

v2: Don't call cleanup on the error path since that takes down the
drm_mm and list entry, which aren't setup at this point.

v3: Fixes addressing Imre's comments from:
<1392821989.19792.13.camel@intelbox>

Don't do dynamic allocation for the page table DMA addresses. I can't
remember why I did it in the first place. This addresses one of Imre's
other issues.

Fix error path leak of page tables.

v4: Fix the fix of the error path leak. Original fix still leaked page
tables. (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b18b6bde 20-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Free PPGTT struct

GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the
leak is small and only found on a module reload. ie. I don't think this
needs to go to stable.

v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double
free on PPGTT initialization failure. (Spotted by Imre). Instead this
patch pulls the ppgtt struct freeing out of the cleanup and leaves it to
the allocators/callers or the one doing the last kref_put as in standard
convention

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d47c3ea2 14-Feb-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Allow blocking in the PDE alloc when running low on gtt space

There's no need not to, really.

Split out from Chris vma-bind rework.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1ec9e26d 14-Feb-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Consolidate binding parameters into flags

Anything more than just one bool parameter is just a pain to read,
symbolic constants are much better.

Split out from Chris' vma-binding rework patch.

v2: Undo the behaviour change in object_pin that Chris spotted.

v3: Split out misplaced hunk to handle set_cache_level errors,
spotted by Jani.

v4: Keep the current over-zealous binding logic in the execbuffer code
working with a quick hack while the overall binding code gets shuffled
around.

v5: Reorder the PIN_ flags for more natural patch splitup.

v6: Pull out the PIN_GLOBAL split-up again.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b45a6715 12-Feb-2014 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Split up PPGTT cleanup

This will make the code more readable, and extensible which is needed
for upcoming feature work. Eventually, we'll do the same for init.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d330a953 21-Jan-2014 Jani Nikula <jani.nikula@intel.com>

drm/i915: move module parameters into a struct, in a new file

With 20+ module parameters, I think referring to them via a struct
improves clarity over just having a bunch of globals. While at it, move
the parameter initialization and definitions into a new file
i915_params.c to reduce clutter in i915_drv.c.

Apart from the ill-named i915_enable_rc6, i915_enable_fbc and
i915_enable_ppgtt parameters, for which we lose the "i915_" prefix
internally, the module parameters now look the same both on the kernel
command line and in code. For example, "i915.modeset".

The downsides of the change are losing static on a couple of variables
and not having the initialization and module_param_named() right next to
each other. On the other hand, all module parameters are now defined in
one place at i915_params.c. Plus you can do this to find all module
parameter references:

$ git grep "i915\." -- drivers/gpu/drm/i915

v2:
- move the definitions into a new file
- s/i915_params/i915/
- make i915_try_reset i915.reset, for consistency

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0e46ce2e 08-Jan-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix ppgtt dump code for DEBUG_FS=n

A regression in the topic/ppgtt branch introduce in

commit 87d60b63e0371529faaed0667d457e5022964010
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:29 2013 -0800

drm/i915: Add PPGTT dumper

The issue is that we're missing the definitions for the seq_file
functions and hence compilation fails.

v2: Just include the right header instead of splattering #ifdefs all
over the place (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Reported-by: Antti Koskipaa <antti.koskipaa@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 21c34607 21-Dec-2013 Bjorn Helgaas <bhelgaas@google.com>

drm/i915: Rename gtt_bus_addr to gtt_phys_addr

We're dealing with CPU physical addresses here, which may be different from
bus addresses, so rename gtt_bus_addr to gtt_phys_addr to avoid confusion.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6f1cc993 31-Dec-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid dereference past end of page array in gen8_ppgtt_insert_entries()

The bug from gen6_ppgtt_insert_entries() was replicated into
gen8_ppgtt_insert_entries(). This applies the fix for the OOPS from the
previous patch to the gen8 routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# cc79714f 31-Dec-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid dereference past end of page array in gen6_ppgtt_insert_entries()

[ 89.237347] BUG: unable to handle kernel paging request at ffff880096326000
[ 89.237369] IP: [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[ 89.237382] PGD 2272067 PUD 25df0e067 PMD 25de5c067 PTE 8000000096326060
[ 89.237394] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 89.237404] CPU: 1 PID: 1981 Comm: gem_concurrent_ Not tainted 3.13.0-rc4+ #639
[ 89.237411] Hardware name: Intel Corporation 2012 Client Platform/Emerald Lake 2, BIOS ACRVMBY1.86C.0078.P00.1201161002 01/16/2012
[ 89.237420] task: ffff88024c038030 ti: ffff88024b130000 task.ti: ffff88024b130000
[ 89.237425] RIP: 0010:[<ffffffff81347227>] [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[ 89.237435] RSP: 0018:ffff88024b131ae0 EFLAGS: 00010286
[ 89.237440] RAX: ffff880096325000 RBX: 0000000000000400 RCX: 0000000000001000
[ 89.237445] RDX: 0000000000000200 RSI: 0000000000000001 RDI: 0000000000000010
[ 89.237451] RBP: ffff88024b131b30 R08: ffff88024cc3aef0 R09: 0000000000000000
[ 89.237456] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88024cc3ae00
[ 89.237462] R13: ffff88024a578000 R14: 0000000000000001 R15: ffff88024a578ffc
[ 89.237469] FS: 00007ff5475d8900(0000) GS:ffff88025d020000(0000) knlGS:0000000000000000
[ 89.237475] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 89.237480] CR2: ffff880096326000 CR3: 000000024d531000 CR4: 00000000001407e0
[ 89.237485] Stack:
[ 89.237488] ffff880000000000 0000020000000000 ffff88024b23f2c0 0000000100000000
[ 89.237499] 0000000000000001 000000000007ffff ffff8801e7bf5ac0 ffff8801e7bf5ac0
[ 89.237510] ffff88024cc3ae00 ffff880248a2ee40 ffff88024b131b58 ffffffff813455ed
[ 89.237521] Call Trace:
[ 89.237528] [<ffffffff813455ed>] ppgtt_bind_vma+0x3d/0x60
[ 89.237534] [<ffffffff8133d8dc>] i915_gem_object_pin+0x55c/0x6a0
[ 89.237541] [<ffffffff8134275b>] i915_gem_execbuffer_reserve_vma.isra.14+0x5b/0x110
[ 89.237548] [<ffffffff81342a88>] i915_gem_execbuffer_reserve+0x278/0x2c0
[ 89.237555] [<ffffffff81343d29>] i915_gem_do_execbuffer.isra.22+0x699/0x1250
[ 89.237562] [<ffffffff81344d91>] ? i915_gem_execbuffer2+0x51/0x290
[ 89.237569] [<ffffffff81344de6>] i915_gem_execbuffer2+0xa6/0x290
[ 89.237575] [<ffffffff813014f2>] drm_ioctl+0x4d2/0x610
[ 89.237582] [<ffffffff81080bf1>] ? cpuacct_account_field+0xa1/0xc0
[ 89.237588] [<ffffffff81080b55>] ? cpuacct_account_field+0x5/0xc0
[ 89.237597] [<ffffffff811371c0>] do_vfs_ioctl+0x300/0x520
[ 89.237603] [<ffffffff810757a1>] ? vtime_account_user+0x91/0xa0
[ 89.237610] [<ffffffff810e40eb>] ? context_tracking_user_exit+0x9b/0xe0
[ 89.237617] [<ffffffff81083d7d>] ? trace_hardirqs_on+0xd/0x10
[ 89.237623] [<ffffffff81137425>] SyS_ioctl+0x45/0x80
[ 89.237630] [<ffffffff815afffa>] tracesys+0xd4/0xd9
[ 89.237634] Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 83 45 bc 01 49 8b 84 24 78 01 00 00 65 ff 0c 25 e0 b8 00 00 8b 55 bc <4c> 8b 2c d0 65 ff 04 25 e0 b8 00 00 49 8b 45 00 48 c1 e8 2d 48
[ 89.237741] RIP [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[ 89.237749] RSP <ffff88024b131ae0>
[ 89.237753] CR2: ffff880096326000
[ 89.237758] ---[ end trace 27416ba8b18d496c ]---

This bug dates back to the original introduction of the
gen6_ppgtt_insert_entries()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Dropped cc: stable since without full ppgtt there's no way
we'll access the last page directory with this function since that
range is occupied (only in the allocator) with the ppgtt pdes. Without
aliasing we can start to use that range and blow up.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c0a7f818 29-Dec-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mention when we enable the Ironlake iommu workarounds

The iommu and gfx on Ironlake do not like each other and require a
big hammer to prevent hard machine hangs. In

commit 5c0422878fcdc279ae9a8e8b66972a15b5efb67f
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Mon Oct 17 15:51:55 2011 -0700

drm/i915: ILK + VT-d workaround

we added the workaround, but never emitted any debug message that it was
active. Doing so should help identify known performance regressions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 058840c7 24-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Flush system agent on gen8 also

gem_gtt_cpu_tlb seems to indicate that it is needed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72869

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 87d60b63 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Add PPGTT dumper

Dump the aliasing PPGTT with it. The aliasing PPGTT should actually
always be empty.

TODO: Broadwell. Since we don't yet use full PPGTT on Broadwell, not
having the dumper is okay.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d2ff7192 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Remove extraneous mm_switch in ppgtt enable

Originally this commit message said:
Now that do_switch does the mm switch, and we always enable the aliasing
PPGTT, and contexts at the same time, there is no need to continue doing
this during PPGTT enabling.

Since originally writing the patch however, I introduced the concept of
synchronous mm switching (using MMIO). Since this is generally not
recommended in the spec (for reasons unknown), I've isolated its usage
as much as possible. As such the "extraneous" switch only ever will
occur when we have full PPGTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7e0d96bc 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use multiple VMs -- the point of no return

As with processes which run on the CPU, the goal of multiple VMs is to
provide process isolation. Specific to GEN, there is also the ability to
map more objects per process (2GB each instead of 2Gb-2k total).

For the most part, all the pipes have been laid, and all we need to do
is remove asserts and actually start changing address spaces with the
context switch. Since prior to this we've converted the setting of the
page tables to a streamed version, this is quite easy.

One important thing to point out (since it'd been hotly contested) is
that with this patch, every context created will have it's own address
space (provided the HW can do it).

v2: Disable BDW on rebase

NOTE: I tried to make this commit as small as possible. I needed one
place where I could "turn everything on" and that is here. It could be
split into finer commits, but I didn't really see much point.

Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# bdf4fd7e 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Do aliasing PPGTT init with contexts

We have a default context which suits the aliasing PPGTT well. Tie them
together so it looks like any other context/PPGTT pair. This makes the
code cleaner as it won't have to special case aliasing as often.

The patch has one slightly tricky part in the default context creation
function. In the future (and on aliased setup) we create a new VM for a
context (potentially). However, if we have aliasing PPGTT, which occurs
at this point in time for all platforms GEN6+, we can simply manage the
refcounting to allow things to behave as normal. Now is a good time to
recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT
drm_mm.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 80da2161 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Restore PDEs for all VMs

In following with the old restore code, we must now restore ever PPGTT's
PDEs, since they aren't proper GEM ojbects.

v2: Rebased on BDW. Only do restore pdes for gen6 & 7

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9f273d48 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Write PDEs at init instead of enable

We won't be calling enable() for all PPGTTs. We do need to write PDEs
for all PPGTTs however. By moving the writing to init (which is called
for all PPGTTs) we should accomplish this.

ADD NOTE ABOUT PDE restore

TODO: Eventually, we should allocate the page tables on demand.

v2: Rebased on BDW. Only do PDEs for pre-gen8

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c7c48dfd 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Add VM to context

Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.

To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.

Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 246cbfb5 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Reorganize intel_enable_ppgtt

This patch consolidates the way in which we handle the various supported
PPGTT by module parameter in addition to what the hardware supports. It
strives to make doing the right thing in the code as simple as possible,
with the USES_ macros.

I've opted to add the full PPGTT argument simply so one can see how I
intend to use this function. It will not/cannot be used until later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d6660add 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Generalize PPGTT init

Rearrange the initialization code to try to special case the aliasing
PPGTT less, and provide usable interfaces for the general case later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 90252e5c 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Flush TLBs after !RCS PP_DIR_BASE

I've found this by accident. The docs don't really come out and say you
need to do this. What the docs do tell you is you need to flush the TLBs
before you set the PP_DIR_BASE, and that the RCS will invalidate its
TLBs upon setting the new PP_DIR_BASE. It makes no such comment about
any of the other rings.

Empirically, this indeed fixes a really obvious bug whereby the batches
being sent to the blitter were not executing (we were executing the
HSWP somehow instead).

NOTE: This should make no difference with the current code. It only
applies when we start using multiple VMs.

NOTE2: HSW appears to be immune to this.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 48a10389 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use LRI for switching PP_DIR_BASE

The docs seem to suggest this is the appropriate method (though it
doesn't say so outright). In other words, we probably should have done
this before. We certainly must do this for switching VMs on the fly,
since synchronizing the rings to MMIO updates isn't acceptable.

v2:
Make the reset code actually work for all rings. Note that this was
fixed in subsequent commits, but was indeed broken for this commit.

Add a posting read to the reset case. It probably should have existed
before hand, but since we have no failures; there is no reason to make
it a separate commit.

Make IS_GEN6 not use the ring because I am seeing crashes when using it.
It is a bit of a hack in this patch, it will get fixed up in a couple of
patches.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# eeb9488e 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Extract mm switching to function

In order to do the full context switch with address space, it's
convenient to have a way to switch the address space. We already have
this in our code - just pull it out to be called by the context switch
code later.

v2: Rebased on BDW support. Required adding BDW.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b4a74e3a 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Use platform specific ppgtt enable

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e3cc1995 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: One hopeful eviction on PPGTT alloc

The patch before this changed the way in which we allocate space for the
PPGTT PDEs. It began carving out the PPGTT PDEs (which live in the
Global GTT) from the GGTT's drm_mm. Prior to that patch, the PDEs were
hidden from the drm_mm, and therefore could never fail to be allocated.

In unfortunate cases, the drm_mm may be full when we want to allocate
the space. This can technically occur whenever we try to allocate, which
happens in two places currently. Practically, it can only really ever
happen at GPU reset.

Later, when we allocate more PDEs for multiple PPGTTs this will
potentially even more useful.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c8d4c0d6 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use drm_mm for PPGTT PDEs

When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.

Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.

The first step here is to make the existing single PPGTT use the
allocator.

The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.

The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.

v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)

v3: Use Chris' top down allocator

v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)

v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)

v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region

v7: Undo v3, so we can make the drm patch last in the series

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a3d67d23 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: PPGTT vfuncs should take a ppgtt argument

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6f65e29a 06-Dec-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Create bind/unbind abstraction for VMAs

To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.

What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.

drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage

drm/i915: Add bind/unbind object functions to VMA

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)

v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.

v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)

v5: Update the comment to not suck (Chris)

v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use the new vm [un]bind functions

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

At this point the original mailing list thread diverges. ie.

v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: reduce vm->insert_entries() usage

FKA: drm/i915: eliminate vm->insert_entries()

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e178f705 06-Dec-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Provide PDP updates via MMIO

The initial implementation of this function used MMIO to write the PDPs.
Upon review it was determined (correctly) that the docs say to use LRI.
The issue is there are times where we want to do a synchronous write
(GPU reset).

I've tested this, and it works. I've verified with as many people as
possible that it should work.

This should fix the failing reset problems.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5ed16782 25-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Move the gtt mm takedown to cleanup

Our VM code already has a cleanup function, and this is a nice place to
put the drm_mm_takedown. This should have no functional impact, it just
leaves the unload function a bit cleaer, and is more logical IMO

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 686e1f6f 25-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Add a few missed bits to the mm

This should really have been added in BDW integration, as well as:

commit 93bd8649dba3155d1a0ba2a902d9c49f1c75a1da
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Tue Jul 16 16:50:06 2013 -0700

drm/i915: Put the mm in the parent address space

It didn't really matter before, but it will in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d595bd4b 25-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Fix BDW PPGTT error path

When we fail for some reason on loading the PDPs, it would be wise to
disable the PPGTT in the ring registers. If we do not do this, we have
undefined results.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c51e9701 22-Nov-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Prefer setting PTE cache age to 3

We have conflicting benchmark data that suggest either age 0 or age 3 is
better. However, the earlier benchmark on which we based the switch to
age 0

(commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Thu Jul 4 11:02:03 2013 -0700

drm/i915/hsw: Set correct Haswell PTE encodings)

actually seems to prefer the default PTE encoding as age 3. Presumably,
this is in part due to the use of MOCS to override the PTE encodings
when appropriate.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69870
Tested-by: mengmeng.meng@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Eric Anholt <eric@anholt.net
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3a2ffb65 07-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Limit GTT to 2GB

Because of the way in which we're allocating the pages for the Aliasing
PPGTT, we cannot actually successfully alloc enough space for anything
greater than 2GB.

Instead of a quick hack to fix this, we should defer until we have the
real solution in place (allocating much less contiguous space).

This wasn't found sooner because we didn't not have any systems
supporting more than a 2GB GTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 230f955f 07-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Free correct number of ppgtt pages

I am unclear how this got messed up in the shuffle, but it did.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b53c8c35 12-Nov-2013 Jesse Barnes <jbarnes@virtuousgeek.org>

drm/i915: drop duplicate ggtt vma list add in setup_global_gtt

Preallocated objects will already have been added to the vma_list when
creating their ggtt vma entry, and coincidentally also marked as holding
a ggtt mapping. Repeating the vma_list manipulation when setting up the
ggtt after preallocation is a recipe for an unhappy kernel.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Use the improve commit message suggest by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b42218c1 02-Nov-2013 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915/bdw: Don't muck with gtt_size on Gen8 when PPGTT setup fails

v2: Resolve rebase conflicts and switch to gen < 8 color for GenX
checking.

v3: Rebase on top of the address space refactoring.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 28cf5415 02-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: unleash PPGTT

v2: Squash in fix from Ben: Set PPGTT batches as necessary

This fixes the regression in the last couple of days when we enabled
PPGTT.

v3: Squash in fixup to still use GTT for secure batches from Ville:

BDW doesn't have a separate secure vs. non-secure bit in
MI_BATCH_BUFFER_START. So for secure batches we have to simply
leave the PPGTT bit unset. Fortunately older generations (except
HSW) had similar limitations so execbuffer already creates a GTT
mapping for all secure batches.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 94e409c1 04-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Implement PPGTT enable

Legacy PPGTT on GEN8 requires programming 4 PDP registers per ring.
Since all rings are using the same address space with the current code
the logic is simply to program all the tables we've setup for the PPGTT.

v2: Turn on PPGTT in GFX_MODE

v3: v2 was the wrong patch

v4: Resolve conflicts due to patch series reordering.

v5: Squash in fixup from Ben: Use LRI to write PDPs

The docs (and simulator seems to back up) suggest that we can only
program legacy PPGTT PDPs with LRI commands.

v6: Rebase around context differences conflicts.

v7: Use #defines for per ring PDPs. (Damien)

v8: Don't use typede'f private_t.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (up to v3 and v7)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9df15b49 02-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Implement PPGTT insert

GEN8 insertion is very similar to GEN6.

v2: Rebase on top of Imre's for_each_sg_page helpers.

v3: Fixup my conversion (spotted by Ville).

v4: Rebase on top of the address space refactoring.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 459108b8 02-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Implement PPGTT clear range

GEN8 PPGTT range clearing is very similar to GEN6 if we assume that our
PDEs are all valid, which they should be.

v2: Rebase on top of the address space refactoring.

v3: Rebase on top of the bool use_scratch addition to the clear_range interface.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b1fe6673 04-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Initialize the PDEs

The upcoming clear and insert routines will expect that PDEs all point
to valid Page Directories. Doing that lazily doesn't really buy us
anything.

The page allocation is done regardless earlier in init so it shouldn't
hurt set the PDEs.

v2: Squash in patches to implement fixed PDE write function:

- If I had done this in the first place, the bug that's going to be
fixed in an upcoming patch would have been much easier to find.

- Use WB for PDEs.

The PAT bit is used for page size. 2ME PDEs aren't even supported in
BDW, so this was completely invalid. The solution is to make our
PDEs WB+LLC instead of the pervious WB+eLLC. As far as I can guess,
this change won't matter for performance.

Thanks to Ville for the quick correction when discussing on IRC.

v3: Return the pde type for pde encoding (Damien)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 37aca44a 04-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: PPGTT init & cleanup

Aside from the potential size increase of the PPGTT, the primary
difference from previous hardware is the Page Directories are no longer
carved out of the Global GTT.

Note that the PDE allocation is done as a 8MB contiguous allocation,
this needs to be eventually fixed (since driver reloading will be a
pain otherwise). Also, this will be a no-go for real PPGTT support.

v2: Move vtable initialization

v3: Resolve conflicts due to patch series reordering.

v4: Rebase on top of the address space refactoring of the PPGTT
support. Drop Imre's r-b tag for v2, too outdated by now.

v5: Free the correct amount of memory, "get_order takes size not a page
count." (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fbe5d36e 04-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Support BDW caching

BDW caching works differently than the previous generations. Instead of
having bits in the PTE which directly control how the page is cached,
the 3 PTE bits PWT PCD and PAT provide an index into a PAT defined by
register 0x40e0. This style of caching is functionally equivalent to how
it works on HSW and before.

v2: Tiny bikeshed as discussed on internal irc.

v3: Squash in patch from Ville to mirror the x86 PAT setup more like
in arch/x86/mm/pat.c. Primarily, the 0th index will be WB, and not
uncached.

v4: Comment for reason to not use a 64b write on the PPAT.

v5: Add a FIXME comment that the caching bits in the PAT registers
might be wrong due to doc confusion.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 94ec8f61 02-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Add GTT functions

With the PTE clarifications, the bind and clear functions can now be
added for gen8.

v2: Use for_each_sg_pages in gen8_ggtt_insert_entries.

v3: Drop dev argument to pte encode functions, upstream lost it. Also
rebase on top of the scratch page movement.

v4: Rebase on top of the new address space vfuncs.

v5: Add the bool use_scratch argument to clear_range and the bool valid argument
to the PTE encode function to follow upstream changes.

v6: Add a FIXME(BDW) about the size mismatch of the readback check
that Jon Bloomfield spotted.

v7: Squash in fixup patch from Ben for the posting read to match the
64bit ptes and so shut up the WARN.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d31eb10e 02-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Create gen8_gtt_pte_t

With gen6 PTE type in place, pave the way for the new gen8 type.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 63340133 04-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: Make gen8_gmch_probe

Probing gen8 is similar to gen6. To make the code cleaner and more
maintainable however we can use the probe functions to split it out.

v2: Rebased on top of update gtt probe infrastructure.

v3: Rebased on top of Kenneth' Graunke's ->pte_encode refactoring.

V4: Resolve conflicts with Ben's latest ppgtt patches, also switch to
gen < 8 testing instead of gen <= 7.

v5: Resolve conflicts with address space vfunc changes in upstream.

v6: Use 39b DMA mask. At least, for this mode, it is the correct mask.
(Imre)

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9459d252 03-Nov-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/bdw: support GMS and GGMS changes

All the BARs have the ability to grow.

v2: Pulled out the simulator workaround to a separate patch.
Rebased.

v3: Rebase onto latest vlv patches from Jesse.

v4: Rebased on top of the early stolen quirk patch from Jesse.

v5: Use the new macro names.
s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
It's Jesse's fault for not following the convention I originally set.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8fe6bd23 02-Nov-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/bdw: Disable PPGTT for now

This will be changed once the gen8 code is fully implemented.

v2: Use ENOSYS instead of ENXIO as suggested by Chris.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 828c7908 16-Oct-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Disable GGTT PTEs on GEN6+ suspend

Once the machine gets to a certain point in the suspend process, we
expect the GPU to be idle. If it is not, we might corrupt memory.
Empirically (with an early version of this patch) we have seen this is
not the case. We cannot currently explain why the latent GPU writes
occur.

In the technical sense, this patch is a workaround in that we have an
issue we can't explain, and the patch indirectly solves the issue.
However, it's really better than a workaround because we understand why
it works, and it really should be a safe thing to do in all cases.

The noticeable effect other than the debug messages would be an increase
in the suspend time. I have not measure how expensive it actually is.

I think it would be good to spend further time to root cause why we're
seeing these latent writes, but it shouldn't preclude preventing the
fallout.

NOTE: It should be safe (and makes some sense IMO) to also keep the
VALID bit unset on resume when we clear_range(). I've opted not to do
this as properly clearing those bits at some later point would be extra
work.

v2: Fix bugzilla link

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65496
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59321
Tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-By: Todd Previte <tprevite@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b35b380e 16-Oct-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Make PTE valid encoding optional

We need this to work around a corruption when the boot kernel image
loads the hibernated kernel image from swap on Haswell systems -
somehow not everything is properly shut off.

This is just the prep work, the next patch will implement the actual
workaround.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a commit message suitable for -fixes and add cc: stable]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a1e22653 20-Sep-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Use kcalloc more

No buffer overflows here, but better safe than sorry.

v2:
- Fixup the sizeof conversion, I've missed the pointer deref (Jani).
- Drop the redundant GFP_ZERO, kcalloc alreads memsets (Jani).
- Use kmalloc_array for the execbuf fastpath to avoid the memset
(Chris). I've opted to leave all other conversions as-is since they
aren't in a fastpath and dealing with cleared memory instead of
random garbage is just generally nicer.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the contentious kmalloc_array hunk in execbuf.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 651d794f 08-Aug-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use Write-Through cacheing for the display plane on Iris

Haswell GT3e has the unique feature of supporting Write-Through cacheing
of objects within the eLLC/LLC. The purpose of this is to enable the display
plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
that we, in theory, get the best of both worlds, perfect display and fast
access.

However, we still need to be careful as the CPU does not see the WT when
accessing the cache. In particular, this means that we need to flush the
cache lines after writing to an object through the CPU, and on
transitioning from a cached state to WT.

v2: Actually do the clflush on transition to WT, nagging by Ville.
v3: Flush the CPU cache after writes into WT objects.
v4: Rease onto LLC updates and report WT as "uncached" for
get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2c22569b 08-Aug-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Update rules for writing through the LLC with the cpu

As mentioned in the previous commit, reads and writes from both the CPU
and GPU go through the LLC. This gives us coherency between the CPU and
GPU irrespective of the attribute settings either device sets. We can
use to avoid having to clflush even uncached memory.

Except for the scanout.

The scanout resides within another functional block that does not use
the LLC but reads directly from main memory. So in order to maintain
coherency with the scanout, writes to uncached memory must be flushed.
In order to optimize writes elsewhere, we start tracking whether an
framebuffer is attached to an object.

v2: Use pin_display tracking rather than fb_count (to ensure we flush
cursors as well etc) and only force the clflush along explicit writes to
the scanout paths (i.e. pin_to_display_plane and pwrite into scanout).

v3: Force the flush after hitting the slowpath in pwrite, as after
dropping the lock the object's cache domain may be invalidated. (Ville)

Based on a patch by Ville Syrjälä.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 350ec881 06-Aug-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge

MLC_LLC was never validated for Sandybridge and was superseded by a new
level of cacheing for the GPU in Ivybridge. Update our names to be
consistent with usage, and in the process stop setting the unwanted bit
on Sandybridge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/BUG/WARN_ON(1) bikeshed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 40d74980 31-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use ggtt_vm to save some typing

Just some small cleanups, and a rename of vm->ggtt_vm requested by
Daniel.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a70a3148 31-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Make proper functions for VMs

Earlier in the conversion sequence we attempted to quickly wedge in the
transitional interface as static inlines.

Now that we're sure these interfaces are sane, for easier debug and to
decrease code size (since many of these functions may be called quite a
bit), make them real functions

While at it, kill off the set_color interface. We'll always have the
VMA, or easily get to it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 87a6b688 05-Aug-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915/hsw: Change default LLC age to 3

The default LLC age was changed:
commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Thu Jul 4 11:02:03 2013 -0700

drm/i915/hsw: Set correct Haswell PTE encodings.

On the surface it would seem setting a default age wouldn't matter
because all GEM BOs are aged similarly, so the order in which objects
are evicted would not be subject to aging. The current working theory as
to why this caused a regression though is that LLC is a bit special in
that it is shared with the CPU. Presumably (not verified) the CPU
fetches cachelines with age 3, and therefore recently cached GPU objects
would be evicted before similar CPU object first when the LLC is full.
It stands to reason therefore that this would negatively impact CPU
bound benchmarks - but those seem to be low on the priority list.

eLLC OTOH does not have this same property as LLC. It should be used
entirely for the GPU, and so the age really shouldn't matter.
Furthermore, we have no evidence to suggest one is better than another
on eLLC. Since we've never properly supported eLLC before no, there
should be no regression. If the GPU client really wants "younger"
objects, they should use MOCS.

v2: Drop the extra #define (Chad)

v3: Actually git add

v4: Pimped commit message

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 08c45263 30-Jul-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the same pte_encoding for ppgtt as for gtt

The PTE layouts are the same for both ppgtt and gtt, so we can simplify
the setup for ppgtt by copying the encoding function pointer from gtt.
This prevents bugs where we update one function pointer, but forget the
other.

For instance,

commit 4d15c145a6234d999c0452eec0d275c1fbf0688c
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Thu Jul 4 11:02:06 2013 -0700

drm/i915: Use eLLC/LLC by default when available

only extends the gtt to use eLLC/LLC cacheing and forgets to also update
the ppgtt function pointer.

v2: Actually mention the bug being fixed (Kenneth)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2f633156 17-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Create VMAs

Formerly: "drm/i915: Create VMAs (part 1)"

In a previous patch, the notion of a VM was introduced. A VMA describes
an area of part of the VM address space. A VMA is similar to the concept
in the linux mm. However, instead of representing regular memory, a VMA
is backed by a GEM BO. There may be many VMAs for a given object, one
for each VM the object is to be used in. This may occur through flink,
dma-buf, or a number of other transient states.

Currently the code depends on only 1 VMA per object, for the global GTT
(and aliasing PPGTT). The following patches will address this and make
the rest of the infrastructure more suited

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

v4: killed obj->gtt_space
some reworks due to rebase

v5: Free vma on error path (Imre)

v6: Another missed vma free in i915_gem_object_bind_to_gtt error path
(Imre)
Fixed vma freeing in stolen preallocation (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Squash in fixup from Ben to not deref a non-existing vma in
set_cache_level, reported by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 93bd8649 16-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Put the mm in the parent address space

Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

v2: Rebased

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 853ba5d2 16-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Move gtt and ppgtt under address space umbrella

The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists), and
many of their characteristics (size) can be shared. Do that.

The change itself doesn't actually impact most of the VMA/VM rework
coming up, it just fits in with the grand scheme of abstracting the GPU
VM operations. GGTT will usually be a special case where we either know
an object must be in the GGTT (dislay engine, workarounds, etc.).

The scratch page is left as part of the VM (even though it's currently
shared with the ppgtt code) because in the future when we have Full
PPGTT, I intend to create a separate scratch page for each.

v2: Drop usage of i915_gtt_vm (Daniel)
Make cleanup also part of the parent class (Ben)
Modified commit msg
Rebased

v3: Properly share scratch page (Imre)
Finish commit message (Daniel, Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4d15c145 04-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use eLLC/LLC by default when available

DRI clients really should be using MOCS to get fine grained streaming
cache controls. With that note, I *hope* that this patch doesn't improve
performance overwhelmingly, because if it does - it means there is a
problem elsewhere.

In any case, the kernel, and old userspace should get some benefit from
this, so let's do it. eLLC is always a good default, and really not
using it is the special case for MOCS.

References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0d8ff15e 04-Jul-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/hsw: Set correct Haswell PTE encodings.

The cacheability controls have changed, and the bits have been
rearranged in general.

Note that age 0 is the oldest (most likely to get evicted) and age 3
is the youngest (most likely to stick around for a bit). We've picked
0 for no reason, but atm it shouldn't matter anyway (since we don't
yet try to differentiate between different objects).

v2: Remove comments for snb/ivb cache leves, that's a separate change.

v3: Resolve conflicts due to patch series reordering.

v4: Rebased on top of Kenneth Graunke's ->pte_encode refactoring.

v5: Removed eLLC bits for separate patch.

In the internal repository this was:
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Add comment about cache ages as requested by Ben provoked due
to a question from Damien.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c6cfb325 05-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Embed drm_mm_node in i915 gem obj

Embedding the node in the obj is more natural in the transition to VMAs
which will also have embedded nodes. This change also helps transition
away from put_block to remove node.

Though it's quite an uncommon occurrence, it's somewhat convenient to not
fail at bind time because we cannot allocate the node. Though in
practice there are other allocations (like the request structure) which
would probably make this point not terribly useful.

Quoting Daniel:
Note that the only difference between put_block and remove_node is
that the former fills up the preallocation cache. Which we don't need
anyway and hence is just wasted space.

v2: Clean up the stolen preallocation code.
Rebased on the reserve_node patches
renames ggtt_ stuff to gtt_ stuff
WARN_ON if the object is already bound (which doesn't mean it's in the
bound list, tricky)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# edd41a87 05-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Kill obj->gtt_offset

With the getters in place from the previous patch this members serves no
purpose other than saving one spare pointer chase, which will be killed
in the next patch anyway.

Moving to VMAs, this members adds unnecessary confusion since an object
may exist at different offsets in different VMs.

v2: Properly preserve the stolen offset. This code is a bit hacky but it
all goes away when we embed the drm_mm_node and removes the need for the
incorrect patch I submitted previously: "Use gtt_space->start for stolen
reservation"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f343c5f6 05-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Getter/setter for object attributes

Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 338710e7 05-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm: Change create block to reserve node

With the previous patch we no longer actually create a node, we simply
find the correct hole and occupy it. This very well could have been
squashed with the last patch, but since I already had David's review, I
figured it's easiest to keep it distinct.

Also update the users in i915. Conveniently this is the only user of the
interface.

CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b3a070cc 05-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm: pre allocate node for create_block

For an upcoming patch where we introduce the i915 VMA, it's ideal to
have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated).
Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this
will break a bunch of code, but amongst them are 2 callers of
drm_mm_create_block(), both related to stolen memory.

It also allows us to embed the drm_mm_node into the object currently
which provides a nice transition over to the new code.

v2: Reordered to do before ripping out obj->gtt_offset.
Some minor cleanups made available because of reordering.

v3: s/continue/break on failed stolen node allocation (David)
Set obj->gtt_space on failed node allocation (David)
Only unref stolen (fix double free) on failed create_stolen (David)
Free node, and NULL it in failed create_stolen (David)
Add back accidentally removed newline (David)

CC: <dri-devel@lists.freedesktop.org>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b2f21b4d 27-Jun-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use gtt shortform where possible

Just for compactness.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 80a74f7f 27-Jun-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Drop dev from pte_encode

The original pte_encode function needed the dev argument so we could do
platform specific handling via IS_GENX, etc. With the merging of a pte
encoding function there should never been a need to quirk away gen
specific details.

The patch doesn't do much but makes the upcoming reworks in gtt/ppgtt/mm
slightly (albeit, ever so) easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 67167240 27-Jun-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Combine scratch members into a struct

There isn't any special reason to do this other than it makes it obvious
that the two members are connected.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 84f13560 27-Jun-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Really share scratch page

A previous patch had set up the ppgtt and ggtt to use the same scratch
page, but still kept around both pointers. Kill it, it's not needed and
gets in our way for upcoming cleanups.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6670a5a5 27-Jun-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: make PDE|PTE platform specific

Nothing outside of i915_gem_gtt.c and more specifically, the relevant
gen specific init function should need to know about number of PDEs, or
PTEs per PD. Exposing this will only lead to circumventing using the
upcoming VM abstraction.

To accomplish this, move the defines into the .c file, rename the PDE
define to be GEN6, and make the PTE count less of a magic number.

The remaining code in the global gtt setup is a bit messy, but an
upcoming patch will clean that one up.

v2: Don't hardcode number of PDEs (Daniel + Jesse)
Reworded commit message to reflect change.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 35c20a60 31-May-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Rename the gtt_list to global_list

Since it will be used for the global bound/unbound list with full PPGTT,
this helps clarify things for upcoming code rework.

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c4ae25ec 01-May-2013 Ben Widawsky <ben@bwidawsk.net>

Revert "drm/i915: Calculate correct stolen size for GEN7+"

This reverts commit 03752f5b7b77b95d83479885040950fba1250850.

This revert requires a bit of explanation on how I understand things
work. Internally the architects/designers decide how the stolen encoding
works. We put it in a doc. BIOS writers take these docs and implement
it. Driver writers read the doc too, and read the value left by the BIOS
writers, and then we make magic.

The failing here is that in the docs we had[1] contained two different
definitions for this register for Gen7. (We have both a PCI register,
and an MMIO, and each of these were different). At the time [2] of
03752f5, we asked the architects what the correct value should be; but
that doesn't match the reality (BIOS) unfortunately.

So on all machines I can get my hands on, this revert is the right thing
to do. I've also worked with the product group to confirm that they
agree this revert is what we should do. People using HW made my "people"
who both write their own BIOS, and have access to our docs (Apple?).
Investigations are still ongoing about whether we need to add a list
of machines needing special handling, but this patch should be the
right thing for pretty much everyone.

[1] The docs are still wrong on this one. Now instead of two registers with
two definitions, we have one register with BOTH definitions, progress?
[2] The open source PRMs have the "wrong" definitions in chapter Volume
1 part6, section 1.1.12.

This digging was inspired by Paulo.

Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Augment the patch saying that it's still a bit unclear
whether there are any machines out there with "wrong" firmware and
whether we need to add a list to handle them specially.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3e302542 24-Apr-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Extract PDE writes

It also makes some sense IMO to have these two functions separate
irrespective of the number of callers.

Only the single caller for now, but that will change as we add more
PPGTTs.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Resolve conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0a732870 24-Apr-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: BUG_ON bad PPGTT offset

Because PPGTT PDEs within the GTT are calculated in cachelines
(HW guys consistency ftw) we do a divide which will wreak havoc if this
is wrong, and I know that from experience).

If/when we move to multiple PPGTTs this will have to become a WARN, and
return an error. For now however it should always be considered fatal,
and only a developer could hit it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: s/BUG/WARN]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 43b27290 27-Apr-2013 Zhang, Xiong Y <xiong.y.zhang@intel.com>

drm/i915: correct the calculation of first_pd_entry_in_global_pt

When ppgtt is enabled, dev_priv->gtt.total has excluded the gtt space
occupied by ppgtt table in i915_gem_init_global_gtt() function. So the
calculation of first_pd_entry_in_global_pt doesn't need to subtract
I915_PPGTT_PD_ENTRIES again. Or else PPGTT directory table will be
destroyed by global gtt allocation.

This regression has been introduced in

commit a54c0c279f3864171fe53c66e769d5a137c5c651
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Thu Jan 24 14:45:00 2013 -0800

drm/i915: remove intel_gtt structure

The breakage is pretty subtile since the old gtt_total_entries
included the pde range, whereas the new on did not.

Cc: stable@vger.kernel.org
Signed-off-by: Xiong Zhang<xiong.y.zhang@intel.com>
[danvet: Add regression citation and cc: stable. Thanks to Chris for
correcting my wrong guess about which commit broke things.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9119708c 22-Apr-2013 Kenneth Graunke <kenneth@whitecape.org>

drm/i915: Split out Haswell code from gen6_pte_encode.

Now that we have function pointers, it's cleaner to just create a new
per-platform PTE encoding function.

This should be identical in behavior to the previous code.

v2: Drop accidental inline keyword on hsw_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 93c34e70 22-Apr-2013 Kenneth Graunke <kenneth@whitecape.org>

drm/i915: Fix page table entries for Bay Trail.

On Bay Trail, bit 1 means "writeable by the GPU." Failing to set that
means basically anything using the GPU will cause hangs.

v2: Drop accidental inline keyword on byt_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2d04befb 22-Apr-2013 Kenneth Graunke <kenneth@whitecape.org>

drm/i915: Add PTE encoding function to the gtt/ppgtt vtables.

Sandybridge/Ivybridge, Bay Trail, and Haswell all have slightly
different page table entry formats. Rather than polluting one function
with generation checks, simply use a function pointer and set up the
correct PTE encoding function at startup.

v2: Move the gen6_gtt_pte_t typedef to i915_drv.h so that the function
pointers and implementations have identical signatures. Also remove
inline keyword on gen6_pte_encode. Both suggested by Jani Nikula.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6a994761 09-Apr-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Remove stale code

Looks like a some remnant from a rebase.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a6f429a5 04-Apr-2013 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Configure GAM_ECOCHK appropriatly for Gen7

IVB and HSW use different encodings for the PPGTT cacheability bits in
the GAM_ECOCHK register.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a65c2fcd 04-Apr-2013 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Set GAC_ECO_BITS register on Gen7+

According to BSpec GAC_ECO_BITS register exists on Gen7 platforms as
well. Configure it accordingly.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3b9d7888 04-Apr-2013 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add ECOBITS_SNB_BIT

GAC_ECO_BITS has a bit similar to GAM_ECOCHK's ECOCHK_SNB_BIT. Add
the define, and enable it on SNB.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b7c36d25 08-Apr-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Allow PPGTT enable to fail

I'm really not happy that we have to support this, but this will be the
simplest way to handle cases where PPGTT init can fail, which I promise
will be coming in the future.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5963cf04 08-Apr-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: NULL aliasing_ppgtt on cleanup

This will allow us to carry on if we've cleaned up the PPGTT. The usage
for this is coming up - it simplifies handling a failed PPGTT init.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Spill the secrets about failing ppgtt init.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6197349b 08-Apr-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Abstract PPGTT enabling

Since we've already set up a nice vtable to abstract other PPGTT
functions, also abstract the actual register programming to enable
things.

This function will probably need to change a bit as we implement real
processes.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3ed124b2 08-Apr-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Rework PPGTT init code

This rework will help if future platforms choose to be a bit different.
Should have no functional impact.

v2: Don't move around the vtable setup (Daniel)

v3: Squash in the disable-by-default patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3eb1c005 08-Apr-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Conditionally carve out GGTT PDE

It only works that way on GEN6 and GEN7. Let's not assume GENn will be
the same.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1e7d12d4 08-Apr-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915/ppgtt: Set scratch page "globally"

The PPGTT scratch page is used for all gens, and doing it in the global
part of our PPGTT setup makes the code a bit nicer.

This was in a patch submitted earlier as part of the PPGTT cleanups.
Grumpy maintainer must have missed it, and I didn't yell when
appropriate. Apologies for everyone :-)

v2: Update commit message

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c81dbe05 08-Apr-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: random checkpatch fixes

There used to be other fixes in this patch but they've slowly disappeared as
other parts have been fixed.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e7c2b58b 08-Apr-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Call out GEN6 PTE specificity

We can assume that the PTE layout, and size changes for future
generations. To avoid confusion with the existing GEN6 PTE typedef, give
it a GEN6_ prefix.

v2: Fixup checkpatch warning and bikeshed commit message slightly.

v3: Rebase on top of Imre's for_each_sg_pages rework.

v4: Fixup conflicts in patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a93e4161 08-Apr-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: generalize pte vs. register BAR allocation

All gen6+ parts so far have 1 BAR which holds both the register space
and the GTT PTEs. Up until now, that was a 4MB BAR with half allocated
to each.

I have a strong hunch (wink, nod, wink) that future gens will also keep
a similar 50-50 split though the sizes may change. To help this along
change the code to obey the rule of half the total size instead of a
hard-coded 2MB.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2db76d7c 26-Mar-2013 Imre Deak <imre.deak@intel.com>

lib/scatterlist: sg_page_iter: support sg lists w/o backing pages

The i915 driver uses sg lists for memory without backing 'struct page'
pages, similarly to other IO memory regions, setting only the DMA
address for these. It does this, so that it can program the HW MMU
tables in a uniform way both for sg lists with and without backing pages.

Without a valid page pointer we can't call nth_page to get the current
page in __sg_page_iter_next, so add a helper that relevant users can
call separately. Also add a helper to get the DMA address of the current
page (idea from Daniel).

Convert all places in i915, to use the new API.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a15326a5 19-Mar-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fixup pd vs pt confusion in gen6 ppgtt code

The index variable points at a page table, not a page directory or a
pde. Ben Widawsky fix this up correctly in his ppgtt cleanup, but I've
botched the job and copy&pasted the old confusion from the original
gen6 ppgtt code in

commit def886c3768d24c4e0aa56ff98b5a468c2b5c9bf
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Jan 24 14:44:56 2013 -0800

drm/i915: vfuncs for ppgtt

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6ddc4fc7 19-Mar-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

style nit: Align function parameter continuation properly.


# 6e995e23 18-Feb-2013 Imre Deak <imre.deak@intel.com>

drm/i915: use for_each_sg_page for setting up the gtt ptes

The existing gtt setup code is correct - and so doesn't need to be fixed to
handle compact dma scatter lists similarly to the previous patches. Still,
take the for_each_sg_page macro into use, to get somewhat simpler code.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 086ddcce 01-Mar-2013 Jesse Barnes <jbarnes@virtuousgeek.org>

drm/i915: use gen6 stolen check on VLV

It uses the same bit definitions.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 41907ddc 08-Feb-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Fix gen2 mappable calculations

When I refactored the code initially, I forgot that gen2 uses a
different bar for the CPU mappable aperture. The agp-less code knows
nothing of generations less than 5, so we have to expand the gtt_probe
function to include the mappable base and end.

It was originally broken by me:
commit baa09f5fd8a6d033ec075355dda99a65b7f6a0f3
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Thu Jan 24 13:49:57 2013 -0800

drm/i915: Add probe and remove to the gtt ops

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d93c6233 30-Jan-2013 Changlong Xie <changlongx.xie@intel.com>

drm/i915: gen6_gmch_remove can be static

Signed-off-by: Changlong Xie <changlongx.xie@intel.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e78891ca 25-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Reclaim GTT space for failed PPGTT

When the PPGTT init fails, we may as well reuse the space that we were
reserving for the PPGTT PDEs.

This also fixes an extraneous mutex_unlock.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a54c0c27 24-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: remove intel_gtt structure

With the probe call in our dispatch table, we can now cut away the
last three remaining members in the intel_gtt shared struct and so
remove it completely.

v2: Rebased on top of Daniel's series

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: bikeshed commit message a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# baa09f5f 24-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Add probe and remove to the gtt ops

The idea, and much of the code came originally from:

commit 0712f0249c3148d8cf42a3703403c278590d4de5
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Jan 18 17:23:16 2013 -0800

drm/i915: Create a vtable for i915 gtt

Daniel didn't like the color of that patch series, and so I asked him to
start something which appealed to his sense of color. The preceding
patches are those, and now this is going on top of that.

[extracted from the original commit message]

One immediately obvious thing to implement is our gmch probing. The init
function was getting massively bloated. Fundamentally, all that's needed
from GMCH probing is the GTT size, and the stolen size. It makes design
sense to put the mappable calculation in there as well, but the code
turns out a bit nicer without it (IMO)

The intel_gtt bridge thing is still here, but the subsequent patches
will finish ripping that out.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Bikeshedded one comment (GMADR is just the PCI aperture, we
use it for other things than just accessing tiled surfaces through a
linear view) and cut the newly added long lines a bit. Also one
checkpatch error.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3440d265 24-Jan-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: extract hw ppgtt setup/cleanup code

At the moment only cosmetics, but being able to initialize/cleanup
arbitrary ppgtt address spaces paves the way to have more than one of
them ... Just in case we ever get around to implementing real
per-process address spaces. Note that in that case another vfunc for
ppgtt would be beneficial though. But that can wait until the code
grows a second place which initializes ppgtts.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 960e3e42 24-Jan-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: pte_encode is gen6+

All the other gen6+ hw code has the gen6_ prefix, so be consistent
about it.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# def886c3 24-Jan-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vfuncs for ppgtt

Like for the global gtt we want a notch more flexibility here. Only
big change (besides a few tiny function parameter adjustments) was to
move gen6_ppgtt_insert_entries up (and remove _sg_ from its name, we
only have one kind of insert_entries since the last gtt cleanup).

We could also extract the platform ppgtt setup/teardown code a bit
better, but I don't care that much.

With this we have the hw details of pte writing nicely hidden away
behind a bit of abstraction. Which should pave the way for
different/multiple ppgtts (e.g. what we need for real ppgtt support).

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7faf1ab2 24-Jan-2013 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: vfuncs for gtt_clear_range/insert_entries

We have a few too many differences here, so finally take the prepared
abstraction and run with it. A few smaller changes are required to get
things into shape:

- move i915_cache_level up since we need it in the gt funcs
- split up i915_ggtt_clear_range and move the two functions down to
where the relevant insert_entries functions are
- adjustments to a few function parameter lists

Now we have 2 functions which deal with the gen6+ global gtt
(gen6_ggtt_ prefix) and 2 functions which deal with the legacy gtt
code in the intel-gtt.c fake agp driver (i915_ggtt_ prefix).

Init is still a bit a mess, but honestly I don't care about that.

One thing I've thought about while deciding on the exact interfaces is
a flag parameter for ->clear_range: We could use that to decide
between writing invalid pte entries or scratch pte entries. In case we
ever get around to fixing all our bugs which currently prevent us from
filling the gtt with empty ptes for the truly unused ranges ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwidawsk: Moved functions to the gtt struct]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8d2e6308 18-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Needs_dmar, not

The reasoning behind our code taking two paths depending upon whether or
not we may have been configured for IOMMU isn't clear to me. It should
always be safe to use the pci mapping functions as they are designed to
abstract the decision we were handling in i915.

Aside from simpler code, removing another member for the intel_gtt
struct is a nice motivation.

I ran this by Chris, and he wasn't concerned about the extra kzalloc,
and memory references vs. page_to_phys calculation in the case without
IOMMU.

v2: Update commit message

v3: Remove needs_dmar addition from Zhenyu upstream

This reverts (and then other stuff)
commit 20652097dadd9a7fb4d652f25466299974bc78f9
Author: Zhenyu Wang <zhenyuw@linux.intel.com>
Date: Thu Dec 13 23:47:47 2012 +0800

drm/i915: Fix missed needs_dmar setting

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in follow-up fix to remove the bogus hunk which
deleted the dma_mask configuration for gen6+.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9c61a32d 18-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Remove scratch page from shared

We already had a mapping in both (minus the phys_addr in AGP).

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a81cc00c 18-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Cut out the infamous ILK w/a from AGP layer

And, move it to where the rest of the logic is.

There is some slight functionality changes. There was extra paranoid
checks in AGP code making sure we never do idle maps on gen2 parts. That
was not duplicated as the simple PCI id check should do the right thing.

v2: use IS_GEN5 && IS_MOBILE check instead. For now, this is the same as
IS_IRONLAKE_M but is more future proof. The workaround docs hint that
more than one platform may be effected, but we've never seen such a
platform in the wild. (Rodrigo, Daniel)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v1)
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 93d18799 17-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Remove use of gtt_mappable_entries

Mappable_end, ie. size is almost always what you want as opposed to the
number of entries. Since we already have that information, we can scrap
the number of entries and only calculate it when needed.

If gtt_start is !0, this will have slightly different behavior. This
difference can only occur in DRI1, and exists when we try to kick out
the firmware fb. The new code seems like a bugfix to me.

The other case where we've changed the behavior is during init we check
the mappable region against our current known upper and lower limits
(64MB, and 512MB). This now matches the comment, and makes things more
convenient after removing gtt_mappable_entries.

Also worth noting is the setting of mappable_end is taken out of setup
because we do it earlier now in the DRI2 case and therefore need to add
that tiny hunk to support the DRI1 IOCTL.

v2: Move up mappable end to before legacy AGP init

v3: Add the dev_priv inclusion here from previous rebase error in patch
5

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: squash in fix for a printk format flag mismatch warning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# dabb7a91 17-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Remove use on gma_bus_addr on gen6+

We have enough info to not use the intel_gtt bridge stuff.

v2: Move setup of mappable_base above the legacy init stuff because we
still need that on older platforms. (Daniel)

v3: Remove the dev_priv hunk which was rebased in by accident

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5d4545ae 17-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Create a gtt structure

The purpose of the gtt structure is to help isolate our gtt specific
properties from the rest of the code (in doing so it help us finish the
isolation from the AGP connection).

The following members are pulled out (and renamed):
gtt_start
gtt_total
gtt_mappable_end
gtt_mappable
gtt_base_addr
gsm

The gtt structure will serve as a nice place to put gen specific gtt
routines in upcoming patches. As far as what else I feel belongs in this
structure: it is meant to encapsulate the GTT's physical properties.
This is why I've not added fields which track various drm_mm properties,
or things like gtt_mtrr (which is itself a pretty transient field).

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[Ben modified commit messages]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 00fc2c3c 17-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Remove gtt_mappable_total

With the assertion from the previous patch in place, it should be safe
to get rid gtt_mappable_total. Keeps things saner to not have to track
the same info in two places.

In order to keep the diff as simple as possible and keep with the
existing gtt_setup semantics we opt to keep gtt_mappable_end. It's not
as consistent with the 'total' used in the previous patch, but that can
be fixed later.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[Ben modified commit message]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 35451cb6 17-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Mappable_end can't ever be > end

Both DRI1 and DRI2 can never specify a mappable size which goes past the
GTT size. Don't pretend otherwise.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c1fc6521 17-Jan-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Kill gtt_end

It's duplicated in the more useful gtt_total.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1c45140d 18-Dec-2012 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Make GSM void

The iomapping of the register region has historically been a uint32_t
for the obvious reason that our PTE size was always 4b. In the future
however, we cannot make this assumption.

By making the type void, it makes the upcoming pointer math we will do
much easier, and hopefully gives the compiler opportunities to warn us
when we do stupid things.

v2: Cast to __iomem, caught by Ville

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Fixup __iomem issue for real.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 06e5598f 18-Dec-2012 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Move GSM mapping into dev_priv

This removes an unused field from the AGP structure and moves it into
the dev_priv structure (with a slightly better name). This builds upon
the kill-agp series already merged.

GSM is a well defined term in the bspec:
GSM: Graphics Stolen Memory

GTT stolen space is defined for storage of the GFX GTT entries in
physical memory. IA can not access GSM directly , it can only access via
GTTMMADR. GT can access GSM directly or through GTTMMADR.

This is not the entire stolen space.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d7e5008f 18-Dec-2012 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Move even more gtt code to i915_gem_gtt

This really should have been part of the kill agp series.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 079a43f6 18-Dec-2012 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: Missed conversion to gtt_pte_t

commit f61c0609073133375ace61f74b0e4e7cf631406b
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Mon Oct 22 11:44:43 2012 -0700

drm/i915: introduce gtt_pte_t

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 20652097 13-Dec-2012 Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: Fix missed needs_dmar setting

From Ben's AGP dependence removal change, "needs_dmar" flag has not
been properly setup for new chips using new GTT init function. This
one adds missed setting of that flag to make sure we do pci mappings
with IOMMU enabled.

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ed2f3452 15-Nov-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid clearing preallocated regions from the GTT

As yet we do not do any preallocation (chicken-and-egg problem), but we
may like to preserve anything already allocated by the BIOS or grub and
reuse for own purposes after initialising the driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2ff4aeac 26-Nov-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Fix pte updates in ggtt clear range

This bug was introduced by me:
commit e76e9aebcdbfebae8f4cd147e3c0f800d36e97f3
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Sun Nov 4 09:21:27 2012 -0800

drm/i915: Stop using AGP layer for GEN6+

The existing code uses memset_io which follows memset semantics in only
guaranteeing a write of individual bytes. Since a PTE entry is 4 bytes,
this can only be correct if the scratch page address is 0.

This caused unsightly errors when we clear the range at load time,
though I'm not really sure what the heck is referencing that memory
anyway. I caught this is because I believe we have some other bug where
the display is doing reads of memory we feel should be cleared (or we
are relying on scratch pages to be a specific value).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b5c62158 19-Nov-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use pci_resource functions for BARs.

This was leftover crap from kill-agp. The current code is theoretically
broken for 64b bars. (I resist removing theoretically because I am too
lazy to test).

We still need to ioremap things ourselves because we want to ioremap_wc
the PTEs.

v2: Forgot to kill the tmp variable in v1

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d640c4b0 11-Nov-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Report amount of usable graphics memory in MiB

...rather than kilo-PTE.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Apply s/Usabel/usable/ bikeshed suggested by Ben Widawsky.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ccdf56cd 06-Nov-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Fix sparse warnings in from AGP kill code

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 26b1ff35 04-Nov-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Move the remaining gtt code

It's pretty much all consolidated now that we've killed AGP. We can move
the one outlier, and defines too.

(Kill some unused defines in the process)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0f9b91c7 04-Nov-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: flush system agent TLBs on SNB

This allows us to map the PTEs WC. I've not done thorough testing or
performance measurements with this patch, but it should be decent.

This is based on a patch from Jesse with the original commit message
> I've only lightly tested this so far, but the corruption seems to be
> gone if I write the GFX_FLSH_CNTL reg after binding an object. This
> register should control the TLB for the system agent, which is what CPU
> mapped objects will go through.

It has been updated for the new AGP-less code by me, and included with
it is feedback from the original patch.

v2: Updated to reflect paranoia on pte updates/register posting reads.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v1]: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 03752f5b 04-Nov-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Calculate correct stolen size for GEN7+

This bug existed in the old code, but was easier to fix here in the
rework. Unfortunately gen7 doesn't have a nice way to figure out the
size and we must use a lookup table.

As Jesse pointed out, there is some confusion in the docs about these
definitions. We're picking the one which seems more accurate, but we
really aren't certain.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e76e9aeb 04-Nov-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Stop using AGP layer for GEN6+

As a quick hack we make the old intel_gtt structure mutable so we can
fool a bunch of the existing code which depends on elements in that data
structure. We can/should try to remove this in a subsequent patch.

This should preserve the old gtt init behavior which upon writing these
patches seems incorrect. The next patch will fix these things.

The one exception is VLV which doesn't have the preserved flush control
write behavior. Since we want to do that for all GEN6+ stuff, we'll
handle that in a later patch. Mainstream VLV support doesn't actually
exist yet anyway.

v2: Update the comment to remove the "voodoo"
Check that the last pte written matches what we readback

v3: actually kill cache_level_to_agp_type since most of the flags will
disappear in an upcoming patch

v4: v3 was actually not what we wanted (Daniel)
Make the ggtt bind assertions better and stricter (Chris)
Fix some uncaught errors at gtt init (Chris)
Some other random stuff that Chris wanted

v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a
tad more robust by mapping everything != CACHE_NONE to the cached agp
flag - we have a 1:1 uncached mapping, but different modes of
cacheable (at least on later generations). Suggested by Chris Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# e7210c3c 19-Oct-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: move more pte encoding to pte encode

In order to handle differences in pte encoding between architectures it
is desirable to have one helper function, pte_encode, do it all for us.
As such, this commit moves the code around so we're in good shape to do
that.

Luckily the ppgtt pte and the ggtt pte look very similar.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 54d12527 24-Sep-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Extract PPGTT pte encoding

HSW will change the PTE encoding, and laying this out now will be
helpful when we're ready to implement that. More importantly, GGTT and
PPGTT PTE encoding is quite similar, so moving this out into a helper
function will enable us to lance the AGP layer.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f61c0609 22-Oct-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: introduce gtt_pte_t

This will make the calculations of size easier to read instead of just
assuming uint32_t everywhere.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8f2c59f0 24-Sep-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Add dev to ppgtt

Some subsequent commits will need to know what generation we're running
on to do different pte encoding for the ppgtt. Since it's not much
hassle or overhead to store it in the ppgtt structure, do that.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8693607a 21-Sep-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: No LLC_MLC for HSW.

The mid-level cache or as it's more commonly referred to now as L3, is
not setup this way on HSW.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 760285e7 02-Oct-2012 David Howells <dhowells@redhat.com>

UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/

Convert #include "..." to #include <path/...> in drivers/gpu/.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>


# 4126d5d6 02-Oct-2012 David Howells <dhowells@redhat.com>

UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/.

Remove redundant DRM UAPI header #inclusions from drivers/gpu/.

Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and
drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding
patch.

Without this patch and the patch to make include the UAPI headers from the core
headers, after the UAPI split, the DRM C sources cannot find these UAPI headers
because the DRM code relies on specific -I flags to make #include "..." work
on headers in include/drm/ - but that does not work after the UAPI split without
adding more -I flags.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>


# 2f745ad3 04-Sep-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops

By providing a callback for when we need to bind the pages, and then
release them again later, we can shorten the amount of time we hold the
foreign pages mapped and pinned, and importantly the dmabuf objects then
behave as any other normal object with respect to the shrinker and
memory management.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9da3da66 01-Jun-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace the array of pages with a scatterlist

Rather than have multiple data structures for describing our page layout
in conjunction with the array of pages, we can migrate all users over to
a scatterlist.

One major advantage, other than unifying the page tracking structures,
this offers is that we replace the vmalloc'ed array (which can be up to
a megabyte in size) with a chain of individual pages which helps reduce
memory pressure.

The disadvantage is that we then do not have a simple array to iterate,
or to access randomly. The common case for this is in the relocation
processing, which will typically fit within a single scatterlist page
and so be almost the same cost as the simple array. For iterating over
the array, the extra function call could be optimised away, but in
reality is an insignificant cost of either binding the pages, or
performing the pwrite/pread.

v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the
trivial compile error from rebasing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9a0f938b 24-Aug-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the correct size of the GTT for placing the per-process entries

The current layout is to place the per-process tables at the end of the
GTT. However, this is currently using a hardcoded maximum size for the GTT
and not taking in account limitations imposed by the BIOS. Use the value
for the total number of entries allocated in the table as provided by
the configuration registers.

Reported-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Matthew Garret <mjg@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6c085a72 20-Aug-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track unbound pages

When dealing with a working set larger than the GATT, or even the
mappable aperture when touching through the GTT, we end up with evicting
objects only to rebind them at a new offset again later. Moving an
object into and out of the GTT requires clflushing the pages, thus
causing a double-clflush penalty for rebinding.

To avoid having to clflush on rebinding, we can track the pages as they
are evicted from the GTT and only relinquish those pages on memory
pressure.

As usual, if it were not for the handling of out-of-memory condition and
having to manually shrink our own bo caches, it would be a net reduction
of code. Alas.

Note: The patch also contains a few changes to the last-hope
evict_everything logic in i916_gem_execbuffer.c - we no longer try to
only evict the purgeable stuff in a first try (since that's superflous
and only helps in OOM corner-cases, not fragmented-gtt trashing
situations).

Also, the extraction of the get_pages retry loop from bind_to_gtt (and
other callsites) to get_pages should imo have been a separate patch.

v2: Ditch the newly added put_pages (for unbound objects only) in
i915_gem_reset. A quick irc discussion hasn't revealed any important
reason for this, so if we need this, I'd like to have a git blame'able
explanation for it.

v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Split out code movements and rant a bit in the commit message
with a few Notes. Done v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a843af18 14-Aug-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: fix hsw uncached pte

They've changed it ... for no apparent reason. Meh.

V2: remove unused 'is_hsw' field.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f00f9791 30-Jul-2012 Dave Airlie <airlied@redhat.com>

i915: don't map imported dma-bufs for dmar.

The exporter should have given us pages in the correct place, avoid
the prepare object mapping phase on dmar systems.

This fixes an oops on a GM45/R600 machine, when running the intel/radeon
tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 42d6ab48 26-Jul-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Segregate memory domains in the GTT using coloring

Several functions of the GPU have the restriction that differing memory
domains cannot be placed next to each other (as the GPU may prefetch
beyond the end of one domain and hang as it crosses into the other
domain). We use the facility of the drm_mm to mark ranges with a
particular color that corresponds to the cache attributes of those pages
in order to prevent allocating adjacent blocks of differing memory
types.

v2: Rebase ontop of drm_mm coloring v2.
v3: Fix rebinding existing gtt_space and add a verification routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1286ff73 10-May-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

i915: add dmabuf/prime buffer sharing support.

This adds handle->fd and fd->handle support to i915, this is to allow
for offloading of rendering in one direction and outputs in the other.

v2 from Daniel Vetter:
- fixup conflicts with the prepare/finish gtt prep work.
- implement ppgtt binding support.

Note that we have squat i-g-t testcoverage for any of the lifetime and
access rules dma_buf/prime support brings along. And there are quite a
few intricate situations here.

Also note that the integration with the existing code is a bit
hackish, especially around get_gtt_pages and put_gtt_pages. It imo
would be easier with the prep code from Chris Wilson's unbound series,
but that is for 3.6.

Also note that I didn't bother to put the new prepare/finish gtt hooks
to good use by moving the dma_buf_map/unmap_attachment calls in there
(like we've originally planned for).

Last but not least this patch is only compile-tested, but I've changed
very little compared to Dave Airlie's version. So there's a decent
chance v2 on drm-next works as well as v1 on 3.4-rc.

v3: Right when I've hit sent I've noticed that I've screwed up one
obj->sg_list (for dmar support) and obj->sg_table (for prime support)
disdinction. We should be able to merge these 2 paths, but that's
material for another patch.

v4: fix the error reporting bugs pointed out by ickle.

v5: fix another error, and stop non-gtt mmaps on shared objects
stop pread/pwrite on imported objects, add fake kmap

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b2da9fe5 26-Apr-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: remove do_retire from i915_wait_request

This originates from a hack by me to quickly fix a bug in an earlier
patch where we needed control over whether or not waiting on a seqno
actually did any retire list processing. Since the two operations aren't
clearly related, we should pull the parameter out of the wait function,
and make the caller responsible for retiring if the action is desired.

The only function call site which did not get an explicit retire_request call
(on purpose) is i915_gem_inactive_shrink(). That code was already calling
retire_request a second time.

v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 211c568b 10-Apr-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: simplify ppgtt setup

We don't need the pt_addr for the !dmar case, so drop the else and
move the if (dmar) condition out of the loop.

v2: Fixup whitespace damage noticed by Chris Wilson.

v3: Collapse the two identical if blocks. Chris Wilson makes me look
like a moron right now ...

Noticed-by: Konstantin Belousov <kostikbel@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wislon.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 55a254ac 21-Mar-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: properly restore the ppgtt page directory on resume

The ppgtt page directory lives in a snatched part of the gtt pte
range. Which naturally gets cleared on hibernate when we pull the
power. Suspend to ram (which is what I've tested) works because
despite the fact that this is a mmio region, it is actually back by
system ram.

Fix this by moving the page directory setup code to the ppgtt init
code (which gets called on resume).

This fixes hibernate on my ivb and snb.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d1dd20a9 26-Mar-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: clear the entire gtt when using gem

We've lost our guard page somewhere in the gtt rewrite, this patch
here will restore it.

Exercised by i-g-t/tests/gem_cs_prefetch.

v2: Substract the guard page from the range we're supposed to manage
with gem. Suggested by Chris Wilson to increase the odds of old ums +
gem userspace not blowing up. To compensate for the loss of a page,
don't substract the guard page in the modeset init code any longer.

Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44748
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 644ec02b 26-Mar-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: s/i915_gem_do_init/i915_gem_init_global_gtt

... because this is what it actually doesn now that we have the global
gtt vs. ppgtt split.

Also move it to the other global gtt functions in i915_gem_gtt.c

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 74898d7e 15-Feb-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: bind objects to the global gtt only when needed

And track the existence of such a binding similar to the aliasing
ppgtt case. Speeds up binding/unbinding in the common case where we
only need a ppgtt binding (which is accessed in a cpu coherent fashion
by the gpu) and no gloabl gtt binding (which needs uc writes for the
ptes).

This patch just puts the required tracking in place.

v2: Check that global gtt mappings exist in the error_state capture
code (with Chris Wilson's llc reloc patches batchbuffers are no longer
relocated as mappable in all situations, so this matters). Suggested
by Chris Wilson.

v3: Adapted to Chris' latest llc-reloc patches.

v4: Fix a bug in the i915 error state capture code noticed by Chris
Wilson.

Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 74163907 15-Feb-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: split out dma mapping from global gtt bind/unbind functions

Note that there's a functional change buried in this patch wrt the ilk
dmar workaround: We now only idle the gpu while tearing down the dmar
mappings, not while clearing the gtt. Keeping the current semantics
would have made for some really ugly code and afaik the issue is only
with the dmar unmapping that needs a fully idle gpu.

Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 7bddb01f 09-Feb-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: ppgtt binding/unbinding support

This adds support to bind/unbind objects and wires it up. Objects are
only put into the ppgtt when necessary, i.e. at execbuf time.

Objects are still unconditionally put into the global gtt.

v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind
like for the global gtt function. Noticed by Chris Wilson.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1d2a314c 09-Feb-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: initialization/teardown for the aliasing ppgtt

This just adds the setup and teardown code for the ppgtt PDE and the
last-level pagetables, which are fixed for the entire lifetime, at
least for the moment.

v2: Kill the stray debug printk noted by and improve the pte
definitions as suggested by Chris Wilson.

v3: Clean up the aperture stealing code as noted by Ben Widawsky.

v4: Paint the init code in a more pleasing colour as suggest by Chris
Wilson.

v5: Explain the magic numbers noticed by Ben Widawsky.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8436473a 24-Jan-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: drm/i915: Fix recursive calls to unmap

After the ILK vt-d workaround patches it became clear that we had
introduced a bug. Chris Wilson tracked down the issue to recursive
calls to unmap. This happens because we try to optimize waiting on
requests by calling retire requests after the wait, which may drop the
last reference on an object and end up freeing the object (and then
unmap the object from the gtt).

After the last patch we can now choose to defer processing the retire
list.

Kudos to Chris Wilson for tracking this one down.

This patch fixes gem_unref_active_buffers from i-g-t. It was tested by
forcing do_idle_maps to true.

This also fixes tests/gem_linear_blits in intel-gpu-tools.

Reported-by: guang.a.yang@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b93f9cf1 25-Jan-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: argument to control retiring behavior

Sometimes it may be the case when we idle the gpu or wait on something
we don't actually want to process the retiring list. This patch allows
callers to choose the behavior.

Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5c042287 17-Oct-2011 Ben Widawsky <ben@bwidawsk.net>

drm/i915: ILK + VT-d workaround

Idle the GPU before doing any unmaps. We know if VT-d is in use through
an exported variable from iommu code.

This should avoid a known HW issue.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>


# e4ffd173 04-Apr-2011 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add an interface to dynamically change the cache level

[anholt v2: Don't forget that when going from cached to uncached, we
haven't been tracking the write domain from the CPU perspective, since
we haven't needed it for GPU coherency.]

[ickle v3: We also need to make sure we relinquish any fences on older
chipsets and clear the GTT for sane domain tracking.]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d5bd1449 13-Apr-2011 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Split out i915_gem_gtt_rebind_object()

... in preparation for changing the cache level (and thus the flags upon
the PTEs) dynamically.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 93dfb40c 29-Mar-2011 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename agp_type to cache_level

... to clarify just how we use it inside the driver and remove the
confusion of the poorly matching agp_type names. We still need to
translate through agp_type for interface into the fake AGP driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>


# bee4a186 21-Jan-2011 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915,agp/intel: Do not clear stolen entries

We can only utilize the stolen portion of the GTT if we are in sole
charge of the hardware. This is only true if using GEM and KMS,
otherwise VESA continues to access stolen memory.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# d9126400 11-Jan-2011 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Unmap the PCI pages after unbinding them from the GTT

Dave Airlie spotted that his ILK laptop with DMAR enabled was generating
the occasional DMAR warning.

"The ordering in the previous code was to rewrite the GTT table before
unmapping the pages and that makes sense to me."

This is his stable patch ported to d-i-n.

Reported-by: Dave Airlie <airlied@redhat.com>
Original-patch-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# a8e93126 08-Dec-2010 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Clear the cachelines upon resume

Required for my pineview system to not barf after resuming.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 05394f39 08-Nov-2010 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use drm_i915_gem_object as the preferred type

A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and
many characters!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 185cbcb3 05-Nov-2010 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: no more agp for gem

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 7c2e6fdf 06-Nov-2010 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: move gtt handling to i915_gem_gtt.c

No more drm_*_agp in i915_gem.c!

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 76aaf220 05-Nov-2010 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: restore gtt on resume in the drm instead of in intel-gtt.ko

This still uses the agp functions to actually reinstate the mappings
(with a gross hack to make agp cooperate), but it wires everything
up correctly for the switchover.

The call to agp_rebind_memory can be dropped because all non-kms drivers
do all their rebinding on EnterVT.

v2: Be more paranoid and flush the chipset cache after restoring gtt
mappings.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>