#
583cc9e4 |
|
11-Sep-2023 |
Qi Zheng <zhengqi.arch@bytedance.com> |
drm/i915: dynamically allocate the i915_gem_mm shrinker In preparation for implementing lockless slab shrink, use new APIs to dynamically allocate the i915_gem_mm shrinker, so that it can be freed asynchronously via RCU. Then it doesn't need to wait for RCU read-side critical section when releasing the struct drm_i915_private. Link: https://lkml.kernel.org/r/20230911094444.68966-21-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Chuck Lever <cel@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Rui <ray.huang@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nadav Amit <namit@vmware.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sean Paul <sean@poorly.run> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Tom Talpey <tom@talpey.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0951dce6 |
|
26-Sep-2023 |
Jonathan Cavitt <jonathan.cavitt@intel.com> |
drm/i915/gem: Make i915_gem_shrinker multi-gt aware Where applicable, use for_each_gt instead of to_gt in the i915_gem_shrinker functions to make them apply to more than just the primary GT. Specifically, this ensure i915_gem_shrink_all retires all requests across all GTs, and this makes i915_gem_shrinker_vmap unmap VMAs from all GTs. v2: Pass correct GT to intel_gt_retire_requests(Andrzej). v3: Remove unnecessary braces(Andi) v4: Undo v3 to fix build failure. Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230926093028.23614-1-nirmoy.das@intel.com
|
#
5e352e32 |
|
09-May-2023 |
Fei Yang <fei.yang@intel.com> |
drm/i915: preparation for using PAT index This patch is a preparation for replacing enum i915_cache_level with PAT index. Caching policy for buffer objects is set through the PAT index in PTE, the old i915_cache_level is not sufficient to represent all caching modes supported by the hardware. Preparing the transition by adding some platform dependent data structures and helper functions to translate the cache_level to pat_index. cachelevel_to_pat: a platform dependent array mapping cache_level to pat_index. max_pat_index: the maximum PAT index recommended in hardware specification Needed for validating the PAT index passed in from user space. i915_gem_get_pat_index: function to convert cache_level to PAT index. obj_to_i915(obj): macro moved to header file for wider usage. I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the convenience of coding. Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-2-fei.yang@intel.com
|
#
8e4ee5e8 |
|
30-Nov-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Wrap all access to i915_vma.node.start|size We already wrap i915_vma.node.start for use with the GGTT, as there we can perform additional sanity checks that the node belongs to the GGTT and fits within the 32b registers. In the next couple of patches, we will introduce guard pages around the objects _inside_ the drm_mm_node allocation. That is we will offset the vma->pages so that the first page is at drm_mm_node.start + vma->guard (not 0 as is currently the case). All users must then not use i915_vma.node.start directly, but compute the guard offset, thus all users are converted to use a i915_vma_offset() wrapper. The notable exceptions are the selftests that are testing exact behaviour of i915_vma_pin/i915_vma_insert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-3-andi.shyti@linux.intel.com
|
#
e33c267a |
|
31-May-2022 |
Roman Gushchin <roman.gushchin@linux.dev> |
mm: shrinkers: provide shrinkers with names Currently shrinkers are anonymous objects. For debugging purposes they can be identified by count/scan function names, but it's not always useful: e.g. for superblock's shrinkers it's nice to have at least an idea of to which superblock the shrinker belongs. This commit adds names to shrinkers. register_shrinker() and prealloc_shrinker() functions are extended to take a format and arguments to master a name. In some cases it's not possible to determine a good name at the time when a shrinker is allocated. For such cases shrinker_debugfs_rename() is provided. The expected format is: <subsystem>-<shrinker_type>[:<instance>]-<id> For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair. After this change the shrinker debugfs directory looks like: $ cd /sys/kernel/debug/shrinker/ $ ls dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42 mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43 mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44 rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49 sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13 sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36 sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19 sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10 sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9 sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37 sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38 sb-dax-11 sb-proc-45 sb-tmpfs-35 sb-debugfs-7 sb-proc-46 sb-tmpfs-40 [roman.gushchin@linux.dev: fix build warnings] Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Dave Chinner <dchinner@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
429e1fc1 |
|
03-May-2022 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gem: Make drop_pages() return bool Commit e4e806253003 ("drm/i915: Change shrink ordering to use locking around unbinding.") changed the return type to int without changing the return values or their meaning to "0 is success". Move it back to boolean. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220503061556.513175-1-lucas.demarchi@intel.com
|
#
ffa3fe08 |
|
15-Dec-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: clean up shrinker_release_pages Add some proper flags for the different modes, and shorten the name to something more snappy. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211215110746.865-2-matthew.auld@intel.com
|
#
93544177 |
|
15-Dec-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: remove writeback hook Ditch the writeback hook and drop i915_gem_object_writeback(). We already support the shrinker_release_pages hook which can just call shmem_writeback directly. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211215110746.865-1-matthew.auld@intel.com
|
#
5c24c9d2 |
|
19-Dec-2021 |
Michał Winiarski <michal.winiarski@intel.com> |
drm/i915/gem: Use to_gt() helper for GGTT accesses GGTT is currently available both through i915->ggtt and gt->ggtt, and we eventually want to get rid of the i915->ggtt one. Use to_gt() for all i915->ggtt accesses to help with the future refactoring. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211219212500.61432-4-andi.shyti@linux.intel.com
|
#
d8be1357 |
|
16-Dec-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Add ww ctx to i915_gem_object_trylock This is required for i915_gem_evict_vm, to be able to evict the entire VM, including objects that are already locked to the current ww ctx. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-12-maarten.lankhorst@linux.intel.com
|
#
2c3849ba |
|
16-Dec-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Trylock the object when shrinking We're working on requiring the obj->resv lock during unbind, fix the shrinker to take the object lock. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-10-maarten.lankhorst@linux.intel.com
|
#
e4e80625 |
|
16-Dec-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Change shrink ordering to use locking around unbinding. Call drop_pages with the gem object lock held, instead of the other way around. This will allow us to drop the vma bindings with the gem object lock held. We plan to require the object lock for unpinning in the future, and this is an easy target. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-3-maarten.lankhorst@linux.intel.com
|
#
1a9c4db4 |
|
14-Dec-2021 |
Michał Winiarski <michal.winiarski@intel.com> |
drm/i915/gem: Use to_gt() helper Use to_gt() helper consistently throughout the codebase. Pure mechanical s/i915->gt/to_gt(i915). No functional changes. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-6-andi.shyti@linux.intel.com
|
#
f7fd7814 |
|
20-Oct-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Remove dma_resv_prune The signaled bit is already used for quick testing if a fence is signaled. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/460722/ Signed-off-by: Christian König <christian.koenig@amd.com>
|
#
5c2625c4 |
|
20-Oct-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Remove dma_resv_prune The signaled bit is already used for quick testing if a fence is signaled. On top of that, it's a terrible abuse of dma-fence api, and in the common case where the object is already locked by the caller, the trylock will fail. If it were useful, the core dma-api would have exposed the same functionality. The fact that i915 has a dma_resv_utils.c file should be a warning that the functionality either belongs in core, or is not very useful at all. In this case the latter. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> [mlankhorst: Improve commit message] Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211021103605.735002-3-maarten.lankhorst@linux.intel.com Reviewed-by: Matthew Auld <matthew.auld@intel.com> #irc
|
#
004746e4 |
|
22-Nov-2021 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915/ttm: Correctly handle waiting for gpu when shrinking With async migration, the shrinker may end up wanting to release the pages of an object while the migration blit is still running, since the GT migration code doesn't set up VMAs and the shrinker is thus oblivious to the fact that the GPU is still using the pages. Add waiting for gpu in the shrinker_release_pages() op and an argument to that function indicating whether the shrinker expects it to not wait for gpu. In the latter case the shrinker_release_pages() op will return -EBUSY if the object is not idle. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-5-thomas.hellstrom@linux.intel.com
|
#
ebd4a8ec |
|
18-Oct-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: move shrinker management into adjust_lru We currently just evict lmem objects to system memory when under memory pressure. For this case we might lack the usual object mm.pages, which effectively hides the pages from the i915-gem shrinker, until we actually "attach" the TT to the object, or in the case of lmem-only objects it just gets migrated back to lmem when touched again. For all cases we can just adjust the i915 shrinker LRU each time we also adjust the TTM LRU. The two cases we care about are: 1) When something is moved by TTM, including when initially populating an object. Importantly this covers the case where TTM moves something from lmem <-> smem, outside of the normal get_pages() interface, which should still ensure the shmem pages underneath are reclaimable. 2) When calling into i915_gem_object_unlock(). The unlock should ensure the object is removed from the shinker LRU, if it was indeed swapped out, or just purged, when the shrinker drops the object lock. v2(Thomas): - Handle managing the shrinker LRU in adjust_lru, where it is always safe to touch the object. v3(Thomas): - Pretty much a re-write. This time piggy back off the shrink_pin stuff, which actually seems to fit quite well for what we want here. v4(Thomas): - Just use a simple boolean for tracking ttm_shrinkable. v5: - Ensure we call adjust_lru when faulting the object, to ensure the pages are visible to the shrinker, if needed. - Add back the adjust_lru when in i915_ttm_move (Thomas) v6(Reported-by: kernel test robot <lkp@intel.com>): - Remove unused i915_tt Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #v4 Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-6-matthew.auld@intel.com
|
#
e25d1ea4 |
|
18-Oct-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: add some kernel-doc for shrink_pin and friends Attempt to document shrink_pin and the other relevant interfaces that interact with it, before we start messing with it. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-5-matthew.auld@intel.com
|
#
7ae03459 |
|
18-Oct-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: add tt shmem backend For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled. v2(Thomas): - Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need to handle objects which don't even have mm.pages, so bundling this into put_pages() would require somehow handling that edge case, hence just letting the ttm backend handle everything in try_to_writeback doesn't seem too bad. v3(Thomas): - Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects. - s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback. - Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate. - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since these just get skipped anyway. We can try to come up with something better later. v4(Thomas): - s/PCI_DMA/DMA/. Also drop NO_KERNEL_MAPPING and NO_WARN, which apparently doesn't do anything with streaming mappings. - Just pass along the error for ->truncate, and assume nothing. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Cc: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-2-matthew.auld@intel.com
|
#
239f3c2e |
|
28-Jul-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Fix runtime pm handling in i915_gem_shrink We forgot to call intel_runtime_pm_put on error, fix it! Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: cf41a8f1dc1e ("drm/i915: Finally remove obj->mm.lock.") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> # v5.13+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210830121006.2978297-9-maarten.lankhorst@linux.intel.com
|
#
0c947773 |
|
28-Jul-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Fix runtime pm handling in i915_gem_shrink We forgot to call intel_runtime_pm_put on error, fix it! Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: cf41a8f1dc1e ("drm/i915: Finally remove obj->mm.lock.") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> # v5.13+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210830121006.2978297-9-maarten.lankhorst@linux.intel.com (cherry picked from commit 239f3c2ee18376587026efecaea5250fa5926d20) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
81eb1d17 |
|
11-Jul-2021 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
drm/i915: Fix fall-through warning for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a return; statement: drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:65:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
#
bc6f80cc |
|
25-Apr-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2. The stop_machine() lock may allocate memory, but is called inside vm->mutex, which is taken in the shrinker. This will cause a lockdep splat, as can be seen below: <4>[ 462.585762] ====================================================== <4>[ 462.585768] WARNING: possible circular locking dependency detected <4>[ 462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G U <4>[ 462.585779] ------------------------------------------------------ <4>[ 462.585783] i915_selftest/5540 is trying to acquire lock: <4>[ 462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30 <4>[ 462.585814] but task is already holding lock: <4>[ 462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915] <4>[ 462.586301] which lock already depends on the new lock. <4>[ 462.586305] the existing dependency chain (in reverse order) is: <4>[ 462.586309] -> #2 (&vm->mutex/1){+.+.}-{3:3}: <4>[ 462.586323] i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915] <4>[ 462.586719] i915_address_space_init+0x12d/0x130 [i915] <4>[ 462.587092] ppgtt_init+0x4e/0x80 [i915] <4>[ 462.587467] gen8_ppgtt_create+0x3e/0x5c0 [i915] <4>[ 462.587828] i915_ppgtt_create+0x28/0xf0 [i915] <4>[ 462.588203] intel_gt_init+0x123/0x370 [i915] <4>[ 462.588572] i915_gem_init+0x129/0x1f0 [i915] <4>[ 462.588971] i915_driver_probe+0x753/0xd80 [i915] <4>[ 462.589320] i915_pci_probe+0x43/0x1d0 [i915] <4>[ 462.589671] pci_device_probe+0x9e/0x110 <4>[ 462.589680] really_probe+0xea/0x410 <4>[ 462.589690] driver_probe_device+0xd9/0x140 <4>[ 462.589697] device_driver_attach+0x4a/0x50 <4>[ 462.589704] __driver_attach+0x83/0x140 <4>[ 462.589711] bus_for_each_dev+0x75/0xc0 <4>[ 462.589718] bus_add_driver+0x14b/0x1f0 <4>[ 462.589724] driver_register+0x66/0xb0 <4>[ 462.589731] i915_init+0x70/0x87 [i915] <4>[ 462.590053] do_one_initcall+0x56/0x2e0 <4>[ 462.590061] do_init_module+0x55/0x200 <4>[ 462.590068] load_module+0x2703/0x2990 <4>[ 462.590074] __do_sys_finit_module+0xad/0x110 <4>[ 462.590080] do_syscall_64+0x33/0x80 <4>[ 462.590089] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 462.590096] -> #1 (fs_reclaim){+.+.}-{0:0}: <4>[ 462.590109] fs_reclaim_acquire+0x9f/0xd0 <4>[ 462.590118] kmem_cache_alloc_trace+0x3d/0x430 <4>[ 462.590126] intel_cpuc_prepare+0x3b/0x1b0 <4>[ 462.590133] cpuhp_invoke_callback+0x9e/0x890 <4>[ 462.590141] _cpu_up+0xa4/0x130 <4>[ 462.590147] cpu_up+0x82/0x90 <4>[ 462.590153] bringup_nonboot_cpus+0x4a/0x60 <4>[ 462.590159] smp_init+0x21/0x5c <4>[ 462.590167] kernel_init_freeable+0x8a/0x1b7 <4>[ 462.590175] kernel_init+0x5/0xff <4>[ 462.590181] ret_from_fork+0x22/0x30 <4>[ 462.590187] -> #0 (cpu_hotplug_lock){++++}-{0:0}: <4>[ 462.590199] __lock_acquire+0x1520/0x2590 <4>[ 462.590207] lock_acquire+0xd1/0x3d0 <4>[ 462.590213] cpus_read_lock+0x39/0xc0 <4>[ 462.590219] stop_machine+0x12/0x30 <4>[ 462.590226] bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915] <4>[ 462.590601] ggtt_bind_vma+0x5d/0x80 [i915] <4>[ 462.590970] i915_vma_bind+0xdc/0x1c0 [i915] <4>[ 462.591374] i915_vma_pin_ww+0x435/0xb40 [i915] <4>[ 462.591779] make_obj_busy+0xcb/0x330 [i915] <4>[ 462.592170] igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915] <4>[ 462.592562] __i915_subtests.cold.7+0x42/0x92 [i915] <4>[ 462.592995] __run_selftests.part.3+0x10d/0x172 [i915] <4>[ 462.593428] i915_live_selftests.cold.5+0x1f/0x47 [i915] <4>[ 462.593860] i915_pci_probe+0x93/0x1d0 [i915] <4>[ 462.594210] pci_device_probe+0x9e/0x110 <4>[ 462.594217] really_probe+0xea/0x410 <4>[ 462.594226] driver_probe_device+0xd9/0x140 <4>[ 462.594233] device_driver_attach+0x4a/0x50 <4>[ 462.594240] __driver_attach+0x83/0x140 <4>[ 462.594247] bus_for_each_dev+0x75/0xc0 <4>[ 462.594254] bus_add_driver+0x14b/0x1f0 <4>[ 462.594260] driver_register+0x66/0xb0 <4>[ 462.594267] i915_init+0x70/0x87 [i915] <4>[ 462.594586] do_one_initcall+0x56/0x2e0 <4>[ 462.594592] do_init_module+0x55/0x200 <4>[ 462.594599] load_module+0x2703/0x2990 <4>[ 462.594605] __do_sys_finit_module+0xad/0x110 <4>[ 462.594612] do_syscall_64+0x33/0x80 <4>[ 462.594618] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 462.594625] other info that might help us debug this: <4>[ 462.594629] Chain exists of: cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1 <4>[ 462.594645] Possible unsafe locking scenario: <4>[ 462.594648] CPU0 CPU1 <4>[ 462.594652] ---- ---- <4>[ 462.594655] lock(&vm->mutex/1); <4>[ 462.594664] lock(fs_reclaim); <4>[ 462.594671] lock(&vm->mutex/1); <4>[ 462.594679] lock(cpu_hotplug_lock); <4>[ 462.594686] *** DEADLOCK *** <4>[ 462.594690] 4 locks held by i915_selftest/5540: <4>[ 462.594696] #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50 <4>[ 462.594715] #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915] <4>[ 462.595118] #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915] <4>[ 462.595519] #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915] <4>[ 462.595934] stack backtrace: <4>[ 462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G U 5.12.0-rc5-CI-Trybot_7644+ #1 <4>[ 462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018 <4>[ 462.595952] Call Trace: <4>[ 462.595961] dump_stack+0x7f/0xad <4>[ 462.595974] check_noncircular+0x12e/0x150 <4>[ 462.595982] ? save_stack.isra.17+0x3f/0x70 <4>[ 462.595991] ? drm_mm_insert_node_in_range+0x34a/0x5b0 <4>[ 462.596000] ? i915_vma_pin_ww+0x9ec/0xb40 [i915] <4>[ 462.596410] __lock_acquire+0x1520/0x2590 <4>[ 462.596419] ? do_init_module+0x55/0x200 <4>[ 462.596429] lock_acquire+0xd1/0x3d0 <4>[ 462.596435] ? stop_machine+0x12/0x30 <4>[ 462.596445] ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915] <4>[ 462.596816] cpus_read_lock+0x39/0xc0 <4>[ 462.596824] ? stop_machine+0x12/0x30 <4>[ 462.596831] stop_machine+0x12/0x30 <4>[ 462.596839] bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915] <4>[ 462.597210] ggtt_bind_vma+0x5d/0x80 [i915] <4>[ 462.597580] i915_vma_bind+0xdc/0x1c0 [i915] <4>[ 462.597986] i915_vma_pin_ww+0x435/0xb40 [i915] <4>[ 462.598395] ? make_obj_busy+0xcb/0x330 [i915] <4>[ 462.598786] make_obj_busy+0xcb/0x330 [i915] <4>[ 462.599180] ? 0xffffffff81000000 <4>[ 462.599187] ? debug_mutex_unlock+0x50/0xa0 <4>[ 462.599198] igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915] <4>[ 462.599592] __i915_subtests.cold.7+0x42/0x92 [i915] <4>[ 462.600026] ? i915_perf_selftests+0x20/0x20 [i915] <4>[ 462.600422] ? __i915_nop_setup+0x10/0x10 [i915] <4>[ 462.600820] __run_selftests.part.3+0x10d/0x172 [i915] <4>[ 462.601253] i915_live_selftests.cold.5+0x1f/0x47 [i915] <4>[ 462.601686] i915_pci_probe+0x93/0x1d0 [i915] <4>[ 462.602037] ? _raw_spin_unlock_irqrestore+0x3d/0x60 <4>[ 462.602047] pci_device_probe+0x9e/0x110 <4>[ 462.602057] really_probe+0xea/0x410 <4>[ 462.602067] driver_probe_device+0xd9/0x140 <4>[ 462.602075] device_driver_attach+0x4a/0x50 <4>[ 462.602084] __driver_attach+0x83/0x140 <4>[ 462.602091] ? device_driver_attach+0x50/0x50 <4>[ 462.602099] ? device_driver_attach+0x50/0x50 <4>[ 462.602107] bus_for_each_dev+0x75/0xc0 <4>[ 462.602116] bus_add_driver+0x14b/0x1f0 <4>[ 462.602124] driver_register+0x66/0xb0 <4>[ 462.602133] i915_init+0x70/0x87 [i915] <4>[ 462.602453] ? 0xffffffffa0606000 <4>[ 462.602458] do_one_initcall+0x56/0x2e0 <4>[ 462.602466] ? kmem_cache_alloc_trace+0x374/0x430 <4>[ 462.602476] do_init_module+0x55/0x200 <4>[ 462.602484] load_module+0x2703/0x2990 <4>[ 462.602500] ? __do_sys_finit_module+0xad/0x110 <4>[ 462.602507] __do_sys_finit_module+0xad/0x110 <4>[ 462.602519] do_syscall_64+0x33/0x80 <4>[ 462.602527] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 462.602535] RIP: 0033:0x7fab69d8d89d Changes since v1: - Add lockdep annotations during init, to ensure that lockdep is primed. This also fixes a false positive when reading /proc/lockdep_stats during module reload. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.com Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
#
772f7bb7 |
|
21-Apr-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Fix docbook descriptions for i915_gem_shrinker Fixes the following htmldocs warning: drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:102: warning: Function parameter or member 'ww' not described in 'i915_gem_shrink' Fixes: cf41a8f1dc1e ("drm/i915: Finally remove obj->mm.lock.") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210421120938.546076-1-maarten.lankhorst@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
270e3cc5 |
|
21-Apr-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Fix docbook descriptions for i915_gem_shrinker Fixes the following htmldocs warning: drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:102: warning: Function parameter or member 'ww' not described in 'i915_gem_shrink' Fixes: cf41a8f1dc1e ("drm/i915: Finally remove obj->mm.lock.") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210421120938.546076-1-maarten.lankhorst@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit 772f7bb75dffd4ec90eaf411f9e09dc2429f5c81) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
cf41a8f1 |
|
23-Mar-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Finally remove obj->mm.lock. With all callers and selftests fixed to use ww locking, we can now finally remove this lock. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-62-maarten.lankhorst@linux.intel.com
|
#
abd2f577 |
|
23-Mar-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Flatten obj->mm.lock With userptr fixed, there is no need for all separate lockdep classes now, and we can remove all lockdep tricks used. A trylock in the shrinker is all we need now to flatten the locking hierarchy. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> [danvet: Resolve conflict because we don't have the patch from Chris to rebrand i915_gem_shrinker_taints_mutex to fs_reclaim_taints_mutex. It's not a bad idea, but if we do it, it should be moved to the right header. See https://lore.kernel.org/intel-gfx/20210202154318.19246-1-chris@chris-wilson.co.uk/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-18-maarten.lankhorst@linux.intel.com
|
#
6d393ef5 |
|
22-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Optimistically prune dma-resv from the shrinker. As we shrink an object, also see if we can prune the dma-resv of idle fences it is maintaining a reference to. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201223122051.4624-2-chris@chris-wilson.co.uk
|
#
09137e94 |
|
08-Jul-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Unpin idle contexts from kswapd reclaim We removed retiring requests from the shrinker in order to decouple the mutexes from reclaim in preparation for unravelling the struct_mutex. The impact of not retiring is that we are much less agressive in making global objects available for shrinking, as such objects remain pinned until they are flushed by a heartbeat pulse following the last retired request along their timeline. In order to ensure that pulse occurs in time for memory reclamation, we should kick it from kswapd. The catch is that we have added some flush_work() into the retirement phase (to ensure that we reach a global idle in a timely manner), but these flush_work() are not eligible (i.e do not belong to WQ_MEM_RELCAIM) for use from inside kswapd. To avoid flushing those workqueues, we teach the retirer not to do so unless we are actually waiting, and only do the plain retire from inside the shrinker. Note that for execlists, we already retire completed contexts as they are scheduled out, so it should not be keeping global state unnecessarily pinned. The legacy ringbuffer however... References: 9e9539800dd4 ("drm/i915: Remove waiting & retiring from shrinker paths") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200708173748.32734-1-chris@chris-wilson.co.uk
|
#
a85f2228 |
|
02-Jul-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Drop forced struct_mutex from shrinker_taints_mutex Since we no longer always take struct_mutex around everything, and want the freedom to create GEM objects, actually taking struct_mutex inside the lock creation ends up pulling the mutex inside other looks. Since we don't use generally use struct_mutex, we can relax the tainting. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200702083225.20044-3-chris@chris-wilson.co.uk
|
#
9da0ea09 |
|
01-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Drop cached obj->bind_count We cached the number of vma bound to the object in order to speed up shrinker decisions. This has been superseded by being more proactive in removing objects we cannot shrink from the shrinker lists, and so we can drop the clumsy attempt at atomically counting the bind count and comparing it to the number of pinned mappings of the object. This will only get more clumsier with asynchronous binding and unbinding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200401223924.16667-1-chris@chris-wilson.co.uk
|
#
83d2bdb6 |
|
25-Feb-2020 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: significantly reduce the use of <drm/i915_drm.h> The #include has been splattered all over the place, but there are precious few places, all .c files, that actually need it. v2: remove leftover double newlines Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200225133131.3301-1-jani.nikula@intel.com
|
#
23873426 |
|
21-Feb-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Avoid recursing onto active vma from the shrinker We mark the vma as active while binding it in order to protect outselves from being shrunk under mempressure. This only works if we are strict in not attempting to shrink active objects. <6> [472.618968] Workqueue: events_unbound fence_work [i915] <4> [472.618970] Call Trace: <4> [472.618974] ? __schedule+0x2e5/0x810 <4> [472.618978] schedule+0x37/0xe0 <4> [472.618982] schedule_preempt_disabled+0xf/0x20 <4> [472.618984] __mutex_lock+0x281/0x9c0 <4> [472.618987] ? mark_held_locks+0x49/0x70 <4> [472.618989] ? _raw_spin_unlock_irqrestore+0x47/0x60 <4> [472.619038] ? i915_vma_unbind+0xae/0x110 [i915] <4> [472.619084] ? i915_vma_unbind+0xae/0x110 [i915] <4> [472.619122] i915_vma_unbind+0xae/0x110 [i915] <4> [472.619165] i915_gem_object_unbind+0x1dc/0x400 [i915] <4> [472.619208] i915_gem_shrink+0x328/0x660 [i915] <4> [472.619250] ? i915_gem_shrink_all+0x38/0x60 [i915] <4> [472.619282] i915_gem_shrink_all+0x38/0x60 [i915] <4> [472.619325] vm_alloc_page.constprop.25+0x1aa/0x240 [i915] <4> [472.619330] ? rcu_read_lock_sched_held+0x4d/0x80 <4> [472.619363] ? __alloc_pd+0xb/0x30 [i915] <4> [472.619366] ? module_assert_mutex_or_preempt+0xf/0x30 <4> [472.619368] ? __module_address+0x23/0xe0 <4> [472.619371] ? is_module_address+0x26/0x40 <4> [472.619374] ? static_obj+0x34/0x50 <4> [472.619376] ? lockdep_init_map+0x4d/0x1e0 <4> [472.619407] setup_page_dma+0xd/0x90 [i915] <4> [472.619437] alloc_pd+0x29/0x50 [i915] <4> [472.619470] __gen8_ppgtt_alloc+0x443/0x6b0 [i915] <4> [472.619503] gen8_ppgtt_alloc+0xd7/0x300 [i915] <4> [472.619535] ppgtt_bind_vma+0x2a/0xe0 [i915] <4> [472.619577] __vma_bind+0x26/0x40 [i915] <4> [472.619611] fence_work+0x1c/0x90 [i915] <4> [472.619617] process_one_work+0x26a/0x620 Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200221221818.2861432-1-chris@chris-wilson.co.uk (cherry picked from commit 6f24e41022f28061368776ea1514db0a6e67a9b1) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
6f24e410 |
|
21-Feb-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Avoid recursing onto active vma from the shrinker We mark the vma as active while binding it in order to protect outselves from being shrunk under mempressure. This only works if we are strict in not attempting to shrink active objects. <6> [472.618968] Workqueue: events_unbound fence_work [i915] <4> [472.618970] Call Trace: <4> [472.618974] ? __schedule+0x2e5/0x810 <4> [472.618978] schedule+0x37/0xe0 <4> [472.618982] schedule_preempt_disabled+0xf/0x20 <4> [472.618984] __mutex_lock+0x281/0x9c0 <4> [472.618987] ? mark_held_locks+0x49/0x70 <4> [472.618989] ? _raw_spin_unlock_irqrestore+0x47/0x60 <4> [472.619038] ? i915_vma_unbind+0xae/0x110 [i915] <4> [472.619084] ? i915_vma_unbind+0xae/0x110 [i915] <4> [472.619122] i915_vma_unbind+0xae/0x110 [i915] <4> [472.619165] i915_gem_object_unbind+0x1dc/0x400 [i915] <4> [472.619208] i915_gem_shrink+0x328/0x660 [i915] <4> [472.619250] ? i915_gem_shrink_all+0x38/0x60 [i915] <4> [472.619282] i915_gem_shrink_all+0x38/0x60 [i915] <4> [472.619325] vm_alloc_page.constprop.25+0x1aa/0x240 [i915] <4> [472.619330] ? rcu_read_lock_sched_held+0x4d/0x80 <4> [472.619363] ? __alloc_pd+0xb/0x30 [i915] <4> [472.619366] ? module_assert_mutex_or_preempt+0xf/0x30 <4> [472.619368] ? __module_address+0x23/0xe0 <4> [472.619371] ? is_module_address+0x26/0x40 <4> [472.619374] ? static_obj+0x34/0x50 <4> [472.619376] ? lockdep_init_map+0x4d/0x1e0 <4> [472.619407] setup_page_dma+0xd/0x90 [i915] <4> [472.619437] alloc_pd+0x29/0x50 [i915] <4> [472.619470] __gen8_ppgtt_alloc+0x443/0x6b0 [i915] <4> [472.619503] gen8_ppgtt_alloc+0xd7/0x300 [i915] <4> [472.619535] ppgtt_bind_vma+0x2a/0xe0 [i915] <4> [472.619577] __vma_bind+0x26/0x40 [i915] <4> [472.619611] fence_work+0x1c/0x90 [i915] <4> [472.619617] process_one_work+0x26a/0x620 Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200221221818.2861432-1-chris@chris-wilson.co.uk
|
#
85c823ac |
|
14-Jan-2020 |
Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> |
drm/i915/gem: Make WARN* drm specific where drm_priv ptr is available drm specific WARN* calls include device information in the backtrace, so we know what device the warnings originate from. Covert all the calls of WARN* with device specific drm_WARN* variants in functions where drm_i915_private struct pointer is readily available. The conversion was done automatically with below coccinelle semantic patch. checkpatch errors/warnings are fixed manually. @rule1@ identifier func, T; @@ func(...) { ... struct drm_i915_private *T = ...; <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } @rule2@ identifier func, T; @@ func(struct drm_i915_private *T,...) { <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } command: spatch --sp-file <script> --dir drivers/gpu/drm/i915/gem \ --linux-spacing --in-place Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-6-pankaj.laxminarayan.bharadiya@intel.com
|
#
f86dbacb |
|
05-Nov-2019 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
drm/i915: Switch obj->mm.lock lockdep annotations on its head The trouble with having a plain nesting flag for locks which do not naturally nest (unlike block devices and their partitions, which is the original motivation for nesting levels) is that lockdep will never spot a true deadlock if you screw up. This patch is an attempt at trying better, by highlighting a bit more of the actual nature of the nesting that's going on. Essentially we have two kinds of objects: - objects without pages allocated, which cannot be on any lru and are hence inaccessible to the shrinker. - objects which have pages allocated, which are on an lru, and which the shrinker can decide to throw out. For the former type of object, memory allocations while holding obj->mm.lock are permissible. For the latter they are not. And get/put_pages transitions between the two types of objects. This is still not entirely fool-proof since the rules might change. But as long as we run such a code ever at runtime lockdep should be able to observe the inconsistency and complain (like with any other lockdep class that we've split up in multiple classes). But there are a few clear benefits: - We can drop the nesting flag parameter from __i915_gem_object_put_pages, because that function by definition is never going allocate memory, and calling it on an object which doesn't have its pages allocated would be a bug. - We strictly catch more bugs, since there's not only one place in the entire tree which is annotated with the special class. All the other places that had explicit lockdep nesting annotations we're now going to leave up to lockdep again. - Specifically this catches stuff like calling get_pages from put_pages (which isn't really a good idea, if we can call get_pages so could the shrinker). I've seen patches do exactly that. Of course I fully expect CI will show me for the fool I am with this one here :-) v2: There can only be one (lockdep only has a cache for the first subclass, not for deeper ones, and we don't want to make these locks even slower). Still separate enums for better documentation. Real fix: don't forget about phys objs and pin_map(), and fix the shrinker to have the right annotations ... silly me. v3: Forgot usertptr too ... v4: Improve comment for pages_pin_count, drop the IMPORTANT comment and instead prime lockdep (Chris). v5: Appease checkpatch, no double empty lines (Chris) v6: More rebasing over selftest changes. Also somehow I forgot to push this patch :-/ Also format comments consistently while at it. v7: Fix typo in commit message (Joonas) Also drop the priming, with the lmem merge we now have allocations while holding the lmem lock, which wreaks the generic priming I've done in earlier patches. Should probably be resurrected when lmem is fixed. See commit 232a6ebae419193f5b8da4fa869ae5089ab105c2 Author: Matthew Auld <matthew.auld@intel.com> Date: Tue Oct 8 17:01:14 2019 +0100 drm/i915: introduce intel_memory_region I'm keeping the priming patch locally so it wont get lost. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Tang, CQ" <cq.tang@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191105090148.30269-1-daniel.vetter@ffwll.ch [mlankhorst: Fix commit typos pointed out by Michael Ruhl]
|
#
5facae4f |
|
18-Sep-2019 |
Qian Cai <cai@lca.pw> |
locking/lockdep: Remove unused @nested argument from lock_release() Since the following commit: b4adfe8e05f1 ("locking/lockdep: Remove unused argument in __lock_release") @nested is no longer used in lock_release(), so remove it from all lock_release() calls and friends. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: alexander.levin@microsoft.com Cc: daniel@iogearbox.net Cc: davem@davemloft.net Cc: dri-devel@lists.freedesktop.org Cc: duyuyang@gmail.com Cc: gregkh@linuxfoundation.org Cc: hannes@cmpxchg.org Cc: intel-gfx@lists.freedesktop.org Cc: jack@suse.com Cc: jlbec@evilplan.or Cc: joonas.lahtinen@linux.intel.com Cc: joseph.qi@linux.alibaba.com Cc: jslaby@suse.com Cc: juri.lelli@redhat.com Cc: maarten.lankhorst@linux.intel.com Cc: mark@fasheh.com Cc: mhocko@kernel.org Cc: mripard@kernel.org Cc: ocfs2-devel@oss.oracle.com Cc: rodrigo.vivi@intel.com Cc: sean@poorly.run Cc: st@kernel.org Cc: tj@kernel.org Cc: tytso@mit.edu Cc: vdavydov.dev@gmail.com Cc: vincent.guittot@linaro.org Cc: viro@zeniv.linux.org.uk Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pw Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
2850748e |
|
04-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Pull i915_vma_pin under the vm->mutex Replace the struct_mutex requirement for pinning the i915_vma with the local vm->mutex instead. Note that the vm->mutex is tainted by the shrinker (we require unbinding from inside fs-reclaim) and so we cannot allocate while holding that mutex. Instead we have to preallocate workers to do allocate and apply the PTE updates after we have we reserved their slot in the drm_mm (using fences to order the PTE writes with the GPU work and with later unbind). In adding the asynchronous vma binding, one subtle requirement is to avoid coupling the binding fence into the backing object->resv. That is the asynchronous binding only applies to the vma timeline itself and not to the pages as that is a more global timeline (the binding of one vma does not need to be ordered with another vma, nor does the implicit GEM fencing depend on a vma, only on writes to the backing store). Keeping the vma binding distinct from the backing store timelines is verified by a number of async gem_exec_fence and gem_exec_schedule tests. The way we do this is quite simple, we keep the fence for the vma binding separate and only wait on it as required, and never add it to the obj->resv itself. Another consequence in reducing the locking around the vma is the destruction of the vma is no longer globally serialised by struct_mutex. A natural solution would be to add a kref to i915_vma, but that requires decoupling the reference cycles, possibly by introducing a new i915_mm_pages object that is own by both obj->mm and vma->pages. However, we have not taken that route due to the overshadowing lmem/ttm discussions, and instead play a series of complicated games with trylocks to (hopefully) ensure that only one destruction path is called! v2: Add some commentary, and some helpers to reduce patch churn. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
|
#
99013b10 |
|
10-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Make shrink/unshrink be atomic Add an atomic counter and always take the spinlock around the pin/unpin events, so that we can perform the list manipulation concurrently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190910212204.17190-1-chris@chris-wilson.co.uk
|
#
5a90606d |
|
01-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Replace obj->pin_global with obj->frontbuffer obj->pin_global was originally used as a means to keep the shrinker off the active scanout, but we use the vma->pin_count itself for that and the obj->frontbuffer to delay shrinking active framebuffers. The other role that obj->pin_global gained was for spotting display objects inside GEM and working harder to keep those coherent; for which we can again simply inspect obj->frontbuffer directly. Coming up next, we will want to manipulate the pin_global counter outside of the principle locks, so would need to make pin_global atomic. However, since obj->frontbuffer is already managed atomically, it makes sense to use that the primary key for display objects instead of having pin_global. Ville pointed out the principle difference is that obj->frontbuffer is set for as long as an intel_framebuffer is attached to an object, but obj->pin_global was only raised for as long as the object was active. In practice, this means that we consider the object as being on the scanout for longer than is strictly required, causing us to be more proactive in flushing -- though it should be true that we would have flushed eventually when the back became the front, except that on the flip path that flush is async but when hit from another ioctl it will be synchronous. v2: i915_gem_object_is_framebuffer() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-5-chris@chris-wilson.co.uk
|
#
c29579d2 |
|
06-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Make caps.scheduler static We do not notify userspace when the scheduler capabilities are changed (due to wedging the driver) and as such userspace will expect the caps to be static and unchanging. Make it so, and so we only need to compute our caps once during driver registration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-1-chris@chris-wilson.co.uk
|
#
1aff1903 |
|
02-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Hide unshrinkable context objects from the shrinker The shrinker cannot touch objects used by the contexts (logical state and ring). Currently we mark those as "pin_global" to let the shrinker skip over them, however, if we remove them from the shrinker lists entirely, we don't event have to include them in our shrink accounting. By keeping the unshrinkable objects in our shrinker tracking, we report a large number of objects available to be shrunk, and leave the shrinker deeply unsatisfied when we fail to reclaim those. The shrinker will persist in trying to reclaim the unavailable objects, forcing the system into a livelock (not even hitting the dread oomkiller). v2: Extend unshrinkable protection for perma-pinned scratch and guc allocations (Tvrtko) v3: Notice that we should be pinned when marking unshrinkable and so the link cannot be empty; merge duplicate paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk
|
#
c03467ba |
|
03-Jul-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Free pages before rcu-freeing the object As we have dropped the final reference to the object, we do not need to wait until after the rcu grace period to drop its pages. We still require struct_mutex to completely unbind the object to release the pages, so we still need a free-worker to manage that from process context. By scheduling the release of pages before waiting for the rcu should mean that we are not trapping those pages from beyond the reach of the shrinker. v2: Pass along the request to skip if the vma is busy to the underlying unbind routine, to avoid checking the reservation underneath the i915->mm.obj_lock which may be used from inside irq context. v3: Flip the bit for unbinding while active, for later convenience. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111035 Fixes: a93615f900bd ("drm/i915: Throw away the active object retirement complexity") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190703091726.11690-6-chris@chris-wilson.co.uk
|
#
a93615f9 |
|
21-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Throw away the active object retirement complexity Remove the accumulated optimisations that we have for i915_vma_retire and reduce it to the bare essential of tracking the active object reference. This allows us to only use atomic operations, and so will be able to avoid the struct_mutex requirement. The principal loss here is the shrinker MRU bumping, so now if we have to shrink, we will do so in much more random order and more likely to try and shrink recently used objects. That is a nuisance, but shrinking active objects is a second step we try to avoid and will always be a system-wide performance issue. The other loss is here is in the automatic pruning of the reservation_object when idling. This is not as large an issue as upon reservation_object introduction as now adding new fences into the object replaces already signaled fences, keeping the array compact. But we do lose the auto-expiration of stale fences and unused arrays. That may be a noticeable problem for which we need to re-implement autopruning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-3-chris@chris-wilson.co.uk
|
#
9e953980 |
|
21-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove waiting & retiring from shrinker paths i915_gem_wait_for_idle() and i915_retire_requests() introduce a dependency on the timeline->mutex. This is problematic as we want to later perform allocations underneath i915_active.mutex, forming a link between the shrinker, the timeline and active mutexes. Nip this cycle in the bud by removing the acquisition of the timeline mutex (i.e. retiring) from inside the shrinker. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-1-chris@chris-wilson.co.uk
|
#
0bd6cb6b |
|
18-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Skip shrinking already freed pages Previously, we wanted to shrink the pages of freed objects before they were finally RCU collected. However, by removing the struct_mutex serialisation around the active reference, we need to acquire an extra reference around the wait. Unfortunately this means that we have to skip objects that are waiting RCU collection. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110937 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-2-chris@chris-wilson.co.uk
|
#
ce476c80 |
|
14-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Keep contexts pinned until after the next kernel context switch We need to keep the context image pinned in memory until after the GPU has finished writing into it. Since it continues to write as we signal the final breadcrumb, we need to keep it pinned until the request after it is complete. Currently we know the order in which requests execute on each engine, and so to remove that presumption we need to identify a request/context-switch we know must occur after our completion. Any request queued after the signal must imply a context switch, for simplicity we use a fresh request from the kernel context. The sequence of operations for keeping the context pinned until saved is: - On context activation, we preallocate a node for each physical engine the context may operate on. This is to avoid allocations during unpinning, which may be from inside FS_RECLAIM context (aka the shrinker) - On context deactivation on retirement of the last active request (which is before we know the context has been saved), we add the preallocated node onto a barrier list on each engine - On engine idling, we emit a switch to kernel context. When this switch completes, we know that all previous contexts must have been saved, and so on retiring this request we can finally unpin all the contexts that were marked as deactivated prior to the switch. We can enhance this in future by flushing all the idle contexts on a regular heartbeat pulse of a switch to kernel context, which will also be used to check for hung engines. v2: intel_context_active_acquire/_release Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
|
#
c447ff7d |
|
13-Jun-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: update with_intel_runtime_pm to use the rpm structure Matching the underlying get/put functions. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
|
#
d858d569 |
|
13-Jun-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: update rpm_get/put to use the rpm structure The functions where internally already only using the structure, so we need to just flip the interface. v2: rebase Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
|
#
70972f51 |
|
12-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: kerneldoc warnings squelched drivers/gpu/drm/i915//gem/i915_gem_shrinker.c:142: warning: Function parameter or member 'shrink' not described in 'i915_gem_shrink' drivers/gpu/drm/i915//gem/i915_gem_shrinker.c:142: warning: Excess function parameter 'flags' description in 'i915_gem_shrink' drivers/gpu/drm/i915//intel_display.c:13443: warning: Function parameter or member '_state' not described in 'intel_atomic_check' drivers/gpu/drm/i915//intel_display.c:13443: warning: Excess function parameter 'state' description in 'intel_atomic_check' Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190612151311.30295-1-chris@chris-wilson.co.uk
|
#
ecab9be1 |
|
12-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Combine unbound/bound list tracking for objects With async binding, we don't want to manage a bound/unbound list as we may end up running before we even acquire the pages. All that is required is keeping track of shrinkable objects, so reduce it to the minimum list. Fixes: 6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190612105720.30310-1-chris@chris-wilson.co.uk
|
#
a8cff4c8 |
|
10-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Promote i915->mm.obj_lock to be irqsafe The intent is to be able to update the mm.lists from inside an irqsoff section (e.g. from a softirq rcu workqueue), ergo we need to make the i915->mm.obj_lock irqsafe. v2: can_discard_pages() ensures we are shrinkable v3: Beware shadowing of 'flags' Fixes: 3b4fa9640ccd ("drm/i915: Track the purgeable objects on a separate eviction list") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110869 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190610145430.17717-1-chris@chris-wilson.co.uk
|
#
d82b4b26 |
|
30-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Report all objects with allocated pages to the shrinker Currently, we try to report to the shrinker the precise number of objects (pages) that are available to be reaped at this moment. This requires searching all objects with allocated pages to see if they fulfill the search criteria, and this count is performed quite frequently. (The shrinker tries to free ~128 pages on each invocation, before which we count all the objects; counting takes longer than unbinding the objects!) If we take the pragmatic view that with sufficient desire, all objects are eventually reapable (they become inactive, or no longer used as framebuffer etc), we can simply return the count of pinned pages maintained during get_pages/put_pages rather than walk the lists every time. The downside is that we may (slightly) over-report the number of objects/pages we could shrink and so penalize ourselves by shrinking more than required. This is mitigated by keeping the order in which we shrink objects such that we avoid penalizing active and frequently used objects, and if memory is so tight that we need to free them we would need to anyway. v2: Only expose shrinkable objects to the shrinker; a small reduction in not considering stolen and foreign objects. v3: Restore the tracking from a "backup" copy from before the gem/ split Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-2-chris@chris-wilson.co.uk
|
#
3b4fa964 |
|
30-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Track the purgeable objects on a separate eviction list Currently the purgeable objects, I915_MADV_DONTNEED, are mixed in the normal bound/unbound lists. Every shrinker pass starts with an attempt to purge from this set of unneeded objects, which entails us doing a walk over both lists looking for any candidates. If there are none, and since we are shrinking we can reasonably assume that the lists are full!, this becomes a very slow futile walk. If we separate out the purgeable objects into own list, this search then becomes its own phase that is preferentially handled during shrinking. Instead the cost becomes that we then need to filter the purgeable list if we want to distinguish between bound and unbound objects. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-1-chris@chris-wilson.co.uk
|
#
10be98a7 |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move more GEM objects under gem/ Continuing the theme of separating out the GEM clutter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
|