Cross Reference: /linux-master/drivers/gpu/drm/msm/msm

History log of /linux-master/drivers/gpu/drm/msm/msm_ringbuffer.c
Revision	Date	Author	Comments
# 917e9b7c	09-Jan-2024	Rob Clark <robdclark@chromium.org>	Revert "drm/msm/gpu: Push gpu lock down past runpm" This reverts commit abe2023b4cea192ab266b351fd38dc9dbd846df0. Changing the locking order means that scheduler/msm_job_run() can race with the recovery kthread worker, with the result that the GPU gets an extra runpm get when we are trying to power it off. Leaving the GPU in an unrecovered state. I'll need to come up with a different scheme for appeasing lockdep. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/573835/
# 2d7d2c4e	20-Nov-2023	Rob Clark <robdclark@chromium.org>	drm/msm/gem: Split out submit_unpin_objects() helper Untangle unpinning from unlock/unref loop. The unpin only happens in error paths so it is easier to decouple from the normal unlock path. Since we never have an intermediate state where a subset of buffers are pinned (ie. we never bail out of the pin or unpin loops) we can replace the bo state flag bit with a global flag in the submit. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/568335/
# a6149f03	30-Oct-2023	Matthew Brost <matthew.brost@intel.com>	drm/sched: Convert drm scheduler to use a work queue rather than kthread In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1 mapping between a drm_gpu_scheduler and drm_sched_entity. At first this seems a bit odd but let us explain the reasoning below. 1. In Xe the submission order from multiple drm_sched_entity is not guaranteed to be the same completion even if targeting the same hardware engine. This is because in Xe we have a firmware scheduler, the GuC, which allowed to reorder, timeslice, and preempt submissions. If a using shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls apart as the TDR expects submission order == completion order. Using a dedicated drm_gpu_scheduler per drm_sched_entity solve this problem. 2. In Xe submissions are done via programming a ring buffer (circular buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow control on the ring for free. A problem with this design is currently a drm_gpu_scheduler uses a kthread for submission / job cleanup. This doesn't scale if a large number of drm_gpu_scheduler are used. To work around the scaling issue, use a worker rather than kthread for submission / job cleanup. v2: - (Rob Clark) Fix msm build - Pass in run work queue v3: - (Boris) don't have loop in worker v4: - (Tvrtko) break out submit ready, stop, start helpers into own patch v5: - (Boris) default to ordered work queue v6: - (Luben / checkpatch) fix alignment in msm_ringbuffer.c - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue - (Luben) Update comment for drm_sched_wqueue_enqueue - (Luben) Positive check for submit_wq in drm_sched_init - (Luben) s/alloc_submit_wq/own_submit_wq v7: - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue v8: - (Luben) Adjust var names / comments Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
# 56e44960	14-Oct-2023	Luben Tuikov <luben.tuikov@amd.com>	drm/sched: Convert the GPU scheduler to variable number of run-queues The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com
# abe2023b	10-Aug-2023	Rob Clark <robdclark@chromium.org>	drm/msm/gpu: Push gpu lock down past runpm Avoid holding gpu lock when calling runpm, to avoid this lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 6.4.3-debug+ #14 Not tainted ------------------------------------------------------ ring0/373 is trying to acquire lock: ffffffead86efb98 (prepare_lock){+.+.}-{3:3}, at: clk_prepare_lock+0x70/0x98 but task is already holding lock: ffffff809cd19170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x7c/0x128 [msm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&gpu->lock){+.+.}-{3:3}: __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 msm_job_run+0x7c/0x128 [msm] drm_sched_main+0x264/0x354 [gpu_sched] kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #3 (dma_fence_map){++++}-{0:0}: __dma_fence_might_wait+0x74/0xc0 dma_resv_lockdep+0x1f0/0x2e8 do_one_initcall+0xb4/0x214 kernel_init_freeable+0x338/0x33c kernel_init+0x30/0x134 ret_from_fork+0x10/0x20 -> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}: fs_reclaim_acquire+0x7c/0x9c slab_pre_alloc_hook.constprop.0+0x40/0x250 __kmem_cache_alloc_node+0x60/0x18c kmalloc_node_trace+0x40/0x84 alloc_worker+0x2c/0x64 init_rescuer+0x34/0xe0 workqueue_init+0x168/0x1fc kernel_init_freeable+0x15c/0x33c kernel_init+0x30/0x134 ret_from_fork+0x10/0x20 -> #1 (fs_reclaim){+.+.}-{0:0}: __fs_reclaim_acquire+0x3c/0x48 fs_reclaim_acquire+0x50/0x9c slab_pre_alloc_hook.constprop.0+0x40/0x250 __kmem_cache_alloc_node+0x60/0x18c kmalloc_trace+0x44/0x88 clk_rcg2_dfs_determine_rate+0x60/0x214 clk_core_determine_round_nolock+0xb8/0xf0 clk_core_round_rate_nolock+0x84/0x118 clk_core_round_rate_nolock+0xd8/0x118 clk_round_rate+0x6c/0xd0 geni_se_clk_tbl_get+0x78/0xc0 geni_se_clk_freq_match+0x44/0xe4 get_spi_clk_cfg+0x50/0xf4 geni_spi_set_clock_and_bw+0x54/0x104 spi_geni_prepare_message+0x130/0x174 __spi_pump_transfer_message+0x200/0x4d8 __spi_sync+0x13c/0x23c spi_sync_locked+0x18/0x24 do_cros_ec_pkt_xfer_spi+0x124/0x3f0 cros_ec_xfer_high_pri_work+0x28/0x3c kthread_worker_fn+0x14c/0x27c kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #0 (prepare_lock){+.+.}-{3:3}: __lock_acquire+0xdf8/0x109c lock_acquire+0x234/0x284 __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 clk_prepare_lock+0x70/0x98 clk_prepare+0x24/0x50 clk_bulk_prepare+0x50/0x9c a6xx_gmu_resume+0x94/0x800 [msm] a6xx_gmu_pm_resume+0x38/0x158 [msm] adreno_runtime_resume+0x2c/0x38 [msm] pm_generic_runtime_resume+0x30/0x44 __rpm_callback+0x4c/0x134 rpm_callback+0x78/0x7c rpm_resume+0x3a4/0x46c __pm_runtime_resume+0x78/0xbc pm_runtime_get_sync.isra.0+0x14/0x20 [msm] msm_gpu_submit+0x4c/0x12c [msm] msm_job_run+0x88/0x128 [msm] drm_sched_main+0x264/0x354 [gpu_sched] kthread+0xf0/0x100 ret_from_fork+0x10/0x20 other info that might help us debug this: Chain exists of: prepare_lock --> dma_fence_map --> &gpu->lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&gpu->lock); lock(dma_fence_map); lock(&gpu->lock); lock(prepare_lock); * DEADLOCK * 2 locks held by ring0/373: #0: ffffffead875ae50 (dma_fence_map){++++}-{0:0}, at: drm_sched_main+0x54/0x354 [gpu_sched] #1: ffffff809cd19170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x7c/0x128 [msm] stack backtrace: CPU: 2 PID: 373 Comm: ring0 Not tainted 6.4.3-debug+ #14 Hardware name: Google Villager (rev1+) with LTE (DT) Call trace: dump_backtrace+0xb4/0xf0 show_stack+0x20/0x30 dump_stack_lvl+0x60/0x84 dump_stack+0x18/0x24 print_circular_bug+0x1cc/0x234 check_noncircular+0x78/0xac __lock_acquire+0xdf8/0x109c lock_acquire+0x234/0x284 __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 clk_prepare_lock+0x70/0x98 clk_prepare+0x24/0x50 clk_bulk_prepare+0x50/0x9c a6xx_gmu_resume+0x94/0x800 [msm] a6xx_gmu_pm_resume+0x38/0x158 [msm] adreno_runtime_resume+0x2c/0x38 [msm] pm_generic_runtime_resume+0x30/0x44 __rpm_callback+0x4c/0x134 rpm_callback+0x78/0x7c rpm_resume+0x3a4/0x46c __pm_runtime_resume+0x78/0xbc pm_runtime_get_sync.isra.0+0x14/0x20 [msm] msm_gpu_submit+0x4c/0x12c [msm] msm_job_run+0x88/0x128 [msm] drm_sched_main+0x264/0x354 [gpu_sched] kthread+0xf0/0x100 ret_from_fork+0x10/0x20 Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/552298/
# 7391c282	02-Aug-2023	Rob Clark <robdclark@chromium.org>	drm/msm: Remove vma use tracking This was not strictly necessary, as page unpinning (ie. shrinker) only cares about the resv. It did give us some extra sanity checking for userspace controlled iova, and was useful to catch issues on kernel and userspace side when enabling userspace iova. But if userspace screws this up, it just corrupts it's own gpu buffers and/or gets iova faults. So we can just let userspace shoot it's own foot and drop the extra per- buffer SUBMIT overhead. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Patchwork: https://patchwork.freedesktop.org/patch/551023/
# 6ba5daa5	02-Aug-2023	Rob Clark <robdclark@chromium.org>	drm/msm: Use drm_gem_object in submit bos table Basically everywhere wants the base ptr type. So store that instead of msm_gem_object. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/551021/
# 1a8b612e	02-Aug-2023	Rob Clark <robdclark@chromium.org>	drm/msm: Take lru lock once per job_run Rather than acquiring it and dropping it for each individual obj. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/551019/
# 17b704f1	20-Mar-2023	Rob Clark <robdclark@chromium.org>	drm/msm/gem: Avoid obj lock in job_run() Now that everything that controls which LRU an obj lives in except the backing pages is protected by the LRU lock, add a special path to unpin in the job_run() path, where we are assured that we already have backing pages and will not be racing against eviction (because the GEM object's dma_resv contains the fence that will be signaled when the submit/job completes). Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/527845/ Link: https://lore.kernel.org/r/20230320144356.803762-10-robdclark@gmail.com
# b14b8c5f	20-Mar-2023	Rob Clark <robdclark@chromium.org>	drm/msm: Decouple vma tracking from obj lock We need to use the inuse count to track that a BO is pinned until we have the hw_fence. But we want to remove the obj lock from the job_run() path as this could deadlock against reclaim/shrinker (because it is blocking the hw_fence from eventually being signaled). So split that tracking out into a per-vma lock with narrower scope. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/527839/ Link: https://lore.kernel.org/r/20230320144356.803762-5-robdclark@gmail.com
# fc2f0756	20-Mar-2023	Rob Clark <robdclark@chromium.org>	drm/msm/gem: Tidy up VMA API Stop open coding VMA construction, which will be needed in the next commit. And since the VMA already has a ptr to the adress space, stop passing that around everywhere. (Also, an aspace always has an mmu so we can drop a couple pointless NULL checks.) Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/527833/ Link: https://lore.kernel.org/r/20230320144356.803762-4-robdclark@gmail.com
# 769fec1e	20-Mar-2023	Rob Clark <robdclark@chromium.org>	drm/msm: Move submit bo flags update from obj lock The flags are only accessed (1) when submit is constructed, before enqueuing to gpu sched (ie. when still visible to only the task calling the submit ioctl), (2) here, where we own a reference to the submit and are serialized on the gpu sched thread, and (3) after the submit is retired and last reference is dropped, which is serialized on the submit's reference count. Hence locking is unneeded here. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/527830/ Link: https://lore.kernel.org/r/20230320144356.803762-3-robdclark@gmail.com
# f94e6a51	20-Mar-2023	Rob Clark <robdclark@chromium.org>	drm/msm: Pre-allocate hw_fence Avoid allocating memory in job_run() by pre-allocating the hw_fence. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/527832/ Link: https://lore.kernel.org/r/20230320144356.803762-2-robdclark@gmail.com
# 084b9e17	23-Sep-2022	Rob Clark <robdclark@chromium.org>	drm/msm/gem: Unpin objects slightly later The introduction of "drm/msm/gem: Evict active GEM objects when necessary" exposes a problem with "drm/msm/gem: Unpin buffers earlier", in that we need to keep the object pinned in the time the submit is queued up in the gpu scheduler. Otherwise the shrinker will see it as a thing that can be evicted if we wait for it to be signaled. But if the shrinker path is waiting on it with the obj lock held, the job cannot be scheduled, as that also requires briefly grabbing the obj lock, leading to deadlock. (Not to mention, we don't want the shrinker to evict an obj queued up in gpu scheduler.) Fixes: f371bcc0c2ac ("drm/msm/gem: Unpin buffers earlier") Fixes: 025d27239a2f ("drm/msm/gem: Evict active GEM objects when necessary") Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/19 Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Chia-I Wu <olvaffe@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/504528/ Link: https://lore.kernel.org/r/20220923224043.2449152-1-robdclark@gmail.com
# 125e03b2	18-Aug-2022	Akhil P Oommen <quic_akhilpo@quicinc.com>	drm/msm: Remove unnecessary pm_runtime_get/put We already enable gpu power from msm_gpu_submit(), so avoid a duplicate pm_runtime_get/put from msm_job_run(). Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498390/ Link: https://lore.kernel.org/r/20220819015030.v5.1.Icf1e8f0c9b3e7e9933c3b48c70477d0582f3243f@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
# 311e03c2	27-May-2022	Rob Clark <robdclark@chromium.org>	drm/msm/gem: Separate object and vma unpin Previously the BO_PINNED state in the submit was tracking two related but different things: (1) that the buffer object was pinned, and (2) that the vma (mapping within a set of pagetables) was pinned. But with fenced vma unpin (needed so that userspace couldn't race with retire path for releasing a vma) these two were decoupled. The fact that the BO_PINNED flag was already cleared meant that we leaked the bo pin count which should have been dropped when the submit was retired. So split this state into BO_OBJ_PINNED and BO_VMA_PINNED, so they can be dropped independently. Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin") Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/487559/ Link: https://lore.kernel.org/r/20220527172341.2151005-1-robdclark@gmail.com
# 500ca2a1	21-Apr-2022	Tom Rix <trix@redhat.com>	drm/msm: change msm_sched_ops from global to static Smatch reports this issue msm_ringbuffer.c:43:36: warning: symbol 'msm_sched_ops' was not declared. Should it be static? msm_sched_ops is only used in msm_ringbuffer.c so change its storage-class specifier to static. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/482883/ Link: https://lore.kernel.org/r/20220421131507.1557667-1-trix@redhat.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
# 95d1deb0	11-Apr-2022	Rob Clark <robdclark@chromium.org>	drm/msm/gem: Add fenced vma unpin With userspace allocated iova (next patch), we can have a race condition where userspace observes the fence completion and deletes the vma before retire_submit() gets around to unpinning the vma. To handle this, add a fenced unpin which drops the refcount but tracks the fence, and update msm_gem_vma_inuse() to check any previously unsignaled fences. v2: Fix inuse underflow (duplicate unpin) v3: Fix msm_job_run() vs submit_cleanup() race condition Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220411215849.297838-10-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
# 8ab62eda	22-Feb-2022	Jiawei Gu <Jiawei.Gu@amd.com>	drm/sched: Add device pointer to drm_gpu_scheduler Add device pointer so scheduler's printing can use DRM_DEV_ERROR() instead, which makes life easier under multiple GPU scenario. v2: amend all calls of drm_sched_init() v3: fill dev pointer for all drm_sched_init() calls Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com
# c28e2f2b	09-Nov-2021	Rob Clark <robdclark@chromium.org>	drm/msm: Remove struct_mutex usage The remaining struct_mutex usage is just to serialize various gpu related things (submit/retire/recover/fault/etc), so replace struct_mutex with gpu->lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211109181117.591148-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
# 80bcfbd3	04-Aug-2021	Daniel Vetter <daniel.vetter@ffwll.ch>	drm/msm: Use scheduler dependency handling drm_sched_job_init is already at the right place, so this boils down to deleting code. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-13-daniel.vetter@ffwll.ch
# 1d8a5ca4	27-Jul-2021	Rob Clark <robdclark@chromium.org>	drm/msm: Conversion to drm scheduler For existing adrenos, there is one or more ringbuffer, depending on whether preemption is supported. When preemption is supported, each ringbuffer has it's own priority. A submitqueue (which maps to a gl context or vk queue in userspace) is mapped to a specific ring- buffer at creation time, based on the submitqueue's priority. Each ringbuffer has it's own drm_gpu_scheduler. Each submitqueue maps to a drm_sched_entity. And each submit maps to a drm_sched_job. Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/4 Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-10-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
# 030af2b0	27-Jul-2021	Rob Clark <robdclark@chromium.org>	drm/msm: drop drm_gem_object_put_locked() No idea why we were still using this. It certainly hasn't been needed for some time. So drop the pointless twin codepaths. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
# 375f9a63	27-Jul-2021	Rob Clark <robdclark@chromium.org>	drm/msm: Docs and misc cleanup Fix a couple incorrect or misspelt comments, and add submitqueue doc comment. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210728010632.2633470-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
# da3d378d	26-Jul-2021	Rob Clark <robdclark@chromium.org>	drm/msm: Let fences read directly from memptrs Let dma_fence::signaled, etc, read directly from the address that the hw is writing with updated completed fence seqno, so we can potentially notice that the fence is signaled sooner. Plus add some docs. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20210726144359.2179302-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
# 77d20529	23-Oct-2020	Rob Clark <robdclark@chromium.org>	drm/msm: Protect ring->submits with it's own lock One less place to rely on dev->struct_mutex. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
# 77c40603	23-Oct-2020	Rob Clark <robdclark@chromium.org>	drm/msm: Document and rename preempt_lock Before adding another lock, give ring->lock a more descriptive name. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
# 604234f3	03-Sep-2020	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Enable expanded apriv support for a650 a650 supports expanded apriv support that allows us to map critical buffers (ringbuffer and memstore) as as privileged to protect them from corruption. Cc: stable@vger.kernel.org Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
# 352c83fb	17-Aug-2020	Rob Clark <robdclark@chromium.org>	drm/msm/gpu: make ringbuffer readonly The GPU has no business writing into the ringbuffer, let's make it readonly to the GPU. Fixes: 7198e6b03155 ("drm/msm: add a3xx gpu support") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
# caab277b	02-Jun-2019	Thomas Gleixner <tglx@linutronix.de>	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# 84c61275	07-Nov-2018	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm/gpu: Map the ringbuffer in the iova at create time For reasons that I'm sure made perfect sense at the time we were opting to defer the iova alloc / pin on the ringbuffer until HW init time so when we moved to iova reference counting we ended up adding a reference count every time the hardware started. Not that it mattered (because the ring is always around) but it did make the debug output look odd. Allocate and pin the iova at create time instead. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
# 0815d774	07-Nov-2018	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Add a name field for gem objects For debugging purposes it is useful to assign descriptions to buffers so that we know what they are used for. Add a field to the buffer object and use that to name the various kernel side allocations which ends up looking like like this in /d/dri/X/gem: flags id ref offset kaddr size madv name 00040000: I 0 ( 1) 00000000 0000000070b79eca 00004096 memptrs vmas: [gpu: 01000000,mapped,inuse=1] 00020000: I 0 ( 1) 00000000 0000000031ed4074 00032768 ring0 Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
# 1e29dff0	07-Nov-2018	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Add a common function to free kernel buffer objects Buffer objects allocated with msm_gem_kernel_new() are mostly freed the same way so we can save a few lines of code with a common function. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
# dc9a9b32	25-Jan-2018	Steve Kowalik <steven@wedontsleep.org>	drm/msm: Replace gem_object deprecated functions drm_gem_object_{reference,unreference,unreference_unlocked} are deprecated functions, and merely alias to the get/put functions. Switch to the new names. Signed-off-by: Steve Kowalik <steven@wedontsleep.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
# b1fc2839	20-Oct-2017	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Implement preemption for A5XX targets Implement preemption for A5XX targets - this allows multiple ringbuffers for different priorities with automatic preemption of a lower priority ringbuffer if a higher one is ready. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
# 4c7085a5	20-Oct-2017	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Shadow current pointer in the ring until command is complete Add a shadow pointer to track the current command being written into the ring. Don't commit it as 'cur' until the command is submitted. Because 'cur' is used to construct the software copy of the wptr this ensures that somebody peeking in on the ring doesn't assume that a command is inflight while it is being written. This isn't a huge deal with a single ring (though technically the hangcheck could assume the system is prematurely busy when it isn't) but it will be rather important for preemption where the decision to preempt is based on a non-empty ringbuffer. Without a shadow an aggressive preemption scheme could assume that the ringbuffer is non empty and switch to it before the CPU is done writing the command and boom. Even though preemption won't be supported for all targets because of the way the code is organized it is simpler to make this generic for all targets. The extra load for non-preemption targets should be minimal. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
# f97decac	20-Oct-2017	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Support multiple ringbuffers Add the infrastructure to support the idea of multiple ringbuffers. Assign each ringbuffer an id and use that as an index for the various ring specific operations. The biggest delta is to support legacy fences. Each fence gets its own sequence number but the legacy functions expect to use a unique integer. To handle this we return a unique identifier for each submission but map it to a specific ring/sequence under the covers. Newer users use a dma_fence pointer anyway so they don't care about the actual sequence ID or ring. The actual mechanics for multiple ringbuffers are very target specific so this code just allows for the possibility but still only defines one ringbuffer for each target family. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
# 8223286d	27-Jul-2017	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Add a helper function for in-kernel buffer allocations Nearly all of the buffer allocations for kernel allocate an buffer object, virtual address and GPU iova at the same time. Make a helper function to handle the details. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> [dropped msm_fbdev conversion to new helper, since it interferes with display-handover work, where we want to separate allocation and mapping] Signed-off-by: Rob Clark <robdclark@gmail.com>
# 0e08270a	13-Jun-2017	Sushmita Susheelendra <ssusheel@codeaurora.org>	drm/msm: Separate locking of buffer resources from struct_mutex Buffer object specific resources like pages, domains, sg list need not be protected with struct_mutex. They can be protected with a buffer object level lock. This simplifies locking and makes it easier to avoid potential recursive locking scenarios for SVM involving mmap_sem and struct_mutex. This also removes unnecessary serialization when creating buffer objects, and also between buffer object creation and GPU command submission. Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org> [robclark: squash in handling new locking for shrinker] Signed-off-by: Rob Clark <robdclark@gmail.com>
# 88b333b0	20-Dec-2016	Jordan Crouse <jcrouse@codeaurora.org>	drm/msm: Ensure that the hardware write pointer is valid Currently the value written to CP_RB_WPTR is calculated on the fly as (rb->next - rb->start). But as the code is designed rb->next is wrapped before writing the commands so if a series of commands happened to fit perfectly in the ringbuffer, rb->next would end up being equal to rb->size / 4 and thus result in an out of bounds address to CP_RB_WPTR. The easiest way to fix this is to mask WPTR when writing it to the hardware; it makes the hardware happy and the rest of the ringbuffer math appears to work and there isn't any point in upsetting anything. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> [squash in is_power_of_2() check] Signed-off-by: Rob Clark <robdclark@gmail.com>