History log of /linux-master/drivers/gpu/drm/msm/msm_gpu.c
Revision Date Author Comments
# 917e9b7c 09-Jan-2024 Rob Clark <robdclark@chromium.org>

Revert "drm/msm/gpu: Push gpu lock down past runpm"

This reverts commit abe2023b4cea192ab266b351fd38dc9dbd846df0.

Changing the locking order means that scheduler/msm_job_run() can race
with the recovery kthread worker, with the result that the GPU gets an
extra runpm get when we are trying to power it off. Leaving the GPU in
an unrecovered state.

I'll need to come up with a different scheme for appeasing lockdep.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/573835/


# 12578c07 17-Nov-2023 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Skip retired submits in recover worker

If we somehow raced with submit retiring, either while waiting for
worker to have a chance to run or acquiring the gpu lock, then the
recover worker should just bail.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/568034/


# 548b61a8 15-Nov-2023 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Move gpu devcore's to gpu device

The dpu devcore's are already associated with the dpu device. So we
should associate the gpu devcore's with the gpu device, for easier
classification.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/567738/


# abe2023b 10-Aug-2023 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Push gpu lock down past runpm

Avoid holding gpu lock when calling runpm, to avoid this lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
6.4.3-debug+ #14 Not tainted
------------------------------------------------------
ring0/373 is trying to acquire lock:
ffffffead86efb98 (prepare_lock){+.+.}-{3:3}, at: clk_prepare_lock+0x70/0x98

but task is already holding lock:
ffffff809cd19170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x7c/0x128 [msm]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #4 (&gpu->lock){+.+.}-{3:3}:
__mutex_lock+0xc8/0x388
mutex_lock_nested+0x2c/0x38
msm_job_run+0x7c/0x128 [msm]
drm_sched_main+0x264/0x354 [gpu_sched]
kthread+0xf0/0x100
ret_from_fork+0x10/0x20

-> #3 (dma_fence_map){++++}-{0:0}:
__dma_fence_might_wait+0x74/0xc0
dma_resv_lockdep+0x1f0/0x2e8
do_one_initcall+0xb4/0x214
kernel_init_freeable+0x338/0x33c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20

-> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
fs_reclaim_acquire+0x7c/0x9c
slab_pre_alloc_hook.constprop.0+0x40/0x250
__kmem_cache_alloc_node+0x60/0x18c
kmalloc_node_trace+0x40/0x84
alloc_worker+0x2c/0x64
init_rescuer+0x34/0xe0
workqueue_init+0x168/0x1fc
kernel_init_freeable+0x15c/0x33c
kernel_init+0x30/0x134
ret_from_fork+0x10/0x20

-> #1 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire+0x3c/0x48
fs_reclaim_acquire+0x50/0x9c
slab_pre_alloc_hook.constprop.0+0x40/0x250
__kmem_cache_alloc_node+0x60/0x18c
kmalloc_trace+0x44/0x88
clk_rcg2_dfs_determine_rate+0x60/0x214
clk_core_determine_round_nolock+0xb8/0xf0
clk_core_round_rate_nolock+0x84/0x118
clk_core_round_rate_nolock+0xd8/0x118
clk_round_rate+0x6c/0xd0
geni_se_clk_tbl_get+0x78/0xc0
geni_se_clk_freq_match+0x44/0xe4
get_spi_clk_cfg+0x50/0xf4
geni_spi_set_clock_and_bw+0x54/0x104
spi_geni_prepare_message+0x130/0x174
__spi_pump_transfer_message+0x200/0x4d8
__spi_sync+0x13c/0x23c
spi_sync_locked+0x18/0x24
do_cros_ec_pkt_xfer_spi+0x124/0x3f0
cros_ec_xfer_high_pri_work+0x28/0x3c
kthread_worker_fn+0x14c/0x27c
kthread+0xf0/0x100
ret_from_fork+0x10/0x20

-> #0 (prepare_lock){+.+.}-{3:3}:
__lock_acquire+0xdf8/0x109c
lock_acquire+0x234/0x284
__mutex_lock+0xc8/0x388
mutex_lock_nested+0x2c/0x38
clk_prepare_lock+0x70/0x98
clk_prepare+0x24/0x50
clk_bulk_prepare+0x50/0x9c
a6xx_gmu_resume+0x94/0x800 [msm]
a6xx_gmu_pm_resume+0x38/0x158 [msm]
adreno_runtime_resume+0x2c/0x38 [msm]
pm_generic_runtime_resume+0x30/0x44
__rpm_callback+0x4c/0x134
rpm_callback+0x78/0x7c
rpm_resume+0x3a4/0x46c
__pm_runtime_resume+0x78/0xbc
pm_runtime_get_sync.isra.0+0x14/0x20 [msm]
msm_gpu_submit+0x4c/0x12c [msm]
msm_job_run+0x88/0x128 [msm]
drm_sched_main+0x264/0x354 [gpu_sched]
kthread+0xf0/0x100
ret_from_fork+0x10/0x20

other info that might help us debug this:
Chain exists of:
prepare_lock --> dma_fence_map --> &gpu->lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&gpu->lock);
lock(dma_fence_map);
lock(&gpu->lock);
lock(prepare_lock);

*** DEADLOCK ***
2 locks held by ring0/373:
#0: ffffffead875ae50 (dma_fence_map){++++}-{0:0}, at: drm_sched_main+0x54/0x354 [gpu_sched]
#1: ffffff809cd19170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x7c/0x128 [msm]

stack backtrace:
CPU: 2 PID: 373 Comm: ring0 Not tainted 6.4.3-debug+ #14
Hardware name: Google Villager (rev1+) with LTE (DT)
Call trace:
dump_backtrace+0xb4/0xf0
show_stack+0x20/0x30
dump_stack_lvl+0x60/0x84
dump_stack+0x18/0x24
print_circular_bug+0x1cc/0x234
check_noncircular+0x78/0xac
__lock_acquire+0xdf8/0x109c
lock_acquire+0x234/0x284
__mutex_lock+0xc8/0x388
mutex_lock_nested+0x2c/0x38
clk_prepare_lock+0x70/0x98
clk_prepare+0x24/0x50
clk_bulk_prepare+0x50/0x9c
a6xx_gmu_resume+0x94/0x800 [msm]
a6xx_gmu_pm_resume+0x38/0x158 [msm]
adreno_runtime_resume+0x2c/0x38 [msm]
pm_generic_runtime_resume+0x30/0x44
__rpm_callback+0x4c/0x134
rpm_callback+0x78/0x7c
rpm_resume+0x3a4/0x46c
__pm_runtime_resume+0x78/0xbc
pm_runtime_get_sync.isra.0+0x14/0x20 [msm]
msm_gpu_submit+0x4c/0x12c [msm]
msm_job_run+0x88/0x128 [msm]
drm_sched_main+0x264/0x354 [gpu_sched]
kthread+0xf0/0x100
ret_from_fork+0x10/0x20

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/552298/


# 6ba5daa5 02-Aug-2023 Rob Clark <robdclark@chromium.org>

drm/msm: Use drm_gem_object in submit bos table

Basically everywhere wants the base ptr type. So store that instead of
msm_gem_object.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/551021/


# f09f5459 27-Jul-2023 Ruan Jinjie <ruanjinjie@huawei.com>

drm/msm: Remove redundant DRM_DEV_ERROR()

There is no need to call the DRM_DEV_ERROR() function directly to print
a custom message when handling an error from platform_get_irq() function
as it is going to display an appropriate error message
in case of a failure.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549499/
Link: https://lore.kernel.org/r/20230727112407.2916029-1-ruanjinjie@huawei.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>


# 171f580e 17-Apr-2023 Rob Clark <robdclark@chromium.org>

drm/msm: Move cmdstream dumping out of sched kthread

This is something that can block for arbitrary amounts of time as
userspace consumes from the FIFO. So we don't really want this to
be in the fence signaling path.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/532617/


# 51d86ee5 24-May-2023 Rob Clark <robdclark@chromium.org>

drm/msm: Switch to fdinfo helper

Now that we have a common helper, use it.

v2: Rebase on drm-misc-next

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230524155956.382440-4-robdclark@gmail.com


# 9f251f93 23-Feb-2023 Konrad Dybcio <konrad.dybcio@linaro.org>

drm/msm/adreno: Use OPP for every GPU generation

Some older GPUs (namely a2xx with no opp tables at all and a320 with
downstream-remnants gpu pwrlevels) used not to have OPP tables. They
both however had just one frequency defined, making it extremely easy
to construct such an OPP table from within the driver if need be.

Do so and switch all clk_set_rate calls on core_clk to their OPP
counterparts.

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/523784/
Link: https://lore.kernel.org/r/20230223-topic-opp-v3-3-5f22163cd1df@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>


# d4843012 02-Jan-2023 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm/a6xx: Remove cx gdsc polling using 'reset'

Remove the unused 'reset' interface which was supposed to help to ensure
that cx gdsc has collapsed during gpu recovery. This is was not enabled
so far due to missing gpucc driver support. Similar functionality using
genpd framework will be implemented in the upcoming patch.

This effectively reverts commit 1f6cca404918
("drm/msm/a6xx: Ensure CX collapse during gpu recovery").

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Patchwork: https://patchwork.freedesktop.org/patch/516470/
Link: https://lore.kernel.org/r/20230102161757.v5.4.I96e0bf9eaf96dd866111c1eec8a4c9b70fd7cbcb@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# a66f1efc 10-Jan-2023 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Fix potential double-free

If userspace was calling the MSM_SET_PARAM ioctl on multiple threads to
set the COMM or CMDLINE param, it could trigger a race causing the
previous value to be kfree'd multiple times. Fix this by serializing on
the gpu lock.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: d4726d770068 ("drm/msm: Add a way to override processes comm/cmdline")
Patchwork: https://patchwork.freedesktop.org/patch/517778/
Link: https://lore.kernel.org/r/20230110212903.1925878-1-robdclark@gmail.com


# d73b1d02 14-Nov-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Hangcheck progress detection

If the hangcheck timer expires, check if the fw's position in the
cmdstream has advanced (changed) since last timer expiration, and
allow it up to three additional "extensions" to it's alotted time.
The intention is to continue to catch "shader stuck in a loop" type
hangs quickly, but allow more time for things that are actually
making forward progress.

Because we need to sample the CP state twice to detect if there has
not been progress, this also cuts the the timer's duration in half.

v2: Fix typo (REG_A6XX_CP_CSQ_IB2_STAT), add comment
v3: Only halve hangcheck timer duration for generations which
support progress detection (hdanton); removed unused a5xx
progress (without knowing how to adjust for data buffered
in ROQ it is too likely to report a false negative)
v4: Comment updates to better describe the total hangcheck
duration when progress detection is applied

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Chia-I Wu <olvaffe@gmail.com> # dEQP-GLES2.functional.flush_finish.wait
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/511584/
Link: https://lore.kernel.org/r/20221114193049.1533391-3-robdclark@gmail.com


# 76efc245 27-Sep-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm/gpu: Fix crash during system suspend after unbind

In adreno_unbind, we should clean up gpu device's drvdata to avoid
accessing a stale pointer during system suspend. Also, check for NULL
ptr in both system suspend/resume callbacks.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/505075/
Link: https://lore.kernel.org/r/20220928124830.2.I5ee0ac073ccdeb81961e5ec0cce5f741a7207a71@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 1f6cca40 18-Aug-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm/a6xx: Ensure CX collapse during gpu recovery

Because there could be transient votes from other drivers/tz/hyp which
may keep the cx gdsc enabled, we should poll until cx gdsc collapses.
We can use the reset framework to poll for cx gdsc collapse from gpucc
clk driver.

This feature requires support from the platform's gpucc driver.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Patchwork: https://patchwork.freedesktop.org/patch/498397/
Link: https://lore.kernel.org/r/20220819015030.v5.5.I176567525af2b9439a7e485d0ca130528666a55c@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f350bfb9 18-Aug-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm: Fix cx collapse issue during recovery

There are some hardware logic under CX domain. For a successful
recovery, we should ensure cx headswitch collapses to ensure all the
stale states are cleard out. This is especially true to for a6xx family
where we can GMU co-processor.

Currently, cx doesn't collapse due to a devlink between gpu and its
smmu. So the *struct gpu device* needs to be runtime suspended to ensure
that the iommu driver removes its vote on cx gdsc.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/498398/
Link: https://lore.kernel.org/r/20220819015030.v5.4.I4ac27a0b34ea796ce0f938bb509e257516bc6f57@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 06097e37 18-Aug-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm: Correct pm_runtime votes in recover worker

In the scenario where there is one a single submit which is hung, gpu is
power collapsed when it is retired. Because of this, by the time we call
reover(), gpu state would be already clear. Fix this by correctly
managing the pm runtime votes.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/498391/
Link: https://lore.kernel.org/r/20220819015030.v5.3.Ib07ecec3d5c17cb0e1efa6fcddaaa019ec2fb556@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 5b26f37d 18-Aug-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm: Take single rpm refcount on behalf of all submits

Instead of separate refcount for each submit, take single rpm refcount
on behalf of all the submits. This makes it easier to drop the rpm
refcount during recovery in an upcoming patch.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/498392/
Link: https://lore.kernel.org/r/20220819015030.v5.2.Ifee853f6d8217a0fdacc459092bbc9e81a8a7ac7@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# b352ba54 02-Aug-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gem: Convert to using drm_gem_lru

This converts over to use the shared GEM LRU/shrinker helpers. Note
that it means we are no longer tracking purgeable or willneed buffers
that are active separately. But the most recently pinned buffers should
be at the tail of the various LRUs, and the shrinker is already prepared
to encounter objects which are still active.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/496131/
Link: https://lore.kernel.org/r/20220802155152.1727594-11-robdclark@gmail.com


# 38a523a2 27-Jun-2022 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "devcoredump: remove the useless gfp_t parameter in dev_coredumpv and dev_coredumpm"

This reverts commit 77515ebaf01920e2db49e04672ef669a7c2907f2 as it
causes build problems in linux-next. It needs to be reintroduced in a
way that can allow the api to evolve and not require a "flag day" to
catch all users.

Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au
Cc: Duoming Zhou <duoming@zju.edu.cn>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 77515eba 06-Jun-2022 Duoming Zhou <duoming@zju.edu.cn>

devcoredump: remove the useless gfp_t parameter in dev_coredumpv and dev_coredumpm

The dev_coredumpv() and dev_coredumpm() could not be used in atomic
context, because they call kvasprintf_const() and kstrdup() with
GFP_KERNEL parameter. The process is shown below:

dev_coredumpv(.., gfp_t gfp)
dev_coredumpm(.., gfp_t gfp)
dev_set_name
kobject_set_name_vargs
kvasprintf_const(GFP_KERNEL, ...); //may sleep
kstrdup(s, GFP_KERNEL); //may sleep

This patch removes gfp_t parameter of dev_coredumpv() and dev_coredumpm()
and changes the gfp_t parameter of kzalloc() in dev_coredumpm() to
GFP_KERNEL in order to show they could not be used in atomic context.

Fixes: 833c95456a70 ("device coredump: add new device coredump class")
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/df72af3b1862bac7d8e793d1f3931857d3779dfd.1654569290.git.duoming@zju.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 8b5de735 13-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Deprecate MSM_BO_UNCACHED harder

Handle the demotion to MSM_BO_WC at the userspace ABI level, and fix
the remaining internal MSM_BO_UNCACHED user.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/489339/
Link: https://lore.kernel.org/r/20220613194623.2588353-1-robdclark@gmail.com


# 18514c38 29-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Add GEM debug label to devcore

When trying to understand an iova fault devcore, once you figure out
which buffer we accessed beyond the end of, it is useful to see the
buffer's debug label.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/491910/
Link: https://lore.kernel.org/r/20220629211919.563585-3-robdclark@gmail.com


# cc66a42c 29-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Capture all BO addr+size in devcore

It is useful to know what buffers userspace thinks are associated with
the submit, even if we don't care to capture their content. This brings
things more inline with $debugfs/rd cmdstream dumping.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/491908/
Link: https://lore.kernel.org/r/20220629211919.563585-2-robdclark@gmail.com


# cfebe3fd 09-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Expose client engine utilization via fdinfo

Similar to AMD commit
874442541133 ("drm/amdgpu: Add show_fdinfo() interface"), using the
infrastructure added in previous patches, we add basic client info
and GPU engine utilisation for msm.

Example output:

# cat /proc/`pgrep glmark2`/fdinfo/6
pos: 0
flags: 02400002
mnt_id: 21
ino: 162
drm-driver: msm
drm-client-id: 7
drm-engine-gpu: 1734371319 ns
drm-cycles-gpu: 1153645024
drm-maxfreq-gpu: 800000000 Hz

See also: https://patchwork.freedesktop.org/patch/468505/

v2: Add dev-maxfreq-$engine and update drm-usage-stats.rst
v3: spelling and compiler warning

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Patchwork: https://patchwork.freedesktop.org/patch/488906/
Link: https://lore.kernel.org/r/20220609174213.2265938-2-robdclark@gmail.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>


# c8af219d 18-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Don't overwrite hw fence in hw_init

Prior to the last commit, this could result in setting the GPU
written fence value back to an older value, if we had missed
updating completed_fence prior to suspend. This was mostly
harmless as the GPU would eventually overwrite it again with
the correct value. But we should just not do this. Instead
just leave a sanity check that the fence looks plausible (in
case the GPU scribbled on memory).

Reported-by: Steev Klimaszewski <steev@kali.org>
Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Patchwork: https://patchwork.freedesktop.org/patch/490138/
Link: https://lore.kernel.org/r/20220618161120.3451993-2-robdclark@gmail.com


# 3c7a5221 18-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Drop update_fences()

I noticed while looking at some traces, that we could miss calls to
msm_update_fence(), as the irq could have raced with retire_submits()
which could have already popped the last submit on a ring out of the
queue of in-flight submits. But walking the list of submits in the
irq handler isn't really needed, as dma_fence_is_signaled() will dtrt.
So lets just drop it entirely.

v2: use spin_lock_irqsave/restore as we are no longer protected by the
spin_lock_irqsave/restore() in update_fences()

Reported-by: Steev Klimaszewski <steev@kali.org>
Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Patchwork: https://patchwork.freedesktop.org/patch/490136/
Link: https://lore.kernel.org/r/20220618161120.3451993-1-robdclark@gmail.com


# 49e47761 08-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Switch ordering of runpm put vs devfreq_idle

In msm_devfreq_suspend() we cancel idle_work synchronously so that it
doesn't run after we power of the hw or in the resume path. But this
means that we want to ensure that idle_work is not scheduled *after* we
no longer hold a runpm ref. So switch the ordering of pm_runtime_put()
vs msm_devfreq_idle().

v2. Only move the runpm _put_autosuspend, and not the _mark_last_busy()

Fixes: 9bc95570175a ("drm/msm: Devfreq tuning")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20210927152928.831245-1-robdclark@gmail.com
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20220608161334.2140611-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 36a1d1bd 21-Apr-2022 Luca Weiss <luca@z3ntu.xyz>

drm/msm: Fix null pointer dereferences without iommu

Check if 'aspace' is set before using it as it will stay null without
IOMMU, such as on msm8974.

Fixes: bc2112583a0b ("drm/msm/gpu: Track global faults per address-space")
Signed-off-by: Luca Weiss <luca@z3ntu.xyz>
Link: https://lore.kernel.org/r/20220421203455.313523-1-luca@z3ntu.xyz
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f9d5355f 11-Apr-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Drop duplicate fence counter

The ring seqno counter duplicates the fence-context last_fence counter.
They end up getting incremented in lock-step, on the same scheduler
thread, but the split just makes things less obvious.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220411215849.297838-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# d4726d77 17-Mar-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Add a way to override processes comm/cmdline

In the cause of using the GPU via virtgpu, the host side process is
really a sort of proxy, and not terribly interesting from the PoV of
crash/fault logging. Add a way to override these per process so that
we can see the guest process's name.

v2: Handle kmalloc failure, add comment to explain kstrdup returns
NULL if passed NULL [Dan Carpenter]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220317165144.222101-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 39ba0c0d 17-Mar-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Split out helper to get comm/cmdline

Deduplicate this from fault_worker and recover_worker.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220317165144.222101-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 90f45c42 03-Mar-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Add SYSPROF param (v2)

Add a SYSPROF param for system profiling tools like Mesa's pps-producer
(perfetto) to control behavior related to system-wide performance
counter collection. In particular, for profiling, one wants to ensure
that GPU context switches do not effect perfcounter state, and might
want to suppress suspend (which would cause counters to lose state).

v2: Swap the order in msm_file_private_set_sysprof() [sboyd] and
initialize the sysprof_active refcount to one (because the under/
overflow checking in refcount_t doesn't expect a 0->1 transition)
meaning that values greater than 1 means sysprof is active.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220304005317.776110-4-robdclark@gmail.com


# 0737ab95 25-Feb-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm: Use generic name for gpu resources

Use generic name for resources like irq and kthread instead of hardware
specific name to make it easier to grep.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220226005021.v2.1.Id3d2e7391192c86d0783aeb307d3f9fb61f9efee@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# bc211258 01-Feb-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Track global faults per address-space

Other processes don't need to know about faults that they are isolated
from by virtue of address space isolation. They are only interested in
whether some of their state might have been corrupted.

But to be safe, also track unattributed faults. This case should really
never happen unless there is a kernel bug (and that would never happen,
right?)

v2: Instead of adding a new param, just change the behavior of the
existing param to match what userspace actually wants [anholt]

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5934
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220201161618.778455-3-robdclark@gmail.com
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# c0e745d7 05-Jan-2022 Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

drm/msm: drop dbgname argument from msm_ioremap*()

msm_ioremap() functions take additional argument dbgname which is now
unused.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20220105232700.444170-2-dmitry.baryshkov@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>


# 167a668a 08-Jan-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Wait for idle before suspending

System suspend uses pm_runtime_force_suspend(), which cheekily bypasses
the runpm reference counts. This doesn't actually work so well when the
GPU is active. So add a reasonable delay waiting for the GPU to become
idle.

Alternatively we could just return -EBUSY in this case, but that has the
disadvantage of causing system suspend to fail.

v2: s/ret/remaining [sboyd], and switch to using active_submits count
to ensure we aren't racing with submit cleanup (and devfreq idle
work getting scheduled, etc)
v3: fix inverted logic

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220108180913.814448-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 5f3aee4c 09-Nov-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Handle fence rollover

Add some helpers for fence comparision, which handle rollover properly,
and stop open coding fence seqno comparisions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20211109181117.591148-5-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# c28e2f2b 09-Nov-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Remove struct_mutex usage

The remaining struct_mutex usage is just to serialize various gpu
related things (submit/retire/recover/fault/etc), so replace
struct_mutex with gpu->lock.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211109181117.591148-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 1d054c9b 09-Nov-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Drop priv->lastctx

cur_ctx_seqno already does the same thing, but handles the edge cases
where a refcnt'd context can live after lastclose. So let's not have
two ways to do the same thing.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20211109181117.591148-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# b220c154 29-Sep-2021 Tim Gardner <tim.gardner@canonical.com>

drm/msm: prevent NULL dereference in msm_gpu_crashstate_capture()

Coverity complains of a possible NULL dereference:

CID 120718 (#1 of 1): Dereference null return value (NULL_RETURNS)
23. dereference: Dereferencing a pointer that might be NULL state->bos when
calling msm_gpu_crashstate_get_bo. [show details]
301 msm_gpu_crashstate_get_bo(state, submit->bos[i].obj,
302 submit->bos[i].iova, submit->bos[i].flags);

Fix this by employing the same state->bos NULL check as is used in the next
for loop.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: linux-arm-msm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: freedreno@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210929162554.14295-1-tim.gardner@canonical.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 1d8a5ca4 27-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Conversion to drm scheduler

For existing adrenos, there is one or more ringbuffer, depending on
whether preemption is supported. When preemption is supported, each
ringbuffer has it's own priority. A submitqueue (which maps to a
gl context or vk queue in userspace) is mapped to a specific ring-
buffer at creation time, based on the submitqueue's priority.

Each ringbuffer has it's own drm_gpu_scheduler. Each submitqueue
maps to a drm_sched_entity. And each submit maps to a drm_sched_job.

Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/4
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210728010632.2633470-10-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# be40596b 27-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Consolidate submit bo state

Move all the locked/active/pinned state handling to msm_gem_submit.c.
In particular, for drm/scheduler, we'll need to do all this before
pushing the submit job to the scheduler. But while we're at it we can
get rid of the dupicate pin and refcnt.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210728010632.2633470-7-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 030af2b0 27-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: drop drm_gem_object_put_locked()

No idea why we were still using this. It certainly hasn't been needed
for some time. So drop the pointless twin codepaths.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210728010632.2633470-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 9bc95570 26-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Devfreq tuning

This adds a few things to try and make frequency scaling better match
the workload:

1) Longer polling interval to avoid whip-lashing between too-high and
too-low frequencies in certain workloads, like mobile games which
throttle themselves to 30fps.

Previously our polling interval was short enough to let things
ramp down to minimum freq in the "off" frame, but long enough to
not react quickly enough when rendering started on the next frame,
leading to uneven frame times. (Ie. rather than a consistent 33ms
it would alternate between 16/33/48ms.)

2) Awareness of when the GPU is active vs idle. Since we know when
the GPU is active vs idle, we can clamp the frequency down to the
minimum while it is idle. (If it is idle for long enough, then
the autosuspend delay will eventually kick in and power down the
GPU.)

Since devfreq has no knowledge of powered-but-idle, this takes a
small bit of trickery to maintain a "fake" frequency while idle.
This, combined with the longer polling period allows devfreq to
arrive at a reasonable "active" frequency, while still clamping
to minimum freq when idle to reduce power draw.

3) Boost. Because simple_ondemand needs to see a certain threshold
of busyness to ramp up, we could end up needing multiple polling
cycles before it reacts appropriately on interactive workloads
(ex. scrolling a web page after reading for some time), on top
of the already lengthened polling interval, when we see a idle
to active transition after a period of idle time we boost the
frequency that we return to.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20210726144653.2180096-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# af5b4fff 26-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Split out devfreq handling

Before we start adding more cleverness, split it into it's own file.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210726144653.2180096-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 298287f6 26-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Signal fences sooner

Nothing we do to in update_fences() can't be done in an atomic context,
so move this into the GPU's irq context to reduce latency (and call
dma_fence_signal() so we aren't relying on dma_fence_is_signaled() which
would defeat the purpose).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20210726144359.2179302-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# e25e92e0 10-Jun-2021 Rob Clark <robdclark@chromium.org>

drm/msm: devcoredump iommu fault support

Wire up support to stall the SMMU on iova fault, and collect a devcore-
dump snapshot for easier debugging of faults.

Currently this is a6xx-only, but mostly only because so far it is the
only one using adreno-smmu-priv.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/20210610214431.539029-6-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 1d2fa58e 06-Jun-2021 Samuel Iglesias Gonsalvez <siglesias@igalia.com>

drm/msm: export hangcheck_period in debugfs

While keeping the previous default value for hangcheck period,
we allow now the possibility of configuring its value via
debugfs.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Link: https://lore.kernel.org/r/20210607104441.184700-1-siglesias@igalia.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 3ab1c5cc 24-Mar-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Add param for userspace to query suspend count

Performance counts, and ALWAYS_ON counters used for capturing GPU
timestamps, lose their state across suspend/resume cycles. Userspace
tooling for performance monitoring needs to be aware of this. For
example, after a suspend userspace needs to recalibrate it's offset
between CPU and GPU time.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/20210325012358.1759770-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# ab5c54cb 16-Nov-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Protect obj->active_count under obj lock

Previously we only held obj lock in the _active_get() path, and relied
on atomic_dec_return() to not be racy in the _active_put() path where
obj lock was not held.

But this is a false sense of security. Unlike obj lifetime refcnt,
where you do not expect to *increase* the refcnt after the last put
(which would mean that something has gone horribly wrong with the
object liveness reference counting), the active_count can increase
again from zero. Racing _active_put()s and _active_get()s could leave
the obj on the wrong mm list.

But in the retire path, immediately after the _active_put(), the
_unpin_iova() would acquire obj lock. So just move the locking earlier
and rely on that to protect obj->active_count.

Fixes: c5c1643cef7a ("drm/msm: Drop struct_mutex from the retire path")
Signed-off-by: Rob Clark <robdclark@chromium.org>


# ec793cf0 30-Oct-2020 Akhil P Oommen <akhilpo@codeaurora.org>

drm/msm: Add support for GPU cooling

Register GPU as a devfreq cooling device so that it can be passively
cooled by the thermal framework.

Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# c5c1643c 23-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Drop struct_mutex from the retire path

Now that we are not relying on dev->struct_mutex to protect the
ring->submits lists, drop the struct_mutex lock.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# fb1a1fcb 23-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Remove obj->gpu

It cannot be atomically updated with obj->active_count, and the only
purpose is a useless WARN_ON() (which becomes a buggy WARN_ON() once
retire_submits() is not serialized with incoming submits via
struct_mutex)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 964d2f97 23-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Refcount submits

Before we remove dev->struct_mutex from the retire path, we have to deal
with the situation of a submit retiring before the submit ioctl returns.

To deal with this, ring->submits will hold a reference to the submit,
which is dropped when the submit is retired. And the submit ioctl path
holds it's own ref, which it drops when it is done with the submit.

Also, add to submit list *after* getting/pinning bo's, to prevent badness
in case the completed fence is corrupted, and retire_worker mistakenly
believes the submit is done too early.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 77d20529 23-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Protect ring->submits with it's own lock

One less place to rely on dev->struct_mutex.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 2a86efb1 23-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Move update_fences()

Small cleanup, update_fences() is used in the hangcheck path, but also
in the normal retire path.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 07ddf4c3 23-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Drop chatty trace

It is somewhat redundant with the gpu tracepoints, and anyways not too
useful to justify spamming the log when debug traces are enabled.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 6c0e3ea2 23-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm/gem: Switch over to obj->resv for locking

This also converts the special msm_gem_get_vaddr_active() to expect the
lock to already be held. There are two call-sites for this, one already
has the lock held, so it is more straightforward to just open-code the
locking for the other caller.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# dd29bd41 19-Oct-2020 Tian Tao <tiantao6@hisilicon.com>

drm/msm: Remove redundant null check

clk_prepare_enable() and clk_disable_unprepare() will check
NULL clock parameter, so It is not necessary to add additional checks.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 7e688294 19-Oct-2020 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Convert retire/recover work to kthread_worker

Signed-off-by: Rob Clark <robdclark@chromium.org>


# 9d8baa2b 22-Sep-2020 Akhil P Oommen <akhilpo@codeaurora.org>

drm/msm: Fix premature purging of BO

In the case where we have a back-to-back submission that shares the same
BO, this BO will be prematurely moved to inactive_list while retiring the
first submit. But it will be still part of the second submit which is
being processed by the GPU. Now, if the shrinker happens to be triggered at
this point, it will result in a premature purging of this BO.

To fix this, we need to refcount BO while doing submit and retire. Then,
it should be moved to inactive list when this refcount becomes 0.

Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 25faf2f2 17-Aug-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Show process names in gem_describe

In $debugfs/gem we already show any vma(s) associated with an object.
Also show process names if the vma's address space is a per-process
address space.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 933415e2 17-Aug-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add support for private address space instances

Add support for allocating private address space instances. Targets that
support per-context pagetables should implement their own function to
allocate private address spaces.

The default will return a pointer to the global address space.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 15eb9ad0 17-Aug-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Drop context arg to gpu->submit()

Now that we can get the ctx from the submitqueue, the extra arg is
redundant.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[split out of previous patch to reduce churny noise]
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 9cba4056 17-Aug-2020 Rob Clark <robdclark@chromium.org>

drm/msm: Set adreno_smmu as gpu's drvdata

This will be populated by adreno-smmu, to provide a way for coordinating
enabling/disabling TTBR0 translation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 69a9313b 17-Aug-2020 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Add dev_to_gpu() helper

In a later patch, the drvdata will not directly be 'struct msm_gpu *',
so add a helper to reduce the churn.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# ec1cb6e4 01-Sep-2020 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Add suspend/resume tracepoints

Signed-off-by: Rob Clark <robdclark@chromium.org>


# 74c0a69c 01-Sep-2020 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Add GPU freq_change traces

Technically the GMU specific one is a bit redundant, but it was useful
to track down a bug.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>


# 604234f3 03-Sep-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Enable expanded apriv support for a650

a650 supports expanded apriv support that allows us to map critical buffers
(ringbuffer and memstore) as as privileged to protect them from corruption.

Cc: stable@vger.kernel.org
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 1f60d114 13-Jul-2020 Sharat Masetty <smasetty@codeaurora.org>

drm: msm: a6xx: send opp instead of a frequency

This patch changes the plumbing to send the devfreq recommended opp rather
than the frequency. Also consolidate and rearrange the code in a6xx to set
the GPU frequency and the icc vote in preparation for the upcoming
changes for GPU->DDR scaling votes.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 0ded520b 13-Jul-2020 Jonathan Marek <jonathan@marek.ca>

drm/msm: reset devfreq freq_table/max_state before devfreq_add_device

These never get set back to 0 when probing fails, so an attempt to probe
again results in broken behavior. Fix the problem by setting thse to zero
before they are used.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# ccac7ce3 22-May-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Refactor address space initialization

Refactor how address space initialization works. Instead of having the
address space function create the MMU object (and thus require separate but
equal functions for gpummu and iommu) use a single function and pass the
MMU struct in. Make the generic code cleaner by using target specific
functions to create the address space so a2xx can do its own thing in its
own space. For all the other targets use a generic helper to initialize
IOMMU but leave the door open for newer targets to use customization
if they need it.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
[squash in rebase fixups]
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 52da6d51 22-May-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Attach the IOMMU device during initialization

Everywhere an IOMMU object is created by msm_gpu_create_address_space
the IOMMU device is attached immediately after. Instead of carrying around
the infrastructure to do the attach from the device specific code do it
directly in the msm_iommu_init() function. This gets it out of the way for
more aggressive cleanups that follow.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
[squash in rebase fixups and fix for unused fxn]
Signed-off-by: Rob Clark <robdclark@chromium.org>


# eecd7fd8 15-May-2020 Emil Velikov <emil.velikov@collabora.com>

drm/gem: add _locked suffix to drm_gem_object_put

Vast majority of DRM (core and drivers) are struct_mutex free.

As such we have only a handful of cases where the locked helper should
be used. Make that stand out a little bit better.

Done via the following script:

__from=drm_gem_object_put
__to=drm_gem_object_put_locked

for __file in $(git grep --name-only --word-regexp $__from); do
sed -i "s/\<$__from\>/$__to/g" $__file;
done

Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-12-emil.l.velikov@gmail.com


# e515af8d 18-Feb-2020 Rob Clark <robdclark@chromium.org>

drm/msm: devcoredump should dump MSM_SUBMIT_BO_DUMP buffers

Also log buffers with the DUMP flag set, to ensure we capture all useful
cmdstream in crashdump state with modern mesa.

Otherwise we miss out on the contents of "state object" cmdstream
buffers.

v2: add missing 'inline'

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>


# 70082a52 18-Sep-2019 Arnd Bergmann <arnd@arndb.de>

drm/msm: include linux/sched/task.h

Without this header file, compile-testing may run into a missing
declaration:

drivers/gpu/drm/msm/msm_gpu.c:444:4: error: implicit declaration of function 'put_task_struct' [-Werror,-Wimplicit-function-declaration]

Fixes: 482f96324a4e ("drm/msm: Fix task dump in gpu recovery")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 53bf7f7a 16-Sep-2019 Drew Davenport <ddavenport@chromium.org>

drm/msm: Remove unused function arguments

The arguments related to IOMMU port name have been unused since
commit 944fc36c31ed ("drm/msm: use upstream iommu") and can be removed.

Signed-off-by: Drew Davenport <ddavenport@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 8e3e791d 25-Jul-2019 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Use generic bulk clock function

Remove the homebrewed bulk clock get function and replace it with
devm_clk_bulk_get_all().

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 67fe62dc 24-Jul-2019 Yue Hu <huyue2@yulong.com>

drm: Switch to use DEVFREQ_GOV_SIMPLE_ONDEMAND constant

Since governor name is defined by DEVFREQ framework internally, use the
macro definition instead of using the name directly.

Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Jordan Crouse <jcrouse@codeaurora.org> for the msm part.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725035239.1192-1-zbestahu@gmail.com


# caab277b 02-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 295b22ae 07-May-2019 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Pass the MMU domain index in struct msm_file_private

Pass the index of the MMU domain in struct msm_file_private instead
of assuming gpu->id throughout the submit path. This clears the way
to change ctx->aspace to a per-instance pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 8ea274ac 20-Mar-2019 Kristian H. Kristensen <hoegsberg@gmail.com>

drm/msm: Stop dropping struct_mutex in recover_worker()

Now that we don't have the mmap_sem lock inversion, we don't need to
jump through this particular hoop anymore.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# b0fb6604 22-Mar-2019 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Add submit queue queries

Add the capability to query information from a submit queue.
The first available parameter is for querying the number of GPU faults
(hangs) that can be attributed to the queue.

This is useful for implementing context robustness. A user context can
regularly query the number of faults to see if it is responsible for any
and if so it can invalidate itself.

This is also helpful for testing by confirming to the user driver if a
particular command stream caused a fault (or not as the case may be).

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 48dc4241 16-Apr-2019 Rob Clark <robdclark@chromium.org>

drm/msm: add param to retrieve # of GPU faults (global)

For KHR_robustness, userspace wants to know two things, the count of GPU
faults globally, and the count of faults attributed to a given context.
This patch providees the former, and the next patch provides the latter.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>


# 2255f244 18-Dec-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Remove hardcoded interrupt name

Every GPU core only has one interrupt so there isn't any
value in looking up the interrupt by name. Remove the name (which
is legacy anyway) and use platform_get_irq() instead.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 878411ae 18-Dec-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Remove hardcoded interrupt name

Every GPU core only has one interrupt so there isn't any
value in looking up the interrupt by name. Remove the name (which
is legacy anyway) and use platform_get_irq() instead.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# c2052a4e 14-Nov-2018 Jonathan Marek <jonathan@marek.ca>

drm/msm: implement a2xx mmu

A2XX has its own very simple MMU.

Added a msm_use_mmu() function because we can't rely on iommu_present to
decide to use MMU or not.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 0815d774 07-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add a name field for gem objects

For debugging purposes it is useful to assign descriptions
to buffers so that we know what they are used for. Add
a field to the buffer object and use that to name the various
kernel side allocations which ends up looking like like this
in /d/dri/X/gem:

flags id ref offset kaddr size madv name
00040000: I 0 ( 1) 00000000 0000000070b79eca 00004096 memptrs
vmas: [gpu: 01000000,mapped,inuse=1]
00020000: I 0 ( 1) 00000000 0000000031ed4074 00032768 ring0

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 7ad0e8cf 07-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Count how many times iova memory is pinned

Add a reference count to track how many times a particular
chunk of iova memory is pinned (mapped) in the iomu and
add msm_gem_unpin_iova to give up references.

It is important to note that msm_gem_unpin_iova replaces
msm_gem_put_iova because the new implicit behavior
that an assigned iova in a given vma is now valid for the
life of the buffer and what we are really focusing on is
the use of that iova.

For now the unmappings are lazy; once the reference counts
go to zero they *COULD* be unmapped dynamically but that
will require an outside force such as a shrinker or
mm_notifiers. For now, we're just focusing on getting
the counting right and setting ourselves up to be ready
for the future.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 9fe041f6 07-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add msm_gem_get_and_pin_iova()

Add a new function to get and pin the iova memory in one
step (basically renaming the old msm_gem_get_iova function)
and switch msm_gem_get_iova() to only allocate an iova but
not map it in the IOMMU. This is only currently used by
msm_ioctl_gem_info() since all other users of of the iova
expect that the memory be immediately available.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1e29dff0 07-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add a common function to free kernel buffer objects

Buffer objects allocated with msm_gem_kernel_new() are mostly
freed the same way so we can save a few lines of code with a
common function.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 896a248a 02-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Only store local command buffers in the GPU state

Instead of trying to store all the tagged buffers from a hanging
submit only store the command buffers that were not imported.
This cuts down on the amount of data stored in the GPU state to
the base minimum of useful information.

The downside is that this will make it more difficult to
successfully replay a hang with just the GPU state but there
isn't any reason why that functionality can't be added back
in later once we've figured out how to better communicate
such massive amounts of data.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4241db42 02-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Add trace events for tracking GPU submissions

Add trace events to track the progress of a GPU submission
msm_gpu_submit occurs at the beginning of the submissions,
msm_gpu_submit_flush happens when the submission is put on
the ringbuffer and msm_submit_flush_retired is sent when
the operation is retired.

To make it easier to track the operations a unique sequence
number is assigned to each submission and displayed in each
event output so a human or a script can easily associate
the events related to a specific submission.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 546ec7b4 02-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Allocate the correct size for the GPU memptrs

Allocate the correct buffer size for the GPU memptrs. The incorrect
size hasn't affected us thus far since the incorrect size was larger
than the intended size and we're still stuck on page sized
granularity anyway but technically correct is the best kind of
correct.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6a41da17 20-Oct-2018 Mamta Shukla <mamtashukla555@gmail.com>

drm: msm: Use DRM_DEV_* instead of dev_*

Use DRM_DEV_INFO/ERROR/WARN instead of dev_info/err/debug to generate
drm-formatted specific log messages so that it will be easy to
differentiate in case of multiple instances of driver.

Signed-off-by: Mamta Shukla <mamtashukla555@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 482f9632 12-Oct-2018 Sharat Masetty <smasetty@codeaurora.org>

drm/msm: Fix task dump in gpu recovery

The current recovery code gets a pointer to the task struct and does a
few things all within the rcu_read_lock. This puts constraints on the
types of gfp flags that can be used within the rcu lock. This patch
instead gets a reference to the task within the rcu lock and releases
the lock immediately, this way the task stays afloat until we need it and
we also get to use the desired gfp flags.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>


# 4f3a31a8 12-Oct-2018 Sharat Masetty <smasetty@codeaurora.org>

drm/msm: Check if target supports crash dump capture

This patch simply checks first to see if the target can support crash dump
capture before proceeding.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>


# de0a3d09 04-Oct-2018 Sharat Masetty <smasetty@codeaurora.org>

drm/msm: re-factor devfreq code

The devfreq framework requires the drivers to provide busy time estimations.
The GPU driver relies on the hardware performance counteres for the busy time
estimations, but different hardware revisions have counters which can be
sourced from different clocks. So the busy time estimation will be target
dependent. Additionally on targets where the clocks are completely controlled
by the on chip microcontroller, fetching and setting the current GPU frequency
will be different. This patch aims to embrace these differences by re-factoring
the devfreq code a bit.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# d3fa91c9 04-Oct-2018 Sharat Masetty <smasetty@codeaurora.org>

drm/msm: suspend devfreq on init

Devfreq turns on and starts recommending power level as soon as it is
initialized. The GPU is still not powered on by the time the devfreq
init happens and this leads to problems on GPU's where register access
is needed to get/set power levels. So we start suspended and only restart
devfreq when GPU is powered on.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6969019f 31-Jul-2018 Anders Roxell <anders.roxell@linaro.org>

drm/msm/gpu: fix parameters in function msm_gpu_crashstate_capture

When CONFIG_DEV_COREDUMP isn't defined msm_gpu_crashstate_capture
doesn't pass the correct parameters.
drivers/gpu/drm/msm/msm_gpu.c: In function ‘recover_worker’:
drivers/gpu/drm/msm/msm_gpu.c:479:34: error: passing argument 2 of ‘msm_gpu_crashstate_capture’ from incompatible pointer type [-Werror=incompatible-pointer-types]
msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
^~~~~~
drivers/gpu/drm/msm/msm_gpu.c:388:13: note: expected ‘char *’ but argument is of type ‘struct msm_gem_submit *’
static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, char *comm,
^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/msm/msm_gpu.c:479:2: error: too many arguments to function ‘msm_gpu_crashstate_capture’
msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/msm/msm_gpu.c:388:13: note: declared here
static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, char *comm,

In current code the function msm_gpu_crashstate_capture parameters.

Fixes: cdb95931dea3 ("drm/msm/gpu: Add the buffer objects from the submit to the crash dump")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-By: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4b565ca5 06-Aug-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add A6XX device support

Add support for the A6XX family of Adreno GPUs. The biggest addition
is the GMU (Graphics Management Unit) which takes over most of the
power management of the GPU itself but in a ironic twist of fate
needs a goodly amount of management itself. Add support for the
A6XX core code, the GMU and the HFI (hardware firmware interface)
queue that the CPU uses to communicate with the GMU.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 8e54eea5 06-Aug-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add a helper function to parse clock names

Add a helper function to parse the clock names and set up
the bulk data so we can take advantage of the bulk clock
functions instead of rolling our own. This is added
as a helper function so the upcoming a6xx GMU code can
also take advantage of it.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 3530a17f 26-Jul-2018 Arnd Bergmann <arnd@arndb.de>

drm/msm/gpu: avoid deprecated do_gettimeofday

All users of do_gettimeofday() have been removed, but this one recently
crept in, along with an incorrect printing of the microseconds portion.

This converts it to using ktime_get_real_timespec64() as a direct
replacement, and adds the leading zeroes. I considered using monotonic
times (ktime_get()) instead, but as this timestamp appears to only
be used for humans rather than compared with other timestamps, the
real time domain is probably good enough.

Fixes: e43b045e2c82 ("drm/msm/gpu: Capture the state of the GPU")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# cdb95931 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Add the buffer objects from the submit to the crash dump

For hangs, dump copy out the contents of the buffer objects attached to the
guilty submission and print them in the crash dump report.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# c0fec7f5 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Capture the GPU state on a GPU hang

Capture the GPU state on a GPU hang and store it for later playback
via the devcoredump facility. Only one crash state is stored at a
time on the assumption that the first hang is usually the most
interesting. The existing crash state can be cleared after capturing
it and then a new one will be captured on the next hang.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 65a3c274 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Rearrange the code that collects the task during a hang

Do a bit of cleanup to prepare for upcoming changes to pass the
hanging task comm and cmdline to the crash dump function.

v2: Use GFP_ATOMIC while holding the rcu lock per Chris Wilson

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 9d20a0e6 22-Jan-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Set number of clocks to 0 if the list allocation fails

If we fail to allocate gpu->grp_clks reset the number of available
clocks to zero to avoid referencing the missing array later.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# dc9a9b32 25-Jan-2018 Steve Kowalik <steven@wedontsleep.org>

drm/msm: Replace gem_object deprecated functions

drm_gem_object_{reference,unreference,unreference_unlocked} are
deprecated functions, and merely alias to the get/put functions.
Switch to the new names.

Signed-off-by: Steve Kowalik <steven@wedontsleep.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# f91c14ab 10-Jan-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add devfreq support for the GPU

Add support for devfreq to dynamically control the GPU frequency.
By default try to use the 'simple_ondemand' governor which can
adjust the frequency based on GPU load.

v2: Fix __aeabi_uldivmod issue from the 0 day bot and use
devfreq_recommended_opp() as suggested by Rob.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1babd706 21-Nov-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Remove unused bus scaling code

Remove the downstream bus scaling code. It isn't needed for for
compatibility with a downstream or vendor kernel. Get it out of the
way to clear space for devfreq support.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 7ddae82e 13-Dec-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: gpu: Only sync fences on rings that exist

The fault recovery code tries to sync fences on all possible rings
instead of only the rings that actually exist which will fault the
kernel when the number of rings are less than the maximum amount.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 2d2bccef 12-Nov-2017 Rob Clark <robdclark@gmail.com>

drm/msm: free kstrdup'd cmdline

Fixes: 18bb8a6 'drm/msm: show task cmdline in gpu recovery messages'
Signed-off-by: Rob Clark <robdclark@gmail.com>


# e99e88a9 16-Oct-2017 Kees Cook <keescook@chromium.org>

treewide: setup_timer() -> timer_setup()

This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.

Casting from unsigned long:

void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
setup_timer(&ptr->my_timer, my_callback, ptr);

and forced object casts:

void my_callback(struct something *ptr)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

become:

void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);

Direct function assignments:

void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
ptr->my_timer.function = my_callback;

have a temporary cast added, along with converting the args:

void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

And finally, callbacks without a data assignment:

void my_callback(unsigned long data)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, 0);

have their argument renamed to verify they're unused during conversion:

void my_callback(struct timer_list *unused)
{
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);

The conversion is done with the following Coccinelle script:

spatch --very-quiet --all-includes --include-headers \
-I ./arch/x86/include -I ./arch/x86/include/generated \
-I ./include -I ./arch/x86/include/uapi \
-I ./arch/x86/include/generated/uapi -I ./include/uapi \
-I ./include/generated/uapi --include ./include/linux/kconfig.h \
--dir . \
--cocci-file ~/src/data/timer_setup.cocci

@fix_address_of@
expression e;
@@

setup_timer(
-&(e)
+&e
, ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
_E->_timer@_stl.function = _callback;
|
_E->_timer@_stl.function = &_callback;
|
_E->_timer@_stl.function = (_cast_func)_callback;
|
_E->_timer@_stl.function = (_cast_func)&_callback;
|
_E._timer@_stl.function = _callback;
|
_E._timer@_stl.function = &_callback;
|
_E._timer@_stl.function = (_cast_func)_callback;
|
_E._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
(
... when != _origarg
_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
)
}

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
depends on change_timer_function_usage &&
!change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
+ _handletype *_origarg = from_timer(_origarg, t, _timer);
+
... when != _origarg
- (_handletype *)_origarg
+ _origarg
... when != _origarg
}

// Avoid already converted callbacks.
@match_callback_converted
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

void _callback(struct timer_list *t)
{ ... }

// callback(struct something *handle)
@change_callback_handle_arg
depends on change_timer_function_usage &&
!match_callback_converted &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

void _callback(
-_handletype *_handle
+struct timer_list *t
)
{
+ _handletype *_handle = from_timer(_handle, t, _timer);
...
}

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
depends on change_timer_function_usage &&
change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

void _callback(struct timer_list *t)
{
- _handletype *_handle = from_timer(_handle, t, _timer);
}

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg &&
!change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
_E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

_callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
)

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)

@change_callback_unused_data
depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

void _callback(
-_origtype _origarg
+struct timer_list *unused
)
{
... when != _origarg
}

Signed-off-by: Kees Cook <keescook@chromium.org>


# 39ae0d3e 03-Aug-2017 Arnd Bergmann <arnd@arndb.de>

drm/msm: use %z format modifier for printing size_t

The return type of ARRAY_SIZE() is size_t, so we have to use
%zu instead of %lu to avoid this warning:

drivers/gpu/drm/msm/msm_gpu.c: In function 'msm_gpu_init':
drivers/gpu/drm/msm/msm_gpu.c:742:31: error: format '%lu' expects argument of type 'long unsigned int', but argument 7 has type 'unsigned int' [-Werror=format=]

The warning it otherwise harmless as size_t is always the
same size as unsigned long in all supported architectures,
but gcc doesn't know that.

Fixes: c2fceabca6d5 ("drm/msm: Support multiple ringbuffers")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 96169f4e 15-Sep-2017 Rob Clark <robdclark@gmail.com>

drm/msm: dump submits which triggered gpu hang

Note we need to move update_fences() to after msm_rd_dump_submit(),
otherwise the bo's referenced by the submit may no longer be valid.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 998b9a58 15-Sep-2017 Rob Clark <robdclark@gmail.com>

drm/msm/rd: allow adding addition msg to top of dump

For faults or hangs, it is nice to be able to include a bit more
information.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 2165e2b9 15-Sep-2017 Rob Clark <robdclark@gmail.com>

drm/msm: split rd debugfs file

Split into two instances, the existing $debugfs/rd which continues to
dump all submits, and $debugfs/hangrd which will be used to dump just
submits that cause gpu hangs (and eventually faults, but that will
require some iommu framework enhancements).

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 18bb8a6c 13-Sep-2017 Rob Clark <robdclark@gmail.com>

drm/msm: show task cmdline in gpu recovery messages

Now that freedreno gallium driver defaults to using submit_queue task
(render reordering), just showing task->comm is not so useful (ie. it is
always "flush_queue:0"), so also dump the cmdline. This should also be
more useful for piglit/shader_runner.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# b1fc2839 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Implement preemption for A5XX targets

Implement preemption for A5XX targets - this allows multiple
ringbuffers for different priorities with automatic preemption
of a lower priority ringbuffer if a higher one is ready.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# f97decac 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Support multiple ringbuffers

Add the infrastructure to support the idea of multiple ringbuffers.
Assign each ringbuffer an id and use that as an index for the various
ring specific operations.

The biggest delta is to support legacy fences. Each fence gets its own
sequence number but the legacy functions expect to use a unique integer.
To handle this we return a unique identifier for each submission but
map it to a specific ring/sequence under the covers. Newer users use
a dma_fence pointer anyway so they don't care about the actual sequence
ID or ring.

The actual mechanics for multiple ringbuffers are very target specific
so this code just allows for the possibility but still only defines
one ringbuffer for each target family.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# cd414f3d 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Move memptrs to msm_gpu

When we move to multiple ringbuffers we're going to store the data
in the memptrs on a per-ring basis. In order to prepare for that
move the current memptrs from the adreno namespace into msm_gpu.
This is way cleaner and immediately lets us kill off some sub
functions so there is much less cost later when we do move to
per-ring structs.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6bd6ae2d 24-Aug-2017 Rob Clark <robdclark@gmail.com>

drm/msm: fix error path cleanup

If we fail to attach iommu, gpu->aspace could be IS_ERR()..

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1267a4df 27-Jul-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Attach the GPU MMU when it is created

Currently the GPU MMU is attached in the adreno_gpu code but as
more and more of the GPU initialization moves to the generic
GPU path we have a need to map and use GPU memory earlier and
earlier. There isn't any reason to defer attaching the MMU
until later so attach it right after the address space is
created so it can be used immediately.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 0e08270a 13-Jun-2017 Sushmita Susheelendra <ssusheel@codeaurora.org>

drm/msm: Separate locking of buffer resources from struct_mutex

Buffer object specific resources like pages, domains, sg list
need not be protected with struct_mutex. They can be protected
with a buffer object level lock. This simplifies locking and
makes it easier to avoid potential recursive locking scenarios
for SVM involving mmap_sem and struct_mutex. This also removes
unnecessary serialization when creating buffer objects, and also
between buffer object creation and GPU command submission.

Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
[robclark: squash in handling new locking for shrinker]
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 8432a903 13-Jun-2017 Rob Clark <robdclark@gmail.com>

drm/msm: remove address-space id

Now that the msm_gem supports an arbitrary number of vma's, we no longer
need to assign an id (index) to each address space. So rip out the
associated code.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 8bdcd949 13-Jun-2017 Rob Clark <robdclark@gmail.com>

drm/msm: pass address-space to _get_iova() and friends

No functional change, that will come later. But this will make it
easier to deal with dynamically created address spaces (ie. per-
process pagetables for gpu).

Signed-off-by: Rob Clark <robdclark@gmail.com>


# cb1e3818 13-Jun-2017 Rob Clark <robdclark@gmail.com>

drm/msm: fix locking inconsistency for gpu->hw_init()

Most, but not all, paths where calling the with struct_mutex held. The
fast-path in msm_gem_get_iova() (plus some sub-code-paths that only run
the first time) was masking this issue.

So lets just always hold struct_mutex for hw_init(). And sprinkle some
WARN_ON()'s and might_lock() to avoid this sort of problem in the
future.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 5770fc7a 08-May-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add a struct to pass configuration to msm_gpu_init()

The amount of information that we need to pass into msm_gpu_init()
is steadily increasing, so add a new struct to stabilize the function
call and make it easier to add new configuration down the line.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 134ccada 03-May-2017 Rob Clark <robdclark@gmail.com>

drm/msm/gpu: check legacy clk names in get_clocks()

Otherwise if someone was using old bindings with "core_clk" instead of
"core" as the clock name, we'd never find it and gpu would be stuck at
27MHz (or whatever it's slowest rate is).

Fixes: 98db803 ("msm/drm: gpu: Dynamically locate the clocks from the device tree")
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 98db803f 07-Mar-2017 Jordan Crouse <jcrouse@codeaurora.org>

msm/drm: gpu: Dynamically locate the clocks from the device tree

Instead of using a fixed list of clock names use the clock-names
list in the device tree to discover and get the list of clocks
that we need.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# bf5af4ae 07-Mar-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Hard code the GPU "slow frequency"

Some A3XX and A4XX GPU targets required that the GPU clock be
programmed to a non zero value when it was disabled so
27Mhz was chosen as the "invalid" frequency.

Even though newer targets do not have the same clock restrictions
we still write 27Mhz on clock disable and expect the clock subsystem
to round down to zero.

For unknown reasons even though the slow clock speed is always
27Mhz and it isn't actually a functional level the legacy device tree
frequency tables always defined it and then did gymnastics to work
around it.

Instead of playing the same silly games just hard code the "slow" clock
speed in the code as 27MHz and save ourselves a bit of infrastructure.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 9873ef07 06-Feb-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Make sure to detach the MMU during GPU cleanup

We should be detaching the MMU before destroying the address
space. To do this cleanly, the detach has to happen in
adreno_gpu_cleanup() because it needs access to structs
in adreno_gpu.c. Plus it is better symmetry to have
the attach and detach at the same code level.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# eeb75474 10-Feb-2017 Rob Clark <robdclark@gmail.com>

drm/msm/gpu: use pm-runtime

We need to use pm-runtime properly when IOMMU is using device_link() to
control it's own clocks.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 028402d4 06-Feb-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Make sure to detach the MMU during GPU cleanup

We should be detaching the MMU before destroying the address
space. To do this cleanly, the detach has to happen in
adreno_gpu_cleanup() because it needs access to structs
in adreno_gpu.c. Plus it is better symmetry to have
the attach and detach at the same code level.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 720c3bb8 30-Jan-2017 Rob Clark <robdclark@gmail.com>

drm/msm: drop _clk suffix from clk names

Suggested by Rob Herring. We still support the old names for
compatibility with downstream android dt files.

Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Herring <robh@kernel.org>


# b5f103ab 28-Nov-2016 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: gpu: Add A5XX target support

Add support for the A5XX family of Adreno GPUs.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 89d777a5 28-Nov-2016 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Remove 'src_clk' from adreno configuration

The adreno code inherited a silly workaround from downstream
from the bad old days before decent clock control. grp_clk[0]
(named 'src_clk') doesn't actually exist - it was used as a proxy
for whatever the core clock actually was (usually 'core_clk').

All targets should be able to correctly request 'core_clk' and
get the right thing back so zap the anachronism and directly
use grp_clk[0] to control the clock rate.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 78babc16 10-Nov-2016 Rob Clark <robdclark@gmail.com>

drm/msm: convert iova to 64b

For a5xx the gpu is 64b so we need to change iova to 64b everywhere. On
the display side, iova is still 32b so it can ignore the upper bits.
(Although all the armv8 devices have an iommu that can map 64b pa to 32b
iova.)

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 667ce33e 28-Sep-2016 Rob Clark <robdclark@gmail.com>

drm/msm: support multiple address spaces

We can have various combinations of 64b and 32b address space, ie. 64b
CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So
best to decouple the device iova's from mmap offset.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# f54d1867 25-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

dma-buf: Rename struct fence to dma_fence

I plan to usurp the short name of struct fence for a core kernel struct,
and so I need to rename the specialised fence/timeline for DMA
operations to make room.

A consensus was reached in
https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html
that making clear this fence applies to DMA operations was a good thing.
Since then the patch has grown a bit as usage increases, so hopefully it
remains a good thing!

(v2...: rebase, rerun spatch)
v3: Compile on msm, spotted a manual fixup that I broke.
v4: Try again for msm, sorry Daniel

coccinelle script:
@@

@@
- struct fence
+ struct dma_fence
@@

@@
- struct fence_ops
+ struct dma_fence_ops
@@

@@
- struct fence_cb
+ struct dma_fence_cb
@@

@@
- struct fence_array
+ struct dma_fence_array
@@

@@
- enum fence_flag_bits
+ enum dma_fence_flag_bits
@@

@@
(
- fence_init
+ dma_fence_init
|
- fence_release
+ dma_fence_release
|
- fence_free
+ dma_fence_free
|
- fence_get
+ dma_fence_get
|
- fence_get_rcu
+ dma_fence_get_rcu
|
- fence_put
+ dma_fence_put
|
- fence_signal
+ dma_fence_signal
|
- fence_signal_locked
+ dma_fence_signal_locked
|
- fence_default_wait
+ dma_fence_default_wait
|
- fence_add_callback
+ dma_fence_add_callback
|
- fence_remove_callback
+ dma_fence_remove_callback
|
- fence_enable_sw_signaling
+ dma_fence_enable_sw_signaling
|
- fence_is_signaled_locked
+ dma_fence_is_signaled_locked
|
- fence_is_signaled
+ dma_fence_is_signaled
|
- fence_is_later
+ dma_fence_is_later
|
- fence_later
+ dma_fence_later
|
- fence_wait_timeout
+ dma_fence_wait_timeout
|
- fence_wait_any_timeout
+ dma_fence_wait_any_timeout
|
- fence_wait
+ dma_fence_wait
|
- fence_context_alloc
+ dma_fence_context_alloc
|
- fence_array_create
+ dma_fence_array_create
|
- to_fence_array
+ to_dma_fence_array
|
- fence_is_array
+ dma_fence_is_array
|
- trace_fence_emit
+ trace_dma_fence_emit
|
- FENCE_TRACE
+ DMA_FENCE_TRACE
|
- FENCE_WARN
+ DMA_FENCE_WARN
|
- FENCE_ERR
+ DMA_FENCE_ERR
)
(
...
)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk


# f44d32c7 16-Jun-2016 Rob Clark <robdclark@gmail.com>

drm/msm: move fence allocation out of msm_gpu_submit()

Prep work for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4816b626 03-May-2016 Rob Clark <robdclark@gmail.com>

drm/msm: print offender task name on hangcheck recovery

Track the pid per submit, so we can print the name of the task which
submitted the batch that caused the gpu to hang.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 40e6815b 03-May-2016 Rob Clark <robdclark@gmail.com>

drm/msm: fix leak in failed submit path

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1193c3bc 03-May-2016 Rob Clark <robdclark@gmail.com>

drm/msm: drop return from gpu->submit()

At this point, there is nothing left to fail. And submit already has a
fence assigned and is added to the submit_list. Any problems from here
on out are asynchronous (ie. hangcheck/recovery).

Signed-off-by: Rob Clark <robdclark@gmail.com>


# b6295f9a 15-Mar-2016 Rob Clark <robdclark@gmail.com>

drm/msm: 'struct fence' conversion

Signed-off-by: Rob Clark <robdclark@gmail.com>


# ca762a8a 15-Mar-2016 Rob Clark <robdclark@gmail.com>

drm/msm: introduce msm_fence_context

Better encapsulate the per-timeline stuff into fence-context. For now
there is just a single fence-context, but eventually we'll also have one
per-CRTC to enable fully explicit fencing.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 7d12a279 16-Mar-2016 Rob Clark <robdclark@gmail.com>

drm/msm/gpu: simplify tracking in-flight bo's

Since we already track the array of bo's in the submit object, just
unconditionally take and drop ref's per submit (rather than only taking
ref's if bo is not already active). This simplifies later patches.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# fde5de6c 15-Mar-2016 Rob Clark <robdclark@gmail.com>

drm/msm: move fence code to it's own file

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 5e921b19 15-Sep-2015 Stephane Viau <sviau@codeaurora.org>

drm/msm: Fix IOMMU clean up path in case msm_iommu_new() fails

msm_iommu_new() can fail and this change makes sure that we
detect the failure and free the allocated domain before going
any further.

Signed-off-by: Stephane Viau <sviau@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1a370be9 07-Jun-2015 Rob Clark <robdclark@gmail.com>

drm/msm: restart queued submits after hang

Track the list of in-flight submits. If the gpu hangs, retire up to an
including the offending submit, and then re-submit the remainder. This
way, for concurrently running piglit tests (for example), one failing
test doesn't cause unrelated tests to fail simply because it's submit
was queued up after one that triggered a hang.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# de558cd2 06-May-2015 Rob Clark <robdclark@gmail.com>

drm/msm: adreno a306 support

As found in apq8016 (used in DragonBoard 410c) and msm8916.

Note that numerically a306 is actually 307 (since a305c already claimed
306). Nice and confusing.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6490ad47 04-Jun-2015 Rob Clark <robdclark@gmail.com>

drm/msm: clarify downstream bus scaling

A few spots in the driver have support for downstream android
CONFIG_MSM_BUS_SCALING. This is mainly to simplify backporting the
driver for various devices which do not have sufficient upstream
kernel support. But the intentionally dead code seems to cause
some confusion. Rename the #define to make this more clear.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# a1ad3523 11-Jul-2014 Rob Clark <robdclark@gmail.com>

drm/msm: fix potential deadlock in gpu init

Somewhere along the way, the firmware loader sprouted another lock
dependency, resulting in possible deadlock scenario:

&dev->struct_mutex --> &sb->s_type->i_mutex_key#2 --> &mm->mmap_sem

which is problematic vs things like gem mmap.

So introduce a separate mutex to synchronize gpu init.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 944fc36c 09-Jul-2014 Rob Clark <robdclark@gmail.com>

drm/msm: use upstream iommu

Downstream kernel IOMMU had a non-standard way of dealing with multiple
devices and multiple ports/contexts. We don't need that on upstream
kernel, so rip out the crazy.

Note that we have to move the pinning of the ringbuffer to after the
IOMMU is attached. No idea how that managed to work properly on the
downstream kernel.

For now, I am leaving the IOMMU port name stuff in place, to simplify
things for folks trying to backport latest drm/msm to device kernels.
Once we no longer have to care about pre-DT kernels, we can drop this
and instead backport upstream IOMMU driver.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 70c70f09 30-May-2014 Rob Clark <robdclark@gmail.com>

drm/msm: add perf logging debugfs

Signed-off-by: Rob Clark <robdclark@gmail.com>


# a7d3c950 30-May-2014 Rob Clark <robdclark@gmail.com>

drm/msm: add rd logging debugfs

To ease debugging, add debugfs file which can be cat/tail'd to log
submits, along with fence #. If GPU hangs, you can look at 'gpu'
debugfs file to find last completed fence and current register state,
and compare with logged rd file to narrow down the DRAW_INDX which
triggered the GPU hang.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 37d77c3a 11-Jan-2014 Rob Clark <robdclark@gmail.com>

drm/msm: crank down gpu when inactive

Shut down the clks when the gpu has nothing to do. A short inactivity
timer is used to provide a low pass filter for power transitions.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# c2703b13 06-Feb-2014 Rob Clark <robdclark@gmail.com>

drm/msm: bigger synchronization hammer

Because we use a list_head in the bo to track it's position in a submit,
we need to serialize at a higher layer. Otherwise there are problems
when multiple contexts are SUBMIT'ing in parallel cmdstreams referencing
a shared bo.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 871d812a 15-Nov-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add support for non-IOMMU systems

Add a VRAM carveout that is used for systems which do not have an IOMMU.

The VRAM carveout uses CMA. The arch code must setup a CMA pool for the
device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not
cool). The user can configure the VRAM pool size using msm.vram module
param.

Technically, the abstraction of IOMMU behind msm_mmu is not strictly
needed, but it simplifies the GEM code a bit, and will be useful later
when I add support for a2xx devices with GPUMMU, so I decided to keep
this part.

It appears to be possible to configure the GPU to restrict access to
addresses within the VRAM pool, but this is not done yet. So for now
the GPU will refuse to load if there is no sort of mmu. Once address
based limits are supported and tested to confirm that we aren't giving
the GPU access to arbitrary memory, this restriction can be lifted

Signed-off-by: Rob Clark <robdclark@gmail.com>


# bf2b33af 15-Nov-2013 Rob Clark <robdclark@gmail.com>

drm/msm: fix bus scaling

This got a bit broken with original patches when re-arranging things to
move dependencies on mach-msm inside #ifndef OF.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# edd4fc63 14-Sep-2013 Rob Clark <robdclark@gmail.com>

drm/msm: rework inactive-work

Re-arrange things a bit so that we can get work requested after a bo
fence passes, like pageflip, done before retiring bo's. Without any
sort of bo cache in userspace, some games can trigger hundred's of
transient bo's, which can cause retire to take a long time (5-10ms).
Obviously we want a bo cache.. but this cleanup will make things a
bit easier for atomic as well and makes things a bit cleaner.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: David Brown <davidb@codeaurora.org>


# aea6a64c 11-Sep-2013 Wei Yongjun <yongjun_wei@trendmicro.com.cn>

drm/msm: fix potential NULL pointer dereference

The dereference to 'pdata' should be moved below the NULL test.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>


# 6b8819c8 11-Sep-2013 Rob Clark <robdclark@gmail.com>

drm/msm: workaround for missing irq

Occasionally we seem to miss an IRQ from the ME (microengine). I'm not
entirely sure the root cause, but for now we can unwedge things by
retiring from the hangcheck timer.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 26791c48 03-Sep-2013 Rob Clark <robdclark@gmail.com>

drm/msm: hangcheck harder

If gpu locks up with the rptr shortly beyond the wrap-around point in
the ringbuffer, because the rptr was not reset (but wptr is, by virtue
of resetting rb->cur), we could end up in a scenario where we think
there is not enough space in the ringbuffer for the next cmds. And
since the CP won't reset rptr until after processing an IB, this leaves
things in a sort of deadlock.

So reset rptr too. And a bit more spiffing up of hangcheck to make
things easier to debug.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# bf6811f3 01-Sep-2013 Rob Clark <robdclark@gmail.com>

drm/msm: handle read vs write fences

The userspace API already had everything needed to handle read vs write
synchronization. This patch actually bothers to hook it up properly, so
that we don't need to (for example) stall on userspace read access to a
buffer that gpu is also still reading.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# bd6f82d8 24-Aug-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add basic hangcheck/recovery mechanism

A basic, no-frills recovery mechanism in case the gpu gets wedged. We
could try to be a bit more fancy and restart the next submit after the
one that got wedged, but for now keep it simple. This is enough to
recover things if, for example, the gpu hangs mid way through a piglit
run.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 7198e6b0 18-Jul-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add a3xx gpu support

Add initial support for a3xx 3d core.

So far, with hardware that I've seen to date, we can have:
+ zero, one, or two z180 2d cores
+ a3xx or a2xx 3d core, which share a common CP (the firmware
for the CP seems to implement some different PM4 packet types
but the basics of cmdstream submission are the same)

Which means that the eventual complete "class" hierarchy, once
support for all past and present hw is in place, becomes:
+ msm_gpu
+ adreno_gpu
+ a3xx_gpu
+ a2xx_gpu
+ z180_gpu

This commit splits out the parts that will eventually be common
between a2xx/a3xx into adreno_gpu, and the parts that are even
common to z180 into msm_gpu.

Note that there is no cmdstream validation required. All memory access
from the GPU is via IOMMU/MMU. So as long as you don't map silly things
to the GPU, there isn't much damage that the GPU can do.

Signed-off-by: Rob Clark <robdclark@gmail.com>