History log of /linux-master/drivers/gpu/drm/msm/adreno/adreno_gpu.c
Revision Date Author Comments
# 44a88fa4 07-Dec-2023 Connor Abbott <cwabbott0@gmail.com>

drm/msm: Add param for the highest bank bit

This parameter is programmed by the kernel and influences the tiling
layout of images. Exposing it to userspace will allow it to tile/untile
images correctly without guessing what value the kernel programmed, and
allow us to change it in the future without breaking userspace.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/571181/
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 1f8c29e8 25-Sep-2023 Konrad Dybcio <konrad.dybcio@linaro.org>

drm/msm/a6xx: Add A740 support

A740 builds upon the A730 IP, shuffling some values and registers
around. More differences will appear when things like BCL are
implemented.

adreno_is_a740_family is added in preparation for more A7xx GPUs,
the logic checks will be valid resulting in smaller diffs.

Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # sm8450
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/559291/
Signed-off-by: Rob Clark <robdclark@chromium.org>


# af66706a 25-Sep-2023 Konrad Dybcio <konrad.dybcio@linaro.org>

drm/msm/a6xx: Add skeleton A7xx support

A7xx GPUs are - from kernel's POV anyway - basically another generation
of A6xx. They build upon the A650/A660_family advancements, skipping some
writes (presumably more values are preset correctly on reset), adding
some new ones and changing others.

One notable difference is the introduction of a second shadow, called BV.
To handle this with the current code, allocate it right after the current
RPTR shadow.

BV handling and .submit are mostly based on Jonathan Marek's work.

All A7xx GPUs are assumed to have a GMU.
A702 is not an A7xx-class GPU, it's a weird forked A610.

Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # sm8450
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/559285/
Signed-off-by: Rob Clark <robdclark@chromium.org>


# a895037e 10-Aug-2023 Ruan Jinjie <ruanjinjie@huawei.com>

drm/msm/adreno: adreno_gpu: Switch to memdup_user_nul() helper

Use memdup_user_nul() helper instead of open-coding to simplify the code.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/552130/
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 42854f8d 03-Aug-2023 Rob Clark <robdclark@chromium.org>

drm/msm: Disallow relocs on a6xx+

Mesa stopped using these pretty early in a6xx bringup[1]. Take advantage
of this to disallow some legacy UABI.

[1] https://gitlab.freedesktop.org/mesa/mesa/-/commit/7ef722861b691ce99be3827ed05f8c0ddf2cd66e

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Patchwork: https://patchwork.freedesktop.org/patch/551175/


# 90b593ce 27-Jul-2023 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Switch to chip-id for identifying GPU

Since the revision becomes an opaque identifier with future GPUs, move
away from treating different ranges of bits as having a given meaning.
This means that we need to explicitly list different patch revisions in
the device table.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549782/


# 47bd37f9 27-Jul-2023 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Move adreno info to config

Let's just stash it in adreno_platform_config rather than looking it up
in N different places.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549777/


# 8825f596 27-Jul-2023 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Add helper for formating chip-id

This is used in a few places, including one that is parsed by userspace
tools. So let's standardize it a bit better.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549774/


# 67133dc0 27-Jul-2023 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Add adreno family

Sometimes it is useful to know the sub-generation (or "family"). And in
any case, this helps us get away from infering the generation from the
numerical chip-id.

v2: Fix is_a2xx() typo

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549773/


# f4f1c707 27-Jul-2023 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Remove redundant revn param

This just duplicates what is in adreno_info, and can cause confusion if
used before it is set.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549761/


# 6391030d 27-Jul-2023 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Remove redundant gmem size param

Even in the ocmem case, the allocated ocmem buffer size should match the
requested size.

v2: Move stray hunk to previous patch, make OCMEM size mismatch an error
condition.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549759/


# 832ee64d 27-Jul-2023 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Remove GPU name

No real need to have marketing names in the kernel.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/549757/


# 5a903a44 15-Jun-2023 Konrad Dybcio <konrad.dybcio@linaro.org>

drm/msm/a6xx: Introduce GMU wrapper support

Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
but don't implement the associated GMUs. This is due to the fact that
the GMU directly pokes at RPMh. Sadly, this means we have to take care
of enabling & scaling power rails, clocks and bandwidth ourselves.

Reuse existing Adreno-common code and modify the deeply-GMU-infused
A6XX code to facilitate these GPUs. This involves if-ing out lots
of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
the actual name that Qualcomm uses in their downstream kernels).

This is essentially a register region which is convenient to model
as a device. We'll use it for managing the GDSCs. The register
layout matches the actual GMU_CX/GX regions on the "real GMU" devices
and lets us reuse quite a bit of gmu_read/write/rmw calls.

Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/542766/
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 0332bd04 01-Apr-2023 Konrad Dybcio <konrad.dybcio@linaro.org>

drm/msm/adreno: adreno_gpu: Don't set OPP scaling clock w/ GMU

Recently I contributed the switch to OPP API for all Adreno generations.
I did however also skip over the fact that GPUs with a GMU don't specify
a core clock of any kind in the GPU node. While that didn't break
anything, it did introduce unwanted spam in the dmesg:

adreno 5000000.gpu: error -ENOENT: _opp_set_clknames: Couldn't find clock with name: core_clk

Guard the entire logic so that it's not used with GMU-equipped GPUs.

Fixes: 9f251f934012 ("drm/msm/adreno: Use OPP for every GPU generation")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/530347/
Link: https://lore.kernel.org/r/20230223-topic-gmuwrapper-v6-1-2034115bb60c@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f62ad0f6 14-Feb-2023 Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

drm/msm/adreno: split a6xx fault handler into generic and a6xx parts

Split the a6xx_fault_handler() into the generic adreno_fault_handler()
and platform-specific parts. The adreno_fault_handler() can further be
used by a5xx and hopefully by a4xx (at some point).

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/522722/
Link: https://lore.kernel.org/r/20230214123504.3729522-3-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 8cceb773 14-Feb-2023 Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

drm/msm/adreno: stall translation on fault for all GPU families

The commit e25e92e08e32 ("drm/msm: devcoredump iommu fault support")
enabled SMMU stalling to collect GPU state, but only for a6xx. It tied
enabling the stall with tha per-instance pagetables creation.

Since that commit SoCs with a5xx also gained support for
adreno-smmu-priv. Move stalling into generic code and add corresponding
resume_translation calls.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/522720/
Link: https://lore.kernel.org/r/20230214123504.3729522-2-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 7fa5047a 10-Mar-2023 Rob Herring <robh@kernel.org>

drm: Use of_property_present() for testing DT property presence

It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property/of_find_property functions for reading properties. As
part of this, convert of_get_property/of_find_property calls to the
recently added of_property_present() helper when we just want to test
for presence of a property and nothing more.

Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Reviewed-by: Liu Ying <victor.liu@nxp.com> # i.MX bridge
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://lore.kernel.org/r/20230310144705.1542207-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>


# 624831b3 20-Mar-2023 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Move fw loading out of hw_init() path

It is already a no-op, since we've already loaded the fw from
adreno_load_gpu(), so drop the redundant call.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/527849/
Link: https://lore.kernel.org/r/20230320144356.803762-13-robdclark@gmail.com


# 9f251f93 23-Feb-2023 Konrad Dybcio <konrad.dybcio@linaro.org>

drm/msm/adreno: Use OPP for every GPU generation

Some older GPUs (namely a2xx with no opp tables at all and a320 with
downstream-remnants gpu pwrlevels) used not to have OPP tables. They
both however had just one frequency defined, making it extremely easy
to construct such an OPP table from within the driver if need be.

Do so and switch all clk_set_rate calls on core_clk to their OPP
counterparts.

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/523784/
Link: https://lore.kernel.org/r/20230223-topic-opp-v3-3-5f22163cd1df@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 3bf90eca 03-Feb-2023 Elliot Berman <quic_eberman@quicinc.com>

firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/

Move include/linux/qcom_scm.h to include/linux/firmware/qcom/qcom_scm.h.
This removes 1 of a few remaining Qualcomm-specific headers into a more
approciate subdirectory under include/.

Suggested-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
Reviewed-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
Acked-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230203210956.3580811-1-quic_eberman@quicinc.com


# dbeedbcb 21-Dec-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm/adreno: Fix null ptr access in adreno_gpu_cleanup()

Fix the below kernel panic due to null pointer access:
[ 18.504431] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048
[ 18.513464] Mem abort info:
[ 18.516346] ESR = 0x0000000096000005
[ 18.520204] EC = 0x25: DABT (current EL), IL = 32 bits
[ 18.525706] SET = 0, FnV = 0
[ 18.528878] EA = 0, S1PTW = 0
[ 18.532117] FSC = 0x05: level 1 translation fault
[ 18.537138] Data abort info:
[ 18.540110] ISV = 0, ISS = 0x00000005
[ 18.544060] CM = 0, WnR = 0
[ 18.547109] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000112826000
[ 18.553738] [0000000000000048] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 18.562690] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
**Snip**
[ 18.696758] Call trace:
[ 18.699278] adreno_gpu_cleanup+0x30/0x88
[ 18.703396] a6xx_destroy+0xc0/0x130
[ 18.707066] a6xx_gpu_init+0x308/0x424
[ 18.710921] adreno_bind+0x178/0x288
[ 18.714590] component_bind_all+0xe0/0x214
[ 18.718797] msm_drm_bind+0x1d4/0x614
[ 18.722566] try_to_bring_up_aggregate_device+0x16c/0x1b8
[ 18.728105] __component_add+0xa0/0x158
[ 18.732048] component_add+0x20/0x2c
[ 18.735719] adreno_probe+0x40/0xc0
[ 18.739300] platform_probe+0xb4/0xd4
[ 18.743068] really_probe+0xfc/0x284
[ 18.746738] __driver_probe_device+0xc0/0xec
[ 18.751129] driver_probe_device+0x48/0x110
[ 18.755421] __device_attach_driver+0xa8/0xd0
[ 18.759900] bus_for_each_drv+0x90/0xdc
[ 18.763843] __device_attach+0xfc/0x174
[ 18.767786] device_initial_probe+0x20/0x2c
[ 18.772090] bus_probe_device+0x40/0xa0
[ 18.776032] deferred_probe_work_func+0x94/0xd0
[ 18.780686] process_one_work+0x190/0x3d0
[ 18.784805] worker_thread+0x280/0x3d4
[ 18.788659] kthread+0x104/0x1c0
[ 18.791981] ret_from_fork+0x10/0x20
[ 18.795654] Code: f9400408 aa0003f3 aa1f03f4 91142015 (f9402516)
[ 18.801913] ---[ end trace 0000000000000000 ]---
[ 18.809039] Kernel panic - not syncing: Oops: Fatal exception

Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}")
Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/515605/
Link: https://lore.kernel.org/r/20221221203925.v2.1.Ib978de92c4bd000b515486aad72e96c2481f84d0@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# a66f1efc 10-Jan-2023 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Fix potential double-free

If userspace was calling the MSM_SET_PARAM ioctl on multiple threads to
set the COMM or CMDLINE param, it could trigger a race causing the
previous value to be kfree'd multiple times. Fix this by serializing on
the gpu lock.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: d4726d770068 ("drm/msm: Add a way to override processes comm/cmdline")
Patchwork: https://patchwork.freedesktop.org/patch/517778/
Link: https://lore.kernel.org/r/20230110212903.1925878-1-robdclark@gmail.com


# 822ff993 02-Nov-2022 Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

drm/msm: remove duplicated code from a6xx_create_address_space

The function a6xx_create_address_space() is mostly a copy of
adreno_iommu_create_address_space() with added quirk setting. Rework
these two functions to be a thin wrappers around a common helper.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/509614/
Link: https://lore.kernel.org/r/20221102175449.452283-3-dmitry.baryshkov@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>


# 3236130b 02-Nov-2022 Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

drm/msm: move domain allocation into msm_iommu_new()

After the msm_iommu instance is created, the IOMMU domain is completely
handled inside the msm_iommu code. Move the iommu_domain_alloc() call
into the msm_iommu_new() to simplify callers code.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/509615/
Link: https://lore.kernel.org/r/20221102175449.452283-2-dmitry.baryshkov@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>


# 83d18e9d 13-Oct-2022 Rob Clark <robdclark@chromium.org>

drm/msm/a6xx: Fix kvzalloc vs state_kcalloc usage

adreno_show_object() is a trap! It will re-allocate the pointer it is
passed on first call, when the data is ascii85 encoded, using kvmalloc/
kvfree(). Which means the data *passed* to it must be kvmalloc'd, ie.
we cannot use the state_kcalloc() helper.

This partially reverts commit ec8f1813bf8d ("drm/msm/a6xx: Replace
kcalloc() with kvzalloc()"), but adds the missing kvfree() to fix the
memory leak that was present previously. And adds a warning comment.

Fixes: ec8f1813bf8d ("drm/msm/a6xx: Replace kcalloc() with kvzalloc()")
Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/20
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/507014/
Link: https://lore.kernel.org/r/20221013225520.371226-2-robdclark@gmail.com


# 4b18299b 13-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Defer enabling runpm until hw_init()

To avoid preventing the display from coming up before the rootfs is
mounted, without resorting to packing fw in the initrd, the GPU has
this limbo state where the device is probed, but we aren't ready to
start sending commands to it. This is particularly problematic for
a6xx, since the GMU (which requires fw to be loaded) is the one that
is controlling the power/clk/icc votes.

So defer enabling runpm until we are ready to call gpu->hw_init(),
as that is a point where we know we have all the needed fw and are
ready to start sending commands to the coproc's.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/489337/
Link: https://lore.kernel.org/r/20220613182036.2567963-1-robdclark@gmail.com


# 18514c38 29-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Add GEM debug label to devcore

When trying to understand an iova fault devcore, once you figure out
which buffer we accessed beyond the end of, it is useful to see the
buffer's debug label.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/491910/
Link: https://lore.kernel.org/r/20220629211919.563585-3-robdclark@gmail.com


# ba0386a9 15-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Fix %d vs %u

In debugging fence rollover, I noticed that GPU state capture and
devcore dumps were showing me negative fence numbers. Let's fix that
and some related signed vs unsigned confusion.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/489621/
Link: https://lore.kernel.org/r/20220615163532.3013035-1-robdclark@gmail.com


# 36bbfdb8 29-May-2022 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: Allow larger address space size

The restriction to 4G was strictly to work around 64b math bug in some
versions of SQE firmware. This appears to be fixed in a650+ SQE fw, so
allow a larger address space size on these devices.

Also, add a modparam override for debugging and igt.

v2: Send the right version of the patch (ie. the one that actually
compiles)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/487601/
Link: https://lore.kernel.org/r/20220529180428.2577832-1-robdclark@gmail.com


# c8af219d 18-Jun-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Don't overwrite hw fence in hw_init

Prior to the last commit, this could result in setting the GPU
written fence value back to an older value, if we had missed
updating completed_fence prior to suspend. This was mostly
harmless as the GPU would eventually overwrite it again with
the correct value. But we should just not do this. Instead
just leave a sanity check that the fence looks plausible (in
case the GPU scribbled on memory).

Reported-by: Steev Klimaszewski <steev@kali.org>
Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Patchwork: https://patchwork.freedesktop.org/patch/490138/
Link: https://lore.kernel.org/r/20220618161120.3451993-2-robdclark@gmail.com


# ce0db505 06-Jun-2022 Maximilian Luz <luzmaximilian@gmail.com>

drm/msm: Fix double pm_runtime_disable() call

Following commit 17e822f7591f ("drm/msm: fix unbalanced
pm_runtime_enable in adreno_gpu_{init, cleanup}"), any call to
adreno_unbind() will disable runtime PM twice, as indicated by the call
trees below:

adreno_unbind()
-> pm_runtime_force_suspend()
-> pm_runtime_disable()

adreno_unbind()
-> gpu->funcs->destroy() [= aNxx_destroy()]
-> adreno_gpu_cleanup()
-> pm_runtime_disable()

Note that pm_runtime_force_suspend() is called right before
gpu->funcs->destroy() and both functions are called unconditionally.

With recent addition of the eDP AUX bus code, this problem manifests
itself when the eDP panel cannot be found yet and probing is deferred.
On the first probe attempt, we disable runtime PM twice as described
above. This then causes any later probe attempt to fail with

[drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13

preventing the driver from loading.

As there seem to be scenarios where the aNxx_destroy() functions are not
called from adreno_unbind(), simply removing pm_runtime_disable() from
inside adreno_unbind() does not seem to be the proper fix. This is what
commit 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in
adreno_gpu_{init, cleanup}") intended to fix. Therefore, instead check
whether runtime PM is still enabled, and only disable it in that case.

Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}")
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Link: https://lore.kernel.org/r/20220606211305.189585-1-luzmaximilian@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 36a1d1bd 21-Apr-2022 Luca Weiss <luca@z3ntu.xyz>

drm/msm: Fix null pointer dereferences without iommu

Check if 'aspace' is set before using it as it will stay null without
IOMMU, such as on msm8974.

Fixes: bc2112583a0b ("drm/msm/gpu: Track global faults per address-space")
Signed-off-by: Luca Weiss <luca@z3ntu.xyz>
Link: https://lore.kernel.org/r/20220421203455.313523-1-luca@z3ntu.xyz
Signed-off-by: Rob Clark <robdclark@chromium.org>


# a636a0ff 11-Apr-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Add a way for userspace to allocate GPU iova

The motivation at this point is mainly native userspace mesa driver in a
VM guest. The one remaining synchronous "hotpath" is buffer allocation,
because guest needs to wait to know the bo's iova before it can start
emitting cmdstream/state that references the new bo. By allocating the
iova in the guest userspace, we no longer need to wait for a response
from the host, but can just rely on the allocation request being
processed before the cmdstream submission. Allocation failures (OoM,
etc) would just be treated as context-lost (ie. GL_GUILTY_CONTEXT_RESET)
or subsequent allocations (or readpix, etc) can raise GL_OUT_OF_MEMORY.

v2: Fix inuse check
v3: Change mismatched iova case to -EBUSY

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://lore.kernel.org/r/20220411215849.297838-11-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f9d5355f 11-Apr-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Drop duplicate fence counter

The ring seqno counter duplicates the fence-context last_fence counter.
They end up getting incremented in lock-step, on the same scheduler
thread, but the split just makes things less obvious.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220411215849.297838-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# d4726d77 17-Mar-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Add a way to override processes comm/cmdline

In the cause of using the GPU via virtgpu, the host side process is
really a sort of proxy, and not terribly interesting from the PoV of
crash/fault logging. Add a way to override these per process so that
we can see the guest process's name.

v2: Handle kmalloc failure, add comment to explain kstrdup returns
NULL if passed NULL [Dan Carpenter]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220317165144.222101-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 4bfba716 17-Mar-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Add support for pointer params

The 64b value field is already suffient to hold a pointer instead of
immediate, but we also need a length field.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220317165144.222101-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# aaa743d8 07-Mar-2022 Dan Carpenter <dan.carpenter@oracle.com>

drm/msm/adreno: fix cast in adreno_get_param()

These casts need to happen before the shift. The only time it would
matter would be if "rev.core" is >= 128. In that case the sign bit
would be extended and we do not want that.

Fixes: afab9d91d872 ("drm/msm/adreno: Expose speedbin to userspace")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220307133105.GA17534@kili
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 90f45c42 03-Mar-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Add SYSPROF param (v2)

Add a SYSPROF param for system profiling tools like Mesa's pps-producer
(perfetto) to control behavior related to system-wide performance
counter collection. In particular, for profiling, one wants to ensure
that GPU context switches do not effect perfcounter state, and might
want to suppress suspend (which would cause counters to lose state).

v2: Swap the order in msm_file_private_set_sysprof() [sboyd] and
initialize the sysprof_active refcount to one (because the under/
overflow checking in refcount_t doesn't expect a 0->1 transition)
meaning that values greater than 1 means sysprof is active.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220304005317.776110-4-robdclark@gmail.com


# f7ddbf55 03-Mar-2022 Rob Clark <robdclark@chromium.org>

drm/msm: Add SET_PARAM ioctl

It was always expected to have a use for this some day, so we left a
placeholder. Now we do. (And I expect another use in the not too
distant future when we start allowing userspace to allocate GPU iova.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220304005317.776110-3-robdclark@gmail.com


# afab9d91 25-Feb-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm/adreno: Expose speedbin to userspace

Expose speedbin through MSM_PARAM_CHIP_ID parameter to help userspace
identify the sku.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220226005021.v2.4.I86c32730e08cba9e5c83f02ec17885124d45fa56@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# e2f76193 25-Feb-2022 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm/adreno: Generate name from chipid for 7c3

Use a gpu name which is sprintf'ed from the chipid for 7c3 gpu instead of
hardcoding one. This helps to avoid code churn in case of a gpu rename.

Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220226005021.v2.2.I9436e0e300f76b2e6c34136a0b902e8cfd73e0d6@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>


# bc211258 01-Feb-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Track global faults per address-space

Other processes don't need to know about faults that they are isolated
from by virtue of address space isolation. They are only interested in
whether some of their state might have been corrupted.

But to be safe, also track unattributed faults. This case should really
never happen unless there is a kernel bug (and that would never happen,
right?)

v2: Instead of adding a new param, just change the behavior of the
existing param to match what userspace actually wants [anholt]

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5934
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220201161618.778455-3-robdclark@gmail.com
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f98f915b 01-Feb-2022 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Add ctx to get_param()

Prep work for next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220201161618.778455-2-robdclark@gmail.com
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 08c4aa3e 09-Dec-2021 Rob Clark <robdclark@chromium.org>

drm/msm/a6xx: Skip crashdumper state if GPU needs_hw_init

I am seeing some crash logs which imply that we are trying to use
crashdumper hw to read back GPU state when the GPU isn't initialized.
This doesn't go well (for example, GPU could be in 32b address mode
and ignoring the upper bits of buffer that it is trying to dump state
to).

I'm not *quite* sure how we get into this state in the first place,
but lets not make a bad situation worse by triggering iova fault
crashes.

While we're at it, also add the information about whether the GPU is
initialized to the devcore dump to make this easier to see in the
logs (which makes the WARN_ON() redundant and even harmful because
it fills up the small bit of dmesg we get with the crash report).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211209193118.1163248-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# eaa55ead 24-Nov-2021 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: Add some WARN_ON()s

We don't expect either of these conditions to ever be true, so let's get
shouty if they are.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211124214151.1427022-6-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 518380cb 24-Nov-2021 Akhil P Oommen <quic_akhilpo@quicinc.com>

drm/msm/a6xx: Capture gmu log in devcoredump

Capture gmu log in coredump to enhance debugging.

Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211124214151.1427022-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# fc40e5e1 27-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Utilize gpu scheduler priorities

The drm/scheduler provides additional prioritization on top of that
provided by however many number of ringbuffers (each with their own
priority level) is supported on a given generation. Expose the
additional levels of priority to userspace and map the userspace
priority back to ring (first level of priority) and schedular priority
(additional priority levels within the ring).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210728010632.2633470-13-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 030af2b0 27-Jul-2021 Rob Clark <robdclark@chromium.org>

drm/msm: drop drm_gem_object_put_locked()

No idea why we were still using this. It certainly hasn't been needed
for some time. So drop the pointless twin codepaths.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210728010632.2633470-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# e25e92e0 10-Jun-2021 Rob Clark <robdclark@chromium.org>

drm/msm: devcoredump iommu fault support

Wire up support to stall the SMMU on iova fault, and collect a devcore-
dump snapshot for easier debugging of faults.

Currently this is a6xx-only, but mostly only because so far it is the
only one using adreno-smmu-priv.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/20210610214431.539029-6-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f6d62d09 08-Jun-2021 Jonathan Marek <jonathan@marek.ca>

drm/msm/a6xx: add support for Adreno 660 GPU

Add adreno_is_{a660,a650_family} helpers and convert update existing
adreno_is_a650 usage based on downstream driver's logic (changing into
adreno_is_a650_family or adding adreno_is_a660).

And add the remaining changes required for A660, again based on
the downstream driver: missing GMU allocations, additional register init,
dummy hfi BW table, cp protect list, entry in gpulist table, hwcg table,
updated a6xx_ucode_check_version check.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Link: https://lore.kernel.org/r/20210608172808.11803-6-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>


# bda1d6e5 08-Jun-2021 Jonathan Marek <jonathan@marek.ca>

drm/msm: remove unused icc_path/ocmem_icc_path

These aren't used by anything anymore.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210608172808.11803-2-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>


# bce98bf7 30-Apr-2021 Stephen Boyd <swboyd@chromium.org>

drm/msm: Use VERB() for extra verbose logging

These messages are useful for bringup/early development but in
production they don't provide much value. We know what sort of GPU we
have and interrupt information can be gathered other ways. This cuts
down on lines in the drm debug logs that happen too often, making the
debug logs practically useless.

Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Abhinav Kumar <abhinavk@codeaurora.org>
Cc: Kuogee Hsieh <khsieh@codeaurora.org>
Cc: aravindh@codeaurora.org
Cc: Sean Paul <sean@poorly.run>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20210430193104.1770538-3-swboyd@chromium.org
[resolve merge conflicts with dpu irq refactor]
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 11120e93 14-Mar-2021 Yangtao Li <tiny.windzz@gmail.com>

drm/msm: Convert to use resource-managed OPP API

Use resource-managed OPP API to simplify code.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20210314163408.22292-12-digetx@gmail.com
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# a5fc7aa9 23-Apr-2021 Jonathan Marek <jonathan@marek.ca>

drm/msm: replace MSM_BO_UNCACHED with MSM_BO_WC for internal objects

msm_gem_get_vaddr() currently always maps as writecombine, so use the right
flag instead of relying on broken behavior (things don't actually work if
they are mapped as uncached).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/20210423190833.25319-3-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 4fc52b81 01-Apr-2021 Christoph Hellwig <hch@lst.de>

iommu: remove DOMAIN_ATTR_IO_PGTABLE_CFG

Use an explicit set_pgtable_quirks method instead that just passes
the actual quirk bitmask instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Li Yang <leoyang.li@nxp.com>
Link: https://lore.kernel.org/r/20210401155256.298656-20-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 3ab1c5cc 24-Mar-2021 Rob Clark <robdclark@chromium.org>

drm/msm: Add param for userspace to query suspend count

Performance counts, and ALWAYS_ON counters used for capturing GPU
timestamps, lose their state across suspend/resume cycles. Userspace
tooling for performance monitoring needs to be aware of this. For
example, after a suspend userspace needs to recalibrate it's offset
between CPU and GPU time.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>
Link: https://lore.kernel.org/r/20210325012358.1759770-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 45596f25 11-Jan-2021 Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>

drm/msm/a6xx: Create an A6XX GPU specific address space

A6XX GPUs have support for last level cache(LLC) also known
as system cache and need to set the bus attributes to
use it. Currently we use a generic adreno iommu address space
implementation which are also used by older GPU generations
which do not have LLC and might introduce issues accidentally
and is not clean in a way that anymore additions of GPUs
supporting LLC would have to be guarded under ifdefs. So keep
the generic code separate and make the address space creation
A6XX specific. We also have a helper to set the llc attributes
so that if the newer GPU generations do support them, we can
use it instead of open coding domain attribute setting for each
GPU.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 276619c0 11-Jan-2021 Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>

drm/msm: Add proper checks for GPU LLCC support

Domain attribute setting for LLCC is guarded by !IS_ERR
check which works fine only when CONFIG_QCOM_LLCC=y but
when it is disabled, the LLCC apis return NULL and that
is not handled by IS_ERR check. Due to this, domain attribute
for LLCC will be set even on GPUs which do not support it
and cause issues, so correct this by using IS_ERR_OR_NULL
checks appropriately. Meanwhile also cleanup comment block
and remove unwanted blank line.

Fixes: 00fd44a1a470 ("drm/msm: Only enable A6xx LLCC code on A6xx")
Fixes: 474dadb8b0d5 ("drm/msm/a6xx: Add support for using system cache(LLC)")
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 00fd44a1 04-Jan-2021 Konrad Dybcio <konrad.dybcio@somainline.org>

drm/msm: Only enable A6xx LLCC code on A6xx

Using this code on A5xx (and probably older too) causes a
smmu bug.

Fixes: 474dadb8b0d5 ("drm/msm/a6xx: Add support for using system cache(LLC)")
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 474dadb8 24-Nov-2020 Sharat Masetty <smasetty@codeaurora.org>

drm/msm/a6xx: Add support for using system cache(LLC)

The last level system cache can be partitioned to 32 different
slices of which GPU has two slices preallocated. One slice is
used for caching GPU buffers and the other slice is used for
caching the GPU SMMU pagetables. This talks to the core system
cache driver to acquire the slice handles, configure the SCID's
to those slices and activates and deactivates the slices upon
GPU power collapse and restore.

Some support from the IOMMU driver is also needed to make use
of the system cache to set the right TCR attributes. GPU then
has the ability to override a few cacheability parameters which
it does to override write-allocate to write-no-allocate as the
GPU hardware does not benefit much from it.

DOMAIN_ATTR_IO_PGTABLE_CFG is another domain level attribute used
by the IOMMU driver for pagetable configuration which will be used
to set a quirk initially to set the right attributes to cache the
hardware pagetables into the system cache.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
[saiprakash.ranjan: fix to set attr before device attach to iommu and rebase]
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 5785dd7a 28-Oct-2020 Akhil P Oommen <akhilpo@codeaurora.org>

drm/msm: Fix duplicate gpu node in icc summary

The dev_pm_opp_of_add_table() api initializes the icc nodes for gpu
indirectly. So we can avoid using of_icc_get() api in the common
probe path. To improve this, move of_icc_get() to target specific code
where it is required.

This patch helps to fix duplicate gpu node listed in the interconnect
summary from the debugfs.

Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 200a2186 28-Sep-2020 Rob Clark <robdclark@chromium.org>

drm/msm: fix 32b build warns

Neither of these code-paths apply to older 32b devices, but it is rude
to introduce warnings.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200929001925.2916984-1-robdclark@gmail.com


# 2fb7487a 14-Sep-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Get rid of the REG_ADRENO offsets

As newer GPU families are added it makes less sense to maintain a
"generic" version functions for older families. Move adreno_submit()
and get_rptr() into the target specific code for a2xx, a3xx and a4xx.
Add a parameter to adreno_flush to pass the target specific WPTR register
instead of relying on the generic register.

All of this gets rid of the last of the REG_ADRENO offsets so remove all
all the register definitions and infrastructure.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 8907afb4 14-Sep-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Allow a5xx to mark the RPTR shadow as privileged

Newer microcode versions have support for the CP_WHERE_AM_I opcode which
allows the RPTR shadow memory to be marked as privileged to protect it
from corruption. Move the RPTR shadow into its own buffer and protect it
it if the current microcode version supports the new feature.

We can also re-enable preemption for those targets that support
CP_WHERE_AM_I. Start out by preemptively assuming that we can enable
preemption and disable it in a5xx_hw_init if the microcode version comes
back as too old.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# e3c64c72 17-Aug-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Set the global virtual address range from the IOMMU domain

Use the aperture settings from the IOMMU domain to set up the virtual
address range for the GPU. This allows us to transparently deal with
IOMMU side features (like split pagetables).

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 15eb9ad0 17-Aug-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Drop context arg to gpu->submit()

Now that we can get the ctx from the submitqueue, the extra arg is
redundant.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[split out of previous patch to reduce churny noise]
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 0a48db56 11-Sep-2020 Luca Weiss <luca@z3ntu.xyz>

drm/msm/adreno: fix probe without iommu

The function iommu_domain_alloc returns NULL on platforms without IOMMU
such as msm8974. This resulted in PTR_ERR(-ENODEV) being assigned to
gpu->aspace so the correct code path wasn't taken.

Fixes: ccac7ce373c1 ("drm/msm: Refactor address space initialization")
Signed-off-by: Luca Weiss <luca@z3ntu.xyz>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f6828e0c 03-Sep-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Disable the RPTR shadow

Disable the RPTR shadow across all targets. It will be selectively
re-enabled later for targets that need it.

Cc: stable@vger.kernel.org
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# df561f66 23-Aug-2020 Gustavo A. R. Silva <gustavoars@kernel.org>

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>


# f228af11 12-Aug-2020 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: fix updating ring fence

We need to set it to the most recent completed fence, not the most
recent submitted. Otherwise we have races where we think we can retire
submits that the GPU is not finished with, if the GPU doesn't manage to
overwrite the seqno before we look at it.

This can show up with hang recovery if one of the submits after the
crashing submit also hangs after it is replayed.

Fixes: f97decac5f4c ("drm/msm: Support multiple ringbuffers")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 520c651f 15-Jul-2020 Rob Clark <robdclark@chromium.org>

drm/msm/adreno: fix gpu probe if no interconnect-names

If there is no interconnect-names, but there is an interconnects
property, then of_icc_get(dev, "gfx-mem"); would return an error
rather than NULL.

Also, if there is no interconnect-names property, there will never
be a ocmem path. But of_icc_get(dev, "ocmem") would return -EINVAL
instead of -ENODATA. Just don't bother trying in this case.

v2: explicity check for interconnect-names property

Fixes: 08af4769c7d2 ("drm/msm: handle for EPROBE_DEFER for of_icc_get")
Fixes: 00bb9243d346 ("drm/msm/gpu: add support for ocmem interconnect path")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 08af4769 13-Jul-2020 Jonathan Marek <jonathan@marek.ca>

drm/msm: handle for EPROBE_DEFER for of_icc_get

Check for errors instead of silently not using icc if the msm driver
probes before the interconnect driver.

Allow ENODATA for ocmem path, as it is optional and this error
is returned when "gfx-mem" path is provided but not "ocmem".

Because msm_gpu_cleanup assumes msm_gpu_init has been called, the icc path
init needs to be after msm_gpu_init for the error path to work.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 17e822f7 13-Jul-2020 Jonathan Marek <jonathan@marek.ca>

drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}

adreno_gpu_init calls pm_runtime_enable, so adreno_gpu_cleanup needs to
call pm_runtime_disable.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# f167989c 17-Jun-2020 Eric Anholt <eric@anholt.net>

drm/msm: Fix address space size after refactor.

Previously the address space went from 16M to ~0u, but with the
refactor one of the 'f's was dropped, limiting us to 256MB.
Additionally, the new interface takes a start and size, not start and
end, so we can't just copy and paste.

Fixes regressions in dEQP-VK.memory.allocation.random.*

Fixes: ccac7ce373c1 ("drm/msm: Refactor address space initialization")
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# bc8bd54f 12-Jun-2020 John Stultz <john.stultz@linaro.org>

drm/msm: Fix 0xfffflub in "Refactor address space initialization"

This week I started seeing GPU crashes on my DragonBoard 845c
which I narrowed down to being caused by commit ccac7ce373c1
("drm/msm: Refactor address space initialization").

Looking through the patch, Jordan and I couldn't find anything
obviously wrong, so I ended up breaking that change up into a
number of smaller logical steps so I could figure out which part
was causing the trouble.

Ends up, visually counting 'f's is hard, esp across a number
of lines:
0xfffffff != 0xffffffff

This patch corrects the end value we pass in to
msm_gem_address_space_create() in
adreno_iommu_create_address_space() so that it matches the value
used before the problematic patch landed.

With this change, I no longer see the GPU crashes that were
affecting me.

Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: freedreno@lists.freedesktop.org
Fixes: ccac7ce373c1 ("drm/msm: Refactor address space initialization")
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# ccac7ce3 22-May-2020 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Refactor address space initialization

Refactor how address space initialization works. Instead of having the
address space function create the MMU object (and thus require separate but
equal functions for gpummu and iommu) use a single function and pass the
MMU struct in. Make the generic code cleaner by using target specific
functions to create the address space so a2xx can do its own thing in its
own space. For all the other targets use a generic helper to initialize
IOMMU but leave the door open for newer targets to use customization
if they need it.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
[squash in rebase fixups]
Signed-off-by: Rob Clark <robdclark@chromium.org>


# dc0fa5eb 09-May-2020 Shawn Guo <shawn.guo@linaro.org>

drm/msm/a4xx: add adreno a405 support

It adds support for adreno a405 found on MSM8939. The adreno_is_a430()
check in adreno_submit() needs an extension to cover a405. The
downstream driver suggests it should cover the whole a4xx generation.
That's why it gets changed to adreno_is_a4xx(), while a420 is not
tested though.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# a83366ef 23-Apr-2020 Jonathan Marek <jonathan@marek.ca>

drm/msm/a6xx: add A640/A650 to gpulist

Add Adreno 640 and 650 GPU info to the gpulist.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# b83caf42 11-Mar-2020 Takashi Iwai <tiwai@suse.de>

drm/msm: Use scnprintf() for avoiding potential buffer overflow

Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit. Fix it by replacing with scnprintf().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 5f9935f5 13-Jan-2020 Douglas Anderson <dianders@chromium.org>

drm/msm: Fix error about comments within a comment block

My compiler yells:
.../drivers/gpu/drm/msm/adreno/adreno_gpu.c:69:27:
error: '/*' within block comment [-Werror,-Wcomment]

Let's fix.

Fixes: 6a0dea02c2c4 ("drm/msm: support firmware-name for zap fw (v2)")
Link: https://patchwork.freedesktop.org/patch/348519/
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 3522b4b2 12-Jan-2020 Rob Clark <robdclark@chromium.org>

drm/msm: allow zapfw to not be specified in gpulist

For newer devices we want to require the path to come from the
firmware-name property in the zap-shader dt node.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 6a0dea02 07-Jan-2020 Rob Clark <robdclark@chromium.org>

drm/msm: support firmware-name for zap fw (v2)

Since zap firmware can be device specific, allow for a firmware-name
property in the zap node to specify which firmware to load, similarly to
the scheme used for dsp/wifi/etc.

v2: only need a single error msg when we can't load from firmware-name
specified path, and fix comment [Bjorn A.]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>


# 89048dd0 10-Dec-2019 Fabio Estevam <festevam@gmail.com>

drm/msm/adreno: Do not print error on "qcom, gpu-pwrlevels" absence

Booting the adreno driver on a imx53 board leads to the following
error message:

adreno 30000000.gpu: [drm:adreno_gpu_init] *ERROR* Could not find the GPU powerlevels

As the "qcom,gpu-pwrlevels" property is optional and never present on
i.MX5, turn the message into debug level instead.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 00bb9243 21-Nov-2019 Brian Masney <masneyb@onstation.org>

drm/msm/gpu: add support for ocmem interconnect path

Some A3xx and all A4xx Adreno GPUs do not have GMEM inside the GPU core
and must use the On Chip MEMory (OCMEM) in order to be functional.
There's a separate interconnect path that needs to be setup to OCMEM.
Add support for this second path to the GPU core.

In the downstream MSM 3.4 sources, the two interconnect paths for the
GPU are between:

- MSM_BUS_MASTER_GRAPHICS_3D and MSM_BUS_SLAVE_EBI_CH0
- MSM_BUS_MASTER_V_OCMEM_GFX3D and MSM_BUS_SLAVE_OCMEM

Signed-off-by: Brian Masney <masneyb@onstation.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 26c0b26d 23-Aug-2019 Brian Masney <masneyb@onstation.org>

drm/msm/gpu: add ocmem init/cleanup functions

The files a3xx_gpu.c and a4xx_gpu.c have ifdefs for the OCMEM support
that was missing upstream. Add two new functions (adreno_gpu_ocmem_init
and adreno_gpu_ocmem_cleanup) that removes some duplicated code.

Signed-off-by: Brian Masney <masneyb@onstation.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Tested-by: Gabriel Francisco <frc.gabrielgmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# c14b5dce 25-Jul-2019 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Annotate intentional switch statement fall throughs

Explicitly mark intentional fall throughs in switch statements to keep
-Wimplicit-fallthrough from complaining.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1564073588-27386-1-git-send-email-jcrouse@codeaurora.org


# caab277b 02-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 6672e11c 31-May-2019 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Ensure that the zap shader region is big enough

Before loading the zap shader we should ensure that the reserved memory
region is big enough to hold the loaded file.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# a9e2559c 19-Apr-2019 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Move zap shader loading to adreno

a5xx and a6xx both share (mostly) the same code to load the zap shader and
bring the GPU out of secure mode. Move the formerly 5xx specific code to
adreno to make it available for a6xx too.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>


# 48dc4241 16-Apr-2019 Rob Clark <robdclark@chromium.org>

drm/msm: add param to retrieve # of GPU faults (global)

For KHR_robustness, userspace wants to know two things, the count of GPU
faults globally, and the count of faults attributed to a given context.
This patch providees the former, and the next patch provides the latter.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>


# d674c963 15-Apr-2019 Rob Clark <robdclark@chromium.org>

drm/msm/gpu: add per-process pagetables param

For now it always returns '0' (false), but once the iommu work is in
place to enable per-process pagetables we can update the value returned.

Userspace needs to know this to make an informed decision about exposing
KHR_robustness.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>


# fcf9d0b7 12-Feb-2019 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/a6xx: Add support for an interconnect path

Try to get the interconnect path for the GPU and vote for the maximum
bandwidth to support all frequencies. This is needed for performance.
Later we will want to scale the bandwidth based on the frequency to
also optimize for power but that will require some device tree
infrastructure that does not yet exist.

v6: use icc_set_bw() instead of icc_set()
v5: Remove hardcoded interconnect name and just use the default
v4: Don't use a port string at all to skip the need for names in the DT
v3: Use macros and change port string per Georgi Djakov

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Acked-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 2255f244 18-Dec-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Remove hardcoded interrupt name

Every GPU core only has one interrupt so there isn't any
value in looking up the interrupt by name. Remove the name (which
is legacy anyway) and use platform_get_irq() instead.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 878411ae 18-Dec-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Remove hardcoded interrupt name

Every GPU core only has one interrupt so there isn't any
value in looking up the interrupt by name. Remove the name (which
is legacy anyway) and use platform_get_irq() instead.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# c2052a4e 14-Nov-2018 Jonathan Marek <jonathan@marek.ca>

drm/msm: implement a2xx mmu

A2XX has its own very simple MMU.

Added a msm_use_mmu() function because we can't rely on iommu_present to
decide to use MMU or not.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 21af872c 21-Nov-2018 Jonathan Marek <jonathan@marek.ca>

drm/msm/adreno: add a2xx

derived from the a3xx driver and tested on the following hardware:
imx51-zii-rdu1 (a200 with 128kb gmem)
imx53-qsrb (a200)
msm8060-tenderloin (a220)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1df4289d 01-Nov-2018 Sharat Masetty <smasetty@codeaurora.org>

drm/msm: Optimize adreno_show_object()

When the userspace tries to read the crashstate dump, the read side
implementation in the driver currently ascii85 encodes all the binary
buffers and it does this each time the read system call is called.
A userspace tool like cat typically does a page by page read and the
number of read calls depends on the size of the data captured by the
driver. This is certainly not desirable and does not scale well with
large captures.

This patch encodes the buffer only once in the read path. With this there
is an immediate >10X speed improvement in crashstate save time.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 84c61275 07-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Map the ringbuffer in the iova at create time

For reasons that I'm sure made perfect sense at the time we were
opting to defer the iova alloc / pin on the ringbuffer until HW
init time so when we moved to iova reference counting we ended
up adding a reference count every time the hardware started.
Not that it mattered (because the ring is always around) but
it did make the debug output look odd. Allocate and pin the iova
at create time instead.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 9fe041f6 07-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add msm_gem_get_and_pin_iova()

Add a new function to get and pin the iova memory in one
step (basically renaming the old msm_gem_get_iova function)
and switch msm_gem_get_iova() to only allocate an iova but
not map it in the IOMMU. This is only currently used by
msm_ioctl_gem_info() since all other users of of the iova
expect that the memory be immediately available.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# b9fc2302 02-Nov-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Don't capture register values if target doesn't define them

If the GPU target doesn't define a list of registers then gracefully skip
capturing and/or printing them. This is used by more complex targets like
6xx that have other means of capturing register values.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6a41da17 20-Oct-2018 Mamta Shukla <mamtashukla555@gmail.com>

drm: msm: Use DRM_DEV_* instead of dev_*

Use DRM_DEV_INFO/ERROR/WARN instead of dev_info/err/debug to generate
drm-formatted specific log messages so that it will be easy to
differentiate in case of multiple instances of driver.

Signed-off-by: Mamta Shukla <mamtashukla555@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# f9a70823 27-Aug-2018 Johan Hovold <johan@kernel.org>

drm/msm: fix OF child-node lookup

Use the new of_get_compatible_child() helper to lookup the legacy
pwrlevels child node instead of using of_find_compatible_node(), which
searches the entire tree from a given start node and thus can return an
unrelated (i.e. non-child) node.

This also addresses a potential use-after-free (e.g. after probe
deferral) as the tree-wide helper drops a reference to its first
argument (i.e. the probed device's node).

While at it, also fix the related child-node reference leak.

Fixes: e2af8b6b0ca1 ("drm/msm: gpu: Use OPP tables if we can")
Cc: stable <stable@vger.kernel.org> # 4.12
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: Rob Clark <robdclark@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>


# 2c087a33 06-Aug-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Load the firmware before bringing up the hardware

Failure to load firmware is the primary reason to fail adreno_load_gpu().
Try to load it first before going into the hardware initialization code and
unwinding it. This is important for a6xx because the GMU gets loaded from
the runtime power code and it is more costly to fail in that path because
of missing firmware.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# bec2dd69 29-Jun-2018 Kees Cook <keescook@chromium.org>

drm/msm/adreno: Remove VLA usage

In the quest to remove all stack VLA usage from the kernel[1], this
switches to using a kasprintf()ed buffer. Return paths are updated
to free the allocation.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 3530a17f 26-Jul-2018 Arnd Bergmann <arnd@arndb.de>

drm/msm/gpu: avoid deprecated do_gettimeofday

All users of do_gettimeofday() have been removed, but this one recently
crept in, along with an incorrect printing of the microseconds portion.

This converts it to using ktime_get_real_timespec64() as a direct
replacement, and adds the leading zeroes. I considered using monotonic
times (ktime_get()) instead, but as this timestamp appears to only
be used for humans rather than compared with other timestamps, the
real time domain is probably good enough.

Fixes: e43b045e2c82 ("drm/msm/gpu: Capture the state of the GPU")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# cdb95931 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Add the buffer objects from the submit to the crash dump

For hangs, dump copy out the contents of the buffer objects attached to the
guilty submission and print them in the crash dump report.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 50f8d218 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Add a5xx specific registers for the GPU state

HLSQ, SP and TP registers are only accessible from a special
aperture and to make matters worse the aperture is blocked from
the CPU on targets that can support secure rendering. Luckily the
GPU hardware has its own purpose built register dumper that can
access the registers from the aperture. Add a5xx specific code
to program the crashdumper and retrieve the wayward registers
and dump them for the crash state.

Also, remove a block of registers the regular CPU accessible
list that aren't useful for debug which helps reduce the size
of the crash state file by a goodly amount.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 43a56687 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Add ringbuffer data to the GPU state

Add the contents of each ringbuffer to the GPU state and dump the
data in the crash file encoded with ascii85. To save space only
the used portions of the ringbuffer are dumped.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# bcf1d9fa 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Convert the show/crash file format

Convert the format of the 'show' debugfs file and the crash
dump to a format resembling YAML. This should be easier to
parse and be more flexible for future changes and expansions.

v2: Use a standard .rst for the msm crashdump documentation

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# c0fec7f5 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Capture the GPU state on a GPU hang

Capture the GPU state on a GPU hang and store it for later playback
via the devcoredump facility. Only one crash state is stored at a
time on the assumption that the first hang is usually the most
interesting. The existing crash state can be cleared after capturing
it and then a new one will be captured on the next hang.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4f776f45 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Convert the GPU show function to use the GPU state

Convert the existing GPU show function to use the GPU state to
dump the information rather than reading it directly from the hardware.
This will require an additional step to capture the state before
dumping it for the existing nodes but it will greatly facilitate reusing
the same code for dumping a previously captured state from a GPU hang.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# e00e473d 24-Jul-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Capture the state of the GPU

Add the infrastructure to capture the current state of the GPU and
store it in memory so that it can be dumped later.

For now grab the same basic ringbuffer information and registers
that are provided by the debugfs 'gpu' node but obviously this should
be extended to capture a much larger set of GPU information.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 64709686 07-May-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Increase the pm runtime autosuspend for 5xx

Experimentation shows that resuming power quickly after suspending
ends up forcing a system hang for unknown reasons on 5xx targets.
To avoid cycling the power too much (especially during init)
turn up the autosuspend time for a5xx to 250ms and use
pm_runtime_put_autosuspend() when applicable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 79d57bf6 13-Feb-2018 Bjorn Andersson <bjorn.andersson@linaro.org>

drm/msm: Trigger fence completion from GPU

Interrupt commands causes the CP to trigger an interrupt as the command
is processed, regardless of the GPU being done processing previous
commands. This is seen by the interrupt being delivered before the
fence is written on 8974 and is likely the cause of the additional
CP_WAIT_FOR_IDLE workaround found for a306, which would cause the CP to
wait for the GPU to go idle before triggering the interrupt.

Instead we can set the (undocumented) BIT(31) of the CACHE_FLUSH_TS
which will cause a special CACHE_FLUSH_TS interrupt to be triggered from
the GPU as the write event is processed.

Add CACHE_FLUSH_TS to the IRQ masks of A3xx and A4xx and remove the
workaround for A306.

Suggested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 9de43e79 01-Feb-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Use generic function to load firmware to a buffer object

Move a5xx specific code to load firmware into a buffer object to
the generic Adreno code. This will come in useful for future targets.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# c5e3548c 01-Feb-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Define a list of firmware files to load per target

The number and type of firmware files required differs for each
target. Instead of using a fixed struct member for each possible
firmware file use a generic list of files that should be loaded
on boot. Use some semi-target specific enums to help each target
find the appropriate firmware(s) that it needs to load.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# f91c14ab 10-Jan-2018 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add devfreq support for the GPU

Add support for devfreq to dynamically control the GPU frequency.
By default try to use the 'simple_ondemand' governor which can
adjust the frequency based on GPU load.

v2: Fix __aeabi_uldivmod issue from the 0 day bot and use
devfreq_recommended_opp() as suggested by Rob.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 999ae6ed 21-Nov-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/adreno: Move clock parsing to adreno_gpu_init()

Move the clock parsing to adreno_gpu_init() to allow for target
specific probing and manipulation of the clock tables.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1babd706 21-Nov-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm/gpu: Remove unused bus scaling code

Remove the downstream bus scaling code. It isn't needed for for
compatibility with a downstream or vendor kernel. Get it out of the
way to clear space for devfreq support.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 3a9016ba 02-Nov-2017 Colin Ian King <colin.king@canonical.com>

drm/msm: fix spelling mistake: "ringubffer" -> "ringbuffer"

Trivial fix to spelling mistake in DRM_DEV_ERROR error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# b1fc2839 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Implement preemption for A5XX targets

Implement preemption for A5XX targets - this allows multiple
ringbuffers for different priorities with automatic preemption
of a lower priority ringbuffer if a higher one is ready.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4d87fc32 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Make the value of RB_CNTL (almost) generic

We use a global ringbuffer size and block size for all targets and
at least for 5XX preemption we need to know the value the RB_CNTL
in several locations so it makes sense to calculate it once and use
it everywhere.

The only monkey wrench is that we need to disable the RPTR shadow
for A430 targets but that only needs to be done once and doesn't
affect A5XX so we can or in the value at init time.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4c7085a5 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Shadow current pointer in the ring until command is complete

Add a shadow pointer to track the current command being written into
the ring. Don't commit it as 'cur' until the command is submitted.
Because 'cur' is used to construct the software copy of the wptr this
ensures that somebody peeking in on the ring doesn't assume that a
command is inflight while it is being written. This isn't a huge deal
with a single ring (though technically the hangcheck could assume
the system is prematurely busy when it isn't) but it will be rather
important for preemption where the decision to preempt is based
on a non-empty ringbuffer. Without a shadow an aggressive preemption
scheme could assume that the ringbuffer is non empty and switch to it
before the CPU is done writing the command and boom.

Even though preemption won't be supported for all targets because of
the way the code is organized it is simpler to make this generic for
all targets. The extra load for non-preemption targets should be
minimal.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# a6e29a0e 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add a parameter query for the number of ringbuffers

In order to manage ringbuffer priority to its fullest userspace
should know how many ringbuffers it has to work with. Add a
parameter to return the number of active rings.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# f97decac 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Support multiple ringbuffers

Add the infrastructure to support the idea of multiple ringbuffers.
Assign each ringbuffer an id and use that as an index for the various
ring specific operations.

The biggest delta is to support legacy fences. Each fence gets its own
sequence number but the legacy functions expect to use a unique integer.
To handle this we return a unique identifier for each submission but
map it to a specific ring/sequence under the covers. Newer users use
a dma_fence pointer anyway so they don't care about the actual sequence
ID or ring.

The actual mechanics for multiple ringbuffers are very target specific
so this code just allows for the possibility but still only defines
one ringbuffer for each target family.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# cd414f3d 20-Oct-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Move memptrs to msm_gpu

When we move to multiple ringbuffers we're going to store the data
in the memptrs on a per-ring basis. In order to prepare for that
move the current memptrs from the adreno namespace into msm_gpu.
This is way cleaner and immediately lets us kill off some sub
functions so there is much less cost later when we do move to
per-ring structs.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 2c41ef1b 16-Oct-2017 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: deal with linux-firmware fw paths

When firmware was added to linux-firmware, it was put in a qcom sub-
directory, unlike what we'd been using before. For a300_pfp.fw and
a300_pm4.fw symlinks were created, but we'd prefer not to have to do
this in the future. So add support to look in both places when
loading firmware.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# e8f3de96 16-Oct-2017 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: split out helper to load fw

Prep work for the next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# eec874ce 16-Oct-2017 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: load gpu at probe/bind time

Previously, in an effort to defer initializing the gpu until firmware
was available (ie. rootfs mounted), the gpu was not loaded at when the
subdevice was bound. Which resulted that clks/etc were requested in a
place that devm couldn't really help unwind if something failed.

Instead move request_firmware() to gpu->hw_init() and construct the gpu
earlier in adreno_bind(). To avoid the rest of the driver needing to
be aware of a gpu that hasn't managed to load firmware and hw_init()
yet, stash the gpu ptr in the adreno device's drvdata, and don't set
priv->gpu() until hw_init() succeeds.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 8223286d 27-Jul-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add a helper function for in-kernel buffer allocations

Nearly all of the buffer allocations for kernel allocate an buffer object,
virtual address and GPU iova at the same time. Make a helper function to
handle the details.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[dropped msm_fbdev conversion to new helper, since it interferes with
display-handover work, where we want to separate allocation and mapping]
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1267a4df 27-Jul-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Attach the GPU MMU when it is created

Currently the GPU MMU is attached in the adreno_gpu code but as
more and more of the GPU initialization moves to the generic
GPU path we have a need to map and use GPU memory earlier and
earlier. There isn't any reason to defer attaching the MMU
until later so attach it right after the address space is
created so it can be used immediately.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 541de4c9 28-Jul-2017 Archit Taneja <architt@codeaurora.org>

drm/msm/adreno: Prevent unclocked access when retrieving timestamps

msm_gpu's get_timestamp() op (called by the MSM_GET_PARAM ioctl) can
result in register accesses. We need our power domain and clocks to
be active for that. Make sure they are enabled here.

Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 0e08270a 13-Jun-2017 Sushmita Susheelendra <ssusheel@codeaurora.org>

drm/msm: Separate locking of buffer resources from struct_mutex

Buffer object specific resources like pages, domains, sg list
need not be protected with struct_mutex. They can be protected
with a buffer object level lock. This simplifies locking and
makes it easier to avoid potential recursive locking scenarios
for SVM involving mmap_sem and struct_mutex. This also removes
unnecessary serialization when creating buffer objects, and also
between buffer object creation and GPU command submission.

Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
[robclark: squash in handling new locking for shrinker]
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 8bdcd949 13-Jun-2017 Rob Clark <robdclark@gmail.com>

drm/msm: pass address-space to _get_iova() and friends

No functional change, that will come later. But this will make it
easier to deal with dynamically created address spaces (ie. per-
process pagetables for gpu).

Signed-off-by: Rob Clark <robdclark@gmail.com>


# cb1e3818 13-Jun-2017 Rob Clark <robdclark@gmail.com>

drm/msm: fix locking inconsistency for gpu->hw_init()

Most, but not all, paths where calling the with struct_mutex held. The
fast-path in msm_gem_get_iova() (plus some sub-code-paths that only run
the first time) was masking this issue.

So lets just always hold struct_mutex for hw_init(). And sprinkle some
WARN_ON()'s and might_lock() to avoid this sort of problem in the
future.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 42a105e9 08-May-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Remove memptrs->wptr

memptrs->wptr seems to be unused. Remove it to avoid
confusing the upcoming preemption code.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 5770fc7a 08-May-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add a struct to pass configuration to msm_gpu_init()

The amount of information that we need to pass into msm_gpu_init()
is steadily increasing, so add a new struct to stabilize the function
call and make it easier to add new configuration down the line.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# bf5af4ae 07-Mar-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Hard code the GPU "slow frequency"

Some A3XX and A4XX GPU targets required that the GPU clock be
programmed to a non zero value when it was disabled so
27Mhz was chosen as the "invalid" frequency.

Even though newer targets do not have the same clock restrictions
we still write 27Mhz on clock disable and expect the clock subsystem
to round down to zero.

For unknown reasons even though the slow clock speed is always
27Mhz and it isn't actually a functional level the legacy device tree
frequency tables always defined it and then did gymnastics to work
around it.

Instead of playing the same silly games just hard code the "slow" clock
speed in the code as 27MHz and save ourselves a bit of infrastructure.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# e3689e47 07-Mar-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add MSM_PARAM_GMEM_BASE

User space needs to know where the GMEM whole starts so that they
can set up the addressing correctly.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# ee546cd3 07-Mar-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Reference count address spaces

There are reasons for a memory object to outlive the file descriptor
that created it and so the address space that a buffer object is
attached to must also outlive the file descriptor. Reference count
the address space so that it can remain viable until all the objects
have released their addresses.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 9873ef07 06-Feb-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Make sure to detach the MMU during GPU cleanup

We should be detaching the MMU before destroying the address
space. To do this cleanly, the detach has to happen in
adreno_gpu_cleanup() because it needs access to structs
in adreno_gpu.c. Plus it is better symmetry to have
the attach and detach at the same code level.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# de098e5f 12-Feb-2017 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: reset ringbuffer in hw_init

We need to do this also in resume path when we need to re-hw_init().

Signed-off-by: Rob Clark <robdclark@gmail.com>


# eeb75474 10-Feb-2017 Rob Clark <robdclark@gmail.com>

drm/msm/gpu: use pm-runtime

We need to use pm-runtime properly when IOMMU is using device_link() to
control it's own clocks.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# c3c3ab19 10-Feb-2017 Rob Clark <robdclark@gmail.com>

drm/msm/gpu: move suspend/resume into debugfs->show

Each of the per-generation callbacks was doing this. Lets just simplify
and move it into toplevel show() fxn.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 028402d4 06-Feb-2017 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Make sure to detach the MMU during GPU cleanup

We should be detaching the MMU before destroying the address
space. To do this cleanly, the detach has to happen in
adreno_gpu_cleanup() because it needs access to structs
in adreno_gpu.c. Plus it is better symmetry to have
the attach and detach at the same code level.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4e09b95d 30-Jan-2017 Rob Clark <robdclark@gmail.com>

drm/msm: drop quirks binding

This was never documented or used in upstream dtb. It is used by
downstream bindings from android device kernels. But the quirks are
a property of the gpu revision, and as such are redundant to be listed
separately in dt. Instead, move the quirks to the device table.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>


# de85d2b3 12-Jan-2017 Rob Clark <robdclark@gmail.com>

drm/msm: fix potential null ptr issue in non-iommu case

Fixes: 9cb07b099fb ("drm/msm: support multiple address spaces")
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 88b333b0 20-Dec-2016 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Ensure that the hardware write pointer is valid

Currently the value written to CP_RB_WPTR is calculated on the fly as
(rb->next - rb->start). But as the code is designed rb->next is wrapped
before writing the commands so if a series of commands happened to
fit perfectly in the ringbuffer, rb->next would end up being equal to
rb->size / 4 and thus result in an out of bounds address to CP_RB_WPTR.

The easiest way to fix this is to mask WPTR when writing it to the
hardware; it makes the hardware happy and the rest of the ringbuffer
math appears to work and there isn't any point in upsetting anything.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[squash in is_power_of_2() check]
Signed-off-by: Rob Clark <robdclark@gmail.com>


# b5f103ab 28-Nov-2016 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: gpu: Add A5XX target support

Add support for the A5XX family of Adreno GPUs.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4ac277cd 28-Nov-2016 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Disable interrupts during init

Disable the interrupt during the init sequence to avoid having
interrupts fired for errors and other things that we are not
ready to handle while initializing.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# fb039981 28-Nov-2016 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: Add adreno_gpu_write64()

Add a new generic function to write a "64" bit value. This isn't
actually a 64 bit operation, it just writes the upper and lower
32 bit of a 64 bit value to a specified LO and HI register. If
a particular target doesn't support one of the registers it can
mark that register as SKIP and writes/reads from that register
will be quietly dropped.

This can be immediately put in place for the ringbuffer base and
the RPTR address. Both writes are converted to use
adreno_gpu_write64() with their respective high and low registers
and the high register appropriately marked as SKIP for both 32 bit
targets (a3xx and a4xx). When a5xx comes it will define valid target
registers for the 'hi' option and everything else will just work.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# c4a8d475 28-Nov-2016 Jordan Crouse <jcrouse@codeaurora.org>

drm/msm: gpu: Return error on hw_init failure

When the GPU hardware init function fails (like say, ME_INIT timed
out) return error instead of blindly continuing on. This gives us
a small chance of saving the system before it goes boom.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 398efc46 11-Nov-2016 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: move scratch register dumping to per-gen code

Scratch registers move, annoyingly enough, in a5xx. Move to
per-generation aNxx_recover() fxn.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 667ce33e 28-Sep-2016 Rob Clark <robdclark@gmail.com>

drm/msm: support multiple address spaces

We can have various combinations of 64b and 32b address space, ie. 64b
CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So
best to decouple the device iova's from mmap offset.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6b597ce2 01-Jun-2016 Rob Clark <robdclark@gmail.com>

drm/msm: deal with arbitrary # of cmd buffers

For some optimizations coming on the userspace side, splitting larger
draw or gmem cmds into multiple cmdstream buffers, we need to support
much more than the previous small/arbitrary limit.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 18f23049 26-May-2016 Rob Clark <robdclark@gmail.com>

drm/msm: change gem->vmap() to get/put

Before we can add vmap shrinking, we really need to know which vmap'ings
are currently being used. So switch to get/put interface. Stubbed put
fxns for now.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 69a834c2 24-May-2016 Rob Clark <robdclark@gmail.com>

drm/msm: deal with exhausted vmap space better

Some, but not all, callers of obj->vmap() would check if return
IS_ERR(). So let's actually return an error if vmap() fails. And fixup
the call-sites that were not handling this properly.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 1193c3bc 03-May-2016 Rob Clark <robdclark@gmail.com>

drm/msm: drop return from gpu->submit()

At this point, there is nothing left to fail. And submit already has a
fence assigned and is added to the submit_list. Any problems from here
on out are asynchronous (ie. hangcheck/recovery).

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 27557343 17-Mar-2016 Rob Clark <robdclark@gmail.com>

drm/msm: fix ->last_fence() after recover

It is no longer true that we discard all in-flight submits on recover
(these days we only discard the first one that hung). After the first
re-submitted batch completes it would overwrite the fence with a correct
value, but there would be a window of time which showed all re-submitted
batches as already complete.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# b6295f9a 15-Mar-2016 Rob Clark <robdclark@gmail.com>

drm/msm: 'struct fence' conversion

Signed-off-by: Rob Clark <robdclark@gmail.com>


# ca762a8a 15-Mar-2016 Rob Clark <robdclark@gmail.com>

drm/msm: introduce msm_fence_context

Better encapsulate the per-timeline stuff into fence-context. For now
there is just a single fence-context, but eventually we'll also have one
per-CRTC to enable fully explicit fencing.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6c77d1ab 22-Feb-2016 Rob Clark <robdclark@gmail.com>

drm/msm: add timestamp param

We need this for GL_TIMESTAMP queries.

Note: currently only supported on a4xx.. a3xx doesn't have this
always-on counter. I think we could emulate it with the one CP
counter that is available, but for now it is of limited usefulness
on a3xx (since we can't seem to do time-elapsed queries in any sane
way with the existing firmware on a3xx, and if you are trying to do
profiling on a tiler you want time-elapsed). We can add that later
if it becomes useful.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 7d0c5ee9 18-Feb-2016 Craig Stout <cstout@chromium.org>

drm/msm/adreno: get CP_RPTR from register instead of shadow memory

As described in the downstream/kgsl driver:
Sometimes the RPTR shadow memory is unreliable causing timeouts
in adreno_idle(). Read it directly from the register instead.

Signed-off-by: Craig Stout <cstout@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 357ff00b 18-Feb-2016 Craig Stout <cstout@chromium.org>

drm/msm/adreno: support for adreno 430.

Signed-off-by: Craig Stout <cstout@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4102a9e5 08-Feb-2016 Rob Clark <robdclark@gmail.com>

drm/msm: add max-freq gpu param to uapi

We need this in userspace for interpreting some of the perf ctrs.

Note possibly not quite sufficient if we had some frequency mgmt
approach other than race-to-idle. Not really sure what the best
thing to do if we did. Although displaying results as a percentage
of max frequence seems sensible(ish) if we did.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# d735fdc3 12-May-2015 Rob Clark <robdclark@gmail.com>

drm/msm: workaround for missing irq on a306/8x16

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 6490ad47 04-Jun-2015 Rob Clark <robdclark@gmail.com>

drm/msm: clarify downstream bus scaling

A few spots in the driver have support for downstream android
CONFIG_MSM_BUS_SCALING. This is mainly to simplify backporting the
driver for various devices which do not have sufficient upstream
kernel support. But the intentionally dead code seems to cause
some confusion. Rename the #define to make this more clear.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 26716185 19-Apr-2015 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: dump scratch regs and other info on hang

Dump a bit more info when the GPU hangs, without having hang_debug
enabled (which dumps a *lot* of registers). Also dump the scratch
registers, as they are useful for determining where in the cmdstream
the GPU hung (and they seem always safe to read when GPU has hung).

Note that the freedreno gallium driver emits increasing counter values
to SCRATCH6 (to identify tile #) and SCRATCH7 (to identify draw #), so
these two in particular can be used to "triangulate" where in the
cmdstream the GPU hung.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 774449eb 15-May-2015 Rob Clark <robdclark@gmail.com>

drm/msm: fix locking inconsistencies in gpu->destroy()

In error paths, this was being called without struct_mutex held.
Leading to panics like:

msm 1a00000.qcom,mdss_mdp: No memory protection without IOMMU
Kernel panic - not syncing: BUG!
CPU: 0 PID: 1409 Comm: cat Not tainted 4.0.0-dirty #4
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
[<ffffffc000089c78>] dump_backtrace+0x0/0x118
[<ffffffc000089da0>] show_stack+0x10/0x20
[<ffffffc0006686d4>] dump_stack+0x84/0xc4
[<ffffffc0006678b4>] panic+0xd0/0x210
[<ffffffc0003e1ce4>] drm_gem_object_free+0x5c/0x60
[<ffffffc000402870>] adreno_gpu_cleanup+0x60/0x80
[<ffffffc0004035a0>] a3xx_destroy+0x20/0x70
[<ffffffc0004036f4>] a3xx_gpu_init+0x84/0x108
[<ffffffc0004018b8>] adreno_load_gpu+0x58/0x190
[<ffffffc000419dac>] msm_open+0x74/0x88
[<ffffffc0003e0a48>] drm_open+0x168/0x400
[<ffffffc0003e7210>] drm_stub_open+0xa8/0x118
[<ffffffc0001a0e84>] chrdev_open+0x94/0x198
[<ffffffc000199f88>] do_dentry_open+0x208/0x310
[<ffffffc00019a4c4>] vfs_open+0x44/0x50
[<ffffffc0001aa26c>] do_last.isra.14+0x2c4/0xc10
[<ffffffc0001aac38>] path_openat+0x80/0x5e8
[<ffffffc0001ac354>] do_filp_open+0x2c/0x98
[<ffffffc00019b60c>] do_sys_open+0x13c/0x228
[<ffffffc00019b72c>] SyS_openat+0xc/0x18
CPU1: stopping

But there isn't any particularly good reason to hold struct_mutex for
teardown, so just standardize on calling it without the mutex held and
use the _unlocked() versions for GEM obj unref'ing

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 5acb07ea 25-Nov-2014 Markus Elfring <elfring@users.sourceforge.net>

drm/msm: Deletion of unnecessary checks before the function call "release_firmware"

The release_firmware() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 23bd62fd 08-Sep-2014 Aravind Ganesan <aravindg@codeaurora.org>

drm/msm: a4xx support for msm-drm

Added a4xx GPU support.

Signed-off-by: Aravind Ganesan <aravindg@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 91b74e97 08-Sep-2014 Aravind Ganesan <aravindg@codeaurora.org>

drm/msm: Handle register offset differences between a3xx and a4xx

Register offsets have changed between a3xx and a4xx GPUs.
To be able access these registers in common code, we create
a lookup table, and set of read-write APIs to access the
register through the lookup table.

Signed-off-by: Aravind Ganesan <aravindg@codeaurora.org>
[robclark: remove REG_ADRENO_UNDEFINED, just use zero, and minor
tweaks for latest generated headers]
Signed-off-by: Rob Clark <robdclark@gmail.com>


# 0122f96f 31-Oct-2014 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: slight init order cleanup

Move anything that can fail after call to base class msm_gpu_init().
This way, if we fail, active_list has already been initialized so we
don't trip 'WARN_ON(!list_empty(&gpu->active_list))' in
msm_gpu_cleanup().

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 3bcefb04 05-Sep-2014 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: push dump/show stuff to base class

Add ptr to list of interesting registers to 'struct adreno_gpu' and use
that to move most of the debugfs show and register dump bits down into
adreno_gpu. This will avoid duplication as support for additional
adreno generations is added.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 3526e9fb 05-Sep-2014 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: bit of init refactoring

Push a few bits down into adreno_gpu so they won't have to be duplicated
as support for additional adreno generations is added.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# e2550b7a 05-Sep-2014 Rob Clark <robdclark@gmail.com>

drm/msm/adreno: move decision about what gpu to to load

Move this into into adreno_device, and decide based on gpu revision
rather than just assuming a3xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# a1ad3523 11-Jul-2014 Rob Clark <robdclark@gmail.com>

drm/msm: fix potential deadlock in gpu init

Somewhere along the way, the firmware loader sprouted another lock
dependency, resulting in possible deadlock scenario:

&dev->struct_mutex --> &sb->s_type->i_mutex_key#2 --> &mm->mmap_sem

which is problematic vs things like gem mmap.

So introduce a separate mutex to synchronize gpu init.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 944fc36c 09-Jul-2014 Rob Clark <robdclark@gmail.com>

drm/msm: use upstream iommu

Downstream kernel IOMMU had a non-standard way of dealing with multiple
devices and multiple ports/contexts. We don't need that on upstream
kernel, so rip out the crazy.

Note that we have to move the pinning of the ringbuffer to after the
IOMMU is attached. No idea how that managed to work properly on the
downstream kernel.

For now, I am leaving the IOMMU port name stuff in place, to simplify
things for folks trying to backport latest drm/msm to device kernels.
Once we no longer have to care about pre-DT kernels, we can drop this
and instead backport upstream IOMMU driver.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 4e1cbaa3 04-Feb-2014 Rob Clark <robdclark@gmail.com>

drm/msm: add chip-id param

Some of the w/a or different behavior of userspace blob driver seem to
be keyed to gpu patch revision, rather than gpu-id. So expose the full
chip-id to userspace so it can DTRT.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 0963756f 11-Jan-2014 Rob Clark <robdclark@gmail.com>

drm/msm: spin helper

Helper macro to simplify places where we need to poll with timeout
waiting for gpu.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 5b6ef08e 22-Dec-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add hang_debug module param

msm.hang_debug=y will dump out current register values if the gpu locks
up, for easier debugging.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 55459968 05-Dec-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add a330/apq8x74

Add support for adreno 330. Not too much different, just a few
differences in initial configuration plus setting OCMEM base.
Userspace support is already in upstream mesa.

Note that the existing DT code is simply using the bindings from
downstream android kernel, to simplify porting of this driver to
existing devices. These do not constitute any committed/stable
DT ABI. The addition of proper DT bindings will be a subsequent
patch, at which point (as best as possible) I will try to support
either upstream bindings or what is found in downstream android
kernel, so that existing device DT files can be used.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 871d812a 15-Nov-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add support for non-IOMMU systems

Add a VRAM carveout that is used for systems which do not have an IOMMU.

The VRAM carveout uses CMA. The arch code must setup a CMA pool for the
device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not
cool). The user can configure the VRAM pool size using msm.vram module
param.

Technically, the abstraction of IOMMU behind msm_mmu is not strictly
needed, but it simplifies the GEM code a bit, and will be useful later
when I add support for a2xx devices with GPUMMU, so I decided to keep
this part.

It appears to be possible to configure the GPU to restrict access to
addresses within the VRAM pool, but this is not done yet. So for now
the GPU will refuse to load if there is no sort of mmu. Once address
based limits are supported and tested to confirm that we aren't giving
the GPU access to arbitrary memory, this restriction can be lifted

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 3b57f23b 15-Nov-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add missing MODULE_FIRMWARE()s

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 26791c48 03-Sep-2013 Rob Clark <robdclark@gmail.com>

drm/msm: hangcheck harder

If gpu locks up with the rptr shortly beyond the wrap-around point in
the ringbuffer, because the rptr was not reset (but wptr is, by virtue
of resetting rb->cur), we could end up in a scenario where we think
there is not enough space in the ringbuffer for the next cmds. And
since the CP won't reset rptr until after processing an IB, this leaves
things in a sort of deadlock.

So reset rptr too. And a bit more spiffing up of hangcheck to make
things easier to debug.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# bd6f82d8 24-Aug-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add basic hangcheck/recovery mechanism

A basic, no-frills recovery mechanism in case the gpu gets wedged. We
could try to be a bit more fancy and restart the next submit after the
one that got wedged, but for now keep it simple. This is enough to
recover things if, for example, the gpu hangs mid way through a piglit
run.

Signed-off-by: Rob Clark <robdclark@gmail.com>


# 7198e6b0 18-Jul-2013 Rob Clark <robdclark@gmail.com>

drm/msm: add a3xx gpu support

Add initial support for a3xx 3d core.

So far, with hardware that I've seen to date, we can have:
+ zero, one, or two z180 2d cores
+ a3xx or a2xx 3d core, which share a common CP (the firmware
for the CP seems to implement some different PM4 packet types
but the basics of cmdstream submission are the same)

Which means that the eventual complete "class" hierarchy, once
support for all past and present hw is in place, becomes:
+ msm_gpu
+ adreno_gpu
+ a3xx_gpu
+ a2xx_gpu
+ z180_gpu

This commit splits out the parts that will eventually be common
between a2xx/a3xx into adreno_gpu, and the parts that are even
common to z180 into msm_gpu.

Note that there is no cmdstream validation required. All memory access
from the GPU is via IOMMU/MMU. So as long as you don't map silly things
to the GPU, there isn't much damage that the GPU can do.

Signed-off-by: Rob Clark <robdclark@gmail.com>