History log of /linux-master/drivers/gpu/drm/v3d/v3d_gem.c
Revision Date Author Comments
# 9032d5f6 30-Nov-2023 Melissa Wen <mwen@igalia.com>

drm/v3d: Detach job submissions IOCTLs to a new specific file

We will include a new job submission type, the CPU job submission. For
readability and maintability, separate the job submission IOCTLs and
related operations from v3d_gem.c.

Minor fix in the CSD submission kernel doc:
CSD (texture formatting) -> CSD (compute shader).

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-5-mcanal@igalia.com


# a8ad9d63 30-Nov-2023 Melissa Wen <mwen@igalia.com>

drm/v3d: Move wait BO ioctl to the v3d_bo file

IOCTLs related to BO operations reside on the file v3d_bo.c. The wait BO
ioctl is the only IOCTL regarding BOs that is placed in a different file.
So, move it to the v3d_bo.c file.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-4-mcanal@igalia.com


# a78422e9 09-Nov-2023 Danilo Krummrich <dakr@redhat.com>

drm/sched: implement dynamic job-flow control

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231110001638.71750-1-dakr@redhat.com


# 509433d8 05-Sep-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Expose the total GPU usage stats on sysfs

The previous patch exposed the accumulated amount of active time per
client for each V3D queue. But this doesn't provide a global notion of
the GPU usage.

Therefore, provide the accumulated amount of active time for each V3D
queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs
submitted to the queue, independent of the client.

This data is exposed through the sysfs interface, so that if the
interface is queried at two different points of time the usage percentage
of each of the queues can be calculated.

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com


# 09a93cc4 05-Sep-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Implement show_fdinfo() callback for GPU usage stats

This patch exposes the accumulated amount of active time per client
through the fdinfo infrastructure. The amount of active time is exposed
for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN.

In order to calculate the amount of active time per client, a CPU clock
is used through the function local_clock(). The point where the jobs has
started is marked and is finally compared with the time that the job had
finished.

Moreover, the number of jobs submitted to each queue is also exposed on
fdinfo through the identifier "v3d-jobs-<queue>".

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com


# 0ad5bc1c 31-Oct-2023 Iago Toral Quiroga <itoral@igalia.com>

drm/v3d: fix up register addresses for V3D 7.x

This patch updates a number of register addresses that have
been changed in Raspberry Pi 5 (V3D 7.1) and updates the
code to use the corresponding registers and addresses based
on the actual V3D version.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-3-itoral@igalia.com


# 79d94360 23-Oct-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: wait for all jobs to finish before unregistering

Currently, we are only warning the user if the BIN or RENDER jobs don't
finish before we unregister V3D. We must wait for all jobs to finish
before unregistering. Therefore, warn the user if TFU or CSD jobs
are not done by the time the driver is unregistered.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20231023105927.101502-1-mcanal@igalia.com


# 25c0e406 09-Feb-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Use drm_sched_job_add_syncobj_dependency()

As v3d_job_add_deps() performs the same steps as
drm_sched_job_add_syncobj_dependency(), replace the open-coded
implementation in v3d in order to simply use the DRM function.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230209124447.467867-6-mcanal@igalia.com


# a53be8da 27-Dec-2022 Maíra Canal <mcanal@igalia.com>

drm/v3d: replace open-coded implementation of drm_gem_object_lookup

As v3d_submit_tfu_ioctl() performs the same steps as
drm_gem_object_lookup(), replace the open-code implementation in v3d
with its DRM core equivalent.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221227200010.191351-1-mcanal@igalia.com


# 5d930605 04-Dec-2022 Melissa Wen <mwen@igalia.com>

drm/v3d: replace obj lookup steps with drm_gem_objects_lookup

As v3d_lookup_bos() performs the same steps as drm_gem_objects_lookup(),
replace the explicit code in v3d to simply use the DRM function.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221205135538.3545051-3-mwen@igalia.com


# f98c5ec2 04-Dec-2022 Melissa Wen <mwen@igalia.com>

drm/v3d: cleanup BOs properly when lookup_bos fails

When v3d_lookup_bos fails to `allocate validated BO pointers`,
job->bo_count was already set to args->bo_count, but job->bo points to
NULL. In this scenario, we must verify that job->bo is not NULL before
iterating on it to proper clean up a job. Also, drm_gem_object_put
already checks that the object passed is not NULL, doing the job->bo[i]
checker redundant.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221205135538.3545051-2-mwen@igalia.com


# 91d502f6 08-Nov-2022 Maíra Canal <mcanal@igalia.com>

drm/v3d: switch to drmm_mutex_init

mutex_init is supposed to be balanced by a call to mutex_destroy, but
this is not currently happening on the v3d driver.

Considering the introduction of a DRM-managed mutex_init variant, switch
to the drmm_mutex_init.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108175425.39819-2-mcanal@igalia.com


# 4a83c26a 01-Aug-2022 Danilo Krummrich <dakr@redhat.com>

drm/gem: rename GEM CMA helpers to GEM DMA helpers

Rename "GEM CMA" helpers to "GEM DMA" helpers - considering the
hierarchy of APIs (mm/cma -> dma -> gem dma) calling them "GEM
DMA" seems to be more applicable.

Besides that, commit e57924d4ae80 ("drm/doc: Task to rename CMA helpers")
requests to rename the CMA helpers and implies that people seem to be
confused about the naming.

In order to do this renaming the following script was used:

```
#!/bin/bash

DIRS="drivers/gpu include/drm Documentation/gpu"

REGEX_SYM_UPPER="[0-9A-Z_\-]"
REGEX_SYM_LOWER="[0-9a-z_\-]"

REGEX_GREP_UPPER="(${REGEX_SYM_UPPER}*)(GEM)_CMA_(${REGEX_SYM_UPPER}*)"
REGEX_GREP_LOWER="(${REGEX_SYM_LOWER}*)(gem)_cma_(${REGEX_SYM_LOWER}*)"

REGEX_SED_UPPER="s/${REGEX_GREP_UPPER}/\1\2_DMA_\3/g"
REGEX_SED_LOWER="s/${REGEX_GREP_LOWER}/\1\2_dma_\3/g"

# Find all upper case 'CMA' symbols and replace them with 'DMA'.
for ff in $(grep -REHl "${REGEX_GREP_UPPER}" $DIRS)
do
sed -i -E "$REGEX_SED_UPPER" $ff
done

# Find all lower case 'cma' symbols and replace them with 'dma'.
for ff in $(grep -REHl "${REGEX_GREP_LOWER}" $DIRS)
do
sed -i -E "$REGEX_SED_LOWER" $ff
done

# Replace all occurrences of 'CMA' / 'cma' in comments and
# documentation files with 'DMA' / 'dma'.
for ff in $(grep -RiHl " cma " $DIRS)
do
sed -i -E "s/ cma / dma /g" $ff
sed -i -E "s/ CMA / DMA /g" $ff
done

# Rename all 'cma_obj's to 'dma_obj'.
for ff in $(grep -RiHl "cma_obj" $DIRS)
do
sed -i -E "s/cma_obj/dma_obj/g" $ff
done
```

Only a few more manual modifications were needed, e.g. reverting the
following modifications in some DRM Kconfig files

- select CMA if HAVE_DMA_CONTIGUOUS
+ select DMA if HAVE_DMA_CONTIGUOUS

as well as manually picking the occurrences of 'CMA'/'cma' in comments and
documentation which relate to "GEM CMA", but not "FB CMA".

Also drivers/gpu/drm/Makefile was fixed up manually after renaming
drm_gem_cma_helper.c to drm_gem_dma_helper.c.

This patch is compile-time tested building a x86_64 kernel with
`make allyesconfig && make drivers/gpu/drm`.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> #drivers/gpu/drm/arm
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220802000405.949236-4-dakr@redhat.com


# 90a64adb 03-Jun-2022 Peter Robinson <pbrobinson@gmail.com>

drm/v3d: Get rid of pm code

Runtime PM doesn't seem to work correctly on this driver. On top of
that, commit 8b6864e3e138 ("drm/v3d/v3d_drv: Remove unused static
variable 'v3d_v3d_pm_ops'") hints that it most likely never did as the
driver's PM ops were not hooked-up.

So, in order to support regular operation with V3D on BCM2711 (Raspberry
Pi 4), get rid of the PM code. PM will be reinstated once we figure out
the underlying issues.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220603092610.1909675-3-pbrobinson@gmail.com


# 73511edf 09-Nov-2021 Christian König <christian.koenig@amd.com>

dma-buf: specify usage while adding fences to dma_resv obj v7

Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.

Rework all drivers to use this interface instead and deprecate the old one.

v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com


# c8d4c18b 16-Nov-2021 Christian König <christian.koenig@amd.com>

dma-buf/drivers: make reserving a shared slot mandatory v4

Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.

This is the next step towards handling the exclusive fence like a
shared one.

v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com


# e57c1a3b 28-Jan-2022 Yongzhi Liu <lyz_cs@pku.edu.cn>

drm/v3d: fix missing unlock

[why]
Unlock is needed on the error handling path to prevent dead lock.
v3d_submit_cl_ioctl and v3d_submit_csd_ioctl is missing unlock.

[how]
Fix this by changing goto target on the error handling path. So
changing the goto to target an error handling path
that includes drm_gem_unlock reservations.

Signed-off-by: Yongzhi Liu <lyz_cs@pku.edu.cn>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1643377262-109975-1-git-send-email-lyz_cs@pku.edu.cn


# 75ad021f 15-Oct-2021 Yang Li <yang.lee@linux.alibaba.com>

drm/v3d: nullify pointer se with a NULL

Currently a plain integer is being used to nullify the pointer
struct v3d_submit_ext *se. Use NULL instead. Cleans up sparse
warnings:
drivers/gpu/drm/v3d/v3d_gem.c:777:53: warning: Using plain integer as
NULL pointer
drivers/gpu/drm/v3d/v3d_gem.c:1010:45: warning: Using plain integer as
NULL pointer

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1634282081-72255-1-git-send-email-yang.lee@linux.alibaba.com


# ee30840b 11-Oct-2021 Dan Carpenter <dan.carpenter@oracle.com>

drm/v3d: fix copy_from_user() error codes

The copy_to/from_user() function returns the number of bytes remaining
to be copied, but we want to return -EFAULT on error.

Fixes: e4165ae8304e ("drm/v3d: add multiple syncobjs support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211011123303.GA14314@kili


# e4165ae8 30-Sep-2021 Melissa Wen <mwen@igalia.com>

drm/v3d: add multiple syncobjs support

Using the generic extension from the previous patch, a specific multisync
extension enables more than one in/out binary syncobj per job submission.
Arrays of syncobjs are set in struct drm_v3d_multisync, that also cares
of determining the stage for sync (wait deps) according to the job
queue.

v2:
- subclass the generic extension struct (Daniel)
- simplify adding dependency conditions to make understandable (Iago)

v3:
- fix conditions to consider single or multiples in/out_syncs (Iago)
- remove irrelevant comment (Iago)

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ffd8b2e3dd2e0c686db441a0c0a4a0181ff85328.1633016479.git.mwen@igalia.com


# bb3425ef 30-Sep-2021 Melissa Wen <mwen@igalia.com>

drm/v3d: add generic ioctl extension

Add support to attach generic extensions on job submission. This patch
is third prep work to enable multiple syncobjs on job submission. With
this work, when the job submission interface needs to be extended to
accommodate a new feature, we will use a generic extension struct where
an id determines the data type to be pointed. The first application is
to enable multiples in/out syncobj (next patch), but the base is
already done for future features. Therefore, to attach a new feature,
a specific extension struct should subclass drm_v3d_extension and
update the list of extensions in a job submission.

v2:
- remove redundant elements to subclass struct (Daniel)

v3:
- add comment for v3d_get_extensions

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ed53b1cd7e3125b76f18fe3fb995a04393639bc6.1633016479.git.mwen@igalia.com


# 07c2a416 30-Sep-2021 Melissa Wen <mwen@igalia.com>

drm/v3d: alloc and init job in one shot

Move job memory allocation to v3d_job_init function. This aim to facilitate
error handling in job initialization, since cleanup steps are similar for
all (struct v3d_job)-based types of job involved in a command submission.
To generalize v3d_job_init(), this change takes into account that all job
structs have the first element a struct v3d_job (bin, render, tfu, csd) or
it is a v3d_job itself (clean_job) for pointer casting.

v3:
- explicitly init job as NULL (Iago)
- fix pm failure handling on v3_job_init (Iago)

Suggested-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4d12e07bd334d2cddb51cabd359e99edde595619.1633016479.git.mwen@igalia.com


# 223583dd 30-Sep-2021 Melissa Wen <mwen@igalia.com>

drm/v3d: decouple adding job dependencies steps from job init

Prep work to enable a job to wait for more than one syncobj before
start. Also get rid of old checkpatch warnings in the v3d_gem file.
No functional changes.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/825f5fdd540b0aa2eb57bd5ff93c0777808b108c.1633016479.git.mwen@igalia.com


# 9fcb4a8f 16-Sep-2021 Melissa Wen <mwen@igalia.com>

drm/v3d: fix sched job resources cleanup when a job is aborted

In a cl submission, when bin job initialization fails, sched job resources
were already allocated for the render job. At this point,
drm_sched_job_init(render) was done in v3d_job_init but the render job is
aborted before drm_sched_job_arm (in v3d_job_push) happens; therefore, not
only v3d_job_put but also drm_sched_job_cleanup should be called (by
v3d_job_cleanup). A similar issue is addressed for csd and tfu submissions.

The issue was noticed from a review by Iago Toral in a patch that touches
the same part of the code.

Fixes: 916044fac8623 ("drm/v3d: Move drm_sched_job_init to v3d_job_init")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916212726.2u2psq2egwy2mdva@mail.igalia.com


# e4f86819 14-Sep-2021 Iago Toral Quiroga <itoral@igalia.com>

drm/v3d: fix wait for TMU write combiner flush

The hardware sets the TMUWCF bit back to 0 when the TMU write
combiner flush completes so we should be checking for that instead
of the L2TFLS bit.

v2 (Melissa Wen):
- Add Signed-off-by and Fixes tags.
- Change the error message for the timeout to be more clear.

Fixes spurious Vulkan CTS failures in:
dEQP-VK.binding_model.descriptorset_random.*

Fixes: d223f98f02099 ("drm/v3d: Add support for compute shader dispatch.")
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210915100507.3945-1-itoral@igalia.com


# da3208e8 04-Aug-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/v3d: Use scheduler dependency handling

With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

v2: Rebase over renamed function names for adding dependencies.

Reviewed-by: Melissa Wen <mwen@igalia.com> (v1)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-11-daniel.vetter@ffwll.ch


# 916044fa 04-Aug-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/v3d: Move drm_sched_job_init to v3d_job_init

Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

v4: Rebase over perfmon patch

Reviewed-by: Melissa Wen <mwen@igalia.com> (v3)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-10-daniel.vetter@ffwll.ch


# 0e10e9a1 04-Aug-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/sched: drop entity parameter from drm_sched_push_job

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com> (v1)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Melissa Wen <mwen@igalia.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch


# dbe48d03 17-Aug-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/sched: Split drm_sched_job_init

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
to be moved into drm_sched_job_arm, which made me realize that the
job->id definitely needs to be moved too.

Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't
be a problem where it is now.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Steven Price <steven.price@arm.com> (v2)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch


# 26a4dc29 08-Jun-2021 Juan A. Suarez Romero <jasuarez@igalia.com>

drm/v3d: Expose performance counters to userspace

The V3D engine has several hardware performance counters that can of
interest for userspace performance analysis tools.

This exposes new ioctls to create and destroy performance monitor
objects, as well as to query the counter values.

Each created performance monitor object has an ID that can be attached
to CL/CSD submissions, so the driver enables the requested counters when
the job is submitted, and updates the performance monitor values when
the job is done.

It is up to the user to ensure all the jobs have been finished before
getting the performance monitor values. It is also up to the user to
properly synchronize BCL jobs when submitting jobs with different
performance monitors attached.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Emma Anholt <emma@anholt.net>
To: dri-devel@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608111541.461991-1-jasuarez@igalia.com


# e226878e 16-Nov-2020 Lee Jones <lee.jones@linaro.org>

drm/v3d/v3d_gem: Provide descriptions for 'v3d_lookup_bos's params

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/v3d/v3d_gem.c:292: warning: Function parameter or member 'bo_handles' not described in 'v3d_lookup_bos'
drivers/gpu/drm/v3d/v3d_gem.c:292: warning: Function parameter or member 'bo_count' not described in 'v3d_lookup_bos'

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-36-lee.jones@linaro.org


# 897dbea6 25-Oct-2020 Dan Carpenter <dan.carpenter@oracle.com>

drm/v3d: Fix double free in v3d_submit_cl_ioctl()

Originally this error path used to leak "bin" but then we accidentally
applied two separate commits to fix it and ended up with a double free.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20201026094905.GA1634423@mwanda


# 2b86189e 15-May-2020 Emil Velikov <emil.velikov@collabora.com>

drm/v3d: remove _unlocked suffix in drm_gem_object_put_unlocked

Spelling out _unlocked for each and every driver is a annoying.
Especially if we consider how many drivers, do not know (or need to)
about the horror stories involving struct_mutex.

Just drop the suffix. It makes the API cleaner.

Done via the following script:

__from=drm_gem_object_put_unlocked
__to=drm_gem_object_put
for __file in $(git grep --name-only $__from); do
sed -i "s/$__from/$__to/g" $__file;
done

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-33-emil.l.velikov@gmail.com


# bc662528 15-Apr-2020 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/v3d: Delete v3d_dev->dev

We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
different objects if there's multiple
- compilers have an easier time too

But also a bit a bikeshed, so feel free to ignore.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-10-daniel.vetter@ffwll.ch


# 29cd13cf 21-Oct-2019 Navid Emamdoost <navid.emamdoost@gmail.com>

drm/v3d: Fix memory leak in v3d_submit_cl_ioctl

In the impelementation of v3d_submit_cl_ioctl() there are two memory
leaks. One is when allocation for bin fails, and the other is when bin
initialization fails. If kcalloc fails to allocate memory for bin then
render->base should be put. Also, if v3d_job_init() fails to initialize
bin->base then allocated memory for bin should be released.

Fixes: a783a09ee76d ("drm/v3d: Refactor job management.")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021185250.26130-1-navid.emamdoost@gmail.com


# 455d56ce 19-Sep-2019 Iago Toral Quiroga <itoral@igalia.com>

drm/v3d: clean caches at the end of render jobs on request from user space

Extends the user space ioctl for CL submissions so it can include a request
to flush the cache once the CL execution has completed. Fixes memory
write violation messages reported by the kernel in workloads involving
shader memory writes (SSBOs, shader images, scratch, etc) which sometimes
also lead to GPU resets during Piglit and CTS workloads.

v2: if v3d_job_init() fails we need to kfree() the job instead of
v3d_job_put() it (Eric Anholt).

v3 (Eric Anholt):
- Drop _FLAG suffix from the new flag name.
- Add a new param so userspace can tell whether cache flushing is
implemented in the kernel.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919071016.4578-1-itoral@igalia.com


# 0d352a3a 16-Sep-2019 Iago Toral Quiroga <itoral@igalia.com>

drm/v3d: don't leak bin job if v3d_job_init fails.

If the initialization of the job fails we need to kfree() it
before returning.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190916071125.5255-1-itoral@igalia.com
Fixes: a783a09ee76d ("drm/v3d: Refactor job management.")
Reviewed-by: Eric Anholt <eric@anholt.net>


# 52791eee 11-Aug-2019 Christian König <christian.koenig@amd.com>

dma-buf: rename reservation_object to dma_resv

Be more consistent with the naming of the other DMA-buf objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/


# 220989e7 16-Jul-2019 Sam Ravnborg <sam@ravnborg.org>

drm/v3d: drop use of drmP.h

Drop use of the deprecated drmP.h header file.
Made v3d_drv.h self-contained with only sufficient
include files.
Fixed fallout in remaining files.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190716064220.18157-3-sam@ravnborg.org


# 1ba9d7cb 18-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Dump V3D error debug registers in debugfs, and one at reset.

Looking at a hang recently, I noticed these registers that might tell
me if something obvious was wrong. They didn't help in this case, but
keep it around for the future.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419001014.23579-3-eric@anholt.net
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>


# dffa9b7a 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Add missing implicit synchronization.

It is the expectation of existing userspace (X11 + Mesa, in
particular) that jobs submitted to the kernel against a shared BO will
get implicitly synchronized by their submission order. If we want to
allow clever userspace to disable implicit synchronization, we should
do that under its own submit flag (as amdgpu and lima do).

Note that we currently only implicitly sync for the rendering pass,
not binning -- if you texture-from-pixmap in the binning vertex shader
(vertex coordinate generation), you'll miss out on synchronization.

Fixes flickering when multiple clients are running in parallel,
particularly GL apps and compositors.

v2: Fix a missing refcount on the CSD done fence for L2 cleaning.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-6-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# 07fbbd66 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Drop reservation of a shared slot in the dma-buf reservations.

We only set the excl (possible-writing) fence pointer and never add a
shared (read-only) fence.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-5-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# d223f98f 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Add support for compute shader dispatch.

The compute shader dispatch interface is pretty simple -- just pass in
the regs that userspace has passed us, with no CLs to run. However,
with no CL to run it means that we need to do manual cache flushing of
the L2 after the HW execution completes (for SSBO, atomic, and
image_load_store writes that are the output of compute shaders).

This doesn't yet expose the L2 cache's ability to have a region of the
address space not write back to memory (which could be used for
shared_var storage).

So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing
the ES31 tests), and on the kernel side on 7278 (failing atomic
compswap tests in a way that doesn't reproduce on simpenrose).

v2: Fix excessive allocation for the clean_job (reported by Dan
Carpenter). Keep refs on jobs until clean_job is finished, to
avoid spurious MMU errors if the output BOs are freed by userspace
before L2 cleaning is finished.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# a783a09e 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Refactor job management.

The CL submission had two jobs embedded in an exec struct. When I
added TFU support, I had to replicate some of the exec stuff and some
of the job stuff. As I went to add CSD, it became clear that actually
what was in exec should just be in the two CL jobs, and it would let
us share a lot more code between the 4 queues.

v2: Fix missing error path in TFU ioctl's bo[] allocation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-3-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# d4c3022a 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Switch the type of job-> to reduce casting.

All consumers wanted drm_gem_object * now.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-2-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# 3f0b646e 13-Mar-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Rename the fence signaled from IRQs to "irq_fence".

We have another thing called the "done fence" that tracks when the
scheduler considers the job done, and having the shared name was
confusing.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190313235211.28995-2-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>


# 40609d48 14-Mar-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Use the new shmem helpers to reduce driver boilerplate.

The new shmem helpers from Noralf and Rob abstract out a bunch of our
BO creation and mapping code.

v2: Use the new sgt getter, and flag pages as dirty before freeing.
v3: Remove the mismatched put_pages.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314163451.13431-1-eric@anholt.net
Reviewed-by: Rob Herring <robh@kernel.org> (v2)


# c2b3e61a 08-Mar-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Use drm_gem_lock_reservations()/drm_gem_unlock_reservations()

Now that we have core helpers, this gets rid of a lot of boilerplate.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308161716.2466-3-eric@anholt.net
Acked-by: Rob Herring <robh@kernel.org>


# eea9b97b 08-Mar-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Add support for V3D v4.2.

No compatible string for it yet, just the version-dependent changes.
They've now tied the hub and the core interrupt lines into a single
interrupt line coming out of the block. It also turns out I made a
mistake in modeling the V3D v3.3 and v4.1 bridge as a part of V3D
itself -- the bridge is going away in favor of an external reset
controller in a larger HW module.

v2: Use consistent checks for whether we're on 4.2, and fix a leak in
an error path.
v3: Use more general means of determining if the current 4.2 changes
are in place, as apparently other platforms may switch back (noted
by Dave). Update the binding doc.
v4: Improve error handling for IRQ init.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308174336.7866-2-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>


# a7dde1b7 20-Feb-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Don't try to set OVRTMUOUT on V3D 4.x.

The old field is gone and the register now has a different field,
QRMAXCNT for how many TMU requests get serviced before thread switch.
We were accidentally reducing it from its default of 0x3 (4 requests)
to 0x0 (1).

v2: Skip setting the reg at all on 4.x, instead of trying to update
only the old field.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190220233658.986-2-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>


# 8d668309 02-Feb-2019 Rob Herring <robh@kernel.org>

drm: v3d: Switch to use drm_gem_object reservation_object

Now that the base struct drm_gem_object has a reservation_object, use it
and remove the private BO one.

Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190202154158.10443-5-robh@kernel.org
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>


# aa5beec3 03-Dec-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Invalidate the caches from the outside in.

This would be a fairly obscure race, but let's make sure we don't ever
lose it.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203222438.25417-6-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>


# 7b9d2fe4 03-Dec-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Stop trying to flush L2C on V3D 3.3+

This cache was replaced with the slice accessing the L2T in the newer
generations. Noted by Dave during review.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203222438.25417-5-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>


# 51c1b6f9 03-Dec-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Drop the wait for L2T flush to complete.

According to Dave, once you've started an L2T flush, all L2T accesses
will be blocked until the flush completes. This fixes a consistent
3-4ms stall between the ioctl and running the job, and 3DMMES Taiji
goes from 27fps to 110fps.

v2: Leave a note about why we don't need to wait for completion.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Reviewed-by: Dave Emett <david.emett@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203222438.25417-4-eric@anholt.net


# 2e6dc3bd 03-Dec-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Don't bother flushing L1TD at job start.

This is the write combiner for TMU writes. You're supposed to flush
that at job end if you had dirtied any cachelines. Flushing it at job
start then doesn't make any sense.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Reviewed-by: Dave Emett <david.emett@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203222438.25417-3-eric@anholt.net


# 2aa34fd5 03-Dec-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Drop unused v3d_flush_caches().

Now that I've specified how the end-of-pipeline flushing should work,
we're never going to use this function.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Emett <david.emett@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203222438.25417-2-eric@anholt.net


# 2312f984 05-Dec-2018 Christian König <christian.koenig@amd.com>

drm/v3d: fix broken build

I missed one case during the recent revert of the replace_fence
interface change.

Fixes: 0b258ed1a219 drm: revert "expand replace_fence to support timeline point v2"

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/266134/


# 0b258ed1 14-Nov-2018 Christian König <christian.koenig@amd.com>

drm: revert "expand replace_fence to support timeline point v2"

This reverts commit 9a09a42369a4a37a959c051d8e1a1f948c1529a4.

The whole interface isn't thought through. Since this function can't
fail we actually can't allocate an object to store the sync point.

Sorry, I should have taken the lead on this from the very beginning and
reviewed it more thoughtfully. Going to propose a new interface as a
follow up change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Link: https://patchwork.freedesktop.org/patch/265580/


# 55a9b748 30-Nov-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Add more tracepoints for V3D GPU rendering.

The core scheduler tells us when the job is pushed to the scheduler's
queue, and I had the job_run functions saying when they actually queue
the job to the hardware. By adding tracepoints for the very top of
the ioctls and the IRQs signaling job completion, "perf record -a -e
v3d:.\* -e gpu_scheduler:.\* <job>; perf script" gets you a pretty
decent timeline.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20181201005759.28093-5-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>


# e14a07fc 28-Nov-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Drop the "dev" argument to lock/unlock of BO reservations.

They were unused, as Dave Emett noticed in TFU review.

Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Dave Emett <david.emett@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181128230927.10951-2-eric@anholt.net
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1584f16c 28-Nov-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Add support for submitting jobs to the TFU.

The TFU can copy from raster, UIF, and SAND input images to UIF output
images, with optional mipmap generation. This will certainly be
useful for media EGL image input, but is also useful immediately for
mipmap generation without bogging the V3D core down.

For now we only run the queue 1 job deep, and don't have any hang
recovery (though I don't think we should need it, with TFU). Queuing
multiple jobs in the HW will require synchronizing the YUV coefficient
regs updates since they don't get FIFOed with the job.

v2: Change the ioctl to IOW instead of IOWR, always set COEF0, explain
why TFU is AUTH, clarify the syncing docs, drop the unused TFU
interrupt regs (you're expected to use the hub's), don't take
&bo->base for NULL bos.
v3: Fix a little whitespace alignment (noticed by checkpatch), rebase
on drm_sched_job_cleanup() changes.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Emett <david.emett@broadcom.com> (v2)
Link: https://patchwork.freedesktop.org/patch/264607/


# 8f1cd826 08-Nov-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Clean up the reservation object setup.

The extra to_v3d_bo() calls came from copying this from the vc4
driver, which stored the cma gem object in the structs.

v2: Fix an unused var warning

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108161654.19888-4-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> (v1)


# ca05359f 19-Sep-2018 Christian König <christian.koenig@amd.com>

dma-buf: allow reserving more than one shared fence slot

Let's support simultaneous submissions to multiple engines.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.kernel.org/patch/10626149/


# 649fdce2 15-Oct-2018 Chunming Zhou <david1.zhou@amd.com>

drm: add flags to drm_syncobj_find_fence

flags can be used by driver to decide whether need to block wait submission.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
SIgned-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.kernel.org/patch/10641339/


# 34c2c4f6 28-Sep-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Fix a use-after-free race accessing the scheduler's fences.

Once we push the job, the scheduler could run it and free it. So, if
we want to reference their fences, we need to grab them before then.
I haven't seen this happen in many days of conformance test runtime,
but let's still close the race.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Link: https://patchwork.freedesktop.org/patch/254119/
Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>


# 9a09a423 30-Aug-2018 Chunming Zhou <david1.zhou@amd.com>

drm: expand replace_fence to support timeline point v2

we can place a fence to a timeline point after expanded.
v2: change func parameter order

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/246543/


# 0a6730ea 30-Aug-2018 Chunming Zhou <david1.zhou@amd.com>

drm: expand drm_syncobj_find_fence to support timeline point v2

we can fetch timeline point fence after expanded.
v2: The parameter fence is the result of the function and should come last.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/246541/


# cdc50176 20-Jul-2018 Nayan Deshmukh <nayan26deshmukh@gmail.com>

drm/scheduler: modify API to avoid redundancy

entity has a scheduler field and we don't need the sched argument
in any of the functions where entity is provided.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 14d1d190 05-Jun-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Remove the bad signaled() implementation.

Since our seqno value comes from a counter associated with the GPU
ring, not the entity (aka client), they'll be completed out of order.
There's actually no need for this code at all, since we don't have
enable_signaling() and thus DMA_FENCE_SIGNALED_BIT will be set before
we could be called.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605190302.18279-2-eric@anholt.net
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>


# 7122b68b 06-Jun-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Take a lock across GPU scheduler job creation and queuing.

Between creation and queueing of a job, you need to prevent any other
job from being created and queued. Otherwise the scheduler's fences
may be signaled out of seqno order.

v2: move mutex unlock to the error label.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Link: https://patchwork.freedesktop.org/patch/msgid/20180606174851.12433-1-eric@anholt.net
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>


# 57692c94 30-Apr-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+

This driver will be used to support Mesa on the Broadcom 7268 and 7278
platforms.

V3D 3.3 introduces an MMU, which means we no longer need CMA or vc4's
complicated CL/shader validation scheme. This massively changes the
GEM behavior, so I've forked off to a new driver.

v2: Mark SUBMIT_CL as needing DRM_AUTH. coccinelle fixes from kbuild
test robot. Drop personal git link from MAINTAINERS. Don't
double-map dma-buf imported BOs. Add kerneldoc about needing MMU
eviction. Drop prime vmap/unmap stubs. Delay mmap offset setup
to mmap time. Use drm_dev_init instead of _alloc. Use
ktime_get() for wait_bo timeouts. Drop drm_can_sleep() usage,
since we don't modeset. Switch page tables back to WC (debug
change to coherent had slipped in). Switch
drm_gem_object_unreference_unlocked() to
drm_gem_object_put_unlocked(). Simplify overflow mem handling by
not sharing overflow mem between jobs.
v3: no changes
v4: align submit_cl to 64 bits (review by airlied), check zero flags in
other ioctls.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v4)
Acked-by: Dave Airlie <airlied@linux.ie> (v3, requested submit_cl change)
Link: https://patchwork.freedesktop.org/patch/msgid/20180430181058.30181-3-eric@anholt.net