History log of /linux-master/drivers/gpu/drm/v3d/v3d_sched.c
Revision Date Author Comments
# 209e8d26 30-Nov-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Create a CPU job extension for the copy performance query job

A CPU job is a type of job that performs operations that requires CPU
intervention. A copy performance query job is a job that copy the complete
or partial result of a query to a buffer. In order to copy the result of
a performance query to a buffer, we need to get the values from the
performance monitors.

So, create a user extension for the CPU job that enables the creation
of a copy performance query job. This user extension will allow the creation
of a CPU job that copy the results of a performance query to a BO with the
possibility to indicate the availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-19-mcanal@igalia.com


# bae7cb5d 30-Nov-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Create a CPU job extension for the reset performance query job

A CPU job is a type of job that performs operations that requires CPU
intervention. A reset performance query job is a job that resets the
performance queries by resetting the values of the perfmons. Moreover,
we also reset the syncobjs related to the availability of the query.

So, create a user extension for the CPU job that enables the creation
of a reset performance job. This user extension will allow the creation of
a CPU job that resets the perfmons values and resets the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-18-mcanal@igalia.com


# 6745f3e4 30-Nov-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Create a CPU job extension to copy timestamp query to a buffer

A CPU job is a type of job that performs operations that requires CPU
intervention. A copy timestamp query job is a job that copy the complete
or partial result of a query to a buffer. As V3D doesn't provide any
mechanism to obtain a timestamp from the GPU, it is a job that needs
CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a copy timestamp query job. This user extension will allow the creation
of a CPU job that copy the results of a timestamp query to a BO with the
possibility to indicate the timestamp availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-17-mcanal@igalia.com


# 34a101e6 30-Nov-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Create a CPU job extension for the reset timestamp job

A CPU job is a type of job that performs operations that requires CPU
intervention. A reset timestamp job is a job that resets the timestamp
queries based on the value offset of the first query. As V3D doesn't
provide any mechanism to obtain a timestamp from the GPU, it is a job
that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a reset timestamp job. This user extension will allow the creation of
a CPU job that resets the timestamp value in the timestamp BO and resets
the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-16-mcanal@igalia.com


# 9ba0ff3e 30-Nov-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Create a CPU job extension for the timestamp query job

A CPU job is a type of job that performs operations that requires CPU
intervention. A timestamp query job is a job that calculates the
query timestamp and updates the query availability by signaling a
syncobj. As V3D doesn't provide any mechanism to obtain a timestamp
from the GPU, it is a job that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a timestamp query job. This user extension will allow the creation of
a CPU job that performs the timestamp query calculation and updates the
timestamp BO with the proper value.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-15-mcanal@igalia.com


# 18b8413b 30-Nov-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Create a CPU job extension for a indirect CSD job

A CPU job is a type of job that performs operations that requires CPU
intervention. An indirect CSD job is a job that, when executed in the
queue, will map the indirect buffer, read the dispatch parameters, and
submit a regular dispatch. Therefore, it is a job that needs CPU
intervention.

So, create a user extension for the CPU job that enables the creation
of an indirect CSD. This user extension will allow the creation of a CSD
job linked to a CPU job. The CPU job will wait for the indirect CSD job
dependencies and, once they are signaled, it will update the CSD job
parameters.

Co-developed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-14-mcanal@igalia.com


# 1fe0879e 30-Nov-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Create tracepoints to track the CPU job

Create tracepoints to track the three major events of a CPU job
lifetime:
1. Submission of a `v3d_submit_cpu` IOCTL
2. Beginning of the execution of a CPU job
3. Ending of the execution of a CPU job

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-11-mcanal@igalia.com


# aafc1a2b 30-Nov-2023 Melissa Wen <mwen@igalia.com>

drm/v3d: Add a CPU job submission

Create a new type of job, a CPU job. A CPU job is a type of job that
performs operations that requires CPU intervention. The overall idea is
to use user extensions to enable different types of CPU job, allowing the
CPU job to perform different operations according to the type of user
extension. The user extension ID identify the type of CPU job that must
be dealt.

Having a CPU job is interesting for synchronization purposes as a CPU
job has a queue like any other V3D job and can be synchoronized by the
multisync extension.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Co-developed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-9-mcanal@igalia.com


# 509433d8 05-Sep-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Expose the total GPU usage stats on sysfs

The previous patch exposed the accumulated amount of active time per
client for each V3D queue. But this doesn't provide a global notion of
the GPU usage.

Therefore, provide the accumulated amount of active time for each V3D
queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs
submitted to the queue, independent of the client.

This data is exposed through the sysfs interface, so that if the
interface is queried at two different points of time the usage percentage
of each of the queues can be calculated.

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com


# 09a93cc4 05-Sep-2023 Maíra Canal <mcanal@igalia.com>

drm/v3d: Implement show_fdinfo() callback for GPU usage stats

This patch exposes the accumulated amount of active time per client
through the fdinfo infrastructure. The amount of active time is exposed
for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN.

In order to calculate the amount of active time per client, a CPU clock
is used through the function local_clock(). The point where the jobs has
started is marked and is finally compared with the time that the job had
finished.

Moreover, the number of jobs submitted to each queue is also exposed on
fdinfo through the identifier "v3d-jobs-<queue>".

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com


# 0ad5bc1c 31-Oct-2023 Iago Toral Quiroga <itoral@igalia.com>

drm/v3d: fix up register addresses for V3D 7.x

This patch updates a number of register addresses that have
been changed in Raspberry Pi 5 (V3D 7.1) and updates the
code to use the corresponding registers and addresses based
on the actual V3D version.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-3-itoral@igalia.com


# a6149f03 30-Oct-2023 Matthew Brost <matthew.brost@intel.com>

drm/sched: Convert drm scheduler to use a work queue rather than kthread

In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
- (Rob Clark) Fix msm build
- Pass in run work queue
v3:
- (Boris) don't have loop in worker
v4:
- (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
- (Boris) default to ordered work queue
v6:
- (Luben / checkpatch) fix alignment in msm_ringbuffer.c
- (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
- (Luben) Update comment for drm_sched_wqueue_enqueue
- (Luben) Positive check for submit_wq in drm_sched_init
- (Luben) s/alloc_submit_wq/own_submit_wq
v7:
- (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
- (Luben) Adjust var names / comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>


# 56e44960 14-Oct-2023 Luben Tuikov <luben.tuikov@amd.com>

drm/sched: Convert the GPU scheduler to variable number of run-queues

The GPU scheduler has now a variable number of run-queues, which are set up at
drm_sched_init() time. This way, each driver announces how many run-queues it
requires (supports) per each GPU scheduler it creates. Note, that run-queues
correspond to scheduler "priorities", thus if the number of run-queues is set
to 1 at drm_sched_init(), then that scheduler supports a single run-queue,
i.e. single "priority". If a driver further sets a single entity per
run-queue, then this creates a 1-to-1 correspondence between a scheduler and
a scheduled entity.

Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com


# 7e302637 28-Feb-2022 Melissa Wen <mwen@igalia.com>

drm/v3d: centralize error handling when init scheduler fails

Remove redundant error message (since now it is very similar to what
we do in drm_sched_init) and centralize all error handling in a
unique place, as we follow the same steps in any case of failure.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228181647.3794298-1-mwen@igalia.com


# 8ab62eda 22-Feb-2022 Jiawei Gu <Jiawei.Gu@amd.com>

drm/sched: Add device pointer to drm_gpu_scheduler

Add device pointer so scheduler's printing can use
DRM_DEV_ERROR() instead, which makes life easier under multiple GPU
scenario.

v2: amend all calls of drm_sched_init()
v3: fill dev pointer for all drm_sched_init() calls

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com


# da3208e8 04-Aug-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/v3d: Use scheduler dependency handling

With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

v2: Rebase over renamed function names for adding dependencies.

Reviewed-by: Melissa Wen <mwen@igalia.com> (v1)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-11-daniel.vetter@ffwll.ch


# 916044fa 04-Aug-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/v3d: Move drm_sched_job_init to v3d_job_init

Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

v4: Rebase over perfmon patch

Reviewed-by: Melissa Wen <mwen@igalia.com> (v3)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-10-daniel.vetter@ffwll.ch


# 26a4dc29 08-Jun-2021 Juan A. Suarez Romero <jasuarez@igalia.com>

drm/v3d: Expose performance counters to userspace

The V3D engine has several hardware performance counters that can of
interest for userspace performance analysis tools.

This exposes new ioctls to create and destroy performance monitor
objects, as well as to query the counter values.

Each created performance monitor object has an ID that can be attached
to CL/CSD submissions, so the driver enables the requested counters when
the job is submitted, and updates the performance monitor values when
the job is done.

It is up to the user to ensure all the jobs have been finished before
getting the performance monitor values. It is also up to the user to
properly synchronize BCL jobs when submitting jobs with different
performance monitors attached.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Emma Anholt <emma@anholt.net>
To: dri-devel@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608111541.461991-1-jasuarez@igalia.com


# 78efe21b 30-Jun-2021 Boris Brezillon <boris.brezillon@collabora.com>

drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr

Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com


# f2f12eb9 01-Feb-2021 Christian König <christian.koenig@amd.com>

drm/scheduler: provide scheduler score externally

Allow multiple schedulers to share the load balancing score.

This is useful when one engine has different hw rings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com


# 576a08e0 01-Feb-2021 Christian König <christian.koenig@amd.com>

drm/v3d/v3d_sched: fix scheduler callbacks return status

Looks like this was not correctly adjusted.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: a6a1f036c74e ("drm/scheduler: Job timeout handler returns status (v3)")
Link: https://patchwork.freedesktop.org/patch/msgid/20210201091159.177853-1-christian.koenig@amd.com


# a6a1f036 20-Jan-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/scheduler: Job timeout handler returns status (v3)

This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
timeout callback method.

v3: Return scheduler/device information, rather
than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/


# e2183fb1 10-Feb-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Revert "drm/scheduler: Job timeout handler returns status (v3)"

This reverts commit c10983e14e8f5d7c8dab0415e0cb7fe8d10aa9e3.

This commit is not meant for drm-misc-next-fixes, and was accidentally
cherry picked over.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# c10983e1 20-Jan-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/scheduler: Job timeout handler returns status (v3)

This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
timeout callback method.

v3: Return scheduler/device information, rather
than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/
(cherry picked from commit a6a1f036c74e3d2a3a56b3140492c7c3ecb879f3)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# d49c4b2c 16-Nov-2020 Lee Jones <lee.jones@linaro.org>

drm/v3d/v3d_sched: Demote non-conformant kernel-doc header

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 'sched_job' not described in 'v3d_job_dependency'
drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 's_entity' not described in 'v3d_job_dependency'

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-37-lee.jones@linaro.org


# e9d2871f 16-Nov-2020 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

drm: fix some kernel-doc markups

Some identifiers have different names between their prototypes
and the kernel-doc markup.

Others need to be fixed, as kernel-doc markups should use this format:
identifier - description

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/12d4ca26f6843618200529ce5445063734d38c04.1605521731.git.mchehab+huawei@kernel.org


# bc662528 15-Apr-2020 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/v3d: Delete v3d_dev->dev

We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
different objects if there's multiple
- compilers have an easier time too

But also a bit a bikeshed, so feel free to ignore.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-10-daniel.vetter@ffwll.ch


# 5918045c 18-Apr-2019 Christian König <christian.koenig@amd.com>

drm/scheduler: rework job destruction

We now destroy finished jobs from the worker thread to make sure that
we never destroy a job currently in timeout processing.
By this we avoid holding lock around ring mirror list in drm_sched_stop
which should solve a deadlock reported by a user.

v2: Remove unused variable.
v4: Move guilty job free into sched code.
v5:
Move sched->hw_rq_count to drm_sched_start to account for counter
decrement in drm_sched_stop even when we don't call resubmit jobs
if guily job did signal.
v6: remove unused variable

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692

Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com


# dffa9b7a 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Add missing implicit synchronization.

It is the expectation of existing userspace (X11 + Mesa, in
particular) that jobs submitted to the kernel against a shared BO will
get implicitly synchronized by their submission order. If we want to
allow clever userspace to disable implicit synchronization, we should
do that under its own submit flag (as amdgpu and lima do).

Note that we currently only implicitly sync for the rendering pass,
not binning -- if you texture-from-pixmap in the binning vertex shader
(vertex coordinate generation), you'll miss out on synchronization.

Fixes flickering when multiple clients are running in parallel,
particularly GL apps and compositors.

v2: Fix a missing refcount on the CSD done fence for L2 cleaning.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-6-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# d223f98f 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Add support for compute shader dispatch.

The compute shader dispatch interface is pretty simple -- just pass in
the regs that userspace has passed us, with no CLs to run. However,
with no CL to run it means that we need to do manual cache flushing of
the L2 after the HW execution completes (for SSBO, atomic, and
image_load_store writes that are the output of compute shaders).

This doesn't yet expose the L2 cache's ability to have a region of the
address space not write back to memory (which could be used for
shared_var storage).

So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing
the ES31 tests), and on the kernel side on 7278 (failing atomic
compswap tests in a way that doesn't reproduce on simpenrose).

v2: Fix excessive allocation for the clean_job (reported by Dan
Carpenter). Keep refs on jobs until clean_job is finished, to
avoid spurious MMU errors if the output BOs are freed by userspace
before L2 cleaning is finished.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# a783a09e 16-Apr-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Refactor job management.

The CL submission had two jobs embedded in an exec struct. When I
added TFU support, I had to replicate some of the exec stuff and some
of the job stuff. As I went to add CSD, it became clear that actually
what was in exec should just be in the two CL jobs, and it would let
us share a lot more code between the 4 queues.

v2: Fix missing error path in TFU ioctl's bo[] allocation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-3-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>


# 3f0b646e 13-Mar-2019 Eric Anholt <eric@anholt.net>

drm/v3d: Rename the fence signaled from IRQs to "irq_fence".

We have another thing called the "done fence" that tracks when the
scheduler considers the job done, and having the shared name was
confusing.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190313235211.28995-2-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>


# e8074f75 11-Mar-2019 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/v3d: Fix calling drm_sched_resubmit_jobs for same sched.

Also stop calling drm_sched_increase_karma multiple times.

v2: Fix whitespace in the code we're moving (by anholt)

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1552409822-17230-1-git-send-email-andrey.grodzovsky@amd.com
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 222b5f044159 ("drm/sched: Refactor ring mirror list handling.")


# 222b5f04 04-Dec-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/sched: Refactor ring mirror list handling.

Decauple sched threads stop and start and ring mirror
list handling from the policy of what to do about the
guilty jobs.
When stoppping the sched thread and detaching sched fences
from non signaled HW fenes wait for all signaled HW fences
to complete before rerunning the jobs.

v2: Fix resubmission of guilty job into HW after refactoring.

v4:
Full restart for all the jobs, not only from guilty ring.
Extract karma increase into standalone function.

v5:
Rework waiting for signaled jobs without relying on the job
struct itself as those might already be freed for non 'guilty'
job's schedulers.
Expose karma increase to drivers.

v6:
Use list_for_each_entry_safe_continue and drm_sched_process_job
in case fence already signaled.
Call drm_sched_increase_karma only once for amdgpu and add documentation.

v7:
Wait only for the latest job's fence.

Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1584f16c 28-Nov-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Add support for submitting jobs to the TFU.

The TFU can copy from raster, UIF, and SAND input images to UIF output
images, with optional mipmap generation. This will certainly be
useful for media EGL image input, but is also useful immediately for
mipmap generation without bogging the V3D core down.

For now we only run the queue 1 job deep, and don't have any hang
recovery (though I don't think we should need it, with TFU). Queuing
multiple jobs in the HW will require synchronizing the YUV coefficient
regs updates since they don't get FIFOed with the job.

v2: Change the ioctl to IOW instead of IOWR, always set COEF0, explain
why TFU is AUTH, clarify the syncing docs, drop the unused TFU
interrupt regs (you're expected to use the hub's), don't take
&bo->base for NULL bos.
v3: Fix a little whitespace alignment (noticed by checkpatch), rebase
on drm_sched_job_cleanup() changes.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Emett <david.emett@broadcom.com> (v2)
Link: https://patchwork.freedesktop.org/patch/264607/


# e90e45f6 08-Nov-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Update a comment about what uses v3d_job_dependency().

I merged bin and render's paths in a late refactoring.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108161654.19888-3-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>


# 26efecf9 29-Oct-2018 Sharat Masetty <smasetty@codeaurora.org>

drm/scheduler: Add drm_sched_job_cleanup

This patch adds a new API to clean up the scheduler job resources. This
is primarliy needed in cases the job was created but was not queued to
the scheduler queue. Additionally with this change, the layer which
creates the scheduler job also gets to free up the job's resources and
this entails moving the dma_fence_put(finished_fence) to the drivers
ops free handler routines.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 19067e52 12-Oct-2018 Christian König <christian.koenig@amd.com>

drm/sched: make sure timer is restarted

Make sure we always restart the timer after a timeout and remove the
device specific workarounds.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6a962430 25-Sep-2018 Nayan Deshmukh <nayan26deshmukh@gmail.com>

drm/scheduler: remove timeout work_struct from drm_sched_job (v3)

having a delayed work item per job is redundant as we only need one
per scheduler to track the time out the currently executing job.

v2: the first element of the ring mirror list is the currently
executing job so we don't need a additional variable for it

v3: squash in fixes for v3d and etnaviv

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a65020d0 03-Jul-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Fix a grammar nit in the scheduler docs.

Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180703170515.6298-4-eric@anholt.net
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>


# 624bb0c0 03-Jul-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Delay the scheduler timeout if we're still making progress.

GTF-GLES2.gtf.GL.acos.acos_float_vert_xvary submits jobs that take 4
seconds at maximum resolution, but we still want to reset quickly if a
job is really hung. Sample the CL's current address and the return
address (since we call into tile lists repeatedly) and if either has
changed then assume we've made progress.

Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180703170515.6298-1-eric@anholt.net
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>


# 17e23993 18-May-2018 Dan Carpenter <dan.carpenter@oracle.com>

drm/v3d: Checking for NULL vs IS_ERR()

The v3d_fence_create() only returns error pointers on error. It never
returns NULL.

Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180518081041.GC28335@mwanda


# 57692c94 30-Apr-2018 Eric Anholt <eric@anholt.net>

drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+

This driver will be used to support Mesa on the Broadcom 7268 and 7278
platforms.

V3D 3.3 introduces an MMU, which means we no longer need CMA or vc4's
complicated CL/shader validation scheme. This massively changes the
GEM behavior, so I've forked off to a new driver.

v2: Mark SUBMIT_CL as needing DRM_AUTH. coccinelle fixes from kbuild
test robot. Drop personal git link from MAINTAINERS. Don't
double-map dma-buf imported BOs. Add kerneldoc about needing MMU
eviction. Drop prime vmap/unmap stubs. Delay mmap offset setup
to mmap time. Use drm_dev_init instead of _alloc. Use
ktime_get() for wait_bo timeouts. Drop drm_can_sleep() usage,
since we don't modeset. Switch page tables back to WC (debug
change to coherent had slipped in). Switch
drm_gem_object_unreference_unlocked() to
drm_gem_object_put_unlocked(). Simplify overflow mem handling by
not sharing overflow mem between jobs.
v3: no changes
v4: align submit_cl to 64 bits (review by airlied), check zero flags in
other ioctls.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v4)
Acked-by: Dave Airlie <airlied@linux.ie> (v3, requested submit_cl change)
Link: https://patchwork.freedesktop.org/patch/msgid/20180430181058.30181-3-eric@anholt.net