History log of /linux-master/include/linux/host1x.h
Revision Date Author Comments
# 1d83d1a2 22-Mar-2023 Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

gpu: host1x: Make host1x_client_unregister() return void

This function returned zero unconditionally. Make it return no value and
simplify all callers accordingly.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# d5179020 19-Jan-2023 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: External timeout/cancellation for fences

Currently all fences have a 30 second timeout to ensure they are
cleaned up if the fence never completes otherwise. However, this
one size fits all solution doesn't actually fit in every case,
such as syncpoint waiting where we want to be able to have timeouts
longer than 30 seconds. As such, we want to be able to give control
over fence cancellation to the caller (and maybe eventually get rid
of the internal timeout altogether).

Here we add this cancellation mechanism by essentially adding a
function for entering the timeout path by function call, and changing
the syncpoint wait function to use it.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# c24973ed 19-Jan-2023 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Implement job tracking using DMA fences

In anticipation of removal of the intr API, implement job tracking
using DMA fences instead. The main two things about this are
making cdma_update schedule the work since fence completion can
now be called from interrupt context, and some complication in
ensuring the callback is not running when we free the fence.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 8935002f 07-Sep-2022 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Select context device based on attached IOMMU

On Tegra234, engines that are programmed through Host1x channels can
be attached to either the NISO0 or NISO1 SMMU. Because of that, when
selecting a context device to use with an engine, we need to select
one that is also attached to the same SMMU.

Add a parameter to host1x_memory_context_alloc to specify which device
we are allocating a context for, and use it to pick an appropriate
context device.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
[treding@nvidia.com: update !IOMMU_API stub signature]
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 24862547 27-Jun-2022 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Program context stream ID on submission

Add code to do stream ID switching at the beginning of a job. The
stream ID is switched to the stream ID specified by the context
passed in the job structure.

Before switching the stream ID, an OP_DONE wait is done on the
channel's engine to ensure that there is no residual ongoing
work that might do DMA using the new stream ID.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 8aa5bcb6 27-Jun-2022 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Add context device management code

Add code to register context devices from device tree, allocate them
out and manage their refcounts.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 3e9c4584 24-Mar-2022 Thierry Reding <treding@nvidia.com>

gpu: host1x: Do not use mapping cache for job submissions

Buffer mappings used in job submissions are usually small and not
rapidly reused as opposed to framebuffers (which are usually large and
rapidly reused, for example when page-flipping between double-buffered
framebuffers). Avoid going through the mapping cache for these buffers
since the cache would also lead to leaks if nobody is ever releasing
the cache's last reference. For DRM/KMS these last references are
dropped when the framebuffers are removed and therefore no longer
needed.

While at it, also add a note about the need to explicitly remove the
final reference to the mapping in the cache.

Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# fe696ccb 03-Apr-2022 Randy Dunlap <rdunlap@infradead.org>

gpu: host1x: Fix a kernel-doc warning

Add @cache description to eliminate a kernel-doc warning.

include/linux/host1x.h:104: warning: Function parameter or member 'cache' not described in 'host1x_client'

Fixes: 1f39b1dfa53c ("drm/tegra: Implement buffer object cache")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: linux-tegra@vger.kernel.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 9ca790f4 30-Nov-2021 Dmitry Osipenko <digetx@gmail.com>

gpu: host1x: Add host1x_channel_stop()

Add host1x_channel_stop() which waits till channel becomes idle and then
stops the channel hardware. This is needed for supporting suspend/resume
by host1x drivers since the hardware state is lost after power-gating,
thus the channel needs to be stopped before client enters into suspend.

Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30
Tested-by: Paul Fertser <fercerpav@gmail.com> # PAZ00 T20
Tested-by: Nicolas Chauvet <kwizart@gmail.com> # PAZ00 T20 and TK1 T124
Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 46f226c9 16-Sep-2021 Mikko Perttunen <mperttunen@nvidia.com>

drm/tegra: Add NVDEC driver

Add support for booting and using NVDEC on Tegra210, Tegra186
and Tegra194 to the Host1x and TegraDRM drivers. Booting in
secure mode is not currently supported.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 1f39b1df 07-Feb-2020 Thierry Reding <treding@nvidia.com>

drm/tegra: Implement buffer object cache

This cache is used to avoid mapping and unmapping buffer objects
unnecessarily. Mappings are cached per client and stay hot until
the buffer object is destroyed.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# c6aeaf56 09-Sep-2021 Thierry Reding <treding@nvidia.com>

drm/tegra: Implement correct DMA-BUF semantics

DMA-BUF requires that each device that accesses a DMA-BUF attaches to it
separately. To do so the host1x_bo_pin() and host1x_bo_unpin() functions
need to be reimplemented so that they can return a mapping, which either
represents an attachment or a map of the driver's own GEM object.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 0fddaa85 10-Jun-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Add option to skip firewall for a job

The new UAPI will have its own firewall, and we don't want to run
the firewall in the Host1x driver for those jobs. As such, add a
parameter to host1x_job_alloc to specify if we want to skip the
firewall in the Host1x driver.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# e902585f 10-Jun-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer

Add support for inserting syncpoint waits in the CDMA pushbuffer.
These waits need to be done in HOST1X class, while gather submitted
by the application execute in engine class.

Support is added by converting the gather list of job into a command
list that can include both gathers and waits. When the job is
submitted, these commands are pushed as the appropriate opcodes
on the CDMA pushbuffer.

Also supported are waits relative to the start of the job,
which are useful for jobs doing multiple things with an engine
that doesn't natively support pipelining.

While at it, use 32-bit waits on chips that support them.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 17a298e9 10-Jun-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Add job release callback

Add a callback field to the job structure, to be called just before
the job is to be freed. This allows the job's submitter to clean
up any of its own state, like decrement runtime PM refcounts.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# c78f837a 10-Jun-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Add no-recovery mode

Add a new property for jobs to enable or disable recovery i.e.
CPU increments of syncpoints to max value on job timeout. This
allows for a more solid model for hanged jobs, where userspace
doesn't need to guess if a syncpoint increment happened because
the job completed, or because job timeout was triggered.

On job timeout, we stop the channel, NOP all future jobs on the
channel using the same syncpoint, mark the syncpoint as locked
and resume the channel from the next job, if any.

The future jobs are NOPed, since because we don't do the CPU
increments, the value of the syncpoint is no longer synchronized,
and any waiters would become confused if a future job incremented
the syncpoint. The syncpoint is marked locked to ensure that any
future jobs cannot increment the syncpoint either, until the
application has recognized the situation and reallocated the
syncpoint.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 687db220 10-Jun-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Add DMA fence implementation

Add an implementation of dma_fences based on syncpoints. Syncpoint
interrupts are used to signal fences. Additionally, after
software signaling has been enabled, a 30 second timeout is started.
If the syncpoint threshold is not reached within this period,
the fence is signalled with an -ETIMEDOUT error code. This is to
allow fences that would never reach their syncpoint threshold to
be cleaned up. The timeout can potentially be removed in the future
after job tracking code has been refactored.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 0cfe5a6e 01-Apr-2021 Thierry Reding <treding@nvidia.com>

gpu: host1x: Split up client initalization and registration

In some cases we may need to initialize the host1x client first before
registering it. This commit adds a new helper that will do nothing but
the initialization of the data structure.

At the same time, the initialization is removed from the registration
function. Note, however, that for simplicity we explicitly initialize
the client when the host1x_client_register() function is called, as
opposed to the low-level __host1x_client_register() function. This
allows existing callers to remain unchanged.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 933deb8c 26-Mar-2021 Thierry Reding <treding@nvidia.com>

gpu: host1x: Add early init and late exit callbacks

These callbacks can be used by client drivers to run code during early
init and during late exit. Early init callbacks are run prior to the
regular init callbacks while late exit callbacks run after the regular
exit callbacks.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# f5ba33fb 29-Mar-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Reserve VBLANK syncpoints at initialization

On T20-T148 chips, the bootloader can set up a boot splash
screen with DC configured to increment syncpoint 26/27
at VBLANK. Because of this we shouldn't allow these syncpoints
to be allocated until DC has been reset and will no longer
increment them in the background.

As such, on these chips, reserve those two syncpoints at
initialization, and only mark them free once the DC
driver has indicated it's safe to do so.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 2aed4f5a 29-Mar-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Cleanup and refcounting for syncpoints

Add reference counting for allocated syncpoints to allow keeping
them allocated while jobs are referencing them. Additionally,
clean up various places using syncpoint IDs to use host1x_syncpt
pointers instead.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 86cec7ec 29-Mar-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Allow syncpoints without associated client

Syncpoints don't need to be associated with any client,
so remove the property, and expose host1x_syncpt_alloc.
This will allow allocating syncpoints without prior knowledge
of the engine that it will be used with.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# a24f9817 29-Mar-2021 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Use different lock classes for each client

To avoid false lockdep warnings, give each client lock a different
lock class, passed from the initialization site by macro.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# cf5153e4 11-Aug-2020 Sowjanya Komatineni <skomatineni@nvidia.com>

media: gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done

With the split of MIPI calibration into tegra_mipi_calibrate() and
tegra_mipi_wait(), MIPI clock is not kept enabled and mutex is not locked
till the calibration is done.

So, this patch keeps MIPI clock enabled and mutex locked after triggering
start of calibration till its done.

To let calibration process go through its finite sequence codes before
calibration logic waiting for pads idle state added wait time of 75usec
to make sure it sees idle state to apply the results.

This patch renames tegra_mipi_calibrate() as tegra_mipi_start_calibration()
and tegra_mipi_wait() as tegra_mipi_finish_calibration() to be inline
with their usage.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>


# b3f1b760 14-Jul-2020 Sowjanya Komatineni <skomatineni@nvidia.com>

gpu: host1x: mipi: Split tegra_mipi_calibrate() and tegra_mipi_wait()

SW can trigger MIPI pads calibration any time after power on
but calibration results will be latched and applied to the pads
by MIPI CAL unit only when the link is in LP-11 state and then
status register will be updated.

For CSI, trigger of pads calibration happen during CSI stream
enable where CSI receiver is kept ready prior to sensor or CSI
transmitter stream start.

So, pads may not be in LP-11 at this time and waiting for the
calibration to be done immediate after calibration start will
result in timeout.

This patch splits tegra_mipi_calibrate() and tegra_mipi_wait()
so triggering for calibration and waiting for it to complete can
happen at different stages.

Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 767598d4 14-Jul-2020 Sowjanya Komatineni <skomatineni@nvidia.com>

gpu: host1x: mipi: Update tegra_mipi_request() to be node based

Tegra CSI driver need a separate MIPI device for each channel as
calibration of corresponding MIPI pads for each channel should
happen independently.

So, this patch updates tegra_mipi_request() API to add a device_node
pointer argument to allow creating mipi device for specific device
node rather than a device.

Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 2fd2bc7f 13-Jun-2020 Colton Lewis <colton.w.lewis@protonmail.com>

gpu: host1x: Correct trivial kernel-doc inconsistencies

Silence documentation build warnings by adding kernel-doc fields.

./include/linux/host1x.h:69: warning: Function parameter or member 'parent' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'usecount' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'lock' not described in 'host1x_client'

Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 501be6c1 25-Mar-2020 Thierry Reding <treding@nvidia.com>

drm/tegra: Fix SMMU support on Tegra124 and Tegra210

When testing whether or not to enable the use of the SMMU, consult the
supported DMA mask rather than the actually configured DMA mask, since
the latter might already have been restricted.

Fixes: 2d9384ff9177 ("drm/tegra: Relax IOMMU usage criteria on old Tegra")
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# fd67e9c6 02-Dec-2019 Thierry Reding <treding@nvidia.com>

drm/tegra: Do not implement runtime PM

The Tegra DRM driver heavily relies on the implementations for runtime
suspend/resume to be called at specific times. Unfortunately, there are
some cases where that doesn't work. One example is if the user disables
runtime PM for a given subdevice. Another example is that the PM core
acquires a reference to runtime PM during system sleep, effectively
preventing devices from going into low power modes. This is intentional
to avoid nasty race conditions, but it also causes system sleep to not
function properly on all Tegra systems.

Fix this by not implementing runtime PM at all. Instead, a minimal,
reference-counted suspend/resume infrastructure is added to the host1x
bus. This has the benefit that it can be used regardless of the system
power state (or any transitions we might be in), or whether or not the
user allows runtime PM.

Atomic modesetting guarantees that these functions will end up being
called at the right point in time, so the pitfalls for the more generic
runtime PM do not apply here.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 608f43ad 02-Dec-2019 Thierry Reding <treding@nvidia.com>

gpu: host1x: Rename "parent" to "host"

Rename the host1x clients' parent to "host" because that more closely
describes what it is. The parent can be confused with the parent device
in terms of the device hierarchy. Subsequent patches will add a new
member that refers to the parent in that hierarchy.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 35bd71dd 18-Nov-2019 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/tegra: Delete host1x_bo_ops->k(un)map

It doesn't have any callers anymore.

Aside: The ->mmap/munmap hooks have a bit a confusing name, they don't
do userspace mmaps, but a kernel vmap. I think most places use vmap
for this, except ttm, which uses kmap for vmap for added confusion.
mmap seems entirely for userspace mappings set up through mmap(2)
syscall.

Reviewed-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: linux-tegra@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20191118103536.17675-3-daniel.vetter@ffwll.ch


# ab4f81bf 28-Oct-2019 Thierry Reding <treding@nvidia.com>

gpu: host1x: Add direction flags to relocations

Add direction flags to host1x relocations performed during job pinning.
These flags indicate the kinds of accesses that hardware is allowed to
perform on the relocated buffers.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 80327ce3 28-Oct-2019 Thierry Reding <treding@nvidia.com>

gpu: host1x: Overhaul host1x_bo_{pin,unpin}() API

The host1x_bo_pin() and host1x_bo_unpin() APIs are used to pin and unpin
buffers during host1x job submission. Pinning currently returns the SG
table and the DMA address (an IOVA if an IOMMU is used or a physical
address if no IOMMU is used) of the buffer. The DMA address is only used
for buffers that are relocated, whereas the host1x driver will map
gather buffers into its own IOVA space so that they can be processed by
the CDMA engine.

This approach has a couple of issues. On one hand it's not very useful
to return a DMA address for the buffer if host1x doesn't need it. On the
other hand, returning the SG table of the buffer is suboptimal because a
single SG table cannot be shared for multiple mappings, because the DMA
address is stored within the SG table, and the DMA address may be
different for different devices.

Subsequent patches will move the host1x driver over to the DMA API which
doesn't work with a single shared SG table. Fix this by returning a new
SG table each time a buffer is pinned. This allows the buffer to be
referenced by multiple jobs for different engines.

Change the prototypes of host1x_bo_pin() and host1x_bo_unpin() to take a
struct device *, specifying the device for which the buffer should be
pinned. This is required in order to be able to properly construct the
SG table. While at it, make host1x_bo_pin() return the SG table because
that allows us to return an ERR_PTR()-encoded error code if we need to,
or return NULL to signal that we don't need the SG table to be remapped
and can simply use the DMA address as-is. At the same time, returning
the DMA address is made optional because in the example of command
buffers, host1x doesn't need to know the DMA address since it will have
to create its own mapping anyway.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# aacdf198 08-Feb-2019 Thierry Reding <treding@nvidia.com>

drm/tegra: Move IOMMU group into host1x client

Handling of the IOMMU group attachment is common to all clients, so move
the group into the client to simplify code.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# caccddcf 18-Jun-2018 Thierry Reding <treding@nvidia.com>

gpu: host1x: Request channels for clients, not devices

A struct device doesn't carry much information that a channel might be
interested in, but the client very much does. Request channels for the
clients rather than their parent devices and store a pointer to them
in order to have that information available when needed.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 1e390478 05-Jun-2019 Thierry Reding <treding@nvidia.com>

gpu: host1x: Increase maximum DMA segment size

Recent versions of the DMA API debug code have started to warn about
violations of the maximum DMA segment size. This is because the segment
size defaults to 64 KiB, which can easily be exceeded in large buffer
allocations such as used in DRM/KMS for framebuffers.

Technically the Tegra SMMU and ARM SMMU don't have a maximum segment
size (they map individual pages irrespective of whether they are
contiguous or not), so the choice of 4 MiB is a bit arbitrary here. The
maximum segment size is a 32-bit unsigned integer, though, so we can't
set it to the correct maximum size, which would be the size of the
aperture.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 16216333 19-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 1

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not write to the free software foundation inc
51 franklin street fifth floor boston ma 02110 1301 usa

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option [no]_[pad]_[ctrl] any later version this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to the free
software foundation inc 51 franklin street fifth floor boston ma
02110 1301 usa

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 176 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154040.652910950@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 326bbd79 16-May-2018 Thierry Reding <treding@nvidia.com>

gpu: host1x: Use not explicitly sized types

The number of words and the offset in a gather don't need to be
explicitly sized, so make them unsigned int instead.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 06490bb9 16-May-2018 Thierry Reding <treding@nvidia.com>

gpu: host1x: Rename relocarray -> relocs for consistency

All other array variables use a plural, and this is the only one using
the *array suffix. This is confusing, so rename it for consistency.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# bf3d41cc 16-May-2018 Thierry Reding <treding@nvidia.com>

gpu: host1x: Store pointer to client in jobs

Rather than storing some identifier derived from the application
context that can't be used concretely anywhere, store a pointer to the
client directly so that accesses can be made directly through that
client object.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 24c94e16 05-May-2018 Thierry Reding <treding@nvidia.com>

gpu: host1x: Remove wait check support

The job submission userspace ABI doesn't support this and there are no
plans to implement it, so all of this code is dead and can be removed.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 617dd7cc 29-Aug-2017 Thierry Reding <treding@nvidia.com>

gpu: host1x: syncpt: Request syncpoints per client

Rather than request syncpoints for a struct device *, request them for a
struct host1x_client *. This is important because subsequent patches are
going to break the assumption that host1x will always be the parent for
devices requesting a syncpoint. It's also a more natural choice because
host1x clients are really the only ones that will know how to deal with
syncpoints.

Note that host1x clients are always guaranteed to be children of host1x,
regardless of their location in the device tree.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 8474b025 14-Jun-2017 Mikko Perttunen <mperttunen@nvidia.com>

gpu: host1x: Refactor channel allocation code

This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:

- The previous code could deadlock due to an interaction
between the 'reflock' mutex and CDMA timeout handling.
This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
of channel ownership handling into channel.c

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# a2b78b0d 14-Jun-2017 Dmitry Osipenko <digetx@gmail.com>

gpu: host1x: Correct swapped arguments in the is_addr_reg() definition

Arguments of the .is_addr_reg() are swapped in the definition of the
function, that is quite confusing.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 0f563a4b 14-Jun-2017 Dmitry Osipenko <digetx@gmail.com>

gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

Several channels could be made to write the same unit concurrently via
the SETCLASS opcode, trusting userspace is a bad idea. It should be
possible to drop the per-client channel reservation and add a per-unit
locking by inserting MLOCK's to the command stream to re-allow the
SETCLASS opcode, but it will be much more work. Let's forbid the
unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# d0fbbdff 14-Jun-2017 Dmitry Osipenko <digetx@gmail.com>

drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL

The waitchecks along with multiple syncpoints per submit are not ready
for use yet, let's forbid them for now.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 466749f1 09-Apr-2017 Thierry Reding <treding@nvidia.com>

gpu: host1x: Flesh out kerneldoc

Improve kerneldoc for the public parts of the host1x infrastructure in
preparation for adding driver-specific part to the GPU documentation.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 0ae797a8 14-Dec-2016 Arto Merilainen <amerilainen@nvidia.com>

drm/tegra: Add VIC support

This patch adds support for Video Image Compositor engine which
can be used for 2d operations.

Signed-off-by: Andrew Chew <achew@nvidia.com>
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 87904c3e 12-Aug-2016 Thierry Reding <treding@nvidia.com>

drm/tegra: dsi: Enhance runtime power management

The MIPI DSI output on Tegra SoCs requires some external logic to
calibrate the MIPI pads before a video signal can be transmitted. This
MIPI calibration logic requires to be powered on while the MIPI pads are
being used, which is currently done as part of the DSI driver's probe
implementation.

This is suboptimal because it will leave the MIPI calibration logic
powered up even if the DSI output is never used.

On Tegra114 and earlier this behaviour also causes the driver to hang
while trying to power up the MIPI calibration logic because the power
partition that contains the MIPI calibration logic will be powered on
by the display controller at output pipeline configuration time. Thus
the power up sequence for the MIPI calibration logic happens before
it's power partition is guaranteed to be enabled.

Fix this by splitting up the API into a request/free pair of functions
that manage the runtime dependency between the DSI and the calibration
modules (no registers are accessed) and a set of enable, calibrate and
disable functions that program the MIPI calibration logic at points in
time where the power partition is really enabled.

While at it, make sure that the runtime power management also works in
ganged mode, which is currently also broken.

Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# b4a20144 28-Jan-2015 Thierry Reding <treding@nvidia.com>

gpu: host1x: Export host1x_syncpt_read()

This function is used to read the current value of the syncpt and is
useful in situations where drivers don't schedule work and wait for the
syncpoint to increment. One particular use-case is using the syncpoint
as a VBLANK counter.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# f4c5cf88 18-Dec-2014 Thierry Reding <treding@nvidia.com>

gpu: host1x: Provide a proper struct bus_type

Previously the struct bus_type exported by the host1x infrastructure was
only a very basic skeleton. Turn that implementation into a more full-
fledged bus to support proper probe ordering and power management.

Note that the bus infrastructure needs to be available before any of the
drivers can be registered. This is automatically ensured if all drivers
are built as loadable modules (via symbol dependencies). If all drivers
are built-in there are no such guarantees and the link order determines
the initcall ordering. Adjust drivers/gpu/Makefile to make sure that the
host1x bus infrastructure is initialized prior to any of its users (only
drm/tegra currently).

v2: Fix building host1x and tegra-drm as modules
Reported-by: Dave Airlie <airlied@gmail.com>

Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Mark Zhang <markz@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 536e1715 05-Nov-2014 Thierry Reding <treding@nvidia.com>

gpu: host1x: Call ->remove() only when a device is bound

When a driver's ->probe() function fails, the host1x bus must not call
its ->remove() function because the driver will already have cleaned up
in the error handling path in ->probe().

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 961e3bea 10-Jun-2014 Thierry Reding <treding@nvidia.com>

drm/tegra: Make job submission 64-bit safe

Job submission currently relies on the fact that struct drm_tegra_reloc
and struct host1x_reloc are the same size and uses a simple call to the
copy_from_user() function to copy them to kernel space. This causes the
handle to be stored in the buffer object field, which then needs a cast
to a 32 bit integer to resolve it to a proper buffer object pointer and
store it back in the buffer object field.

On 64-bit architectures that will no longer work, since pointers are 64
bits wide whereas handles will remain 32 bits. This causes the sizes of
both structures to because different and copying will no longer work.

Fix this by adding a new function, host1x_reloc_get_user(), that copies
the structures field by field.

While at it, use substructures for the command and target buffers in
struct host1x_reloc for better readability. Also use unsized types to
make it more obvious that this isn't part of userspace ABI.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 64400c37 19-Feb-2014 Bryan Wu <pengw@nvidia.com>

gpu: host1x: export host1x_syncpt_incr_max() function

Tegra V4L2 camera driver needs this function to do frame capture.

Signed-off-by: Bryan Wu <pengw@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 4de6a2d6 02-Sep-2013 Thierry Reding <treding@nvidia.com>

gpu: host1x: Add MIPI pad calibration support

This driver adds support to perform calibration of the MIPI pads for CSI
and DSI.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# f5a954fe 14-Oct-2013 Arto Merilainen <amerilainen@nvidia.com>

gpu: host1x: Add syncpoint base support

This patch adds support for hardware syncpoint bases. This creates
a simple mechanism to stall the command FIFO until an operation is
completed.

Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 8736fe81 14-Oct-2013 Arto Merilainen <amerilainen@nvidia.com>

gpu: host1x: Add 'flags' field to syncpt request

Functions host1x_syncpt_request() and _host1x_syncpt_alloc() have
been taking a separate boolean flag ('client_managed') for indicating
if the syncpoint value should be tracked by the host1x driver.

This patch converts the field into generic 'flags' field so that
we can easily add more information while requesting a syncpoint.
Clients are adapted to use the new interface accordingly.

Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 5f60ed0d 28-Feb-2013 Thierry Reding <thierry.reding@avionic-design.de>

drm/tegra: Add 3D support

Initialize and power the 3D unit on Tegra20, Tegra30 and Tegra114 and
register a channel with the Tegra DRM driver so that the unit can be
used from userspace.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>


# 776dc384 14-Oct-2013 Thierry Reding <treding@nvidia.com>

drm/tegra: Move subdevice infrastructure to host1x

The Tegra DRM driver currently uses some infrastructure to defer the DRM
core initialization until all required devices have registered. The same
infrastructure can potentially be used by any other driver that requires
more than a single sub-device of the host1x module.

Make the infrastructure more generic and keep only the DRM specific code
in the DRM part of the driver. Eventually this will make it easy to move
the DRM driver part back to the DRM subsystem.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 35d747a8 24-Sep-2013 Thierry Reding <treding@nvidia.com>

gpu: host1x: Expose syncpt and channel functionality

Expose the buffer objects, syncpoint and channel functionality in the
public public header so that drivers can use them.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# 53fa7f72 24-Sep-2013 Thierry Reding <treding@nvidia.com>

drm/tegra: Introduce tegra_drm_client structure

This structure derives from host1x_client. DRM-specific fields are moved
from host1x_client to this structure, so that host1x_client can remain
agnostic of DRM.

Signed-off-by: Thierry Reding <treding@nvidia.com>


# e1e90644 24-Sep-2013 Thierry Reding <treding@nvidia.com>

gpu: host1x: Make host1x header file public

In preparation to support host1x clients other than DRM, move this
header into a public location.

Signed-off-by: Thierry Reding <treding@nvidia.com>