History log of /linux-master/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c
Revision Date Author Comments
# 46dec616 08-Dec-2023 Thierry Reding <treding@nvidia.com>

drm/nouveau: Fixup gk20a instobj hierarchy

Commit 12c9b05da918 ("drm/nouveau/imem: support allocations not
preserved across suspend") uses container_of() to cast from struct
nvkm_memory to struct nvkm_instobj, assuming that all instance objects
are derived from struct nvkm_instobj. For the gk20a family that's not
the case and they are derived from struct nvkm_memory instead. This
causes some subtle data corruption (nvkm_instobj.preserve ends up
mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference
in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also
prevents suspend/resume from working.

Fix this by making struct gk20a_instobj derive from struct nvkm_instobj
instead.

Fixes: 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend")
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231208104653.1917055-1-thierry.reding@gmail.com


# 624c6f78 18-Sep-2023 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem/tu102-: prepare for GSP-RM

- move suspend/resume paths to HW-specific code
- allow (future) RM paths to be based on nv50_instmem

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230918202149.4343-15-skeggsb@gmail.com


# 1369459b 23-Jan-2023 Jason Gunthorpe <jgg@ziepe.ca>

iommu: Add a gfp parameter to iommu_map()

The internal mechanisms support this, but instead of exposting the gfp to
the caller it wrappers it into iommu_map() and iommu_map_atomic()

Fix this instead of adding more variants for GFP_KERNEL_ACCOUNT.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/1-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# d9691a22 03-Dec-2020 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/instmem: switch to instanced constructor

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>


# e0ec8a4d 17-Aug-2020 Christoph Hellwig <hch@lst.de>

drm/nouveau/gk20a: stop setting DMA_ATTR_NON_CONSISTENT

DMA_ATTR_NON_CONSISTENT is a no-op except on PA-RISC and a few MIPS
configs, so don't set it in this ARM specific driver part.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 632b740c 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/mmu: remove old vmm frontend

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 9202d732 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem/nv50-: use new interfaces for vmm operations

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# f9463a4b 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/mmu: implement new vmm frontend

These are the new priviledged interfaces to the VMM backends, and expose
some functionality that wasn't previously available.

It's now possible to allocate a chunk of address-space (even all of it),
without causing page tables to be allocated up-front, and then map into
it at arbitrary locations. This is the basic primitive used to support
features such as sparse mapping, or to allow userspace control over its
own address-space, or HMM (where the GPU driver isn't in control of the
address-space layout).

Rather than being tied to a subtle combination of memory object and VMA
properties, arguments that control map flags (ro, kind, etc) are passed
explicitly at map time.

The compatibility hacks to implement the old frontend on top of the new
driver backends have been replaced with something similar to implement
the old frontend's interfaces on top of the new frontend.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# bd275f1d 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: wrap nvkm_mem objects in nvkm_memory interfaces

This is a transition step, to enable finer-grained commits while
transitioning to new MMU interfaces.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 19a82e49 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/core/memory: change map interface to support upcoming mmu changes

Map flags (access, kind, etc) are currently defined in either the VMA,
or the memory object, which turns out to not be ideal for things like
suballocated buffers, etc.

These will become per-map flags instead, so we need to support passing
these arguments in nvkm_memory_map().

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 9ce523cc 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: separate buffer object backing memory from nvkm structures

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 71370e62 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem: remove now-unused wrapper for backend objects

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 07bbc1c5 31-Oct-2017 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/core/memory: split info pointers from accessor pointers

The accessor functions can change as a result of acquire()/release() calls,
and are protected by any refcounting done there.

Other functions must remain constant, as they can be called any time.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# e5ffa727 30-Jan-2017 Thierry Reding <treding@nvidia.com>

drm/nouveau/imem/gk20a: Turn instmem lock into mutex

The gk20a implementation of instance memory uses vmap()/vunmap() to map
memory regions into the kernel's virtual address space. These functions
may sleep, so protecting them by a spin lock is not safe. This triggers
a warning if the DEBUG_ATOMIC_SLEEP Kconfig option is enabled. Fix this
by using a mutex instead.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Tested-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# d2ee3605 09-May-2016 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/core/memory: distinguish between coherent/non-coherent targets

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 134fdc1a 03-Oct-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/core/mm: replace region list with next pointer

We never have any need for a double-linked list here, and as there's
generally a large number of these objects, replace it with a single-
linked list in order to save some memory.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 00085f1e 03-Aug-2016 Krzysztof Kozlowski <krzk@kernel.org>

dma-mapping: use unsigned long for dma_attrs

The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:

1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
attributes are passed by value.

Semantic patches for this change (at least most of them):

virtual patch
virtual context

@r@
identifier f, attrs;

@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}

@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)

and

// Options: --all-includes
virtual patch
virtual context

@r@
identifier f, attrs;
type t;

@@
t f(..., struct dma_attrs *attrs);

@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e02d586d 03-Mar-2016 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: add write barrier when releasing DMA object

When using the DMA-API for instmem, we may obtain a write-combined
mapping. For such cases, add a write barrier in
gk20a_instobj_release_dma() to make sure that all writes have reached
memory at this time.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# b306712d 11-Nov-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: use DMA API CPU mapping

Commit 69c4938249fb ("drm/nouveau/instmem/gk20a: use direct CPU access")
tried to be smart while using the DMA-API by managing the CPU mappings of
buffers allocated with the DMA-API by itself. In doing so, it relied
on dma_to_phys() which is an architecture-private function not
available everywhere. This broke the build on several architectures.

Since there is no reliable and portable way to obtain the physical
address of a DMA-API buffer, stop trying to be smart and just use the
CPU mapping that the DMA-API can provide. This means that buffers will
be CPU-mapped for all their life as opposed to when we need them, but
anyway using the DMA-API here is a fallback for when no IOMMU is
available so we should not expect optimal behavior.

This makes the IOMMU and DMA-API implementations of instmem diverge
enough that we should maybe put them into separate files...

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 338840ee 09-Nov-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: fix race conditions

The LRU list used for recycling CPU mappings was handling concurrency
very poorly. For instance, if an instobj was acquired twice before being
released once, it would end up into the LRU list even though there is
still a client accessing it.

This patch fixes this by properly counting how many clients are
currently using a given instobj.

While at it, we also raise errors when inconsistencies are detected, and
factorize some code.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 10855aeb 09-Nov-2015 Dave Airlie <airlied@redhat.com>

drm/nouveau: fix build failures on all non ARM.

gk20a is an ARM only GPU, so we can just do the correct thing on
ARM but fail on other architectures. The other option was to use
SWIOTLB as the define, which means phys_to_page exists, but
this seems clearer.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# 68b56653 04-Sep-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: make use of the IOMMU bit

Use the IOMMU bit specified in platform data instead of hardcoding it to
the bit used by current Tegra GPUs.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 69c49382 04-Sep-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: use direct CPU access

The Great Nouveau Refactoring Take II brought us a lot of goodness,
including acquire/release methods that are called before and after an
instobj is modified. These functions can be used as synchronization
points to manage CPU/GPU coherency if we modify an instobj using the
CPU.

This patch replaces the legacy and slow PRAMIN access for gk20a instmem
with CPU mappings and writes. A LRU list is used to unmap unused
mappings after a certain threshold (currently 1MB) of mapped instobjs is
reached. This allows mappings to be reused most of the time.

Accessing instobjs using the CPU requires to maintain the GPU L2 cache,
which we do in the acquire/release functions. This triggers a lot of L2
flushes/invalidates, but most of them are performed on an empty cache
(and thus return immediately), and overall context setup performance
greatly benefits from this (from 250ms to 160ms on Jetson TK1 for a
simple libdrm program).

Making L2 management more explicit should allow us to grab some more
performance in the future.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 43a70661 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/tegra: merge platform setup from nouveau drm

The copyright header in nvkm/engine/device/platform.c has been replaced
with the NVIDIA one from drm/nouveau_platform.c, as most of the actual
code is now theirs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 26c9e8ef 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/device: remove pci/platform_device from common struct

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# b7a2bc18 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem: convert to new-style nvkm_subdev

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# d8e83994 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem: improve management of instance memory

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 47b2505e 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/platform: remove subclassing of nvkm_device

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 00c55507 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem: switch to subdev printk macros

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# d5c5bcf6 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem: switch to device pri macros

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# c44c06ae 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/imem: cosmetic changes

This is purely preparation for upcoming commits, there should be no
code changes here.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 9ace404b 19-Aug-2015 Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/device: include core/device.h automatically for subdevs/engines

Pretty much every subdev/engine is going to need access to nvkm_device
shortly to touch registers and/or output messages.

The odd placement of the includes is necessary to work around some
inter-dependencies that currently exist. This will be fixed later.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# df16896b 10-Mar-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: fix crash during error path

If a memory allocation fails when using the DMA allocator,
gk20a_instobj_dtor_dma() will be called on the failed instmem object.
At this time, node->handle might not be NULL despite the call to
dma_alloc_attrs() having failed. node->cpuaddr is the right member to
check for such a failure, so use it instead.

Reported-by: Vince Hsu <vinceh@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# a7f6da6e 20-Feb-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: add IOMMU support

Let GK20A's instmem take advantage of the IOMMU if it is present. Having
an IOMMU means that instmem is no longer allocated using the DMA API,
but instead obtained through page_alloc and made contiguous to the GPU
by IOMMU mappings.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 5dc240bc 20-Feb-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: use DMA attributes

instmem for GK20A is allocated using dma_alloc_coherent(), which
provides us with a coherent CPU mapping that we never use because
instmem objects are accessed through PRAMIN. Switch to
dma_alloc_attrs() which gives us the option to dismiss that CPU mapping
and free up some CPU virtual space.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# a6ff85d3 20-Feb-2015 Alexandre Courbot <acourbot@nvidia.com>

drm/nouveau/instmem/gk20a: move memory allocation to instmem

GK20A does not have dedicated RAM, thus having a RAM device for it does
not make sense. Move the contiguous physical memory allocation to
instmem.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>