History log of /linux-master/arch/arm64/mm/dma-mapping.c
Revision Date Author Comments
# 4720287c 07-Dec-2023 Jason Gunthorpe <jgg@ziepe.ca>

iommu: Remove struct iommu_ops *iommu from arch_setup_dma_ops()

This is not being used to pass ops, it is just a way to tell if an
iommu driver was probed. These days this can be detected directly via
device_iommu_mapped(). Call device_iommu_mapped() in the two places that
need to check it and remove the iommu parameter everywhere.

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 7bd6680b 30-Mar-2023 Will Deacon <will@kernel.org>

Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""

This reverts commit b7d9aae404841d9999b7476170867ae441e948d2.

With the Qualcomm remoteproc driver now modified to use a carveout
memory region in 57f72170a2b2 ("remoteproc: qcom_q6v5_mss: Use a
carveout to authenticate modem headers"), we can reinstate c44094eee32f
("arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()")
which relaxes the arm64 implementation of arch_dma_prep_coherent() to
perform only a data cache clean operation, rather than a
clean-and-invalidate.

Signed-off-by: Will Deacon <will@kernel.org>


# b7d9aae4 06-Dec-2022 Will Deacon <will@kernel.org>

Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"

This reverts commit c44094eee32f32f175aadc0efcac449d99b1bbf7.

Although the semantics of the DMA API require only a clean operation
here, it turns out that the Qualcomm 'qcom_q6v5_mss' remoteproc driver
(ab)uses the DMA API for transferring the modem firmware to the secure
world via calls to Trustzone [1].

Once the firmware buffer has changed hands, _any_ access from the
non-secure side (i.e. Linux) will be detected on the bus and result in a
full system reset [2]. Although this is possible even with this revert
in place (due to speculative reads via the cacheable linear alias of
memory), anecdotally the problem occurs considerably more frequently
when the lines have not been invalidated, assumedly due to some
micro-architectural interactions with the cache hierarchy.

Revert the offending change for now, along with a comment, so that the
Qualcomm developers have time to fix the driver [3] to use a firmware
buffer which does not have a cacheable alias in the linear map.

Link: https://lore.kernel.org/r/20221114110329.68413-1-manivannan.sadhasivam@linaro.org [1]
Link: https://lore.kernel.org/r/CAMi1Hd3H2k1J8hJ6e-Miy5+nVDNzv6qQ3nN-9929B0GbHJkXEg@mail.gmail.com/ [2]
Link: https://lore.kernel.org/r/20221206092152.GD15486@thinkpad [2]
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Sibi Sankar <quic_sibis@quicinc.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20221206103403.646-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# fa49364c 16-Aug-2022 Robin Murphy <robin.murphy@arm.com>

iommu/dma: Move public interfaces to linux/iommu.h

The iommu-dma layer is now mostly encapsulated by iommu_dma_ops, with
only a couple more public interfaces left pertaining to MSI integration.
Since these depend on the main IOMMU API header anyway, move their
declarations there, taking the opportunity to update the half-baked
comments to proper kerneldoc along the way.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/9cd99738f52094e6bed44bfee03fa4f288d20695.1660668998.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# c44094ee 23-Aug-2022 Will Deacon <will@kernel.org>

arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()

arch_dma_prep_coherent() is called when preparing a non-cacheable region
for a consistent DMA buffer allocation. Since the buffer pages may
previously have been written via a cacheable mapping and consequently
allocated as dirty cachelines, the purpose of this function is to remove
these dirty lines from the cache, writing them back so that the
non-coherent device is able to see them.

On arm64, this operation can be achieved with a clean to the point of
coherency; a subsequent invalidation is not required and serves little
purpose in the presence of a cacheable alias (e.g. the linear map),
since clean lines can be speculatively fetched back into the cache after
the invalidation operation has completed.

Relax the cache maintenance in arch_dma_prep_coherent() so that only a
clean, and not a clean-and-invalidate operation is performed.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220823122111.17439-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 7eacf185 10-Jun-2022 Will Deacon <will@kernel.org>

arm64: mm: Remove assembly DMA cache maintenance wrappers

Remove the __dma_{flush,map,unmap}_area assembly wrappers and call the
appropriate cache maintenance functions directly from the DMA mapping
callbacks.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220610151228.4562-3-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>


# 9bf22421 02-Jun-2022 Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

arm/xen: Introduce xen_setup_dma_ops()

This patch introduces new helper and places it in new header.
The helper's purpose is to assign any Xen specific DMA ops in
a single place. For now, we deal with xen-swiotlb DMA ops only.
The one of the subsequent commits in current series will add
xen-grant DMA ops case.

Also re-use the xen_swiotlb_detect() check on Arm32.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
[For arm64]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1654197833-25362-2-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# ac6d7046 18-Jun-2021 Jean-Philippe Brucker <jean-philippe@linaro.org>

iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops()

Passing a 64-bit address width to iommu_setup_dma_ops() is valid on
virtual platforms, but isn't currently possible. The overflow check in
iommu_dma_init_domain() prevents this even when @dma_base isn't 0. Pass
a limit address instead of a size, so callers don't have to fake a size
to work around the check.

The base and limit parameters are being phased out, because:
* they are redundant for x86 callers. dma-iommu already reserves the
first page, and the upper limit is already in domain->geometry.
* they can now be obtained from dev->dma_range_map on Arm.
But removing them on Arm isn't completely straightforward so is left for
future work. As an intermediate step, simplify the x86 callers by
passing dummy limits.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20210618152059.1194210-5-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# f5079a9a 19-Mar-2021 Stefano Stabellini <stefano.stabellini@xilinx.com>

xen/arm: introduce XENFEAT_direct_mapped and XENFEAT_not_direct_mapped

Newer Xen versions expose two Xen feature flags to tell us if the domain
is directly mapped or not. Only when a domain is directly mapped it
makes sense to enable swiotlb-xen on ARM.

Introduce a function on ARM to check the new Xen feature flags and also
to deal with the legacy case. Call the function xen_swiotlb_detect.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210319200140.12512-1-sstabellini@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>


# 9f4df96b 22-Sep-2020 Christoph Hellwig <hch@lst.de>

dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>

Move more nitty gritty DMA implementation details into the common
internal header.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 0a0f0d8b 22-Sep-2020 Christoph Hellwig <hch@lst.de>

dma-mapping: split <linux/dma-mapping.h>

Split out all the bits that are purely for dma_map_ops implementations
and related code into a new <linux/dma-map-ops.h> header so that they
don't get pulled into all the drivers. That also means the architecture
specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h>
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 56e35f9c 07-Nov-2019 Christoph Hellwig <hch@lst.de>

dma-mapping: drop the dev argument to arch_sync_dma_for_*

These are pure cache maintainance routines, so drop the unused
struct device argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5489c8e0 24-Jul-2019 Christoph Hellwig <hch@lst.de>

arm64: use asm-generic/dma-mapping.h

Now that the Xen special cases are gone nothing worth mentioning is
left in the arm64 <asm/dma-mapping.h> file, so switch to use the
asm-generic version instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


# 0e0d26e7 24-Jul-2019 Christoph Hellwig <hch@lst.de>

xen/arm: remove xen_dma_ops

arm and arm64 can just use xen_swiotlb_dma_ops directly like x86, no
need for a pointer indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


# 8e3a68fb 02-Aug-2019 Christoph Hellwig <hch@lst.de>

dma-mapping: make dma_atomic_pool_init self-contained

The memory allocated for the atomic pool needs to have the same
mapping attributes that we use for remapping, so use
pgprot_dmacoherent instead of open coding it. Also deduct a
suitable zone to allocate the memory from based on the presence
of the DMA zones.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 419e2f18 26-Aug-2019 Christoph Hellwig <hch@lst.de>

dma-mapping: remove arch_dma_mmap_pgprot

arch_dma_mmap_pgprot is used for two things:

1) to override the "normal" uncached page attributes for mapping
memory coherent to devices that can't snoop the CPU caches
2) to provide the special DMA_ATTR_WRITE_COMBINE semantics on older
arm systems and some mips platforms

Replace one with the pgprot_dmacoherent macro that is already provided
by arm and much simpler to use, and lift the DMA_ATTR_WRITE_COMBINE
handling to common code with an explicit arch opt-in.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Acked-by: Paul Burton <paul.burton@mips.com> # mips


# 33dcb37c 26-Jul-2019 Christoph Hellwig <hch@lst.de>

dma-mapping: fix page attributes for dma_mmap_*

All the way back to introducing dma_common_mmap we've defaulted to mark
the pages as uncached. But this is wrong for DMA coherent devices.
Later on DMA_ATTR_WRITE_COMBINE also got incorrect treatment as that
flag is only treated special on the alloc side for non-coherent devices.

Introduce a new dma_pgprot helper that deals with the check for coherent
devices so that only the remapping cases ever reach arch_dma_mmap_pgprot
and we thus ensure no aliasing of page attributes happens, which makes
the powerpc version of arch_dma_mmap_pgprot obsolete and simplifies the
remaining ones.

Note that this means arch_dma_mmap_pgprot is a bit misnamed now, but
we'll phase it out soon.

Fixes: 64ccc9c033c6 ("common: dma-mapping: add support for generic dma_mmap_* calls")
Reported-by: Shawn Anastasio <shawn@anastas.io>
Reported-by: Gavin Li <git@thegavinli.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> # arm64


# caab277b 02-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 8f5c9037 14-Jun-2019 Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>

arm64/mm: Correct the cache line size warning with non coherent device

If the cache line size is greater than ARCH_DMA_MINALIGN (128),
the warning shows and it's tainted as TAINT_CPU_OUT_OF_SPEC.

However, it's not good because as discussed in the thread [1], the cpu
cache line size will be problem only on non-coherent devices.

Since the coherent flag is already introduced to struct device,
show the warning only if the device is non-coherent device and
ARCH_DMA_MINALIGN is smaller than the cpu cache size.

[1] https://lore.kernel.org/linux-arm-kernel/20180514145703.celnlobzn3uh5tc2@localhost/

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Tested-by: Zhang Lei <zhang.lei@jp.fujitsu.com>
[catalin.marinas@arm.com: removed 'if' block for WARN_TAINT]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a84cc69e 20-May-2019 Christoph Hellwig <hch@lst.de>

arm64: trim includes in dma-mapping.c

With most of the previous functionality now elsewhere a lot of the
headers included in this file are not needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# b5f75a36 20-May-2019 Christoph Hellwig <hch@lst.de>

arm64: switch copyright boilerplace to SPDX in dma-mapping.c

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 06d60728 20-May-2019 Christoph Hellwig <hch@lst.de>

iommu/dma: move the arm64 wrappers to common code

There is nothing really arm64 specific in the iommu_dma_ops
implementation, so move it to dma-iommu.c and keep a lot of symbols
self-contained. Note the implementation does depend on the
DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
DMA_IOMMU support depend on it, but this will be relaxed soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# af751d43 20-May-2019 Christoph Hellwig <hch@lst.de>

iommu/dma: Remove the flush_page callback

We now have a arch_dma_prep_coherent architecture hook that is used
for the generic DMA remap allocator, and we should use the same
interface for the dma-iommu code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# a98d9ae9 30-Apr-2019 Christoph Hellwig <hch@lst.de>

arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable

DMA allocations that can't sleep may return non-remapped addresses, but
we do not properly handle them in the mmap and get_sgtable methods.
Resolve non-vmalloc addresses using virt_to_page to handle this corner
case.

Cc: <stable@vger.kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 60d8cd57 16-Jan-2019 Christoph Hellwig <hch@lst.de>

arm64/xen: fix xen-swiotlb cache flushing

Xen-swiotlb hooks into the arm/arm64 arch code through a copy of the DMA
DMA mapping operations stored in the struct device arch data.

Switching arm64 to use the direct calls for the merged DMA direct /
swiotlb code broke this scheme. Replace the indirect calls with
direct-calls in xen-swiotlb as well to fix this problem.

Fixes: 356da6d0cde3 ("dma-mapping: bypass indirect calls for dma-direct")
Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


# 356da6d0 06-Dec-2018 Christoph Hellwig <hch@lst.de>

dma-mapping: bypass indirect calls for dma-direct

Avoid expensive indirect calls in the fast path DMA mapping
operations by directly calling the dma_direct_* ops if we are using
the directly mapped DMA operations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>


# 55897af6 03-Dec-2018 Christoph Hellwig <hch@lst.de>

dma-direct: merge swiotlb_dma_ops into the dma_direct code

While the dma-direct code is (relatively) clean and simple we actually
have to use the swiotlb ops for the mapping on many architectures due
to devices with addressing limits. Instead of keeping two
implementations around this commit allows the dma-direct
implementation to call the swiotlb bounce buffering functions and
thus share the guts of the mapping implementation. This also
simplified the dma-mapping setup on a few architectures where we
don't have to differenciate which implementation to use.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>


# 90ac706e 06-Dec-2018 Robin Murphy <robin.murphy@arm.com>

dma-mapping: factor out dummy DMA ops

The dummy DMA ops are currently used by arm64 for any device which has
an invalid ACPI description and is thus barred from using DMA due to not
knowing whether is is cache-coherent or not. Factor these out into
general dma-mapping code so that they can be referenced from other
common code paths. In the process, we can prune all the optional
callbacks which just do the same thing as the default behaviour, and
fill in .map_resource for completeness.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[hch: moved to a separate source file]
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>


# 3238c359 10-Dec-2018 Robin Murphy <robin.murphy@arm.com>

arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing

We need to invalidate the caches *before* clearing the buffer via the
non-cacheable alias, else in the worst case __dma_flush_area() may
write back dirty lines over the top of our nice new zeros.

Fixes: dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag")
Cc: <stable@vger.kernel.org> # 4.18.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# cad34be7 21-Nov-2018 Christoph Hellwig <hch@lst.de>

iommu/dma-iommu: remove the mapping_error dma_map_ops method

Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>


# 52f0b3ee 21-Nov-2018 Christoph Hellwig <hch@lst.de>

arm64: remove the dummy_dma_ops mapping_error method

Just return DMA_MAPPING_ERROR from __dummy_map_page and let the core
dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>


# 0c3b3171 04-Nov-2018 Christoph Hellwig <hch@lst.de>

dma-mapping: move the arm64 noncoherent alloc/free support to common code

The arm64 codebase to implement coherent dma allocation for architectures
with non-coherent DMA is a good start for a generic implementation, given
that is uses the generic remap helpers, provides the atomic pool for
allocations that can't sleep and still is realtively simple and well
tested. Move it to kernel/dma and allow architectures to opt into it
using a config symbol. Architectures just need to provide a new
arch_dma_prep_coherent helper to writeback an invalidate the caches
for any memory that gets remapped for uncached access.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>


# f6271755 30-Oct-2018 Christoph Hellwig <hch@lst.de>

arm64: fix warnings without CONFIG_IOMMU_DMA

__swiotlb_get_sgtable_page and __swiotlb_mmap_pfn are not only misnamed
but also only used if CONFIG_IOMMU_DMA is set. Just add a simple ifdef
for now, given that we plan to remove them entirely for the next merge
window.

Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>


# 57c8a661 30-Oct-2018 Mike Rapoport <rppt@linux.vnet.ibm.com>

mm: remove include/linux/bootmem.h

Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 886643b7 08-Oct-2018 Christoph Hellwig <hch@lst.de>

arm64: use the generic swiotlb_dma_ops

Now that the generic swiotlb code supports non-coherent DMA we can switch
to it for arm64. For that we need to refactor the existing
alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
and implement the standard arch_sync_dma_for_{device,cpu} hooks for
cache maintaincance in the streaming dma hooks, which also implies
using the generic dma_coherent flag in struct device.

Note that we need to keep the old is_device_dma_coherent function around
for now, so that the shared arm/arm64 Xen code keeps working.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>


# fafadcd1 30-Sep-2018 Christoph Hellwig <hch@lst.de>

swiotlb: don't dip into swiotlb pool for coherent allocations

All architectures that support swiotlb also have a zone that backs up
these less than full addressing allocations (usually ZONE_DMA32).

Because of that it is rather pointless to fall back to the global swiotlb
buffer if the normal dma direct allocation failed - the only thing this
will do is to eat up bounce buffers that would be more useful to serve
streaming mappings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# dff8d6c1 16-Aug-2018 Christoph Hellwig <hch@lst.de>

swiotlb: remove the overflow buffer

Like all other dma mapping drivers just return an error code instead
of an actual memory buffer. The reason for the overflow buffer was
that at the time swiotlb was invented there was no way to check for
dma mapping errors, but this has long been fixed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 1a0afc14 25-Sep-2018 Christoph Hellwig <hch@lst.de>

Revert "dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops"

This reverts commit 46053c73685411915d3de50c5a0045beef32806b.

This change breaks architectures setting up dma_ops in their own magic
way and not using arch_setup_dma_ops, so revert it.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>


# 7adb562c 12-Sep-2018 Robin Murphy <robin.murphy@arm.com>

arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops

Whilst the symmetry of deferring to the existing sync callback in
__iommu_map_page() is nice, taking a round-trip through
iommu_iova_to_phys() is a pretty heavyweight way to get an address we
can trivially compute from the page we already have. Tweaking it to just
perform the cache maintenance directly when appropriate doesn't really
make the code any more complicated, and the runtime efficiency gain can
only be a benefit.

Furthermore, the sync operations themselves know they can only be
invoked on a managed DMA ops domain, so can use the fast specific domain
lookup to avoid excessive manipulation of the group refcount
(particularly in the scatterlist cases).

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 46053c73 24-Aug-2018 Christoph Hellwig <hch@lst.de>

dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops

There is no reason to leave the per-device dma_ops around when
deconfiguring a device, so move this code from arm64 into the
common code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>


# d834c5ab 17-Aug-2018 Marek Szyprowski <m.szyprowski@samsung.com>

kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()

The CMA memory allocator doesn't support standard gfp flags for memory
allocation, so there is no point having it as a parameter for
dma_alloc_from_contiguous() function. Replace it by a boolean no_warn
argument, which covers all the underlaying cma_alloc() function
supports.

This will help to avoid giving false feeling that this function supports
standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
what has already been an issue: see commit dd65a941f6ba ("arm64:
dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").

Link: http://lkml.kernel.org/r/20180709122020eucas1p21a71b092975cb4a3b9954ffc63f699d1~-sqUFoa-h2939329393eucas1p2Y@eucas1p2.samsung.com
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Michał Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# dd65a941 12-Jun-2018 Marek Szyprowski <m.szyprowski@samsung.com>

arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag

dma_alloc_*() buffers might be exposed to userspace via mmap() call, so
they should be cleared on allocation. In case of IOMMU-based dma-mapping
implementation such buffer clearing was missing in the code path for
DMA_ATTR_FORCE_CONTIGUOUS flag handling, because dma_alloc_from_contiguous()
doesn't honor __GFP_ZERO flag. This patch fixes this issue. For more
information on clearing buffers allocated by dma_alloc_* functions,
see commit 6829e274a623 ("arm64: dma-mapping: always clear allocated
buffers").

Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# ebc7e21e 11-May-2018 Catalin Marinas <catalin.marinas@arm.com>

arm64: Increase ARCH_DMA_MINALIGN to 128

This patch increases the ARCH_DMA_MINALIGN to 128 so that it covers the
currently known Cache Writeback Granule (CTR_EL0.CWG) on arm64 and moves
the fallback in cache_line_size() from L1_CACHE_BYTES to this constant.
In addition, it warns (and taints) if the CWG is larger than
ARCH_DMA_MINALIGN as this is not safe with non-coherent DMA.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 15b28bbc 16-Apr-2018 Christoph Hellwig <hch@lst.de>

dma-debug: move initialization to common code

Most mainstream architectures are using 65536 entries, so lets stick to
that. If someone is really desperate to override it that can still be
done through <asm/dma-mapping.h>, but I'd rather see a really good
rationale for that.

dma_debug_init is now called as a core_initcall, which for many
architectures means much earlier, and provides dma-debug functionality
earlier in the boot process. This should be safe as it only relies
on the memory allocator already being available.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>


# 3f251cf0 26-Mar-2018 Will Deacon <will@kernel.org>

Revert "arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)"

This reverts commit 1f85b42a691cd8329ba82dbcaeec80ac1231b32a.

The internal dma-direct.h API has changed in -next, which collides with
us trying to use it to manage non-coherent DMA devices on systems with
unreasonably large cache writeback granules.

This isn't at all trivial to resolve, so revert our changes for now and
we can revisit this after the merge window. Effectively, this just
restores our behaviour back to that of 4.16.

Signed-off-by: Will Deacon <will.deacon@arm.com>


# 1f85b42a 28-Feb-2018 Catalin Marinas <catalin.marinas@arm.com>

arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)

Commit 97303480753e ("arm64: Increase the max granular size") increased
the cache line size to 128 to match Cavium ThunderX, apparently for some
performance benefit which could not be confirmed. This change, however,
has an impact on the network packets allocation in certain
circumstances, requiring slightly over a 4K page with a significant
performance degradation.

This patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache line) while
keeping ARCH_DMA_MINALIGN at 128. The cache_line_size() function was
changed to default to ARCH_DMA_MINALIGN in the absence of a meaningful
CTR_EL0.CWG bit field.

In addition, if a system with ARCH_DMA_MINALIGN < CTR_EL0.CWG is
detected, the kernel will force swiotlb bounce buffering for all
non-coherent devices since DMA cache maintenance on sub-CWG ranges is
not safe, leading to data corruption.

Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 0d8488ac 24-Dec-2017 Christoph Hellwig <hch@lst.de>

arm64: use swiotlb_alloc and swiotlb_free

The generic swiotlb_alloc and swiotlb_free routines already take care
of CMA allocations and adding GFP_DMA32 where needed, so use them
instead of the arm specific helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>


# ad67f5a6 24-Dec-2017 Christoph Hellwig <hch@lst.de>

arm64: replace ZONE_DMA with ZONE_DMA32

arm64 uses ZONE_DMA for allocations below 32-bits. These days we
name the zone for that ZONE_DMA32, which will allow to use the
dma-direct and generic swiotlb code as-is, so rename it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>


# b7da4094 23-Dec-2017 Christoph Hellwig <hch@lst.de>

arm64: rename swiotlb_dma_ops

We'll need that name for a generic implementation soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>


# ea8c64ac 10-Jan-2018 Christoph Hellwig <hch@lst.de>

dma-mapping: move swiotlb arch helpers to a new header

phys_to_dma, dma_to_phys and dma_capable are helpers published by
architecture code for use of swiotlb and xen-swiotlb only. Drivers are
not supposed to use these directly, but use the DMA API instead.

Move these to a new asm/dma-direct.h helper, included by a
linux/dma-direct.h wrapper that provides the default linear mapping
unless the architecture wants to override it.

In the MIPS case the existing dma-coherent.h is reused for now as
untangling it will take a bit of work.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>


# 359be678 02-Oct-2017 Matthieu CASTET <matthieu.castet@parrot.com>

dma mapping : export caller to vmallocinfo

For example on arm64 board, this add info to "user" entries in vmallocinfo

Before :
[...]
0xffffff8008997000 0xffffff80089d8000 266240 user
[...]

Afer :
[...]
0xffffff8008997000 0xffffff80089d8000 266240 atomic_pool_init+0x0/0x1d8 user
[...]

This help to debug mapping issues, and is consistent with others entries
(ioremap, vmalloc, ...) that already provide caller.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Matthieu CASTET <matthieu.castet@parrot.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# b4f4a275 20-Sep-2017 Thomas Meyer <thomas@m3y3r.de>

arm64: dma-mapping: Cocci spatch "vma_pages"

Use vma_pages function on vma object instead of explicit computation.
Found by coccinelle spatch "api/vma_pages.cocci"

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 8165f706 14-Aug-2017 Vladimir Murzin <vladimir.murzin@arm.com>

arm64: dma-mapping: Mark atomic_pool as __ro_after_init

atomic_pool is setup once while init stage and never changed after
that, so it is good candidate for __ro_after_init

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 2fa59ec8 14-Aug-2017 Vladimir Murzin <vladimir.murzin@arm.com>

arm64: dma-mapping: Do not pass data to gen_pool_set_algo()

gen_pool_first_fit_order_align() does not make use of additional data,
so pass plain NULL there.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 43fc509c 20-Jul-2017 Vladimir Murzin <vladimir.murzin@arm.com>

dma-coherent: introduce interface for default DMA pool

Christoph noticed [1] that default DMA pool in current form overload
the DMA coherent infrastructure. In reply, Robin suggested [2] to
split the per-device vs. global pool interfaces, so allocation/release
from default DMA pool is driven by dma ops implementation.

This patch implements Robin's idea and provide interface to
allocate/release/mmap the default (aka global) DMA pool.

To make it clear that existing *_from_coherent routines work on
per-device pool rename them to *_from_dev_coherent.

[1] https://lkml.org/lkml/2017/7/7/370
[2] https://lkml.org/lkml/2017/7/7/431

Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>


# e0d60ac1 20-May-2017 Christoph Hellwig <hch@lst.de>

arm64: remove DMA_ERROR_CODE

The dma alloc interface returns an error by return NULL, and the
mapping interfaces rely on the mapping_error method, which the dummy
ops already implement correctly.

Thus remove the DMA_ERROR_CODE define.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>


# 577dfe16 13-Jun-2017 Olav Haugan <ohaugan@codeaurora.org>

arm64/dma-mapping: Remove extraneous null-pointer checks

The current null-pointer check in __dma_alloc_coherent and
__dma_free_coherent is not needed anymore since the
__dma_alloc/__dma_free functions won't be called if !dev (dummy ops will
be called instead).

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 92f66f84 25-Apr-2017 Catalin Marinas <catalin.marinas@arm.com>

arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUS

While honouring the DMA_ATTR_FORCE_CONTIGUOUS on arm64 (commit
44176bb38fa4: "arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to
IOMMU"), the existing uses of dma_mmap_attrs() and dma_get_sgtable()
have been broken by passing a physically contiguous vm_struct with an
invalid pages pointer through the common iommu API.

Since the coherent allocation with DMA_ATTR_FORCE_CONTIGUOUS uses CMA,
this patch simply reuses the existing swiotlb logic for mmap and
get_sgtable.

Note that the current implementation of get_sgtable (both swiotlb and
iommu) is broken if dma_declare_coherent_memory() is used since such
memory does not have a corresponding struct page. To be addressed in a
subsequent patch.

Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU")
Reported-by: Andrzej Hajda <a.hajda@samsung.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# e0586326 13-Apr-2017 Stefano Stabellini <sstabellini@kernel.org>

xen/arm,arm64: fix xen_dma_ops after 815dd18 "Consolidate get_dma_ops..."

The following commit:

commit 815dd18788fe0d41899f51b91d0560279cf16b0d
Author: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Fri Jan 20 13:04:04 2017 -0800

treewide: Consolidate get_dma_ops() implementations

rearranges get_dma_ops in a way that xen_dma_ops are not returned when
running on Xen anymore, dev->dma_ops is returned instead (see
arch/arm/include/asm/dma-mapping.h:get_arch_dma_ops and
include/linux/dma-mapping.h:get_dma_ops).

Fix the problem by storing dev->dma_ops in dev_archdata, and setting
dev->dma_ops to xen_dma_ops. This way, xen_dma_ops is returned naturally
by get_dma_ops. The Xen code can retrieve the original dev->dma_ops from
dev_archdata when needed. It also allows us to remove __generic_dma_ops
from common headers.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Julien Grall <julien.grall@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org> [4.11+]
CC: linux@armlinux.org.uk
CC: catalin.marinas@arm.com
CC: will.deacon@arm.com
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
CC: Julien Grall <julien.grall@arm.com>


# 461a6946 26-Apr-2017 Joerg Roedel <jroedel@suse.de>

iommu: Remove pci.h include from trace/events/iommu.h

The include file does not need any PCI specifics, so remove
that include. Also fix the places that relied on it.

Signed-off-by: Joerg Roedel <jroedel@suse.de>


# b913efe7 10-Apr-2017 Sricharan R <sricharan@codeaurora.org>

arm64: dma-mapping: Remove the notifier trick to handle early setting of dma_ops

With arch_setup_dma_ops now being called late during device's probe after
the device's iommu is probed, the notifier trick required to handle the
early setup of dma_ops before the iommu group gets created is not
required. So removing the notifier's here.

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
[rm: clean up even more]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 44176bb3 07-Mar-2017 Geert Uytterhoeven <geert+renesas@glider.be>

arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU

Add support for allocating physically contiguous DMA buffers on arm64
systems with an IOMMU. This can be useful when two or more devices
with different memory requirements are involved in buffer sharing.

Note that as this uses the CMA allocator, setting the
DMA_ATTR_FORCE_CONTIGUOUS attribute has a runtime-dependency on
CONFIG_DMA_CMA, just like on arm32.

For arm64 systems using swiotlb, no changes are needed to support the
allocation of physically contiguous DMA buffers:
- swiotlb always uses physically contiguous buffers (up to
IO_TLB_SEGSIZE = 128 pages),
- arm64's __dma_alloc_coherent() already calls
dma_alloc_from_contiguous() when CMA is available.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 712c604d 24-Feb-2017 Lucas Stach <l.stach@pengutronix.de>

mm: wire up GFP flag passing in dma_alloc_from_contiguous

The callers of the DMA alloc functions already provide the proper
context GFP flags. Make sure to pass them through to the CMA allocator,
to make the CMA compaction context aware.

Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.de
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a1831bb9 01-Feb-2017 Robin Murphy <robin.murphy@arm.com>

iommu/dma: Remove bogus dma_supported() implementation

Back when this was first written, dma_supported() was somewhat of a
murky mess, with subtly different interpretations being relied upon in
various places. The "does device X support DMA to address range Y?"
uses assuming Y to be physical addresses, which motivated the current
iommu_dma_supported() implementation and are alluded to in the comment
therein, have since been cleaned up, leaving only the far less ambiguous
"can device X drive address bits Y" usage internal to DMA API mask
setting. As such, there is no reason to keep a slightly misleading
callback which does nothing but duplicate the current default behaviour;
we already constrain IOVA allocations to the iommu_domain aperture where
necessary, so let's leave DMA mask business to architecture-specific
code where it belongs.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# adbe7e26 25-Jan-2017 Robin Murphy <robin.murphy@arm.com>

arm64: dma-mapping: Fix dma_mapping_error() when bypassing SWIOTLB

When bypassing SWIOTLB on small-memory systems, we need to avoid calling
into swiotlb_dma_mapping_error() in exactly the same way as we avoid
swiotlb_dma_supported(), because the former also relies on SWIOTLB state
being initialised.

Under the assumptions for which we skip SWIOTLB, dma_map_{single,page}()
will only ever return the DMA-offset-adjusted physical address of the
page passed in, thus we can report success unconditionally.

Fixes: b67a8b29df7e ("arm64: mm: only initialize swiotlb when necessary")
CC: stable@vger.kernel.org
CC: Jisheng Zhang <jszhang@marvell.com>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 5657933d 20-Jan-2017 Bart Van Assche <bvanassche@acm.org>

treewide: Move dma_ops from struct dev_archdata into struct device

Some but not all architectures provide set_dma_ops(). Move dma_ops
from struct dev_archdata into struct device such that it becomes
possible on all architectures to configure dma_ops per device.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 5299709d 20-Jan-2017 Bart Van Assche <bvanassche@acm.org>

treewide: Constify most dma_map_ops structures

Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected. This patch
has been generated as follows:

git grep -l 'struct dma_map_ops' |
xargs -d\\n sed -i \
-e 's/struct dma_map_ops/const struct dma_map_ops/g' \
-e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \
-e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \
-e 's/const const struct dma_map_ops /const struct dma_map_ops /g';
sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \
$(git grep -l 'struct dma_map_ops intel_dma_ops');
sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \
$(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc);
sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \
-e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \
-e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \
drivers/pci/host/*.c
sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c
sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c
sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>


# 4a8d8a14 06-Jan-2017 Will Deacon <will@kernel.org>

arm64: dma-mapping: Only swizzle DMA ops for IOMMU_DOMAIN_DMA

The arm64 DMA-mapping implementation sets the DMA ops to the IOMMU DMA
ops if we detect that an IOMMU is present for the master and the DMA
ranges are valid.

In the case when the IOMMU domain for the device is not of type
IOMMU_DOMAIN_DMA, then we have no business swizzling the ops, since
we're not in control of the underlying address space. This patch leaves
the DMA ops alone for masters attached to non-DMA IOMMU domains.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 737c85ca 06-Jan-2017 Mitchel Humpherys <mitchelh@codeaurora.org>

arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED

The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that
are only accessible to privileged DMA engines. Implement it in
dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 7f332fc1 11-Jan-2017 Takeshi Kihara <takeshi.kihara.df@renesas.com>

arm64: Add support for DMA_ATTR_SKIP_CPU_SYNC attribute to swiotlb

This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
dma_{un}map_{page,sg} functions family to swiotlb.

DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
the CPU cache for the given buffer assuming that it has been already
transferred to 'device' domain.

Ported from IOMMU .{un}map_{sg,page} ops.

Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# ae7871be 16-Dec-2016 Geert Uytterhoeven <geert+renesas@glider.be>

swiotlb: Convert swiotlb_force from int to enum

Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.

Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 60c4e804 13-Nov-2016 Robin Murphy <robin.murphy@arm.com>

arm64: Wire up iommu_dma_{map, unmap}_resource()

With no coherency to worry about, just plug'em straight in.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# b7b941af 26-Oct-2016 Robin Murphy <robin.murphy@arm.com>

arm64: Remove pointless WARN_ON in DMA teardown

We expect arch_teardown_dma_ops() to be called very late in a device's
life, after it has been removed from its bus, and thus after the IOMMU
bus notifier has run. As such, even if this funny little check did make
sense, it's unlikely to achieve what it thinks it's trying to do anyway.
It's a residual trace of an earlier implementation which didn't belong
here from the start; belatedly snuff it out.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# fade1ec0 12-Sep-2016 Robin Murphy <robin.murphy@arm.com>

iommu/dma: Avoid PCI host bridge windows

With our DMA ops enabled for PCI devices, we should avoid allocating
IOVAs which a host bridge might misinterpret as peer-to-peer DMA and
lead to faults, corruption or other badness. To be safe, punch out holes
for all of the relevant host bridge's windows when initialising a DMA
domain for a PCI device.

CC: Marek Szyprowski <m.szyprowski@samsung.com>
CC: Inki Dae <inki.dae@samsung.com>
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 5a9e3e15 15-Aug-2016 Jisheng Zhang <jszhang@marvell.com>

arm64: apply __ro_after_init to some objects

These objects are set during initialization, thereafter are read only.

Previously I only want to mark vdso_pages, vdso_spec, vectors_page and
cpu_ops as __read_mostly from performance point of view. Then inspired
by Kees's patch[1] to apply more __ro_after_init for arm, I think it's
better to mark them as __ro_after_init. What's more, I find some more
objects are also read only after init. So apply __ro_after_init to all
of them.

This patch also removes global vdso_pagelist and tries to clean up
vdso_spec[] assignment code.

[1] http://www.spinics.net/lists/arm-kernel/msg523188.html

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# d34fdb70 01-Aug-2016 Kwangwoo Lee <kwangwoo.lee@sk.com>

arm64: mm: convert __dma_* routines to use start, size

__dma_* routines have been converted to use start and size instread of
start and end addresses. The patch was origianlly for adding
__clean_dcache_area_poc() which will be used in pmem driver to clean
dcache to the PoC(Point of Coherency) in arch_wb_cache_pmem().

The functionality of __clean_dcache_area_poc() was equivalent to
__dma_clean_range(). The difference was __dma_clean_range() uses the end
address, but __clean_dcache_area_poc() uses the size to clean.

Thus, __clean_dcache_area_poc() has been revised with a fallthrough
function of __dma_clean_range() after the change that __dma_* routines
use start and size instead of using start and end.

As a consequence of using start and size, the name of __dma_* routines
has also been altered following the terminology below:
area: takes a start and size
range: takes a start and end

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 00085f1e 03-Aug-2016 Krzysztof Kozlowski <krzk@kernel.org>

dma-mapping: use unsigned long for dma_attrs

The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:

1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
attributes are passed by value.

Semantic patches for this change (at least most of them):

virtual patch
virtual context

@r@
identifier f, attrs;

@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}

@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)

and

// Options: --all-includes
virtual patch
virtual context

@r@
identifier f, attrs;
type t;

@@
t f(..., struct dma_attrs *attrs);

@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 16c11325 01-Jul-2016 Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

arm64: mm: change IOMMU notifier action to attach DMA ops

Current bus notifier in ARM64 (__iommu_attach_notifier)
attempts to attach dma_ops to a device on BUS_NOTIFY_ADD_DEVICE
action notification.

This will cause issues on ACPI based systems, where PCI devices
can be added before the IOMMUs the devices are attached to
had a chance to be probed, causing failures on attempts to
attach dma_ops in that the domain for the respective IOMMU
may not be set-up yet by the time the bus notifier is run.

Devices dma_ops do not require to be set-up till the matching
device drivers are probed. This means that instead of running
the notifier attaching dma_ops to devices (__iommu_attach_notifier)
on BUS_NOTIFY_ADD_DEVICE action, it can be run just before the
device driver is bound to the device in question (on action
BUS_NOTIFY_BIND_DRIVER) so that it is certain that its IOMMU
group and domain are set-up accordingly at the time the
notifier is triggered.

This patch changes the notifier action upon which dma_ops
are attached to devices and defer it to driver binding time,
so that IOMMU devices have a chance to be probed and to register
their bus notifiers before the dma_ops attach sequence for a
device is actually carried out.

As a result we also no longer need worry about racing with
iommu_bus_notifier(), or about retrying the queue in case devices
were added too early on DT-based systems, so clean up the notifier
itself plus the additional workaround from 722ec35f7fae ("arm64:
dma-mapping: fix handling of devices registered before arch_initcall")

Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
[rm: get rid of other now-redundant bits]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# b67a8b29 08-Jun-2016 Jisheng Zhang <jszhang@marvell.com>

arm64: mm: only initialize swiotlb when necessary

we only initialize swiotlb when swiotlb_force is true or not all system
memory is DMA-able, this trivial optimization saves us 64MB when
swiotlb is not necessary.

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 3b6b7e19 13-Apr-2016 Robin Murphy <robin.murphy@arm.com>

iommu/dma: Finish optimising higher-order allocations

Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 53c92d79 07-Apr-2016 Robin Murphy <Robin.Murphy@arm.com>

iommu: of: enforce const-ness of struct iommu_ops

As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.

Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 921b1f52 19-Apr-2016 Robin Murphy <robin.murphy@arm.com>

arm64/dma-mapping: Remove default domain workaround

With the IOMMU core now taking care of default domains for groups
regardless of bus type, we can gleefully rip out this stop-gap, as
slight recompense for having to expand the other one.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 226d89cb 19-Apr-2016 Robin Murphy <robin.murphy@arm.com>

arm64/dma-mapping: Extend DMA ops workaround to PCI devices

PCI devices now suffer the same hiccup as platform devices, in that they
get their DMA ops configured before they have been added to their bus,
and thus before we know whether they have successfully registered with
an IOMMU or not. Until the necessary driver core changes to reorder
calls during device creation have been worked out, extend our delayed
notifier trick onto the PCI bus so as to avoid broken DMA ops once
IOMMUs get plugged into the PCI code.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 722ec35f 16-Feb-2016 Marek Szyprowski <m.szyprowski@samsung.com>

arm64: dma-mapping: fix handling of devices registered before arch_initcall

This patch ensures that devices, which got registered before arch_initcall
will be handled correctly by IOMMU-based DMA-mapping code.

Cc: <stable@vger.kernel.org>
Fixes: 13b8629f6511 ("arm64: Add IOMMU dma_ops")
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# a7c61a34 20-Nov-2015 Jisheng Zhang <jszhang@marvell.com>

arm64: add __init/__initdata section marker to some functions/variables

These functions/variables are not needed after booting, so mark them
as __init or __initdata.

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 1dccb598 16-Nov-2015 Arnd Bergmann <arnd@arndb.de>

arm64: simplify dma_get_ops

Including linux/acpi.h from asm/dma-mapping.h causes tons of compile-time
warnings, e.g.

drivers/isdn/mISDN/dsp_ecdis.h:43:0: warning: "FALSE" redefined
drivers/isdn/mISDN/dsp_ecdis.h:44:0: warning: "TRUE" redefined
drivers/net/fddi/skfp/h/targetos.h:62:0: warning: "TRUE" redefined
drivers/net/fddi/skfp/h/targetos.h:63:0: warning: "FALSE" redefined

However, it looks like the dependency should not even there as
I do not see why __generic_dma_ops() cares about whether we have
an ACPI based system or not.

The current behavior is to fall back to the global dma_ops when
a device has not set its own dma_ops, but only for DT based systems.
This seems dangerous, as a random device might have different
requirements regarding IOMMU or coherency, so we should really
never have that fallback and just forbid DMA when we have not
initialized DMA for a device.

This removes the global dma_ops variable and the special-casing
for ACPI, and just returns the dma ops that got set for the
device, or the dummy_dma_ops if none were present.

The original code has apparently been copied from arm32 where we
rely on it for ISA devices things like the floppy controller, but
we should have no such devices on ARM64.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[catalin.marinas@arm.com: removed acpi_disabled check in arch_setup_dma_ops()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# bd1c6ff7 04-Nov-2015 Robin Murphy <robin.murphy@arm.com>

arm64/dma-mapping: Fix sizes in __iommu_{alloc,free}_attrs

The iommu-dma layer does its own size-alignment for coherent DMA
allocations based on IOMMU page sizes, but we still need to consider
CPU page sizes for the cases where a non-cacheable CPU mapping is
created. Whilst everything on the alloc/map path seems to implicitly
align things enough to make it work, some functions used by the
corresponding unmap/free path do not, which leads to problems freeing
odd-sized allocations. Either way it's something we really should be
handling explicitly, so do that to make both paths suitably robust.

Reported-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# ce5c2d2c 07-Nov-2015 Andrew Morton <akpm@linux-foundation.org>

arm64: fixup for mm renames

__GFP_WAIT was renamed for __GFP_RECLAIM and the gfpflags_allow_blocking()
helper was added.

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d0164adc 06-Nov-2015 Mel Gorman <mgorman@techsingularity.net>

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 876945db 01-Oct-2015 Robin Murphy <robin.murphy@arm.com>

arm64: Hook up IOMMU dma_ops

With iommu_dma_ops in place, hook them up to the configuration code, so
IOMMU-fronted devices will get them automatically.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# 13b8629f 01-Oct-2015 Robin Murphy <robin.murphy@arm.com>

arm64: Add IOMMU dma_ops

Taking some inspiration from the arch/arm code, implement the
arch-specific side of the DMA mapping ops using the new IOMMU-DMA layer.

Since there is still work to do elsewhere to make DMA configuration happen
in a more appropriate order and properly support platform devices in the
IOMMU core, the device setup code unfortunately starts out carrying some
workarounds to ensure it works correctly in the current state of things.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>


# ba9cc453 11-Sep-2015 Jisheng Zhang <jszhang@marvell.com>

arm64: dma-mapping: check whether cma area is initialized or not

If CMA is turned on and CMA size is set to zero, kernel should
behave as if CMA was not enabled at compile time.
Every dma allocation should check existence of cma area
before requesting memory.

Arm has done this by commit e464ef16c4f0 ("arm: dma-mapping: add
checking cma area initialized"), also do this for arm64.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 97942c28 31-Jul-2015 Robin Murphy <Robin.Murphy@arm.com>

arm64: dma-mapping: Simplify pgprot handling

Since __get_dma_pgprot() does The Right Thing(TM) in the non-coherent
case, and the non-cacheable alias for DMA buffers is private to the
kernel anyway, we can simplify things slightly and make the code more
readable by just using PAGE_KERNEL as the base pgprot.

Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 1d1ddf67 17-Jul-2015 Robin Murphy <Robin.Murphy@arm.com>

arm64: dma-mapping: implement dma_get_sgtable()

The default dma_common_get_sgtable() implementation relies on the CPU
address of the buffer being a regular lowmem address. This is not always
the case on arm64, since allocations from the various DMA pools may have
remapped vmalloc addresses, rendering the use of virt_to_page() invalid.

Fix this by providing our own implementation based on the fact that we
can safely derive a physical address from the DMA address in both cases.

CC: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: made static]
Signed-off-by: Will Deacon <will.deacon@arm.com>


# aaf6f2f0 10-Jul-2015 Robin Murphy <Robin.Murphy@arm.com>

arm64: consolidate __swiotlb_mmap

Since commit 9d3bfbb4df58 ("arm64: Combine coherent and non-coherent
swiotlb dma_ops"), __dma_common_mmap is no longer shared between two
callers, so roll it into the remaining one.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# b6197b93 10-Jun-2015 Suthikulpanit, Suravee <Suravee.Suthikulpanit@amd.com>

arm64 : Introduce support for ACPI _CCA object

section 6.2.17 _CCA states that ARM platforms require ACPI _CCA
object to be specified for DMA-cabpable devices. Therefore, this patch
specifies ACPI_CCA_REQUIRED in arm64 Kconfig.

In addition, to handle the case when _CCA is missing, arm64 would assign
dummy_dma_ops to disable DMA capability of the device.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


# 2cff98b9 29-Apr-2015 Dean Nelson <dnelson@redhat.com>

arm64: add missing PAGE_ALIGN() to __dma_free()

__dma_alloc() does a PAGE_ALIGN() on the passed in size argument before
doing anything else. __dma_free() does not. And because it doesn't, it is
possible to leak memory should size not be an integer multiple of PAGE_SIZE.

The solution is to add a PAGE_ALIGN() to __dma_free() like is done in
__dma_alloc().

Additionally, this patch removes a redundant PAGE_ALIGN() from
__dma_alloc_coherent(), since __dma_alloc_coherent() can only be called
from __dma_alloc(), which already does a PAGE_ALIGN() before the call.

Cc: stable@vger.kernel.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 6829e274 22-Apr-2015 Marek Szyprowski <m.szyprowski@samsung.com>

arm64: dma-mapping: always clear allocated buffers

Buffers allocated by dma_alloc_coherent() are always zeroed on Alpha,
ARM (32bit), MIPS, PowerPC, x86/x86_64 and probably other architectures.
It turned out that some drivers rely on this 'feature'. Allocated buffer
might be also exposed to userspace with dma_mmap() call, so clearing it
is desired from security point of view to avoid exposing random memory
to userspace. This patch unifies dma_alloc_coherent() behavior on ARM64
architecture with other implementations by unconditionally zeroing
allocated buffer.

Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 7132813c 19-Mar-2015 Suzuki K. Poulose <suzuki.poulose@arm.com>

arm64: Honor __GFP_ZERO in dma allocations

Current implementation doesn't zero out the pages allocated.
Honor the __GFP_ZERO flag and zero out if set.

Cc: <stable@vger.kernel.org> # v3.14+
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a1e50a82 05-Feb-2015 Catalin Marinas <catalin.marinas@arm.com>

arm64: Increase the swiotlb buffer size 64MB

With commit 3690951fc6d4 (arm64: Use swiotlb late initialisation), the
swiotlb buffer size is limited to MAX_ORDER_NR_PAGES. However, there are
platforms with 32-bit only devices that require bounce buffering via
swiotlb. This patch changes the swiotlb initialisation to an early 64MB
memblock allocation. In order to get the swiotlb buffer correctly
allocated (via memblock_virt_alloc_low_nopanic), this patch also defines
ARCH_LOW_ADDRESS_LIMIT to the maximum physical address capable of 32-bit
DMA.

Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 9d3bfbb4 12-Jan-2015 Catalin Marinas <catalin.marinas@arm.com>

arm64: Combine coherent and non-coherent swiotlb dma_ops

Since dev_archdata now has a dma_coherent state, combine the two
coherent and non-coherent operations and remove their declaration,
together with set_dma_ops, from the arch dma-mapping.h file.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# d4932f9e 09-Oct-2014 Laura Abbott <lauraa@codeaurora.org>

arm64: add atomic pool for non-coherent and CMA allocations

Neither CMA nor noncoherent allocations support atomic allocations.
Add a dedicated atomic pool to support this.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Riley <davidriley@chromium.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a52ce121 01-Oct-2014 Sean Paul <seanpaul@chromium.org>

arm64: Use DMA_ERROR_CODE to denote failed allocation

This patch replaces the static assignment of ~0 to dma_handle with
DMA_ERROR_CODE to be consistent with other platforms.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 21890647 22-Sep-2014 Catalin Marinas <catalin.marinas@arm.com>

arm64: Implement set_arch_dma_coherent_ops() to replace bus notifiers

Commit 6ecba8eb51b7 (arm64: Use bus notifiers to set per-device coherent
DMA ops) introduced bus notifiers to set the coherent dma ops based on
the 'dma-coherent' DT property. Since the generic of_dma_configure()
handles this property for platform and AMBA devices, replace the
notifiers with set_arch_dma_coherent_ops().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a501e324 03-Apr-2014 Catalin Marinas <catalin.marinas@arm.com>

arm64: Clean up the default pgprot setting

The primary aim of this patchset is to remove the pgprot_default and
prot_sect_default global variables and rely strictly on predefined
values. The original goal was to be able to run SMP kernels on UP
hardware by not setting the Shareability bit. However, it is unlikely to
see UP ARMv8 hardware and even if we do, the Shareability bit is no
longer assumed to disable cacheable accesses.

A side effect is that the device mappings now have the Shareability
attribute set. The hardware, however, should ignore it since Device
accesses are always Outer Shareable.

Following the removal of the two global variables, there is some PROT_*
macro reshuffling and cleanup, including the __PAGE_* macros (replaced
by PAGE_*).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>


# 6ecba8eb 25-Apr-2014 Catalin Marinas <catalin.marinas@arm.com>

arm64: Use bus notifiers to set per-device coherent DMA ops

Recently, the default DMA ops have been changed to non-coherent for
alignment with 32-bit ARM platforms (and DT files). This patch adds bus
notifiers to be able to set the coherent DMA ops (with no cache
maintenance) for devices explicitly marked as coherent via the
"dma-coherent" DT property.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# c7a4a765 22-Apr-2014 Ritesh Harjani <ritesh.harjani@gmail.com>

arm64: Make default dma_ops to be noncoherent

Currently arm64 dma_ops is by default made coherent which makes it
opposite in default policy from arm.

Make default dma_ops to be noncoherent (same as arm), as currently there
aren't any dma-capable drivers which assumes coherent ops

Signed-off-by: Ritesh Harjani <ritesh.harjani@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 196adf2f 24-Mar-2014 Catalin Marinas <catalin.marinas@arm.com>

arm64: Remove pgprot_dmacoherent()

Since this macro is identical to pgprot_writecombine() and is only used
in a single place, remove it completely to avoid confusion. On ARMv7+
processors, the coherent DMA mapping must be Normal NonCacheable (a.k.a.
writecombine) to avoid mismatched hardware attribute aliases (with the
kernel linear mapping as Normal Cacheable).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 214fdbe7 14-Mar-2014 Laura Abbott <lauraa@codeaurora.org>

arm64: Support DMA_ATTR_WRITE_COMBINE

DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot
appropriately for non coherent opperations.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 6e8d7968 14-Mar-2014 Laura Abbott <lauraa@codeaurora.org>

arm64: Implement custom mmap functions for dma mapping

The current dma_ops do not specify an mmap function so maping
falls back to the default implementation. There are at least
two issues with using the default implementation:

1) The pgprot is always pgprot_noncached (strongly ordered)
memory even with coherent operations
2) dma_common_mmap calls virt_to_page on the remapped non-coherent
address which leads to invalid memory being mapped.

Fix both these issue by implementing a custom mmap function which
correctly accounts for remapped addresses and sets vm_pg_prot
appropriately.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
[catalin.marinas@arm.com: replaced "arm64_" with "__" prefix for consistency]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 7363590d 21-May-2013 Catalin Marinas <catalin.marinas@arm.com>

arm64: Implement coherent DMA API based on swiotlb

This patch adds support for DMA API cache maintenance on SoCs without
hardware device cache coherency.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 3690951f 26-Feb-2014 Catalin Marinas <catalin.marinas@arm.com>

arm64: Use swiotlb late initialisation

Since arm64 does not support ISA, there is no need for early swiotlb
initialisation. This patch switches the DMA mapping code to
swiotlb_tlb_late_init_with_default_size(). A side effect of this is that
GFP_DMA is used for the swiotlb buffer and devices with a 32-bit
coherent mask are correctly supported.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 19e7640d 26-Feb-2014 Catalin Marinas <catalin.marinas@arm.com>

arm64: Replace ZONE_DMA32 with ZONE_DMA

On arm64 we do not have two DMA zones, so it does not make sense to
implement ZONE_DMA32. This patch changes ZONE_DMA32 with ZONE_DMA, the
latter covering 32-bit dma address space to honour GFP_DMA allocations.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# bb10eb7b 06-Feb-2014 Ritesh Harjani <ritesh.harjani@gmail.com>

arm64: Change misleading function names in dma-mapping

arm64_swiotlb_alloc/free_coherent name can be misleading
somtimes with CMA support being enabled after this
patch (c2104debc235b745265b64d610237a6833fd53)

Change this name to be more generic:
__dma_alloc/free_coherent

Signed-off-by: Ritesh Harjani <ritesh.harjani@gmail.com>
[catalin.marinas@arm.com: renamed arm64_swiotlb_dma_ops to coherent_swiotlb_dma_ops]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# ccc9e244 04-Feb-2014 Laura Abbott <lauraa@codeaurora.org>

arm64: Align CMA sizes to PAGE_SIZE

dma_alloc_from_contiguous takes number of pages for a size.
Align up the dma size passed in to page size to avoid truncation
and allocation failures on sizes less than PAGE_SIZE.

Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 6ac2104d 12-Dec-2013 Laura Abbott <lauraa@codeaurora.org>

arm64: Enable CMA

arm64 bit targets need the features CMA provides. Add the appropriate
hooks, header files, and Kconfig to allow this to happen.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# c666e8d5 12-Dec-2013 Laura Abbott <lauraa@codeaurora.org>

arm64: Warn on NULL device structure for dma APIs

Although parts of the DMA apis may properly check for NULL devices,
there may be some places that don't. Rather than fix up all the
possible locations, just require a non-NULL device structure to be
used for allocating/freeing.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
[catalin.marinas@arm.com: s/WARN/WARN_ONCE/]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 27222a3d 03-Oct-2012 Catalin Marinas <catalin.marinas@arm.com>

arm64: Call swiotlb_init() instead of swiotlb_init_with_default_size()

Following commit 74838b7 (swiotlb: add the late swiotlb initialization
function with iotlb memory) the swiotlb_init_with_default_size() is a
static function. This patch changes the arm64 code to call
swiotlb_init() instead and use the default size of 64MB. It is assumed
that AArch64 platforms have enough RAM to afford the pre-allocated
swiotlb memory. It also removes the #ifdef around this call since
CONFIG_SWIOTLB is always enabled.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 09b55412 05-Mar-2012 Catalin Marinas <catalin.marinas@arm.com>

arm64: DMA mapping API

This patch adds support for the DMA mapping API. It uses dma_map_ops for
flexibility and it currently supports swiotlb. This patch could be
simplified further if the DMA accesses are coherent (not mandated by the
architecture) or if corresponding hooks are placed in the generic
swiotlb code to deal with cache maintenance.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>