Cross Reference: /linux-master/drivers/iommu/io-pgtable-arm.c

History log of /linux-master/drivers/iommu/io-pgtable-arm.c
Revision	Date	Author	Comments
# 87639e01	24-Nov-2023	Boris Brezillon <boris.brezillon@collabora.com>	iommu: Extend LPAE page table format to support custom allocators We need that in order to implement the VM_BIND ioctl in the GPU driver targeting new Mali GPUs. VM_BIND is about executing MMU map/unmap requests asynchronously, possibly after waiting for external dependencies encoded as dma_fences. We intend to use the drm_sched framework to automate the dependency tracking and VM job dequeuing logic, but this comes with its own set of constraints, one of them being the fact we are not allowed to allocate memory in the drm_gpu_scheduler_ops::run_job() to avoid this sort of deadlocks: - VM_BIND map job needs to allocate a page table to map some memory to the VM. No memory available, so kswapd is kicked - GPU driver shrinker backend ends up waiting on the fence attached to the VM map job or any other job fence depending on this VM operation. With custom allocators, we will be able to pre-reserve enough pages to guarantee the map/unmap operations we queued will take place without going through the system allocator. But we can also optimize allocation/reservation by not free-ing pages immediately, so any upcoming page table allocation requests can be serviced by some free page table pool kept at the driver level. I might also be valuable for other aspects of GPU and similar use-cases, like fine-grained memory accounting and resource limiting. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20231124142434.1577550-3-boris.brezillon@collabora.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# 99cbb8e4	15-Nov-2022	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Remove map/unmap With all users now calling {map,unmap}_pages, remove the wrappers. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/162e58e83ed42f78c3fbefe78c9b5410dd1dc412.1668100209.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# 745ef109	16-Sep-2022	Janne Grunau <j@jannau.net>	iommu/io-pgtable: Move Apple DART support to its own file The pte format used by the DARTs found in the Apple M1 (t8103) is not fully compatible with io-pgtable-arm. The 24 MSB are used for subpage protection (mapping only parts of page) and conflict with the address mask. In addition bit 1 is not available for tagging entries but disables subpage protection. Subpage protection could be useful to support a CPU granule of 4k with the fixed IOMMU page size of 16k. The DARTs found on Apple M1 Pro/Max/Ultra use another different pte format which is even less compatible. To support an output address size of 42 bit the address is shifted down by 4. Subpage protection is mandatory and bit 1 signifies uncached mappings used by the display controller. It would be advantageous to share code for all known Apple DART variants to support common features. The page table allocator for DARTs is less complex since it uses a two levels of translation table without support for huge pages. Signed-off-by: Janne Grunau <j@jannau.net> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Sven Peter <sven@svenpeter.dev> Acked-by: Hector Martin <marcan@marcan.st> Link: https://lore.kernel.org/r/20220916094152.87137-3-j@jannau.net [ joro: Fix compile warning in __dart_alloc_pages()] Signed-off-by: Joerg Roedel <jroedel@suse.de>
# ca25ec24	15-Aug-2022	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Remove iommu_dev==NULL special case The special case to allow iommu_dev==NULL in __arm_lpae_alloc_pages() is confusing to static checkers (and possibly readers in general), since it's not obvious that that is only intended for the selftests. However it only serves to get around the dev_to_node() call, and we can easily fake up enough to make that work anyway, so let's simply remove this consideration from the normal flow and punt the responsibility over to the test harness itself. Reported-by: Rustam Subkhankulov <subkhankulov@ispras.ru> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e2095eeda305071cb56c2cb8ac8a82dc3bd4dcab.1660580155.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# 9abe2ac8	19-Nov-2021	Hector Martin <marcan@marcan.st>	iommu/io-pgtable-arm: Fix table descriptor paddr formatting Table descriptors were being installed without properly formatting the address using paddr_to_iopte, which does not match up with the iopte_deref in __arm_lpae_map. This is incorrect for the LPAE pte format, as it does not handle the high bits properly. This was found on Apple T6000 DARTs, which require a new pte format (different shift); adding support for that to paddr_to_iopte/iopte_to_paddr caused it to break badly, as even <48-bit addresses would end up incorrect in that case. Fixes: 6c89928ff7a0 ("iommu/io-pgtable-arm: Support 52-bit physical address") Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Hector Martin <marcan@marcan.st> Link: https://lore.kernel.org/r/20211120031343.88034-1-marcan@marcan.st Signed-off-by: Joerg Roedel <jroedel@suse.de>
# f7403abf	20-Aug-2021	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable: Abstract iommu_iotlb_gather access Previously io-pgtable merely passed the iommu_iotlb_gather pointer through to helpers, but now it has grown its own direct dereference. This turns out to break the build for !IOMMU_API configs where the structure only has a dummy definition. It will probably also crash drivers who don't use the gather mechanism and simply pass in NULL. Wrap this dereference in a suitable helper which can both be stubbed out for !IOMMU_API and encapsulate a NULL check otherwise. Fixes: 7a7c5badf858 ("iommu: Indicate queued flushes via gather data") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/83672ee76f6405c82845a55c148fa836f56fbbc1.1629465282.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# a8e5f044	11-Aug-2021	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable: Remove non-strict quirk IO_PGTABLE_QUIRK_NON_STRICT was never a very comfortable fit, since it's not a quirk of the pagetable format itself. Now that we have a more appropriate way to convey non-strict unmaps, though, this last of the non-quirk quirks can also go, and with the flush queue code also now enforcing its own ordering we can have a lovely cleanup all round. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/155b5c621cd8936472e273a8b07a182f62c6c20d.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# 892384cd	03-Aug-2021	Sven Peter <sven@svenpeter.dev>	iommu/io-pgtable: Add DART pagetable format Apple's DART iommu uses a pagetable format that shares some similarities with the ones already implemented by io-pgtable.c. Add a new format variant to support the required differences so that we don't have to duplicate the pagetable handling code. Reviewed-by: Alexander Graf <graf@amazon.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/20210803121651.61594-2-sven@svenpeter.dev Signed-off-by: Joerg Roedel <jroedel@suse.de>
# 4a77b12d	16-Jun-2021	Isaac J. Manjarres <isaacm@codeaurora.org>	iommu/io-pgtable-arm: Implement arm_lpae_map_pages() Implement the map_pages() callback for the ARM LPAE io-pgtable format. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-12-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# 1fe27be5	16-Jun-2021	Isaac J. Manjarres <isaacm@codeaurora.org>	iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages() Implement the unmap_pages() callback for the ARM LPAE io-pgtable format. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-11-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# 41e1eb25	16-Jun-2021	Isaac J. Manjarres <isaacm@codeaurora.org>	iommu/io-pgtable-arm: Prepare PTE methods for handling multiple entries The PTE methods currently operate on a single entry. In preparation for manipulating multiple PTEs in one map or unmap call, allow them to handle multiple PTEs. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-10-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# fefe8527	25-Nov-2020	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable: Remove tlb_flush_leaf The only user of tlb_flush_leaf is a particularly hairy corner of the Arm short-descriptor code, which wants a synchronous invalidation to minimise the races inherent in trying to split a large page mapping. This is already far enough into "here be dragons" territory that no sensible caller should ever hit it, and thus it really doesn't need optimising. Although using tlb_flush_walk there may technically be more heavyweight than needed, it does the job and saves everyone else having to carry around useless baggage. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/9844ab0c5cb3da8b2f89c6c2da16941910702b41.1606324115.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
# f37eb484	07-Dec-2020	Kunkun Jiang <jiangkunkun@huawei.com>	iommu/io-pgtable-arm: Remove unused 'level' parameter from iopte_type() macro The 'level' parameter to the iopte_type() macro is unused, so remove it. Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Link: https://lore.kernel.org/r/20201207120150.1891-1-jiangkunkun@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
# f12e0d22	07-Dec-2020	Keqian Zhu <zhukeqian1@huawei.com>	iommu: Defer the early return in arm_(v7s/lpae)_map Although handling a mapping request with no permissions is a trivial no-op, defer the early return until after the size/range checks so that we are consistent with other mapping requests. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Link: https://lore.kernel.org/r/20201207115758.9400-1-zhukeqian1@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
# e67890c9	24-Nov-2020	Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>	iommu/io-pgtable-arm: Add support to use system cache Add a quirk IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to override the outer-cacheability attributes set in the TCR for a non-coherent page table walker when using system cache. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Link: https://lore.kernel.org/r/f818676b4a2a9ad1edb92721947d47db41ed6a7c.1606287059.git.saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
# 728da60d	22-Sep-2020	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Support coherency for Mali LPAE Midgard GPUs have ACE-Lite master interfaces which allows systems to integrate them in an I/O-coherent manner. It seems that from the GPU's viewpoint, the rest of the system is its outer shareable domain, and so even when snoop signals are wired up, they are only emitted for outer shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does indeed get coherent pagetable walks working nicely for the coherent T620 in the Arm Juno SoC. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/8df778355378127ea7eccc9521d6427e3e48d4f2.1600780574.git.robin.murphy@arm.com
# 7cef39dd	17-Sep-2020	Jean-Philippe Brucker <jean-philippe@linaro.org>	iommu/io-pgtable-arm: Move some definitions to a header Extract some of the most generic TCR defines, so they can be reused by the page table sharing code. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200918101852.582559-6-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
# b9bb694b	21-Sep-2020	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Clean up faulty sanity check Checking for a nonzero dma_pfn_offset was a quick shortcut to validate whether the DMA == phys assumption could hold at all. Checking for a non-NULL dma_range_map is not quite equivalent, since a map may be present to describe a limited DMA window even without an offset, and thus this check can now yield false positives. However, it only ever served to short-circuit going all the way through to __arm_lpae_alloc_pages(), failing the canonical test there, and having a bit more to clean up. As such, we can simply remove it without loss of correctness. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
# e0d07278	17-Sep-2020	Jim Quinlan <james.quinlan@broadcom.com>	dma-mapping: introduce DMA range map, supplanting dma_pfn_offset The new field 'dma_range_map' in struct device is used to facilitate the use of single or multiple offsets between mapping regions of cpu addrs and dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only capable of holding a single uniform offset and had no region bounds checking. The function of_dma_get_range() has been modified so that it takes a single argument -- the device node -- and returns a map, NULL, or an error code. The map is an array that holds the information regarding the DMA regions. Each range entry contains the address offset, the cpu_start address, the dma_start address, and the size of the region. of_dma_configure() is the typical manner to set range offsets but there are a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel driver code. These cases now invoke the function dma_direct_set_offset(dev, cpu_addr, dma_addr, size). Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [hch: various interface cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
# f34ce7a7	11-Jun-2020	Baolin Wang <baolin.wang@linux.alibaba.com>	iommu: Add gfp parameter to io_pgtable_ops->map() Now the ARM page tables are always allocated by GFP_ATOMIC parameter, but the iommu_ops->map() function has been added a gfp_t parameter by commit 781ca2de89ba ("iommu: Add gfp parameter to iommu_ops::map"), thus io_pgtable_ops->map() should use the gfp parameter passed from iommu_ops->map() to allocate page pages, which can avoid wasting the memory allocators atomic pools for some non-atomic contexts. Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/3093df4cb95497aaf713fca623ce4ecebb197c2e.1591930156.git.baolin.wang@linux.alibaba.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
# ecd7274f	04-Jun-2020	Will Deacon <will@kernel.org>	iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag The IOMMU_SYS_CACHE_ONLY flag was never exposed via the DMA API and has no in-tree users. Remove it. Cc: Robin Murphy <robin.murphy@arm.com> Cc: "Isaac J. Manjarres" <isaacm@codeaurora.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Signed-off-by: Will Deacon <will@kernel.org>
# 08090744	28-Feb-2020	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Fix IOVA validation for 32-bit Since we ony support the TTB1 quirk for AArch64 contexts, and consequently only for 64-bit builds, the sign-extension aspect of the "are all bits above IAS consistent?" check should implicitly only apply to 64-bit IOVAs. Change the type of the cast to ensure that 32-bit longs don't inadvertently get sign-extended, and thus considered invalid, if they happen to be above 2GB in the TTB0 region. Reported-by: Stephan Gerhold <stephan@gerhold.net> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Acked-by: Will Deacon <will@kernel.org> Fixes: db6903010aa5 ("iommu/io-pgtable-arm: Prepare for TTBR1 usage") Signed-off-by: Joerg Roedel <jroedel@suse.de>
# db690301	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Prepare for TTBR1 usage Now that we can correctly extract top-level indices without relying on the remaining upper bits being zero, the only remaining impediments to using a given table for TTBR1 are the address validation on map/unmap and the awkward TCR translation granule format. Add a quirk so that we can do the right thing at those points. Tested-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# ac4b80e5	10-Jan-2020	Will Deacon <will@kernel.org>	iommu/io-pgtable-arm: Rationalise VTCR handling Commit 05a648cd2dd7 ("iommu/io-pgtable-arm: Rationalise TCR handling") reworked the way in which the TCR register value is returned from the io-pgtable code when targetting the Arm long-descriptor format, in preparation for allowing page-tables to target TTBR1. As it turns out, the new interface is a lot nicer to use, so do the same conversion for the VTCR register even though there is only a single base register for stage-2 translation. Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# fb485eb1	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Rationalise TCR handling Although it's conceptually nice for the io_pgtable_cfg to provide a standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU looks exactly like an Arm CPU, and they all have various other TCR controls which io-pgtable can't be expected to understand. Thus since there is an expectation that drivers will have to add to the given TCR value anyway, let's strip it down to just the essentials that are directly relevant to io-pgtable's inner workings - namely the various sizes and the walk attributes. Tested-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [will: Add missing include of bitfield.h] Signed-off-by: Will Deacon <will@kernel.org>
# 6f932ad3	09-Jan-2020	Will Deacon <will@kernel.org>	iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned ARM_64_LPAE_S2_TCR_RES1 is intended to map to bit 31 of the VTCR register, which is required to be set to 1 by the architecture. Unfortunately, we accidentally treat this as a signed quantity which means we also set the upper 32 bits of the VTCR to one, and they are required to be zero. Treat ARM_64_LPAE_S2_TCR_RES1 as unsigned to avoid the unwanted sign-extension up to 64 bits. Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 7618e479	10-Jan-2020	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Improve attribute handling By VMSA rules, using Normal Non-Cacheable type with a shareability attribute of anything other than Outer Shareable is liable to lead into unpredictable territory: \| Overlaying the shareability attribute (B3-1377, ARM DDI 0406C.c) \| \| A memory region with a resultant memory type attribute of Normal, and \| a resultant cacheability attribute of Inner Non-cacheable, Outer \| Non-cacheable, must have a resultant shareability attribute of Outer \| Shareable, otherwise shareability is UNPREDICTABLE Although the SMMU architectures seem to give some slightly stronger guarantees of Non-Cacheable output types becoming implicitly Outer Shareable in most cases, we may as well be explicit and not take any chances. It's also weird that LPAE attribute handling is currently split between prot_to_pte() and init_pte() given that it can all be statically determined up-front. Thus, collect all the LPAE attributes into prot_to_pte() in order to logically pick the shareability based on the incoming IOMMU API prot value, and tweak the short-descriptor code to stop setting TTBR0.NOS for Non-Cacheable walks. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 30d2acb6	10-Jan-2020	Will Deacon <will@kernel.org>	iommu/io-pgtable-arm: Support non-coherent stage-2 page tables Commit 9e6ea59f3ff3 ("iommu/io-pgtable: Support non-coherent page tables") added support for non-coherent page-table walks to the Arm IOMMU page-table backends. Unfortunately, it left the stage-2 allocator unchanged, so let's hook that up in the same way. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# d1e5f26f	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Rationalise TTBRn handling TTBR1 values have so far been redundant since no users implement any support for split address spaces. Crucially, though, one of the main reasons for wanting to do so is to be able to manage each half entirely independently, e.g. context-switching one set of mappings without disturbing the other. Thus it seems unlikely that tying two tables together in a single io_pgtable_cfg would ever be particularly desirable or useful. Streamline the configs to just a single conceptual TTBR value representing the allocated table. This paves the way for future users to support split address spaces by simply allocating a table and dealing with the detailed TTBRn logistics themselves. Tested-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [will: Drop change to ttbr value] Signed-off-by: Will Deacon <will@kernel.org>
# dd5ddd3c	24-Oct-2019	Will Deacon <will@kernel.org>	iommu/io-pgtable-arm: Rename IOMMU_QCOM_SYS_CACHE and improve doc The 'IOMMU_QCOM_SYS_CACHE' IOMMU protection flag is exposed to all users of the IOMMU API. Despite its name, the idea behind it isn't especially tied to Qualcomm implementations and could conceivably be used by other systems. Rename it to 'IOMMU_SYS_CACHE_ONLY' and update the comment to describe a bit better the idea behind it. Cc: Robin Murphy <robin.murphy@arm.com> Cc: "Isaac J. Manjarres" <isaacm@codeaurora.org> Signed-off-by: Will Deacon <will@kernel.org>
# 205577ab	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Rationalise MAIR handling Between VMSAv8-64 and the various 32-bit formats, there is either one 64-bit MAIR or a pair of 32-bit MAIR0/MAIR1 or NMRR/PMRR registers. As such, keeping two 64-bit values in io_pgtable_cfg has always been overkill. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 5fb190b0	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Simplify level indexing The nature of the LPAE format means that data->pg_shift is always redundant with data->bits_per_level, since they represent the size of a page and the number of PTEs per page respectively, and the size of a PTE is constant. Thus it works out more efficient to only store the latter, and derive the former via a trivial addition where necessary. Signed-off-by: Robin Murphy <robin.murphy@arm.com> [will: Reworked granule check in iopte_to_paddr()] Signed-off-by: Will Deacon <will@kernel.org>
# c79278c1	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Simplify PGD size handling We use data->pgd_size directly for the one-off allocation and freeing of the top-level table, but otherwise it serves for ARM_LPAE_PGD_IDX() to repeatedly re-calculate the effective number of top-level address bits it represents. Flip this around so we store the form we most commonly need, and derive the lesser-used one instead. This cuts a whole bunch of code out of the map/unmap/iova_to_phys fast-paths. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 594ab90f	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Simplify start level lookup Beyond a couple of allocation-time calculations, data->levels is only ever used to derive the start level. Storing the start level directly leads to a small reduction in object code, which should help eke out a little more efficiency, and slightly more readable source to boot. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 67f3e53d	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Simplify bounds checks We're merely checking that the relevant upper bits of each address are all zero, so there are cheaper ways to achieve that. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# f7b90d2c	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Rationalise size check It makes little sense to only validate the requested size after we think we've found a matching block size - making the check up-front is simple, and far more logical than waiting to walk off the bottom of the table to infer that we must have been passed a bogus size to start with. We're missing an equivalent check on the unmap path, so add that as well for consistency. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# b5813c16	25-Oct-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable: Make selftest gubbins consistently __init The selftests run as an initcall, but the annotation of the various callbacks and data seems to be somewhat arbitrary. Add it consistently for everything related to the selftests. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 9062c1d0	09-Sep-2019	Christophe JAILLET <christophe.jaillet@wanadoo.fr>	iommu/io-pgtable: Move some initialization data to .init.rodata The memory used by '__init' functions can be freed once the initialization phase has been performed. Mark some 'static const' array defined and used within some '__init' functions as '__initconst', so that the corresponding data can also be discarded. Without '__initconst', the data are put in the .rodata section. With the qualifier, they are put in the .init.rodata section. With gcc 8.3.0, the following changes have been measured: Without '__initconst': section size .rodata 00000720 .init.rodata 00000018 With '__initconst': section size .rodata 00000660 .init.rodata 00000058 Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Will Deacon <will@kernel.org>
# 1be08f45	30-Sep-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Support all Mali configurations In principle, Midgard GPUs supporting smaller VA sizes should only require 3-level pagetables, since level 0 only resolves bits 48:40 of the address. However, the kbase driver does not appear to have any notion of a variable start level, and empirically T720 and T820 rapidly blow up with translation faults unless given a full 4-level table, despite only supporting a 33-bit VA size. The 'real' IAS value is still valuable in terms of validating addresses on map/unmap, so tweak the allocator to allow smaller values while still forcing the resultant tables to the full 4 levels. As far as I can test, this should make all known Midgard variants happy. Fixes: d08d42de6432 ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format") Tested-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
# 52f325f4	30-Sep-2019	Robin Murphy <robin.murphy@arm.com>	iommu/io-pgtable-arm: Correct Mali attributes Whilst Midgard's MEMATTR follows a similar principle to the VMSA MAIR, the actual attribute values differ, so although it currently appears to work to some degree, we probably shouldn't be using our standard stage 1 MAIR for that. Instead, generate a reasonable MEMATTR with attribute values borrowed from the kbase driver; at this point we'll be overriding or ignoring pretty much all of the LPAE config, so just implement these Mali details in a dedicated allocator instead of pretending to subclass the standard VMSA format. Fixes: d08d42de6432 ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format") Tested-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>