Cross Reference: /linux-master/arch/arm/mm/fault-armv.c

History log of /linux-master/arch/arm/mm/fault-armv.c
Revision	Date	Author	Comments
# 8b5989f3	02-Aug-2023	Matthew Wilcox (Oracle) <willy@infradead.org>	arm: implement the new page table range API Add set_ptes(), update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Change the PG_dcache_clear flag from being per-page to per-folio which makes __dma_page_dev_to_cpu() a bit more exciting. Also add flush_cache_pages(), even though this isn't used by generic code (yet?) [m.szyprowski@samsung.com: fix potential endless loop in __dma_page_dev_to_cpu()] Link: https://lkml.kernel.org/r/20230809172737.3574190-1-m.szyprowski@samsung.com [willy@infradead.org: fix folio conversion in __dma_page_dev_to_cpu()] Link: https://lkml.kernel.org/r/20230823191852.1556561-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-10-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
# de2e4626	11-Jul-2023	Hugh Dickins <hughd@google.com>	arm: adjust_pte() use pte_offset_map_nolock() Instead of pte_lockptr(), use the recently added pte_offset_map_nolock() in adjust_pte(): because it gives the not-locked ptl for precisely that pte, which the caller can then safely lock; whereas pte_lockptr() is not so tightly coupled, because it dereferences the pmd pointer again. Link: https://lkml.kernel.org/r/4d5258bd-ffa0-018-253a-25f2c9b783f7@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@infradead.org> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Russell King <linux@armlinux.org.uk> Cc: SeongJae Park <sj@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Zack Rusin <zackr@vmware.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
# 766b59e8	08-Jun-2023	Hugh Dickins <hughd@google.com>	arm: allow pte_offset_map[_lock]() to fail Patch series "arch: allow pte_offset_map[_lock]() to fail", v2. What is it all about? Some mmap_lock avoidance i.e. latency reduction. Initially just for the case of collapsing shmem or file pages to THPs; but likely to be relied upon later in other contexts e.g. freeing of empty page tables (but that's not work I'm doing). mmap_write_lock avoidance when collapsing to anon THPs? Perhaps, but again that's not work I've done: a quick attempt was not as easy as the shmem/file case. I would much prefer not to have to make these small but wide-ranging changes for such a niche case; but failed to find another way, and have heard that shmem MADV_COLLAPSE's usefulness is being limited by that mmap_write_lock it currently requires. These changes (though of course not these exact patches, and not all of these architectures!) have been in Google's data centre kernel for three years now: we do rely upon them. What are the per-arch changes about? Generally, two things. One: the current mmap locking may not be enough to guard against that tricky transition between pmd entry pointing to page table, and empty pmd entry, and pmd entry pointing to huge page: pte_offset_map() will have to validate the pmd entry for itself, returning NULL if no page table is there. What to do about that varies: often the nearby error handling indicates just to skip it; but in some cases a "goto again" looks appropriate (and if that risks an infinite loop, then there must have been an oops, or pfn 0 mistaken for page table, before). Deeper study of each site might show that 90% of them here in arch code could only fail if there's corruption e.g. a transition to THP would be surprising on an arch without HAVE_ARCH_TRANSPARENT_HUGEPAGE. But given the likely extension to freeing empty page tables, I have not limited this set of changes to THP; and it has been easier, and sets a better example, if each site is given appropriate handling. Two: pte_offset_map() will need to do an rcu_read_lock(), with the corresponding rcu_read_unlock() in pte_unmap(). But most architectures never supported CONFIG_HIGHPTE, so some don't always call pte_unmap() after pte_offset_map(), or have used userspace pte_offset_map() where pte_offset_kernel() is more correct. No problem in the current tree, but a problem once an rcu_read_unlock() will be needed to keep balance. A common special case of that comes in arch/*/mm/hugetlbpage.c, if the architecture supports hugetlb pages down at the lowest PTE level. huge_pte_alloc() uses pte_alloc_map(), but generic hugetlb code does no corresponding pte_unmap(); similarly for huge_pte_offset(). In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Link: https://lkml.kernel.org/r/a4963be9-7aa6-350-66d0-2ba843e1af44@google.com Link: https://lkml.kernel.org/r/813429a1-204a-1844-eeae-7fd72826c28@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: John David Anglin <dave.anglin@bell.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Will Deacon <will@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
# e31cf2f4	08-Jun-2020	Mike Rapoport <rppt@kernel.org>	mm: don't include asm/pgtable.h if linux/mm.h is already included Patch series "mm: consolidate definitions of page table accessors", v2. The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures. Most of these definitions are actually identical and typically it boils down to, e.g. static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); } static inline pmd_t pmd_offset(pud_t pud, unsigned long address) { return (pmd_t )pud_page_vaddr(pud) + pmd_index(address); } These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined. For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic. These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header. This patch (of 12): The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>. The include statements in such cases are remove with a simple loop: for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 84e6ffb2	04-Jun-2020	Mike Rapoport <rppt@kernel.org>	arm: add support for folded p4d page tables Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate, and remove __ARCH_USE_5LEVEL_HACK. [rppt@linux.ibm.com: fix kexec] Link: http://lkml.kernel.org/r/20200508174232.GA759899@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: James Morse <james.morse@arm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200414153455.21744-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# d2912cb1	04-Jun-2019	Thomas Gleixner <tglx@linutronix.de>	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# cb9f753a	05-Apr-2018	Huang Ying <ying.huang@intel.com>	mm: fix races between swapoff and flush dcache Thanks to commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks"), after swapoff the address_space associated with the swap device will be freed. So page_mapping() users which may touch the address_space need some kind of mechanism to prevent the address_space from being freed during accessing. The dcache flushing functions (flush_dcache_page(), etc) in architecture specific code may access the address_space of swap device for anonymous pages in swap cache via page_mapping() function. But in some cases there are no mechanisms to prevent the swap device from being swapoff, for example, CPU1 CPU2 __get_user_pages() swapoff() flush_dcache_page() mapping = page_mapping() ... exit_swap_address_space() ... kvfree(spaces) mapping_mapped(mapping) The address space may be accessed after being freed. But from cachetlb.txt and Russell King, flush_dcache_page() only care about file cache pages, for anonymous pages, flush_anon_page() should be used. The implementation of flush_dcache_page() in all architectures follows this too. They will check whether page_mapping() is NULL and whether mapping_mapped() is true to determine whether to flush the dcache immediately. And they will use interval tree (mapping->i_mmap) to find all user space mappings. While mapping_mapped() and mapping->i_mmap isn't used by anonymous pages in swap cache at all. So, to fix the race between swapoff and flush dcache, __page_mapping() is add to return the address_space for file cache pages and NULL otherwise. All page_mapping() invoking in flush dcache functions are replaced with page_mapping_file(). [akpm@linux-foundation.org: simplify page_mapping_file(), per Mike] Link: http://lkml.kernel.org/r/20180305083634.15174-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 4ed89f22	28-Oct-2014	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: convert printk(KERN_* to pr_* Convert many (but not all) printk(KERN_* to pr_* to simplify the code. We take the opportunity to join some printk lines together so we don't split the message across several lines, and we also add a few levels to some messages which were previously missing them. Tested-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 57c1ffce	14-Nov-2013	Kirill A. Shutemov <kirill.shutemov@linux.intel.com>	mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS We're going to introduce split page table lock for PMD level. Let's rename existing split ptlock for PTE level to avoid confusion. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 6b2dbba8	08-Oct-2012	Michel Lespinasse <walken@google.com>	mm: replace vma prio_tree with an interval tree Implement an interval tree as a replacement for the VMA prio_tree. The algorithms are similar to lib/interval_tree.c; however that code can't be directly reused as the interval endpoints are not explicitly stored in the VMA. So instead, the common algorithm is moved into a template and the details (node type, how to get interval endpoints from the node, etc) are filled in using the C preprocessor. Once the interval tree functions are available, using them as a replacement to the VMA prio tree is a relatively simple, mechanical job. Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# d91ef63b	22-Jul-2011	Paul Gortmaker <paul.gortmaker@windriver.com>	arm: remove several unnecessary module.h include instances Building these files does not reveal a hidden need for any of these. Since module.h brings in the whole kitchen sink, it just needlessly adds 30k+ lines to the cpp burden. There are probably lots more, but ARM files of mach-* and plat-* don't get coverage via a simple yesconfig build. They will have to be cleaned up and tested via using their respective configs. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
# 516295e5	21-Nov-2010	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: pgtable: add pud-level code Add pud_offset() et.al. between the pgd and pmd code in preparation of using pgtable-nopud.h rather than 4level-fixup.h. This incorporates a fix from Jamie Iles <jamie@jamieiles.com> for uaccess_with_memcpy.c. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# f6e3354d	15-Nov-2010	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: pgtable: introduce pteval_t to represent a pte value This makes everywhere dealing with pte values use the same type. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 4e54d93d	28-Oct-2010	Mika Westerberg <mika.westerberg@iki.fi>	ARM: 6464/2: fix spinlock recursion in adjust_pte() When running following code in a machine which has VIVT caches and USE_SPLIT_PTLOCKS is not defined: fd = open("/etc/passwd", O_RDONLY); addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0); addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0); v = ((int )addr); we will hang in spinlock recursion in the page fault handler: BUG: spinlock recursion on CPU#0, mmap_test/717 lock: c5e295d8, .magic: dead4ead, .owner: mmap_test/717, .owner_cpu: 0 [<c0026604>] (unwind_backtrace+0x0/0xec) [<c014ee48>] (do_raw_spin_lock+0x40/0x140) [<c0027f68>] (update_mmu_cache+0x208/0x250) [<c0079db4>] (__do_fault+0x320/0x3ec) [<c007af7c>] (handle_mm_fault+0x2f0/0x6d8) [<c0027834>] (do_page_fault+0xdc/0x1cc) [<c00202d0>] (do_DataAbort+0x34/0x94) This comes from the fact that when USE_SPLIT_PTLOCKS is not defined, the only lock protecting the page tables is mm->page_table_lock which is already locked before update_mmu_cache() is called. Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# ece0e2b6	26-Oct-2010	Peter Zijlstra <a.p.zijlstra@chello.nl>	mm: remove pte_map_nested() Since we no longer need to provide KM_type, the whole pte_map_nested() API is now redundant, remove it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 6012191a	13-Sep-2010	Catalin Marinas <catalin.marinas@arm.com>	ARM: 6380/1: Introduce __sync_icache_dcache() for VIPT caches On SMP systems, there is a small chance of a PTE becoming visible to a different CPU before the current cache maintenance operations in update_mmu_cache(). To avoid this, cache maintenance must be handled in set_pte_at() (similar to IA-64 and PowerPC). This patch provides a unified VIPT cache handling mechanism and implements the __sync_icache_dcache() function for ARMv6 onwards architectures. It is called from set_pte_at() and replaces the update_mmu_cache(). The latter is still used on VIVT hardware where a vm_area_struct is required. Tested-by: Rabin Vincent <rabin.vincent@stericsson.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# c0177800	13-Sep-2010	Catalin Marinas <catalin.marinas@arm.com>	ARM: 6379/1: Assume new page cache pages have dirty D-cache There are places in Linux where writes to newly allocated page cache pages happen without a subsequent call to flush_dcache_page() (several PIO drivers including USB HCD). This patch changes the meaning of PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly mapped page in update_mmu_cache(). The patch also sets the PG_arch_1 bit in the DMA cache maintenance function to avoid additional cache flushing in update_mmu_cache(). Tested-by: Rabin Vincent <rabin.vincent@stericsson.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# f76348a3	23-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: remove unnecessary cache flush This cache flush occurs when we first insert a page into the page tables, where a page did not exist previously. There can be no cache lines associated with this virtual mapping, so this cache flush is redundant. Tested-by: Mike Rapoport <mike@compulab.co.il> Tested-by: Mikael Pettersson <mikpe at it.uu.se> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 5a0e3ad6	24-Mar-2010	Tejun Heo <tj@kernel.org>	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
# ae140202	18-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: make_coherent(): fix problems with highpte, part 2 update_mmu_cache() is called with the page table for the faulted-in page still mapped. We need to modify the PTE for this page to ensure coherency with other shared mappings when multiple shared mappings exist within a MM. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 4b3073e1	18-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself On VIVT ARM, when we have multiple shared mappings of the same file in the same MM, we need to ensure that we have coherency across all copies. We do this via make_coherent() by making the pages uncacheable. This used to work fine, until we allowed highmem with highpte - we now have a page table which is mapped as required, and is not available for modification via update_mmu_cache(). Ralf Beache suggested getting rid of the PTE value passed to update_mmu_cache(): On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables to construct a pointer to the pte again. Passing a pte_t * is much more elegant. Maybe we might even replace the pte argument with the pte_t? Ben Herrenschmidt would also like the pte pointer for PowerPC: Passing the ptep in there is exactly what I want. I want that -instead- of the PTE value, because I have issue on some ppc cases, for I$/D$ coherency, where set_pte_at() may decide to mask out the _PAGE_EXEC. So, pass in the mapped page table pointer into update_mmu_cache(), and remove the PTE value, updating all implementations and call sites to suit. Includes a fix from Stephen Rothwell: sparc: fix fallout from update_mmu_cache API change Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# ed42acae	18-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: make_coherent: avoid recalculating the pfn for the modified page We already know the pfn for the page to be modified in make_coherent, so let's stop recalculating it unnecessarily. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 56dd4709	18-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: make_coherent: fix problems with highpte, part 1 update_mmu_cache() is called with a page table already mapped. We call make_coherent(), which then calls adjust_pte() which wants to map other page tables. This causes kmap_atomic() to BUG() because the slot its trying to use is already taken. Since do_adjust_pte() modifies the page tables, we are also missing any form of locking, so we're risking corrupting the page tables. Fix this by using pte_offset_map_nested(), and taking the pte page table lock around do_adjust_pte(). Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# f8a85f11	18-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: make_coherent: convert adjust_pte() to use p*d_none_or_clear_bad() Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# c26c20b8	18-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: make_coherent: split adjust_pte() in two adjust_pte() walks the page tables, and do_adjust_pte() does the page table manipulation. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 52e8bfd8	23-Dec-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: Fix wrong shared bit for CPU write buffer bug test It is unpredictable to have the same memory mapped using different shared bit settings for ARMv6 and ARMv7 CPUs. Fix this for the CPU write buffer bug test. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 7b0a1003	24-Oct-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: Reduce __flush_dcache_page() visibility Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 421fe93c	25-Oct-2009	Russell King <rmk+kernel@arm.linux.org.uk>	ARM: ZERO_PAGE: Avoid flush_dcache_page() for zero page The zero page is read-only, and has its cache state cleared during boot. No further maintanence for this page is required. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 787b2faa	12-Oct-2009	Nitin Gupta <ngupta@vflare.org>	ARM: force dcache flush if dcache_dirty bit set On ARM, update_mmu_cache() does dcache flush for a page only if it has a kernel mapping (page_mapping(page) != NULL). The correct behavior would be to force the flush based on dcache_dirty bit only. One of the cases where present logic would be a problem is when a RAM based block device[1] is used as a swap disk. In this case, we would have in-memory data corruption as shown in steps below: do_swap_page() { - Allocate a new page (if not already in swap cache) - Issue read from swap disk - Block driver issues flush_dcache_page() - flush_dcache_page() simply sets PG_dcache_dirty bit and does not actually issue a flush since this page has no user space mapping yet. - Now, if swap disk is almost full, this newly read page is removed from swap cache and corrsponding swap slot is freed. - Map this page anonymously in user space. - update_mmu_cache() - Since this page does not have kernel mapping (its not in page/swap cache and is mapped anonymously), it does not issue dcache flush even if dcache_dirty bit is set by flush_dcache_page() above. <user now gets stale data since dcache was never flushed> } Same problem exists on mips too. [1] example: - brd (RAM based block device) - ramzswap (RAM based compressed swap device) Signed-off-by: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 08e445bd	16-Jan-2009	Nicolas Pitre <nico@cam.org>	[ARM] 5366/1: fix shared memory coherency with VIVT L1 + L2 caches When there are multiple L1-aliasing userland mappings of the same physical page, we currently remap each of them uncached, to prevent VIVT cache aliasing issues. (E.g. writes to one of the mappings not being immediately visible via another mapping.) However, when we do this remapping, there could still be stale data in the L2 cache, and an uncached mapping might bypass L2 and go straight to RAM. This would cause reads from such mappings to see old data (until the dirty L2 line is eventually evicted.) This issue is solved by forcing a L2 cache flush whenever the shared page is made L1 uncacheable. Ideally, we would make L1 uncacheable and L2 cacheable as L2 is PIPT. But Feroceon does not support that combination, and the TEX=5 C=0 B=0 encoding for XSc3 doesn't appear to work in practice. Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# bb30f36f	06-Sep-2008	Russell King <rmk@dyn-67.arm.linux.org.uk>	[ARM] Introduce new PTE memory type bits Provide L_PTE_MT_xxx definitions to describe the memory types that we use in Linux/ARM. These definitions are carefully picked such that: 1. their LSBs match what is required for pre-ARMv6 CPUs. 2. they all have a unique encoding, including after modification by build_mem_type_table() (the result being that some have more than one combination.) Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 09d9bae0	05-Sep-2008	Russell King <rmk@dyn-67.arm.linux.org.uk>	[ARM] sparse: fix several warnings arch/arm/kernel/process.c:270:6: warning: symbol 'show_fpregs' was not declared. Should it be static? This function isn't used, so can be removed. arch/arm/kernel/setup.c:532:9: warning: symbol 'len' shadows an earlier one arch/arm/kernel/setup.c:524:6: originally declared here A function containing two 'len's. arch/arm/mm/fault-armv.c:188:13: warning: symbol 'check_writebuffer_bugs' was not declared. Should it be static? arch/arm/mm/mmap.c:122:5: warning: symbol 'valid_phys_addr_range' was not declared. Should it be static? arch/arm/mm/mmap.c:137:5: warning: symbol 'valid_mmap_phys_addr_range' was not declared. Should it be static? Missing includes. arch/arm/kernel/traps.c:71:77: warning: Using plain integer as NULL pointer arch/arm/mm/ioremap.c:355:46: error: incompatible types in comparison expression (different address spaces) Sillies. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 46097c7d	10-Aug-2008	Russell King <rmk@dyn-67.arm.linux.org.uk>	[ARM] cachetype: move definitions to separate header Rather than pollute asm/cacheflush.h with the cache type definitions, move them to asm/cachetype.h, and include this new header where necessary. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 53cdb27a	27-Jul-2008	Russell King <rmk@dyn-67.arm.linux.org.uk>	[ARM] Fix shared mmap when more than two maps of the same file exist The shared mmap code works fine for the test case, which only checked for two shared maps of the same file. However, three shared maps result in one mapping remaining cached, resulting in stale data being visible via that mapping. Fix this. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 826cbdaf	13-Jun-2008	Catalin Marinas <catalin.marinas@arm.com>	[ARM] 5092/1: Fix the I-cache invalidation on ARMv6 and later CPUs This patch adds the I-cache invalidation in update_mmu_cache if the corresponding vma is marked as executable. It also invalidates the I-cache if a thread migrates to a CPU it never ran on. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# cb36bb75	14-Feb-2007	George G. Davis <gdavis@mvista.com>	[ARM] 4191/1: Remove redundant __flush_dcache_page() function prototype Commit 1c9d3df5e88ad7db23f5b22f4341c39722a904a4 added function prototype __flush_dcache_page() in include/asm-arm/cacheflush.h. So we can remove the prototype for same in arch/arm/mm/fault-armv.c since it is now redundant to have it there. Signed-off-by: George G. Davis <gdavis@mvista.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# ad1ae2fe	13-Dec-2006	Russell King <rmk@dyn-67.arm.linux.org.uk>	[ARM] Unuse another Linux PTE bit L_PTE_ASID is not really required to be stored in every PTE, since we can identify it via the address passed to set_pte_at(). So, create set_pte_ext() which takes the address of the PTE to set, the Linux PTE value, and the additional CPU PTE bits which aren't encoded in the Linux PTE value. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 69b04754	29-Oct-2005	Hugh Dickins <hugh@veritas.com>	[PATCH] mm: arm ready for split ptlock Prepare arm for the split page_table_lock: three issues. Signal handling's preserve and restore of iwmmxt context currently involves reading and writing that context to and from user space, while holding page_table_lock to secure the user page(s) against kswapd. If we split the lock, then the structure might span two pages, secured by to read into and write from a kernel stack buffer, copying that out and in without locking (the structure is 160 bytes in size, and here we're near the top of the kernel stack). Or would the overhead be noticeable? arm_syscall's cmpxchg emulation use pte_offset_map_lock, instead of pte_offset_map and mm-wide page_table_lock; and strictly, it should now also take mmap_sem before descending to pmd, to guard against another thread munmapping, and the page table pulled out beneath this thread. Updated two comments in fault-armv.c. adjust_pte is interesting, since its modification of a pte in one part of the mm depends on the lock held when calling update_mmu_cache for a pte in some other part of that mm. This can't be done with a split page_table_lock (and we've already taken the lowest lock in the hierarchy here): so we'll have to disable split on arm, unless CONFIG_CPU_CACHE_VIPT to ensures adjust_pte never used. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# 8830f04a	20-Jun-2005	Russell King <rmk@dyn-67.arm.linux.org.uk>	[PATCH] ARM: Fix delayed dcache flush for ARMv6 non-aliasing caches flush_dcache_page() did nothing for these caches, but since they suffer from I/D cache coherency issues, we need to ensure that data is written back to RAM. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
# 1da177e4	16-Apr-2005	Linus Torvalds <torvalds@ppc970.osdl.org>	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!