History log of /linux-master/arch/powerpc/mm/Makefile
Revision Date Author Comments
# 36e5f9ee 09-Apr-2022 Christophe Leroy <christophe.leroy@csgroup.eu>

powerpc/mm: Convert to default topdown mmap layout

Select CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT and
remove arch/powerpc/mm/mmap.c

This change reuses the generic framework added by
commit 67f3977f805b ("arm64, mm: move generic mmap layout
functions to mm") without any functional change.

Comparison between powerpc implementation and the generic one:
- mmap_is_legacy() is identical.
- arch_mmap_rnd() does exactly the same allthough it's written
slightly differently.
- MIN_GAP and MAX_GAP are identical.
- mmap_base() does the same but uses STACK_RND_MASK which provides
the same values as stack_maxrandom_size().
- arch_pick_mmap_layout() is identical.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/518f9def87d3c889d5958103e7463cf45a2f673d.1649523076.git.christophe.leroy@csgroup.eu


# 1408fca0 09-Apr-2022 Christophe Leroy <christophe.leroy@csgroup.eu>

powerpc/mm: Make slice specific to book3s/64

Since commit 555904d07eef ("powerpc/8xx: MM_SLICE is not needed
anymore") only book3s/64 selects CONFIG_PPC_MM_SLICES.

Move slice.c into mm/book3s64/

Move necessary stuff in asm/book3s/64/slice.h and
remove asm/slice.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4a0d74ef1966a5902b5fd4ac4b513a760a6d675a.1649523076.git.christophe.leroy@csgroup.eu


# e0847283 08-Jul-2021 Christophe Leroy <christophe.leroy@csgroup.eu>

powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP

This patch converts powerpc to the generic PTDUMP implementation.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/03166d569526be70214fe9370a7bad219d2f41c8.1625762907.git.christophe.leroy@csgroup.eu


# 1f9ad21c 08-Jun-2021 Russell Currey <ruscur@russell.cc>

powerpc/mm: Implement set_memory() routines

The set_memory_{ro/rw/nx/x}() functions are required for
STRICT_MODULE_RWX, and are generally useful primitives to have. This
implementation is designed to be generic across powerpc's many MMUs.
It's possible that this could be optimised to be faster for specific
MMUs.

This implementation does not handle cases where the caller is attempting
to change the mapping of the page it is executing from, or if another
CPU is concurrently using the page being altered. These cases likely
shouldn't happen, but a more complex implementation with MMU-specific code
could safely handle them.

On hash, the linear mapping is not kept in the linux pagetable, so this
will not change the protection if used on that range. Currently these
functions are not used on the linear map so just WARN for now.

apply_to_existing_page_range() does not work on huge pages so for now
disallow changing the protection of huge pages.

[jpn: - Allow set memory functions to be used without Strict RWX
- Hash: Disallow certain regions
- Have change_page_attr() take function pointers to manipulate ptes
- Radix: Add ptesync after set_pte_at()]

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-2-jniethe5@gmail.com


# a9ee6cf5 28-Jun-2021 Mike Rapoport <rppt@kernel.org>

mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA

After removal of DISCINTIGMEM the NEED_MULTIPLE_NODES and NUMA
configuration options are equivalent.

Drop CONFIG_NEED_MULTIPLE_NODES and use CONFIG_NUMA instead.

Done with

$ sed -i 's/CONFIG_NEED_MULTIPLE_NODES/CONFIG_NUMA/' \
$(git grep -wl CONFIG_NEED_MULTIPLE_NODES)
$ sed -i 's/NEED_MULTIPLE_NODES/NUMA/' \
$(git grep -wl NEED_MULTIPLE_NODES)

with manual tweaks afterwards.

[rppt@linux.ibm.com: fix arm boot crash]
Link: https://lkml.kernel.org/r/YMj9vHhHOiCVN4BF@linux.ibm.com

Link: https://lkml.kernel.org/r/20210608091316.3622-9-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b26e8f27 08-Apr-2021 Christophe Leroy <christophe.leroy@csgroup.eu>

powerpc/mem: Move cache flushing functions into mm/cacheflush.c

Cache flushing functions are in the middle of completely
unrelated stuff in mm/mem.c

Create a dedicated mm/cacheflush.c for those functions.

Also cleanup the list of included headers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7bf6f1600acad146e541a4e220940062f2e5b03d.1617895813.git.christophe.leroy@csgroup.eu


# 5eedf9fe 07-Dec-2020 Christophe Leroy <christophe.leroy@csgroup.eu>

powerpc/mm: Fix KUAP warning by providing copy_from_kernel_nofault_allowed()

Since commit c33165253492 ("powerpc: use non-set_fs based maccess
routines"), userspace access is not granted anymore when using
copy_from_kernel_nofault()

However, kthread_probe_data() uses copy_from_kernel_nofault()
to check validity of pointers. When the pointer is NULL,
it points to userspace, leading to a KUAP fault and triggering
the following big hammer warning many times when you request
a sysrq "show task":

[ 1117.202054] ------------[ cut here ]------------
[ 1117.202102] Bug: fault blocked by AP register !
[ 1117.202261] WARNING: CPU: 0 PID: 377 at arch/powerpc/include/asm/nohash/32/kup-8xx.h:66 do_page_fault+0x4a8/0x5ec
[ 1117.202310] Modules linked in:
[ 1117.202428] CPU: 0 PID: 377 Comm: sh Tainted: G W 5.10.0-rc5-01340-g83f53be2de31-dirty #4175
[ 1117.202499] NIP: c0012048 LR: c0012048 CTR: 00000000
[ 1117.202573] REGS: cacdbb88 TRAP: 0700 Tainted: G W (5.10.0-rc5-01340-g83f53be2de31-dirty)
[ 1117.202625] MSR: 00021032 <ME,IR,DR,RI> CR: 24082222 XER: 20000000
[ 1117.202899]
[ 1117.202899] GPR00: c0012048 cacdbc40 c2929290 00000023 c092e554 00000001 c09865e8 c092e640
[ 1117.202899] GPR08: 00001032 00000000 00000000 00014efc 28082224 100d166a 100a0920 00000000
[ 1117.202899] GPR16: 100cac0c 100b0000 1080c3fc 1080d685 100d0000 100d0000 00000000 100a0900
[ 1117.202899] GPR24: 100d0000 c07892ec 00000000 c0921510 c21f4440 0000005c c0000000 cacdbc80
[ 1117.204362] NIP [c0012048] do_page_fault+0x4a8/0x5ec
[ 1117.204461] LR [c0012048] do_page_fault+0x4a8/0x5ec
[ 1117.204509] Call Trace:
[ 1117.204609] [cacdbc40] [c0012048] do_page_fault+0x4a8/0x5ec (unreliable)
[ 1117.204771] [cacdbc70] [c00112f0] handle_page_fault+0x8/0x34
[ 1117.204911] --- interrupt: 301 at copy_from_kernel_nofault+0x70/0x1c0
[ 1117.204979] NIP: c010dbec LR: c010dbac CTR: 00000001
[ 1117.205053] REGS: cacdbc80 TRAP: 0301 Tainted: G W (5.10.0-rc5-01340-g83f53be2de31-dirty)
[ 1117.205104] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 28082224 XER: 00000000
[ 1117.205416] DAR: 0000005c DSISR: c0000000
[ 1117.205416] GPR00: c0045948 cacdbd38 c2929290 00000001 00000017 00000017 00000027 0000000f
[ 1117.205416] GPR08: c09926ec 00000000 00000000 3ffff000 24082224
[ 1117.206106] NIP [c010dbec] copy_from_kernel_nofault+0x70/0x1c0
[ 1117.206202] LR [c010dbac] copy_from_kernel_nofault+0x30/0x1c0
[ 1117.206258] --- interrupt: 301
[ 1117.206372] [cacdbd38] [c004bbb0] kthread_probe_data+0x44/0x70 (unreliable)
[ 1117.206561] [cacdbd58] [c0045948] print_worker_info+0xe0/0x194
[ 1117.206717] [cacdbdb8] [c00548ac] sched_show_task+0x134/0x168
[ 1117.206851] [cacdbdd8] [c005a268] show_state_filter+0x70/0x100
[ 1117.206989] [cacdbe08] [c039baa0] sysrq_handle_showstate+0x14/0x24
[ 1117.207122] [cacdbe18] [c039bf18] __handle_sysrq+0xac/0x1d0
[ 1117.207257] [cacdbe48] [c039c0c0] write_sysrq_trigger+0x4c/0x74
[ 1117.207407] [cacdbe68] [c01fba48] proc_reg_write+0xb4/0x114
[ 1117.207550] [cacdbe88] [c0179968] vfs_write+0x12c/0x478
[ 1117.207686] [cacdbf08] [c0179e60] ksys_write+0x78/0x128
[ 1117.207826] [cacdbf38] [c00110d0] ret_from_syscall+0x0/0x34
[ 1117.207938] --- interrupt: c01 at 0xfd4e784
[ 1117.208008] NIP: 0fd4e784 LR: 0fe0f244 CTR: 10048d38
[ 1117.208083] REGS: cacdbf48 TRAP: 0c01 Tainted: G W (5.10.0-rc5-01340-g83f53be2de31-dirty)
[ 1117.208134] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 44002222 XER: 00000000
[ 1117.208470]
[ 1117.208470] GPR00: 00000004 7fc34090 77bfb4e0 00000001 1080fa40 00000002 7400000f fefefeff
[ 1117.208470] GPR08: 7f7f7f7f 10048d38 1080c414 7fc343c0 00000000
[ 1117.209104] NIP [0fd4e784] 0xfd4e784
[ 1117.209180] LR [0fe0f244] 0xfe0f244
[ 1117.209236] --- interrupt: c01
[ 1117.209274] Instruction dump:
[ 1117.209353] 714a4000 418200f0 73ca0001 40820084 73ca0032 408200f8 73c90040 4082ff60
[ 1117.209727] 0fe00000 3c60c082 386399f4 48013b65 <0fe00000> 80010034 3860000b 7c0803a6
[ 1117.210102] ---[ end trace 1927c0323393af3e ]---

To avoid that, copy_from_kernel_nofault_allowed() is used to check
whether the address is a valid kernel address. But the default
version of it returns true for any address.

Provide a powerpc version of copy_from_kernel_nofault_allowed()
that returns false when the address is below TASK_USER_MAX,
so that copy_from_kernel_nofault() will return -ERANGE.

Fixes: c33165253492 ("powerpc: use non-set_fs based maccess routines")
Reported-by: Qian Cai <qcai@redhat.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/18bcb456d32a3e74f5ae241fd6f1580c092d07f5.1607360230.git.christophe.leroy@csgroup.eu


# 47da42b2 03-Nov-2020 Thomas Gleixner <tglx@linutronix.de>

powerpc/mm/highmem: Switch to generic kmap atomic

No reason having the same code in every architecture

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20201103095858.087635810@linutronix.de


# f381d571 20-Aug-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Move ioremap functions out of pgtable_32/64.c

Create ioremap_32.c and ioremap_64.c and move respective ioremap
functions out of pgtable_32.c and pgtable_64.c

In the meantime, fix a few comments and changes a printk() to
pr_warn(). Also fix a few oversplitted lines.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b5c8b02ccefd4ede64c61b53cf64fb5dacb35740.1566309263.git.christophe.leroy@c-s.fr


# 4634c375 20-Aug-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: move common 32/64 bits ioremap functions into ioremap.c

ioremap(), ioremap_wc() and ioremap_coherent() are now identical on
PPC32 and PPC64 as iowa_is_active() will always return false on
PPC32. Move them into a new common location called ioremap.c

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6223803ce024d6ab4dfaa919f44098aed5b4bc33.1566309262.git.christophe.leroy@c-s.fr


# c4e31847 06-May-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: fix redundant inclusion of pgtable-frag.o in Makefile

The patch identified below added pgtable-frag.o to obj-y
but some merge witchery kept it also for obj-CONFIG_PPC_BOOK3S_64

This patch clears the duplication.

Fixes: 737b434d3d55 ("powerpc/mm: convert Book3E 64 to pte_fragment")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 471e475c 06-May-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Fix makefile for KASAN

In commit 17312f258cf6 ("powerpc/mm: Move book3s32 specifics in
subdirectory mm/book3s64"), ppc_mmu_32.c was moved and renamed.

This patch fixes Makefiles to disable KASAN instrumentation on
the new name and location.

Fixes: f072015c7b74 ("powerpc: disable KASAN instrumentation on early/critical files.")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 2edb16ef 26-Apr-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/32: Add KASAN support

This patch adds KASAN support for PPC32. The following patch
will add an early activation of hash table for book3s. Until
then, a warning will be raised if trying to use KASAN on an
hash 6xx.

To support KASAN, this patch initialises that MMU mapings for
accessing to the KASAN shadow area defined in a previous patch.

An early mapping is set as soon as the kernel code has been
relocated at its definitive place.

Then the definitive mapping is set once paging is initialised.

For modules, the shadow area is allocated at module_alloc().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# f072015c 26-Apr-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc: disable KASAN instrumentation on early/critical files.

All files containing functions run before kasan_early_init() is called
must have KASAN instrumentation disabled.

For those file, branch profiling also have to be disabled otherwise
each if () generates a call to ftrace_likely_update().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 737b434d 26-Apr-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: convert Book3E 64 to pte_fragment

Book3E 64 is the only subarch not using pte_fragment. In order
to allow refactorisation, this patch converts it to pte_fragment.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# b7dcf96c 25-Apr-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: make hugetlbpage.c depend on CONFIG_HUGETLB_PAGE

The only function in hugetlbpage.c which doesn't depend on
CONFIG_HUGETLB_PAGE is gup_hugepte(), and this function is
only called from gup_huge_pd() which depends on
CONFIG_HUGETLB_PAGE so all the content of hugetlbpage.c
depends on CONFIG_HUGETLB_PAGE.

This patch modifies Makefile to only compile hugetlbpage.c
when CONFIG_HUGETLB_PAGE is set.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 27e23b5f 29-Mar-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Move nohash specifics in subdirectory mm/nohash

Many files in arch/powerpc/mm are only for nohash. This patch
creates a subdirectory for them.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Shorten new filenames]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 17312f25 29-Mar-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Move book3s32 specifics in subdirectory mm/book3s64

Several files in arch/powerpc/mm are only for book3S32. This patch
creates a subdirectory for them.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Shorten new filenames]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 47d99948 29-Mar-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Move book3s64 specifics in subdirectory mm/book3s64

Many files in arch/powerpc/mm are only for book3S64. This patch
creates a subdirectory for them.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Update the selftest sym links, shorten new filenames, cleanup some
whitespace and formatting in the new files.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 19d69075 04-Mar-2019 Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

powerpc/mm: Disable kcov for SLB routines

The kcov instrumentation inside SLB routines causes duplicate SLB
entries to be added resulting into SLB multihit machine checks.
Disable kcov instrumentation on slb.o

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# fb0b0a73 21-Feb-2019 Andrew Donnellan <andrew.donnellan@au1.ibm.com>

powerpc: Enable kcov

kcov provides kernel coverage data that's useful for fuzzing tools like
syzkaller.

Wire up kcov support on powerpc. Disable kcov instrumentation on the same
files where we currently disable gcov and UBSan instrumentation, plus some
additional exclusions which appear necessary to boot on book3e machines.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Daniel Axtens <dja@axtens.net> # e6500
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# e66c3209 17-Feb-2019 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc: Move page table dump files in a dedicated subdirectory

This patch moves the files related to page table dump in a
dedicated subdirectory.

The purpose is to clean a bit arch/powerpc/mm by regrouping
multiple files handling a dedicated function.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Shorten the file names while we're at it]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 7c91efce 03-Dec-2018 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: dump block address translation on book3s/32

This patch adds a debugfs file to dump block address translation:

~# cat /sys/kernel/debug/powerpc/block_address_translation
---[ Instruction Block Address Translations ]---
0: -
1: -
2: 0xc0000000-0xcfffffff 0x00000000 Kernel EXEC coherent
3: 0xd0000000-0xdfffffff 0x10000000 Kernel EXEC coherent
4: -
5: -
6: -
7: -

---[ Data Block Address Translations ]---
0: -
1: -
2: 0xc0000000-0xcfffffff 0x00000000 Kernel RW coherent
3: 0xd0000000-0xdfffffff 0x10000000 Kernel RW coherent
4: -
5: -
6: -
7: -

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 0261a508 16-Nov-2018 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: dump segment registers on book3s/32

This patch creates a debugfs file to see content of
segment registers

# cat /sys/kernel/debug/segment_registers
---[ User Segments ]---
0x00000000-0x0fffffff Kern key 1 User key 1 VSID 0xade2b0
0x10000000-0x1fffffff Kern key 1 User key 1 VSID 0xade3c1
0x20000000-0x2fffffff Kern key 1 User key 1 VSID 0xade4d2
0x30000000-0x3fffffff Kern key 1 User key 1 VSID 0xade5e3
0x40000000-0x4fffffff Kern key 1 User key 1 VSID 0xade6f4
0x50000000-0x5fffffff Kern key 1 User key 1 VSID 0xade805
0x60000000-0x6fffffff Kern key 1 User key 1 VSID 0xade916
0x70000000-0x7fffffff Kern key 1 User key 1 VSID 0xadea27
0x80000000-0x8fffffff Kern key 1 User key 1 VSID 0xadeb38
0x90000000-0x9fffffff Kern key 1 User key 1 VSID 0xadec49
0xa0000000-0xafffffff Kern key 1 User key 1 VSID 0xaded5a
0xb0000000-0xbfffffff Kern key 1 User key 1 VSID 0xadee6b

---[ Kernel Segments ]---
0xc0000000-0xcfffffff Kern key 0 User key 1 VSID 0x000ccc
0xd0000000-0xdfffffff Kern key 0 User key 1 VSID 0x000ddd
0xe0000000-0xefffffff Kern key 0 User key 1 VSID 0x000eee
0xf0000000-0xffffffff Kern key 0 User key 1 VSID 0x000fff

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Move it under /sys/kernel/debug/powerpc, make sr_init() __init]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 32ea4c14 29-Nov-2018 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Extend pte_fragment functionality to PPC32

In order to allow the 8xx to handle pte_fragments, this patch
extends the use of pte_fragments to PPC32 platforms.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# a95d133c 29-Nov-2018 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Move pte_fragment_alloc() to a common location

In preparation of next patch which generalises the use of
pte_fragment_alloc() for all, this patch moves the related functions
in a place that is common to all subarches.

The 8xx will need that for supporting 16k pages, as in that mode
page tables still have a size of 4k.

Since pte_fragment with only once fragment is not different
from what is done in the general case, we can easily migrate all
subarchs to pte fragments.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 5b3e84fc 17-Nov-2018 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc: change CONFIG_PPC_STD_MMU to CONFIG_PPC_BOOK3S

Today we have:

config PPC_BOOK3S
def_bool y
depends on PPC_BOOK3S_32 || PPC_BOOK3S_64

config PPC_STD_MMU
def_bool y
depends on PPC_BOOK3S

PPC_STD_MMU is therefore redundant with PPC_BOOK3S. Lets remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 68289ae9 17-Nov-2018 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc: change CONFIG_PPC_STD_MMU_32 to CONFIG_PPC_BOOK3S_32

Today we have:

config PPC_BOOK3S_32
bool "512x/52xx/6xx/7xx/74xx/82xx/83xx/86xx"
[depends on PPC32 within a choice]

config PPC_BOOK3S
def_bool y
depends on PPC_BOOK3S_32 || PPC_BOOK3S_64

config PPC_STD_MMU
def_bool y
depends on PPC_BOOK3S

config PPC_STD_MMU_32
def_bool y
depends on PPC_STD_MMU && PPC32

PPC_STD_MMU_32 is therefore redundant with PPC_BOOK3S_32.

In order to make the code clearer, lets use preferably PPC_BOOK3S_32.
This will allow to remove CONFIG_PPC_STD_MMU_32 in a later patch.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 23ad1a27 09-Oct-2018 Michael Ellerman <mpe@ellerman.id.au>

powerpc: Add -Werror at arch/powerpc level

Back when I added -Werror in commit ba55bd74360e ("powerpc: Add
configurable -Werror for arch/powerpc") I did it by adding it to most
of the arch Makefiles.

At the time we excluded math-emu, because apparently it didn't build
cleanly. But that seems to have been fixed somewhere in the interim.

So move the -Werror addition to the top-level of the arch, this saves
us from repeating it in every Makefile and means we won't forget to
add it to any new sub-dirs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 48e7b769 14-Sep-2018 Nicholas Piggin <npiggin@gmail.com>

powerpc/64s/hash: Convert SLB miss handlers to C

This patch moves SLB miss handlers completely to C, using the standard
exception handler macros to set up the stack and branch to C.

This can be done because the segment containing the kernel stack is
always bolted, so accessing it with relocation on will not cause an
SLB exception.

Arbitrary kernel memory must not be accessed when handling kernel
space SLB misses, so care should be taken there. However user SLB
misses can access any kernel memory, which can be used to move some
fields out of the paca (in later patches).

User SLB misses could quite easily reconcile IRQs and set up a first
class kernel environment and exit via ret_from_except, however that
doesn't seem to be necessary at the moment, so we only do that if a
bad fault is encountered.

[ Credit to Aneesh for bug fixes, error checks, and improvements to
bad address handling, etc ]

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Disallow tracing for all of slb.c for now.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 97026b5a 09-Oct-2018 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/mm: Split dump_pagelinuxtables flag_array table

To reduce the complexity of flag_array, and allow the removal of
default 0 value of non existing flags, lets have one flag_array
table for each platform family with only the really existing flags.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 54be0b9c 02-Oct-2018 Michael Ellerman <mpe@ellerman.id.au>

Revert "convert SLB miss handlers to C" and subsequent commits

This reverts commits:
5e46e29e6a97 ("powerpc/64s/hash: convert SLB miss handlers to C")
8fed04d0f6ae ("powerpc/64s/hash: remove user SLB data from the paca")
655deecf67b2 ("powerpc/64s/hash: SLB allocation status bitmaps")
2e1626744e8d ("powerpc/64s/hash: provide arch_setup_exec hooks for hash slice setup")
89ca4e126a3f ("powerpc/64s/hash: Add a SLB preload cache")

This series had a few bugs, and the fixes are not all trivial. So
revert most of it for now.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 5e46e29e 14-Sep-2018 Nicholas Piggin <npiggin@gmail.com>

powerpc/64s/hash: convert SLB miss handlers to C

This patch moves SLB miss handlers completely to C, using the standard
exception handler macros to set up the stack and branch to C.

This can be done because the segment containing the kernel stack is
always bolted, so accessing it with relocation on will not cause an
SLB exception.

Arbitrary kernel memory may not be accessed when handling kernel space
SLB misses, so care should be taken there. However user SLB misses can
access any kernel memory, which can be used to move some fields out of
the paca (in later patches).

User SLB misses could quite easily reconcile IRQs and set up a first
class kernel environment and exit via ret_from_except, however that
doesn't seem to be necessary at the moment, so we only do that if a
bad fault is encountered.

[ Credit to Aneesh for bug fixes, error checks, and improvements to bad
address handling, etc ]

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Since RFC:
- Added MSR[RI] handling
- Fixed up a register loss bug exposed by irq tracing (Aneesh)
- Reject misses outside the defined kernel regions (Aneesh)
- Added several more sanity checks and error handling (Aneesh), we may
look at consolidating these tests and tightenig up the code but for
a first pass we decided it's better to check carefully.

Since v1:
- Fixed SLB cache corruption (Aneesh)
- Fixed untidy SLBE allocation "leak" in get_vsid error case
- Now survives some stress testing on real hardware

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# badf436f 06-Aug-2018 Rodrigo R. Galvao <rosattig@linux.vnet.ibm.com>

powerpc/Makefiles: Convert ifeq to ifdef where possible

In Makefiles if we're testing a CONFIG_FOO symbol for equality with 'y'
we can instead just use ifdef. The latter reads easily, so convert to
it where possible.

Signed-off-by: Rodrigo R. Galvao <rosattig@linux.vnet.ibm.com>
Reviewed-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 92e3da3c 18-Jan-2018 Ram Pai <linuxram@us.ibm.com>

powerpc: initial pkey plumbing

Basic plumbing to initialize the pkey system.
Nothing is enabled yet. A later patch will enable it
once all the infrastructure is in place.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
[mpe: Rework copyrights to use SPDX tags]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 6c6ea537 01-Dec-2017 Nathan Fontenot <nfont@linux.vnet.ibm.com>

powerpc/mm: Separate ibm, dynamic-memory data from DT format

We currently have code to parse the dynamic reconfiguration LMB
information from the ibm,dynamic-meory device tree property in
multiple locations; numa.c, prom.c, and pseries/hotplug-memory.c.
In anticipation of adding support for a version 2 of the
ibm,dynamic-memory property this patch aims to separate the device
tree information from the device tree format.

Doing this requires a two step process to avoid a possibly very large
bootmem allocation early in boot. During initial boot, new routines
are provided to walk the device tree property and make a call-back
for each LMB.

The second step (introduced in later patches) will allocate an
array of LMB information that can be used directly without needing
to know the DT format.

This approach provides the benefit of consolidating the device tree
property parsing to a single location and (eventually) providing
a common data structure for retrieving LMB information.

This patch introduces a routine to walk the ibm,dynamic-memory
property in the flattened device tree and updates the prom.c code
to use this to initialize memory.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 4e003747 18-Oct-2017 Michael Ellerman <mpe@ellerman.id.au>

powerpc/64s: Replace CONFIG_PPC_STD_MMU_64 with CONFIG_PPC_BOOK3S_64

CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU
on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU
found in "server" processors, from IBM mainly.

Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's
annoying to have two symbols that always have the same value, it's not
quite annoying enough to bother removing one.

However with the arrival of Power9, we now have the situation where
CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the
Radix MMU - *not* the "standard" MMU. So it is now actively confusing
to use it, because it implies that code is disabled or inactive when
the Radix MMU is in use, however that is not necessarily true.

So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor
formatting updates of some of the affected lines.

This will be a pain for backports, but c'est la vie.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 3a2df379 23-Jul-2017 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Make switch_mm_irqs_off() out of line

It's too big to be inline, there is no reason to keep it
that way.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework to incorporate the comment changes via fixes branch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 6ff4d3e9 18-Jul-2017 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Remove old unused icswx based coprocessor support

We have a whole pile of unused code to maintain the ACOP register,
allocate coprocessor PIDs and handle ACOP faults. This mechanism
was used for the HFI adapter on POWER7 which is dead and gone and
whose driver never went upstream. It was used on some A2 core based
stuff that also never saw the light of day.

Take out all that code.

There is still some POWER8 coprocessor code that uses icswx but it's
kernel only and thus doesn't use any of that infrastructure.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 9b081e10 07-Dec-2016 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc: port 64 bits pgtable_cache to 32 bits

Today powerpc64 uses a set of pgtable_caches while powerpc32 uses
standard pages when using 4k pages and a single pgtable_cache
if using other size pages.

In preparation of implementing huge pages on the 8xx, this patch
replaces the specific powerpc32 handling by the 64 bits approach.

This is done by:
* moving 64 bits pgtable_cache_add() and pgtable_cache_init()
in a new file called init-common.c
* modifying pgtable_cache_init() to also handle the case
without PMD
* removing the 32 bits version of pgtable_cache_add() and
pgtable_cache_init()
* copying related header contents from 64 bits into both the
book3s/32 and nohash/32 header files

On the 8xx, the following cache sizes will be used:
* 4k pages mode:
- PGT_CACHE(10) for PGD
- PGT_CACHE(3) for 512k hugepage tables
* 16k pages mode:
- PGT_CACHE(6) for PGD
- PGT_CACHE(7) for 512k hugepage tables
- PGT_CACHE(3) for 8M hugepage tables

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Scott Wood <oss@buserror.net>


# dd5ac03e 30-Nov-2016 Michael Ellerman <mpe@ellerman.id.au>

powerpc/mm: Fix page table dump build on non-Book3S

In the recent commit 1515ab932156 ("powerpc/mm: Dump hash table") we
added code to dump the hage page table. Currently this can be selected
to build on any platform. However it breaks the build if we're building
for a non-Book3S platform, because none of the hash page table related
defines and so on exist. So restrict it to building only on Book3S.

Similarly in commit 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
we added code to dump the Linux page tables, which uses some constants
which are only defined on Book3S - so guard those with an #ifdef.

Fixes: 1515ab932156 ("powerpc/mm: Dump hash table")
Fixes: 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 1515ab93 26-May-2016 Rashmica Gupta <rashmicy@gmail.com>

powerpc/mm: Dump hash table

Useful to be able to dump the kernel hash page table to check
which pages are hashed along with their sizes and other details.

Add a debugfs file to check the hash page table. If radix is enabled
(and so there is no hash page table) then this file doesn't exist. To
use this the PPC_PTDUMP config option must be selected.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
[mpe: Fix build with SPARSEMEM_VMEMMAP=n & PSERIES=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 8eb07b18 26-May-2016 Rashmica Gupta <rashmicy@gmail.com>

powerpc/mm: Dump linux pagetables

Useful to be able to dump the kernels page tables to check permissions
and memory types - derived from arm64's implementation.

Add a debugfs file to check the page tables. To use this the PPC_PTDUMP
config option must be selected.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 68201fbb 11-Aug-2016 Michael Ellerman <mpe@ellerman.id.au>

powerpc/Makefile: Drop CONFIG_WORD_SIZE for BITS

Commit 2578bfae84a7 ("[POWERPC] Create and use CONFIG_WORD_SIZE") added
CONFIG_WORD_SIZE, and suggests that other arches were going to do
likewise.

But that never happened, powerpc is the only architecture which uses it.

So switch to using a simple make variable, BITS, like x86, sh, sparc and
tile. It is also easier to spell and simpler, avoiding any confusion
about whether it's defined due to ordering of make vs kconfig.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 3df33f12 29-Apr-2016 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm/thp: Abstraction for THP functions

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 48483760 29-Apr-2016 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Add radix support for hugetlb

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 1a472c9d 29-Apr-2016 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm/radix: Add tlbflush routines

Core kernel doesn't track the page size of the VA range that we are
invalidating. Hence we end up flushing TLB for the entire mm here. Later
patches will improve this.

We also don't flush page walk cache separetly instead use RIC=2 when
flushing TLB, because we do a MMU gather flush after freeing page table.

MMU_NO_CONTEXT is updated for hash.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# f5df4b4b 29-Apr-2016 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Rename mmu_context_hash64.c to mmu_context_book3s64.c

This file now contains both hash and radix specific code. Rename it to
indicate this better.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 2bfd65e4 29-Apr-2016 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm/radix: Add radix callbacks for early init routines

This adds routines for early setup for radix. We use device tree
property "ibm,processor-radix-AP-encodings" to find supported page
sizes. If we don't find the above we consider 64K and 4K as supported
page sizes.

We do map vmemap using 2M page size if we can. The linear mapping is
done such that we use required page size for that range. For example
memory of 3.5G is mapped such that we use 1G mapping till 3G range and
use 2M mapping for the rest.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# eee24b5a 29-Apr-2016 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Move hash and no hash code to separate files

This patch reduces the number of #ifdefs in C code and will also help in
adding radix changes later. Only code movement in this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Propagate copyrights and update GPL text]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# a372acfa 09-Feb-2016 Christophe Leroy <christophe.leroy@c-s.fr>

powerpc/8xx: Map linear kernel RAM with 8M pages

On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.

In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.

In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.

With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>


# a43c0eb8 30-Nov-2015 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Convert 4k insert from asm to C

This is similar to 64K insert. May be we want to consolidate

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 91f1da99 30-Nov-2015 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Convert 4k hash insert to C

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 15b244a8 05-Jun-2015 Alexey Kardashevskiy <aik@ozlabs.ru>

powerpc/mmu: Add userspace-to-physical addresses translation cache

We are adding support for DMA memory pre-registration to be used in
conjunction with VFIO. The idea is that the userspace which is going to
run a guest may want to pre-register a user space memory region so
it all gets pinned once and never goes away. Having this done,
a hypervisor will not have to pin/unpin pages on every DMA map/unmap
request. This is going to help with multiple pinning of the same memory.

Another use of it is in-kernel real mode (mmu off) acceleration of
DMA requests where real time translation of guest physical to host
physical addresses is non-trivial and may fail as linux ptes may be
temporarily invalid. Also, having cached host physical addresses
(compared to just pinning at the start and then walking the page table
again on every H_PUT_TCE), we can be sure that the addresses which we put
into TCE table are the ones we already pinned.

This adds a list of memory regions to mm_context_t. Each region consists
of a header and a list of physical addresses. This adds API to:
1. register/unregister memory regions;
2. do final cleanup (which puts all pre-registered pages);
3. do userspace to physical address translation;
4. manage usage counters; multiple registration of the same memory
is allowed (once per container).

This implements 2 counters per registered memory region:
- @mapped: incremented on every DMA mapping; decremented on unmapping;
initialized to 1 when a region is just registered; once it becomes zero,
no more mappings allowe;
- @used: incremented on every "register" ioctl; decremented on
"unregister"; unregistration is allowed for DMA mapped regions unless
it is the very last reference. For the very last reference this checks
that the region is still mapped and returns -EBUSY so the userspace
gets to know that memory is still pinned and unregistration needs to
be retried; @used remains 1.

Host physical addresses are stored in vmalloc'ed array. In order to
access these in the real mode (mmu off), there is a real_vmalloc_addr()
helper. In-kernel acceleration patchset will move it from KVM to MMU code.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 4b6cfb2a 23-Feb-2015 Greg Kurz <groug@kaod.org>

powerpc/vphn: move VPHN parsing logic to a separate file

The goal behind this patch is to be able to write userland tests for the
VPHN parsing code.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# b30e7590 05-Nov-2014 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/mm: Switch to generic RCU get_user_pages_fast

This patch switch the ppc arch to use the generic RCU based
gup implementation.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# e83d0169 08-Oct-2014 Ian Munsie <imunsie@au1.ibm.com>

powerpc/cell: Move spu_handle_mm_fault() out of cell platform

Currently spu_handle_mm_fault() is in the cell platform.

This code is generically useful for other non-cell co-processors on powerpc.

This patch moves this function out of the cell platform into arch/powerpc/mm so
that others may use it.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 376af594 09-Jul-2014 Michael Ellerman <mpe@ellerman.id.au>

powerpc: Remove STAB code

Old cpus didn't have a Segment Lookaside Buffer (SLB), instead they had
a Segment Table (STAB). Now that we've dropped support for those cpus,
we can remove the STAB support entirely.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 6d492ecc 20-Jun-2013 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc/THP: Add code to handle HPTE faults for hugepages

The deposted PTE page in the second half of the PMD table is used to
track the state on hash PTEs. After updating the HPTE, we mark the
coresponding slot in the deposted PTE page valid.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 29409997 20-Jun-2013 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc: move find_linux_pte_or_hugepte and gup_hugepte to common code

We will use this in the later patch for handling THP pages

Reviewed-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# d5d8ec89 23-Apr-2013 Daniel Walker <dwalker@fifo99.com>

powerpc/mm: Make mmap_64.c compile on 32bit powerpc

There appears to be no good reason to keep this as 64bit only. It works
on 32bit also, and has checks so that it can work correctly with 32bit
binaries on 64bit hardware which is why I think this works.

I tested this on qemu using the virtex-ml507 machine type.

Before,

/bin2 # ./test & cat /proc/${!}/maps
00100000-00103000 r-xp 00000000 00:00 0 [vdso]
10000000-10007000 r-xp 00000000 00:01 454 /bin2/test
10017000-10018000 rw-p 00007000 00:01 454 /bin2/test
48000000-48020000 r-xp 00000000 00:01 224 /lib/ld-2.11.3.so
48021000-48023000 rw-p 00021000 00:01 224 /lib/ld-2.11.3.so
bfd03000-bfd24000 rw-p 00000000 00:00 0 [stack]
/bin2 # ./test & cat /proc/${!}/maps
00100000-00103000 r-xp 00000000 00:00 0 [vdso]
0fe6e000-0ffd8000 r-xp 00000000 00:01 214 /lib/libc-2.11.3.so
0ffd8000-0ffe8000 ---p 0016a000 00:01 214 /lib/libc-2.11.3.so
0ffe8000-0ffed000 rw-p 0016a000 00:01 214 /lib/libc-2.11.3.so
0ffed000-0fff0000 rw-p 00000000 00:00 0
10000000-10007000 r-xp 00000000 00:01 454 /bin2/test
10017000-10018000 rw-p 00007000 00:01 454 /bin2/test
48000000-48020000 r-xp 00000000 00:01 224 /lib/ld-2.11.3.so
48020000-48021000 rw-p 00000000 00:00 0
48021000-48023000 rw-p 00021000 00:01 224 /lib/ld-2.11.3.so
bf98a000-bf9ab000 rw-p 00000000 00:00 0 [stack]
/bin2 # ./test & cat /proc/${!}/maps
00100000-00103000 r-xp 00000000 00:00 0 [vdso]
0fe6e000-0ffd8000 r-xp 00000000 00:01 214 /lib/libc-2.11.3.so
0ffd8000-0ffe8000 ---p 0016a000 00:01 214 /lib/libc-2.11.3.so
0ffe8000-0ffed000 rw-p 0016a000 00:01 214 /lib/libc-2.11.3.so
0ffed000-0fff0000 rw-p 00000000 00:00 0
10000000-10007000 r-xp 00000000 00:01 454 /bin2/test
10017000-10018000 rw-p 00007000 00:01 454 /bin2/test
48000000-48020000 r-xp 00000000 00:01 224 /lib/ld-2.11.3.so
48020000-48021000 rw-p 00000000 00:00 0
48021000-48023000 rw-p 00021000 00:01 224 /lib/ld-2.11.3.so
bfa54000-bfa75000 rw-p 00000000 00:00 0 [stack]

After,

bash-4.1# ./test & cat /proc/${!}/maps
[7] 803
00100000-00103000 r-xp 00000000 00:00 0 [vdso]
10000000-10007000 r-xp 00000000 00:01 454 /bin2/test
10017000-10018000 rw-p 00007000 00:01 454 /bin2/test
b7eb0000-b7ed0000 r-xp 00000000 00:01 224 /lib/ld-2.11.3.so
b7ed1000-b7ed3000 rw-p 00021000 00:01 224 /lib/ld-2.11.3.so
bfbc0000-bfbe1000 rw-p 00000000 00:00 0 [stack]
bash-4.1# ./test & cat /proc/${!}/maps
[8] 805
00100000-00103000 r-xp 00000000 00:00 0 [vdso]
10000000-10007000 r-xp 00000000 00:01 454 /bin2/test
10017000-10018000 rw-p 00007000 00:01 454 /bin2/test
b7b03000-b7b23000 r-xp 00000000 00:01 224 /lib/ld-2.11.3.so
b7b24000-b7b26000 rw-p 00021000 00:01 224 /lib/ld-2.11.3.so
bfc27000-bfc48000 rw-p 00000000 00:00 0 [stack]
bash-4.1# ./test & cat /proc/${!}/maps
[9] 807
00100000-00103000 r-xp 00000000 00:00 0 [vdso]
10000000-10007000 r-xp 00000000 00:01 454 /bin2/test
10017000-10018000 rw-p 00007000 00:01 454 /bin2/test
b7f37000-b7f57000 r-xp 00000000 00:01 224 /lib/ld-2.11.3.so
b7f58000-b7f5a000 rw-p 00021000 00:01 224 /lib/ld-2.11.3.so
bff96000-bffb7000 rw-p 00000000 00:00 0 [stack]

Signed-off-by: Daniel Walker <dwalker@fifo90.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 1fbe9cf2 26-Nov-2012 Anton Blanchard <anton@samba.org>

powerpc: Build kernel with -mcmodel=medium

Finally remove the two level TOC and build with -mcmodel=medium.

Unfortunately we can't build modules with -mcmodel=medium due to
the tricks the kernel module loader plays with percpu data:

# -mcmodel=medium breaks modules because it uses 32bit offsets from
# the TOC pointer to create pointers where possible. Pointers into the
# percpu data area are created by this method.
#
# The kernel module loader relocates the percpu data section from the
# original location (starting with 0xd...) to somewhere in the base
# kernel percpu data space (starting with 0xc...). We need a full
# 64bit relocation for this to work, hence -mcmodel=large.

On older kernels we fall back to the two level TOC (-mminimal-toc)

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 9d670280 29-Sep-2011 Jimi Xenidis <jimix@pobox.com>

powerpc: Split ICSWX ACOP and PID processing

Some processors, like embedded, that already have a PID register that
is managed by the system. This patch separates the ACOP and PID
processing into separate files so that the ACOP code can be shared.

Signed-off-by: Jimi Xenidis <jimix@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 41151e77 28-Jun-2011 Becky Bruce <beckyb@kernel.crashing.org>

powerpc: Hugetlb for BookE

Enable hugepages on Freescale BookE processors. This allows the kernel to
use huge TLB entries to map pages, which can greatly reduce the number of
TLB misses and the amount of TLB thrashing experienced by applications with
large memory footprints. Care should be taken when using this on FSL
processors, as the number of large TLB entries supported by the core is low
(16-64) on current processors.

The supported set of hugepage sizes include 4m, 16m, 64m, 256m, and 1g.
Page sizes larger than the max zone size are called "gigantic" pages and
must be allocated on the command line (and cannot be deallocated).

This is currently only fully implemented for Freescale 32-bit BookE
processors, but there is some infrastructure in the code for
64-bit BooKE.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 55fd766b 16-Oct-2009 Kumar Gala <galak@kernel.crashing.org>

powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips

On Freescale parts typically have TLB array for large mappings that we can
bolt the linear mapping into. We utilize the code that already exists
on PPC32 on the 64-bit side to setup the linear mapping to be cover by
bolted TLB entries. We utilize a quarter of the variable size TLB array
for this purpose.

Additionally, we limit the amount of memory to what we can cover via
bolted entries so we don't get secondary faults in the TLB miss
handlers. We should fix this limitation in the future.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>


# 4108d9ba 22-Sep-2010 matt mooney <mfm@muteddisk.com>

powerpc/Makefiles: Change to new flag variables

Replace EXTRA_CFLAGS with ccflags-y and EXTRA_AFLAGS with asflags-y.

Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 883a3e52 26-Oct-2009 David Gibson <david@gibson.dropbear.id.au>

powerpc/mm: Split hash MMU specific hugepage code into a new file

This patch separates the parts of hugetlbpage.c which are inherently
specific to the hash MMU into a new hugelbpage-hash64.c file.

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 2d27cfd3 23-Jul-2009 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Remaining 64-bit Book3E support

This contains all the bits that didn't fit in previous patches :-) This
includes the actual exception handlers assembly, the changes to the
kernel entry, other misc bits and wiring it all up in Kconfig.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 850f6ac3 18-Jun-2009 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Make k(un)map_atomic out of line

Those functions are way too big to be inline, besides, kmap_atomic()
wants to call debug_kmap_atomic() which isn't exported for modules
and causes module link failures.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# ba55bd74 09-Jun-2009 Michael Ellerman <michael@ellerman.id.au>

powerpc: Add configurable -Werror for arch/powerpc

Add the option to build the code under arch/powerpc with -Werror.

The intention is to make it harder for people to inadvertantly introduce
warnings in the arch/powerpc code. It needs to be configurable so that
if a warning is introduced, people can easily work around it while it's
being fixed.

The option is a negative, ie. don't enable -Werror, so that it will be
turned on for allyes and allmodconfig builds.

The default is n, in the hope that developers will build with -Werror,
that will probably lead to some build breaks, I am prepared to be flamed.

It's not enabled for math-emu, which is a steaming pile of warnings.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 94491685 02-Jun-2009 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Shield code specific to 64-bit server processors

This is a random collection of added ifdef's around portions of
code that only mak sense on server processors. Using either
CONFIG_PPC_STD_MMU_64 or CONFIG_PPC_BOOK3S as seems appropriate.

This is meant to make the future merging of Book3E 64-bit support
easier.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# b16e7766 26-May-2009 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Move dma-noncoherent.c from arch/powerpc/lib to arch/powerpc/mm

(pre-requisite to make the next patches more palatable)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# d62cbf45 19-Mar-2009 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Rename arch/powerpc/kernel/mmap.c to mmap_64.c

This file is only useful on 64-bit, so we name it accordingly.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 9e5efaa9 10-Mar-2009 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Properly wire up get_user_pages_fast() on 32-bit

While we did add support for _PAGE_SPECIAL on some 32-bit platforms,
we never actually built get_user_pages_fast() on them. This fixes
it which requires a little bit of ifdef'ing around.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 2a4aca11 18-Dec-2008 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Split low level tlb invalidate for nohash processors

Currently, the various forms of low level TLB invalidations are all
implemented in misc_32.S for 32-bit processors, in a fairly scary
mess of #ifdef's and with interesting duplication such as a whole
bunch of code for FSL _tlbie and _tlbia which are no longer used.

This moves things around such that _tlbie is now defined in
hash_low_32.S and is only used by the 32-bit hash code, and all
nohash CPUs use the various _tlbil_* forms that are now moved to
a new file, tlb_nohash_low.S.

I moved all the definitions for that stuff out of
include/asm/tlbflush.h as they are really internal mm stuff, into
mm/mmu_decl.h

The code should have no functional changes. I kept some variants
inline for trivial forms on things like 40x and 8xx.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# f048aace 18-Dec-2008 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Add SMP support to no-hash TLB handling

This commit moves the whole no-hash TLB handling out of line into a
new tlb_nohash.c file, and implements some basic SMP support using
IPIs and/or broadcast tlbivax instructions.

Note that I'm using local invalidations for D->I cache coherency.

At worst, if another processor is trying to execute the same and
has the old entry in its TLB, it will just take a fault and re-do
the TLB flush locally (it won't re-do the cache flush in any case).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# 5e696617 18-Dec-2008 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Split mmu_context handling

This splits the mmu_context handling between 32-bit hash based
processors, 64-bit hash based processors and everybody else. This is
preliminary work for adding SMP support for BookE processors.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# e41e811a 14-Dec-2008 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Rename tlb_32.c and tlb_64.c to tlb_hash32.c and tlb_hash64.c

This renames the files to clarify the fact that they are used by
the hash based family of CPUs (the 603 being an exception in that
family but is still handled by that code).

This paves the way for the new tlb_nohash.c coming via a subsequent
commit.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# 0186f47e 18-Nov-2008 Kumar Gala <galak@kernel.crashing.org>

powerpc: Use RCU based pte freeing mechanism for all powerpc

Refactor the RCU based pte free code that was used on ppc64 to be used
on all powerpc.

Additionally refactor pte_free() & pte_free_kernel() into common code
between ppc32 & ppc64.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# ce0ad7f0 29-Jul-2008 Nick Piggin <npiggin@suse.de>

powerpc/mm: Lockless get_user_pages_fast() for 64-bit (v3)

Implement lockless get_user_pages_fast for 64-bit powerpc.

Page table existence is guaranteed with RCU, and speculative page references
are used to take a reference to the pages without having a prior existence
guarantee on them.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# d9b2b2a2 13-Feb-2008 David S. Miller <davem@davemloft.net>

[LIB]: Make PowerPC LMB code generic so sparc64 can use it too.

Signed-off-by: David S. Miller <davem@davemloft.net>


# fa28237c 23-Jan-2008 Paul Mackerras <paulus@samba.org>

[POWERPC] Provide a way to protect 4k subpages when using 64k pages

Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.

This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.

The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.

Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.

The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.

Signed-off-by: Paul Mackerras <paulus@samba.org>


# 2578bfae 20-Sep-2007 Stephen Rothwell <sfr@canb.auug.org.au>

[POWERPC] Create and use CONFIG_WORD_SIZE

Linus made this suggestion for the x86 merge and this starts the process
for powerpc. We assume that CONFIG_PPC64 implies CONFIG_PPC_MERGE and
CONFIG_PPC_STD_MMU_32 implies CONFIG_PPC_STD_MMU.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# 15f6527e 20-Aug-2007 Josh Boyer <jwboyer@linux.vnet.ibm.com>

[POWERPC] Rename 4xx paths to 40x

4xx is a bit of a misnomer for certain things, as they really apply to PowerPC
40x only. Rename some of the files to clean this up.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>


# 3d5134ee 03-Jun-2007 Benjamin Herrenschmidt <benh@kernel.crashing.org>

[POWERPC] Rewrite IO allocation & mapping on powerpc64

This rewrites pretty much from scratch the handling of MMIO and PIO
space allocations on powerpc64. The main goals are:

- Get rid of imalloc and use more common code where possible
- Simplify the current mess so that PIO space is allocated and
mapped in a single place for PCI bridges
- Handle allocation constraints of PIO for all bridges including
hot plugged ones within the 2GB space reserved for IO ports,
so that devices on hotplugged busses will now work with drivers
that assume IO ports fit in an int.
- Cleanup and separate tracking of the ISA space in the reserved
low 64K of IO space. No ISA -> Nothing mapped there.

I booted a cell blade with IDE on PIO and MMIO and a dual G5 so
far, that's it :-)

With this patch, all allocations are done using the code in
mm/vmalloc.c, though we use the low level __get_vm_area with
explicit start/stop constraints in order to manage separate
areas for vmalloc/vmap, ioremap, and PCI IOs.

This greatly simplifies a lot of things, as you can see in the
diffstat of that patch :-)

A new pair of functions pcibios_map/unmap_io_space() now replace
all of the previous code that used to manipulate PCI IOs space.
The allocation is done at mapping time, which is now called from
scan_phb's, just before the devices are probed (instead of after,
which is by itself a bug fix). The only other caller is the PCI
hotplug code for hot adding PCI-PCI bridges (slots).

imalloc is gone, as is the "sub-allocation" thing, but I do beleive
that hotplug should still work in the sense that the space allocation
is always done by the PHB, but if you unmap a child bus of this PHB
(which seems to be possible), then the code should properly tear
down all the HPTE mappings for that area of the PHB allocated IO space.

I now always reserve the first 64K of IO space for the bridge with
the ISA bus on it. I have moved the code for tracking ISA in a separate
file which should also make it smarter if we ever are capable of
hot unplugging or re-plugging an ISA bridge.

This should have a side effect on platforms like powermac where VGA IOs
will no longer work. This is done on purpose though as they would have
worked semi-randomly before. The idea at this point is to isolate drivers
that might need to access those and fix them by providing a proper
function to obtain an offset to the legacy IOs of a given bus.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# d0f13e3c 08-May-2007 Benjamin Herrenschmidt <benh@kernel.crashing.org>

[POWERPC] Introduce address space "slices"

The basic issue is to be able to do what hugetlbfs does but with
different page sizes for some other special filesystems; more
specifically, my need is:

- Huge pages

- SPE local store mappings using 64K pages on a 4K base page size
kernel on Cell

- Some special 4K segments in 64K-page kernels for mapping a dodgy
type of powerpc-specific infiniband hardware that requires 4K MMU
mappings for various reasons I won't explain here.

The main issues are:

- To maintain/keep track of the page size per "segment" (as we can
only have one page size per segment on powerpc, which are 256MB
divisions of the address space).

- To make sure special mappings stay within their allotted
"segments" (including MAP_FIXED crap)

- To make sure everybody else doesn't mmap/brk/grow_stack into a
"segment" that is used for a special mapping

Some of the necessary mechanisms to handle that were present in the
hugetlbfs code, but mostly in ways not suitable for anything else.

The patch relies on some changes to the generic get_unmapped_area()
that just got merged. It still hijacks hugetlb callbacks here or
there as the generic code hasn't been entirely cleaned up yet but
that shouldn't be a problem.

So what is a slice ? Well, I re-used the mechanism used formerly by our
hugetlbfs implementation which divides the address space in
"meta-segments" which I called "slices". The division is done using
256MB slices below 4G, and 1T slices above. Thus the address space is
divided currently into 16 "low" slices and 16 "high" slices. (Special
case: high slice 0 is the area between 4G and 1T).

Doing so simplifies significantly the tracking of segments and avoids
having to keep track of all the 256MB segments in the address space.

While I used the "concepts" of hugetlbfs, I mostly re-implemented
everything in a more generic way and "ported" hugetlbfs to it.

Slices can have an associated page size, which is encoded in the mmu
context and used by the SLB miss handler to set the segment sizes. The
hash code currently doesn't care, it has a specific check for hugepages,
though I might add a mechanism to provide per-slice hash mapping
functions in the future.

The slice code provide a pair of "generic" get_unmapped_area() (bottomup
and topdown) functions that should work with any slice size. There is
some trickiness here so I would appreciate people to have a look at the
implementation of these and let me know if I got something wrong.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# e22ba7e3 27-Nov-2006 Arnd Bergmann <arnd@arndb.de>

[POWERPC] ps3: multiplatform build fixes

A few code paths need to check whether or not they are running
on the PS3's LV1 hypervisor before making hcalls. This introduces
a new firmware feature bit for this, FW_FEATURE_PS3_LV1.

Now when both PS3 and IBM_CELL_BLADE are enabled, but not PSERIES,
FW_FEATURE_PS3_LV1 and FW_FEATURE_LPAR get enabled at compile time,
which is a bug. The same problem can also happen for (PPC_ISERIES &&
!PPC_PSERIES && PPC_SOMETHING_ELSE). In order to solve this, I
introduce a new CONFIG_PPC_NATIVE option that is set when at least
one platform is selected that can run without a hypervisor and then
turns the firmware feature check into a run-time option.

The new cell oprofile support that was recently merged does not
work on hypervisor based platforms like the PS3, therefore make
it depend on PPC_CELL_NATIVE instead of PPC_CELL. This may change
if we get oprofile support for PS3.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>


# b15125fa 18-Oct-2005 Kumar Gala <galak@freescale.com>

[PATCH] powerpc: Some more fixes to allow building for a Book-E processor

Some minor fixes that are needed if we are building for a book-e
processor.

Signed-off-by: Kumar K. Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>


# 3a5f8c5f 11-Oct-2005 Stephen Rothwell <sfr@canb.auug.org.au>

powerpc: make iSeries boot again

On ARCH=ppc64 we were getting htab_hash_mask recalculated
to the correct value for our particular machine by accident.
In the merge tree, that code was commented out, so htab_hash_mask
was being corrupted.

We now set ppc64_pft_size instead which gets htab_has_mask
calculated correctly for us later. We should put an
ibm,pft-size property in the device tree at some point.

Also set -mno-minimal-toc in some makefiles.
Allow iSeries to configure PROC_DEVICETREE.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>


# ab1f9dac 10-Oct-2005 Paul Mackerras <paulus@samba.org>

powerpc: Merge arch/ppc64/mm to arch/powerpc/mm

This moves the remaining files in arch/ppc64/mm to arch/powerpc/mm,
and arranges that we use them when compiling with ARCH=ppc64.

Signed-off-by: Paul Mackerras <paulus@samba.org>


# 70d64cea 10-Oct-2005 Paul Mackerras <paulus@samba.org>

powerpc: Rename files to have consistent _32/_64 suffixes

This doesn't change any code, just renames things so we consistently
have foo_32.c and foo_64.c where we have separate 32- and 64-bit
versions.

Signed-off-by: Paul Mackerras <paulus@samba.org>


# 7c8c6b97 05-Oct-2005 Paul Mackerras <paulus@samba.org>

powerpc: Merge lmb.c and make MM initialization use it.

This also creates merged versions of do_init_bootmem, paging_init
and mem_init and moves them to arch/powerpc/mm/mem.c. It gets rid
of the mem_pieces stuff.

I made memory_limit a parameter to lmb_enforce_memory_limit rather
than a global referenced by that function. This will require some
small changes to ppc64 if we want to continue building ARCH=ppc64
using the merged lmb.c.

Signed-off-by: Paul Mackerras <paulus@samba.org>


# 14cf11af 26-Sep-2005 Paul Mackerras <paulus@samba.org>

powerpc: Merge enough to start building in arch/powerpc.

This creates the directory structure under arch/powerpc and a bunch
of Kconfig files. It does a first-cut merge of arch/powerpc/mm,
arch/powerpc/lib and arch/powerpc/platforms/powermac. This is enough
to build a 32-bit powermac kernel with ARCH=powerpc.

For now we are getting some unmerged files from arch/ppc/kernel and
arch/ppc/syslib, or arch/ppc64/kernel. This makes some minor changes
to files in those directories and files outside arch/powerpc.

The boot directory is still not merged. That's going to be interesting.

Signed-off-by: Paul Mackerras <paulus@samba.org>