#
a9e1a3d8 |
|
13-Sep-2023 |
Baoquan He <bhe@redhat.com> |
crash_core: change the prototype of function parse_crashkernel() Add two parameters 'low_size' and 'high' to function parse_crashkernel(), later crashkernel=,high|low parsing will be added. Make adjustments in all call sites of parse_crashkernel() in arch. Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Jiahao <chenjiahao16@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
d1eb75e0 |
|
03-Jul-2023 |
Sourabh Jain <sourabhjain@linux.ibm.com> |
powerpc/fadump: reset dump area size if fadump memory reserve fails In case fadump_reserve_mem() fails to reserve memory, the reserve_dump_area_size variable will retain the reserve area size. This will lead to /sys/kernel/fadump/mem_reserved node displaying an incorrect memory reserved by fadump. To fix this problem, reserve dump area size variable is set to 0 if fadump failed to reserve memory. Fixes: 8255da95e545 ("powerpc/fadump: release all the memory above boot memory size") Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230704050715.203581-1-sourabhjain@linux.ibm.com
|
#
5b89492c |
|
08-May-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc: Finalise cleanup around ABI use Now that we have CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2, get rid of all indirect detection of ABI version. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/709d9d69523c14c8a9fba4486395dca0f2d675b1.1652074503.git.christophe.leroy@csgroup.eu
|
#
e6f6390a |
|
08-Mar-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc: Add missing headers Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h, asm/pci.h etc... Include the needed headers, and remove asm/prom.h when it was needed exclusively for pulling necessary headers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
|
#
1fd02f66 |
|
30-Apr-2022 |
Julia Lawall <Julia.Lawall@inria.fr> |
powerpc: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr
|
#
20776319 |
|
28-Jan-2022 |
Jiapeng Chong <jiapeng.chong@linux.alibaba.com> |
powerpc/fadump: Use swap() instead of open coding it Clean the following coccicheck warning: ./arch/powerpc/kernel/fadump.c:1291:34-35: WARNING opportunity for swap(). Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220129034847.76902-1-jiapeng.chong@linux.alibaba.com
|
#
887f56a0 |
|
29-Oct-2021 |
Randy Dunlap <rdunlap@infradead.org> |
powerpc/fadump: Correct two typos in a comment Fix typos of 'remaining' and 'those'. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Matthew Wilcox <willy@infradead.org> # 'remaining' Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211030002619.2063-1-rdunlap@infradead.org
|
#
9cf3b3a3 |
|
06-Apr-2022 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: align destination address to pagesize On crash, boot memory area is copied to a destination address by f/w. This region is setup as separate PT_LOAD segment with appropriate offset to handle the different physical address and offset in vmcore. If this destination address is not page aligned, reading the vmcore with mmap is likely to fail forcing tools like makedumpfile to fall back to regular read. Avoid mmap read failure by ensuring that the destination address is always page aligned. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220406093839.206608-3-hbathini@linux.ibm.com
|
#
15eb77f8 |
|
06-Apr-2022 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: fix PT_LOAD segment for boot memory area Boot memory area is setup as separate PT_LOAD segment in the vmcore as it is moved by f/w, on crash, to a destination address provided by the kernel. Having separate PT_LOAD segment helps in handling the different physical address and offset for boot memory area in the vmcore. Commit ced1bf52f477 ("powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements") inadvertly broke this pre-condition for cases where some of the first kernel memory is available adjacent to boot memory area. This scenario is rare but possible when memory for fadump could not be reserved adjacent to boot memory area owing to memory hole or such. Reading memory from a vmcore exported in such scenario provides incorrect data. Fix it by ensuring no other region is folded into boot memory area. Fixes: ced1bf52f477 ("powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements") Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220406093839.206608-2-hbathini@linux.ibm.com
|
#
6584cec0 |
|
04-Apr-2022 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: save CPU reg data in vmcore when PHYP terminates LPAR An LPAR can be terminated by the POWER Hypervisor (PHYP) for various reasons. If FADump was configured when PHYP terminates the LPAR, platform-assisted dump is initiated to save the kernel dump. But CPU register data would not be processed/saved in the vmcore in such case because CPU mask is set in crash_fadump() at the time of kernel crash and it remains unset in this case with LPAR being terminated by PHYP abruptly. To get around the problem, initialize cpu_mask to cpu_possible_mask so as to ensure all possible CPUs' register data is processed for the vmcore generated on PHYP terminated LPAR. Also, rename the crash info member variable from online_mask to cpu_mask as it doesn't necessarily have to be online CPU mask always. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220404182137.59231-1-hbathini@linux.ibm.com
|
#
9554e908 |
|
25-Mar-2022 |
Brian Gerst <brgerst@gmail.com> |
ELF: Remove elf_core_copy_kernel_regs() x86-32 was the last architecture that implemented separate user and kernel registers. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220325153953.162643-3-brgerst@gmail.com
|
#
607451ce |
|
01-Feb-2022 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: register for fadump as early as possible Crash recovery (fadump) is setup in the userspace by some service. This service rebuilds initrd with dump capture capability, if it is not already dump capture capable before proceeding to register for firmware assisted dump (echo 1 > /sys/kernel/fadump/registered). But arming the kernel with crash recovery support does not have to wait for userspace configuration. So, register for fadump while setting it up itself. This can at worst lead to a scenario, where /proc/vmcore is ready afer crash but the initrd does not know how/where to offload it, which is always better than not having a /proc/vmcore at all due to incomplete configuration in the userspace at the time of crash. Commit 0823c68b054b ("powerpc/fadump: re-register firmware-assisted dump if already registered") ensures this change does not break userspace. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> [mpe: Reword comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220201105305.155511-1-hbathini@linux.ibm.com
|
#
ee97347f |
|
22-Mar-2022 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: opt out from freeing pages on cma activation failure With commit a4e92ce8e4c8 ("powerpc/fadump: Reservationless firmware assisted dump"), Linux kernel's Contiguous Memory Allocator (CMA) based reservation was introduced in fadump. That change was aimed at using CMA to let applications utilize the memory reserved for fadump while blocking it from being used for kernel pages. The assumption was, even if CMA activation fails for whatever reason, the memory still remains reserved to avoid it from being used for kernel pages. But commit 072355c1cf2d ("mm/cma: expose all pages to the buddy if activation of an area fails") breaks this assumption as it started exposing all pages to buddy allocator on CMA activation failure. It led to warning messages like below while running crash-utility on vmcore of a kernel having above two commits: crash: seek error: kernel virtual address: <from reserved region> To fix this problem, opt out from exposing pages to buddy allocator on CMA activation failure for fadump reserved memory. Link: https://lkml.kernel.org/r/20220117075246.36072-3-hbathini@linux.ibm.com Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
e16faf26 |
|
22-Mar-2022 |
David Hildenbrand <david@redhat.com> |
cma: factor out minimum alignment requirement Patch series "mm: enforce pageblock_order < MAX_ORDER". Having pageblock_order >= MAX_ORDER seems to be able to happen in corner cases and some parts of the kernel are not prepared for it. For example, Aneesh has shown [1] that such kernels can be compiled on ppc64 with 64k base pages by setting FORCE_MAX_ZONEORDER=8, which will run into a WARN_ON_ONCE(order >= MAX_ORDER) in comapction code right during boot. We can get pageblock_order >= MAX_ORDER when the default hugetlb size is bigger than the maximum allocation granularity of the buddy, in which case we are no longer talking about huge pages but instead gigantic pages. Having pageblock_order >= MAX_ORDER can only make alloc_contig_range() of such gigantic pages more likely to succeed. Reliable use of gigantic pages either requires boot time allcoation or CMA, no need to overcomplicate some places in the kernel to optimize for corner cases that are broken in other areas of the kernel. This patch (of 2): Let's enforce pageblock_order < MAX_ORDER and simplify. Especially patch #1 can be regarded a cleanup before: [PATCH v5 0/6] Use pageblock_order for cma and alloc_contig_range alignment. [2] [1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com [2] https://lkml.kernel.org/r/20220211164135.1803616-1-zi.yan@sent.com Link: https://lkml.kernel.org/r/20220214174132.219303-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Rob Herring <robh@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: John Garry via iommu <iommu@lists.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
d276960d |
|
16-Dec-2021 |
Nick Child <nick.child@ibm.com> |
powerpc/kernel: Add __init attribute to eligible functions Some functions defined in `arch/powerpc/kernel` (and one in `arch/powerpc/ kexec`) are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-2-nick.child@ibm.com
|
#
06e629c2 |
|
07-Dec-2021 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panic In panic path, fadump is triggered via a panic notifier function. Before calling panic notifier functions, smp_send_stop() gets called, which stops all CPUs except the panic'ing CPU. Commit 8389b37dffdc ("powerpc: stop_this_cpu: remove the cpu from the online map.") and again commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") started marking CPUs as offline while stopping them. So, if a kernel has either of the above commits, vmcore captured with fadump via panic path would not process register data for all CPUs except the panic'ing CPU. Sample output of crash-utility with such vmcore: # crash vmlinux vmcore ... KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 1 DATE: Wed Nov 10 09:56:34 EST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 2.27, 0.69, 0.24 TASKS: 183 NODENAME: XXXXXXXXX RELEASE: 5.15.0+ VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021 MACHINE: ppc64le (2500 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 3394 COMMAND: "bash" TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> p -x __cpu_online_mask __cpu_online_mask = $1 = { bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> crash> crash> p -x __cpu_active_mask __cpu_active_mask = $2 = { bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> While this has been the case since fadump was introduced, the issue was not identified for two probable reasons: - In general, the bulk of the vmcores analyzed were from crash due to exception. - The above did change since commit 8341f2f222d7 ("sysrq: Use panic() to force a crash") started using panic() instead of deferencing NULL pointer to force a kernel crash. But then commit de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs") stopped marking CPUs as offline till kernel commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") reverted that change. To ensure post processing register data of all other CPUs happens as intended, let panic() function take the crash friendly path (read crash_smp_send_stop()) with the help of crash_kexec_post_notifiers option. Also, as register data for all CPUs is captured by f/w, skip IPI callbacks here for fadump, to avoid any complications in finding the right backtraces. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211207103719.91117-2-hbathini@linux.ibm.com
|
#
dbf77fed |
|
12-Aug-2021 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
powerpc: rename powerpc_debugfs_root to arch_debugfs_dir No functional change in this patch. arch_debugfs_dir is the generic kernel name declared in linux/debugfs.h for arch-specific debugfs directory. Architectures like x86/s390 already use the name. Rename powerpc specific powerpc_debugfs_root to arch_debugfs_dir. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210812132831.233794-2-aneesh.kumar@linux.ibm.com
|
#
2e341f56 |
|
21-Apr-2021 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/fadump: Fix sparse warnings Sparse says: arch/powerpc/kernel/fadump.c:48:16: warning: symbol 'fadump_kobj' was not declared. Should it be static? arch/powerpc/kernel/fadump.c:55:27: warning: symbol 'crash_mrange_info' was not declared. Should it be static? arch/powerpc/kernel/fadump.c:61:27: warning: symbol 'reserved_mrange_info' was not declared. Should it be static? arch/powerpc/kernel/fadump.c:83:12: warning: symbol 'fadump_cma_init' was not declared. Should it be static? And indeed none of them are used outside this file, they can all be made static. Also fadump_kobj needs to be moved inside the ifdef where it's used. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210421125402.1955013-1-mpe@ellerman.id.au
|
#
cbd3d5ba |
|
19-Apr-2021 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/fadump: Fix compile error since trap type change sfr reports that the allyesconfig build fails with: arch/powerpc/kernel/fadump.c: In function 'crash_fadump': arch/powerpc/kernel/fadump.c:731:28: error: 'INTERRUPT_SYSTEM_RESET' undeclared 731 | if (TRAP(&(fdh->regs)) == INTERRUPT_SYSTEM_RESET) { Add an include of interrupt.h to fix it. Fixes: 7153d4bf0b37 ("powerpc/traps: Enhance readability for trap types") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> [mpe: Reformat change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210419191425.281dc58a@canb.auug.org.au
|
#
7153d4bf |
|
14-Apr-2021 |
Xiongwei Song <sxwjean@gmail.com> |
powerpc/traps: Enhance readability for trap types Define macros to list ppc interrupt types in interttupt.h, replace the reference of the trap hex values with these macros. Referred the hex numbers in arch/powerpc/kernel/exceptions-64e.S, arch/powerpc/kernel/exceptions-64s.S, arch/powerpc/kernel/head_*.S, arch/powerpc/kernel/head_booke.h and arch/powerpc/include/asm/kvm_asm.h. Signed-off-by: Xiongwei Song <sxwjean@gmail.com> [mpe: Resolve conflicts in nmi_disables_ftrace(), fix 40x build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1618398033-13025-1-git-send-email-sxwjean@me.com
|
#
fbced154 |
|
02-Mar-2021 |
Nathan Chancellor <nathan@kernel.org> |
powerpc/fadump: Mark fadump_calculate_reserve_size as __init If fadump_calculate_reserve_size() is not inlined, there is a modpost warning: WARNING: modpost: vmlinux.o(.text+0x5196c): Section mismatch in reference from the function fadump_calculate_reserve_size() to the function .init.text:parse_crashkernel() The function fadump_calculate_reserve_size() references the function __init parse_crashkernel(). This is often because fadump_calculate_reserve_size lacks a __init annotation or the annotation of parse_crashkernel is wrong. fadump_calculate_reserve_size() calls parse_crashkernel(), which is marked as __init and fadump_calculate_reserve_size() is called from within fadump_reserve_mem(), which is also marked as __init. Mark fadump_calculate_reserve_size() as __init to fix the section mismatch. Additionally, remove the inline keyword as it is not necessary to inline this function; the compiler is still free to do so if it feels it is worthwhile since commit 889b3c1245de ("compiler: remove CONFIG_OPTIMIZE_INLINING entirely"). Fixes: 11550dc0a00b ("powerpc/fadump: reuse crashkernel parameter for fadump memory reservation") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/1300 Link: https://lore.kernel.org/r/20210302195013.2626335-1-nathan@kernel.org
|
#
b10d6bca |
|
13-Oct-2020 |
Mike Rapoport <rppt@kernel.org> |
arch, drivers: replace for_each_membock() with for_each_mem_range() There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start = __pfn_to_phys(memblock_region_memory_base_pfn(reg); end = __pfn_to_phys(memblock_region_memory_end_pfn(reg)); /* do something with start and end */ } Using for_each_mem_range() iterator is more appropriate in such cases and allows simpler and cleaner code. [akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build] [rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring] Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
c9118e6c |
|
13-Oct-2020 |
Mike Rapoport <rppt@kernel.org> |
arch, mm: replace for_each_memblock() with for_each_mem_pfn_range() There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start_pfn = memblock_region_memory_base_pfn(reg); end_pfn = memblock_region_memory_end_pfn(reg); /* do something with start_pfn and end_pfn */ } Rather than iterate over all memblock.memory regions and each time query for their start and end PFNs, use for_each_mem_pfn_range() iterator to get simpler and clearer code. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [.clang-format] Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200818151634.14343-12-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
738e6cad |
|
18-Nov-2019 |
zhengbin <zhengbin13@huawei.com> |
powerpc/fadump: Remove set but not used variable 'elf' Fix gcc '-Wunused-but-set-variable' warning: arch/powerpc/kernel/fadump.c: In function fadump_update_elfcore_header: arch/powerpc/kernel/fadump.c:790:17: warning: variable elf set but not used [-Wunused-but-set-variable] It is introduced by commit ebaeb5ae2437 ("fadump: Convert firmware-assisted cpu state dump data into elf notes."), but never used, so remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1574144074-142032-2-git-send-email-zhengbin13@huawei.com
|
#
5f987cae |
|
26-Jul-2020 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/fadump: Fix build error with CONFIG_PRESERVE_FA_DUMP=y skiroot_defconfig fails: arch/powerpc/kernel/fadump.c:48:17: error: ‘cpus_in_fadump’ defined but not used 48 | static atomic_t cpus_in_fadump; Fix it by moving the definition into the #ifdef where it's used. Fixes: ba608c4fa12c ("powerpc/fadump: fix race between pstore write and fadump crash trigger") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200727070341.595634-1-mpe@ellerman.id.au
|
#
ba608c4f |
|
12-Jul-2020 |
Sourabh Jain <sourabhjain@linux.ibm.com> |
powerpc/fadump: fix race between pstore write and fadump crash trigger When we enter into fadump crash path via system reset we fail to update the pstore. On the system reset path we first update the pstore then we go for fadump crash. But the problem here is when all the CPUs try to get the pstore lock to initiate the pstore write, only one CPUs will acquire the lock and proceed with the pstore write. Since it in NMI context CPUs that fail to get lock do not wait for their turn to write to the pstore and simply proceed with the next operation which is fadump crash. One of the CPU who proceeded with fadump crash path triggers the crash and does not wait for the CPU who gets the pstore lock to complete the pstore update. Timeline diagram to depicts the sequence of events that leads to an unsuccessful pstore update when we hit fadump crash path via system reset. 1 2 3 ... n CPU Threads | | | | | | | | Reached to -->|--->|---->| ----------->| system reset | | | | path | | | | | | | | Try to -->|--->|---->|------------>| acquire the | | | | pstore lock | | | | | | | | | | | | Got the -->| +->| | |<-+ pstore lock | | | | | |--> Didn't get the | --------------------------+ lock and moving | | | | ahead on fadump | | | | crash path | | | | Begins the -->| | | | process to | | | |<-- Got the chance to update the | | | | trigger the crash pstore | -> | | ... <- | | | | | | | | | | | | |<-- Triggers the | | | | | | crash | | | | | | ^ | | | | | | | Writing to -->| | | | | | | pstore | | | | | | | | | | ^ |__________________| | | CPU Relax | | | +-----------------------------------------+ | v Race: crash triggered before pstore update completes To avoid this race condition a barrier is added on crash_fadump path, it prevents the CPU to trigger the crash until all the online CPUs completes their task. A barrier is added to make sure all the secondary CPUs hit the crash_fadump function before we initiates the crash. A timeout is kept to ensure the primary CPU (one who initiates the crash) do not wait for secondary CPUs indefinitely. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200713052435.183750-1-sourabhjain@linux.ibm.com
|
#
9a2921e5 |
|
27-May-2020 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: Account for memory_limit while reserving memory If the memory chunk found for reserving memory overshoots the memory limit imposed, do not proceed with reserving memory. Default behavior was this until commit 140777a3d8df ("powerpc/fadump: consider reserved ranges while reserving memory") changed it unwittingly. Fixes: 140777a3d8df ("powerpc/fadump: consider reserved ranges while reserving memory") Cc: stable@vger.kernel.org Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/159057266320.22331.6571453892066907320.stgit@hbathini.in.ibm.com
|
#
140777a3 |
|
20-Apr-2020 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: consider reserved ranges while reserving memory Commit 0962e8004e97 ("powerpc/prom: Scan reserved-ranges node for memory reservations") enabled support to parse reserved-ranges DT node and reserve kernel memory falling in these ranges for F/W purposes. Memory reserved for FADump should not overlap with these ranges as it could corrupt memory meant for F/W or crash'ed kernel memory to be exported as vmcore. But since commit 579ca1a27675 ("powerpc/fadump: make use of memblock's bottom up allocation mode"), memblock_find_in_range() is being used to find the appropriate area to reserve memory for FADump, which can't account for reserved-ranges as these ranges are reserved only after FADump memory reservation. With reserved-ranges now being populated during early boot, look out for these memory ranges while reserving memory for FADump. Without this change, MPIPL on PowerNV systems aborts with hostboot failure, when memory reserved for FADump is less than 4096MB. Fixes: 579ca1a27675 ("powerpc/fadump: make use of memblock's bottom up allocation mode") Cc: stable@vger.kernel.org Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/158737297693.26700.16193820746269425424.stgit@hbathini.in.ibm.com
|
#
02c04e37 |
|
20-Apr-2020 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: use static allocation for reserved memory ranges At times, memory ranges have to be looked up during early boot, when kernel couldn't be initialized for dynamic memory allocation. In fact, reserved-ranges look up is needed during FADump memory reservation. Without accounting for reserved-ranges in reserving memory for FADump, MPIPL boot fails with memory corruption issues. So, extend memory ranges handling to support static allocation and populate reserved memory ranges during early boot. Fixes: dda9dbfeeb7a ("powerpc/fadump: consider reserved ranges while releasing memory") Cc: stable@vger.kernel.org Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/158737294432.26700.4830263187856221314.stgit@hbathini.in.ibm.com
|
#
860286cf |
|
09-Feb-2020 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
powerpc/kernel: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-1-gregkh@linuxfoundation.org
|
#
d8e73458 |
|
11-Dec-2019 |
Sourabh Jain <sourabhjain@linux.ibm.com> |
powerpc/fadump: sysfs for fadump memory reservation Add a sys interface to allow querying the memory reserved by FADump for saving the crash dump. Also added Documentation/ABI for the new sysfs file. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-7-sourabhjain@linux.ibm.com
|
#
d418b19f |
|
11-Dec-2019 |
Sourabh Jain <sourabhjain@linux.ibm.com> |
powerpc/fadump: Reorganize /sys/kernel/fadump_* sysfs files As the number of FADump sysfs files increases it is hard to manage all of them inside /sys/kernel directory. It's better to have all the FADump related sysfs files in a dedicated directory /sys/kernel/fadump. But in order to maintain backward compatibility a symlink has been added for every sysfs that has moved to new location. As the FADump sysfs files are now part of a dedicated directory there is no need to prefix their name with fadump_, hence sysfs file names are also updated. For example fadump_enabled sysfs file is now referred as enabled. Also consolidate ABI documentation for all the FADump sysfs files in a single file Documentation/ABI/testing/sysfs-kernel-fadump. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Tested-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-4-sourabhjain@linux.ibm.com
|
#
565f9bc0 |
|
07-Nov-2019 |
Michal Suchanek <msuchanek@suse.de> |
powerpc/fadump: when fadump is supported register the fadump sysfs files. Currently it is not possible to distinguish the case when fadump is supported by firmware and disabled in kernel and completely unsupported using the kernel sysfs interface. User can investigate the devicetree but it is more reasonable to provide sysfs files in case we get some fadumpv2 in the future. With this patch sysfs files are available whenever fadump is supported by firmware. There is duplicate message about lack of support by firmware in fadump_reserve_mem and setup_fadump. Remove the duplicate message in setup_fadump. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191107164757.15140-1-msuchanek@suse.de
|
#
7dee93a9 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: support holes in kernel boot memory area With support to copy multiple kernel boot memory regions owing to copy size limitation, also handle holes in the memory area to be preserved. Support as many as 128 kernel boot memory regions. This allows having an adequate FADump capture kernel size for different scenarios. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821385448.5656.6124791213910877759.stgit@hbathini.in.ibm.com
|
#
becd91d9 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: remove RMA_START and RMA_END macros RMA_START is defined as '0' and there is even a BUILD_BUG_ON() to make sure it is never anything else. Remove this macro and use '0' instead as code change is needed anyway when it has to be something else. Also, remove unused RMA_END macro. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821384096.5656.15026984053970204652.stgit@hbathini.in.ibm.com
|
#
7b1b3b48 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: consider f/w load area OPAL loads kernel & initrd at 512MB offset (256MB size), also exported as ibm,opal/dump/fw-load-area. So, if boot memory size of FADump is less than 768MB, kernel memory to be exported as '/proc/vmcore' would be overwritten by f/w while loading kernel & initrd. To avoid such a scenario, enforce a minimum boot memory size of 768MB on OPAL platform and skip using FADump if a newer F/W version loads kernel & initrd above 768MB. Also, irrespective of RMA size, set the minimum boot memory size expected on pseries platform at 320MB. This is to avoid inflating the minimum memory requirements on systems with 512M/1024M RMA size. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821381414.5656.1592867278535469652.stgit@hbathini.in.ibm.com
|
#
bec53196 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: add support to preserve crash data on FADUMP disabled kernel Add a new kernel config option, CONFIG_PRESERVE_FA_DUMP that ensures that crash data, from previously crash'ed kernel, is preserved. This helps in cases where FADump is not enabled but the subsequent memory preserving kernel boot is likely to process this crash data. One typical usecase for this config option is petitboot kernel. As OPAL allows registering address with it in the first kernel and retrieving it after MPIPL, use it to store the top of boot memory. A kernel that intends to preserve crash data retrieves it and avoids using memory beyond this address. Move arch_reserved_kernel_pages() function as it is needed for both FA_DUMP and PRESERVE_FA_DUMP configurations. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821375751.5656.11459483669542541602.stgit@hbathini.in.ibm.com
|
#
b2a815a5 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: improve how crashed kernel's memory is reserved The size parameter to fadump_reserve_crash_area() function is not needed as all the memory above boot memory size must be preserved anyway. Update the function by dropping this redundant parameter. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821374440.5656.2945512543806951766.stgit@hbathini.in.ibm.com
|
#
dda9dbfe |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: consider reserved ranges while releasing memory Commit 0962e8004e97 ("powerpc/prom: Scan reserved-ranges node for memory reservations") enabled support to parse 'reserved-ranges' DT node to reserve kernel memory falling in these ranges for firmware purposes. Along with the preserved area memory, ensure memory in reserved ranges is not overlapped with memory released by capture kernel aftering saving vmcore. Also, fix the off-by-one error in fadump_release_reserved_area function while releasing memory. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821371358.5656.6061214942558818661.stgit@hbathini.in.ibm.com
|
#
e4fc48fb |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: make crash memory ranges array allocation generic Make allocate_crash_memory_ranges() and free_crash_memory_ranges() functions generic to reuse them for memory management of all types of dynamic memory range arrays. This change helps in memory management of reserved ranges array to be added later. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821369863.5656.4375667005352155892.stgit@hbathini.in.ibm.com
|
#
579ca1a2 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: make use of memblock's bottom up allocation mode Earlier, memblock_find_in_range() was not used to find the memory to be reserved for FADump as bottom up allocation mode was not supported. But since commit 79442ed189acb8b ("mm/memblock.c: introduce bottom-up allocation mode") bottom up allocation mode is supported for memblock. So, use it to find the memory to be reserved for FADump. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821364211.5656.14336025460336135194.stgit@hbathini.in.ibm.com
|
#
a4e2e2ca |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: handle invalidation of crashdump and re-registraion Make OPAL call to indicate that the dump is processed and the metadata area in OPAL can be cleared/released. Also, setup/initialize FADump for re-registration. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821356046.5656.12270927048195494911.stgit@hbathini.in.ibm.com
|
#
2790d01d |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: reset metadata address during clean up During kexec boot, metadata address needs to be reset to avoid running into errors interpreting stale metadata address, in case the kexec'ed kernel crashes before metadata address could be setup again. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821346629.5656.10783321582005237813.stgit@hbathini.in.ibm.com
|
#
742a265a |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: register kernel metadata address with opal OPAL allows registering address with it in the first kernel and retrieving it after MPIPL. Setup kernel metadata and register its address with OPAL to use it for processing the crash dump. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821345011.5656.13567765019032928471.stgit@hbathini.in.ibm.com
|
#
6abec12c |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: improve fadump_reserve_mem() Some code clean-up like using minimal assignments and updating printk messages. Also, add an 'error_out' label for handling error cleanup at one place. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821343485.5656.10202857091553646948.stgit@hbathini.in.ibm.com
|
#
41df5928 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: add fadump support on powernv Add basic callback functions for FADump on PowerNV platform. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821342072.5656.4346362203141486452.stgit@hbathini.in.ibm.com
|
#
f3512011 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
pseries/fadump: move out platform specific support from generic code Move code that supports processing the crash'ed kernel's memory preserved by firmware to platform specific callback functions. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821337690.5656.13050665924800177744.stgit@hbathini.in.ibm.com
|
#
8255da95 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: release all the memory above boot memory size Except for Reserved dump area (see Documentation/powerpc/firmware- assisted-dump.rst) which is permanent reserved, all memory above boot memory size, where boot memory size is the memory required for the kernel to boot successfully when booted with restricted memory (memory for capture kernel), is released when the dump is invalidated. Make this a bit more explicit in the code. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821336092.5656.1079046285366041687.stgit@hbathini.in.ibm.com
|
#
41a65d16 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
pseries/fadump: define RTAS register/un-register callback functions Move platform specific register/un-register code, the RTAS calls, to register/un-register callback functions. This would also mean moving code that initializes and prints the platform specific FADump data. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821332856.5656.16380417702046411631.stgit@hbathini.in.ibm.com
|
#
d3833a70 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: introduce callbacks for platform specific operations Introduce callback functions for platform specific operations like register, unregister, invalidate & such. Also, define place-holders for the same on pSeries platform. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821330286.5656.15538934400074110770.stgit@hbathini.in.ibm.com
|
#
0226e552 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: move rtas specific definitions to platform code Currently, FADump is only supported on pSeries but that is going to change soon with FADump support being added on PowerNV platform. So, move rtas specific definitions to platform code to allow FADump to have multiple platforms support. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821328494.5656.16219929140866195511.stgit@hbathini.in.ibm.com
|
#
72aa6517 |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: use helper functions to reserve/release cpu notes buffer Use helper functions to simplify memory allocation, pinning down and freeing the memory used for CPU notes buffer. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821323555.5656.2486038022572739622.stgit@hbathini.in.ibm.com
|
#
7f0ad11d |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: declare helper functions in internal header file Declare helper functions, that can be reused by multiple platforms, in the internal header file. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821320487.5656.2660730464212209984.stgit@hbathini.in.ibm.com
|
#
961cf26a |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: add helper functions Add helper functions to setup & free CPU notes buffer and to find if a given memory area is contiguous. Also, use boolean as return type for the function that finds if boot memory area is contiguous. While at it, save the virtual address of CPU notes buffer instead of physical address as virtual address is used often. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821318971.5656.9281936950510635858.stgit@hbathini.in.ibm.com
|
#
ca986d7f |
|
11-Sep-2019 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: move internal macros/definitions to a new header Though asm/fadump.h is meant to be used by other components dealing with FADump, it also has macros/definitions internal to FADump code. Move them to a new header file used within FADump code. This also makes way for refactoring platform specific FADump code. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156821313134.5656.6597770626574392140.stgit@hbathini.in.ibm.com
|
#
1a59d1b8 |
|
27-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1334 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
45d0ba52 |
|
25-Apr-2019 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc/mm: move hugetlb_disabled into asm/hugetlb.h No need to have this in asm/page.h, move it into asm/hugetlb.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
0db6896f |
|
20-Aug-2018 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
powerpc/fadump: Do not allow hot-remove memory from fadump reserved area. For fadump to work successfully there should not be any holes in reserved memory ranges where kernel has asked firmware to move the content of old kernel memory in event of crash. Now that fadump uses CMA for reserved area, this memory area is now not protected from hot-remove operations unless it is cma allocated. Hence, fadump service can fail to re-register after the hot-remove operation, if hot-removed memory belongs to fadump reserved region. To avoid this make sure that memory from fadump reserved area is not hot-removable if fadump is registered. However, if user still wants to remove that memory, he can do so by manually stopping fadump service before hot-remove operation. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
f86593be |
|
20-Aug-2018 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
powerpc/fadump: Throw proper error message on fadump registration failure fadump fails to register when there are holes in reserved memory area. This can happen if user has hot-removed a memory that falls in the fadump reserved memory area. Throw a meaningful error message to the user in such case. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: is_reserved_memory_area_contiguous() returns bool, unsplit string] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
a4e92ce8 |
|
20-Aug-2018 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
powerpc/fadump: Reservationless firmware assisted dump One of the primary issues with Firmware Assisted Dump (fadump) on Power is that it needs a large amount of memory to be reserved. On large systems with TeraBytes of memory, this reservation can be quite significant. In some cases, fadump fails if the memory reserved is insufficient, or if the reserved memory was DLPAR hot-removed. In the normal case, post reboot, the preserved memory is filtered to extract only relevant areas of interest using the makedumpfile tool. While the tool provides flexibility to determine what needs to be part of the dump and what memory to filter out, all supported distributions default this to "Capture only kernel data and nothing else". We take advantage of this default and the Linux kernel's Contiguous Memory Allocator (CMA) to fundamentally change the memory reservation model for fadump. Instead of setting aside a significant chunk of memory nobody can use, this patch uses CMA instead, to reserve a significant chunk of memory that the kernel is prevented from using (due to MIGRATE_CMA), but applications are free to use it. With this fadump will still be able to capture all of the kernel memory and most of the user space memory except the user pages that were present in CMA region. Essentially, on a P9 LPAR with 2 cores, 8GB RAM and current upstream: [root@zzxx-yy10 ~]# free -m total used free shared buff/cache available Mem: 7557 193 6822 12 541 6725 Swap: 4095 0 4095 With this patch: [root@zzxx-yy10 ~]# free -m total used free shared buff/cache available Mem: 8133 194 7464 12 475 7338 Swap: 4095 0 4095 Changes made here are completely transparent to how fadump has traditionally worked. Thanks to Aneesh Kumar and Anshuman Khandual for helping us understand CMA and its usage. TODO: - Handle case where CMA reservation spans nodes. Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
f6cee260 |
|
05-Nov-2018 |
Yangtao Li <tiny.windzz@gmail.com> |
powerpc/fadump: Change to use DEFINE_SHOW_ATTRIBUTE macro Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
0823c68b |
|
14-Sep-2018 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: re-register firmware-assisted dump if already registered Firmware-Assisted Dump (FADump) needs to be registered again after any memory hot add/remove operation to update the crash memory ranges. But currently, the kernel returns '-EEXIST' if we try to register without uregistering it first. This could expose the system to racing issues while unregistering and registering FADump from userspace during udev events. Spare the userspace of this and let it be taken care of in the kernel space for a simpler interface. Since this change, running 'echo 1 > /sys/kernel/fadump_registered' would result in re-regisering (unregistering and registering) FADump, if it was already registered. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
a5818313 |
|
17-Aug-2018 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: cleanup crash memory ranges support Commit 1bd6a1c4b80a ("powerpc/fadump: handle crash memory ranges array index overflow") changed crash memory ranges to a dynamic array that is reallocated on-demand with krealloc(). The relevant header for this call was not included. The kernel compiles though. But be cautious and add the header anyway. Also, memory allocation logic in fadump_add_crash_memory() takes care of memory allocation for crash memory ranges in all scenarios. Drop unnecessary memory allocation in fadump_setup_crash_memory_ranges(). Fixes: 1bd6a1c4b80a ("powerpc/fadump: handle crash memory ranges array index overflow") Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
ced1bf52 |
|
06-Aug-2018 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements With dynamic memory allocation support for crash memory ranges array, there is no hard limit on the no. of crash memory ranges kernel could export, but program headers count could overflow in the /proc/vmcore ELF file while exporting each memory range as PT_LOAD segment. Reduce the likelihood of a such scenario, by folding adjacent crash memory ranges which minimizes the total number of PT_LOAD segments. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
1bd6a1c4 |
|
06-Aug-2018 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: handle crash memory ranges array index overflow Crash memory ranges is an array of memory ranges of the crashing kernel to be exported as a dump via /proc/vmcore file. The size of the array is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since commit 142b45a72e22 ("memblock: Add array resizing support"). On large memory systems with a few DLPAR operations, the memblock memory regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such systems, registering fadump results in crash or other system failures like below: task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000 NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180 REGS: c00000000b73b570 TRAP: 0300 Tainted: G L X (4.4.140+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22004484 XER: 20000000 CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0 ... NIP [c000000000047df4] smp_send_reschedule+0x24/0x80 LR [c0000000000f9e58] resched_curr+0x138/0x160 Call Trace: resched_curr+0x138/0x160 (unreliable) check_preempt_curr+0xc8/0xf0 ttwu_do_wakeup+0x38/0x150 try_to_wake_up+0x224/0x4d0 __wake_up_common+0x94/0x100 ep_poll_callback+0xac/0x1c0 __wake_up_common+0x94/0x100 __wake_up_sync_key+0x70/0xa0 sock_def_readable+0x58/0xa0 unix_stream_sendmsg+0x2dc/0x4c0 sock_sendmsg+0x68/0xa0 ___sys_sendmsg+0x2cc/0x2e0 __sys_sendmsg+0x5c/0xc0 SyS_socketcall+0x36c/0x3f0 system_call+0x3c/0x100 as array index overflow is not checked for while setting up crash memory ranges causing memory corruption. To resolve this issue, dynamically allocate memory for crash memory ranges and resize it incrementally, in units of pagesize, on hitting array size limit. Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Just use PAGE_SIZE directly, fixup variable placement] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
722cde76 |
|
27-Apr-2018 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
powerpc/fadump: Unregister fadump on kexec down path. Unregister fadump on kexec down path otherwise the fadump registration in new kexec-ed kernel complains that fadump is already registered. This makes new kernel to continue using fadump registered by previous kernel which may lead to invalid vmcore generation. Hence this patch fixes this issue by un-registering fadump in fadump_cleanup() which is called during kexec path so that new kernel can register fadump with new valid values. Fixes: b500afff11f6 ("fadump: Invalidate registration and release reserved memory for general use.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
85975387 |
|
10-Apr-2018 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: Do not use hugepages when fadump is active FADump capture kernel boots in restricted memory environment preserving the context of previous kernel to save vmcore. Supporting hugepages in such environment makes things unnecessarily complicated, as hugepages need memory set aside for them. This means most of the capture kernel's memory is used in supporting hugepages. In most cases, this results in out-of-memory issues while booting FADump capture kernel. But hugepages are not of much use in capture kernel whose only job is to save vmcore. So, disabling hugepages support, when fadump is active, is a reliable solution for the out of memory issues. Introducing a flag variable to disable HugeTLB support when fadump is active. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
b71a693d |
|
10-Apr-2018 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
powerpc/fadump: exclude memory holes while reserving memory in second kernel The second kernel, during early boot after the crash, reserves rest of the memory above boot memory size to make sure it does not touch any of the dump memory area. It uses memblock_reserve() that reserves the specified memory region irrespective of memory holes present within that region. There are chances where previous kernel would have hot removed some of its memory leaving memory holes behind. In such cases fadump kernel reports incorrect number of reserved pages through arch_reserved_kernel_pages() hook causing kernel to hang or panic. Fix this by excluding memory holes while reserving rest of the memory above boot memory size during second kernel boot after crash. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
ab9dbf77 |
|
03-Dec-2017 |
David Gibson <david@gibson.dropbear.id.au> |
Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier" This reverts commit a3b2cb30f252b21a6f962e0dd107c8b897ca65e4. That commit tried to fix problems with panic on powerpc in certain circumstances, where some output from the generic panic code was being dropped. Unfortunately, it breaks things worse in other circumstances. In particular when running a PAPR guest, it will now attempt to reboot instead of informing the hypervisor (KVM or PowerVM) that the guest has crashed. The crash notification is important to some virtualization management layers. Revert it for now until we can come up with a better solution. Fixes: a3b2cb30f252 ("powerpc: Do not call ppc_md.panic in fadump panic notifier") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [mpe: Tweak change log a bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
dcdc4679 |
|
26-Jun-2017 |
Michal Suchanek <msuchanek@suse.de> |
powerpc/fadump: use kstrtoint to handle sysfs store Currently sysfs store handlers in fadump use if buf[0] == 'char'. This means input "100foo" is interpreted as '1' and "01" as '0'. Change to kstrtoint so leading zeroes and the like is handled in expected way. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Acked-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michal Suchanek <a class="moz-txt-link-rfc2396E" href="mailto:msuchanek@suse.de"><msuchanek@suse.de></a></pre> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
6fcd6baa |
|
19-Jul-2017 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/powernv: Use kernel crash path for machine checks There are quite a few machine check exceptions that can be caused by kernel bugs. To make debugging easier, use the kernel crash path in cases of synchronous machine checks that occur in kernel mode, if that would not result in the machine going straight to panic or crash dump. There is a downside here that die()ing the process in kernel mode can still leave the system unstable. panic_on_oops will always force the system to fail-stop, so systems where that behaviour is important will still do the right thing. As a test, when triggering an i-side 0111b error (ifetch from foreign address) in kernel mode process context on POWER9, the kernel currently dies quickly like this: Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] [ 127.426651616,0] OPAL: Reboot requested due to Platform error. Effective[ 127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000 opal: Reboot type 1 not supported Kernel panic - not syncing: PowerNV Unrecovered Machine Check CPU: 56 PID: 4425 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #35 Call Trace: [ 128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to buggy/mising code in OPAL for this BMC Rebooting in 10 seconds.. Trying to free IRQ 496 from IRQ context! After this patch, the process is killed and the kernel continues with this message, which gives enough information to identify the offending branch (i.e., with CFAR): Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] Effective address: ffff000000000000 Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ... CPU: 22 PID: 4436 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #36 task: c000000932300000 task.stack: c000000932380000 NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000 REGS: c00000000fc8fd80 TRAP: 0200 Tainted: G M (4.12.0-rc1-13857-ga4700a261072-dirty) MSR: 90000000001c1003 <SF,HV,ME,RI,LE> CR: 24000484 XER: 20000000 CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1 GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000 GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8 GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030 GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918 NIP [ffff000000000000] 0xffff000000000000 LR [00000000217706a4] 0x217706a4 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
a3b2cb30 |
|
04-Jul-2017 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc: Do not call ppc_md.panic in fadump panic notifier If fadump is not registered, and no other crash or debug handlers are registered, the powerpc panic handler stops the guest before the generic panic code can push out debug information to the console. Currently, system reset injection causes the guest to silently stop. Stop calling ppc_md.panic in the panic notifier. crash_fadump already does rtas_os_term() to terminate the guest if fadump is registered. Remove ppc_md.panic. Move fadump panic notifier into fadump code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
5203f499 |
|
12-Jul-2017 |
Xunlei Pang <xlpang@redhat.com> |
powerpc/fadump: use the correct VMCOREINFO_NOTE_SIZE for phdr vmcoreinfo_max_size stands for the vmcoreinfo_data, the correct one we should use is vmcoreinfo_note whose total size is VMCOREINFO_NOTE_SIZE. Like explained in commit 77019967f06b ("kdump: fix exported size of vmcoreinfo note"), it should not affect the actual function, but we better fix it, also this change should be safe and backward compatible. After this, we can get rid of variable vmcoreinfo_max_size, let's use the corresponding macros directly, fewer variables means more safety for vmcoreinfo operation. [xlpang@redhat.com: fix build warning] Link: http://lkml.kernel.org/r/1494830606-27736-1-git-send-email-xlpang@redhat.com Link: http://lkml.kernel.org/r/1493281021-20737-2-git-send-email-xlpang@redhat.com Signed-off-by: Xunlei Pang <xlpang@redhat.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Dave Young <dyoung@redhat.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Juergen Gross <jgross@suse.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
68fa6478 |
|
01-Jun-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: add reschedule point while releasing memory Around 95% of memory is reserved by fadump/capture kernel. All this memory is freed, one page at a time, on writing '1' to the node /sys/kernel/fadump_release_mem. On systems with large memory, this can take a long time to complete, leading to soft lockup warning messages. To avoid this, add reschedule points at regular intervals. Also, while memblock_reserve() implicitly takes care of holes in the given memory range while reserving memory, those holes need to be taken care of while releasing memory as memory is freed one page at a time. Add support to skip holes while releasing memory. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
a5a05b91 |
|
01-Jun-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: provide a helpful error message fadump fails to register when there are holes in boot memory area. Provide a helpful error message to the user in such case. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
eae0dfcc |
|
01-Jun-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: avoid holes in boot memory area when fadump is registered To register fadump, boot memory area - the size of low memory chunk that is required for a kernel to boot successfully when booted with restricted memory, is assumed to have no holes. But this memory area is currently not protected from hot-remove operations. So, fadump could fail to re-register after a memory hot-remove operation, if memory is removed from boot memory area. To avoid this, ensure that memory from boot memory area is not hot-removed when fadump is registered. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
a77af552 |
|
01-Jun-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: avoid duplicates in crash memory ranges fadump sets up crash memory ranges to be used for creating PT_LOAD program headers in elfcore header. Memory chunk RMA_START through boot memory area size is added as the first memory range because firmware, at the time of crash, moves this memory chunk to different location specified during fadump registration making it necessary to create a separate program header for it with the correct offset. This memory chunk is skipped while setting up the remaining memory ranges. But currently, there is possibility that some of this memory may have duplicate entries like when it is hot-removed and added again. Ensure that no two memory ranges represent the same memory. When 5 lmbs are hot-removed and then hot-plugged before registering fadump, here is how the program headers in /proc/vmcore exported by fadump look like without this change: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000010000 0x0000000000000000 0x0000000000000000 0x0000000000001894 0x0000000000001894 0 LOAD 0x0000000000021020 0xc000000000000000 0x0000000000000000 0x0000000040000000 0x0000000040000000 RWE 0 LOAD 0x0000000040031020 0xc000000000000000 0x0000000000000000 0x0000000010000000 0x0000000010000000 RWE 0 LOAD 0x0000000050040000 0xc000000010000000 0x0000000010000000 0x0000000050000000 0x0000000050000000 RWE 0 LOAD 0x00000000a0040000 0xc000000060000000 0x0000000060000000 0x000000019ffe0000 0x000000019ffe0000 RWE 0 and with this change: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000010000 0x0000000000000000 0x0000000000000000 0x0000000000001894 0x0000000000001894 0 LOAD 0x0000000000021020 0xc000000000000000 0x0000000000000000 0x0000000040000000 0x0000000040000000 RWE 0 LOAD 0x0000000040030000 0xc000000040000000 0x0000000040000000 0x0000000020000000 0x0000000020000000 RWE 0 LOAD 0x0000000060030000 0xc000000060000000 0x0000000060000000 0x000000019ffe0000 0x000000019ffe0000 RWE 0 Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
48a316e3 |
|
02-Jun-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: Set an upper limit for boot memory size By default, 5% of system RAM is reserved for preserving boot memory. Alternatively, a user can specify the amount of memory to reserve. See Documentation/powerpc/firmware-assisted-dump.txt for details. In addition to the memory reserved for preserving boot memory, some more memory is reserved, to save HPTE region, CPU state data and ELF core headers. Memory Reservation during first kernel looks like below: Low memory Top of memory 0 boot memory size | | | |<--Reserved dump area -->| V V | Permanent Reservation V +-----------+----------/ /----------+---+----+-----------+----+ | | |CPU|HPTE| DUMP |ELF | +-----------+----------/ /----------+---+----+-----------+----+ | ^ | | \ / ------------------------------------------- Boot memory content gets transferred to reserved area by firmware at the time of crash This implicitly means that the sum of the sizes of boot memory, CPU state data, HPTE region, DUMP preserving area and ELF core headers can't be greater than the total memory size. But currently, a user is allowed to specify any value as boot memory size. So, the above rule is violated when a boot memory size around 50% of the total available memory is specified. As the kernel is not handling this currently, it may lead to undefined behavior. Fix it by setting an upper limit for boot memory size to 25% of the total available memory. Also, instead of using memblock_end_of_DRAM(), which doesn't take the holes, if any, in the memory layout into account, use memblock_phys_mem_size() to calculate the percentage of total available memory. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
e7467dc6 |
|
22-May-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: Update comment about offset where fadump is reserved With commit f6e6bedb7731 ("powerpc/fadump: Reserve memory at an offset closer to bottom of RAM"), memory for fadump is no longer reserved at the top of RAM. But there are still a few places which say so. Change them appropriately. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
81d9eca5 |
|
22-May-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: Add a warning when 'fadump_reserve_mem=' is used With commit 11550dc0a00b ("powerpc/fadump: reuse crashkernel parameter for fadump memory reservation"), 'fadump_reserve_mem=' parameter is deprecated in favor of 'crashkernel=' parameter. Add a warning if 'fadump_reserve_mem=' is still used. Fixes: 11550dc0a00b ("powerpc/fadump: reuse crashkernel parameter for fadump memory reservation") Suggested-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> [mpe: Unsplit long printk strings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
98b8cd7f |
|
27-May-2017 |
Michal Suchanek <msuchanek@suse.de> |
powerpc/fadump: Return error when fadump registration fails - log an error message when registration fails and no error code listed in the switch is returned - translate the hv error code to posix error code and return it from fw_register - return the posix error code from fw_register to the process writing to sysfs - return EEXIST on re-registration - return success on deregistration when fadump is not registered - return ENODEV when no memory is reserved for fadump Signed-off-by: Michal Suchanek <msuchanek@suse.de> Tested-by: Hari Bathini <hbathini@linux.vnet.ibm.com> [mpe: Use pr_err() to shrink the error print] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
11550dc0 |
|
08-May-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: reuse crashkernel parameter for fadump memory reservation fadump supports specifying memory to reserve for fadump's crash kernel with fadump_reserve_mem kernel parameter. This parameter currently supports passing a fixed memory size, like fadump_reserve_mem=<size> only. This patch aims to add support for other syntaxes like range-based memory size <range1>:<size1>[,<range2>:<size2>,<range3>:<size3>,...] which allows using the same parameter to boot the kernel with different system RAM sizes. As crashkernel parameter already supports the above mentioned syntaxes, this patch deprecates fadump_reserve_mem parameter and reuses crashkernel parameter instead, to specify memory for fadump's crash kernel memory reservation as well. If any offset is provided in crashkernel parameter, it will be ignored in case of fadump, as fadump reserves memory at end of RAM. Advantages using crashkernel parameter instead of fadump_reserve_mem parameter are one less kernel parameter overall, code reuse and support for multiple syntaxes to specify memory. Suggested-by: Dave Young <dyoung@redhat.com> Link: http://lkml.kernel.org/r/149035346749.6881.911095631212975718.stgit@hbathini.in.ibm.com Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
22bd0177 |
|
08-May-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: remove dependency with CONFIG_KEXEC Now that crashkernel parameter parsing and vmcoreinfo related code is moved under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE, remove dependency with CONFIG_KEXEC for CONFIG_FA_DUMP. While here, get rid of definitions of fadump_append_elf_note() & fadump_final_note() functions to reuse similar functions compiled under CONFIG_CRASH_CORE. Link: http://lkml.kernel.org/r/149035343956.6881.1536459326017709354.stgit@hbathini.in.ibm.com Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
3ae05fb3 |
|
09-Feb-2017 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc: Remove unnecessary includes of asm/debug.h These files don't seem to have any need for asm/debug.h, now that all it includes are the debugger hooks and breakpoint definitions. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
7644d581 |
|
09-Feb-2017 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc: Create asm/debugfs.h and move powerpc_debugfs_root there powerpc_debugfs_root is the dentry representing the root of the "powerpc" directory tree in debugfs. Currently it sits in asm/debug.h, a long with some other things that have "debug" in the name, but are otherwise unrelated. Pull it out into a separate header, which also includes linux/debugfs.h, and convert all the users to include debugfs.h instead of debug.h. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
f6e6bedb |
|
16-Mar-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: Reserve memory at an offset closer to bottom of RAM Currently, the area to preserve boot memory is reserved at the top of RAM. This leaves fadump vulnerable to memory hot-remove operations. As memory for fadump has to be reserved early in the boot process, fadump can't be registered after a memory hot-remove operation. Though this problem can't be eleminated completely, the impact can be minimized by reserving memory at an offset closer to bottom of the RAM. The offset for fadump memory reservation can be any value greater than fadump boot memory size. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
f2a5e8f0 |
|
24-Oct-2016 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
powerpc/fadump: Fix the race in crash_fadump(). There are chances that multiple CPUs can call crash_fadump() simultaneously and would start duplicating same info to vmcoreinfo ELF note section. This causes makedumpfile to fail during kdump capture. One example is, triggering dumprestart from HMC which sends system reset to all the CPUs at once. makedumpfile --dump-dmesg /proc/vmcore read_vmcoreinfo_basic_info: Invalid data in /tmp/vmcoreinfoyjgxlL: CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971 makedumpfile Failed. Running makedumpfile --dump-dmesg /proc/vmcore failed (1). makedumpfile -d 31 -l /proc/vmcore read_vmcoreinfo_basic_info: Invalid data in /tmp/vmcoreinfo1mmVdO: CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971 makedumpfile Failed. Running makedumpfile -d 31 -l /proc/vmcore failed (1). Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
1e76609c |
|
07-Oct-2016 |
Srikar Dronamraju <srikar@linux.vnet.ibm.com> |
powerpc: implement arch_reserved_kernel_pages Currently significant amount of memory is reserved only in kernel booted to capture kernel dump using the fa_dump method. Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize only certain size memory per node. The certain size takes into account the dentry and inode cache sizes. Currently the cache sizes are calculated based on the total system memory including the reserved memory. However such a kernel when booting the same kernel as fadump kernel will not be able to allocate the required amount of memory to suffice for the dentry and inode caches. This results in crashes like Hence only implement arch_reserved_kernel_pages() for CONFIG_FA_DUMP configurations. The amount reserved will be reduced while calculating the large caches and will avoid crashes like the below on large systems such as 32 TB systems. Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes) vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3 Call Trace: dump_stack+0xb0/0xf0 (unreliable) warn_alloc_failed+0x114/0x160 __vmalloc_node_range+0x304/0x340 __vmalloc+0x6c/0x90 alloc_large_system_hash+0x1b8/0x2c0 inode_init+0x94/0xe4 vfs_caches_init+0x8c/0x13c start_kernel+0x50c/0x578 start_here_common+0x20/0xa8 Link: http://lkml.kernel.org/r/1472476010-4709-4-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Suggested-by: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
2685f826 |
|
29-Sep-2016 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n The fadump code calls vmcore_cleanup() which only exists if CONFIG_PROC_VMCORE=y. We don't want to depend on CONFIG_PROC_VMCORE, because it's user selectable, so just wrap the call in an #ifdef. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
d8bced27 |
|
05-Sep-2016 |
Daniel Axtens <dja@axtens.net> |
powerpc/fadump: Set core e_flags using kernel's ELF ABI version Firmware Assisted Dump is a facility to dump kernel core with assistance from firmware. As part of this process the kernel ELF ABI version is stored in the core file. Currently fadump.h defines this to 0 if it is not already defined. This clashes with a define in elf.h which sets it based on the current task - not based on the kernel's ELF ABI version. Use the compiler-provided #define _CALL_ELF which tells us the ELF ABI version of the kernel to set e_flags, this matches what binutils does. Remove the definition in fadump.h, which becomes unused. Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
b5b1cfc5 |
|
05-Jul-2016 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/fadump: Fix build error introduced by recent cleanup We spent so much time bike-shedding the printk() we missed that the next line was missing a semi-colon. And it seems none of our defconfigs turn on CONFIG_FA_DUMP. Fixes: 4a03749f140c ("powerpc/fadump: Trivial fix of spelling mistake, clean up message") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
4a03749f |
|
26-Jun-2016 |
Colin Ian King <colin.king@canonical.com> |
powerpc/fadump: Trivial fix of spelling mistake, clean up message Fix trivial spelling mistake "rgistration". Also use pr_err() instead of printk() and unsplit the string to keep it all on one line. Signed-off-by: Colin Ian King <colin.king@canonical.com> [mpe: Keep rc on the same line, splitting it doesn't help] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
a0512164 |
|
20-Jan-2016 |
Rasmus Villemoes <linux@rasmusvillemoes.dk> |
powerpc/fadump: rename cpu_online_mask member of struct fadump_crash_info_header The four cpumasks cpu_{possible,online,present,active}_bits are exposed readonly via the corresponding const variables cpu_xyz_mask. But they are also accessible for arbitrary writing via the exposed functions set_cpu_xyz. There's quite a bit of code throughout the kernel which iterates over or otherwise accesses these bitmaps, and having the access go via the cpu_xyz_mask variables is nowadays [1] simply a useless indirection. It may be that any problem in CS can be solved by an extra level of indirection, but that doesn't mean every extra indirection solves a problem. In this case, it even necessitates some minor ugliness (see 4/6). Patch 1/6 is new in v2, and fixes a build failure on ppc by renaming a struct member, to avoid problems when the identifier cpu_online_mask becomes a macro later in the series. The next four patches eliminate the cpu_xyz_mask variables by simply exposing the actual bitmaps, after renaming them to discourage direct access - that still happens through cpu_xyz_mask, which are now simply macros with the same type and value as they used to have. After that, there's no longer any reason to have the setter functions be out-of-line: The boolean parameter is almost always a literal true or false, so by making them static inlines they will usually compile to one or two instructions. For a defconfig build on x86_64, bloat-o-meter says we save ~3000 bytes. We also save a little stack (stackdelta says 127 functions have a 16 byte smaller stack frame, while two grow by that amount). Mostly because, when iterating over the mask, gcc typically loads the value of cpu_xyz_mask into a callee-saved register and from there into %rdi before each find_next_bit call - now it can just load the appropriate immediate address into %rdi before each call. [1] See Rusty's kind explanation http://thread.gmane.org/gmane.linux.kernel/2047078/focus=2047722 for some historic context. This patch (of 6): As preparation for eliminating the indirect access to the various global cpu_*_bits bitmaps via the pointer variables cpu_*_mask, rename the cpu_online_mask member of struct fadump_crash_info_header to simply online_mask, thus allowing cpu_online_mask to become a macro. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
408cddd9 |
|
30-Sep-2014 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
powerpc/fadump: Fix endianess issues in firmware assisted dump handling Firmware-assisted dump (fadump) kernel code is not endian safe. The below patch fixes this issue. Tested this patch with upstream kernel. Below output shows crash tool successfully opening LE fadump vmcore. # crash vmlinux vmcore GNU gdb (GDB) 7.6 This GDB was configured as "powerpc64le-unknown-linux-gnu"... KERNEL: vmlinux DUMPFILE: vmcore CPUS: 16 DATE: Wed Dec 31 19:00:00 1969 UPTIME: 00:03:28 LOAD AVERAGE: 0.46, 0.86, 0.41 TASKS: 268 NODENAME: linux-dhr2 RELEASE: 3.17.0-rc5-7-default VERSION: #6 SMP Tue Sep 30 01:06:34 EDT 2014 MACHINE: ppc64le (4116 Mhz) MEMORY: 40 GB PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log for details) PID: 6223 COMMAND: "bash" TASK: c0000009661b2500 [THREAD_INFO: c000000967ac0000] CPU: 2 STATE: TASK_RUNNING (PANIC) Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> [mpe: Make the comment in pSeries_lpar_hptab_clear() clearer] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
b717d985 |
|
22-May-2014 |
Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> |
arch: powerpc/fadump: Cleaning up inconsistent NULL checks Cleaning up inconsistent NULL checks. There is otherwise a risk of a possible null pointer dereference. Was largely found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
9d0c4dfe |
|
01-Apr-2014 |
Rob Herring <robh@kernel.org> |
of/fdt: update of_get_flat_dt_prop in prep for libfdt Make of_get_flat_dt_prop arguments compatible with libfdt fdt_getprop call in preparation to convert FDT code to use libfdt. Make the return value const and the property length ptr type an int. Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Michal Simek <michal.simek@xilinx.com> Tested-by: Grant Likely <grant.likely@linaro.org> Tested-by: Stephen Chivers <schivers@csc.com>
|
#
a7d04317 |
|
24-Apr-2014 |
Gavin Shan <gwshan@linux.vnet.ibm.com> |
powerpc/prom: Stop scanning dev-tree for fdump early Function early_init_dt_scan_fw_dump() is called to scan the device tree for fdump properties under node "rtas". Any one of them is invalid, we can stop scanning the device tree early by returning "1". It would save a bit time during boot. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
5d585e5c |
|
29-Apr-2013 |
Jiang Liu <liuj97@gmail.com> |
mm/ppc: use common help functions to free reserved pages Use common help functions to free reserved pages. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Anatolij Gustschin <agust@denx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
a84fcd46 |
|
20-Aug-2012 |
Suzuki Poulose <suzuki@in.ibm.com> |
powerpc: Change memory_limit from phys_addr_t to unsigned long long There are some device-tree nodes, whose values are of type phys_addr_t. The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT. Change these to a fixed unsigned long long for consistency. This patch does the change only for memory_limit. The following is a list of such variables which need the change: 1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c 2) (struct resource *)crashk_res.start - We could export a local static variable from machine_kexec.c. Changing the above values might break the kexec-tools. So, I will fix kexec-tools first to handle the different sized values and then change the above. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
cad3c834 |
|
30-Mar-2012 |
Stephen Rothwell <sfr@canb.auug.org.au> |
powerpc: Fix fallout from system.h split up Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
b500afff |
|
15-Feb-2012 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
fadump: Invalidate registration and release reserved memory for general use. This patch introduces an sysfs interface '/sys/kernel/fadump_release_mem' to invalidate the last fadump registration, invalidate '/proc/vmcore', release the reserved memory for general use and re-register for future kernel dump. Once the dump is copied to the disk, unlike phyp dump, the userspace tool can release all the memory reserved for dump with one single operation of echo 1 to '/sys/kernel/fadump_release_mem'. Release the reserved memory region excluding the size of the memory required for future kernel dump registration. And therefore, unlike kdump, Fadump doesn't need a 2nd reboot to get back the system to the production configuration. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
d34c5f26 |
|
15-Feb-2012 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
fadump: Add PT_NOTE program header for vmcoreinfo Introduce a PT_NOTE program header that points to physical address of vmcoreinfo_note buffer declared in kernel/kexec.c. The vmcoreinfo note buffer is populated during crash_fadump() at the time of system crash. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
ebaeb5ae |
|
15-Feb-2012 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
fadump: Convert firmware-assisted cpu state dump data into elf notes. When registered for firmware assisted dump on powerpc, firmware preserves the registers for the active CPUs during a system crash. This patch reads the cpu register data stored in Firmware-assisted dump format (except for crashing cpu) and converts it into elf notes and updates the PT_NOTE program header accordingly. The exact register state for crashing cpu is saved to fadump crash info structure in scratch area during crash_fadump() and read during second kernel boot. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
2df173d9 |
|
15-Feb-2012 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
fadump: Initialize elfcore header and add PT_LOAD program headers. Build the crash memory range list by traversing through system memory during the first kernel before we register for firmware-assisted dump. After the successful dump registration, initialize the elfcore header and populate PT_LOAD program headers with crash memory ranges. The elfcore header is saved in the scratch area within the reserved memory. The scratch area starts at the end of the memory reserved for saving RMR region contents. The scratch area contains fadump crash info structure that contains magic number for fadump validation and physical address where the eflcore header can be found. This structure will also be used to pass some important crash info data to the second kernel which will help second kernel to populate ELF core header with correct data before it gets exported through /proc/vmcore. Since the firmware preserves the entire partition memory at the time of crash the contents of the scratch area will be preserved till second kernel boot. Since the memory dump exported through /proc/vmcore is in ELF format similar to kdump, it will help us to reuse the kdump infrastructure for dump capture and filtering. Unlike phyp dump, userspace tool does not need to refer any sysfs interface while reading /proc/vmcore. NOTE: The current design implementation does not address a possibility of introducing additional fields (in future) to this structure without affecting compatibility. It's on TODO list to come up with better approach to address this. Reserved dump area start => +-------------------------------------+ | CPU state dump data | +-------------------------------------+ | HPTE region data | +-------------------------------------+ | RMR region data | Scratch area start => +-------------------------------------+ | fadump crash info structure { | | magic nummber | +------|---- elfcorehdr_addr | | | } | +----> +-------------------------------------+ | ELF core header | Reserved dump area end => +-------------------------------------+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
3ccc00a7 |
|
19-Feb-2012 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
fadump: Register for firmware assisted dump. On 2012-02-20 11:02:51 Mon, Paul Mackerras wrote: > On Thu, Feb 16, 2012 at 04:44:30PM +0530, Mahesh J Salgaonkar wrote: > > If I have read the code correctly, we are going to get this printk on > non-pSeries machines or on older pSeries machines, even if the user > has not put the fadump=on option on the kernel command line. The > printk will be annoying since there is no actual error condition. It > seems to me that the condition for the printk should include > fw_dump.fadump_enabled. In other words you should probably add > > if (!fw_dump.fadump_enabled) > return 0; > > at the beginning of the function. Hi Paul, Thanks for pointing it out. Please find the updated patch below. The existing patches above this (4/10 through 10/10) cleanly applies on this update. Thanks, -Mahesh. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
eb39c880 |
|
15-Feb-2012 |
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> |
fadump: Reserve the memory for firmware assisted dump. Reserve the memory during early boot to preserve CPU state data, HPTE region and RMA (real mode area) region data in case of kernel crash. At the time of crash, powerpc firmware will store CPU state data, HPTE region data and move RMA region data to the reserved memory area. If the firmware-assisted dump fails to reserve the memory, then fallback to existing kexec-based kdump. Most of the code implementation to reserve memory has been adapted from phyp assisted dump implementation written by Linas Vepstas and Manish Ahuja This patch also introduces a config option CONFIG_FA_DUMP for firmware assisted dump feature on Powerpc (ppc64) architecture. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|