#
c239c83e |
|
20-Feb-2024 |
Sven Schnelle <svens@linux.ibm.com> |
s390/entry: add CIF_SIE flag and remove sie64a() address check When a program check, interrupt or machine check is triggered, the PSW address is compared to a certain range of the sie64a() function to figure out whether SIE was interrupted and a cleanup of SIE is needed. This doesn't work with kprobes: If kprobes probes an instruction, it copies the instruction to the kprobes instruction page and overwrites the original instruction with an undefind instruction (Opcode 00). When this instruction is hit later, kprobes single-steps the instruction on the kprobes_instruction page. However, if this instruction is a relative branch instruction it will now point to a different location in memory due to being moved to the kprobes instruction page. If the new branch target points into sie64a() the kernel assumes it interrupted SIE when processing the breakpoint and will crash trying to access the SIE control block. Instead of comparing the address, introduce a new CIF_SIE flag which indicates whether SIE was interrupted. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
8c09871a |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: limit save and restore to used registers The first invocation of kernel_fpu_begin() after switching from user to kernel context will save all vector registers, even if only parts of the vector registers are used within the kernel fpu context. Given that save and restore of all vector registers is quite expensive change the current approach in several ways: - Instead of saving and restoring all user registers limit this to those registers which are actually used within an kernel fpu context. - On context switch save all remaining user fpu registers, so they can be restored when the task is rescheduled. - Saving user registers within kernel_fpu_begin() is done without disabling and enabling interrupts - which also slightly reduces runtime. In worst case (e.g. interrupt context uses the same registers) this may lead to the situation that registers are saved several times, however the assumption is that this will not happen frequently, so that the new method is faster in nearly all cases. - save_user_fpu_regs() can still be called from all contexts and saves all (or all remaining) user registers to a tasks ufpu user fpu save area. Overall this reduces the time required to save and restore the user fpu context for nearly all cases. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
9cbff7f2 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: remove regs member from struct fpu KVM was the only user which modified the regs pointer in struct fpu. Remove the pointer and convert the rest of the core fpu code to directly access the save area embedded within struct fpu. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
4eed43de |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: make kernel fpu context preemptible Make the kernel fpu context preemptible. Add another fpu structure to the thread_struct, and use it to save and restore the kernel fpu context if its task uses fpu registers when it is preempted. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
87c5c700 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: rename save_fpu_regs() to save_user_fpu_regs(), etc Rename save_fpu_regs(), load_fpu_regs(), and struct thread_struct's fpu member to save_user_fpu_regs(), load_user_fpu_regs(), and ufpu. This way the function and variable names reflect for which context they are supposed to be used. This large and trivial conversion is a prerequisite for making the kernel fpu usage preemptible. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
419abc4d |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: convert FPU CIF flag to regular TIF flag The FPU state, as represented by the CIF_FPU flag reflects the FPU state of a task, not the CPU it is running on. Therefore convert the flag to a regular TIF flag. This removes the magic in switch_to() where a save_fpu_regs() call for the currently (previous) running task sets the per-cpu CIF_FPU flag, which is required to restore FPU register contents of the next task, when it returns to user space. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
fd2527f2 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: move, rename, and merge header files Move, rename, and merge the fpu and vx header files. This way fpu header files have a consistent naming scheme (fpu*.h). Also get rid of the fpu subdirectory and move header files to asm directory, so that all fpu and vx header files can be found at the same location. Merge internal.h header file into other header files, since the internal helpers are used at many locations. so those helper functions are really not internal. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
d7f679ec |
|
01-Dec-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support s390 selects ARCH_WANTS_DYNAMIC_TASK_STRUCT in order to make the size of the task structure dependent on the availability of the vector facility. This doesn't make sense anymore because since many years all machines provide the vector facility. Therefore simplify the code a bit and remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
1c8b8cf2 |
|
01-Dec-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/nmi: implement and use local_mcck_save() / local_mcck_restore() Instead of using local_mcck_disable() / local_mcck_enable() implement and use local_mcck_save() / local_mcck_restore() to disable machine checks, and restoring the previous state. The problem with using local_mcck_disable() / local_mcck_enable() is that there is an assumption that machine checks are always enabled. While this is currently the case the code still looks quite odd, readers need to double check if the code is correct. In order to increase readability save and then restore the old machine check mask bit, instead of assuming that it must have been enabled. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
0a9ace11 |
|
15-Nov-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390: remove odd comment In the meantime hopefully most people got used to forward declarations, therefore remove the explanation. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
2a405f6b |
|
05-Apr-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/stackleak: provide fast __stackleak_poison() implementation Provide an s390 specific __stackleak_poison() implementation which is faster than the generic variant. For the original implementation with an enforced 4kb stackframe for the getpid() system call the system call overhead increases by a factor of 3 if the stackleak feature is enabled. Using the s390 mvc based variant this is reduced to an increase of 25% instead. This is within the expected area, since the mvc based implementation is more or less a memset64() variant which comes with similar results. See commit 0b77d6701cf8 ("s390: implement memset16, memset32 & memset64"). Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20230405130841.1350565-3-hca@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
22ca1e77 |
|
27-Mar-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390: move on_thread_stack() to processor.h As preparation for the stackleak feature move on_thread_stack() to processor.h like x86. Also make it __always_inline, and slightly optimize it by reading current task's kernel stack pointer from lowcore. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
69a407bf |
|
28-Feb-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/bp: remove __bpon() There is no point in changing branch prediction state of a cpu shortly before it enters stop state. Therefore remove __bpon(). Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
9b63fd2f |
|
28-Feb-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/bp: remove s390_isolate_bp_guest() s390_isolate_bp_guest() is unused. Remove it. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
f33f2d4c |
|
28-Feb-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/bp: remove TIF_ISOLATE_BP TIF_ISOLATE_BP is unused since it was introduced with commit 6b73044b2b00 ("s390: run user space and KVM guests with modified branch prediction"). Given that there is no use case remove it again. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
f96f41aa |
|
12-Feb-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/processor: add test_and_set_cpu_flag() and test_and_clear_cpu_flag() Add test_and_set_cpu_flag() and test_and_clear_cpu_flag() helper functions. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
b977f03e |
|
12-Feb-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/processor: let cpu helper functions return boolean values Let cpu helper functions return boolean values. This also allows to make the code a bit simpler by getting rid of the "!!" construct. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
87f79d88 |
|
06-Feb-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/processor: always inline cpu flag helper functions arch_cpu_idle() is marked noinstr and therefore must only call functions which are also not instrumented. Make sure that cpu flag helper functions are always inlined to avoid that the compiler generates an out-of-line function for e.g. the call within arch_cpu_idle(). Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
e3c11025 |
|
03-Nov-2022 |
Vasily Gorbik <gor@linux.ibm.com> |
s390: avoid using global register for current_stack_pointer Commit 30de14b1884b ("s390: current_stack_pointer shouldn't be a function") made current_stack_pointer a global register variable like on many other architectures. Unfortunately on s390 it uncovers old gcc bug which is fixed only since gcc-9.1 [gcc commit 3ad7fed1cc87 ("S/390: Fix PR89775. Stackpointer save/restore instructions removed")] and backported to gcc-8.4 and later. Due to this bug gcc versions prior to 8.4 generate broken code which leads to stack corruptions. Current minimal gcc version required to build the kernel is declared as 5.1. It is not possible to fix all old gcc versions, so work around this problem by avoiding using global register variable for current_stack_pointer. Fixes: 30de14b1884b ("s390: current_stack_pointer shouldn't be a function") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
2be9880d |
|
18-Aug-2022 |
Kefeng Wang <wangkefeng.wang@huawei.com> |
kernel: exit: cleanup release_thread() Only x86 has own release_thread(), introduce a new weak release_thread() function to clean empty definitions in other ARCHs. Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Brian Cain <bcain@quicinc.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Huacai Chen <chenhuacai@kernel.org> [LoongArch] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> [csky] Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
2f0e8aae |
|
24-Jul-2022 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/mm: rework memcpy_real() to avoid DAT-off mode Function memcpy_real() is an univeral data mover that does not require DAT mode to be able reading from a physical address. Its advantage is an ability to read from any address, even those for which no kernel virtual mapping exists. Although memcpy_real() is interrupt-safe, there are no handlers that make use of this function. The compiler instrumentation have to be disabled and separate no-DAT stack used to allow execution of the function once DAT mode is disabled. Rework memcpy_real() to overcome these shortcomings. As result, data copying (which is primarily reading out a crashed system memory by a user process) is executed on a regular stack with enabled interrupts. Also, use of memcpy_real_buf swap buffer becomes unnecessary and the swapping is eliminated. The above is achieved by using a fixed virtual address range that spans a single page and remaps that page repeatedly when memcpy_real() is called for a particular physical address. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
4df29d2b |
|
20-Jul-2022 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/smp: rework absolute lowcore access Temporary unsetting of the prefix page in memcpy_absolute() routine poses a risk of executing code path with unexpectedly disabled prefix page. This rework avoids the prefix page uninstalling and disabling of normal and machine check interrupts when accessing the absolute zero memory. Although memcpy_absolute() routine can access the whole memory, it is only used to update the absolute zero lowcore. This rework therefore introduces a new mechanism for the absolute zero lowcore access and scraps memcpy_absolute() routine for good. Instead, an area is reserved in the virtual memory that is used for the absolute lowcore access only. That area holds an array of 8KB virtual mappings - one per CPU. Whenever a CPU is brought online, the corresponding item is mapped to the real address of the previously installed prefix page. The absolute zero lowcore access works like this: a CPU calls the new primitive get_abs_lowcore() to obtain its 8KB mapping as a pointer to the struct lowcore. Virtual address references to that pointer get translated to the real addresses of the prefix page, which in turn gets swapped with the absolute zero memory addresses due to prefixing. Once the pointer is not needed it must be released with put_abs_lowcore() primitive: struct lowcore *abs_lc; unsigned long flags; abs_lc = get_abs_lowcore(&flags); abs_lc->... = ...; put_abs_lowcore(abs_lc, flags); To ensure the described mechanism works large segment- and region- table entries must be avoided for the 8KB mappings. Failure to do so results in usage of Region-Frame Absolute Address (RFAA) or Segment-Frame Absolute Address (SFAA) large page fields. In that case absolute addresses would be used to address the prefix page instead of the real ones and the prefixing would get bypassed. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
5e441f61 |
|
06-Aug-2022 |
Alexander Gordeev <agordeev@linux.ibm.com> |
Revert "s390/smp: rework absolute lowcore access" This reverts commit 7d06fed77b7d8fc9f6cc41b4e3f2823d32532ad8. This introduced vmem_mutex locking from vmem_map_4k_page() function called from smp_reinit_ipl_cpu() with interrupts disabled. While it is a pre-SMP early initcall no other CPUs running in parallel nor other code taking vmem_mutex on this boot stage - it still needs to be fixed. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
7d06fed7 |
|
20-Jul-2022 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/smp: rework absolute lowcore access Temporary unsetting of the prefix page in memcpy_absolute() routine poses a risk of executing code path with unexpectedly disabled prefix page. This rework avoids the prefix page uninstalling and disabling of normal and machine check interrupts when accessing the absolute zero memory. Although memcpy_absolute() routine can access the whole memory, it is only used to update the absolute zero lowcore. This rework therefore introduces a new mechanism for the absolute zero lowcore access and scraps memcpy_absolute() routine for good. Instead, an area is reserved in the virtual memory that is used for the absolute lowcore access only. That area holds an array of 8KB virtual mappings - one per CPU. Whenever a CPU is brought online, the corresponding item is mapped to the real address of the previously installed prefix page. The absolute zero lowcore access works like this: a CPU calls the new primitive get_abs_lowcore() to obtain its 8KB mapping as a pointer to the struct lowcore. Virtual address references to that pointer get translated to the real addresses of the prefix page, which in turn gets swapped with the absolute zero memory addresses due to prefixing. Once the pointer is not needed it must be released with put_abs_lowcore() primitive: struct lowcore *abs_lc; unsigned long flags; abs_lc = get_abs_lowcore(&flags); abs_lc->... = ...; put_abs_lowcore(abs_lc, flags); To ensure the described mechanism works large segment- and region- table entries must be avoided for the 8KB mappings. Failure to do so results in usage of Region-Frame Absolute Address (RFAA) or Segment-Frame Absolute Address (SFAA) large page fields. In that case absolute addresses would be used to address the prefix page instead of the real ones and the prefixing would get bypassed. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
85806016 |
|
20-May-2022 |
Heiko Carstens <hca@linux.ibm.com> |
s390: simplify early program check handler Due to historic reasons the base program check handler calls a configurable function. Given that there is only the early program check handler left, simplify the code by directly calling that function. The only other user was removed with commit d485235b0054 ("s390: assume diag308 set always works"). Also rename all functions and the asm file to reflect this. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
9e37a2e8 |
|
06-Apr-2022 |
Sven Schnelle <svens@linux.ibm.com> |
s390/vdso: map vdso above stack In the current code vdso is mapped below the stack. This is problematic when programs mapped to the top of the address space are allocating a lot of memory, because the heap will clash with the vdso. To avoid this map the vdso above the stack and move STACK_TOP so that it all fits into three level paging. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
57761da4 |
|
06-Apr-2022 |
Sven Schnelle <svens@linux.ibm.com> |
s390/vdso: move vdso mapping to its own function This is a preparation patch for adding vdso randomization to s390. It adds a function vdso_size(), which will be used later in calculating the STACK_TOP value. It also moves the vdso mapping into a new function vdso_map(), to keep the code similar to other architectures. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
30de14b1 |
|
08-Apr-2022 |
Sven Schnelle <svens@linux.ibm.com> |
s390: current_stack_pointer shouldn't be a function s390 defines current_stack_pointer as function while all other architectures use 'register unsigned long asm("<stackptr reg>"). This make codes like the following from check_stack_object() fail: if (IS_ENABLED(CONFIG_STACK_GROWSUP)) { if ((void *)current_stack_pointer < obj + len) return BAD_STACK; } else { if (obj < (void *)current_stack_pointer) return BAD_STACK; } because this would compare the address of current_stack_pointer() and not the stackpointer value. Reported-by: Karsten Graul <kgraul@linux.ibm.com> Fixes: 2792d84e6da5 ("usercopy: Check valid lifetime via stack depth") Cc: Kees Cook <keescook@chromium.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
ed0192bc |
|
17-Mar-2022 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/maccess: rework absolute lowcore accessors Macro mem_assign_absolute() is able to access the whole memory, but is only used and makes sense when updating the absolute lowcore. Instead, introduce get_abs_lowcore() and put_abs_lowcore() macros that limit access to absolute lowcore addresses only. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
731efc96 |
|
25-Feb-2022 |
Vasily Gorbik <gor@linux.ibm.com> |
s390: convert ".insn" encoding to instruction names With z10 as minimum supported machine generation many ".insn" encodings could be now converted to instruction names. There are couple of exceptions - stfle is used from the als code built for z900 and cannot be converted - few ".insn" directives encode unsupported instruction formats The generated code is identical before/after this change. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
cfa45c5e |
|
28-Feb-2022 |
Heiko Carstens <hca@linux.ibm.com> |
s390/base: pass pt_regs to early program check handler Pass pt_regs to early program check handler like it is done for every other interrupt and exception handler. Also the passed pt_regs can be changed by the called function and the changes register contents and psw contents will be taken into account when returning. In addition the return psw will not be copied to the program check old psw in lowcore, but to the usual return psw location, like it is also done by the regular program check handler. This allows also to get rid of the code that disabled lowcore protection when changing the return address. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
303fd988 |
|
29-Jan-2022 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/maccess: fix semantics of memcpy_real() and its callers There is a confusion with regard to the source address of memcpy_real() and calling functions. While the declared type for a source assumes a virtual address, in fact it always called with physical address of the source. This confusion led to bugs in copy_oldmem_kernel() and copy_oldmem_user() functions, where __pa() macro applied mistakenly to physical addresses. It does not lead to a real issue, since virtual and physical addresses are currently the same. Fix both the bugs and memcpy_real() prototype by making type of source address consistent to the function name and the way it actually used. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
c4538d0f |
|
27-Dec-2021 |
Guo Ren <guoren@kernel.org> |
s390: remove unused TASK_SIZE_OF This macro isn't used in Linux sched, now. Delete in include/linux/sched.h and arch's include/asm. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211228064730.2882351-6-guoren@kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
42a20f86 |
|
29-Sep-2021 |
Kees Cook <keescook@chromium.org> |
sched: Add wrapper for get_wchan() to keep task blocked Having a stable wchan means the process must be blocked and for it to stay that way while performing stack unwinding. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm] Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
|
#
915fea04 |
|
24-Aug-2021 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/smp: enable DAT before CPU restart callback is called The restart interrupt is triggered whenever a secondary CPU is brought online, a remote function call dispatched from another CPU or a manual PSW restart is initiated and causes the system to kdump. The handling routine is always called with DAT turned off. It then initializes the stack frame and invokes a callback. The existing callbacks handle DAT as follows: * __do_restart() and __machine_kexec() turn in on upon entry; * __ipl_run(), __reipl_run() and __dump_run() do not turn it right away, but all of them call diag308() - which turns DAT on, but only if kasan is enabled; In addition to the described complexity all callbacks (and the functions they call) should avoid kasan instrumentation while DAT is off. This update enables DAT in the assembler restart handler and relieves any callbacks (which are mostly C functions) from dealing with DAT altogether. There are four types of CPU restart that initialize control registers in different ways: 1. Start of secondary CPU on boot - control registers are inherited from the IPL CPU; 2. Restart of online CPU - control registers of the CPU being restarted are kept; 3. Hotplug of offline CPU - control registers are inherited from the starting CPU; 4. Start of offline CPU triggered by manual PSW restart - the control registers are read from the absolute lowcore and contain the boot time IPL CPU values updated with all follow-up calls of smp_ctl_set_bit() and smp_ctl_clear_bit() routines; In first three cases contents of the control registers is the most recent. In the latter case control registers are good enough to facilitate successful completion of kdump operation. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
9c9a915a |
|
10-Jun-2021 |
Heiko Carstens <hca@linux.ibm.com> |
s390/processor: always inline stap() and __load_psw_mask() s390 is the only architecture which makes use of the __no_kasan_or_inline attribute for two functions. Given that both stap() and __load_psw_mask() are very small functions they can and should be always inlined anyway. Therefore get rid of __no_kasan_or_inline and always inline these functions. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
755112b3 |
|
05-May-2021 |
Sven Schnelle <svens@linux.ibm.com> |
s390/traps: add struct to access transactional diagnostic block gcc-11 warns: arch/s390/kernel/traps.c: In function __do_pgm_check: arch/s390/kernel/traps.c:319:17: warning: memcpy reading 256 bytes from a region of size 0 [-Wstringop-overread] 319 | memcpy(¤t->thread.trap_tdb, &S390_lowcore.pgm_tdb, 256); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by adding a struct pgm_tdb to struct lowcore and copy that. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
3a790cc1 |
|
18-Jan-2021 |
Sven Schnelle <svens@linux.ibm.com> |
s390: pass struct pt_regs instead of registers to syscalls Instead of fetching all registers from struct pt_regs and passing them to the syscall wrappers, let the system call wrappers only fetch the values really required. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
56e62a73 |
|
21-Nov-2020 |
Sven Schnelle <svens@linux.ibm.com> |
s390: convert to generic entry This patch converts s390 to use the generic entry infrastructure from kernel/entry/*. There are a few special things on s390: - PIF_PER_TRAP is moved to TIF_PER_TRAP as the generic code doesn't know about our PIF flags in exit_to_user_mode_loop(). - The old code had several ways to restart syscalls: a) PIF_SYSCALL_RESTART, which was only set during execve to force a restart after upgrading a process (usually qemu-kvm) to pgste page table extensions. b) PIF_SYSCALL, which is set by do_signal() to indicate that the current syscall should be restarted. This is changed so that do_signal() now also uses PIF_SYSCALL_RESTART. Continuing to use PIF_SYSCALL doesn't work with the generic code, and changing it to PIF_SYSCALL_RESTART makes PIF_SYSCALL and PIF_SYSCALL_RESTART more unique. - On s390 calling sys_sigreturn or sys_rt_sigreturn is implemented by executing a svc instruction on the process stack which causes a fault. While handling that fault the fault code sets PIF_SYSCALL to hand over processing to the syscall code on exit to usermode. The patch introduces PIF_SYSCALL_RET_SET, which is set if ptrace sets a return value for a syscall. The s390x ptrace ABI uses r2 both for the syscall number and return value, so ptrace cannot set the syscall number + return value at the same time. The flag makes handling that a bit easier. do_syscall() will just skip executing the syscall if PIF_SYSCALL_RET_SET is set. CONFIG_DEBUG_ASCE was removd in favour of the generic CONFIG_DEBUG_ENTRY. CR1/7/13 will be checked both on kernel entry and exit to contain the correct asces. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
44292c86 |
|
14-Dec-2020 |
Heiko Carstens <hca@linux.ibm.com> |
s390/idle: merge enabled_wait() and arch_cpu_idle() The only caller of enabled_wait() besides arch_cpu_idle() was udelay(). Since that call doesn't exist anymore, merge enabled_wait() and arch_cpu_idle(). Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
dd6cfe55 |
|
03-Dec-2020 |
Heiko Carstens <hca@linux.ibm.com> |
s390/delay: simplify udelay udelay is implemented by using quite subtle details to make it possible to load an idle psw and waiting for an interrupt even in irq context or when interrupts are disabled. Also handling (or better: no handling) of softirqs is taken into account. All this is done to optimize for something which should in normal circumstances never happen: calling udelay to busy wait. Therefore get rid of the whole complexity and just busy loop like other architectures are doing it also. It could have been possible to use diag 0x44 instead of cpu_relax() in the busy loop, however we have seen too many bad things happen with diag 0x44 that it seems to be better to simply busy loop. Also note that with this new implementation kernel preemption does work when within the udelay loop. This did not work before. To get a feeling what the former code optimizes for: IPL'ing a kernel with 'defconfig' and afterwards compiling a kernel ends with a total of zero udelay calls. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
87d59863 |
|
16-Nov-2020 |
Heiko Carstens <hca@linux.ibm.com> |
s390/mm: remove set_fs / rework address space handling Remove set_fs support from s390. With doing this rework address space handling and simplify it. As a result address spaces are now setup like this: CPU running in | %cr1 ASCE | %cr7 ASCE | %cr13 ASCE ----------------------------|-----------|-----------|----------- user space | user | user | kernel kernel, normal execution | kernel | user | kernel kernel, kvm guest execution | gmap | user | kernel To achieve this the getcpu vdso syscall is removed in order to avoid secondary address mode and a separate vdso address space in for user space. The getcpu vdso syscall will be implemented differently with a subsequent patch. The kernel accesses user space always via secondary address space. This happens in different ways: - with mvcos in home space mode and directly read/write to secondary address space - with mvcs/mvcp in primary space mode and copy from primary space to secondary space or vice versa - with e.g. cs in secondary space mode and access secondary space Switching translation modes happens with sacf before and after instructions which access user space, like before. Lazy handling of control register reloading is removed in the hope to make everything simpler, but at the cost of making kernel entry and exit a bit slower. That is: on kernel entry the primary asce is always changed to contain the kernel asce, and on kernel exit the primary asce is changed again so it contains the user asce. In kernel mode there is only one exception to the primary asce: when kvm guests are executed the primary asce contains the gmap asce (which describes the guest address space). The primary asce is reset to kernel asce whenever kvm guest execution is interrupted, so that this doesn't has to be taken into account for any user space accesses. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
f38b0a74 |
|
24-Sep-2020 |
Vasily Gorbik <gor@linux.ibm.com> |
s390: remove unused s390_base_ext_handler s390_base_ext_handler_fn haven't been used since its introduction in commit ab14de6c37fa ("[S390] Convert memory detection into C code."). s390_base_ext_handler itself is currently falsely storing 16 registers at __LC_SAVE_AREA_ASYNC rewriting several following lowcore values: cpu_flags, return_psw, return_mcck_psw, sync_enter_timer and async_enter_timer. Besides that s390_base_ext_handler itself is only potentially hiding EXT interrupts which should not have happen in the first place. Any piece of code which requires EXT interrupts before fully functional ext_int_handler is enabled has to do it on its own, like this is done by sclp_early_cmd() which is doing EXT interrupts handling synchronously in sclp_early_wait_irq(). With s390_base_ext_handler removed unexpected EXT interrupt leads to disabled wait with the address 0x1b0 (__LC_EXT_NEW_PSW), which is currently setup in the decompressor. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
0b0ed657 |
|
19-Feb-2020 |
Sven Schnelle <svens@linux.ibm.com> |
s390: remove critical section cleanup from entry.S The current code is rather complex and caused a lot of subtle and hard to debug bugs in the past. Simplify the code by calling the system_call handler with interrupts disabled, save machine state, and re-enable them later. This requires significant changes to the machine check handling code as well. When the machine check interrupt arrived while being in kernel mode the new code will signal pending machine checks with a SIGP external call. When userspace was interrupted, the handler will switch to the kernel stack and directly execute s390_handle_mcck(). Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
f7555608 |
|
19-Mar-2020 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/mm: cleanup virtual memory constants usage Remove duplicate definitions and consolidate usage of virutal and address translation constants. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
6a3eb35e |
|
28-Feb-2020 |
Alexander Gordeev <agordeev@linux.ibm.com> |
s390/mm: remove page table downgrade support This update consolidates page table handling code. Because there are hardly any 31-bit binaries left we do not need to optimize for that. No extra efforts are needed to ensure that a compat task does not map anything above 2GB. The TASK_SIZE limit for 31-bit tasks is 2GB already and the generic code does check that a resulting map address would not surpass that limit. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
0b38b5e1 |
|
22-Jan-2020 |
Sven Schnelle <svens@linux.ibm.com> |
s390: prevent leaking kernel address in BEAR When userspace executes a syscall or gets interrupted, BEAR contains a kernel address when returning to userspace. This make it pretty easy to figure out where the kernel is mapped even with KASLR enabled. To fix this, add lpswe to lowcore and always execute it there, so userspace sees only the lowcore address of lpswe. For this we have to extend both critical_cleanup and the SWITCH_ASYNC macro to also check for lpswe addresses in lowcore. Fixes: b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
140588bf |
|
14-Feb-2020 |
Stephen Kitt <steve@sk2.org> |
s390: remove obsolete ieee_emulation_warnings s390 math emulation was removed with commit 5a79859ae0f3 ("s390: remove 31 bit support"), rendering ieee_emulation_warnings useless. The code still built because it was protected by CONFIG_MATHEMU, which was no longer selectable. This patch removes the sysctl_ieee_emulation_warnings declaration and the sysctl entry declaration. Link: https://lkml.kernel.org/r/20200214172628.3598516-1-steve@sk2.org Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Stephen Kitt <steve@sk2.org> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
c2e06e15 |
|
21-Nov-2019 |
Vasily Gorbik <gor@linux.ibm.com> |
s390: always inline disabled_wait disabled_wait uses _THIS_IP_ and assumes that compiler would inline it. Make sure this assumption is always correct by utilizing __always_inline. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
265f79dc |
|
30-Oct-2019 |
Heiko Carstens <hca@linux.ibm.com> |
s390: always inline current_stack_pointer() This function must be inlined since any caller expects the current stack pointer; which wouldn't be true if the function isn't inlined. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
51222278 |
|
02-Sep-2019 |
Vasily Gorbik <gor@linux.ibm.com> |
s390/base: remove unused s390_base_mcck_handler s390_base_mcck_handler was used during system reset if diag308 set was not available. But after commit d485235b0054 ("s390: assume diag308 set always works") is a dead code and could be removed. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
fe6ba88b |
|
16-Jul-2019 |
Masahiro Yamada <yamada.masahiro@socionext.com> |
arch: replace _BITUL() in kernel-space headers with BIT() Now that BIT() can be used from assembly code, we can safely replace _BITUL() with equivalent BIT(). UAPI headers are still required to use _BITUL(), but there is no more reason to use it in kernel headers. BIT() is shorter. Link: http://lkml.kernel.org/r/20190609153941.17249-2-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
4ecf0a43 |
|
07-Jun-2019 |
Heiko Carstens <hca@linux.ibm.com> |
processor: get rid of cpu_relax_yield stop_machine is the only user left of cpu_relax_yield. Given that it now has special semantics which are tied to stop_machine introduce a weak stop_machine_yield function which architectures can override, and get rid of the generic cpu_relax_yield implementation. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
38f2c691 |
|
16-May-2019 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: improve wait logic of stop_machine The stop_machine loop to advance the state machine and to wait for all affected CPUs to check-in calls cpu_relax_yield in a tight loop until the last missing CPUs acknowledged the state transition. On a virtual system where not all logical CPUs are backed by real CPUs all the time it can take a while for all CPUs to check-in. With the current definition of cpu_relax_yield a diagnose 0x44 is done which tells the hypervisor to schedule *some* other CPU. That can be any CPU and not necessarily one of the CPUs that need to run in order to advance the state machine. This can lead to a pretty bad diagnose 0x44 storm until the last missing CPU finally checked-in. Replace the undirected cpu_relax_yield based on diagnose 0x44 with a directed yield. Each CPU in the wait loop will pick up the next CPU in the cpumask of stop_machine. The diagnose 0x9c is used to tell the hypervisor to run this next CPU instead of the current one. If there is only a limited number of real CPUs backing the virtual CPUs we end up with the real CPUs passed around in a round-robin fashion. [heiko.carstens@de.ibm.com]: Use cpumask_next_wrap as suggested by Peter Zijlstra. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
98587c2d |
|
29-Apr-2019 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: simplify disabled_wait The disabled_wait() function uses its argument as the PSW address when it stops the CPU with a wait PSW that is disabled for interrupts. The different callers sometimes use a specific number like 0xdeadbeef to indicate a specific failure, the early boot code uses 0 and some other calls sites use __builtin_return_address(0). At the time a dump is created the current PSW and the registers of a CPU are written to lowcore to make them avaiable to the dump analysis tool. For a CPU stopped with disabled_wait the PSW and the registers do not really make sense together, the PSW address does not point to the function the registers belong to. Simplify disabled_wait() by using _THIS_IP_ for the PSW address and drop the argument to the function. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
78c98f90 |
|
28-Jan-2019 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/unwind: introduce stack unwind API Rework the dump_trace() stack unwinder interface to support different unwinding algorithms. The new interface looks like this: struct unwind_state state; unwind_for_each_frame(&state, task, regs, start_stack) do_something(state.sp, state.ip, state.reliable); The unwind_bc.c file contains the implementation for the classic back-chain unwinder. One positive side effect of the new code is it now handles ftraced functions gracefully. It prints the real name of the return function instead of 'return_to_handler'. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
0a113efc |
|
08-Apr-2019 |
Arnd Bergmann <arnd@arndb.de> |
s390: make __load_psw_mask work with clang clang fails to use the %O and %R inline assembly modifiers the same way as gcc, leading to build failures with every use of __load_psw_mask(): /tmp/nmi-4a9f80.s: Assembler messages: /tmp/nmi-4a9f80.s:571: Error: junk at end of line: `+8(160(%r11))' /tmp/nmi-4a9f80.s:626: Error: junk at end of line: `+8(160(%r11))' Replace these with a more conventional way of passing the addresses that should work with both clang and gcc. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
7aa0055e |
|
10-Apr-2019 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: fine-tune stack switch helper The CALL_ON_STACK helper currently does not work with clang and for calls without arguments. It does not initialize r2 although the constraint is "+&d". Rework the CALL_FMT_x and the CALL_ON_STACK macros to work with clang and produce optimal code in all cases. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
163c8d54 |
|
04-Nov-2018 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
compiler: remove __no_sanitize_address_or_inline again The __no_sanitize_address_or_inline and __no_kasan_or_inline defines are almost identical. The only difference is that __no_kasan_or_inline does not have the 'notrace' attribute. To be able to replace __no_sanitize_address_or_inline with the older definition, add 'notrace' to __no_kasan_or_inline and change to two users of __no_sanitize_address_or_inline in the s390 code. The 'notrace' option is necessary for e.g. the __load_psw_mask function in arch/s390/include/asm/processor.h. Without the option it is possible to trace __load_psw_mask which leads to kernel stack overflow. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Pointed-out-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
de0d22e5 |
|
30-Oct-2018 |
Nick Desaulniers <ndesaulniers@google.com> |
treewide: remove current_text_addr Prefer _THIS_IP_ defined in linux/kernel.h. Most definitions of current_text_addr were the same as _THIS_IP_, but a few archs had inline assembly instead. This patch removes the final call site of current_text_addr, making all of the definitions dead code. [akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h] Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
ac1256f8 |
|
19-Nov-2017 |
Vasily Gorbik <gor@linux.ibm.com> |
s390/kasan: reipl and kexec support Some functions from both arch/s390/kernel/ipl.c and arch/s390/kernel/machine_kexec.c are called without DAT enabled (or with and without DAT enabled code paths). There is no easy way to partially disable kasan for those files without a substantial rework. Disable kasan for both files for now. To avoid disabling kasan for arch/s390/kernel/diag.c DAT flag is enabled in diag308 call. pcpu_delegate which disables DAT is marked with __no_sanitize_address to disable instrumentation for that one function. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
9e8df6da |
|
19-Nov-2017 |
Vasily Gorbik <gor@linux.ibm.com> |
s390/smp: kasan stack instrumentation support smp_start_secondary function is called without DAT enabled. To avoid disabling kasan instrumentation for entire arch/s390/kernel/smp.c smp_start_secondary has been split in 2 parts. smp_start_secondary has instrumentation disabled, it does minimal setup and enables DAT. Then instrumentated __smp_start_secondary is called to do the rest. __load_psw_mask function instrumentation has been disabled as well to be able to call it from smp_start_secondary. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
32ce55a6 |
|
18-Sep-2018 |
Vasily Gorbik <gor@linux.ibm.com> |
s390: unify stack size definitions Remove STACK_ORDER and STACK_SIZE in favour of identical THREAD_SIZE_ORDER and THREAD_SIZE definitions. THREAD_SIZE and THREAD_SIZE_ORDER naming is misleading since it is used as general kernel stack size information. But both those definitions are used in the common code and throughout architectures specific code, so changing the naming is problematic. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ce3dc447 |
|
12-Sep-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: add support for virtually mapped kernel stacks With virtually mapped kernel stacks the kernel stack overflow detection is now fault based, every stack has a guard page in the vmalloc space. The panic_stack is renamed to nodat_stack and is used for all function that need to run without DAT, e.g. memcpy_real or do_start_kdump. The main effect is a reduction in the kernel image size as with vmap stacks the old style overflow checking that adds two instructions per function is not needed anymore. Result from bloat-o-meter: add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042) In regard to performance the micro-benchmark for fork has a hit of a few microseconds, allocating 4 pages in vmalloc space is more expensive compare to an order-2 page allocation. But with real workload I could not find a noticeable difference. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ff340d24 |
|
07-Sep-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: add stack switch helper Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
6b73044b |
|
15-Jan-2018 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: run user space and KVM guests with modified branch prediction Define TIF_ISOLATE_BP and TIF_ISOLATE_BP_GUEST and add the necessary plumbing in entry.S to be able to run user space and KVM guests with limited branch prediction. To switch a user space process to limited branch prediction the s390_isolate_bp() function has to be call, and to run a vCPU of a KVM guest associated with the current task with limited branch prediction call s390_isolate_bp_guest(). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
d768bd89 |
|
15-Jan-2018 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: add options to change branch prediction behaviour for the kernel Add the PPA instruction to the system entry and exit path to switch the kernel to a different branch prediction behaviour. The instructions are added via CPU alternatives and can be disabled with the "nospec" or the "nobp=0" kernel parameter. If the default behaviour selected with CONFIG_KERNEL_NOBP is set to "n" then the "nobp=1" parameter can be used to enable the changed kernel branch prediction. Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
11776eaa |
|
13-Nov-2017 |
Vasily Gorbik <gor@linux.vnet.ibm.com> |
s390: correct some inline assembly constraints Inline assembly code changed in this patch should really use "Q" constraint "Memory reference without index register and with short displacement". The kernel does not compile with kasan support enabled otherwise (due to stack instrumentation). Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
0aaba41b |
|
21-Aug-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: remove all code using the access register mode The vdso code for the getcpu() and the clock_gettime() call use the access register mode to access the per-CPU vdso data page with the current code. An alternative to the complicated AR mode is to use the secondary space mode. This makes the vdso faster and quite a bit simpler. The downside is that the uaccess code has to be changed quite a bit. Which instructions are used depends on the machine and what kind of uaccess operation is requested. The instruction dictates which ASCE value needs to be loaded into %cr1 and %cr7. The different cases: * User copy with MVCOS for z10 and newer machines The MVCOS instruction can copy between the primary space (aka user) and the home space (aka kernel) directly. For set_fs(KERNEL_DS) the kernel ASCE is loaded into %cr1. For set_fs(USER_DS) the user space is already loaded in %cr1. * User copy with MVCP/MVCS for older machines To be able to execute the MVCP/MVCS instructions the kernel needs to switch to primary mode. The control register %cr1 has to be set to the kernel ASCE and %cr7 to either the kernel ASCE or the user ASCE dependent on set_fs(KERNEL_DS) vs set_fs(USER_DS). * Data access in the user address space for strnlen / futex To use "normal" instruction with data from the user address space the secondary space mode is used. The kernel needs to switch to primary mode, %cr1 has to contain the kernel ASCE and %cr7 either the user ASCE or the kernel ASCE, dependent on set_fs. To load a new value into %cr1 or %cr7 is an expensive operation, the kernel tries to be lazy about it. E.g. for multiple user copies in a row with MVCP/MVCS the replacement of the vdso ASCE in %cr7 with the user ASCE is done only once. On return to user space a CPU bit is checked that loads the vdso ASCE again. To enable and disable the data access via the secondary space two new functions are added, enable_sacf_uaccess and disable_sacf_uaccess. The fact that a context is in secondary space uaccess mode is stored in the mm_segment_t value for the task. The code of an interrupt may use set_fs as long as it returns to the previous state it got with get_fs with another call to set_fs. The code in finish_arch_post_lock_switch simply has to do a set_fs with the current mm_segment_t value for the task. For CPUs with MVCOS: CPU running in | %cr1 ASCE | %cr7 ASCE | --------------------------------------|-----------|-----------| user space | user | vdso | kernel, USER_DS, normal-mode | user | vdso | kernel, USER_DS, normal-mode, lazy | user | user | kernel, USER_DS, sacf-mode | kernel | user | kernel, KERNEL_DS, normal-mode | kernel | vdso | kernel, KERNEL_DS, normal-mode, lazy | kernel | kernel | kernel, KERNEL_DS, sacf-mode | kernel | kernel | For CPUs without MVCOS: CPU running in | %cr1 ASCE | %cr7 ASCE | --------------------------------------|-----------|-----------| user space | user | vdso | kernel, USER_DS, normal-mode | user | vdso | kernel, USER_DS, normal-mode lazy | kernel | user | kernel, USER_DS, sacf-mode | kernel | user | kernel, KERNEL_DS, normal-mode | kernel | vdso | kernel, KERNEL_DS, normal-mode, lazy | kernel | kernel | kernel, KERNEL_DS, sacf-mode | kernel | kernel | The lines with "lazy" refer to the state after a copy via the secondary space with a delayed reload of %cr1 and %cr7. There are three hardware address spaces that can cause a DAT exception, primary, secondary and home space. The exception can be related to four different fault types: user space fault, vdso fault, kernel fault, and the gmap faults. Dependent on the set_fs state and normal vs. sacf mode there are a number of fault combinations: 1) user address space fault via the primary ASCE 2) gmap address space fault via the primary ASCE 3) kernel address space fault via the primary ASCE for machines with MVCOS and set_fs(KERNEL_DS) 4) vdso address space faults via the secondary ASCE with an invalid address while running in secondary space in problem state 5) user address space fault via the secondary ASCE for user-copy based on the secondary space mode, e.g. futex_ops or strnlen_user 6) kernel address space fault via the secondary ASCE for user-copy with secondary space mode with set_fs(KERNEL_DS) 7) kernel address space fault via the primary ASCE for user-copy with secondary space mode with set_fs(USER_DS) on machines without MVCOS. 8) kernel address space fault via the home space ASCE Replace user_space_fault() with a new function get_fault_type() that can distinguish all four different fault types. With these changes the futex atomic ops from the kernel and the strnlen_user will get a little bit slower, as well as the old style uaccess with MVCP/MVCS. All user accesses based on MVCOS will be as fast as before. On the positive side, the user space vdso code is a lot faster and Linux ceases to use the complicated AR mode. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
1887aa07 |
|
22-Sep-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/topology: add detection of dedicated vs shared CPUs The topology information returned by STSI 15.x.x contains a flag if the CPUs of a topology-list are dedicated or shared. Make this information available if the machine provides topology information. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
7b83c629 |
|
11-Sep-2017 |
Heiko Carstens <hca@linux.ibm.com> |
s390/guarded storage: simplify task exit handling Free data structures required for guarded storage from arch_release_task_struct(). This allows to simplify the code a bit, and also makes the semantics a bit easier: arch_release_task_struct() is never called from the task that is being removed. In addition this allows to get rid of exit_thread() in a later patch. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
8076428f |
|
11-Sep-2017 |
Heiko Carstens <hca@linux.ibm.com> |
s390: convert release_thread() into a static inline function release_thread() is an empty function that gets called on every task exit. Move the function to a header file and force inlining of it, so that the compiler can optimize it away instead of generating a pointless function call. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
6474924e |
|
28-Jun-2017 |
Tobias Klauser <tklauser@distanz.ch> |
arch: remove unused macro/function thread_saved_pc() The only user of thread_saved_pc() in non-arch-specific code was removed in commit 8243d5597793 ("sched/core: Remove pointless printout in sched_show_task()"). Remove the implementations as well. Some architectures use thread_saved_pc() in their arch-specific code. Leave their thread_saved_pc() intact. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
c929500d |
|
07-Jun-2017 |
QingFeng Hao <haoqf@linux.vnet.ibm.com> |
s390/nmi: s390: New low level handling for machine check happening in guest Add the logic to check if the machine check happens when the guest is running. If yes, set the exit reason -EINTR in the machine check's interrupt handler. Refactor s390_do_machine_check to avoid panicing the host for some kinds of machine checks which happen when guest is running. Reinject the instruction processing damage's machine checks including Delayed Access Exception instead of damaging the host if it happens in the guest because it could be caused by improper update on TLB entry or other software case and impacts the guest only. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
#
1aea9b3f |
|
24-Apr-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/mm: implement 5 level pages tables Add the logic to upgrade the page table for a 64-bit process to five levels. This increases the TASK_SIZE from 8PB to 16EB-4K. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ee71d16d |
|
20-Apr-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/mm: make TASK_SIZE independent from the number of page table levels The TASK_SIZE for a process should be maximum possible size of the address space, 2GB for a 31-bit process and 8PB for a 64-bit process. The number of page table levels required for a given memory layout is a consequence of the mapped memory areas and their location. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
916cda1a |
|
26-Jan-2016 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: add a system call for guarded storage This adds a new system call to enable the use of guarded storage for user space processes. The system call takes two arguments, a command and pointer to a guarded storage control block: s390_guarded_storage(int command, struct gs_cb *gs_cb); The second argument is relevant only for the GS_SET_BC_CB command. The commands in detail: 0 - GS_ENABLE Enable the guarded storage facility for the current task. The initial content of the guarded storage control block will be all zeros. After the enablement the user space code can use load-guarded-storage-controls instruction (LGSC) to load an arbitrary control block. While a task is enabled the kernel will save and restore the current content of the guarded storage registers on context switch. 1 - GS_DISABLE Disables the use of the guarded storage facility for the current task. The kernel will cease to save and restore the content of the guarded storage registers, the task specific content of these registers is lost. 2 - GS_SET_BC_CB Set a broadcast guarded storage control block. This is called per thread and stores a specific guarded storage control block in the task struct of the current task. This control block will be used for the broadcast event GS_BROADCAST. 3 - GS_CLEAR_BC_CB Clears the broadcast guarded storage control block. The guarded- storage control block is removed from the task struct that was established by GS_SET_BC_CB. 4 - GS_BROADCAST Sends a broadcast to all thread siblings of the current task. Every sibling that has established a broadcast guarded storage control block will load this control block and will be enabled for guarded storage. The broadcast guarded storage control block is used up, a second broadcast without a refresh of the stored control block with GS_SET_BC_CB will not have any effect. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
fb94a687 |
|
23-Feb-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: TASK_SIZE for kernel threads Return a sensible value if TASK_SIZE if called from a kernel thread. This gets us around an issue with copy_mount_options that does a magic size calculation "TASK_SIZE - (unsigned long)data" while in a kernel thread and data pointing to kernel space. Cc: <stable@vger.kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b5a882fc |
|
17-Feb-2017 |
Heiko Carstens <hca@linux.ibm.com> |
s390: restore address space when returning to user space Unbalanced set_fs usages (e.g. early exit from a function and a forgotten set_fs(USER_DS) call) may lead to a situation where the secondary asce is the kernel space asce when returning to user space. This would allow user space to modify kernel space at will. This would only be possible with the above mentioned kernel bug, however we can detect this and fix the secondary asce before returning to user space. Therefore a new TIF_ASCE_SECONDARY which is used within set_fs. When returning to user space check if TIF_ASCE_SECONDARY is set, which would indicate a bug. If it is set print a message to the console, fixup the secondary asce, and then return to user space. This is similar to what is being discussed for x86 and arm: "[RFC] syscalls: Restore address limit after a syscall". Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
606aa4aa |
|
17-Feb-2017 |
Heiko Carstens <hca@linux.ibm.com> |
s390: rename CIF_ASCE to CIF_ASCE_PRIMARY This is just a preparation patch in order to keep the "restore address space after syscall" patch small. Rename CIF_ASCE to CIF_ASCE_PRIMARY to be unique and specific when introducing a second CIF_ASCE_SECONDARY CIF flag. Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
1228f7be |
|
16-Feb-2017 |
Heiko Carstens <hca@linux.ibm.com> |
s390: add missing "do {} while (0)" loop constructs to multiline macros Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b7394a5f |
|
05-Jan-2017 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
sched/cputime, s390: Implement delayed accounting of system time The account_system_time() function is called with a cputime that occurred while running in the kernel. The function detects which context the CPU is currently running in and accounts the time to the correct bucket. This forces the arch code to account the cputime for hardirq and softirq immediately. Such accounting function can be costly and perform unwelcome divisions and multiplications, among others. The arch code can delay the accounting for system time. For s390 the accounting is done once per timer tick and for each task switch. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> [ Rebase against latest linus tree and move account_system_index_scaled(). ] Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1483636310-6557-10-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
6d0d2878 |
|
16-Nov-2016 |
Christian Borntraeger <borntraeger@de.ibm.com> |
locking/core: Provide common cpu_relax_yield() definition No need to duplicate the same define everywhere. Since the only user is stop-machine and the only provider is s390, we can use a default implementation of cpu_relax_yield() in sched.h. Suggested-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-s390 <linux-s390@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
5bd0b85b |
|
25-Oct-2016 |
Christian Borntraeger <borntraeger@de.ibm.com> |
locking/core, arch: Remove cpu_relax_lowlatency() As there are no users left, we can remove cpu_relax_lowlatency() implementations from every architecture. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Cc: <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
22b6430d |
|
25-Oct-2016 |
Christian Borntraeger <borntraeger@de.ibm.com> |
locking/core, s390: Make cpu_relax() a barrier again stop_machine() seemed to be the only important place for yielding during cpu_relax(). This was fixed by using cpu_relax_yield(). Therefore, we can now redefine cpu_relax() to be a barrier instead on s390, making s390 identical to all other architectures. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-4-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
79ab11cd |
|
25-Oct-2016 |
Christian Borntraeger <borntraeger@de.ibm.com> |
locking/core: Introduce cpu_relax_yield() For spinning loops people do often use barrier() or cpu_relax(). For most architectures cpu_relax and barrier are the same, but on some architectures cpu_relax can add some latency. For example on power,sparc64 and arc, cpu_relax can shift the CPU towards other hardware threads in an SMT environment. On s390 cpu_relax does even more, it uses an hypercall to the hypervisor to give up the timeslice. In contrast to the SMT yielding this can result in larger latencies. In some places this latency is unwanted, so another variant "cpu_relax_lowlatency" was introduced. Before this is used in more and more places, lets revert the logic and provide a cpu_relax_yield that can be called in places where yielding is more important than latency. By default this is the same as cpu_relax on all architectures. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
ef280c85 |
|
07-Nov-2016 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: move sys_call_table and last_break from thread_info to thread_struct Move the last two architecture specific fields from the thread_info structure to the thread_struct. All that is left in thread_info is the flags field. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
90c53e65 |
|
07-Nov-2016 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: move cputime accounting fields from thread_info to thread_struct The user_timer and system_timer fields are used for the per-thread cputime accounting code. The access to these values is simpler if they are moved to the thread_struct as the task_thread_info(tsk) indirection is not needed anymore. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
f8fc82b4 |
|
08-Nov-2016 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: move system_call field from thread_info to thread_struct The system_call field in thread_info structure is used by the signal code to store the number of the current system call while the debugger interacts with its inferior. A better location for the system_call field is with the other debugger related information in the thread_struct. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
d0208639 |
|
17-Oct-2016 |
Heiko Carstens <hca@linux.ibm.com> |
s390/dumpstack: restore reliable indicator for call traces Before merging all different stack tracers the call traces printed had an indicator if an entry can be considered reliable or not. Unreliable entries were put in braces, reliable not. Currently all lines contain these extra braces. This patch restores the old behaviour by adding an extra "reliable" parameter to the callback functions. Only show_trace makes currently use of it. Before: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0) After: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] [<0000000000161d64>] create_worker+0x174/0x1c0 Fixes: 758d39ebd3d5 ("s390/dumpstack: merge all four stack tracers") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
4a494439 |
|
07-Mar-2016 |
David Hildenbrand <dahi@linux.vnet.ibm.com> |
s390/mm: remember the int code for the last gmap fault For nested virtualization, we want to know if we are handling a protection exception, because these can directly be forwarded to the guest without additional checks. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
#
4be130a0 |
|
07-Mar-2016 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/mm: add shadow gmap support For a nested KVM guest the outer KVM host needs to create shadow page tables for the nested guest. This patch adds the basic support to the guest address space (gmap) code. For each guest address space the inner KVM host creates, the first outer KVM host needs to create shadow page tables. The address space is identified by the ASCE loaded into the control register 1 at the time the inner SIE instruction for the second nested KVM guest is executed. The outer KVM host creates the shadow tables starting with the table identified by the ASCE on a on-demand basis. The outer KVM host will get repeated faults for all the shadow tables needed to run the second KVM guest. While a shadow page table for the second KVM guest is active the access to the origin region, segment and page tables needs to be restricted for the first KVM guest. For region and segment and page tables the first KVM guest may read the memory, but write attempt has to lead to an unshadow. This is done using the page invalid and read-only bits in the page table of the first KVM guest. If the first guest re-accesses one of the origin pages of a shadow, it gets a fault and the affected parts of the shadow page table hierarchy needs to be removed again. PGSTE tables don't have to be shadowed, as all interpretation assist can't deal with the invalid bits in the shadow pte being set differently than the original ones provided by the first KVM guest. Many bug fixes and improvements by David Hildenbrand. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
#
097a116c |
|
13-Apr-2016 |
Heiko Carstens <hca@linux.ibm.com> |
s390/cpuinfo: show dynamic and static cpu mhz Show the dynamic and static cpu mhz of each cpu. Since these values are per cpu this requires a fundamental extension of the format of /proc/cpuinfo. Historically we had only a single line per cpu and a summary at the top of the file. This format is hardly extendible if we want to add more per cpu information. Therefore this patch adds per cpu blocks at the end of /proc/cpuinfo: cpu : 0 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 cpu : 1 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 cpu : 2 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 cpu : 3 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 Right now each block contains only the dynamic and static cpu mhz, but it can be easily extended like on every other architecture. This extension is supposed to be compatible with the old format. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
3f6813b9 |
|
01-Apr-2016 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/fpu: allocate 'struct fpu' with the task_struct Analog to git commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a "x86/fpu, sched: Dynamically allocate 'struct fpu'" move the struct fpu to the end of the struct thread_struct, set CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and add the setup_task_size() function to calculate the correct size fo the task struct. For the performance_defconfig this increases the size of struct task_struct from 7424 bytes to 7936 bytes (MACHINE_HAS_VX==1) or 7552 bytes (MACHINE_HAS_VX==0). The dynamic allocation of the struct fpu is removed. The slab cache uses an 8KB block for the task struct in all cases, there is enough room for the struct fpu. For MACHINE_HAS_VX==1 each task now needs 512 bytes less memory. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
723cacbd |
|
15-Apr-2016 |
Gerald Schaefer <gerald.schaefer@linux.ibm.com> |
s390/mm: fix asce_bits handling with dynamic pagetable levels There is a race with multi-threaded applications between context switch and pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a pagetable upgrade on another thread in crst_table_upgrade() could already have set new asce_bits, but not yet the new mm->pgd. This would result in a corrupt user_asce in switch_mm(), and eventually in a kernel panic from a translation exception. Fix this by storing the complete asce instead of just the asce_bits, which can then be read atomically from switch_mm(), so that it either sees the old value or the new value, but no mixture. Both cases are OK. Having the old value would result in a page fault on access to the higher level memory, but the fault handler would see the new mm->pgd, if it was a valid access after the mmap on the other thread has completed. So as worst-case scenario we would have a page fault loop for the racing thread until the next time slice. Also remove dead code and simplify the upgrade/downgrade path, there are no upgrades from 2 levels, and only downgrades from 3 levels for compat tasks. There are also no concurrent upgrades, because the mmap_sem is held with down_write() in do_mmap, so the flush and table checks during upgrade can be removed. Reported-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
758d39eb |
|
08-Feb-2016 |
Heiko Carstens <hca@linux.ibm.com> |
s390/dumpstack: merge all four stack tracers We have four different stack tracers of which three had bugs. So it's time to merge them to a single stack tracer which allows to specify a call back function which will be called for each step. This patch changes behavior a bit: - the "nosched" and "in_sched_functions" check within save_stack_trace_tsk did work only for the last stack frame within a context. Now it considers the check for each stack frame like it should. - both the oprofile variant and the perf_events variant did save a return address twice if a zero back chain was detected, which indicates an interrupt frame. The new dump_trace function will call the oprofile and perf_events backends with the psw address that is contained within the corresponding pt_regs structure instead. - the original show_trace and save_context_stack functions did already use the psw address of the pt_regs structure if a zero back chain was detected. However now we ignore the psw address if it is a user space address. After all we trace the kernel stack and not the user space stack. This way we also get rid of the garbage user space address in case of warnings and / or panic call traces. So this should make life easier since now there is only one stack tracer left which we can break. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
76737ce1 |
|
31-Jan-2016 |
Heiko Carstens <hca@linux.ibm.com> |
s390: add current_stack_pointer() helper function Implement current_stack_pointer() helper function and use it everywhere, instead of having several different inline assembly variants. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
fecc868a |
|
17-Jan-2016 |
Heiko Carstens <hca@linux.ibm.com> |
s390: remove all usages of PSW_ADDR_AMODE This is a leftover from the 31 bit area. For CONFIG_64BIT the usual operation "y = x | PSW_ADDR_AMODE" is a nop. Therefore remove all usages of PSW_ADDR_AMODE and make the code a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
|
#
c667aeac |
|
31-Dec-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390: rename struct _lowcore to struct lowcore Finally get rid of the leading underscore. I tried this already two or three years ago, however Michael Holzheu objected since this would break the crash utility (again). However Michael integrated support for the new name into the crash utility back then, so it doesn't break if the name will be changed now. So finally get rid of the ever confusing leading underscore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
419123f9 |
|
19-Nov-2015 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/spinlock: do not yield to a CPU in udelay/mdelay It does not make sense to try to relinquish the time slice with diag 0x9c to a CPU in a state that does not allow to schedule the CPU. The scenario where this can happen is a CPU waiting in udelay/mdelay while holding a spin-lock. Add a CIF bit to tag a CPU in enabled wait and use it to detect that the yield of a CPU will not be successful and skip the diagnose call. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b38feccd |
|
02-Nov-2015 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: remove runtime instrumentation interrupts The external interrupts for runtime instrumentation buffer-full and runtime instrumentation halted are unused and have no current user. Remove the support and ignore the second parameter of the s390_runtime_instr system call from now on. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
f9e6edfb |
|
11-Oct-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390: don't store registers on disabled wait anymore The current disabled wait code stores register contents into their save areas, however it is (at least) missing the new vector registers. Given the fact that the whole exercise seems to be rather pointless simply don't save any registers anymore. In a "live" system it is always possible to inspect register contents, and in case of a dump the register contents will be stored by the dump mechanism. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ecbafda8 |
|
12-Oct-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390: get rid of __set_psw_mask() With the removal of 31 bit code we can always assume that the epsw instruction is available. Therefore use the __extract_psw() function to disable and enable machine checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b0753902 |
|
05-Oct-2015 |
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> |
s390/fpu: split fpu-internal.h into fpu internals, api, and type headers Split the API and FPU type definitions into separate header files similar to "x86/fpu: Rename fpu-internal.h to fpu/internal.h" (78f7f1e54b). The new header files and their meaning are: asm/fpu/types.h: FPU related data types, needed for 'struct thread_struct' and 'struct task_struct'. asm/fpu/api.h: FPU related 'public' functions for other subsystems and device drivers. asm/fpu/internal.h: FPU internal functions mainly used to convert FPU register contents in signal handling. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
92778b99 |
|
06-Oct-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390/flags: use _BITUL macro The defines that are used in entry.S have been partially converted to use the _BITUL macro (setup.h). This patch converts the rest. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ac25e790 |
|
06-Oct-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390/flags: fix flag handling The cpu flags and pt_regs flags fields are each 64 bits in size. A flag can be set with helper functions like set_cpu_flags(). These functions create a mask using "1U << flag". This doesn't work if flag is larger than 31, since 1U << 32 == 0. So fix this in case we ever will have such flag numbers. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
db7e007f |
|
15-Aug-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390/udelay: make udelay have busy loop semantics When using systemtap it was observed that our udelay implementation is rather suboptimal if being called from a kprobe handler installed by systemtap. The problem observed when a kprobe was installed on lock_acquired(). When the probe was hit the kprobe handler did call udelay, which set up an (internal) timer and reenabled interrupts (only the clock comparator interrupt) and waited for the interrupt. This is an optimization to avoid that the cpu is busy looping while waiting that enough time passes. The problem is that the interrupt handler still does call irq_enter()/irq_exit() which then again can lead to a deadlock, since some accounting functions may take locks as well. If one of these locks is the same, which caused lock_acquired() to be called, we have a nice deadlock. This patch reworks the udelay code for the interrupts disabled case to immediately leave the low level interrupt handler when the clock comparator interrupt happens. That way no C code is being called and the deadlock cannot happen anymore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
0ac27779 |
|
29-Sep-2015 |
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> |
s390/fpu: add static FPU save area for init_task Previously, the init task did not have an allocated FPU save area and saving an FPU state was not possible. Now if the vector extension is always enabled, provide a static FPU save area to save FPU states of vector instructions that can be executed quite early. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
22362a0e |
|
08-Jul-2015 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/sclp: convert early sclp console code to C The 31-bit assembler code for the early sclp console is error prone as git commit fde24b54d976cc123506695c17db01438a11b673 "s390/sclp: clear upper register halves in _sclp_print_early" has shown. Convert the assembler code to C. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
9977e886 |
|
09-Jun-2015 |
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> |
s390/kernel: lazy restore fpu registers Improve the save and restore behavior of FPU register contents to use the vector extension within the kernel. The kernel does not use floating-point or vector registers and, therefore, saving and restoring the FPU register contents are performed for handling signals or switching processes only. To prepare for using vector instructions and vector registers within the kernel, enhance the save behavior and implement a lazy restore at return to user space from a system call or interrupt. To implement the lazy restore, the save_fpu_regs() sets a CPU information flag, CIF_FPU, to indicate that the FPU registers must be restored. Saving and setting CIF_FPU is performed in an atomic fashion to be interrupt-safe. When the kernel wants to use the vector extension or wants to change the FPU register state for a task during signal handling, the save_fpu_regs() must be called first. The CIF_FPU flag is also set at process switch. At return to user space, the FPU state is restored. In particular, the FPU state includes the floating-point or vector register contents, as well as, vector-enablement and floating-point control. The FPU state restore and clearing CIF_FPU is also performed in an atomic fashion. For KVM, the restore of the FPU register state is performed when restoring the general-purpose guest registers before the SIE instructions is started. Because the path towards the SIE instruction is interruptible, the CIF_FPU flag must be checked again right before going into SIE. If set, the guest registers must be reloaded again by re-entering the outer SIE loop. This is the same behavior as if the SIE critical section is interrupted. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
904818e2 |
|
11-Jun-2015 |
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> |
s390/kernel: introduce fpu-internal.h with fpu helper functions Introduce a new structure to manage FP and VX registers. Refactor the save and restore of floating point and vector registers with a set of helper functions in fpu-internal.h. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
2e54dc3c |
|
12-Jun-2015 |
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> |
s390/kernel: move EX_TABLE macros to linkage.h header file Move the EX_TABLE macro definitions from the processor.h to the linkage.h header file. It helps to reduce circular header file dependencies. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
5a79859a |
|
12-Feb-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390: remove 31 bit support Remove the 31 bit support in order to reduce maintenance cost and effectively remove dead code. Since a couple of years there is no distribution left that comes with a 31 bit kernel. The 31 bit kernel also has been broken since more than a year before anybody noticed. In addition I added a removal warning to the kernel shown at ipl for 5 minutes: a960062e5826 ("s390: add 31 bit warning message") which let everybody know about the plan to remove 31 bit code. We didn't get any response. Given that the last 31 bit only machine was introduced in 1999 let's remove the code. Anybody with 31 bit user space code can still use the compat mode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
4d92f502 |
|
27-Jan-2015 |
Heiko Carstens <hca@linux.ibm.com> |
s390: reintroduce diag 44 calls for cpu_relax() Christian Borntraeger reported that the now missing diag 44 calls (voluntary time slice end) does cause a performance regression for stop_machine() calls if a machine has more virtual cpus than the host has physical cpus. This patch mainly reverts 57f2ffe14fd125c2 ("s390: remove diag 44 calls from cpu_relax()") with the exception that we still do not issue diag 44 calls if running with smt enabled. Due to group scheduling algorithms when running in LPAR this would lead to significant latencies. However, when running in LPAR we do not have more virtual than physical cpus. Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
57f2ffe1 |
|
15-Sep-2014 |
Heiko Carstens <hca@linux.ibm.com> |
s390: remove diag 44 calls from cpu_relax() Simplify cpu_relax() to a simple barrier(). Performance wise this doesn't seem to make any big difference anymore, since nearly all lock variants have directed yield semantics in the meantime. Also this makes s390 behave like all other architectures. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
80703617 |
|
06-Oct-2014 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: add support for vector extension The vector extension introduces 32 128-bit vector registers and a set of instruction to operate on the vector registers. The kernel can control the use of vector registers for the problem state program with a bit in control register 0. Once enabled for a process the kernel needs to retain the content of the vector registers on context switch. The signal frame is extended to include the vector registers. Two new register sets NT_S390_VXRS_LOW and NT_S390_VXRS_HIGH are added to the regset interface for the debugger and core dumps. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b5f87f15 |
|
01-Oct-2014 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/idle: consolidate idle functions and definitions Move the C functions and definitions related to the idle state handling to arch/s390/include/asm/idle.h and arch/s390/kernel/idle.c. The function s390_get_idle_time is renamed to arch_cpu_idle_time and vtime_stop_cpu to enabled_wait. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
fe0f4976 |
|
30-Sep-2014 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/nohz: use a per-cpu flag for arch_needs_cpu Move the nohz_delay bit from the s390_idle data structure to the per-cpu flags. Clear the nohz delay flag in __cpu_disable and remove the cpu hotplug notifier that used to do this. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
3a6bfbc9 |
|
29-Jun-2014 |
Davidlohr Bueso <davidlohr@hp.com> |
arch, locking: Ciao arch_mutex_cpu_relax() The arch_mutex_cpu_relax() function, introduced by 34b133f, is hacky and ugly. It was added a few years ago to address the fact that common cpu_relax() calls include yielding on s390, and thus impact the optimistic spinning functionality of mutexes. Nowadays we use this function well beyond mutexes: rwsem, qrwlock, mcs and lockref. Since the macro that defines the call is in the mutex header, any users must include mutex.h and the naming is misleading as well. This patch (i) renames the call to cpu_relax_lowlatency ("relax, but only if you can do it with very low latency") and (ii) defines it in each arch's asm/processor.h local header, just like for regular cpu_relax functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax, and thus we can take it out of mutex.h. While this can seem redundant, I believe it is a good choice as it allows us to move out arch specific logic from generic locking primitives and enables future(?) archs to transparently define it, similarly to System Z. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Blanchard <anton@samba.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharat Bhushan <r65777@freescale.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Joe Perches <joe@perches.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Joseph Myers <joseph@codesourcery.com> Cc: Kees Cook <keescook@chromium.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Stratos Karafotis <stratosk@semaphore.gr> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Kulikov <segoon@openwall.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Waiman Long <Waiman.Long@hp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: adi-buildroot-devel@lists.sourceforge.net Cc: linux390@de.ibm.com Cc: linux-alpha@vger.kernel.org Cc: linux-am33-list@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-cris-kernel@axis.com Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux@lists.openrisc.net Cc: linux-m32r-ja@ml.linux-m32r.org Cc: linux-m32r@ml.linux-m32r.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-metag@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
d3a73acb |
|
14-Apr-2014 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: split TIF bits into CIF, PIF and TIF bits The oi and ni instructions used in entry[64].S to set and clear bits in the thread-flags are not guaranteed to be atomic in regard to other CPUs. Split the TIF bits into CPU, pt_regs and thread-info specific bits. Updates on the TIF bits are done with atomic instructions, updates on CPU and pt_regs bits are done with non-atomic instructions. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
24eb3a82 |
|
17-Jun-2013 |
Dominik Dingel <dingel@linux.vnet.ibm.com> |
KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault In the case of a fault, we will retry to exit sie64 but with gmap fault indication for this thread set. This makes it possible to handle async page faults. Based on a patch from Martin Schwidefsky. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
#
10607864 |
|
28-Oct-2013 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/mm,tlb: correct tlb flush on page table upgrade The IDTE instruction used to flush TLB entries for a specific address space uses the address-space-control element (ASCE) to identify affected TLB entries. The upgrade of a page table adds a new top level page table which changes the ASCE. The TLB entries associated with the old ASCE need to be flushed and the ASCE for the address space needs to be replaced synchronously on all CPUs which currently use it. The concept of a lazy ASCE update with an exception handler is broken. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
5ebf250d |
|
16-Oct-2013 |
Heiko Carstens <hca@linux.ibm.com> |
s390: fix handling of runtime instrumentation psw bit Fix the following bugs: - When returning from a signal the signal handler copies the saved psw mask from user space and uses parts of it. Especially it restores the RI bit unconditionally. If however the machine doesn't support RI, or RI is disabled for the task, the last lpswe instruction which returns to user space will generate a specification exception. To fix this check if the RI bit is allowed to be set and kill the task if not. - In the compat mode signal handler code the RI bit of the psw mask gets propagated to the mask of the return psw: if user space enables RI in the signal handler, RI will also be enabled after the signal handler is finished. This is a different behaviour than with 64 bit tasks. So change this to match the 64 bit semantics, which restores the original RI bit value. - Fix similar oddities within the ptrace code as well. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
e258d719 |
|
24-Sep-2013 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/uaccess: always run the kernel in home space Simplify the uaccess code by removing the user_mode=home option. The kernel will now always run in the home space mode. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
0f20822a |
|
13-Sep-2013 |
Heiko Carstens <hca@linux.ibm.com> |
s390/dis: move disassembler function prototypes to proper header file Now that the in-kernel disassembler has an own header file move the disassembler related function prototypes to that header file. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
083986e8 |
|
28-Sep-2013 |
Heiko Carstens <hca@linux.ibm.com> |
mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdef Linus suggested to replace #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX #define arch_mutex_cpu_relax() cpu_relax() #endif with just a simple #ifndef arch_mutex_cpu_relax # define arch_mutex_cpu_relax() cpu_relax() #endif to get rid of CONFIG_HAVE_CPU_RELAX_SIMPLE. So architectures can simply define arch_mutex_cpu_relax if they want an architecture specific function instead of having to add a select statement in their Kconfig in addition. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
ee6ee55b |
|
26-Jul-2013 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
KVM: s390: fix task size check The gmap_map_segment function uses PGDIR_SIZE in the check for the maximum address in the tasks address space. This incorrectly limits the amount of memory usable for a kvm guest to 4TB. The correct limit is (1UL << 53). As the TASK_SIZE has different values (4TB vs 8PB) dependent on the existance of the fourth page table level, create a new define 'TASK_MAX_SIZE' for (1UL << 53). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
64597f9d |
|
02-Jul-2013 |
Michael Mueller <mimu@linux.vnet.ibm.com> |
s390/ptrace: PTRACE_TE_ABORT_RAND The patch implements a s390 specific ptrace request PTRACE_TE_ABORT_RAND to modify the randomness of spontaneous aborts of memory transactions of the transaction execution facility. The data argument of the ptrace request is used to specify the levels of randomness, 0 for normal operation, 1 to abort every transaction at a random instruction, and 2 to abort a random transaction at a random instruction. The default is 0 for normal operation. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
0bcc94ba |
|
05-Mar-2013 |
Stefan Raspl <raspl@linux.vnet.ibm.com> |
s390/dis: use explicit buf len Pass buffer length in extra parameter. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
f9a7e025 |
|
21-Sep-2012 |
Al Viro <viro@zeniv.linux.org.uk> |
s390: switch to generic kernel_thread() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
65f22a90 |
|
06-Sep-2012 |
Al Viro <viro@zeniv.linux.org.uk> |
s390: fold execve_tail() into start_thread(), convert to generic sys_execve() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
eb608fb3 |
|
05-Sep-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/exceptions: switch to relative exception table entries This is the s390 port of 70627654 "x86, extable: Switch to relative exception table entries". Reduces the size of our exception tables by 50% on 64 bit builds. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
6668022c |
|
29-Aug-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/cache: add cpu cache information to /proc/cpuinfo Add a line for each cpu cache to /proc/cpuinfo. Since we only have information of private cpu caches in sysfs we add a line for each cpu cache in /proc/cpuinfo which will also contain information about shared caches. For a z196 machine /proc/cpuinfo now looks like: vendor_id : IBM/S390 bogomips per cpu: 14367.00 features : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs cache0 : level=1 type=Data scope=Private size=64K line_size=256 associativity=4 cache1 : level=1 type=Instruction scope=Private size=128K line_size=256 associativity=8 cache2 : level=2 type=Unified scope=Private size=1536K line_size=256 associativity=12 cache3 : level=3 type=Unified scope=Shared size=24576K line_size=256 associativity=12 cache4 : level=4 type=Unified scope=Shared size=196608K line_size=256 associativity=24 processor 0: version = FF, identification = 000123, machine = 2817 processor 1: version = FF, identification = 100123, machine = 2817 processor 2: version = FF, identification = 200123, machine = 2817 processor 3: version = FF, identification = 200123, machine = 2817 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
d35339a4 |
|
31-Jul-2012 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: add support for transactional memory Allow user-space processes to use transactional execution (TX). If the TX facility is available user space programs can use transactions for fine-grained serialization based on the data objects that are referenced during a transaction. This is useful for lockless data structures and speculative compiler optimizations. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
e4b8b3f3 |
|
31-Jul-2012 |
Jan Glauber <jan.glauber@gmail.com> |
s390: add support for runtime instrumentation Allow user-space threads to use runtime instrumentation (RI). To enable RI for a thread there is a new s390 specific system call, sys_s390_runtime_instr, that takes as parameter a realtime signal number. If the RI facility is available the system call sets up a control block for the calling thread with the appropriate permissions for the thread to modify the control block. The user-space thread can then use the store and modify RI instructions to alter the control block and start/stop the instrumentation via RION/RIOFF. If the user specified program buffer runs full RI triggers an external interrupt. The external interrupt is translated to a real-time signal that is delivered to the thread that enabled RI on that CPU. The number of the real-time signal is the number specified in the RI system call. So, user-space can select any available real-time signal number in case the application itself uses real-time signals for other purposes. The kernel saves the RI control blocks on task switch only if the running thread was enabled for RI. Therefore, the performance impact on task switch should be negligible if RI is not used. RI is only enabled for user-space mode and is disabled for the supervisor state. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
c10302ef |
|
31-Jul-2012 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/bpf,jit: BPF Just In Time compiler for s390 The s390 implementation of the JIT compiler for packet filter speedup. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
0f6f281b |
|
26-Jul-2012 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/mm: downgrade page table after fork of a 31 bit process The downgrade of the 4 level page table created by init_new_context is currently done only in start_thread31. If a 31 bit process forks the new mm uses a 4 level page table, including the task size of 2<<42 that goes along with it. This is incorrect as now a 31 bit process can map memory beyond 2GB. Define arch_dup_mmap to do the downgrade after fork. Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
9b7fb990 |
|
23-Jul-2012 |
Cornelia Huck <cornelia.huck@de.ibm.com> |
s390/dis: Instruction decoding interface Provide a new function, insn_to_mnemonic, by which e.g. kvm can obtain a human-readable decoding of an instruction's opcode. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
a53c8fab |
|
20-Jul-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/comments: unify copyright messages and remove file names Remove the file name from the comment at top of many files. In most cases the file name was wrong anyway, so it's rather pointless. Also unify the IBM copyright statement. We did have a lot of sightly different statements and wanted to change them one after another whenever a file gets touched. However that never happened. Instead people start to take the old/"wrong" statements to use as a template for new files. So unify all of them in one go. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
fbe76568 |
|
05-Jun-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/smp: make absolute lowcore / cpu restart parameter accesses more robust Setting the cpu restart parameters is done in three different fashions: - directly setting the four parameters individually - copying the four parameters with memcpy (using 4 * sizeof(long)) - copying the four parameters using a private structure In addition code in entry*.S relies on a certain order of the restart members of struct _lowcore. Make all of this more robust to future changes by adding a mem_absolute_assign(dest, val) define, which assigns val to dest using absolute addressing mode. Also the load multiple instructions in entry*.S have been split into separate load instruction so the order of the struct _lowcore members doesn't matter anymore. In addition move the prototypes of memcpy_real/absolute from uaccess.h to processor.h. These memcpy* variants are not related to uaccess at all. string.h doesn't seem to match as well, so lets use processor.h. Also replace the eight byte array in struct _lowcore which represents a misaliged u64 with a u64. The compiler will always create code that handles the misaligned u64 correctly. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b9e3f776 |
|
25-May-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/cpu: remove cpu "capabilities" sysfs attribute It has been a big mistage to add the capabilities attribute to the cpus in sysfs: First the attribute only contains the cpu capability of primary cpus, which however is not necessarily (or better: unlikely) the type of cpu the kernel runs on, which is typically an IFL. In addition all information that is necessary is available in /proc/sysinfo already. So this attribute partially duplicated informations. So programs should look into the sysinfo file to retrieve all informations they are interested in. Since with this kernel release also the powersavings cpu attributes are removed this seems to be a good opportunity to remove another broken interface. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
f4815ac6 |
|
23-May-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/headers: replace __s390x__ with CONFIG_64BIT where possible Replace __s390x__ with CONFIG_64BIT in all places that are not exported to userspace or guarded with #ifdef __KERNEL__. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
da477737 |
|
23-May-2012 |
Heiko Carstens <hca@linux.ibm.com> |
s390/headers: remove #ifdef __KERNEL__ from not exported headers Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
55ccf3fe |
|
16-May-2012 |
Suresh Siddha <suresh.b.siddha@intel.com> |
fork: move the real prepare_to_copy() users to arch_dup_task_struct() Historical prepare_to_copy() is mostly a no-op, duplicated for majority of the architectures and the rest following the x86 model of flushing the extended register state like fpu there. Remove it and use the arch_dup_task_struct() instead. Suggested-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-1-git-send-email-suresh.b.siddha@intel.com Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <chris@zankel.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
#
a0616cde |
|
28-Mar-2012 |
David Howells <dhowells@redhat.com> |
Disintegrate asm/system.h for S390 Disintegrate asm/system.h for S390. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-s390@vger.kernel.org
|
#
ff2d8b19 |
|
12-Jan-2012 |
Joe Perches <joe@perches.com> |
treewide: convert uses of ATTRIB_NORETURN to __noreturn Use the more commonly used __noreturn instead of ATTRIB_NORETURN. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Joe Perches <joe@perches.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
aa33c8cb |
|
27-Dec-2011 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] cleanup trap handling Move the program interruption code and the translation exception identifier to the pt_regs structure as 'int_code' and 'int_parm_long' and make the first level interrupt handler in entry[64].S store the two values. That makes it possible to drop 'prot_addr' and 'trap_no' from the thread_struct and to reduce the number of arguments to a lot of functions. Finally un-inline do_trap. Overall this saves 5812 bytes in the .text section of the 64 bit kernel. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
638ad34a |
|
30-Oct-2011 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] sparse: fix sparse warnings about missing prototypes Add prototypes and includes for functions used in different modules. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b50511e4 |
|
30-Oct-2011 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] cleanup psw related bits and pieces Split out addressing mode bits from PSW_BASE_BITS, rename PSW_BASE_BITS to PSW_MASK_BASE, get rid of psw_user32_bits, remove unused function enabled_wait(), introduce PSW_MASK_USER, and drop PSW_MASK_MERGE macros. Change psw_kernel_bits / psw_user_bits to contain only the bits that are always set in the respective mode. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ccf45caf |
|
30-Oct-2011 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] addressing mode limits and psw address wrapping An instruction with an address right below the adress limit for the current addressing mode will wrap. The instruction restart logic in the protection fault handler and the signal code need to follow the wrapping rules to find the correct instruction address. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ada5ed54 |
|
03-Aug-2011 |
Mathias Krause <minipli@googlemail.com> |
[S390] exec: remove redundant set_fs(USER_DS) The address limit is already set in flush_old_exec() so those calls to set_fs(USER_DS) are redundant. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
#
e5992f2e |
|
24-Jul-2011 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] kvm guest address space mapping Add code that allows KVM to control the virtual memory layout that is seen by a guest. The guest address space uses a second page table that shares the last level pte-tables with the process page table. If a page is unmapped from the process page table it is automatically unmapped from the guest page table as well. The guest address space mapping starts out empty, KVM can map any individual 1MB segments from the process virtual memory to any 1MB aligned location in the guest virtual memory. If a target segment in the process virtual memory does not exist or is unmapped while a guest mapping exists the desired target address is stored as an invalid segment table entry in the guest page table. The population of the guest page table is fault driven. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
f2db2e6c |
|
23-May-2011 |
Heiko Carstens <hca@linux.ibm.com> |
[S390] pfault: cpu hotplug vs missing completion interrupts On cpu hot remove a PFAULT CANCEL command is sent to the hypervisor which in turn will cancel all outstanding pfault requests that have been issued on that cpu (the same happens with a SIGP cpu reset). The result is that we end up with uninterruptible processes where the interrupt that would wake up these processes never arrives. In order to solve this all processes which wait for a pfault completion interrupt get woken up after a cpu hot remove. The worst case that could happen is that they fault again and in turn need to wait again. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
261cd298 |
|
15-Feb-2011 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390: remove task_show_regs task_show_regs used to be a debugging aid in the early bringup days of Linux on s390. /proc/<pid>/status is a world readable file, it is not a good idea to show the registers of a process. The only correct fix is to remove task_show_regs. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
974de4d7 |
|
04-Jan-2011 |
Heiko Carstens <hca@linux.ibm.com> |
[S390] smp: remove cpu hotplug messages Get rid of messages that indicate if a cpu went online or offline. There is nothing special about this anymore and these messages might flood the kernel log buffer which makes debugging harder since more important messages might be overwritten. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
5e9a2692 |
|
04-Jan-2011 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] ptrace cleanup Overhaul program event recording and the code dealing with the ptrace user space interface. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
ba6cadfe |
|
25-Oct-2010 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] remove ieee_instruction_pointer from thread_struct The ieee_instruction_pointer can not be read from user space anymore since git commit 613e1def6b52c399a8b72a5e11bc2e57d2546fb8, the ptrace interface always returns zero. Remove it from the thread_struct. It is still present in the user_regs_struct for compatability reasons. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
987bcdac |
|
26-Feb-2010 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] use inline assembly contraints available with gcc 3.3.3 Drop support to compile the kernel with gcc versions older than 3.3.3. This allows us to use the "Q" inline assembly contraint on some more inline assemblies without duplicating a lot of complex code (e.g. __xchg and __cmpxchg). The distinction for older gcc versions can be removed which saves a few lines and simplifies the code. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
bdd42b28 |
|
22-Sep-2009 |
Heiko Carstens <hca@linux.ibm.com> |
[S390] fix disabled_wait inline assembly clobber list The disabled_wait inline assmembly also clobbers register r1, but it is missing in the clobber list. Fixes recursive Oops on panic. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
e86a6ed6 |
|
11-Sep-2009 |
Heiko Carstens <hca@linux.ibm.com> |
[S390] Get rid of cpuid.h header file. Merge cpuid.h header file into cpu.h. While at it convert from typedef to struct declaration and also convert cio code to use proper lowcore structure instead of casts. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
25097bf1 |
|
14-Apr-2009 |
Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> |
[S390] s390: move machine flags to lowcore Currently the storage of the machine flags is a globally exported unsigned long long variable. By moving the storage location into the lowcore struct we allow assembler code to check machine_flags directly even without needing a register. Addtionally the lowcore and therefore the machine flags too will be in cache most of the time. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
7b468488 |
|
26-Mar-2009 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] eliminate cpuinfo_S390 structure Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
f481bfaf |
|
18-Mar-2009 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] make page table walking more robust Make page table walking on s390 more robust. The current code requires that the pgd/pud/pmd/pte loop is only done for address ranges that are below the end address of the last vma of the address space. But this is not always true, e.g. the generic page table walker does not guarantee this. Change TASK_SIZE/TASK_SIZE_OF to reflect the current size of the address space. This makes the generic page table walker happy but it breaks the upgrade of a 3 level page table to a 4 level page table. To make the upgrade work again another fix is required. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
edd53787 |
|
25-Dec-2008 |
Heiko Carstens <hca@linux.ibm.com> |
[S390] mark disabled_wait as noreturn function disabled_wait() won't return, so add an __attribute__((noreturn)). This will remove a false positive finding which our internal code checker reports. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
c6557e7f |
|
01-Aug-2008 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
[S390] move include/asm-s390 to arch/s390/include/asm Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|