#
8c09871a |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: limit save and restore to used registers The first invocation of kernel_fpu_begin() after switching from user to kernel context will save all vector registers, even if only parts of the vector registers are used within the kernel fpu context. Given that save and restore of all vector registers is quite expensive change the current approach in several ways: - Instead of saving and restoring all user registers limit this to those registers which are actually used within an kernel fpu context. - On context switch save all remaining user fpu registers, so they can be restored when the task is rescheduled. - Saving user registers within kernel_fpu_begin() is done without disabling and enabling interrupts - which also slightly reduces runtime. In worst case (e.g. interrupt context uses the same registers) this may lead to the situation that registers are saved several times, however the assumption is that this will not happen frequently, so that the new method is faster in nearly all cases. - save_user_fpu_regs() can still be called from all contexts and saves all (or all remaining) user registers to a tasks ufpu user fpu save area. Overall this reduces the time required to save and restore the user fpu context for nearly all cases. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
066c4091 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: decrease stack usage for some cases The kernel_fpu structure has a quite large size of 520 bytes. In order to reduce stack footprint introduce several kernel fpu structures with different and also smaller sizes. This way every kernel fpu user must use the correct variant. A compile time check verifies that the correct variant is used. There are several users which use only 16 instead of all 32 vector registers. For those users the new kernel_fpu_16 structure with a size of only 266 bytes can be used. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
bdbd3acb |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: remove anonymous union from struct fpu The anonymous union within struct fpu contains a floating point register array and a vector register array. Given that the vector register is always present remove the floating point register array. For configurations without vector registers save the floating point register contents within their corresponding vector register location. This allows to remove the union, and also to simplify ptrace and perf code. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
9cbff7f2 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: remove regs member from struct fpu KVM was the only user which modified the regs pointer in struct fpu. Remove the pointer and convert the rest of the core fpu code to directly access the save area embedded within struct fpu. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
c038b984 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: change type of fpu mask from u32 to int Change type of fpu mask consistently from u32 to int. This is a prerequisite to make the kernel fpu usage preemptible. Upcoming code uses __atomic* ops which work with int pointers. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
87c5c700 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: rename save_fpu_regs() to save_user_fpu_regs(), etc Rename save_fpu_regs(), load_fpu_regs(), and struct thread_struct's fpu member to save_user_fpu_regs(), load_user_fpu_regs(), and ufpu. This way the function and variable names reflect for which context they are supposed to be used. This large and trivial conversion is a prerequisite for making the kernel fpu usage preemptible. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
419abc4d |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: convert FPU CIF flag to regular TIF flag The FPU state, as represented by the CIF_FPU flag reflects the FPU state of a task, not the CPU it is running on. Therefore convert the flag to a regular TIF flag. This removes the magic in switch_to() where a save_fpu_regs() call for the currently (previous) running task sets the per-cpu CIF_FPU flag, which is required to restore FPU register contents of the next task, when it returns to user space. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
918c7cad |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: convert __kernel_fpu_begin()/__kernel_fpu_end() to C Convert the rather large __kernel_fpu_begin()/__kernel_fpu_end() inline assemblies to C. The C variant is much more readable, and this also allows to get rid of the non-obvious usage of KERNEL_VXR_* constants within the inline assemblies. E.g. "tmll %[m],6" correlates with the two bits set in KERNEL_VXR_LOW. If the corresponding defines would be changed, the inline assembles would break in a subtle way. Therefore convert to C, use the proper defines, and allow the compiler to generate code using the (hopefully) most efficient instructions. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
3a5866a0 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: provide and use vlm and vstm inline assemblies Instead of open-coding vlm and vstm inline assemblies at several locations, provide an fpu_* function for each instruction, and use them in the new save_vx_regs() and load_vx_regs() helper functions. Note that "O" and "R" inline assembly operand modifiers are used in order to pass the displacement and base register of the memory operands to the existing VLM and VSTM macros. The two operand modifiers are not available for clang. Therefore provide two variants of each inline assembly. The clang variant always uses and clobbers general purpose register 1, like in the previous inline assemblies, so it can be used as base register with a zero displacement. This generates slightly less efficient code, but can be removed as soon as clang has support for the used operand modifiers. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
f4e3de75 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: provide and use lfpc, sfpc, and stfpc inline assemblies Instead of open-coding lfpc, sfpc, and stfpc inline assemblies at several locations, provide an fpu_* function for each instruction and use the function instead. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
88d8136a |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: provide and use ld and std inline assemblies Deduplicate the 64 ld and std inline assemblies. Provide an fpu inline assembly for both instructions, and use them in the new save_fp_regs() and load_fp_regs() helper functions. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
13a8a519 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: use lfpc instead of sfpc instruction The only user of sfpc_safe() needs to read the new fpc register value from memory before it is set with sfpc. Avoid this indirection and use lfpc, which reads the new value from memory. Also add the "fpu_" prefix to have a common name space for fpu related inline assemblies, and provide memory access instrumentation. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
fd2527f2 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: move, rename, and merge header files Move, rename, and merge the fpu and vx header files. This way fpu header files have a consistent naming scheme (fpu*.h). Also get rid of the fpu subdirectory and move header files to asm directory, so that all fpu and vx header files can be found at the same location. Merge internal.h header file into other header files, since the internal helpers are used at many locations. so those helper functions are really not internal. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
31d3ec15 |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: various coding style changes Address various checkpatch warnings, adjust whitespace, and try to increase readability. This is just preparation, in order to avoid that subsequent patches contain any distracting drive-by coding style changes. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
b6b842be |
|
03-Feb-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: use KERNEL_VXR_LOW instead of KERNEL_VXR_V0V7 Use KERNEL_VXR_LOW instead of KERNEL_VXR_V0V7 for configurations without vector registers in order to decide if floating point registers need to be saved and restored. Kernel FPU areas which use floating point registers are supposed to use the KERNEL_FPR mask, however users may also open-code this and specify KERNEL_VXR_V0V7 and/or KERNEL_VXR_V8V15. If only KERNEL_VXR_V8V15 is specified floating point registers wouldn't be saved and restored. Improve this and check for both bits. There are currently no users where this would fix a bug. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
#
74ca8961 |
|
05-Jan-2024 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: remove __load_fpu_regs() export __load_fpu_regs() is only called from core kernel code. Therefore remove the not needed export. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
18564756 |
|
01-Dec-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: get rid of MACHINE_HAS_VX Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx() which is a short readable wrapper for "test_facility(129)". Facility bit 129 is set if the vector facility is present. test_facility() returns also true for all bits which are set in the architecture level set of the cpu that the kernel is compiled for. This means that test_facility(129) is a compile time constant which returns true for z13 and later, since the vector facility bit is part of the z13 kernel ALS. In result the compiled code will have less runtime checks, and less code. Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
70264424 |
|
30-Nov-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/fpu: get rid of test_fp_ctl() It is quite subtle to use test_fp_ctl() correctly. Therefore remove it - instead copy whatever new floating point control (fpc) register values are supposed to be used into its save area. Test the validity of the new value when loading it. If the new value is invalid, load the fpc register with zero. This seems to be a the best way to approach this problem. Even though this changes behavior: - sigreturn with an invalid fpc value on the stack will succeed, and continue with zero value, instead of returning with SIGSEGV - ptraced processes will also use a zero value instead of letting the request fail with -EINVAL However all of this seems to acceptable. After all testing of the value was only implemented to avoid that user space can crash the kernel. It is not there to test values for validity; and the assumption is that there is no existing user space which is doing this. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
706f2ada |
|
01-Dec-2022 |
Heiko Carstens <hca@linux.ibm.com> |
s390/vx: add vx-insn.h wrapper include file The vector instruction macros can also be used in inline assemblies. For this the magic asm(".include \"asm/vx-insn.h\"\n"); must be added to C files in order to avoid that the pre-processor eliminates the __ASSEMBLY__ guarded macros. This however comes with the problem that changes to asm/vx-insn.h do not cause a recompile of C files which have only this magic statement instead of a proper include statement. This can be observed with the arch/s390/kernel/fpu.c file. In order to fix this problem and also to avoid that the include must be specified twice, add a wrapper include header file which will do all necessary steps. This way only the vx-insn.h header file needs to be included and changes to the new vx-insn-asm.h header file cause a recompile of all dependent files like it should. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
56e62a73 |
|
21-Nov-2020 |
Sven Schnelle <svens@linux.ibm.com> |
s390: convert to generic entry This patch converts s390 to use the generic entry infrastructure from kernel/entry/*. There are a few special things on s390: - PIF_PER_TRAP is moved to TIF_PER_TRAP as the generic code doesn't know about our PIF flags in exit_to_user_mode_loop(). - The old code had several ways to restart syscalls: a) PIF_SYSCALL_RESTART, which was only set during execve to force a restart after upgrading a process (usually qemu-kvm) to pgste page table extensions. b) PIF_SYSCALL, which is set by do_signal() to indicate that the current syscall should be restarted. This is changed so that do_signal() now also uses PIF_SYSCALL_RESTART. Continuing to use PIF_SYSCALL doesn't work with the generic code, and changing it to PIF_SYSCALL_RESTART makes PIF_SYSCALL and PIF_SYSCALL_RESTART more unique. - On s390 calling sys_sigreturn or sys_rt_sigreturn is implemented by executing a svc instruction on the process stack which causes a fault. While handling that fault the fault code sets PIF_SYSCALL to hand over processing to the syscall code on exit to usermode. The patch introduces PIF_SYSCALL_RET_SET, which is set if ptrace sets a return value for a syscall. The s390x ptrace ABI uses r2 both for the syscall number and return value, so ptrace cannot set the syscall number + return value at the same time. The flag makes handling that a bit easier. do_syscall() will just skip executing the syscall if PIF_SYSCALL_RET_SET is set. CONFIG_DEBUG_ASCE was removd in favour of the generic CONFIG_DEBUG_ENTRY. CR1/7/13 will be checked both on kernel entry and exit to contain the correct asces. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
#
35af0d46 |
|
14-Apr-2019 |
Vasily Gorbik <gor@linux.ibm.com> |
s390: correct some inline assembly constraints Inline assembly code changed in this patch should really use "Q" constraint "Memory reference without index register and with short displacement". The kernel build with kasan instrumentation enabled might occasionally break otherwise (due to stack instrumentation). Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7f79695c |
|
21-Aug-2016 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
s390/fpu: improve kernel_fpu_[begin|end] In case of nested user of the FPU or vector registers in the kernel the current code uses the mask of the FPU/vector registers of the previous contexts to decide which registers to save and restore. E.g. if the previous context used KERNEL_VXR_V0V7 and the next context wants to use KERNEL_VXR_V24V31 the first 8 vector registers are stored to the FPU state structure. But this is not necessary as the next context does not use these registers. Rework the FPU/vector register save and restore code. The new code does a few things differently: 1) A lowcore field is used instead of a per-cpu variable. 2) The kernel_fpu_end function now has two parameters just like kernel_fpu_begin. The register flags are required by both functions to save / restore the minimal register set. 3) The inline functions kernel_fpu_begin/kernel_fpu_end now do the update of the register masks. If the user space FPU registers have already been stored neither save_fpu_regs nor the __kernel_fpu_begin/__kernel_fpu_end functions have to be called for the first context. In this case kernel_fpu_begin adds 7 instructions and kernel_fpu_end adds 4 instructions. 3) The inline assemblies in __kernel_fpu_begin / __kernel_fpu_end to save / restore the vector registers are simplified a bit. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
#
04864808 |
|
18-Feb-2015 |
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> |
s390/vx: add support functions for in-kernel FPU use Introduce the kernel_fpu_begin() and kernel_fpu_end() function to enclose any in-kernel use of FPU instructions and registers. In enclosed sections, you can perform floating-point or vector (SIMD) computations. The functions take care of saving and restoring FPU register contents and controls. For usage details, see the guidelines in arch/s390/include/asm/fpu/api.h Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|