#
10e4b516 |
|
22-Mar-2024 |
Adamos Ttofari <attofari@amazon.de> |
x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD Commit 672365477ae8 ("x86/fpu: Update XFD state where required") and commit 8bf26758ca96 ("x86/fpu: Add XFD state to fpstate") introduced a per CPU variable xfd_state to keep the MSR_IA32_XFD value cached, in order to avoid unnecessary writes to the MSR. On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which wipes out any stale state. But the per CPU cached xfd value is not reset, which brings them out of sync. As a consequence a subsequent xfd_update_state() might fail to update the MSR which in turn can result in XRSTOR raising a #NM in kernel space, which crashes the kernel. To fix this, introduce xfd_set_state() to write xfd_state together with MSR_IA32_XFD, and use it in all places that set MSR_IA32_XFD. Fixes: 672365477ae8 ("x86/fpu: Update XFD state where required") Signed-off-by: Adamos Ttofari <attofari@amazon.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240322230439.456571-1-chang.seok.bae@intel.com Closes: https://lore.kernel.org/lkml/20230511152818.13839-1-attofari@amazon.de
|
#
18164f66 |
|
27-Sep-2023 |
Sean Christopherson <seanjc@google.com> |
x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer Plumb an xfeatures mask into __copy_xstate_to_uabi_buf() so that KVM can constrain which xfeatures are saved into the userspace buffer without having to modify the user_xfeatures field in KVM's guest_fpu state. KVM's ABI for KVM_GET_XSAVE{2} is that features that are not exposed to guest must not show up in the effective xstate_bv field of the buffer. Saving only the guest-supported xfeatures allows userspace to load the saved state on a different host with a fewer xfeatures, so long as the target host supports the xfeatures that are exposed to the guest. KVM currently sets user_xfeatures directly to restrict KVM_GET_XSAVE{2} to the set of guest-supported xfeatures, but doing so broke KVM's historical ABI for KVM_SET_XSAVE, which allows userspace to load any xfeatures that are supported by the *host*. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230928001956.924301-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
1c813ce0 |
|
15-Nov-2022 |
Kyle Huey <me@kylehuey.com> |
x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate(). Both KVM (through KVM_SET_XSTATE) and ptrace (through PTRACE_SETREGSET with NT_X86_XSTATE) ultimately call copy_uabi_from_kernel_to_xstate(), but the canonical locations for the current PKRU value for KVM guests and processes in a ptrace stop are different (in the kvm_vcpu_arch and the thread_state structs respectively). In preparation for eventually handling PKRU in copy_uabi_to_xstate, pass in a pointer to the PKRU location. Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-3-khuey%40kylehuey.com
|
#
6a877d24 |
|
15-Nov-2022 |
Kyle Huey <me@kylehuey.com> |
x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate() This will allow copy_sigframe_from_user_to_xstate() to grab the address of thread_struct's pkru value in a later patch. Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-2-khuey%40kylehuey.com
|
#
8ad7e8f6 |
|
04-Apr-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xsave: Support XSAVEC in the kernel XSAVEC is the user space counterpart of XSAVES which cannot save supervisor state. In virtualization scenarios the hypervisor does not expose XSAVES but XSAVEC to the guest, though the kernel does not make use of it. That's unfortunate because XSAVEC uses the compacted format of saving the XSTATE. This is more efficient in terms of storage space vs. XSAVE[OPT] as it does not create holes for XSTATE components which are not supported or enabled by the kernel but are available in hardware. There is room for further optimizations when XSAVEC/S and XGETBV1 are supported. In order to support XSAVEC: - Define the XSAVEC ASM macro as it's not yet supported by the required minimal toolchain. - Create a software defined X86_FEATURE_XCOMPACTED to select the compacted XSTATE buffer format for both XSAVEC and XSAVES. - Make XSAVEC an option in the 'XSAVE' ASM alternatives Requested-by: Andrew Cooper <Andrew.Cooper3@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220404104820.598704095@linutronix.de
|
#
c270ce39 |
|
05-Jan-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Add guest support to xfd_enable_feature() Guest support for dynamically enabled FPU features requires a few modifications to the enablement function which is currently invoked from the #NM handler: 1) Use guest permissions and sizes for the update 2) Update fpu_guest state accordingly 3) Take into account that the enabling can be triggered either from a running guest via XSETBV and MSR_IA32_XFD write emulation or from a guest restore. In the latter case the guests fpstate is not the current tasks active fpstate. Split the function and implement the guest mechanics throughout the callchain. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-7-yang.zhong@intel.com> [Add 32-bit stub for __xfd_enable_feature. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
980fe2fd |
|
05-Jan-2022 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Extend fpu_xstate_prctl() with guest permissions KVM requires a clear separation of host user space and guest permissions for dynamic XSTATE components. Add a guest permissions member to struct fpu and a separate set of prctl() arguments: ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM. The semantics are equivalent to the host user space permission control except for the following constraints: 1) Permissions have to be requested before the first vCPU is created 2) Permissions are frozen when the first vCPU is created to ensure consistency. Any attempt to expand permissions via the prctl() after that point is rejected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-2-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
1c3b9091 |
|
10-Nov-2021 |
Peter Zijlstra <peterz@infradead.org> |
x86/fpu: Remove .fixup usage Employ EX_TYPE_EFAULT_REG to store '-EFAULT' into the %[err] register on exception. All the callers only ever test for 0, so the change from -1 to -EFAULT is immaterial. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20211110101325.604494664@infradead.org
|
#
30d02551 |
|
02-Nov-2021 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86/fpu: Optimize out sigframe xfeatures when in init state tl;dr: AMX state is ~8k. Signal frames can have space for this ~8k and each signal entry writes out all 8k even if it is zeros. Skip writing zeros for AMX to speed up signal delivery by about 4% overall when AMX is in its init state. This is a user-visible change to the sigframe ABI. == Hardware XSAVE Background == XSAVE state components may be tracked by the processor as being in their initial configuration. Software can detect which features are in this configuration by looking at the XSTATE_BV field in an XSAVE buffer or with the XGETBV(1) instruction. Both the XSAVE and XSAVEOPT instructions enumerate features s being in the initial configuration via the XSTATE_BV field in the XSAVE header, However, XSAVEOPT declines to actually write features in their initial configuration to the buffer. XSAVE writes the feature unconditionally, regardless of whether it is in the initial configuration or not. Basically, XSAVE users never need to inspect XSTATE_BV to determine if the feature has been written to the buffer. XSAVEOPT users *do* need to inspect XSTATE_BV. They might also need to clear out the buffer if they want to make an isolated change to the state, like modifying one register. == Software Signal / XSAVE Background == Signal frames have historically been written with XSAVE itself. Each state is written in its entirety, regardless of being in its initial configuration. In other words, the signal frame ABI uses the XSAVE behavior, not the XSAVEOPT behavior. == Problem == This means that any application which has acquired permission to use AMX via ARCH_REQ_XCOMP_PERM will write 8k of state to the signal frame. This 8k write will occur even when AMX was in its initial configuration and software *knows* this because of XSTATE_BV. This problem also exists to a lesser degree with AVX-512 and its 2k of state. However, AVX-512 use does not require ARCH_REQ_XCOMP_PERM and is more likely to have existing users which would be impacted by any change in behavior. == Solution == Stop writing out AMX xfeatures which are in their initial state to the signal frame. This effectively makes the signal frame XSAVE buffer look as if it were written with a combination of XSAVEOPT and XSAVE behavior. Userspace which handles XSAVEOPT- style buffers should be able to handle this naturally. For now, include only the AMX xfeatures: XTILE and XTILEDATA in this new behavior. These require new ABI to use anyway, which makes their users very unlikely to be broken. This XSAVEOPT-like behavior should be expected for all future dynamic xfeatures. It may also be extended to legacy features like AVX-512 in the future. Only attempt this optimization on systems with dynamic features. Disable dynamic feature support (XFD) if XGETBV1 is unavailable by adding a CPUID dependency. This has been measured to reduce the *overall* cycle cost of signal delivery by about 4%. Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: "Chang S. Bae" <chang.seok.bae@intel.com> Link: https://lore.kernel.org/r/20211102224750.FA412E26@davehans-spike.ostc.intel.com
|
#
67236547 |
|
21-Oct-2021 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/fpu: Update XFD state where required The IA32_XFD_MSR allows to arm #NM traps for XSTATE components which are enabled in XCR0. The register has to be restored before the tasks XSTATE is restored. The life time rules are the same as for FPU state. XFD is updated on return to userspace only when the FPU state of the task is not up to date in the registers. It's updated before the XRSTORS so that eventually enabled dynamic features are restored as well and not brought into init state. Also in signal handling for restoring FPU state from user space the correctness of the XFD state has to be ensured. Add it to CPU initialization and resume as well. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211021225527.10184-17-chang.seok.bae@intel.com
|
#
5529acf4 |
|
21-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Add sanity checks for XFD Add debug functionality to ensure that the XFD MSR is up to date for XSAVE* and XRSTOR* operations. [ tglx: Improve comment. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-16-chang.seok.bae@intel.com
|
#
8bf26758 |
|
21-Oct-2021 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/fpu: Add XFD state to fpstate Add storage for XFD register state to struct fpstate. This will be used to store the XFD MSR state. This will be used for switching the XFD MSR when FPU content is restored. Add a per-CPU variable to cache the current MSR value so the MSR has only to be written when the values are different. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-15-chang.seok.bae@intel.com
|
#
db8268df |
|
21-Oct-2021 |
Chang S. Bae <chang.seok.bae@intel.com> |
x86/arch_prctl: Add controls for dynamic XSTATE components Dynamically enabled XSTATE features are by default disabled for all processes. A process has to request permission to use such a feature. To support this implement a architecture specific prctl() with the options: - ARCH_GET_XCOMP_SUPP Copies the supported feature bitmap into the user space provided u64 storage. The pointer is handed in via arg2 - ARCH_GET_XCOMP_PERM Copies the process wide permitted feature bitmap into the user space provided u64 storage. The pointer is handed in via arg2 - ARCH_REQ_XCOMP_PERM Request permission for a feature set. A feature set can be mapped to a facility, e.g. AMX, and can require one or more XSTATE components to be enabled. The feature argument is the number of the highest XSTATE component which is required for a facility to work. The request argument is not a user supplied bitmap because that makes filtering harder (think seccomp) and even impossible because to support 32bit tasks the argument would have to be a pointer. The permission mechanism works this way: Task asks for permission for a facility and kernel checks whether that's supported. If supported it does: 1) Check whether permission has already been granted 2) Compute the size of the required kernel and user space buffer (sigframe) size. 3) Validate that no task has a sigaltstack installed which is smaller than the resulting sigframe size 4) Add the requested feature bit(s) to the permission bitmap of current->group_leader->fpu and store the sizes in the group leaders fpu struct as well. If that is successful then the feature is still not enabled for any of the tasks. The first usage of a related instruction will result in a #NM trap. The trap handler validates the permission bit of the tasks group leader and if permitted it installs a larger kernel buffer and transfers the permission and size info to the new fpstate container which makes all the FPU functions which require per task information aware of the extended feature set. [ tglx: Adopted to new base code, added missing serialization, massaged namings, comments and changelog ] Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-7-chang.seok.bae@intel.com
|
#
d72c8701 |
|
14-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xstate: Move remaining xfeature helpers to core Now that everything is mopped up, move all the helpers and prototypes into the core header. They are not required by the outside. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.514095101@linutronix.de
|
#
2bd264bc |
|
14-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Move xstate size to fpu_*_cfg Use the new kernel and user space config storage to store and retrieve the XSTATE buffer sizes. The default and the maximum size are the same for now, but will change when support for dynamically enabled features is added. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211014230739.296830097@linutronix.de
|
#
49e4eb41 |
|
13-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xstate: Use fpstate for copy_uabi_to_xstate() Prepare for dynamically enabled states per task. The function needs to retrieve the features and sizes which are valid in a fpstate context. Retrieve them from fpstate. Move the function declarations to the core header as they are not required anywhere else. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.233529986@linutronix.de
|
#
3ac8d757 |
|
13-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Use fpstate in __copy_xstate_to_uabi_buf() With dynamically enabled features the copy function must know the features and the size which is valid for the task. Retrieve them from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.181495492@linutronix.de
|
#
0b2d39aa |
|
13-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xstate: Use fpstate for xsave_to_user_sigframe() With dynamically enabled features the sigframe code must know the features which are enabled for the task. Get them from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.077781448@linutronix.de
|
#
073e627a |
|
13-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu/xstate: Use fpstate for os_xsave() With variable feature sets XSAVE[S] requires to know the feature set for which the buffer is valid. Retrieve it from fpstate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211013145323.025695590@linutronix.de
|
#
087df48c |
|
13-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Replace KVMs xstate component clearing In order to prepare for the support of dynamically enabled FPU features, move the clearing of xstate components to the FPU core code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Link: https://lkml.kernel.org/r/20211013145322.399567049@linutronix.de
|
#
6415bb80 |
|
14-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Mop up the internal.h leftovers Move the global interfaces to api.h and the rest into the core. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211015011539.948837194@linutronix.de
|
#
df95b0f1 |
|
14-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Move os_xsave() and os_xrstor() to core Nothing outside the core code needs these. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211015011539.513368075@linutronix.de
|
#
ca834def |
|
14-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Rework copy_xstate_to_uabi_buf() Prepare for replacing the KVM copy xstate to user function by extending copy_xstate_to_uabi_buf() with a pkru argument which allows the caller to hand in the pkru value, which is required for KVM because the guest PKRU is not accessible via current. Fixup all callsites accordingly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211015011539.191902137@linutronix.de
|
#
126fe040 |
|
14-Oct-2021 |
Thomas Gleixner <tglx@linutronix.de> |
x86/fpu: Cleanup xstate xcomp_bv initialization No point in having this duplicated all over the place with needlessly different defines. Provide a proper initialization function which initializes user buffers properly and make KVM use it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211015011538.897664678@linutronix.de
|