History log of /openbsd-current/sys/arch/amd64/include/fpu.h
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.20 14-Apr-2024 kettenis

Implement support for AVX-512. This required some fixes to the so-far
unused Skylake AVX-512 MDS handler and increases the ci_mds_tmp array to
64 bytes. With help from guenther@

ok deraadt@, guenther@


Revision tags: OPENBSD_7_4_BASE OPENBSD_7_5_BASE
# 1.19 10-Jul-2023 guenther

Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS
to save/restore the state and enabling it at exec-time (and for
signal handling) if the PS_NOBTCFI flag isn't set.

Note: this changes the format of the sc_fpstate data in the signal
context to possibly be in compressed format: starting now we just
guarantee that that state is in a format understood by the XRSTOR
instruction of the system that is being executed on.

At this time, passing sigreturn a corrupt sc_fpstate now results
in the process exiting with no attempt to fix it up or send a
T_PROTFLT trap. That may change.

prodding by deraadt@
issues with my original signal handling design identified by kettenis@

lots of base and ports preparation for this by deraadt@ and the
libressl and ports teams

ok deraadt@ kettenis@


# 1.18 22-May-2023 guenther

The fp_ex_[st]w struct savefpu members were inherited from NetBSD where
they're used in the 32bit-compat support, which we dropped years ago.
Bye bye!

ok deraadt@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.17 29-Nov-2019 mortimer

Fix size of reserved bytes section in xsave header.
ok guenther@ kettenis@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.16 07-Oct-2018 guenther

In vmm, handle xsetbv like xrstor: instead of trying to prevalidate
the values, just try it and handle the #GP if it faults.

Problem reported by Maxime Villard (max(at)m00nbsd.net)
ok mlarkin@


# 1.15 24-Jun-2018 guenther

Move signal generation from fputrap() to where it's called in trap()


# 1.14 05-Jun-2018 guenther

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in the kernel all
check the flag and, if set, save the old thread's state to the PCB,
clear the flag, and then load the _blank_ state
- when returning to userspace, if the flag is clear then set it and restore
the thread's state

This simpler tracking also fixes the restoring of FPU state after nested
signal handlers.

With this, %cr0's TS flag is never set, the FPU #DNA trap can no
longer happen, and IPIs are no longer necessary for flushing or
syncing FPU state; on the other hand, restoring xstate while returning
to userspace means we have to handle xrstor faulting if we could
be loading an altered state. If that happens, reset the state,
fake a #GP fault (SIGBUS), and recheck for ASTs.

While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by
using codepatching to switch to xsave/xrstor when present in the
CPU. In addition, code patch in use of xsaveopt in most places
when the CPU supports that. Use the 64bit-wide variants of the
instructions in all cases so that x87 instruction fault IPs are
reported correctly.

This change has three motivations:
1) with modern clang, SSE registers are used even in rcrt0.o, making
lazy FPU switching a smaller benefit vs trap costs
2) the Intel SDM warns that lazy FPU switching may increase power costs
3) post-Spectre rumors suggest that the %cr0 TS flag might not block
speculation, permitting leaking of information about FPU state
(AES keys?) across protection boundaries.

tested by many in snaps; prodding from deraadt@


# 1.13 26-May-2018 guenther

Update comment to reflect xsave


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.12 27-Apr-2017 mlarkin

branches: 1.12.2; 1.12.4;
vmm(4): proper save/restore of FPU context during entry/exit.

tested by reyk, dcoppa, and a few others.

ok kettenis@ on the fpu bits
ok deraadt@ on the vmm bits


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE OPENBSD_6_0_BASE OPENBSD_6_1_BASE
# 1.11 25-Mar-2015 kettenis

branches: 1.11.10;
Save/restore AVX registers and other XSAVE-managed state information when
entering/leaving a signal handler like we already do the the FPU and SSE
state. This should make it possible to use AVX instructions in signal
handlers.

ok mlarkin@


# 1.10 21-Mar-2015 kettenis

Add support for saving/restoring FPU state using the XSAVE/XRSTOR. Limit
support to the X87, SSE and AVX state.

This gives us (almost) full AVX support. The AVX state isn't saved by
signal handlers yet, and ptrace(2) support is still missing.

ok guenther@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE OPENBSD_5_4_BASE OPENBSD_5_5_BASE OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.9 23-Mar-2011 pirofti

Normalize sentinel. Use _MACHINE_*_H_ and _<ARCH>_*_H_ properly and consitently.

Discussed and okay drahn@. Okay deraadt@.


# 1.8 20-Mar-2011 guenther

When reading MXCSR from userland sigcontext or a ptrace request,
mask out invalid bits to prevent a protect fault.

Original diff by joshe@; further feedback and ok kettenis@


Revision tags: OPENBSD_4_9_BASE
# 1.7 20-Nov-2010 miod

__attribute__((packed)) -> __packed. The ioprbs.c chunk was commented out, and
uncommenting it is intentional.
ok deraadt@


# 1.6 29-Sep-2010 joshe

Back out previous, it appears to be broken.


# 1.5 29-Sep-2010 joshe

When reading MXCSR from userland sigcontext, mask out invalid bits.

This prevents a protection fault if a userland signal handler
scribbles all over it's struct sigcontext

Help from and ok guenther@ kettenis@


Revision tags: OPENBSD_4_8_BASE
# 1.4 29-Jun-2010 thib

fpu_kernel_{enter,exit}; Functions to allow the use of
the FPU in the kernel.

From Mike Belopuhov; Little bits by myself.

Comments/OK kettenis@


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.3 01-Oct-2006 kettenis

Switch fpu control word to the hardware default. This makes us use 64-bit
precision instead of 53-bit precision, giving us proper support for
"long double".

ok deraadt@


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE SMP_SYNC_A SMP_SYNC_B
# 1.2 28-Feb-2004 deraadt

rename our NPXCW setting


# 1.1 28-Jan-2004 mickey

branches: 1.1.2;
an amd64 arch support.
hacked by art@ from netbsd sources and then later debugged
by me into the shape where it can host itself.
no bootloader yet as needs redoing from the
recent advanced i386 sources (anyone? ;)


# 1.19 10-Jul-2023 guenther

Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS
to save/restore the state and enabling it at exec-time (and for
signal handling) if the PS_NOBTCFI flag isn't set.

Note: this changes the format of the sc_fpstate data in the signal
context to possibly be in compressed format: starting now we just
guarantee that that state is in a format understood by the XRSTOR
instruction of the system that is being executed on.

At this time, passing sigreturn a corrupt sc_fpstate now results
in the process exiting with no attempt to fix it up or send a
T_PROTFLT trap. That may change.

prodding by deraadt@
issues with my original signal handling design identified by kettenis@

lots of base and ports preparation for this by deraadt@ and the
libressl and ports teams

ok deraadt@ kettenis@


# 1.18 22-May-2023 guenther

The fp_ex_[st]w struct savefpu members were inherited from NetBSD where
they're used in the 32bit-compat support, which we dropped years ago.
Bye bye!

ok deraadt@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.17 29-Nov-2019 mortimer

Fix size of reserved bytes section in xsave header.
ok guenther@ kettenis@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.16 07-Oct-2018 guenther

In vmm, handle xsetbv like xrstor: instead of trying to prevalidate
the values, just try it and handle the #GP if it faults.

Problem reported by Maxime Villard (max(at)m00nbsd.net)
ok mlarkin@


# 1.15 24-Jun-2018 guenther

Move signal generation from fputrap() to where it's called in trap()


# 1.14 05-Jun-2018 guenther

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in the kernel all
check the flag and, if set, save the old thread's state to the PCB,
clear the flag, and then load the _blank_ state
- when returning to userspace, if the flag is clear then set it and restore
the thread's state

This simpler tracking also fixes the restoring of FPU state after nested
signal handlers.

With this, %cr0's TS flag is never set, the FPU #DNA trap can no
longer happen, and IPIs are no longer necessary for flushing or
syncing FPU state; on the other hand, restoring xstate while returning
to userspace means we have to handle xrstor faulting if we could
be loading an altered state. If that happens, reset the state,
fake a #GP fault (SIGBUS), and recheck for ASTs.

While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by
using codepatching to switch to xsave/xrstor when present in the
CPU. In addition, code patch in use of xsaveopt in most places
when the CPU supports that. Use the 64bit-wide variants of the
instructions in all cases so that x87 instruction fault IPs are
reported correctly.

This change has three motivations:
1) with modern clang, SSE registers are used even in rcrt0.o, making
lazy FPU switching a smaller benefit vs trap costs
2) the Intel SDM warns that lazy FPU switching may increase power costs
3) post-Spectre rumors suggest that the %cr0 TS flag might not block
speculation, permitting leaking of information about FPU state
(AES keys?) across protection boundaries.

tested by many in snaps; prodding from deraadt@


# 1.13 26-May-2018 guenther

Update comment to reflect xsave


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.12 27-Apr-2017 mlarkin

branches: 1.12.2; 1.12.4;
vmm(4): proper save/restore of FPU context during entry/exit.

tested by reyk, dcoppa, and a few others.

ok kettenis@ on the fpu bits
ok deraadt@ on the vmm bits


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE OPENBSD_6_0_BASE OPENBSD_6_1_BASE
# 1.11 25-Mar-2015 kettenis

branches: 1.11.10;
Save/restore AVX registers and other XSAVE-managed state information when
entering/leaving a signal handler like we already do the the FPU and SSE
state. This should make it possible to use AVX instructions in signal
handlers.

ok mlarkin@


# 1.10 21-Mar-2015 kettenis

Add support for saving/restoring FPU state using the XSAVE/XRSTOR. Limit
support to the X87, SSE and AVX state.

This gives us (almost) full AVX support. The AVX state isn't saved by
signal handlers yet, and ptrace(2) support is still missing.

ok guenther@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE OPENBSD_5_4_BASE OPENBSD_5_5_BASE OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.9 23-Mar-2011 pirofti

Normalize sentinel. Use _MACHINE_*_H_ and _<ARCH>_*_H_ properly and consitently.

Discussed and okay drahn@. Okay deraadt@.


# 1.8 20-Mar-2011 guenther

When reading MXCSR from userland sigcontext or a ptrace request,
mask out invalid bits to prevent a protect fault.

Original diff by joshe@; further feedback and ok kettenis@


Revision tags: OPENBSD_4_9_BASE
# 1.7 20-Nov-2010 miod

__attribute__((packed)) -> __packed. The ioprbs.c chunk was commented out, and
uncommenting it is intentional.
ok deraadt@


# 1.6 29-Sep-2010 joshe

Back out previous, it appears to be broken.


# 1.5 29-Sep-2010 joshe

When reading MXCSR from userland sigcontext, mask out invalid bits.

This prevents a protection fault if a userland signal handler
scribbles all over it's struct sigcontext

Help from and ok guenther@ kettenis@


Revision tags: OPENBSD_4_8_BASE
# 1.4 29-Jun-2010 thib

fpu_kernel_{enter,exit}; Functions to allow the use of
the FPU in the kernel.

From Mike Belopuhov; Little bits by myself.

Comments/OK kettenis@


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.3 01-Oct-2006 kettenis

Switch fpu control word to the hardware default. This makes us use 64-bit
precision instead of 53-bit precision, giving us proper support for
"long double".

ok deraadt@


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE SMP_SYNC_A SMP_SYNC_B
# 1.2 28-Feb-2004 deraadt

rename our NPXCW setting


# 1.1 28-Jan-2004 mickey

branches: 1.1.2;
an amd64 arch support.
hacked by art@ from netbsd sources and then later debugged
by me into the shape where it can host itself.
no bootloader yet as needs redoing from the
recent advanced i386 sources (anyone? ;)


# 1.18 22-May-2023 guenther

The fp_ex_[st]w struct savefpu members were inherited from NetBSD where
they're used in the 32bit-compat support, which we dropped years ago.
Bye bye!

ok deraadt@


Revision tags: OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE
# 1.17 29-Nov-2019 mortimer

Fix size of reserved bytes section in xsave header.
ok guenther@ kettenis@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.16 07-Oct-2018 guenther

In vmm, handle xsetbv like xrstor: instead of trying to prevalidate
the values, just try it and handle the #GP if it faults.

Problem reported by Maxime Villard (max(at)m00nbsd.net)
ok mlarkin@


# 1.15 24-Jun-2018 guenther

Move signal generation from fputrap() to where it's called in trap()


# 1.14 05-Jun-2018 guenther

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in the kernel all
check the flag and, if set, save the old thread's state to the PCB,
clear the flag, and then load the _blank_ state
- when returning to userspace, if the flag is clear then set it and restore
the thread's state

This simpler tracking also fixes the restoring of FPU state after nested
signal handlers.

With this, %cr0's TS flag is never set, the FPU #DNA trap can no
longer happen, and IPIs are no longer necessary for flushing or
syncing FPU state; on the other hand, restoring xstate while returning
to userspace means we have to handle xrstor faulting if we could
be loading an altered state. If that happens, reset the state,
fake a #GP fault (SIGBUS), and recheck for ASTs.

While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by
using codepatching to switch to xsave/xrstor when present in the
CPU. In addition, code patch in use of xsaveopt in most places
when the CPU supports that. Use the 64bit-wide variants of the
instructions in all cases so that x87 instruction fault IPs are
reported correctly.

This change has three motivations:
1) with modern clang, SSE registers are used even in rcrt0.o, making
lazy FPU switching a smaller benefit vs trap costs
2) the Intel SDM warns that lazy FPU switching may increase power costs
3) post-Spectre rumors suggest that the %cr0 TS flag might not block
speculation, permitting leaking of information about FPU state
(AES keys?) across protection boundaries.

tested by many in snaps; prodding from deraadt@


# 1.13 26-May-2018 guenther

Update comment to reflect xsave


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.12 27-Apr-2017 mlarkin

branches: 1.12.2; 1.12.4;
vmm(4): proper save/restore of FPU context during entry/exit.

tested by reyk, dcoppa, and a few others.

ok kettenis@ on the fpu bits
ok deraadt@ on the vmm bits


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE OPENBSD_6_0_BASE OPENBSD_6_1_BASE
# 1.11 25-Mar-2015 kettenis

branches: 1.11.10;
Save/restore AVX registers and other XSAVE-managed state information when
entering/leaving a signal handler like we already do the the FPU and SSE
state. This should make it possible to use AVX instructions in signal
handlers.

ok mlarkin@


# 1.10 21-Mar-2015 kettenis

Add support for saving/restoring FPU state using the XSAVE/XRSTOR. Limit
support to the X87, SSE and AVX state.

This gives us (almost) full AVX support. The AVX state isn't saved by
signal handlers yet, and ptrace(2) support is still missing.

ok guenther@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE OPENBSD_5_4_BASE OPENBSD_5_5_BASE OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.9 23-Mar-2011 pirofti

Normalize sentinel. Use _MACHINE_*_H_ and _<ARCH>_*_H_ properly and consitently.

Discussed and okay drahn@. Okay deraadt@.


# 1.8 20-Mar-2011 guenther

When reading MXCSR from userland sigcontext or a ptrace request,
mask out invalid bits to prevent a protect fault.

Original diff by joshe@; further feedback and ok kettenis@


Revision tags: OPENBSD_4_9_BASE
# 1.7 20-Nov-2010 miod

__attribute__((packed)) -> __packed. The ioprbs.c chunk was commented out, and
uncommenting it is intentional.
ok deraadt@


# 1.6 29-Sep-2010 joshe

Back out previous, it appears to be broken.


# 1.5 29-Sep-2010 joshe

When reading MXCSR from userland sigcontext, mask out invalid bits.

This prevents a protection fault if a userland signal handler
scribbles all over it's struct sigcontext

Help from and ok guenther@ kettenis@


Revision tags: OPENBSD_4_8_BASE
# 1.4 29-Jun-2010 thib

fpu_kernel_{enter,exit}; Functions to allow the use of
the FPU in the kernel.

From Mike Belopuhov; Little bits by myself.

Comments/OK kettenis@


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.3 01-Oct-2006 kettenis

Switch fpu control word to the hardware default. This makes us use 64-bit
precision instead of 53-bit precision, giving us proper support for
"long double".

ok deraadt@


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE SMP_SYNC_A SMP_SYNC_B
# 1.2 28-Feb-2004 deraadt

rename our NPXCW setting


# 1.1 28-Jan-2004 mickey

branches: 1.1.2;
an amd64 arch support.
hacked by art@ from netbsd sources and then later debugged
by me into the shape where it can host itself.
no bootloader yet as needs redoing from the
recent advanced i386 sources (anyone? ;)


# 1.17 29-Nov-2019 mortimer

Fix size of reserved bytes section in xsave header.
ok guenther@ kettenis@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.16 07-Oct-2018 guenther

In vmm, handle xsetbv like xrstor: instead of trying to prevalidate
the values, just try it and handle the #GP if it faults.

Problem reported by Maxime Villard (max(at)m00nbsd.net)
ok mlarkin@


# 1.15 24-Jun-2018 guenther

Move signal generation from fputrap() to where it's called in trap()


# 1.14 05-Jun-2018 guenther

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in the kernel all
check the flag and, if set, save the old thread's state to the PCB,
clear the flag, and then load the _blank_ state
- when returning to userspace, if the flag is clear then set it and restore
the thread's state

This simpler tracking also fixes the restoring of FPU state after nested
signal handlers.

With this, %cr0's TS flag is never set, the FPU #DNA trap can no
longer happen, and IPIs are no longer necessary for flushing or
syncing FPU state; on the other hand, restoring xstate while returning
to userspace means we have to handle xrstor faulting if we could
be loading an altered state. If that happens, reset the state,
fake a #GP fault (SIGBUS), and recheck for ASTs.

While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by
using codepatching to switch to xsave/xrstor when present in the
CPU. In addition, code patch in use of xsaveopt in most places
when the CPU supports that. Use the 64bit-wide variants of the
instructions in all cases so that x87 instruction fault IPs are
reported correctly.

This change has three motivations:
1) with modern clang, SSE registers are used even in rcrt0.o, making
lazy FPU switching a smaller benefit vs trap costs
2) the Intel SDM warns that lazy FPU switching may increase power costs
3) post-Spectre rumors suggest that the %cr0 TS flag might not block
speculation, permitting leaking of information about FPU state
(AES keys?) across protection boundaries.

tested by many in snaps; prodding from deraadt@


# 1.13 26-May-2018 guenther

Update comment to reflect xsave


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.12 27-Apr-2017 mlarkin

branches: 1.12.2; 1.12.4;
vmm(4): proper save/restore of FPU context during entry/exit.

tested by reyk, dcoppa, and a few others.

ok kettenis@ on the fpu bits
ok deraadt@ on the vmm bits


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE OPENBSD_6_0_BASE OPENBSD_6_1_BASE
# 1.11 25-Mar-2015 kettenis

branches: 1.11.10;
Save/restore AVX registers and other XSAVE-managed state information when
entering/leaving a signal handler like we already do the the FPU and SSE
state. This should make it possible to use AVX instructions in signal
handlers.

ok mlarkin@


# 1.10 21-Mar-2015 kettenis

Add support for saving/restoring FPU state using the XSAVE/XRSTOR. Limit
support to the X87, SSE and AVX state.

This gives us (almost) full AVX support. The AVX state isn't saved by
signal handlers yet, and ptrace(2) support is still missing.

ok guenther@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE OPENBSD_5_4_BASE OPENBSD_5_5_BASE OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.9 23-Mar-2011 pirofti

Normalize sentinel. Use _MACHINE_*_H_ and _<ARCH>_*_H_ properly and consitently.

Discussed and okay drahn@. Okay deraadt@.


# 1.8 20-Mar-2011 guenther

When reading MXCSR from userland sigcontext or a ptrace request,
mask out invalid bits to prevent a protect fault.

Original diff by joshe@; further feedback and ok kettenis@


Revision tags: OPENBSD_4_9_BASE
# 1.7 20-Nov-2010 miod

__attribute__((packed)) -> __packed. The ioprbs.c chunk was commented out, and
uncommenting it is intentional.
ok deraadt@


# 1.6 29-Sep-2010 joshe

Back out previous, it appears to be broken.


# 1.5 29-Sep-2010 joshe

When reading MXCSR from userland sigcontext, mask out invalid bits.

This prevents a protection fault if a userland signal handler
scribbles all over it's struct sigcontext

Help from and ok guenther@ kettenis@


Revision tags: OPENBSD_4_8_BASE
# 1.4 29-Jun-2010 thib

fpu_kernel_{enter,exit}; Functions to allow the use of
the FPU in the kernel.

From Mike Belopuhov; Little bits by myself.

Comments/OK kettenis@


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.3 01-Oct-2006 kettenis

Switch fpu control word to the hardware default. This makes us use 64-bit
precision instead of 53-bit precision, giving us proper support for
"long double".

ok deraadt@


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE SMP_SYNC_A SMP_SYNC_B
# 1.2 28-Feb-2004 deraadt

rename our NPXCW setting


# 1.1 28-Jan-2004 mickey

branches: 1.1.2;
an amd64 arch support.
hacked by art@ from netbsd sources and then later debugged
by me into the shape where it can host itself.
no bootloader yet as needs redoing from the
recent advanced i386 sources (anyone? ;)


Revision tags: OPENBSD_6_4_BASE
# 1.16 07-Oct-2018 guenther

In vmm, handle xsetbv like xrstor: instead of trying to prevalidate
the values, just try it and handle the #GP if it faults.

Problem reported by Maxime Villard (max(at)m00nbsd.net)
ok mlarkin@


# 1.15 24-Jun-2018 guenther

Move signal generation from fputrap() to where it's called in trap()


# 1.14 05-Jun-2018 guenther

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in the kernel all
check the flag and, if set, save the old thread's state to the PCB,
clear the flag, and then load the _blank_ state
- when returning to userspace, if the flag is clear then set it and restore
the thread's state

This simpler tracking also fixes the restoring of FPU state after nested
signal handlers.

With this, %cr0's TS flag is never set, the FPU #DNA trap can no
longer happen, and IPIs are no longer necessary for flushing or
syncing FPU state; on the other hand, restoring xstate while returning
to userspace means we have to handle xrstor faulting if we could
be loading an altered state. If that happens, reset the state,
fake a #GP fault (SIGBUS), and recheck for ASTs.

While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by
using codepatching to switch to xsave/xrstor when present in the
CPU. In addition, code patch in use of xsaveopt in most places
when the CPU supports that. Use the 64bit-wide variants of the
instructions in all cases so that x87 instruction fault IPs are
reported correctly.

This change has three motivations:
1) with modern clang, SSE registers are used even in rcrt0.o, making
lazy FPU switching a smaller benefit vs trap costs
2) the Intel SDM warns that lazy FPU switching may increase power costs
3) post-Spectre rumors suggest that the %cr0 TS flag might not block
speculation, permitting leaking of information about FPU state
(AES keys?) across protection boundaries.

tested by many in snaps; prodding from deraadt@


# 1.13 26-May-2018 guenther

Update comment to reflect xsave


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.12 27-Apr-2017 mlarkin

branches: 1.12.2; 1.12.4;
vmm(4): proper save/restore of FPU context during entry/exit.

tested by reyk, dcoppa, and a few others.

ok kettenis@ on the fpu bits
ok deraadt@ on the vmm bits


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE OPENBSD_6_0_BASE OPENBSD_6_1_BASE
# 1.11 25-Mar-2015 kettenis

branches: 1.11.10;
Save/restore AVX registers and other XSAVE-managed state information when
entering/leaving a signal handler like we already do the the FPU and SSE
state. This should make it possible to use AVX instructions in signal
handlers.

ok mlarkin@


# 1.10 21-Mar-2015 kettenis

Add support for saving/restoring FPU state using the XSAVE/XRSTOR. Limit
support to the X87, SSE and AVX state.

This gives us (almost) full AVX support. The AVX state isn't saved by
signal handlers yet, and ptrace(2) support is still missing.

ok guenther@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE OPENBSD_5_4_BASE OPENBSD_5_5_BASE OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.9 23-Mar-2011 pirofti

Normalize sentinel. Use _MACHINE_*_H_ and _<ARCH>_*_H_ properly and consitently.

Discussed and okay drahn@. Okay deraadt@.


# 1.8 20-Mar-2011 guenther

When reading MXCSR from userland sigcontext or a ptrace request,
mask out invalid bits to prevent a protect fault.

Original diff by joshe@; further feedback and ok kettenis@


Revision tags: OPENBSD_4_9_BASE
# 1.7 20-Nov-2010 miod

__attribute__((packed)) -> __packed. The ioprbs.c chunk was commented out, and
uncommenting it is intentional.
ok deraadt@


# 1.6 29-Sep-2010 joshe

Back out previous, it appears to be broken.


# 1.5 29-Sep-2010 joshe

When reading MXCSR from userland sigcontext, mask out invalid bits.

This prevents a protection fault if a userland signal handler
scribbles all over it's struct sigcontext

Help from and ok guenther@ kettenis@


Revision tags: OPENBSD_4_8_BASE
# 1.4 29-Jun-2010 thib

fpu_kernel_{enter,exit}; Functions to allow the use of
the FPU in the kernel.

From Mike Belopuhov; Little bits by myself.

Comments/OK kettenis@


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.3 01-Oct-2006 kettenis

Switch fpu control word to the hardware default. This makes us use 64-bit
precision instead of 53-bit precision, giving us proper support for
"long double".

ok deraadt@


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE SMP_SYNC_A SMP_SYNC_B
# 1.2 28-Feb-2004 deraadt

rename our NPXCW setting


# 1.1 28-Jan-2004 mickey

branches: 1.1.2;
an amd64 arch support.
hacked by art@ from netbsd sources and then later debugged
by me into the shape where it can host itself.
no bootloader yet as needs redoing from the
recent advanced i386 sources (anyone? ;)


# 1.15 24-Jun-2018 guenther

Move signal generation from fputrap() to where it's called in trap()


# 1.14 05-Jun-2018 guenther

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in the kernel all
check the flag and, if set, save the old thread's state to the PCB,
clear the flag, and then load the _blank_ state
- when returning to userspace, if the flag is clear then set it and restore
the thread's state

This simpler tracking also fixes the restoring of FPU state after nested
signal handlers.

With this, %cr0's TS flag is never set, the FPU #DNA trap can no
longer happen, and IPIs are no longer necessary for flushing or
syncing FPU state; on the other hand, restoring xstate while returning
to userspace means we have to handle xrstor faulting if we could
be loading an altered state. If that happens, reset the state,
fake a #GP fault (SIGBUS), and recheck for ASTs.

While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by
using codepatching to switch to xsave/xrstor when present in the
CPU. In addition, code patch in use of xsaveopt in most places
when the CPU supports that. Use the 64bit-wide variants of the
instructions in all cases so that x87 instruction fault IPs are
reported correctly.

This change has three motivations:
1) with modern clang, SSE registers are used even in rcrt0.o, making
lazy FPU switching a smaller benefit vs trap costs
2) the Intel SDM warns that lazy FPU switching may increase power costs
3) post-Spectre rumors suggest that the %cr0 TS flag might not block
speculation, permitting leaking of information about FPU state
(AES keys?) across protection boundaries.

tested by many in snaps; prodding from deraadt@


# 1.13 26-May-2018 guenther

Update comment to reflect xsave


Revision tags: OPENBSD_6_2_BASE OPENBSD_6_3_BASE
# 1.12 27-Apr-2017 mlarkin

branches: 1.12.2; 1.12.4;
vmm(4): proper save/restore of FPU context during entry/exit.

tested by reyk, dcoppa, and a few others.

ok kettenis@ on the fpu bits
ok deraadt@ on the vmm bits


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE OPENBSD_6_0_BASE OPENBSD_6_1_BASE
# 1.11 25-Mar-2015 kettenis

branches: 1.11.10;
Save/restore AVX registers and other XSAVE-managed state information when
entering/leaving a signal handler like we already do the the FPU and SSE
state. This should make it possible to use AVX instructions in signal
handlers.

ok mlarkin@


# 1.10 21-Mar-2015 kettenis

Add support for saving/restoring FPU state using the XSAVE/XRSTOR. Limit
support to the X87, SSE and AVX state.

This gives us (almost) full AVX support. The AVX state isn't saved by
signal handlers yet, and ptrace(2) support is still missing.

ok guenther@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE OPENBSD_5_4_BASE OPENBSD_5_5_BASE OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.9 23-Mar-2011 pirofti

Normalize sentinel. Use _MACHINE_*_H_ and _<ARCH>_*_H_ properly and consitently.

Discussed and okay drahn@. Okay deraadt@.


# 1.8 20-Mar-2011 guenther

When reading MXCSR from userland sigcontext or a ptrace request,
mask out invalid bits to prevent a protect fault.

Original diff by joshe@; further feedback and ok kettenis@


Revision tags: OPENBSD_4_9_BASE
# 1.7 20-Nov-2010 miod

__attribute__((packed)) -> __packed. The ioprbs.c chunk was commented out, and
uncommenting it is intentional.
ok deraadt@


# 1.6 29-Sep-2010 joshe

Back out previous, it appears to be broken.


# 1.5 29-Sep-2010 joshe

When reading MXCSR from userland sigcontext, mask out invalid bits.

This prevents a protection fault if a userland signal handler
scribbles all over it's struct sigcontext

Help from and ok guenther@ kettenis@


Revision tags: OPENBSD_4_8_BASE
# 1.4 29-Jun-2010 thib

fpu_kernel_{enter,exit}; Functions to allow the use of
the FPU in the kernel.

From Mike Belopuhov; Little bits by myself.

Comments/OK kettenis@


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.3 01-Oct-2006 kettenis

Switch fpu control word to the hardware default. This makes us use 64-bit
precision instead of 53-bit precision, giving us proper support for
"long double".

ok deraadt@


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE SMP_SYNC_A SMP_SYNC_B
# 1.2 28-Feb-2004 deraadt

rename our NPXCW setting


# 1.1 28-Jan-2004 mickey

branches: 1.1.2;
an amd64 arch support.
hacked by art@ from netbsd sources and then later debugged
by me into the shape where it can host itself.
no bootloader yet as needs redoing from the
recent advanced i386 sources (anyone? ;)


Revision tags: OPENBSD_6_2_BASE
# 1.12 27-Apr-2017 mlarkin

vmm(4): proper save/restore of FPU context during entry/exit.

tested by reyk, dcoppa, and a few others.

ok kettenis@ on the fpu bits
ok deraadt@ on the vmm bits


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE OPENBSD_6_0_BASE OPENBSD_6_1_BASE
# 1.11 25-Mar-2015 kettenis

branches: 1.11.10;
Save/restore AVX registers and other XSAVE-managed state information when
entering/leaving a signal handler like we already do the the FPU and SSE
state. This should make it possible to use AVX instructions in signal
handlers.

ok mlarkin@


# 1.10 21-Mar-2015 kettenis

Add support for saving/restoring FPU state using the XSAVE/XRSTOR. Limit
support to the X87, SSE and AVX state.

This gives us (almost) full AVX support. The AVX state isn't saved by
signal handlers yet, and ptrace(2) support is still missing.

ok guenther@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE OPENBSD_5_4_BASE OPENBSD_5_5_BASE OPENBSD_5_6_BASE OPENBSD_5_7_BASE
# 1.9 23-Mar-2011 pirofti

Normalize sentinel. Use _MACHINE_*_H_ and _<ARCH>_*_H_ properly and consitently.

Discussed and okay drahn@. Okay deraadt@.


# 1.8 20-Mar-2011 guenther

When reading MXCSR from userland sigcontext or a ptrace request,
mask out invalid bits to prevent a protect fault.

Original diff by joshe@; further feedback and ok kettenis@


Revision tags: OPENBSD_4_9_BASE
# 1.7 20-Nov-2010 miod

__attribute__((packed)) -> __packed. The ioprbs.c chunk was commented out, and
uncommenting it is intentional.
ok deraadt@


# 1.6 29-Sep-2010 joshe

Back out previous, it appears to be broken.


# 1.5 29-Sep-2010 joshe

When reading MXCSR from userland sigcontext, mask out invalid bits.

This prevents a protection fault if a userland signal handler
scribbles all over it's struct sigcontext

Help from and ok guenther@ kettenis@


Revision tags: OPENBSD_4_8_BASE
# 1.4 29-Jun-2010 thib

fpu_kernel_{enter,exit}; Functions to allow the use of
the FPU in the kernel.

From Mike Belopuhov; Little bits by myself.

Comments/OK kettenis@


Revision tags: OPENBSD_4_1_BASE OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE OPENBSD_4_7_BASE
# 1.3 01-Oct-2006 kettenis

Switch fpu control word to the hardware default. This makes us use 64-bit
precision instead of 53-bit precision, giving us proper support for
"long double".

ok deraadt@


Revision tags: OPENBSD_3_5_BASE OPENBSD_3_6_BASE OPENBSD_3_7_BASE OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE SMP_SYNC_A SMP_SYNC_B
# 1.2 28-Feb-2004 deraadt

rename our NPXCW setting


# 1.1 28-Jan-2004 mickey

branches: 1.1.2;
an amd64 arch support.
hacked by art@ from netbsd sources and then later debugged
by me into the shape where it can host itself.
no bootloader yet as needs redoing from the
recent advanced i386 sources (anyone? ;)