History log of /linux-master/arch/x86/include/asm/paravirt_types.h
Revision Date Author Comments
# a5a0abfd 15-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

x86: fix missing includes/forward declarations

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f7af6977 09-Dec-2023 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove no longer needed paravirt patching code

Now that paravirt is using the alternatives patching infrastructure,
remove the paravirt patching code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231210062138.2417-6-jgross@suse.com


# 60bc276b 09-Dec-2023 Juergen Gross <jgross@suse.com>

x86/paravirt: Switch mixed paravirt/alternative calls to alternatives

Instead of stacking alternative and paravirt patching, use the new
ALT_FLAG_CALL flag to switch those mixed calls to pure alternative
handling.

Eliminate the need to be careful regarding the sequence of alternative
and paravirt patching.

[ bp: Touch up commit message. ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231210062138.2417-5-jgross@suse.com


# 9824b00c 29-Nov-2023 Juergen Gross <jgross@suse.com>

x86/paravirt: Move some functions and defines to alternative.c

As a preparation for replacing paravirt patching completely by
alternative patching, move some backend functions and #defines to
the alternatives code and header.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231129133332.31043-3-jgross@suse.com


# 772ca413 29-Nov-2023 Juergen Gross <jgross@suse.com>

x86/paravirt: Introduce ALT_NOT_XEN

Introduce the macro ALT_NOT_XEN as a short form of
ALT_NOT(X86_FEATURE_XENPV).

No functional changes.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231129133332.31043-2-jgross@suse.com


# fe22bc43 09-Jun-2023 Hou Wenlong <houwenlong.hwl@antgroup.com>

x86/paravirt: Make the struct paravirt_patch_site packed

Similar to struct alt_instr, make the struct paravirt_patch_site packed
and get rid of all the .align directives and save 2 bytes for one
PARA_SITE entry on X86_64.

[ bp: Massage commit message. ]

Suggested-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/6dcb20159ded36586c5f7f2ae159e4e030256627.1686301237.git.houwenlong.hwl@antgroup.com


# 5c22c472 09-Jun-2023 Hou Wenlong <houwenlong.hwl@antgroup.com>

x86/paravirt: Use relative reference for the original instruction offset

Similar to the alternative patching, use a relative reference for original
instruction offset rather than absolute one, which saves 8 bytes for one
PARA_SITE entry on x86_64. As a result, a R_X86_64_PC32 relocation is
generated instead of an R_X86_64_64 one, which also reduces relocation
metadata on relocatable builds. Hardcode the alignment to 4 now.

[ bp: Massage commit message. ]

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/9e6053107fbaabc0d33e5d2865c5af2c67ec9925.1686301237.git.houwenlong.hwl@antgroup.com


# a4a7644c 13-Sep-2023 Juergen Gross <jgross@suse.com>

x86/xen: move paravirt lazy code

Only Xen is using the paravirt lazy mode code, so it can be moved to
Xen specific sources.

This allows to make some of the functions static or to merge them into
their only call sites.

While at it do a rename from "paravirt" to "xen" for all moved
specifiers.

No functional change.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20230913113828.18421-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# 11af36cb 17-Mar-2023 Juergen Gross <jgross@suse.com>

x86/paravirt: Convert simple paravirt functions to asm

All functions referenced via __PV_IS_CALLEE_SAVE() need to be assembler
functions, as those functions calls are hidden from the compiler.

In case the kernel is compiled with "-fzero-call-used-regs" the compiler
will clobber caller-saved registers at the end of C functions, which
will result in unexpectedly zeroed registers at the call site of the
related paravirt functions.

Replace the C functions with DEFINE_PARAVIRT_ASM() constructs using
the same instructions as the related paravirt calls in the
PVOP_ALT_[V]CALLEE*() macros. And since they're not C functions visible
to the compiler anymore, latter won't do the callee-clobbered zeroing
invoked by -fzero-call-used-regs and thus won't corrupt registers.

[ bp: Extend commit message. ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230317063325.361-1-jgross@suse.com


# c9ae1b10 07-Feb-2023 Juergen Gross <jgross@suse.com>

x86/paravirt: Merge activate_mm() and dup_mmap() callbacks

The two paravirt callbacks .mmu.activate_mm() and .mmu.dup_mmap() are
sharing the same implementations in all cases: for Xen PV guests they
are pinning the PGD of the new mm_struct, and for all other cases they
are a NOP.

In the end, both callbacks are meant to register an address space with
the underlying hypervisor, so there needs to be only a single callback
for that purpose.

So merge them to a common callback .mmu.enter_mmap() (in contrast to the
corresponding already existing .mmu.exit_mmap()).

As the first parameter of the old callbacks isn't used, drop it from the
replacement one.

[ bp: Remove last occurrence of paravirt_activate_mm() in
asm/mmu_context.h ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Link: https://lore.kernel.org/r/20230207075902.7539-1-jgross@suse.com


# 5736b1b7 03-Sep-2022 Kees Cook <keescook@chromium.org>

x86/paravirt: Remove clobber bitmask from .parainstructions

The u16 "clobber" value is not used in .parainstructions since commit
27876f3882fd ("x86/paravirt: Remove clobbers from struct paravirt_patch_site")

Remove the u16 from the section macro, the argument from all macros, and
all now-unused CLBR_* macros.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220903073706.3193746-1-keescook@chromium.org


# dbf53a29 19-Oct-2022 Borislav Petkov <bp@alien8.de>

x86/paravirt: Fix a !PARAVIRT build warning

Fix

./include/trace/events/xen.h:28:31: warning: ‘enum paravirt_lazy_mode’ \
declared inside parameter list will not be visible outside of this definition or declaration

which turns into a build error:

./include/trace/events/xen.h:28:50: error: parameter 1 (‘mode’) has incomplete type
28 | TP_PROTO(enum paravirt_lazy_mode mode), \

due to enum paravirt_lazy_mode being visible only under CONFIG_PARAVIRT.
Just pull it up where it is unconditionally visible.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Y1AtAXM8YjtBm2cj@zn.tnic


# 239f2e24 15-Sep-2022 Thomas Gleixner <tglx@linutronix.de>

x86/paravirt: Make struct paravirt_call_site unconditionally available

For the upcoming call thunk patching it's less ifdeffery when the data
structure is unconditionally available. The code can then be trivially
fenced off with IS_ENABLED().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111147.367853167@infradead.org


# 00e8f715 03-Sep-2022 Kees Cook <keescook@chromium.org>

x86/paravirt: Remove clobber bitmask from .parainstructions

The u16 "clobber" value is not used in .parainstructions since commit
27876f3882fd ("x86/paravirt: Remove clobbers from struct paravirt_patch_site")

Remove the u16 from the section macro, the argument from all macros, and
all now-unused CLBR_* macros.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220903073706.3193746-1-keescook@chromium.org


# f92ff8f5 15-Sep-2022 Thomas Gleixner <tglx@linutronix.de>

x86/paravirt: Ensure proper alignment

The entries in the .parainstructions sections are 8 byte aligned and the
corresponding C struct paravirt_patch_site makes the array offset 16
bytes.

Though the pushed entries are only using 12 bytes, __parainstructions_end
is therefore 4 bytes short.

That works by chance because it's only used in a loop:

for (p = start; p < end; p++)

But this falls flat when calculating the number of elements:

n = end - start

That's obviously off by one.

Ensure that the gap is filled and the last entry is occupying 16 bytes.

[ bp: Add the proper struct and section names. ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220915111142.992398801@infradead.org


# 8c86f29b 02-Sep-2022 Bill Wendling <morbo@google.com>

x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled

The ZERO_CALL_USED_REGS feature may zero out caller-saved registers
before returning.

In spurious_kernel_fault(), the "pte_offset_kernel()" call results in
this assembly code:

.Ltmp151:
#APP
# ALT: oldnstr
.Ltmp152:
.Ltmp153:
.Ltmp154:
.section .discard.retpoline_safe,"",@progbits
.quad .Ltmp154
.text

callq *pv_ops+536(%rip)

.Ltmp155:
.section .parainstructions,"a",@progbits
.p2align 3, 0x0
.quad .Ltmp153
.byte 67
.byte .Ltmp155-.Ltmp153
.short 1
.text
.Ltmp156:
# ALT: padding
.zero (-(((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152))>0))*((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152)),144
.Ltmp159:
.section .altinstructions,"a",@progbits
.Ltmp160:
.long .Ltmp152-.Ltmp160
.Ltmp161:
.long .Ltmp158-.Ltmp161
.short 33040
.byte .Ltmp159-.Ltmp152
.byte .Ltmp157-.Ltmp158
.text

.section .altinstr_replacement,"ax",@progbits
# ALT: replacement 1
.Ltmp158:
movq %rdi, %rax
.Ltmp157:
.text
#NO_APP
.Ltmp162:
testb $-128, %dil

The "testb" here is using %dil, but the %rdi register was cleared before
returning from "callq *pv_ops+536(%rip)". Adding the proper constraints
results in the use of a different register:

movq %r11, %rdi

# Similar to above.

testb $-128, %r11b

Link: https://github.com/KSPP/linux/issues/192
Signed-off-by: Bill Wendling <morbo@google.com>
Reported-and-tested-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 035f7f87b729 ("randstruct: Enable Clang support")
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/lkml/fa6df43b-8a1a-8ad1-0236-94d2a0b588fa@suse.com/
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220902213750.1124421-3-morbo@google.com


# f67b90be 02-Sep-2022 Bill Wendling <morbo@google.com>

x86/paravirt: clean up typos and grammaros

Drive-by clean up of the comment.

[ Impact: cleanup]

Signed-off-by: Bill Wendling <morbo@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220902213750.1124421-2-morbo@google.com


# 8b87d8ce 08-Mar-2022 Peter Zijlstra <peterz@infradead.org>

x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()

By doing an early rewrite of 'jmp native_iret` in
restore_regs_and_return_to_kernel() we can get rid of the last
INTERRUPT_RETURN user and paravirt_iret.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.815039833@infradead.org


# 2e1f8e55 30-Sep-2021 Jan Beulich <jbeulich@suse.com>

x86/paravirt: Use %rip-relative addressing in hook calls

While using a plain (constant) address works, its use needlessly invokes
a SIB addressing mode, making every call site one byte larger than
necessary:

ff 14 25 98 89 42 82 call *0xffffffff82428998

Instead of using an "i" constraint with address-of operator and a 'c'
operand modifier, simply use an ordinary "m" constraint, which the
64-bit compiler will translate to %rip-relative addressing:

ff 15 62 fb d2 00 call *0xd2fb62(%rip) # ffffffff82428998 <pv_ops+0x18>

This way the compiler is also told the truth about operand usage - the
memory location gets actually read, after all.

32-bit code generation is unaffected by the change.

[ bp: Remove "we", add examples. ]

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/b8192e8a-13ef-6ac6-6364-8ba58992cd1d@suse.com


# 064ce6c5 24-Aug-2021 Brijesh Singh <brijesh.singh@amd.com>

mm: x86: Invoke hypercall when page encryption status is changed

Invoke a hypercall when a memory region is changed from encrypted ->
decrypted and vice versa. Hypervisor needs to know the page encryption
status during the guest migration.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Steve Rutherford <srutherford@google.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Message-Id: <0a237d5bb08793916c7790a3e653a2cbe7485761.1629726117.git.ashish.kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# e453f872 28-Oct-2021 Juergen Gross <jgross@suse.com>

x86/xen: switch initial pvops IRQ functions to dummy ones

The initial pvops functions handling irq flags will only ever be called
before interrupts are being enabled.

So switch them to be dummy functions:
- xen_save_fl() can always return 0
- xen_irq_disable() is a nop
- xen_irq_enable() can BUG()

Add some generic paravirt functions for that purpose.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211028072748.29862-3-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


# 4ce94eab 20-Feb-2021 Nadav Amit <namit@vmware.com>

x86/mm/tlb: Flush remote and local TLBs concurrently

To improve TLB shootdown performance, flush the remote and local TLBs
concurrently. Introduce flush_tlb_multi() that does so. Introduce
paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen
and hyper-v are only compile-tested).

While the updated smp infrastructure is capable of running a function on
a single local core, it is not optimized for this case. The multiple
function calls and the indirect branch introduce some overhead, and
might make local TLB flushes slower than they were before the recent
changes.

Before calling the SMP infrastructure, check if only a local TLB flush
is needed to restore the lost performance in this common case. This
requires to check mm_cpumask() one more time, but unless this mask is
updated very frequently, this should impact performance negatively.

Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com> # Hyper-v parts
Reviewed-by: Juergen Gross <jgross@suse.com> # Xen and paravirt parts
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20210220231712.2475218-5-namit@vmware.com


# d9f6e12f 18-Mar-2021 Ingo Molnar <mingo@kernel.org>

x86: Fix various typos in comments

Fix ~144 single-word typos in arch/x86/ code comments.

Doing this in a single commit should reduce the churn.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-kernel@vger.kernel.org


# 054ac8ad 11-Mar-2021 Juergen Gross <jgross@suse.com>

x86/paravirt: Have only one paravirt patch function

There is no need any longer to have different paravirt patch functions
for native and Xen. Eliminate native_patch() and rename
paravirt_patch_default() to paravirt_patch().

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-15-jgross@suse.com


# fafe5e74 11-Mar-2021 Juergen Gross <jgross@suse.com>

x86/paravirt: Switch functions with custom code to ALTERNATIVE

Instead of using paravirt patching for custom code sequences use
ALTERNATIVE for the functions with custom code replacements.

Instead of patching an ud2 instruction for unpopulated vector entries
into the caller site, use a simple function just calling BUG() as a
replacement.

Simplify the register defines for assembler paravirt calling, as there
isn't much usage left.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-14-jgross@suse.com


# 00aa3193 11-Mar-2021 Juergen Gross <jgross@suse.com>

x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEs

Instead of using paravirt patching for custom code sequences add
support for using ALTERNATIVE handling combined with paravirt call
patching.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-13-jgross@suse.com


# ae755b5a 11-Mar-2021 Juergen Gross <jgross@suse.com>

x86/paravirt: Switch iret pvops to ALTERNATIVE

The iret paravirt op is rather special as it is using a jmp instead
of a call instruction. Switch it to ALTERNATIVE.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-12-jgross@suse.com


# 0b8d366a 11-Mar-2021 Juergen Gross <jgross@suse.com>

x86/paravirt: Simplify paravirt macros

The central pvops call macros ____PVOP_CALL() and ____PVOP_VCALL() are
looking very similar now.

The main differences are using PVOP_VCALL_ARGS or PVOP_CALL_ARGS, which
are identical, and the return value handling.

So drop PVOP_VCALL_ARGS and instead of ____PVOP_VCALL() just use
(void)____PVOP_CALL(long, ...).

Note that it isn't easily possible to just redefine ____PVOP_VCALL()
to use ____PVOP_CALL() instead, as this would require further hiding of
commas in macro parameters.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-11-jgross@suse.com


# 33634e42 11-Mar-2021 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove no longer needed 32-bit pvops cruft

PVOP_VCALL4() is only used for Xen PV, while PVOP_CALL4() isn't used
at all. Keep PVOP_CALL4() for 64 bits due to symmetry reasons.

This allows to remove the 32-bit definitions of those macros leading
to a substantial simplification of the paravirt macros, as those were
the only ones needing non-empty "pre" and "post" parameters.

PVOP_CALLEE2() and PVOP_VCALLEE2() are used nowhere, so remove them.

Another no longer needed case is special handling of return types
larger than unsigned long. Replace that with a BUILD_BUG_ON().

DISABLE_INTERRUPTS() is used in 32-bit code only, so it can just be
replaced by cli.

INTERRUPT_RETURN in 32-bit code can be replaced by iret.

ENABLE_INTERRUPTS is used nowhere, so it can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-10-jgross@suse.com


# a0e2bf7c 11-Mar-2021 Juergen Gross <jgross@suse.com>

x86/paravirt: Switch time pvops functions to use static_call()

The time pvops functions are the only ones left which might be
used in 32-bit mode and which return a 64-bit value.

Switch them to use the static_call() mechanism instead of pvops, as
this allows quite some simplification of the pvops implementation.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-5-jgross@suse.com


# ab234a26 20-Jan-2021 Juergen Gross <jgross@suse.com>

x86/pv: Rework arch_local_irq_restore() to not use popf

POPF is a rather expensive operation, so don't use it for restoring
irq flags. Instead, test whether interrupts are enabled in the flags
parameter and enable interrupts via STI in that case.

This results in the restore_fl paravirt op to be no longer needed.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210120135555.32594-7-jgross@suse.com


# afd30525 20-Jan-2021 Juergen Gross <jgross@suse.com>

x86/xen: Drop USERGS_SYSRET64 paravirt call

USERGS_SYSRET64 is used to return from a syscall via SYSRET, but
a Xen PV guest will nevertheless use the IRET hypercall, as there
is no sysret PV hypercall defined.

So instead of testing all the prerequisites for doing a sysret and
then mangling the stack for Xen PV again for doing an iret just use
the iret exit from the beginning.

This can easily be done via an ALTERNATIVE like it is done for the
sysenter compat case already.

It should be noted that this drops the optimization in Xen for not
restoring a few registers when returning to user mode, but it seems
as if the saved instructions in the kernel more than compensate for
this drop (a kernel build in a Xen PV guest was slightly faster with
this patch applied).

While at it remove the stale sysret32 remnants.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210120135555.32594-6-jgross@suse.com


# 53c9d924 20-Jan-2021 Juergen Gross <jgross@suse.com>

x86/pv: Switch SWAPGS to ALTERNATIVE

SWAPGS is used only for interrupts coming from user mode or for
returning to user mode. So there is no reason to use the PARAVIRT
framework, as it can easily be replaced by an ALTERNATIVE depending
on X86_FEATURE_XENPV.

There are several instances using the PV-aware SWAPGS macro in paths
which are never executed in a Xen PV guest. Replace those with the
plain swapgs instruction. For SWAPGS_UNSAFE_STACK the same applies.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210120135555.32594-5-jgross@suse.com


# 157e118b 03-Nov-2020 Thomas Gleixner <tglx@linutronix.de>

x86/mm/highmem: Use generic kmap atomic implementation

Convert X86 to the generic kmap atomic implementation and make the
iomap_atomic() naming convention consistent while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201103095857.375127260@linutronix.de


# e1ac3e66 14-Aug-2020 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove set_pte_at() pv-op

On x86 set_pte_at() is now always falling back to set_pte(). So instead
of having this fallback after the paravirt maze just drop the
set_pte_at paravirt operation and let set_pte_at() use the set_pte()
function directly.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200815100641.26362-6-jgross@suse.com


# 0cabf991 14-Aug-2020 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove 32-bit support from CONFIG_PARAVIRT_XXL

The last 32-bit user of stuff under CONFIG_PARAVIRT_XXL is gone.

Remove 32-bit specific parts.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200815100641.26362-2-jgross@suse.com


# cadfad87 17-Jul-2020 Andy Lutomirski <luto@kernel.org>

x86/ioperm: Fix io bitmap invalidation on Xen PV

tss_invalidate_io_bitmap() wasn't wired up properly through the pvop
machinery, so the TSS and Xen's io bitmap would get out of sync
whenever disabling a valid io bitmap.

Add a new pvop for tss_invalidate_io_bitmap() to fix it.

This is XSA-329.

Fixes: 22fe5b0439dd ("x86/ioperm: Move TSS bitmap update to exit to user work")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org


# 99bcd4a6 18-Feb-2020 Juergen Gross <jgross@suse.com>

x86/ioperm: Add new paravirt function update_io_bitmap()

Commit 111e7b15cf10f6 ("x86/ioperm: Extend IOPL config to control ioperm()
as well") reworked the iopl syscall to use I/O bitmaps.

Unfortunately this broke Xen PV domains using that syscall as there is
currently no I/O bitmap support in PV domains.

Add I/O bitmap support via a new paravirt function update_io_bitmap which
Xen PV domains can use to update their I/O bitmaps via a hypercall.

Fixes: 111e7b15cf10f6 ("x86/ioperm: Extend IOPL config to control ioperm() as well")
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org> # 5.5
Link: https://lkml.kernel.org/r/20200218154712.25490-1-jgross@suse.com


# a24ca997 11-Nov-2019 Thomas Gleixner <tglx@linutronix.de>

x86/iopl: Remove legacy IOPL option

The IOPL emulation via the I/O bitmap is sufficient. Remove the legacy
cruft dealing with the (e)flags based IOPL mechanism.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com> (Paravirt and Xen parts)
Acked-by: Andy Lutomirski <luto@kernel.org>


# 83b584d9 15-Jul-2019 Andrew Cooper <andrew.cooper3@citrix.com>

x86/paravirt: Drop {read,write}_cr8() hooks

There is a lot of infrastructure for functionality which is used
exclusively in __{save,restore}_processor_state() on the suspend/resume
path.

cr8 is an alias of APIC_TASKPRI, and APIC_TASKPRI is saved/restored by
lapic_{suspend,resume}(). Saving and restoring cr8 independently of the
rest of the Local APIC state isn't a clever thing to be doing.

Delete the suspend/resume cr8 handling, which shrinks the size of struct
saved_context, and allows for the removal of both PVOPS.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lkml.kernel.org/r/20190715151641.29210-1-andrew.cooper3@citrix.com


# 55aedddb 11-Jul-2019 Peter Zijlstra <peterz@infradead.org>

x86/paravirt: Make read_cr2() CALLEE_SAVE

The one paravirt read_cr2() implementation (Xen) is actually quite trivial
and doesn't need to clobber anything other than the return register.

Making read_cr2() CALLEE_SAVE avoids all the PUSH/POP nonsense and allows
more convenient use from assembly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: bp@alien8.de
Cc: rostedt@goodmis.org
Cc: luto@kernel.org
Cc: torvalds@linux-foundation.org
Cc: hpa@zytor.com
Cc: dave.hansen@linux.intel.com
Cc: zhe.he@windriver.com
Cc: joel@joelfernandes.org
Cc: devel@etsukata.com
Link: https://lkml.kernel.org/r/20190711114335.887392493@infradead.org


# 46938cc8 25-Apr-2019 Ingo Molnar <mingo@kernel.org>

x86/paravirt: Rename paravirt_patch_site::instrtype to paravirt_patch_site::type

It's used as 'type' in almost every paravirt patching function, so standardize
the field name from the somewhat weird 'instrtype' name to 'type'.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 1fc654cf 25-Apr-2019 Ingo Molnar <mingo@kernel.org>

x86/paravirt: Standardize 'insn_buff' variable names

We currently have 6 (!) separate naming variants to name temporary instruction
buffers that are used for code patching:

- insnbuf
- insnbuff
- insn_buff
- insn_buffer
- ibuf
- ibuffer

These are used as local variables, percpu fields and function parameters.

Standardize all the names to a single variant: 'insn_buff'.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 0b9d2fc1 24-Apr-2019 Thomas Gleixner <tglx@linutronix.de>

x86/paravirt: Replace the paravirt patch asm magic

The magic macro DEF_NATIVE() in the paravirt patching code uses inline
assembly to generate a data table for patching in the native instructions.

While clever this is falling apart with LTO and even aside of LTO the
construct is just working by chance according to GCC folks.

Aside of that the tables are constant data and not some form of magic
text.

As these constructs are not subject to frequent changes it is not a
maintenance issue to convert them to regular data tables which are
initialized with hex bytes.

Create a new set of macros and data structures to store the instruction
sequences and convert the code over.

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/20190424134223.690835713@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 0cbe3e26 05-Mar-2019 Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

mm: update ptep_modify_prot_start/commit to take vm_area_struct as arg

Patch series "NestMMU pte upgrade workaround for mprotect", v5.

We can upgrade pte access (R -> RW transition) via mprotect. We need to
make sure we follow the recommended pte update sequence as outlined in
commit bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to
handle nest MMU hang") for such updates. This patch series does that.

This patch (of 5):

Some architectures may want to call flush_tlb_range from these helpers.

Link: http://lkml.kernel.org/r/20190116085035.29729-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a4da3d86 19-Dec-2018 Ingo Molnar <mingo@kernel.org>

Revert "x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops"

This reverts commit 494b5168f2de009eb80f198f668da374295098dd.

See this commit for details about the revert:

e769742d3584 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"")

Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Richard Biener <rguenther@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 7847c7be 30-Oct-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove unused _paravirt_ident_32

There is no user of _paravirt_ident_32 left in the tree. Remove it
together with the related paravirt_patch_ident_32().

paravirt_patch_ident_64() can be moved inside CONFIG_PARAVIRT_XXL=y.

Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20181030063301.15054-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 494b5168 03-Oct-2018 Nadav Amit <namit@vmware.com>

x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops

As described in:

77b0bf55bc67: ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

GCC's inlining heuristics are broken with common asm() patterns used in
kernel code, resulting in the effective disabling of inlining.

The workaround is to set an assembly macro and call it from the inline
assembly block. As a result GCC considers the inline assembly block as
a single instruction. (Which it isn't, but that's the best we can get.)

In this patch we wrap the paravirt call section tricks in a macro,
to hide it from GCC.

The effect of the patch is a more aggressive inlining, which also
causes a size increase of kernel.

text data bss dec hex filename
18147336 10226688 2957312 31331336 1de1408 ./vmlinux before
18162555 10226288 2957312 31346155 1de4deb ./vmlinux after (+14819)

The number of static text symbols (non-inlined functions) goes down:

Before: 40053
After: 39942 (-111)

[ mingo: Rewrote the changelog. ]

Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20181003213100.189959-8-namit@vmware.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# fdc0269e 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Move the Xen-only pv_mmu_ops under the PARAVIRT_XXL umbrella

Most of the paravirt ops defined in pv_mmu_ops are for Xen PV guests
only. Define them only if CONFIG_PARAVIRT_XXL is set.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-15-jgross@suse.com


# 6da63eb2 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrella

All of the paravirt ops defined in pv_irq_ops are for Xen PV guests
or VSMP only. Define them only if CONFIG_PARAVIRT_XXL is set.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-14-jgross@suse.com


# 9bad5658 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Move the Xen-only pv_cpu_ops under the PARAVIRT_XXL umbrella

Most of the paravirt ops defined in pv_cpu_ops are for Xen PV guests
only. Define them only if CONFIG_PARAVIRT_XXL is set.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-13-jgross@suse.com


# 40181646 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Move items in pv_info under PARAVIRT_XXL umbrella

All items but name in pv_info are needed by Xen PV only. Define them
with CONFIG_PARAVIRT_XXL set only.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-12-jgross@suse.com


# 5c83511b 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Use a single ops structure

Instead of using six globally visible paravirt ops structures combine
them in a single structure, keeping the original structures as
sub-structures.

This avoids the need to assemble struct paravirt_patch_template at
runtime on the stack each time apply_paravirt() is being called (i.e.
when loading a module).

[ tglx: Made the struct and the initializer tabular for readability sake ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-9-jgross@suse.com


# 27876f38 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove clobbers from struct paravirt_patch_site

There is no need any longer to store the clobbers in struct
paravirt_patch_site. Remove clobbers from the struct and from the
related macros.

While at it fix some lines longer than 80 characters.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-8-jgross@suse.com


# abc745f8 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove clobbers parameter from paravirt patch functions

The clobbers parameter from paravirt_patch_default() et al isn't used
any longer. Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-7-jgross@suse.com


# 7e437202 28-Aug-2018 Juergen Gross <jgross@suse.com>

x86/paravirt: Make paravirt_patch_call() and paravirt_patch_jmp() static

paravirt_patch_call() and paravirt_patch_jmp() are used in paravirt.c
only. Convert them to static.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: virtualization@lists.linux-foundation.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: boris.ostrovsky@oracle.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180828074026.820-6-jgross@suse.com


# 48a8b97c 22-Aug-2018 Peter Zijlstra <peterz@infradead.org>

x86/mm: Only use tlb_remove_table() for paravirt

If we don't use paravirt; don't play unnecessary and complicated games
to free page-tables.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 3010a066 17-Jan-2018 Peter Zijlstra <peterz@infradead.org>

x86/paravirt, objtool: Annotate indirect calls

Paravirt emits indirect calls which get flagged by objtool retpoline
checks, annotate it away because all these indirect calls will be
patched out before we start userspace.

This patching happens through alternative_instructions() ->
apply_paravirt() -> pv_init_ops.patch() which will eventually end up
in paravirt_patch_default(). This function _will_ write direct
alternatives.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 1299ef1d 31-Jan-2018 Andy Lutomirski <luto@kernel.org>

x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()

flush_tlb_single() and flush_tlb_one() sound almost identical, but
they really mean "flush one user translation" and "flush one kernel
translation". Rename them to flush_tlb_one_user() and
flush_tlb_one_kernel() to make the semantics more obvious.

[ I was looking at some PTI-related code, and the flush-one-address code
is unnecessarily hard to understand because the names of the helpers are
uninformative. This came up during PTI review, but no one got around to
doing it. ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/3303b02e3c3d049dc5235d5651e0ae6d29a34354.1517414378.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# da51da18 02-Nov-2017 Andy Lutomirski <luto@kernel.org>

x86/entry/64: Pass SP0 directly to load_sp0()

load_sp0() had an odd signature:

void load_sp0(struct tss_struct *tss, struct thread_struct *thread);

Simplify it to:

void load_sp0(unsigned long sp0);

Also simplify a few get_cpu()/put_cpu() sequences to
preempt_disable()/preempt_enable().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2655d8b42ed940aa384fe18ee1129bbbcf730a08.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# f5caf621 20-Sep-2017 Josh Poimboeuf <jpoimboe@redhat.com>

x86/asm: Fix inline asm call constraints for Clang

For inline asm statements which have a CALL instruction, we list the
stack pointer as a constraint to convince GCC to ensure the frame
pointer is set up first:

static inline void foo()
{
register void *__sp asm(_ASM_SP);
asm("call bar" : "+r" (__sp))
}

Unfortunately, that pattern causes Clang to corrupt the stack pointer.

The fix is easy: convert the stack pointer register variable to a global
variable.

It should be noted that the end result is different based on the GCC
version. With GCC 6.4, this patch has exactly the same result as
before:

defconfig defconfig-nofp distro distro-nofp
before 9820389 9491555 8816046 8516940
after 9820389 9491555 8816046 8516940

With GCC 7.2, however, GCC's behavior has changed. It now changes its
behavior based on the conversion of the register variable to a global.
That somehow convinces it to *always* set up the frame pointer before
inserting *any* inline asm. (Therefore, listing the variable as an
output constraint is a no-op and is no longer necessary.) It's a bit
overkill, but the performance impact should be negligible. And in fact,
there's a nice improvement with frame pointers disabled:

defconfig defconfig-nofp distro distro-nofp
before 9796316 9468236 9076191 8790305
after 9796957 9464267 9076381 8785949

So in summary, while listing the stack pointer as an output constraint
is no longer necessary for newer versions of GCC, it's still needed for
older versions.

Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 87930019 03-Sep-2017 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove no longer used paravirt functions

With removal of lguest some of the paravirt functions are no longer
needed:

->read_cr4()
->store_idt()
->set_pmd_at()
->set_pud_at()
->pte_update()

Remove them.

Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: chrisw@sous-sol.org
Cc: jeremy@goop.org
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20170904102527.25409-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 5878d5d6 31-Aug-2017 Juergen Gross <jgross@suse.com>

x86/xen: Get rid of paravirt op adjust_exception_frame

When running as Xen pv-guest the exception frame on the stack contains
%r11 and %rcx additional to the other data pushed by the processor.

Instead of having a paravirt op being called for each exception type
prepend the Xen specific code to each exception entry. When running as
Xen pv-guest just use the exception entry with prepended instructions,
otherwise use the entry without the Xen specific code.

[ tglx: Merged through tip to avoid ugly merge conflict ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Cc: luto@amacapital.net
Link: http://lkml.kernel.org/r/20170831174249.26853-1-jg@pfupf.net


# 8acdf505 28-Oct-2016 Kees Cook <keescook@chromium.org>

randstruct: opt-out externally exposed function pointer structs

Some function pointer structures are used externally to the kernel, like
the paravirt structures. These should never be randomized, so mark them
as such, in preparation for enabling randstruct's automatic selection
of all-function-pointer structures.

These markings are verbatim from Brad Spengler/PaX Team's code in the
last public patch of grsecurity/PaX based on my understanding of the
code. Changes or omissions from the original code are mine and don't
reflect the original grsecurity/PaX code.

Signed-off-by: Kees Cook <keescook@chromium.org>


# a2055abe 28-May-2017 Andy Lutomirski <luto@kernel.org>

x86/mm: Pass flush_tlb_info to flush_tlb_others() etc

Rather than passing all the contents of flush_tlb_info to
flush_tlb_others(), pass a pointer to the structure directly. For
consistency, this also removes the unnecessary cpu parameter from
uv_flush_tlb_others() to make its signature match the other
*flush_tlb_others() functions.

This serves two purposes:

- It will dramatically simplify future patches that change struct
flush_tlb_info, which I'm planning to do.

- struct flush_tlb_info is an adequate description of what to do
for a local flush, too, so by reusing it we can remove duplicated
code between local and remove flushes in a future patch.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
[ Fix build warning. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 335437fb 30-Mar-2017 Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

x86/paravirt: Add 5-level support to the paravirt code

Add operations to allocate/release p4ds.

Xen requires more work. We will need to come back to it.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170330080731.65421-5-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# f2a6a705 17-Mar-2017 Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

x86: Convert the rest of the code to support p4d_t

This patch converts x86 to use proper folding of a new (fifth) page table level
with <asm-generic/pgtable-nop4d.h>.

That's a bit of a kitchen sink patch, but I don't see how to split it further
without hurting bisectability.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170317185515.8636-7-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# a00cc7d9 24-Feb-2017 Matthew Wilcox <willy@infradead.org>

mm, x86: add support for PUD-sized transparent hugepages

The current transparent hugepage code only supports PMDs. This patch
adds support for transparent use of PUDs with DAX. It does not include
support for anonymous pages. x86 support code also added.

Most of this patch simply parallels the work that was done for huge
PMDs. The only major difference is how the new ->pud_entry method in
mm_walk works. The ->pmd_entry method replaces the ->pte_entry method,
whereas the ->pud_entry method works along with either ->pmd_entry or
->pte_entry. The pagewalk code takes care of locking the PUD before
calling ->pud_walk, so handlers do not need to worry whether the PUD is
stable.

[dave.jiang@intel.com: fix SMP x86 32bit build for native_pud_clear()]
Link: http://lkml.kernel.org/r/148719066814.31111.3239231168815337012.stgit@djiang5-desk3.ch.intel.com
[dave.jiang@intel.com: native_pud_clear missing on i386 build]
Link: http://lkml.kernel.org/r/148640375195.69754.3315433724330910314.stgit@djiang5-desk3.ch.intel.com
Link: http://lkml.kernel.org/r/148545059381.17912.8602162635537598445.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Alexander Kapshuk <alexander.kapshuk@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 11f254db 08-Dec-2016 Peter Zijlstra <peterz@infradead.org>

x86/paravirt: Fix bool return type for PVOP_CALL()

Commit:

3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()")

introduced a paravirt op with bool return type [*]

It turns out that the PVOP_CALL*() macros miscompile when rettype is
bool. Code that looked like:

83 ef 01 sub $0x1,%edi
ff 15 32 a0 d8 00 callq *0xd8a032(%rip) # ffffffff81e28120 <pv_lock_ops+0x20>
84 c0 test %al,%al

ended up looking like so after PVOP_CALL1() was applied:

83 ef 01 sub $0x1,%edi
48 63 ff movslq %edi,%rdi
ff 14 25 20 81 e2 81 callq *0xffffffff81e28120
48 85 c0 test %rax,%rax

Note how it tests the whole of %rax, even though a typical bool return
function only sets %al, like:

0f 95 c0 setne %al
c3 retq

This is because ____PVOP_CALL() does:

__ret = (rettype)__eax;

and while regular integer type casts truncate the result, a cast to
bool tests for any !0 value. Fix this by explicitly truncating to
sizeof(rettype) before casting.

[*] The actual bug should've been exposed in commit:
446f3dc8cc0a ("locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests")
but that didn't properly implement the paravirt call.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()")
Link: http://lkml.kernel.org/r/20161208154349.346057680@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 3cded417 15-Nov-2016 Peter Zijlstra <peterz@infradead.org>

x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()

Avoid the pointless function call to pv_lock_ops.vcpu_is_preempted()
when a paravirt spinlock enabled kernel is ran on native hardware.

Do this by patching out the CALL instruction with "XOR %RAX,%RAX"
which has the same effect (0 return value).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 446f3dc8 02-Nov-2016 Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>

locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests

Optimize spinlock and mutex busy-loops by providing a vcpu_is_preempted(cpu)
function on KVM and Xen platforms.

Extend the pv_lock_ops interface accordingly and implement the callbacks
on KVM and Xen.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Translated to English. ]
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-7-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# af25ed59 31-Oct-2016 Andy Lutomirski <luto@kernel.org>

x86/fpu: Remove clts()

The kernel doesn't use clts() any more. Remove it and all of its
paravirt infrastructure.

A careful reader may notice that xen_clts() appears to have been
buggy -- it didn't update xen_cr0_value.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm list <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/3d3c8ca62f17579b9849a013d71e59a4d5d1b079.1477951965.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 1ef55be1 29-Sep-2016 Andy Lutomirski <luto@kernel.org>

x86/asm: Get rid of __read_cr4_safe()

We use __read_cr4() vs __read_cr4_safe() inconsistently. On
CR4-less CPUs, all CR4 bits are effectively clear, so we can make
the code simpler and more robust by making __read_cr4() always fix
up faults on 32-bit kernels.

This may fix some bugs on old 486-like CPUs, but I don't have any
easy way to test that.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: david@saggiorato.net
Link: http://lkml.kernel.org/r/ea647033d357d9ce2ad2bbde5a631045f5052fb6.1475178370.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# cfd8983f 18-May-2016 Peter Zijlstra <peterz@infradead.org>

x86, locking/spinlocks: Remove ticket (spin)lock implementation

We've unconditionally used the queued spinlock for many releases now.

Its time to remove the old ticket lock code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <waiman.long@hpe.com>
Cc: Waiman.Long@hpe.com
Cc: david.vrabel@citrix.com
Cc: dhowells@redhat.com
Cc: pbonzini@redhat.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20160518184302.GO3193@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 867fe800 13-Apr-2016 Luis R. Rodriguez <mcgrof@kernel.org>

x86/paravirt: Remove paravirt_enabled()

Now that all previous paravirt_enabled() uses were replaced with proper
x86 semantics by the previous patches we can remove the unused
paravirt_enabled() mechanism.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andrew.cooper3@citrix.com
Cc: andriy.shevchenko@linux.intel.com
Cc: bigeasy@linutronix.de
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: ffainelli@freebox.fr
Cc: george.dunlap@citrix.com
Cc: glin@suse.com
Cc: jlee@suse.com
Cc: josh@joshtriplett.org
Cc: julien.grall@linaro.org
Cc: konrad.wilk@oracle.com
Cc: kozerkov@parallels.com
Cc: lenb@kernel.org
Cc: lguest@lists.ozlabs.org
Cc: linux-acpi@vger.kernel.org
Cc: lv.zheng@intel.com
Cc: matt@codeblueprint.co.uk
Cc: mbizon@freebox.fr
Cc: rjw@rjwysocki.net
Cc: robert.moore@intel.com
Cc: rusty@rustcorp.com.au
Cc: tiwai@suse.de
Cc: toshi.kani@hp.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1460592286-300-15-git-send-email-mcgrof@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 8d152e7a 13-Apr-2016 Luis R. Rodriguez <mcgrof@kernel.org>

x86/rtc: Replace paravirt rtc check with platform legacy quirk

We have 4 types of x86 platforms that disable RTC:

* Intel MID
* Lguest - uses paravirt
* Xen dom-U - uses paravirt
* x86 on legacy systems annotated with an ACPI legacy flag

We can consolidate all of these into a platform specific legacy
quirk set early in boot through i386_start_kernel() and through
x86_64_start_reservations(). This deals with the RTC quirks which
we can rely on through the hardware subarch, the ACPI check can
be dealt with separately.

For Xen things are bit more complex given that the @X86_SUBARCH_XEN
x86_hardware_subarch is shared on for Xen which uses the PV path for
both domU and dom0. Since the semantics for differentiating between
the two are Xen specific we provide a platform helper to help override
default legacy features -- x86_platform.set_legacy_features(). Use
of this helper is highly discouraged, its only purpose should be
to account for the lack of semantics available within your given
x86_hardware_subarch.

As per 0-day, this bumps the vmlinux size using i386-tinyconfig as
follows:

TOTAL TEXT init.text x86_early_init_platform_quirks()
+70 +62 +62 +43

Only 8 bytes overhead total, as the main increase in size is
all removed via __init.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andrew.cooper3@citrix.com
Cc: andriy.shevchenko@linux.intel.com
Cc: bigeasy@linutronix.de
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: ffainelli@freebox.fr
Cc: george.dunlap@citrix.com
Cc: glin@suse.com
Cc: jlee@suse.com
Cc: josh@joshtriplett.org
Cc: julien.grall@linaro.org
Cc: konrad.wilk@oracle.com
Cc: kozerkov@parallels.com
Cc: lenb@kernel.org
Cc: lguest@lists.ozlabs.org
Cc: linux-acpi@vger.kernel.org
Cc: lv.zheng@intel.com
Cc: matt@codeblueprint.co.uk
Cc: mbizon@freebox.fr
Cc: rjw@rjwysocki.net
Cc: robert.moore@intel.com
Cc: rusty@rustcorp.com.au
Cc: tiwai@suse.de
Cc: toshi.kani@hp.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1460592286-300-5-git-send-email-mcgrof@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# dd2f4a00 02-Apr-2016 Andy Lutomirski <luto@kernel.org>

x86/paravirt: Add paravirt_{read,write}_msr()

This adds paravirt callbacks for unsafe MSR access. On native, they
call native_{read,write}_msr(). On Xen, they use xen_{read,write}_msr_safe().

Nothing uses them yet for ease of bisection. The next patch will
use them in rdmsrl(), wrmsrl(), etc.

I intentionally didn't make them warn on #GP on Xen. I think that
should be done separately by the Xen maintainers.

Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: KVM list <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/880eebc5dcd2ad9f310d41345f82061ea500e9fa.1459605520.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# c2ee03b2 02-Apr-2016 Andy Lutomirski <luto@kernel.org>

x86/paravirt: Add _safe to the read_ms()r and write_msr() PV callbacks

These callbacks match the _safe variants, so name them accordingly.
This will make room for unsafe PV callbacks.

Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: KVM list <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/9ee3fb6a196a514c93325bdfa15594beecf04876.1459605520.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# bb93eb4c 21-Jan-2016 Josh Poimboeuf <jpoimboe@redhat.com>

x86/paravirt: Add stack frame dependency to PVOP inline asm calls

If a PVOP call macro is inlined at the beginning of a function, gcc can
insert the call instruction before setting up a stack frame, which
breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled and
can result in a bad stack trace.

Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the PVOP inline asm
statements.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/6a13e48c5a8cf2de1aa112ae2d4c0ac194096282.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# d8c98a1d 11-Dec-2015 David Vrabel <david.vrabel@citrix.com>

x86/paravirt: Prevent rtc_cmos platform device init on PV guests

Adding the rtc platform device in non-privileged Xen PV guests causes
an IRQ conflict because these guests do not have legacy PIC and may
allocate irqs in the legacy range.

In a single VCPU Xen PV guest we should have:

/proc/interrupts:
CPU0
0: 4934 xen-percpu-virq timer0
1: 0 xen-percpu-ipi spinlock0
2: 0 xen-percpu-ipi resched0
3: 0 xen-percpu-ipi callfunc0
4: 0 xen-percpu-virq debug0
5: 0 xen-percpu-ipi callfuncsingle0
6: 0 xen-percpu-ipi irqwork0
7: 321 xen-dyn-event xenbus
8: 90 xen-dyn-event hvc_console
...

But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.

genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 (rtc0)

We can avoid this problem by realizing that unprivileged PV guests (both
Xen and lguests) are not supposed to have rtc_cmos device and so
adding it is not necessary.

Privileged guests (i.e. Xen's dom0) do use it but they should not have
irq conflicts since they allocate irqs above legacy range (above
gsi_top, in fact).

Instead of explicitly testing whether the guest is privileged we can
extend pv_info structure to include information about guest's RTC
support.

Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: vkuznets@redhat.com
Cc: xen-devel@lists.xenproject.org
Cc: konrad.wilk@oracle.com
Cc: stable@vger.kernel.org # 4.2+
Link: http://lkml.kernel.org/r/1449842873-2613-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# d6ccc3ec 17-Nov-2015 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove paravirt ops pmd_update[_defer] and pte_update_defer

pte_update_defer can be removed as it is always set to the same
function as pte_update. So any usage of pte_update_defer() can be
replaced by pte_update().

pmd_update and pmd_update_defer are always set to paravirt_nop, so they
can just be nuked.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: jeremy@goop.org
Cc: chrisw@sous-sol.org
Cc: akataria@vmware.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xen.org
Cc: konrad.wilk@oracle.com
Cc: david.vrabel@citrix.com
Cc: boris.ostrovsky@oracle.com
Link: http://lkml.kernel.org/r/1447771879-1806-1-git-send-email-jgross@suse.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 75ef8219 19-Nov-2015 Boris Ostrovsky <boris.ostrovsky@oracle.com>

x86/entry, x86/paravirt: Remove the unused usergs_sysret32 PV op

As result of commit "x86/xen: Avoid fast syscall path for Xen PV
guests", usergs_sysret32 pv op is not called by Xen PV guests
anymore and since they were the only ones who used it we can
safely remove it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1447970147-1733-4-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 88c15ec9 19-Nov-2015 Boris Ostrovsky <boris.ostrovsky@oracle.com>

x86/paravirt: Remove the unused irq_enable_sysexit pv op

As result of commit "x86/xen: Avoid fast syscall path for Xen PV
guests", the irq_enable_sysexit pv op is not called by Xen PV guests
anymore and since they were the only ones who used it we can
safely remove it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1447970147-1733-3-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 46095865 17-Nov-2015 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove unused pv_apic_ops structure

The only member of that structure is startup_ipi_hook which is always
set to paravirt_nop.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Cc: jeremy@goop.org
Cc: chrisw@sous-sol.org
Cc: akataria@vmware.com
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xen.org
Cc: konrad.wilk@oracle.com
Cc: boris.ostrovsky@oracle.com
Link: http://lkml.kernel.org/r/1447767872-16730-1-git-send-email-jgross@suse.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 79f1d836 03-Nov-2015 Borislav Petkov <bp@suse.de>

x86/paravirt: Kill some unused patching functions

paravirt_patch_ignore() is completely unused and paravirt_patch_nop()
doesn't do a whole lot. Remove them both.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1446542329-32037-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# cda34fc7 13-Sep-2015 Juergen Gross <jgross@suse.com>

x86/paravirt: Remove the unused pv_time_ops::get_tsc_khz method

It's not used anywhere.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: chrisw@sous-sol.org
Cc: jeremy@goop.org
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/1442227343-403-1-git-send-email-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 9261e050 25-Jun-2015 Andy Lutomirski <luto@kernel.org>

x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() paravirt hooks

We've had ->read_tsc() and ->read_tscp() paravirt hooks since
the very beginning of paravirt, i.e.,

d3561b7fa0fb ("[PATCH] paravirt: header and stubs for paravirtualisation").

AFAICT, the only paravirt guest implementation that ever
replaced these calls was vmware, and it's gone. Arguably even
vmware shouldn't have hooked RDTSC -- we fully support systems
that don't have a TSC at all, so there's no point for a paravirt
implementation to pretend that we have a TSC but to replace it.

I also doubt that these hooks actually worked. Calls to rdtscl()
and rdtscll(), which respected the hooks, were used seemingly
interchangeably with native_read_tsc(), which did not.

Just remove them. If anyone ever needs them again, they can try
to make a case for why they need them.

Before, on a paravirt config:
text data bss dec hex filename
12618257 1816384 1093632 15528273 ecf151 vmlinux

After:
text data bss dec hex filename
12617207 1816384 1093632 15527223 eced37 vmlinux

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/d08a2600fb298af163681e5efd8e599d889a5b97.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 62c7a1e9 11-May-2015 Ingo Molnar <mingo@kernel.org>

locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS

Valentin Rothberg reported that we use CONFIG_QUEUED_SPINLOCKS
in arch/x86/kernel/paravirt_patch_32.c, while the symbol is
called CONFIG_QUEUED_SPINLOCK. (Note the extra 'S')

But the typo was natural: the proper English term for such
a generic object would be 'queued spinlocks' - so rename
this and related symbols accordingly to the plural form.

Reported-by: Valentin Rothberg <valentinrothberg@gmail.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# f233f7f1 24-Apr-2015 Peter Zijlstra (Intel) <peterz@infradead.org>

locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

We use the regular paravirt call patching to switch between:

native_queued_spin_lock_slowpath() __pv_queued_spin_lock_slowpath()
native_queued_spin_unlock() __pv_queued_spin_unlock()

We use a callee saved call for the unlock function which reduces the
i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
again.

We further optimize the unlock path by patching the direct call with a
"movb $0,%arg1" if we are indeed using the native unlock code. This
makes the unlock code almost as fast as the !PARAVIRT case.

This significantly lowers the overhead of having
CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# aac82d31 03-Apr-2015 Andy Lutomirski <luto@kernel.org>

x86, paravirt, xen: Remove the 64-bit ->irq_enable_sysexit() pvop

We don't use irq_enable_sysexit on 64-bit kernels any more.
Remove all the paravirt and Xen machinery to support it on
64-bit kernels.

Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8a03355698fe5b94194e9e7360f19f91c1b2cf1f.1428100853.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 98233368 14-Apr-2015 Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

x86: expose number of page table levels on Kconfig level

We would want to use number of page table level to define mm_struct.
Let's expose it as CONFIG_PGTABLE_LEVELS.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 824a2870 22-Oct-2013 Andi Kleen <ak@linux.intel.com>

x86, asmlinkage, paravirt: Don't rely on local assembler labels

The paravirt patching code assumes that it can reference a
local assembler label between two different top level assembler
statements. This does not work with LTO
where the assembler code may end up in different assembler files.

Replace it with extern / global /asm linkage labels.

This also removes one redundant copy of the macro.

Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1382458079-24450-4-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 354714dd 09-Aug-2013 Jeremy Fitzhardinge <jeremy@goop.org>

x86, pvticketlock: Use callee-save for lock_spinning

Although the lock_spinning calls in the spinlock code are on the
uncommon path, their presence can cause the compiler to generate many
more register save/restores in the function pre/postamble, which is in
the fast path. To avoid this, convert it to using the pvops callee-save
calling convention, which defers all the save/restores until the actual
function is called, keeping the fastpath clean.

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Link: http://lkml.kernel.org/r/1376058122-8248-8-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Attilio Rao <attilio.rao@citrix.com>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 545ac138 09-Aug-2013 Jeremy Fitzhardinge <jeremy@goop.org>

x86, spinlock: Replace pv spinlocks with pv ticketlocks

Rather than outright replacing the entire spinlock implementation in
order to paravirtualize it, keep the ticket lock implementation but add
a couple of pvops hooks on the slow patch (long spin on lock, unlocking
a contended lock).

Ticket locks have a number of nice properties, but they also have some
surprising behaviours in virtual environments. They enforce a strict
FIFO ordering on cpus trying to take a lock; however, if the hypervisor
scheduler does not schedule the cpus in the correct order, the system can
waste a huge amount of time spinning until the next cpu can take the lock.

(See Thomas Friebel's talk "Prevent Guests from Spinning Around"
http://www.xen.org/files/xensummitboston08/LHP.pdf for more details.)

To address this, we add two hooks:
- __ticket_spin_lock which is called after the cpu has been
spinning on the lock for a significant number of iterations but has
failed to take the lock (presumably because the cpu holding the lock
has been descheduled). The lock_spinning pvop is expected to block
the cpu until it has been kicked by the current lock holder.
- __ticket_spin_unlock, which on releasing a contended lock
(there are more cpus with tail tickets), it looks to see if the next
cpu is blocked and wakes it if so.

When compiled with CONFIG_PARAVIRT_SPINLOCKS disabled, a set of stub
functions causes all the extra code to go away.

Results:
=======
setup: 32 core machine with 32 vcpu KVM guest (HT off) with 8GB RAM
base = 3.11-rc
patched = base + pvspinlock V12

+-----------------+----------------+--------+
dbench (Throughput in MB/sec. Higher is better)
+-----------------+----------------+--------+
| base (stdev %)|patched(stdev%) | %gain |
+-----------------+----------------+--------+
| 15035.3 (0.3) |15150.0 (0.6) | 0.8 |
| 1470.0 (2.2) | 1713.7 (1.9) | 16.6 |
| 848.6 (4.3) | 967.8 (4.3) | 14.0 |
| 652.9 (3.5) | 685.3 (3.7) | 5.0 |
+-----------------+----------------+--------+

pvspinlock shows benefits for overcommit ratio > 1 for PLE enabled cases,
and undercommits results are flat

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Link: http://lkml.kernel.org/r/1376058122-8248-2-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Attilio Rao <attilio.rao@citrix.com>
[ Raghavendra: Changed SPIN_THRESHOLD, fixed redefinition of arch_spinlock_t]
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 9a55fdbe 05-Aug-2013 Andi Kleen <ak@linux.intel.com>

x86, asmlinkage, paravirt: Add __visible/asmlinkage to xen paravirt ops

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-13-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 357d1226 05-Apr-2013 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

x86, xen, gdt: Remove the pvops variant of store_gdt.

The two use-cases where we needed to store the GDT were during ACPI S3 suspend
and resume. As the patches:
x86/gdt/i386: store/load GDT for ACPI S3 or hibernation/resume path is not needed
x86/gdt/64-bit: store/load GDT for ACPI S3 or hibernate/resume path is not needed.

have demonstrated - there are other mechanism by which the GDT is
saved and reloaded during early resume path.

Hence we do not need to worry about the pvops call-chain for saving the
GDT and can and can eliminate it. The other areas where the store_gdt is
used are never going to be hit when running under the pvops platforms.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1365194544-14648-4-git-send-email-konrad.wilk@oracle.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 511ba86e 23-Mar-2013 Boris Ostrovsky <boris.ostrovsky@oracle.com>

x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal

Invoking arch_flush_lazy_mmu_mode() results in calls to
preempt_enable()/disable() which may have performance impact.

Since lazy MMU is not used on bare metal we can patch away
arch_flush_lazy_mmu_mode() so that it is never called in such
environment.

[ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU
updates" may cause a minor performance regression on
bare metal. This patch resolves that performance regression. It is
somewhat unclear to me if this is a good -stable candidate. ]

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1364045796-10720-2-git-send-email-konrad.wilk@oracle.com
Tested-by: Josh Boyer <jwboyer@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> SEE NOTE ABOVE


# e7b52ffd 27-Jun-2012 Alex Shi <alex.shi@linux.alibaba.com>

x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range

x86 has no flush_tlb_range support in instruction level. Currently the
flush_tlb_range just implemented by flushing all page table. That is not
the best solution for all scenarios. In fact, if we just use 'invlpg' to
flush few lines from TLB, we can get the performance gain from later
remain TLB lines accessing.

But the 'invlpg' instruction costs much of time. Its execution time can
compete with cr3 rewriting, and even a bit more on SNB CPU.

So, on a 512 4KB TLB entries CPU, the balance points is at:
(512 - X) * 100ns(assumed TLB refill cost) =
X(TLB flush entries) * 100ns(assumed invlpg cost)

Here, X is 256, that is 1/2 of 512 entries.

But with the mysterious CPU pre-fetcher and page miss handler Unit, the
assumed TLB refill cost is far lower then 100ns in sequential access. And
2 HT siblings in one core makes the memory access more faster if they are
accessing the same memory. So, in the patch, I just do the change when
the target entries is less than 1/16 of whole active tlb entries.
Actually, I have no data support for the percentage '1/16', so any
suggestions are welcomed.

As to hugetlb, guess due to smaller page table, and smaller active TLB
entries, I didn't see benefit via my benchmark, so no optimizing now.

My micro benchmark show in ideal scenarios, the performance improves 70
percent in reading. And in worst scenario, the reading/writing
performance is similar with unpatched 3.4-rc4 kernel.

Here is the reading data on my 2P * 4cores *HT NHM EP machine, with THP
'always':

multi thread testing, '-t' paramter is thread number:
with patch unpatched 3.4-rc4
./mprotect -t 1 14ns 24ns
./mprotect -t 2 13ns 22ns
./mprotect -t 4 12ns 19ns
./mprotect -t 8 14ns 16ns
./mprotect -t 16 28ns 26ns
./mprotect -t 32 54ns 51ns
./mprotect -t 128 200ns 199ns

Single process with sequencial flushing and memory accessing:

with patch unpatched 3.4-rc4
./mprotect 7ns 11ns
./mprotect -p 4096 -l 8 -n 10240
21ns 21ns

[ hpa: http://lkml.kernel.org/r/1B4B44D9196EFF41AE41FDA404FC0A100BFF94@SHSMSX101.ccr.corp.intel.com
has additional performance numbers. ]

Signed-off-by: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-3-git-send-email-alex.shi@intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>


# 1f975f78 01-Jun-2012 Andre Przywara <andre.przywara@amd.com>

x86, pvops: Remove hooks for {rd,wr}msr_safe_regs

There were paravirt_ops hooks for the full register set variant of
{rd,wr}msr_safe which are actually not used by anyone anymore. Remove
them to make the code cleaner and avoid silent breakages when the pvops
members were uninitialized. This has been boot-tested natively and under
Xen with PVOPS enabled and disabled on one machine.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Link: http://lkml.kernel.org/r/1338562358-28182-2-git-send-email-bp@amd64.org
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>


# 318f5a2a 03-Aug-2011 Andy Lutomirski <luto@MIT.EDU>

x86-64: Add user_64bit_mode paravirt op

Three places in the kernel assume that the only long mode CPL 3
selector is __USER_CS. This is not true on Xen -- Xen's sysretq
changes cs to the magic value 0xe033.

Two of the places are corner cases, but as of "x86-64: Improve
vsyscall emulation CS and RIP handling"
(c9712944b2a12373cb6ff8059afcfb7e826a6c54), vsyscalls will segfault
if called with Xen's extra CS selector. This causes a panic when
older init builds die.

It seems impossible to make Xen use __USER_CS reliably without
taking a performance hit on every system call, so this fixes the
tests instead with a new paravirt op. It's a little ugly because
ptrace.h can't include paravirt.h.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/f4fcb3947340d9e96ce1054a432f183f9da9db83.1312378163.git.luto@mit.edu
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 3c404b57 11-Jul-2011 Glauber Costa <glommer@redhat.com>

KVM guest: Add a pv_ops stub for steal time

This patch adds a function pointer in one of the many paravirt_ops
structs, to allow guests to register a steal time function. Besides
a steal time function, we also declare two jump_labels. They will be
used to allow the steal time code to be easily bypassed when not
in use.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Tested-by: Eric B Munson <emunson@mgebm.net>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 331127f7 13-Jan-2011 Andrea Arcangeli <aarcange@redhat.com>

thp: add pmd paravirt ops

Paravirt ops pmd_update/pmd_update_defer/pmd_set_at. Not all might be
necessary (vmware needs pmd_update, Xen needs set_pmd_at, nobody needs
pmd_update_defer), but this is to keep full simmetry with pte paravirt
ops, which looks cleaner and simpler from a common code POV.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b0f4c062 23-Aug-2010 Alok Kataria <akataria@vmware.com>

x86, paravirt: Remove alloc_pmd_clone hook, only used by VMI

VMI was the only user of the alloc_pmd_clone hook, given that VMI
is now removed we can also remove this hook.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
LKML-Reference: <1282608357.19396.36.camel@ank32.eng.vmware.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# dad52fc0 26-Feb-2010 Ian Campbell <ian.campbell@citrix.com>

x86, paravirt: Remove kmap_atomic_pte paravirt op.

Now that both Xen and VMI disable allocations of PTE pages from high
memory this paravirt op serves no further purpose.

This effectively reverts ce6234b5 "add kmap_atomic_pte for mapping
highpte pages".

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
LKML-Reference: <1267204562-11844-3-git-send-email-ian.campbell@citrix.com>
Acked-by: Alok Kataria <akataria@vmware.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>


# 445c8951 02-Dec-2009 Thomas Gleixner <tglx@linutronix.de>

locking: Convert raw_spinlock to arch_spinlock

The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.

Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org


# 71999d98 12-Oct-2009 Jeremy Fitzhardinge <jeremy@goop.org>

x86/paravirt: Use normal calling sequences for irq enable/disable

Bastian Blank reported a boot crash with stackprotector enabled,
and debugged it back to edx register corruption.

For historical reasons irq enable/disable/save/restore had special
calling sequences to make them more efficient. With the more
recent introduction of higher-level and more general optimisations
this is no longer necessary so we can just use the normal PVOP_
macros.

This fixes some residual bugs in the old implementations which left
edx liable to inadvertent clobbering. Also, fix some bugs in
__PVOP_VCALLEESAVE which were revealed by actual use.

Reported-by: Bastian Blank <bastian@waldi.eu.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4AD3BC9B.7040501@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7bd867df 09-Sep-2009 Feng Tang <feng.tang@intel.com>

x86: Move get/set_wallclock to x86_platform_ops

get/set_wallclock() have already a set of platform dependent
implementations (default, EFI, paravirt). MRST will add another
variant.

Moving them to platform ops simplifies the existing code and minimizes
the effort to integrate new variants.

Signed-off-by: Feng Tang <feng.tang@intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 845b3944 19-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Add timer_init to x86_init_ops

The timer init code is convoluted with several quirks and the paravirt
timer chooser. Figuring out which code path is actually taken is not
for the faint hearted.

Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and
replace the paravirt time chooser and the remaining x86 quirk with a
simple x86_init_ops function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 736decac 18-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Move percpu clockevents setup to x86_init_ops

paravirt overrides the setup of the default apic timers as per cpu
timers. Moorestown needs to override that as well.

Move it to x86_init_ops setup and create a separate x86_cpuinit struct
which holds the function for the secondary evtl. hotplugabble CPUs.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# f1d7062a 20-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Move xen_post_allocator_init into xen_pagetable_setup_done

We really do not need two paravirt/x86_init_ops functions which are
called in two consecutive source lines. Move the only user of
post_allocator_init into the already existing pagetable_setup_done
function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 030cb6c0 20-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Move paravirt pagetable_setup to x86_init_ops

Replace more paravirt hackery by proper x86_init_ops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 6f30c1ac 20-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Move paravirt banner printout to x86_init_ops

Replace another obscure paravirt magic and move it to
x86_init_ops. Such a hook is also useful for embedded and special
hardware.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 42bbdb43 20-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Replace ARCH_SETUP by a proper x86_init_ops

ARCH_SETUP is a horrible leftover from the old arch/i386 mach support
code. It still has a lonely user in xen. Move it to x86_init_ops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 66bcaf0b 20-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Move irq_init to x86_init_ops

irq_init is overridden by x86_quirks and by paravirts. Unify the whole
mess and make it an unconditional x86_init_ops function which defaults
to the standard function and can be overridden by the early platform
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 6b18ae3e 20-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Move memory_setup to x86_init_ops

memory_setup is overridden by x86_quirks and by paravirts with weak
functions and quirks. Unify the whole mess and make it an
unconditional x86_init_ops function which defaults to the standard
function and can be overridden by the early platform code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# ac5672f8 14-Apr-2009 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

x86/paravirt: split paravirt definitions into paravirt_types.h

Split the monolithic asm/paravirt.h into separate paravirt.h (inlines and other
"active" definitions), and paravirt_types.h (types, constants and other "passive"
definitions). This makes it easier to use the type/constant definitions without
pulling in everything else and causing circular dependency problems.

[ Impact: cleanup ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>