History log of /linux-master/arch/powerpc/kvm/book3s_hv_p9_entry.c
Revision Date Author Comments
# 7028ac8d 13-Sep-2023 Jordan Niethe <jniethe5@gmail.com>

KVM: PPC: Use accessors for VCPU registers

Introduce accessor generator macros for VCPU registers. Use the accessor
functions to replace direct accesses to this registers.

This will be important later for Nested APIv2 support which requires
additional functionality for accessing and modifying VCPU state.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-5-jniethe5@gmail.com


# db536084 10-Jul-2022 Kajol Jain <kjain@linux.ibm.com>

powerpc/kvm: Move pmu code in kvm folder to separate file for power9 and later platforms

File book3s_hv_p9_entry.c in powerpc/kvm folder consists of functions
like freeze_pmu, switch_pmu_to_guest and switch_pmu_to_host which are
specific to Performance Monitoring Unit(PMU) for power9 and later
platforms.

For better maintenance, moving pmu related code from
book3s_hv_p9_entry.c to a new file called book3s_hv_p9_perf.c,
without any logic change.
Also make corresponding changes in the Makefile to include
book3s_hv_p9_perf.c during compilation.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220711034927.213192-1-kjain@linux.ibm.com


# b44bb1b7 25-May-2022 Fabiano Rosas <farosas@linux.ibm.com>

KVM: PPC: Book3S HV: Provide more detailed timings for P9 entry path

Alter the data collection points for the debug timing code in the P9
path to be more in line with what the code does. The points where we
accumulate time are now the following:

vcpu_entry: From vcpu_run_hv entry until the start of the inner loop;

guest_entry: From the start of the inner loop until the guest entry in
asm;

in_guest: From the guest entry in asm until the return to KVM C code;

guest_exit: From the return into KVM C code until the corresponding
hypercall/page fault handling or re-entry into the guest;

hypercall: Time spent handling hcalls in the kernel (hcalls can go to
QEMU, not accounted here);

page_fault: Time spent handling page faults;

vcpu_exit: vcpu_run_hv exit (almost no code here currently).

Like before, these are exposed in debugfs in a file called
"timings". There are four values:

- number of occurrences of the accumulation point;
- total time the vcpu spent in the phase in ns;
- shortest time the vcpu spent in the phase in ns;
- longest time the vcpu spent in the phase in ns;

===
Before:

rm_entry: 53132 16793518 256 4060
rm_intr: 53132 2125914 22 340
rm_exit: 53132 24108344 374 2180
guest: 53132 40980507996 404 9997650
cede: 0 0 0 0

After:

vcpu_entry: 34637 7716108 178 4416
guest_entry: 52414 49365608 324 747542
in_guest: 52411 40828715840 258 9997480
guest_exit: 52410 19681717182 826 102496674
vcpu_exit: 34636 1744462 38 182
hypercall: 45712 22878288 38 1307962
page_fault: 992 111104034 568 168688

With just one instruction (hcall):

vcpu_entry: 1 942 942 942
guest_entry: 1 4044 4044 4044
in_guest: 1 1540 1540 1540
guest_exit: 1 3542 3542 3542
vcpu_exit: 1 80 80 80
hypercall: 0 0 0 0
page_fault: 0 0 0 0
===

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220525130554.2614394-6-farosas@linux.ibm.com


# 2861c827 25-May-2022 Fabiano Rosas <farosas@linux.ibm.com>

KVM: PPC: Book3S HV: Expose timing functions to module code

The next patch adds new timing points to the P9 entry path, some of
which are in the module code, so we need to export the timing
functions.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220525130554.2614394-5-farosas@linux.ibm.com


# c3fa64c9 25-May-2022 Fabiano Rosas <farosas@linux.ibm.com>

KVM: PPC: Book3S HV: Decouple the debug timing from the P8 entry path

We are currently doing the timing for debug purposes of the P9 entry
path using the accumulators and terminology defined by the old entry
path for P8 machines.

Not only the "real-mode" and "napping" mentions are out of place for
the P9 Radix entry path but also we cannot change them because the
timing code is coupled to the structures defined in struct
kvm_vcpu_arch.

Add a new CONFIG_KVM_BOOK3S_HV_P9_TIMING to enable the timing code for
the P9 entry path. For now, just add the new CONFIG and duplicate the
structures. A subsequent patch will add the P9 changes.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220525130554.2614394-4-farosas@linux.ibm.com


# 9981bace 25-May-2022 Fabiano Rosas <farosas@linux.ibm.com>

KVM: PPC: Book3S HV: Fix "rm_exit" entry in debugfs timings

At debugfs/kvm/<pid>/vcpu0/timings we show how long each part of the
code takes to run:

$ cat /sys/kernel/debug/kvm/*-*/vcpu0/timings
rm_entry: 123785 49398892 118 4898
rm_intr: 123780 6075890 22 390
rm_exit: 0 0 0 0 <-- NOK
guest: 123780 46732919988 402 9997638
cede: 0 0 0 0 <-- OK, no cede napping in P9

The "rm_exit" is always showing zero because it is the last one and
end_timing does not increment the counter of the previous entry.

We can fix it by calling accumulate_time again instead of
end_timing. That way the counter gets incremented. The rest of the
arithmetic can be ignored because there are no timing points after
this and the accumulators are reset before the next round.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220525130554.2614394-2-farosas@linux.ibm.com


# 361234d7 23-Jan-2022 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Optimise loads around context switch

It is better to get all loads for the register values in flight
before starting to switch LPID, PID, and LPCR because those
mtSPRs are expensive and serialising.

This also just tidies up the code for a potential future change
to the context switching sequence.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220123114725.3549202-1-npiggin@gmail.com


# 1fd02f66 30-Apr-2022 Julia Lawall <Julia.Lawall@inria.fr>

powerpc: fix typos in comments

Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr


# f6a19877 30-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Remove unused ri_set local variable

ri_set is set and never used.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211201052112.2137167-1-npiggin@gmail.com


# 9c5a432a 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Remove subcore HMI handling

On POWER9 and newer, rather than the complex HMI synchronisation and
subcore state, have each thread un-apply the guest TB offset before
calling into the early HMI handler.

This allows the subcore state to be avoided, including subcore enter
/ exit guest, which includes an expensive divide that shows up
slightly in profiles.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-54-npiggin@gmail.com


# 6398326b 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Stop using vc->dpdes

The P9 path uses vc->dpdes only for msgsndp / SMT emulation. This adds
an ordering requirement between vcpu->doorbell_request and vc->dpdes for
no real benefit. Use vcpu->doorbell_request directly.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-53-npiggin@gmail.com


# f08cbf5c 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exit

kvm_hstate.in_guest provides the equivalent of MSR[RI]=0 protection,
and it covers the existing MSR[RI]=0 section in late entry and early
exit, so clearing and setting MSR[RI] in those cases does not
actually do anything useful.

Remove the RI manipulation and replace it with comments. Make the
in_guest memory accesses a bit closer to a proper critical section
pattern. This speeds up guest entry/exit performance.

This also removes the MSR[RI] warnings which aren't very interesting
and would cause crashes if they hit due to causing an interrupt in
non-recoverable code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-48-npiggin@gmail.com


# 241d1f19 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving

slbmfee/slbmfev instructions are very expensive, moreso than a regular
mfspr instruction, so minimising them significantly improves hash guest
exit performance. The slbmfev is only required if slbmfee found a valid
SLB entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-47-npiggin@gmail.com


# b49c65c5 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry

Rearrange the MSR saving on entry so it does not follow the mtmsrd to
disable interrupts, avoiding a possible RAW scoreboard stall.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-46-npiggin@gmail.com


# d5c0e833 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit

Use the existing TLB flushing logic to IPI the previous CPU and run the
necessary barriers before running a guest vCPU on a new physical CPU,
to do the necessary radix GTSE barriers for handling the case of an
interrupted guest tlbie sequence.

This requires the vCPU TLB flush sequence that is currently just done
on one thread, to be expanded to ensure the other threads execute a
ptesync, because causing them to exit the guest will no longer cause a
ptesync by itself.

This results in more IPIs than the TLB flush logic requires, but it's
a significant win for common case scheduling when the vCPU remains on
the same physical CPU.

This saves about 520 cycles (nearly 10%) on a guest entry+exit micro
benchmark on a POWER9.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-44-npiggin@gmail.com


# 0ba0e5d5 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV: Split P8 from P9 path guest vCPU TLB flushing

This creates separate functions for old and new paths for vCPU TLB
flushing, which will reduce complexity of the next change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-43-npiggin@gmail.com


# a089a686 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed

This also moves the PSSCR update in nested entry to avoid a SPR
scoreboard stall.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-42-npiggin@gmail.com


# 9c75f65f 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs

Some of the DAWR SPR access is already predicated on dawr_enabled(),
apply this to the remainder of the accesses.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-41-npiggin@gmail.com


# cf3b16cf 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code

Tighten up partition switching code synchronisation and comments.

In particular, hwsync ; isync is required after the last access that is
performed in the context of a partition, before the partition is
switched away from.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-40-npiggin@gmail.com


# 5236756d 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs

Linux implements SPR save/restore including storage space for registers
in the task struct for process context switching. Make use of this
similarly to the way we make use of the context switching fp/vec save
restore.

This improves code reuse, allows some stack space to be saved, and helps
with avoiding VRSAVE updates if they are not required.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-39-npiggin@gmail.com


# 022ecb96 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Demand fault TM facility registers

Use HFSCR facility disabling to implement demand faulting for TM, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.

This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-38-npiggin@gmail.com


# a3e18ca8 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Demand fault EBB facility registers

Use HFSCR facility disabling to implement demand faulting for EBB, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.

This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-37-npiggin@gmail.com


# 34e02d55 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: More SPR speed improvements

This avoids more scoreboard stalls and reduces mtSPRs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-36-npiggin@gmail.com


# d55b1ecc 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that require it

Use CPU_FTR_P9_RADIX_PREFETCH_BUG to apply the workaround, to test for
DD2.1 and below processors. This saves a mtSPR in guest entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-35-npiggin@gmail.com


# 3e7b3379 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible

This moves PMU switch to guest as late as possible in entry, and switch
back to host as early as possible at exit. This helps the host get the
most perf coverage of KVM entry/exit code as possible.

This is slightly suboptimal for SPR scheduling point of view when the
PMU is enabled, but when perf is disabled there is no real difference.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-34-npiggin@gmail.com


# 3f9e2966 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit

If TM is not active, only TM register state needs to be saved and
restored, avoiding several mfmsr/mtmsrd instructions and improving
performance.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-33-npiggin@gmail.com


# d5f48019 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entry

Move register saving and loading from kvmhv_p9_guest_entry() into the HV
and nested entry handlers.

Accesses are scheduled to reduce mtSPR / mfSPR interleaving which
reduces SPR scoreboard stalls.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-32-npiggin@gmail.com


# aabcaf6a 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in

Move the P9 guest/host register switching functions to the built-in
P9 entry code, and export it for nested to use as well.

This allows more flexibility in scheduling these supervisor privileged
SPR accesses with the HV privileged and PR SPR accesses in the low level
entry code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-30-npiggin@gmail.com


# 9a1e530b 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls

Avoid interleaving mfSPR and mtSPR to reduce SPR scoreboard stalls.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-26-npiggin@gmail.com


# cb2553a0 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Optimise timebase reads

Reduce the number of mfTB executed by passing the current timebase
around entry and exit code rather than read it multiple times.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-25-npiggin@gmail.com


# 6547af3e 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Move TB updates

Move the TB updates between saving and loading guest and host SPRs,
to improve scheduling by keeping issue-NTC operations together as
much as possible.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-24-npiggin@gmail.com


# 3c1a4322 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase

Change dec_expires to be relative to the guest timebase, and allow
it to be moved into low level P9 guest entry functions, to improve
SPR access scheduling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-23-npiggin@gmail.com


# 34e119c9 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs

This reduces the number of mtmsrd required to enable facility bits when
saving/restoring registers, by having the KVM code set all bits up front
rather than using individual facility functions that set their particular
MSR bits.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-20-npiggin@gmail.com


# 46f9caf1 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

powerpc/64s: Keep AMOR SPR a constant ~0 at runtime

This register controls supervisor SPR modifications, and as such is only
relevant for KVM. KVM always sets AMOR to ~0 on guest entry, and never
restores it coming back out to the host, so it can be kept constant and
avoid the mtSPR in KVM guest entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-10-npiggin@gmail.com


# 34bf08a2 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit

mftb is serialising (dispatch next-to-complete) so it is heavy weight
for a mfspr. Avoid reading it multiple times in the entry or exit paths.
A small number of cycles delay to timers is tolerable.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-7-npiggin@gmail.com


# 9581991a 23-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Use large decrementer for HDEC

On processors that don't suppress the HDEC exceptions when LPCR[HDICE]=0,
this could help reduce needless guest exits due to leftover exceptions on
entering the guest.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-6-npiggin@gmail.com


# e44fbdb6 11-Jul-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Fix guest TM support

The conversion to C introduced several bugs in TM handling that can
cause host crashes with TM bad thing interrupts. Mostly just simple
typos or missed logic in the conversion that got through due to my
not testing TM in the guest sufficiently.

- Early TM emulation for the softpatch interrupt should be done if fake
suspend mode is _not_ active.

- Early TM emulation wants to return immediately to the guest so as to
not doom transactions unnecessarily.

- And if exiting from the guest, the host MSR should include the TM[S]
bit if the guest was T/S, before it is treclaimed.

After this fix, all the TM selftests pass when running on a P9 processor
that implements TM with softpatch interrupt.

Fixes: 89d35b2391015 ("KVM: PPC: Book3S HV P9: Implement the rest of the P9 path in C")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210712013650.376325-1-npiggin@gmail.com


# 0bf7e1b2 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: implement hash host / hash guest support

Implement support for hash guests under hash host. This has to save and
restore the host SLB, and ensure that the MMU is off while switching
into the guest SLB.

POWER9 and later CPUs now always go via the P9 path. The "fast" guest
mode is now renamed to the P9 mode, which is consistent with its
functionality and the rest of the naming.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-32-npiggin@gmail.com


# 079a09a5 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: implement hash guest support

Implement hash guest support. Guest entry/exit has to restore and
save/clear the SLB, plus several other bits to accommodate hash guests
in the P9 path. Radix host, hash guest support is removed from the P7/8
path.

The HPT hcalls and faults are not handled in real mode, which is a
performance regression. A worst-case fork/exit microbenchmark takes 3x
longer after this patch. kbuild benchmark performance is in the noise,
but the slowdown is likely to be noticed somewhere.

For now, accept this penalty for the benefit of simplifying the P7/8
paths and unifying P9 hash with the new code, because hash is a less
important configuration than radix on processors that support it. Hash
will benefit from future optimisations to this path, including possibly
a faster path to handle such hcalls and interrupts without doing a full
exit.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-31-npiggin@gmail.com


# 2e1ae9cd 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV: Implement radix prefetch workaround by disabling MMU

Rather than partition the guest PID space + flush a rogue guest PID to
work around this problem, instead fix it by always disabling the MMU when
switching in or out of guest MMU context in HV mode.

This may be a bit less efficient, but it is a lot less complicated and
allows the P9 path to trivally implement the workaround too. Newer CPUs
are not subject to this issue.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-22-npiggin@gmail.com


# 41f77991 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Switch to guest MMU context as late as possible

Move MMU context switch as late as reasonably possible to minimise code
running with guest context switched in. This becomes more important when
this code may run in real-mode, with later changes.

Move WARN_ON as early as possible so program check interrupts are less
likely to tangle everything up.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-21-npiggin@gmail.com


# 68e3baac 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Move SPR loading after expiry time check

This is wasted work if the time limit is exceeded.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-19-npiggin@gmail.com


# a32ed1bb 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Improve exit timing accounting coverage

The C conversion caused exit timing to become a bit cramped. Expand it
to cover more of the entry and exit code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-18-npiggin@gmail.com


# 6d770e3f 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Read machine check registers while MSR[RI] is 0

SRR0/1, DAR, DSISR must all be protected from machine check which can
clobber them. Ensure MSR[RI] is clear while they are live.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-17-npiggin@gmail.com


# c00366e2 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: inline kvmhv_load_hv_regs_and_go into __kvmhv_vcpu_entry_p9

Now the initial C implementation is done, inline more HV code to make
rearranging things easier.

And rename __kvmhv_vcpu_entry_p9 to drop the leading underscores as it's
now C, and is now a more complete vcpu entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-16-npiggin@gmail.com


# 89d35b23 28-May-2021 Nicholas Piggin <npiggin@gmail.com>

KVM: PPC: Book3S HV P9: Implement the rest of the P9 path in C

Almost all logic is moved to C, by introducing a new in_guest mode for
the P9 path that branches very early in the KVM interrupt handler to P9
exit code.

The main P9 entry and exit assembly is now only about 160 lines of low
level stack setup and register save/restore, plus a bad-interrupt
handler.

There are two motivations for this, the first is just make the code more
maintainable being in C. The second is to reduce the amount of code
running in a special KVM mode, "realmode". In quotes because with radix
it is no longer necessarily real-mode in the MMU, but it still has to be
treated specially because it may be in real-mode, and has various
important registers like PID, DEC, TB, etc set to guest. This is hostile
to the rest of Linux and can't use arbitrary kernel functionality or be
instrumented well.

This initial patch is a reasonably faithful conversion of the asm code,
but it does lack any loop to return quickly back into the guest without
switching out of realmode in the case of unimportant or easily handled
interrupts. As explained in previous changes, handling HV interrupts
very quickly in this low level realmode is not so important for P9
performance, and are important to avoid for security, observability,
debugability reasons.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-15-npiggin@gmail.com