#
e12d8e26 |
|
29-Nov-2023 |
Zhao Ke <ke.zhao@shingroup.cn> |
powerpc: Add PVN support for HeXin C2000 processor HeXin Tech Co. has applied for a new PVN from the OpenPower Community for its new processor C2000. The OpenPower has assigned a new PVN and this newly assigned PVN is 0x0066, add pvr register related support for this PVN. Signed-off-by: Zhao Ke <ke.zhao@shingroup.cn> Link: https://discuss.openpower.foundation/t/how-to-get-a-new-pvr-for-processors-follow-power-isa/477/10 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231129075845.57976-1-ke.zhao@shingroup.cn
|
#
67c48662e |
|
08-Feb-2023 |
Thomas Huth <thuth@redhat.com> |
KVM: PPC: Standardize on "int" return types in the powerpc KVM code Most functions that are related to kvm_arch_vm_ioctl() already use "int" as return type to pass error values back to the caller. Some outlier functions use "long" instead for no good reason (they do not really require long values here). Let's standardize on "int" here to avoid casting the values back and forth between the two types. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230208140105.655814-2-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
a3800ef9 |
|
07-Mar-2023 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Enable prefixed instructions for HV KVM and disable for PR KVM Now that we can read prefixed instructions from a HV KVM guest and emulate prefixed load/store instructions to emulated MMIO locations, we can add HFSCR_PREFIXED into the set of bits that are set in the HFSCR for a HV KVM guest on POWER10, allowing the guest to use prefixed instructions. PR KVM has not yet been extended to handle prefixed instructions in all situations where we might need to emulate them, so prevent the guest from enabling prefixed instructions in the FSCR for now. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/ZAgs25dCmLrVkBdU@cleo
|
#
acf17878 |
|
07-Mar-2023 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Make kvmppc_get_last_inst() produce a ppc_inst_t This changes kvmppc_get_last_inst() so that the instruction it fetches is returned in a ppc_inst_t variable rather than a u32. This will allow us to return a 64-bit prefixed instruction on those 64-bit machines that implement Power ISA v3.1 or later, such as POWER10. On 32-bit platforms, ppc_inst_t is 32 bits wide, and is turned back into a u32 by ppc_inst_val, which is an identity operation on those platforms. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/ZAgsiPlL9O7KnlZZ@cleo
|
#
460ba21d |
|
30-Mar-2023 |
Nicholas Piggin <npiggin@gmail.com> |
KVM: PPC: Permit SRR1 flags in more injected interrupt types The prefix architecture in ISA v3.1 introduces a prefixed bit in SRR1 for many types of synchronous interrupts which is set when the interrupt is caused by a prefixed instruction. This requires KVM to be able to set this bit when injecting interrupts into a guest. Plumb through the SRR1 "flags" argument to the core_queue APIs where it's missing for this. For now they are set to 0, which is no change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fixup kvmppc_core_queue_alignment() in booke.c] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230330103224.3589928-2-npiggin@gmail.com
|
#
c59fb127 |
|
20-Sep-2022 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: remove KVM_REQ_UNHALT KVM_REQ_UNHALT is now unnecessary because it is replaced by the return value of kvm_vcpu_block/kvm_vcpu_halt. Remove it. No functional change intended. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20220921003201.1441511-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
1fd02f66 |
|
30-Apr-2022 |
Julia Lawall <Julia.Lawall@inria.fr> |
powerpc: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr
|
#
b5149e22 |
|
21-Feb-2022 |
Nicholas Piggin <npiggin@gmail.com> |
KVM: PPC: Book3S PR: Disable SCV when AIL could be disabled PR KVM does not support running with AIL enabled, and SCV does is not supported with AIL disabled. Fix this by ensuring the SCV facility is disabled with FSCR while a CPU could be running with AIL=0. The PowerNV host supports disabling AIL on a per-CPU basis, so SCV just needs to be disabled when a vCPU is being run. The pSeries machine can only switch AIL on a system-wide basis, so it must disable SCV support at boot if the configuration can potentially run a PR KVM guest. Also ensure a the FSCR[SCV] bit can not be enabled when emulating mtFSCR for the guest. SCV is not emulated for the PR guest at the moment, this just fixes the host crashes. Alternatives considered and rejected: - SCV support can not be disabled by PR KVM after boot, because it is advertised to userspace with HWCAP. - AIL can not be disabled on a per-CPU basis. At least when running on pseries it is a per-LPAR setting. - Support for real-mode SCV vectors will not be added because they are at 0x17000 so making such a large fixed head space causes immediate value limits to be exceeded, requiring a lot rework and more code. - Disabling SCV for any PR KVM possible kernel will cause a slowdown when not using PR KVM. - A boot time option to disable SCV to use PR KVM is user-hostile. - System call instruction emulation for SCV facility unavailable instructions is too complex and old emulation code was subtly broken and removed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Link: https://lore.kernel.org/r/20220222064727.2314380-2-npiggin@gmail.com
|
#
91b99ea7 |
|
08-Oct-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting the actual "block" sequences into a separate helper (to be named kvm_vcpu_block()). x86 will use the standalone block-only path to handle non-halt cases where the vCPU is not runnable. Rename block_ns to halt_ns to match the new function name. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
eaaaed13 |
|
06-Dec-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Avoid referencing userspace memory region in memslot updates For PPC HV, get the number of pages directly from the new memslot instead of computing the same from the userspace memory region, and explicitly check for !DELETE instead of inferring the same when toggling mmio_update. The motivation for these changes is to avoid referencing the @mem param so that it can be dropped in a future commit. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <1e97fb5198be25f98ef82e63a8d770c682264cc9.1638817639.git.maciej.szmigiero@oracle.com>
|
#
537a17b3 |
|
06-Dec-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: Let/force architectures to deal with arch specific memslot data Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch code to handle propagating arch specific data from "new" to "old" when necessary. This is a baby step towards dynamically allocating "new" from the get go, and is a (very) minor performance boost on x86 due to not unnecessarily copying arch data. For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE and FLAGS_ONLY. This is functionally a nop as the previous behavior would overwrite the pointer for CREATE, and eventually discard/ignore it for DELETE. For x86, copy the arch data only for FLAGS_ONLY changes. Unlike PPC HV, x86 needs to reallocate arch data in the MOVE case as the size of x86's allocations depend on the alignment of the memslot's gfn. Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to match the "commit" prototype. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> [mss: add missing RISCV kvm_arch_prepare_memory_region() change] Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>
|
#
46808a4c |
|
16-Nov-2021 |
Marc Zyngier <maz@kernel.org> |
KVM: Use 'unsigned long' as kvm_for_each_vcpu()'s index Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu index. Unfortunately, we're about to move rework the iterator, which requires this to be upgrade to an unsigned long. Let's bite the bullet and repaint all of it in one go. Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-7-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
59dc5bfc |
|
17-Jun-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: avoid reloading (H)SRR registers if they are still valid When an interrupt is taken, the SRR registers are set to return to where it left off. Unless they are modified in the meantime, or the return address or MSR are modified, there is no need to reload these registers when returning from interrupt. Introduce per-CPU flags that track the validity of SRR and HSRR registers. These are cleared when returning from interrupt, when using the registers for something else (e.g., OPAL calls), when adjusting the return address or MSR of a context, and when context switching (which changes the return address and MSR). This improves the performance of interrupt returns. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fold in fixup patch from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-5-npiggin@gmail.com
|
#
0193cc90 |
|
18-Jun-2021 |
Jing Zhang <jingzhangos@google.com> |
KVM: stats: Separate generic stats from architecture specific ones Generic KVM stats are those collected in architecture independent code or those supported by all architectures; put all generic statistics in a separate structure. This ensures that they are defined the same way in the statistics API which is being added, removing duplication among different architectures in the declaration of the descriptors. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
b1c5356e |
|
01-Apr-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Convert to the gfn-based MMU notifier callbacks Move PPC to the gfn-base MMU notifier APIs, and update all 15 bajillion PPC-internal hooks to work with gfns instead of hvas. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
e89a8ca9 |
|
05-Nov-2020 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: Remove MSR[ISF] bit No supported processor implements this mode. Setting the bit in MSR values can be a bit confusing (and would prevent the bit from ever being reused). Remove it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201106045340.1935841-1-npiggin@gmail.com
|
#
cf59eb13 |
|
21-Sep-2020 |
Wang Wensheng <wangwensheng4@huawei.com> |
KVM: PPC: Book3S: Fix symbol undeclared warnings Build the kernel with `C=2`: arch/powerpc/kvm/book3s_hv_nested.c:572:25: warning: symbol 'kvmhv_alloc_nested' was not declared. Should it be static? arch/powerpc/kvm/book3s_64_mmu_radix.c:350:6: warning: symbol 'kvmppc_radix_set_pte_at' was not declared. Should it be static? arch/powerpc/kvm/book3s_hv.c:3568:5: warning: symbol 'kvmhv_p9_guest_entry' was not declared. Should it be static? arch/powerpc/kvm/book3s_hv_rm_xics.c:767:15: warning: symbol 'eoi_rc' was not declared. Should it be static? arch/powerpc/kvm/book3s_64_vio_hv.c:240:13: warning: symbol 'iommu_tce_kill_rm' was not declared. Should it be static? arch/powerpc/kvm/book3s_64_vio.c:492:6: warning: symbol 'kvmppc_tce_iommu_do_map' was not declared. Should it be static? arch/powerpc/kvm/book3s_pr.c:572:6: warning: symbol 'kvmppc_set_pvr_pr' was not declared. Should it be static? Those symbols are used only in the files that define them so make them static to fix the warnings. Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
7ec21d9d |
|
23-Jun-2020 |
Tianjia Zhang <tianjia.zhang@linux.alibaba.com> |
KVM: PPC: Clean up redundant kvm_run parameters in assembly In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu' structure. For historical reasons, many kvm-related function parameters retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This patch does a unified cleanup of these remaining redundant parameters. [paulus@ozlabs.org - Fixed places that were missed in book3s_interrupts.S] Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
1ef79040 |
|
03-Jun-2020 |
Kees Cook <keescook@chromium.org> |
KVM: PPC: Book3S PR: Remove uninitialized_var() usage Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. As a precursor to removing[2] this[3] macro[4], just remove this variable since it was actually unused: arch/powerpc/kvm/book3s_pr.c:1832:16: warning: unused variable 'vrsave' [-Wunused-variable] unsigned long vrsave; ^ [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Suggested-by: Nathan Chancellor <natechancellor@gmail.com> Fixes: f05ed4d56e9c ("KVM: PPC: Split out code from book3s.c into book3s_pr.c") Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Kees Cook <keescook@chromium.org>
|
#
8c99d345 |
|
26-Apr-2020 |
Tianjia Zhang <tianjia.zhang@linux.alibaba.com> |
KVM: PPC: Clean up redundant 'kvm_run' parameters In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu' structure. For historical reasons, many kvm-related function parameters retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This patch does a unified cleanup of these remaining redundant parameters. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
1d0c32ec |
|
18-Mar-2020 |
Greg Kurz <groug@kaod.org> |
KVM: PPC: Fix kernel crash with PR KVM With PR KVM, shutting down a VM causes the host kernel to crash: [ 314.219284] BUG: Unable to handle kernel data access on read at 0xc00800000176c638 [ 314.219299] Faulting instruction address: 0xc008000000d4ddb0 cpu 0x0: Vector: 300 (Data Access) at [c00000036da077a0] pc: c008000000d4ddb0: kvmppc_mmu_pte_flush_all+0x68/0xd0 [kvm_pr] lr: c008000000d4dd94: kvmppc_mmu_pte_flush_all+0x4c/0xd0 [kvm_pr] sp: c00000036da07a30 msr: 900000010280b033 dar: c00800000176c638 dsisr: 40000000 current = 0xc00000036d4c0000 paca = 0xc000000001a00000 irqmask: 0x03 irq_happened: 0x01 pid = 1992, comm = qemu-system-ppc Linux version 5.6.0-master-gku+ (greg@palmb) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #17 SMP Wed Mar 18 13:49:29 CET 2020 enter ? for help [c00000036da07ab0] c008000000d4fbe0 kvmppc_mmu_destroy_pr+0x28/0x60 [kvm_pr] [c00000036da07ae0] c0080000009eab8c kvmppc_mmu_destroy+0x34/0x50 [kvm] [c00000036da07b00] c0080000009e50c0 kvm_arch_vcpu_destroy+0x108/0x140 [kvm] [c00000036da07b30] c0080000009d1b50 kvm_vcpu_destroy+0x28/0x80 [kvm] [c00000036da07b60] c0080000009e4434 kvm_arch_destroy_vm+0xbc/0x190 [kvm] [c00000036da07ba0] c0080000009d9c2c kvm_put_kvm+0x1d4/0x3f0 [kvm] [c00000036da07c00] c0080000009da760 kvm_vm_release+0x38/0x60 [kvm] [c00000036da07c30] c000000000420be0 __fput+0xe0/0x310 [c00000036da07c90] c0000000001747a0 task_work_run+0x150/0x1c0 [c00000036da07cf0] c00000000014896c do_exit+0x44c/0xd00 [c00000036da07dc0] c0000000001492f4 do_group_exit+0x64/0xd0 [c00000036da07e00] c000000000149384 sys_exit_group+0x24/0x30 [c00000036da07e20] c00000000000b9d0 system_call+0x5c/0x68 This is caused by a use-after-free in kvmppc_mmu_pte_flush_all() which dereferences vcpu->arch.book3s which was previously freed by kvmppc_core_vcpu_free_pr(). This happens because kvmppc_mmu_destroy() is called after kvmppc_core_vcpu_free() since commit ff030fdf5573 ("KVM: PPC: Move kvm_vcpu_init() invocation to common code"). The kvmppc_mmu_destroy() helper calls one of the following depending on the KVM backend: - kvmppc_mmu_destroy_hv() which does nothing (Book3s HV) - kvmppc_mmu_destroy_pr() which undoes the effects of kvmppc_mmu_init() (Book3s PR 32-bit) - kvmppc_mmu_destroy_pr() which undoes the effects of kvmppc_mmu_init() (Book3s PR 64-bit) - kvmppc_mmu_destroy_e500() which does nothing (BookE e500/e500mc) It turns out that this is only relevant to PR KVM actually. And both 32 and 64 backends need vcpu->arch.book3s to be valid when calling kvmppc_mmu_destroy_pr(). So instead of calling kvmppc_mmu_destroy() from kvm_arch_vcpu_destroy(), call kvmppc_mmu_destroy_pr() at the beginning of kvmppc_core_vcpu_free_pr(). This is consistent with kvmppc_mmu_init() being the last call in kvmppc_core_vcpu_create_pr(). For the same reason, if kvmppc_core_vcpu_create_pr() returns an error then this means that kvmppc_mmu_init() was either not called or failed, in which case kvmppc_mmu_destroy() should not be called. Drop the line in the error path of kvm_arch_vcpu_create(). Fixes: ff030fdf5573 ("KVM: PPC: Move kvm_vcpu_init() invocation to common code") Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/158455341029.178873.15248663726399374882.stgit@bahia.lan
|
#
6fef0c6b |
|
18-Mar-2020 |
Greg Kurz <groug@kaod.org> |
KVM: PPC: Kill kvmppc_ops::mmu_destroy() and kvmppc_mmu_destroy() These are only used by HV KVM and BookE, and in both cases they are nops. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
3f1268dd |
|
18-Mar-2020 |
Greg Kurz <groug@kaod.org> |
KVM: PPC: Book3S PR: Move kvmppc_mmu_init() into PR KVM This is only relevant to PR KVM. Make it obvious by moving the function declaration to the Book3s header and rename it with a _pr suffix. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
b2fa4f90 |
|
18-Mar-2020 |
Greg Kurz <groug@kaod.org> |
KVM: PPC: Book3S PR: Fix kernel crash with PR KVM With PR KVM, shutting down a VM causes the host kernel to crash: [ 314.219284] BUG: Unable to handle kernel data access on read at 0xc00800000176c638 [ 314.219299] Faulting instruction address: 0xc008000000d4ddb0 cpu 0x0: Vector: 300 (Data Access) at [c00000036da077a0] pc: c008000000d4ddb0: kvmppc_mmu_pte_flush_all+0x68/0xd0 [kvm_pr] lr: c008000000d4dd94: kvmppc_mmu_pte_flush_all+0x4c/0xd0 [kvm_pr] sp: c00000036da07a30 msr: 900000010280b033 dar: c00800000176c638 dsisr: 40000000 current = 0xc00000036d4c0000 paca = 0xc000000001a00000 irqmask: 0x03 irq_happened: 0x01 pid = 1992, comm = qemu-system-ppc Linux version 5.6.0-master-gku+ (greg@palmb) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #17 SMP Wed Mar 18 13:49:29 CET 2020 enter ? for help [c00000036da07ab0] c008000000d4fbe0 kvmppc_mmu_destroy_pr+0x28/0x60 [kvm_pr] [c00000036da07ae0] c0080000009eab8c kvmppc_mmu_destroy+0x34/0x50 [kvm] [c00000036da07b00] c0080000009e50c0 kvm_arch_vcpu_destroy+0x108/0x140 [kvm] [c00000036da07b30] c0080000009d1b50 kvm_vcpu_destroy+0x28/0x80 [kvm] [c00000036da07b60] c0080000009e4434 kvm_arch_destroy_vm+0xbc/0x190 [kvm] [c00000036da07ba0] c0080000009d9c2c kvm_put_kvm+0x1d4/0x3f0 [kvm] [c00000036da07c00] c0080000009da760 kvm_vm_release+0x38/0x60 [kvm] [c00000036da07c30] c000000000420be0 __fput+0xe0/0x310 [c00000036da07c90] c0000000001747a0 task_work_run+0x150/0x1c0 [c00000036da07cf0] c00000000014896c do_exit+0x44c/0xd00 [c00000036da07dc0] c0000000001492f4 do_group_exit+0x64/0xd0 [c00000036da07e00] c000000000149384 sys_exit_group+0x24/0x30 [c00000036da07e20] c00000000000b9d0 system_call+0x5c/0x68 This is caused by a use-after-free in kvmppc_mmu_pte_flush_all() which dereferences vcpu->arch.book3s which was previously freed by kvmppc_core_vcpu_free_pr(). This happens because kvmppc_mmu_destroy() is called after kvmppc_core_vcpu_free() since commit ff030fdf5573 ("KVM: PPC: Move kvm_vcpu_init() invocation to common code"). The kvmppc_mmu_destroy() helper calls one of the following depending on the KVM backend: - kvmppc_mmu_destroy_hv() which does nothing (Book3s HV) - kvmppc_mmu_destroy_pr() which undoes the effects of kvmppc_mmu_init() (Book3s PR 32-bit) - kvmppc_mmu_destroy_pr() which undoes the effects of kvmppc_mmu_init() (Book3s PR 64-bit) - kvmppc_mmu_destroy_e500() which does nothing (BookE e500/e500mc) It turns out that this is only relevant to PR KVM actually. And both 32 and 64 backends need vcpu->arch.book3s to be valid when calling kvmppc_mmu_destroy_pr(). So instead of calling kvmppc_mmu_destroy() from kvm_arch_vcpu_destroy(), call kvmppc_mmu_destroy_pr() at the beginning of kvmppc_core_vcpu_free_pr(). This is consistent with kvmppc_mmu_init() being the last call in kvmppc_core_vcpu_create_pr(). For the same reason, if kvmppc_core_vcpu_create_pr() returns an error then this means that kvmppc_mmu_init() was either not called or failed, in which case kvmppc_mmu_destroy() should not be called. Drop the line in the error path of kvm_arch_vcpu_create(). Fixes: ff030fdf5573 ("KVM: PPC: Move kvm_vcpu_init() invocation to common code") Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
8fc6ba0a |
|
10-Mar-2020 |
Joe Perches <joe@perches.com> |
KVM: PPC: Use fallthrough; Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe.com/ Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
2a49f61d |
|
18-Feb-2020 |
Sean Christopherson <seanjc@google.com> |
KVM: Ensure validity of memslot with respect to kvm_get_dirty_log() Rework kvm_get_dirty_log() so that it "returns" the associated memslot on success. A future patch will rework memslot handling such that id_to_memslot() can return NULL, returning the memslot makes it more obvious that the validity of the memslot has been verified, i.e. precludes the need to add validity checks in the arch code that are technically unnecessary. To maintain ordering in s390, move the call to kvm_arch_sync_dirty_log() from s390's kvm_vm_ioctl_get_dirty_log() to the new kvm_get_dirty_log(). This is a nop for PPC, the only other arch that doesn't select KVM_GENERIC_DIRTYLOG_READ_PROTECT, as its sync_dirty_log() is empty. Ideally, moving the sync_dirty_log() call would be done in a separate patch, but it can't be done in a follow-on patch because that would temporarily break s390's ordering. Making the move in a preparatory patch would be functionally correct, but would create an odd scenario where the moved sync_dirty_log() would operate on a "different" memslot due to consuming the result of a different id_to_memslot(). The memslot couldn't actually be different as slots_lock is held, but the code is confusing enough as it is, i.e. moving sync_dirty_log() in this patch is the lesser of all evils. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
e96c81ee |
|
18-Feb-2020 |
Sean Christopherson <seanjc@google.com> |
KVM: Simplify kvm_free_memslot() and all its descendents Now that all callers of kvm_free_memslot() pass NULL for @dont, remove the param from the top-level routine and all arch's implementations. No functional change intended. Tested-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
82307e67 |
|
18-Feb-2020 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Move memslot memory allocation into prepare_memory_region() Allocate the rmap array during kvm_arch_prepare_memory_region() to pave the way for removing kvm_arch_create_memslot() altogether. Moving PPC's memory allocation only changes the order of kernel memory allocations between PPC and common KVM code. No functional change intended. Acked-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
fd24a862 |
|
26-Jan-2020 |
David Michael <fedora.dm0@gmail.com> |
KVM: PPC: Book3S PR: Fix -Werror=return-type build failure Fixes: 3a167beac07c ("kvm: powerpc: Add kvmppc_ops callback") Signed-off-by: David Michael <fedora.dm0@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
ff030fdf |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Move kvm_vcpu_init() invocation to common code Move the kvm_cpu_{un}init() calls to common PPC code as an intermediate step towards removing kvm_cpu_{un}init() altogether. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
d3076952 |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Book3S PR: Allocate book3s and shadow vcpu after common init Call kvm_vcpu_init() in kvmppc_core_vcpu_create_pr() prior to allocating the book3s and shadow_vcpu objects in preparation of moving said call to common PPC code. Although kvm_vcpu_init() has an arch callback, the callback is empty for Book3S PR, i.e. barring unseen black magic, moving the allocation has no real functional impact. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
c50bfbdc |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Allocate vcpu struct in common PPC code Move allocation of all flavors of PPC vCPUs to common PPC code. All variants either allocate 'struct kvm_vcpu' directly, or require that the embedded 'struct kvm_vcpu' member be located at offset 0, i.e. guarantee that the allocation can be directly interpreted as a 'struct kvm_vcpu' object. Remove the message from the build-time assertion regarding placement of the struct, as compatibility with the arch usercopy region is no longer the sole dependent on 'struct kvm_vcpu' being at offset zero. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
cb10bf91 |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Book3S PR: Free shared page if mmu initialization fails Explicitly free the shared page if kvmppc_mmu_init() fails during kvmppc_core_vcpu_create(), as the page is freed only in kvmppc_core_vcpu_free(), which is not reached via kvm_vcpu_uninit(). Fixes: 96bc451a15329 ("KVM: PPC: Introduce shared page") Cc: stable@vger.kernel.org Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
87a45e07 |
|
02-Oct-2019 |
Nicholas Piggin <npiggin@gmail.com> |
KVM: PPC: Book3S: Replace reset_msr mmu op with inject_interrupt arch op reset_msr sets the MSR for interrupt injection, but it's cleaner and more flexible to provide a single op to set both MSR and PC for the interrupt. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
d2912cb1 |
|
04-Jun-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
f032b734 |
|
11-Dec-2018 |
Bharata B Rao <bharata@linux.ibm.com> |
KVM: PPC: Pass change type down to memslot commit function Currently, kvm_arch_commit_memory_region() gets called with a parameter indicating what type of change is being made to the memslot, but it doesn't pass it down to the platform-specific memslot commit functions. This adds the `change' parameter to the lower-level functions so that they can use it in future. [paulus@ozlabs.org - fix book E also.] Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
6142236c |
|
06-Dec-2018 |
Suraj Jitindar Singh <sjitindarsingh@gmail.com> |
KVM: PPC: Book3S PR: Set hflag to indicate that POWER9 supports 1T segments When booting a kvm-pr guest on a POWER9 machine the following message is observed: "qemu-system-ppc64: KVM does not support 1TiB segments which guest expects" This is because the guest is expecting to be able to use 1T segments however we don't indicate support for it. This is because we don't set the BOOK3S_HFLAG_MULTI_PGSIZE flag in the hflags in kvmppc_set_pvr_pr() on POWER9. POWER9 does indeed have support for 1T segments, so add a case for POWER9 to the switch statement to ensure it is set. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
fd0944ba |
|
07-Oct-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct When the 'regs' field was added to struct kvm_vcpu_arch, the code was changed to use several of the fields inside regs (e.g., gpr, lr, etc.) but not the ccr field, because the ccr field in struct pt_regs is 64 bits on 64-bit platforms, but the cr field in kvm_vcpu_arch is only 32 bits. This changes the code to use the regs.ccr field instead of cr, and changes the assembly code on 64-bit platforms to use 64-bit loads and stores instead of 32-bit ones. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
d24ea8a7 |
|
07-Oct-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Book3S: Simplify external interrupt handling Currently we use two bits in the vcpu pending_exceptions bitmap to indicate that an external interrupt is pending for the guest, one for "one-shot" interrupts that are cleared when delivered, and one for interrupts that persist until cleared by an explicit action of the OS (e.g. an acknowledge to an interrupt controller). The BOOK3S_IRQPRIO_EXTERNAL bit is used for one-shot interrupt requests and BOOK3S_IRQPRIO_EXTERNAL_LEVEL is used for persisting interrupts. In practice BOOK3S_IRQPRIO_EXTERNAL never gets used, because our Book3S platforms generally, and pseries in particular, expect external interrupt requests to persist until they are acknowledged at the interrupt controller. That combined with the confusion introduced by having two bits for what is essentially the same thing makes it attractive to simplify things by only using one bit. This patch does that. With this patch there is only BOOK3S_IRQPRIO_EXTERNAL, and by default it has the semantics of a persisting interrupt. In order to avoid breaking the ABI, we introduce a new "external_oneshot" flag which preserves the behaviour of the KVM_INTERRUPT ioctl with the KVM_INTERRUPT_SET argument. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
3cc97bea |
|
23-Aug-2018 |
Finn Thain <fthain@telegraphics.com.au> |
treewide: correct "differenciate" and "instanciate" typos Also add these typos to spelling.txt so checkpatch.pl will look for them. Link: http://lkml.kernel.org/r/88af06b9de34d870cb0afc46cfd24e0458be2575.1529471371.git.fthain@telegraphics.com.au Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
45ef5992 |
|
05-Jul-2018 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc: remove unnecessary inclusion of asm/tlbflush.h asm/tlbflush.h is only needed for: - using functions xxx_flush_tlb_xxx() - using MMU_NO_CONTEXT - including asm-generic/pgtable.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
db96a04a |
|
07-Jun-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Book3S PR: Enable use on POWER9 bare-metal hosts in HPT mode It turns out that PR KVM has no dependency on the format of HPTEs, because it uses functions pointed to by mmu_hash_ops which do all the formatting and interpretation of HPTEs. Thus we can allow PR KVM to load on POWER9 bare-metal hosts as long as they are running in HPT mode. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
4f169d21 |
|
07-Jun-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Book3S PR: Don't let PAPR guest set MSR hypervisor bit PAPR guests run in supervisor mode and should not be able to set the MSR HV (hypervisor mode) bit or clear the ME (machine check enable) bit by mtmsrd or any other means. To enforce this, we force MSR_HV off and MSR_ME on in kvmppc_set_msr_pr. Without this, the guest can appear to be in hypervisor mode to itself and to userspace. This has been observed to cause a crash in QEMU when it tries to deliver a system reset interrupt to the guest. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
916ccadc |
|
07-Jun-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Book3S PR: Fix MSR setting when delivering interrupts This makes sure that MSR "partial-function" bits are not transferred to SRR1 when delivering an interrupt. This was causing failures in guests running kernels that include commit f3d96e698ed0 ("powerpc/mm: Overhaul handling of bad page faults", 2017-07-19), which added code to check bits of SRR1 on instruction storage interrupts (ISIs) that indicate a bad page fault. The symptom was that a guest user program that handled a signal and attempted to return from the signal handler would get a SIGBUS signal and die. The code that generated ISIs and some other interrupts would previously set bits in the guest MSR to indicate the interrupt status and then call kvmppc_book3s_queue_irqprio(). This technique no longer works now that kvmppc_inject_interrupt() is masking off those bits. Instead we make kvmppc_core_queue_data_storage() and kvmppc_core_queue_inst_storage() call kvmppc_inject_interrupt() directly, and make sure that all the places that generate ISIs or DSIs call kvmppc_core_queue_{data,inst}_storage instead of kvmppc_book3s_queue_irqprio(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
b71dc519 |
|
05-Jun-2018 |
Cameron Kaiser <spectre@floodgap.com> |
KVM: PPC: Book3S PR: Handle additional interrupt types This adds trivial handling for additional interrupt types that KVM-PR must support for proper virtualization on a POWER9 host in HPT mode, as a further prerequisite to enabling KVM-PR on that configuration. Signed-off-by: Cameron Kaiser <spectre@floodgap.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
deeb879d |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Enable kvmppc_get/set_one_reg_pr() for HTM registers We need to migrate PR KVM during transaction and userspace will use kvmppc_get_one_reg_pr()/kvmppc_set_one_reg_pr() APIs to get/set transaction checkpoint state. This patch adds support for that. So far, QEMU on PR KVM doesn't fully function for migration but the savevm/loadvm can be done against a RHEL72 guest. During savevm/ loadvm procedure, the kvm ioctls will be invoked as well. Test has been performed to savevm/loadvm for a guest running a HTM test program: https://github.com/justdoitqd/publicFiles/blob/master/test-tm-mig.c Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
7284ca8a |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM Currently guest kernel doesn't handle TAR facility unavailable and it always runs with TAR bit on. PR KVM will lazily enable TAR. TAR is not a frequent-use register and it is not included in SVCPU struct. Due to the above, the checkpointed TAR val might be a bogus TAR val. To solve this issue, we will make vcpu->arch.fscr tar bit consistent with shadow_fscr when TM is enabled. At the end of emulating treclaim., the correct TAR val need to be loaded into the register if FSCR_TAR bit is on. At the beginning of emulating trechkpt., TAR needs to be flushed so that the right tar val can be copied into tar_tm. Tested with: tools/testing/selftests/powerpc/tm/tm-tar tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar (remove DSCR/PPR related testing). Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
68ab07b9 |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Add guard code to prevent returning to guest with PR=0 and Transactional state Currently PR KVM doesn't support transaction memory in guest privileged state. This patch adds a check at setting guest msr, so that we can never return to guest with PR=0 and TS=0b10. A tabort will be emulated to indicate this and fail transaction immediately. [paulus@ozlabs.org - don't change the TM_CAUSE_MISC definition, instead use TM_CAUSE_KVM_FAC_UNAV.] Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
e32c53d1 |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Add emulation for trechkpt. This patch adds host emulation when guest PR KVM executes "trechkpt.", which is a privileged instruction and will trap into host. We firstly copy vcpu ongoing content into vcpu tm checkpoint content, then perform kvmppc_restore_tm_pr() to do trechkpt. with updated vcpu tm checkpoint values. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
19c585eb |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Restore NV regs after emulating mfspr from TM SPRs Currently kvmppc_handle_fac() will not update NV GPRs and thus it can return with GUEST_RESUME. However PR KVM guest always disables MSR_TM bit in privileged state. If PR privileged-state guest is trying to read TM SPRs, it will trigger TM facility unavailable exception and fall into kvmppc_handle_fac(). Then the emulation will be done by kvmppc_core_emulate_mfspr_pr(). The mfspr instruction can include a RT with NV reg. So it is necessary to restore NV GPRs at this case, to reflect the update to NV RT. This patch make kvmppc_handle_fac() return GUEST_RESUME_NV for TM facility unavailable exceptions in guest privileged state. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
5706340a |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Always fail transactions in guest privileged state Currently the kernel doesn't use transaction memory. And there is an issue for privileged state in the guest that: tbegin/tsuspend/tresume/tabort TM instructions can impact MSR TM bits without trapping into the PR host. So following code will lead to a false mfmsr result: tbegin <- MSR bits update to Transaction active. beq <- failover handler branch mfmsr <- still read MSR bits from magic page with transaction inactive. It is not an issue for non-privileged guest state since its mfmsr is not patched with magic page and will always trap into the PR host. This patch will always fail tbegin attempt for privileged state in the guest, so that the above issue is prevented. It is benign since currently (guest) kernel doesn't initiate a transaction. Test case: https://github.com/justdoitqd/publicFiles/blob/master/test_tbegin_pr.c Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
533082ae |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Emulate mtspr/mfspr using active TM SPRs The mfspr/mtspr on TM SPRs(TEXASR/TFIAR/TFHAR) are non-privileged instructions and can be executed by PR KVM guest in problem state without trapping into the host. We only emulate mtspr/mfspr texasr/tfiar/tfhar in guest PR=0 state. When we are emulating mtspr tm sprs in guest PR=0 state, the emulation result needs to be visible to guest PR=1 state. That is, the actual TM SPR val should be loaded into actual registers. We already flush TM SPRs into vcpu when switching out of CPU, and load TM SPRs when switching back. This patch corrects mfspr()/mtspr() emulation for TM SPRs to make the actual source/dest be the actual TM SPRs. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
13989b65 |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Add math support for PR KVM HTM The math registers will be saved into vcpu->arch.fp/vr and corresponding vcpu->arch.fp_tm/vr_tm area. We flush or giveup the math regs into vcpu->arch.fp/vr before saving transaction. After transaction is restored, the math regs will be loaded back into regs. If there is a FP/VEC/VSX unavailable exception during transaction active state, the math checkpoint content might be incorrect and we need to do treclaim./load the correct checkpoint val/trechkpt. sequence to retry the transaction. That will make our solution complicated. To solve this issue, we always make the hardware guest MSR math bits (shadow_msr) consistent with the MSR val which guest sees (kvmppc_get_msr()) when guest msr is with tm enabled. Then all FP/VEC/VSX unavailable exception can be delivered to guest and guest handles the exception by itself. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
8d2e2fc5 |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Add transaction memory save/restore skeleton The transaction memory checkpoint area save/restore behavior is triggered when VCPU qemu process is switching out/into CPU, i.e. at kvmppc_core_vcpu_put_pr() and kvmppc_core_vcpu_load_pr(). MSR TM active state is determined by TS bits: active: 10(transactional) or 01 (suspended) inactive: 00 (non-transactional) We don't "fake" TM functionality for guest. We "sync" guest virtual MSR TM active state(10 or 01) with shadow MSR. That is to say, we don't emulate a transactional guest with a TM inactive MSR. TM SPR support(TFIAR/TFAR/TEXASR) has already been supported by commit 9916d57e64a4 ("KVM: PPC: Book3S PR: Expose TM registers"). Math register support (FPR/VMX/VSX) will be done at subsequent patch. Whether TM context need to be saved/restored can be determined by kvmppc_get_msr() TM active state: * TM active - save/restore TM context * TM inactive - no need to do so and only save/restore TM SPRs. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
66c33e79 |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Add kvmppc_save/restore_tm_sprs() APIs This patch adds 2 new APIs, kvmppc_save_tm_sprs() and kvmppc_restore_tm_sprs(), for the purpose of TEXASR/TFIAR/TFHAR save/restore. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
95757bfc |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Sync TM bits to shadow msr for problem state guest MSR TS bits can be modified with non-privileged instruction such as tbegin./tend. That means guest can change MSR value "silently" without notifying host. It is necessary to sync the TM bits to host so that host can calculate shadow msr correctly. Note, privileged mode in the guest will always fail transactions so we only take care of problem state mode in the guest. The logic is put into kvmppc_copy_from_svcpu() so that kvmppc_handle_exit_pr() can use correct MSR TM bits even when preemption occurs. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
901938ad |
|
23-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Book3S PR: Pass through MSR TM and TS bits to shadow_msr PowerPC TM functionality needs MSR TM/TS bits support in hardware level. Guest TM functionality can not be emulated with "fake" MSR (msr in magic page) TS bits. This patch syncs TM/TS bits in shadow_msr with the MSR value in magic page, so that the MSR TS value which guest sees is consistent with actual MSR bits running in guest. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
9617a0b3 |
|
29-May-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Book3S PR: Allow KVM_PPC_CONFIGURE_V3_MMU to succeed Currently, PR KVM does not implement the configure_mmu operation, and so the KVM_PPC_CONFIGURE_V3_MMU ioctl always fails with an EINVAL error. This causes recent kernels to fail to boot as a PR KVM guest on POWER9, since recent kernels booted in HPT mode do the H_REGISTER_PROC_TBL hypercall, which causes userspace (QEMU) to do KVM_PPC_CONFIGURE_V3_MMU, which fails. This implements a minimal configure_mmu operation for PR KVM. It succeeds only if the MMU is being configured for HPT mode and no process table is being registered. This is enough to get recent kernels to boot as a PR KVM guest. Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
2e6baa46 |
|
20-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Add giveup_ext() hook to PPC KVM ops Currently HV will save math regs(FP/VEC/VSX) when trap into host. But PR KVM will only save math regs when qemu task switch out of CPU, or when returning from qemu code. To emulate FP/VEC/VSX mmio load, PR KVM need to make sure that math regs were flushed firstly and then be able to update saved VCPU FPR/VEC/VSX area reasonably. This patch adds giveup_ext() field to KVM ops. Only PR KVM has non-NULL giveup_ext() ops. kvmppc_complete_mmio_load() can invoke that hook (when not NULL) to flush math regs accordingly, before updating saved register vals. Math regs flush is also necessary for STORE, which will be covered in later patch within this patch series. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
ec531d02 |
|
18-May-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Book3S PR: Enable use on POWER9 inside HPT-mode guests This relaxes the restriction on using PR KVM on POWER9. The existing code does work inside a guest partition running in HPT mode, because hypercalls such as H_ENTER use the old HPTE format, not the new format used by POWER9, and so no change to PR KVM's HPT manipulation code is required. PR KVM will still refuse to run if the kernel is using radix translation or if it is running bare-metal. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
173c520a |
|
07-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Move nip/ctr/lr/xer registers to pt_regs in kvm_vcpu_arch This patch moves nip/ctr/lr/xer registers from scattered places in kvm_vcpu_arch to pt_regs structure. cr register is "unsigned long" in pt_regs and u32 in vcpu->arch. It will need more consideration and may move in later patches. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
1143a706 |
|
07-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Add pt_regs into kvm_vcpu_arch and move vcpu->arch.gpr[] into it Current regs are scattered at kvm_vcpu_arch structure and it will be more neat to organize them into pt_regs structure. Also it will enable reimplementation of MMIO emulation code with analyse_instr() later. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
39c983ea |
|
21-Feb-2018 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Remove unused kvm_unmap_hva callback Since commit fb1522e099f0 ("KVM: update to new mmu_notifier semantic v2", 2017-08-31), the MMU notifier code in KVM no longer calls the kvm_unmap_hva callback. This removes the PPC implementations of kvm_unmap_hva(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
07ae5389 |
|
31-Jan-2018 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled When copying between the vcpu and svcpu, we may get scheduled away onto a different host CPU which in turn means our svcpu pointer may change. That means we need to atomically copy to and from the svcpu with preemption disabled, so that all code around it always sees a coherent state. Reported-by: Simon Guo <wei.guo.simon@gmail.com> Fixes: 3d3319b45eea ("KVM: PPC: Book3S: PR: Enable interrupts earlier") Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
6c7d47c3 |
|
21-Nov-2017 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
KVM: PPC: Book3S PR: Fix WIMG handling under pHyp Commit 96df226 ("KVM: PPC: Book3S PR: Preserve storage control bits") added code to preserve WIMG bits but it missed 2 special cases: - a magic page in kvmppc_mmu_book3s_64_xlate() and - guest real mode in kvmppc_handle_pagefault(). For these ptes, WIMG was 0 and pHyp failed on these causing a guest to stop in the very beginning at NIP=0x100 (due to bd9166ffe "KVM: PPC: Book3S PR: Exit KVM on failed mapping"). According to LoPAPR v1.1 14.5.4.1.2 H_ENTER: The hypervisor checks that the WIMG bits within the PTE are appropriate for the physical page number else H_Parameter return. (For System Memory pages WIMG=0010, or, 1110 if the SAO option is enabled, and for IO pages WIMG=01**.) This hence initializes WIMG to non-zero value HPTE_R_M (0x10), as expected by pHyp. [paulus@ozlabs.org - fix compile for 32-bit] Cc: stable@vger.kernel.org # v4.11+ Fixes: 96df226 "KVM: PPC: Book3S PR: Preserve storage control bits" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Ruediger Oertel <ro@suse.de> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
f4093ee9 |
|
15-Oct-2017 |
Greg Kurz <groug@kaod.org> |
KVM: PPC: Book3S PR: Only install valid SLBs during KVM_SET_SREGS Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS, some of which are valid (ie, SLB_ESID_V is set) and the rest are likely all-zeroes (with QEMU at least). Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which assumes to find the SLB index in the 3 lower bits of its rb argument. When passed zeroed arguments, it happily overwrites the 0th SLB entry with zeroes. This is exactly what happens while doing live migration with QEMU when the destination pushes the incoming SLB descriptors to KVM PR. When reloading the SLBs at the next synchronization, QEMU first clears its SLB array and only restore valid ones, but the 0th one is now gone and we cannot access the corresponding memory anymore: (qemu) x/x $pc c0000000000b742c: Cannot access memory To avoid this, let's filter out non-valid SLB entries. While here, we also force a full SLB flush before installing new entries. Since SLB is for 64-bit only, we now build this path conditionally to avoid a build break on 32-bit, which doesn't define SLB_ESID_V. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
72875d8a |
|
26-Apr-2017 |
Radim Krčmář <rkrcmar@redhat.com> |
KVM: add kvm_{test,clear}_request to replace {test,clear}_bit Users were expected to use kvm_check_request() for testing and clearing, but request have expanded their use since then and some users want to only test or do a faster clear. Make sure that requests are not directly accessed with bit operations. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
96df2267 |
|
24-Mar-2017 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
KVM: PPC: Book3S PR: Preserve storage control bits PR KVM page fault handler performs eaddr to pte translation for a guest, however kvmppc_mmu_book3s_64_xlate() does not preserve WIMG bits (storage control) in the kvmppc_pte struct. If PR KVM is running as a second level guest under HV KVM, and PR KVM tries inserting HPT entry, this fails in HV KVM if it already has this mapping. This preserves WIMG bits between kvmppc_mmu_book3s_64_xlate() and kvmppc_mmu_map_page(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
bd9166ff |
|
24-Mar-2017 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
KVM: PPC: Book3S PR: Exit KVM on failed mapping At the moment kvmppc_mmu_map_page() returns -1 if mmu_hash_ops.hpte_insert() fails for any reason so the page fault handler resumes the guest and it faults on the same address again. This adds distinction to kvmppc_mmu_map_page() to return -EIO if mmu_hash_ops.hpte_insert() failed for a reason other than full pteg. At the moment only pSeries_lpar_hpte_insert() returns -2 if plpar_pte_enter() failed with a code other than H_PTEG_FULL. Other mmu_hash_ops.hpte_insert() instances can only fail with -1 "full pteg". With this change, if PR KVM fails to update HPT, it can signal the userspace about this instead of returning to guest and having the very same page fault over and over again. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
9eecec12 |
|
24-Mar-2017 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
KVM: PPC: Book3S PR: Get rid of unused local variable @is_mmio has never been used since introduction in commit 2f4cf5e42d13 ("Add book3s.c") from 2009. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
fcd4f3c6 |
|
25-Jan-2017 |
Thomas Huth <thuth@redhat.com> |
KVM: PPC: Book3S PR: Refactor program interrupt related code into separate function The function kvmppc_handle_exit_pr() is quite huge and thus hard to read, and even contains a "spaghetti-code"-like goto between the different case labels of the big switch statement. This can be made much more readable by moving the code related to injecting program interrupts / instruction emulation into a separate function instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
7c0f6ba6 |
|
24-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
2365f6b6 |
|
21-Sep-2016 |
Thomas Huth <thuth@redhat.com> |
KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL On POWER8E and POWER8NVL, KVM-PR does not announce support for 64kB page sizes and 1TB segments yet. Looks like this has just been forgotton so far, since there is no reason why this should be different to the normal POWER8 CPUs. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
88b02cf9 |
|
14-Sep-2016 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread POWER8 has one virtual timebase (VTB) register per subcore, not one per CPU thread. The HV KVM code currently treats VTB as a per-thread register, which can lead to spurious soft lockup messages from guests which use the VTB as the time source for the soft lockup detector. (CPUs before POWER8 did not have the VTB register.) For HV KVM, this fixes the problem by making only the primary thread in each virtual core save and restore the VTB value. With this, the VTB state becomes part of the kvmppc_vcore structure. This also means that "piggybacking" of multiple virtual cores onto one subcore is not possible on POWER8, because then the virtual cores would share a single VTB register. PR KVM emulates a VTB register, which is per-vcpu because PR KVM has no notion of CPU threads or SMT. For PR KVM we move the VTB state into the kvmppc_vcpu_book3s struct. Cc: stable@vger.kernel.org # v3.14+ Reported-by: Thomas Huth <thuth@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
d3cbff1b |
|
04-Jul-2016 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
powerpc: Put exception configuration in a common place The various calls to establish exception endianness and AIL are now done from a single point using already established CPU and FW feature bits to decide what to do. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
6edaa530 |
|
15-Jun-2016 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: remove kvm_guest_enter/exit wrappers Use the functions from context_tracking.h directly. Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
b69890d1 |
|
19-May-2016 |
Thomas Huth <thuth@redhat.com> |
KVM: PPC: Book3S PR: Fix contents of SRR1 when injecting a program exception vcpu->arch.shadow_srr1 only contains usable values for injecting a program exception into the guest if we entered the function kvmppc_handle_exit_pr() with exit_nr == BOOK3S_INTERRUPT_PROGRAM. In other cases, the shadow_srr1 bits are zero. Since we want to pass an illegal-instruction program check to the guest, set "flags" to SRR1_PROGILL for these other cases. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
11dd6ac0 |
|
08-Apr-2016 |
Laurent Vivier <lvivier@redhat.com> |
KVM: PPC: Book3S PR: Manage single-step mode Until now, when we connect gdb to the QEMU gdb-server, the single-step mode is not managed. This patch adds this, only for kvm-pr: If KVM_GUESTDBG_SINGLESTEP is set, we enable single-step trace bit in the MSR (MSR_SE) just before the __kvmppc_vcpu_run(), and disable it just after. In kvmppc_handle_exit_pr, instead of routing the interrupt to the guest, we return to host, with KVM_EXIT_DEBUG reason. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
50de596d |
|
29-Apr-2016 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
powerpc/mm/hash: Add support for Power9 Hash PowerISA 3.0 adds a parition table indexed by LPID. Parition table allows us to specify the MMU model that will be used for guest and host translation. This patch adds support with SLB based hash model (UPRT = 0). What is required with this model is to support the new hash page table entry format and also setup partition table such that we use hash table for address translation. We don't have segment table support yet. In order to make sure we don't load KVM module on Power9 (since we don't have kvm support yet) this patch also disables KVM on Power9. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
c2085059 |
|
28-Oct-2015 |
Anton Blanchard <anton@samba.org> |
powerpc: create giveup_all() Create a single function that gives everything up (FP, VMX, VSX, SPE). Doing this all at once means we only do one MSR write. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --fp --altivec --vector 0 0 shows an improvement of 3% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> [mpe: giveup_all() needs to be EXPORT_SYMBOL'ed] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
dc4fbba1 |
|
28-Oct-2015 |
Anton Blanchard <anton@samba.org> |
powerpc: Create disable_kernel_{fp,altivec,vsx,spe}() The enable_kernel_*() functions leave the relevant MSR bits enabled until we exit the kernel sometime later. Create disable versions that wrap the kernel use of FP, Altivec VSX or SPE. While we don't want to disable it normally for performance reasons (MSR writes are slow), it will be used for a debug boot option that does this and catches bad uses in other areas of the kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
378b417d |
|
15-Nov-2015 |
Yaowei Bai <baiyaowei@cmss.chinamobile.com> |
KVM: powerpc: kvmppc_visible_gpa can be boolean In another patch kvm_is_visible_gfn is maken return bool due to this function only returns zero or one as its return value, let's also make kvmppc_visible_gpa return bool to keep consistent. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
f36f3f28 |
|
18-May-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: add "new" argument to kvm_arch_commit_memory_region This lets the function access the new memory slot without going through kvm_memslots and id_to_memslot. It will simplify the code when more than one address space will be supported. Unfortunately, the "const"ness of the new argument must be casted away in two places. Fixing KVM to accept const struct kvm_memory_slot pointers would require modifications in pretty much all architectures, and is left for later. Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
09170a49 |
|
18-May-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: const-ify uses of struct kvm_userspace_memory_region Architecture-specific helpers are not supposed to muck with struct kvm_userspace_memory_region contents. Add const to enforce this. In order to eliminate the only write in __kvm_set_memory_region, the cleaning of deleted slots is pulled up from update_memslots to __kvm_set_memory_region. Reviewed-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
9f6b8029 |
|
17-May-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: use kvm_memslots whenever possible kvm_memslots provides lockdep checking. Use it consistently instead of explicit dereferencing of kvm->memslots. Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
6178839b |
|
07-Dec-2014 |
Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> |
arch: powerpc: kvm: book3s_pr.c: Remove unused function Remove the function get_fpr_index() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
57128468 |
|
22-Sep-2014 |
Andres Lagar-Cavilla <andreslc@google.com> |
kvm: Fix page ageing bugs 1. We were calling clear_flush_young_notify in unmap_one, but we are within an mmu notifier invalidate range scope. The spte exists no more (due to range_start) and the accessed bit info has already been propagated (due to kvm_pfn_set_accessed). Simply call clear_flush_young. 2. We clear_flush_young on a primary MMU PMD, but this may be mapped as a collection of PTEs by the secondary MMU (e.g. during log-dirty). This required expanding the interface of the clear_flush_young mmu notifier, so a lot of code has been trivially touched. 3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate the access bit by blowing the spte. This requires proper synchronizing with MMU notifier consumers, like every other removal of spte's does. Signed-off-by: Andres Lagar-Cavilla <andreslc@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
a59c1d9e |
|
09-Sep-2014 |
Madhavan Srinivasan <maddy@linux.vnet.ibm.com> |
powerpc/kvm: support to handle sw breakpoint This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch also adds support for software breakpoint in PR KVM. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8e6afa36 |
|
31-Jul-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: PR: Handle FSCR feature deselects We handle FSCR feature bits (well, TAR only really today) lazily when the guest starts using them. So when a guest activates the bit and later uses that feature we enable it for real in hardware. However, when the guest stops using that bit we don't stop setting it in hardware. That means we can potentially lose a trap that the guest expects to happen because it thinks a feature is not active. This patch adds support to drop TAR when then guest turns it off in FSCR. While at it it also restricts FSCR access to 64bit systems - 32bit ones don't have it. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a0840240 |
|
19-Jul-2014 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
KVM: PPC: Book3S: Fix LPCR one_reg interface Unfortunately, the LPCR got defined as a 32-bit register in the one_reg interface. This is unfortunate because KVM allows userspace to control the DPFD (default prefetch depth) field, which is in the upper 32 bits. The result is that DPFD always get set to 0, which reduces performance in the guest. We can't just change KVM_REG_PPC_LPCR to be a 64-bit register ID, since that would break existing userspace binaries. Instead we define a new KVM_REG_PPC_LPCR_64 id which is 64-bit. Userspace can still use the old KVM_REG_PPC_LPCR id, but it now only modifies those fields in the bottom 32 bits that userspace can modify (ILE, TC and AIL). If userspace uses the new KVM_REG_PPC_LPCR_64 id, it can modify DPFD as well. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
51f04726 |
|
23-Jul-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Allow kvmppc_get_last_inst() to fail On book3e, guest last instruction is read on the exit path using load external pid (lwepx) dedicated instruction. This load operation may fail due to TLB eviction and execute-but-not-read entries. This patch lay down the path for an alternative solution to read the guest last instruction, by allowing kvmppc_get_lat_inst() function to fail. Architecture specific implmentations of kvmppc_load_last_inst() may read last guest instruction and instruct the emulation layer to re-execute the guest in case of failure. Make kvmppc_get_last_inst() definition common between architectures. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9a26af64 |
|
23-Jul-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Book3s: Remove kvmppc_read_inst() function In the context of replacing kvmppc_ld() function calls with a version of kvmppc_get_last_inst() which allow to fail, Alex Graf suggested this: "If we get EMULATE_AGAIN, we just have to make sure we go back into the guest. No need to inject an ISI into the guest - it'll do that all by itself. With an error returning kvmppc_get_last_inst we can just use completely get rid of kvmppc_read_inst() and only use kvmppc_get_last_inst() instead." As a intermediate step get rid of kvmppc_read_inst() and only use kvmppc_ld() instead. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
89b68c96 |
|
13-Jul-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: Make magic page properly 4k mappable The magic page is defined as a 4k page of per-vCPU data that is shared between the guest and the host to accelerate accesses to privileged registers. However, when the host is using 64k page size granularity we weren't quite as strict about that rule anymore. Instead, we partially treated all of the upper 64k as magic page and mapped only the uppermost 4k with the actual magic contents. This works well enough for Linux which doesn't use any memory in kernel space in the upper 64k, but Mac OS X got upset. So this patch makes magic page actually stay in a 4k range even on 64k page size hosts. This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE hosts for me. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
c01e3f66 |
|
10-Jul-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: Add hack for split real mode Today we handle split real mode by mapping both instruction and data faults into a special virtual address space that only exists during the split mode phase. This is good enough to catch 32bit Linux guests that use split real mode for copy_from/to_user. In this case we're always prefixed with 0xc0000000 for our instruction pointer and can map the user space process freely below there. However, that approach fails when we're running KVM inside of KVM. Here the 1st level last_inst reader may well be in the same virtual page as a 2nd level interrupt handler. It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a kernel copy_from/to_user implementation can easily overlap with user space addresses. The architecturally correct way to fix this would be to implement an instruction interpreter in KVM that kicks in whenever we go into split real mode. This interpreter however would not receive a great amount of testing and be a lot of bloat for a reasonably isolated corner case. So I went back to the drawing board and tried to come up with a way to make split real mode work with a single flat address space. And then I realized that we could get away with the same trick that makes it work for Linux: Whenever we see an instruction address during split real mode that may collide, we just move it higher up the virtual address space to a place that hopefully does not collide (keep your fingers crossed!). That approach does work surprisingly well. I am able to successfully run Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I apply a tiny timing probe hack to QEMU. I'd say this is a win over even more broken split real mode :). Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
ae2113a4 |
|
01-Jun-2014 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
699a0ea0 |
|
01-Jun-2014 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
568fccc4 |
|
16-Jun-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S PR: Handle hyp doorbell exits If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts. Handle those the same way we treat normal doorbells. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
fb4188ba |
|
08-Jun-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3s PR: Disable AIL mode with OPAL When we're using PR KVM we must not allow the CPU to take interrupts in virtual mode, as the SLB does not contain host kernel mappings when running inside the guest context. To make sure we get good performance for non-KVM tasks but still properly functioning PR KVM, let's just disable AIL whenever a vcpu is scheduled in. This is fundamentally different from how we deal with AIL on pSeries type machines where we disable AIL for the whole machine as soon as a single KVM VM is up. The reason for that is easy - on pSeries we do not have control over per-cpu configuration of AIL. We also don't want to mess with CPU hotplug races and AIL configuration, so setting it per CPU is easier and more flexible. This patch fixes running PR KVM on POWER8 bare metal for me. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>
|
#
06da28e7 |
|
05-Jun-2014 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
KVM: PPC: BOOK3S: PR: Emulate instruction counter Writing to IC is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8f42ab27 |
|
05-Jun-2014 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
KVM: PPC: BOOK3S: PR: Emulate virtual timebase register virtual time base register is a per VM, per cpu register that needs to be saved and restored on vm exit and entry. Writing to VTB is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix compile error] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3cd60e31 |
|
04-Jun-2014 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation We use time base for PURR and SPURR emulation with PR KVM since we are emulating a single threaded core. When using time base we need to make sure that we don't accumulate time spent in the host in PURR and SPURR value. Also we don't need to emulate mtspr because both the registers are hypervisor resource. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9916d57e |
|
29-Apr-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S PR: Expose TM registers POWER8 introduces transactional memory which brings along a number of new registers and MSR bits. Implementing all of those is a pretty big headache, so for now let's at least emulate enough to make Linux's context switching code happy. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
e14e7a1e |
|
21-Apr-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S PR: Expose TAR facility to guest POWER8 implements a new register called TAR. This register has to be enabled in FSCR and then from KVM's point of view is mere storage. This patch enables the guest to use TAR. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
616dff86 |
|
29-Apr-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR POWER8 introduced a new interrupt type called "Facility unavailable interrupt" which contains its status message in a new register called FSCR. Handle these exits and try to emulate instructions for unhandled facilities. Follow-on patches enable KVM to expose specific facilities into the guest. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
cd087eef |
|
24-Apr-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S PR: Do dcbz32 patching with big endian instructions When the host CPU we're running on doesn't support dcbz32 itself, but the guest wants to have dcbz only clear 32 bytes of data, we loop through every executable mapped page to search for dcbz instructions and patch them with a special privileged instruction that we emulate as dcbz32. The only guests that want to see dcbz act as 32byte are book3s_32 guests, so we don't have to worry about little endian instruction ordering. So let's just always search for big endian dcbz instructions, also when we're on a little endian host. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
5deb8e7a |
|
24-Apr-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Make shared struct aka magic page guest endian The shared (magic) page is a data structure that contains often used supervisor privileged SPRs accessible via memory to the user to reduce the number of exits we have to take to read/write them. When we actually share this structure with the guest we have to maintain it in guest endianness, because some of the patch tricks only work with native endian load/store operations. Since we only share the structure with either host or guest in little endian on book3s_64 pr mode, we don't have to worry about booke or book3s hv. For booke, the shared struct stays big endian. For book3s_64 hv we maintain the struct in host native endian, since it never gets shared with the guest. For book3s_64 pr we introduce a variable that tells us which endianness the shared struct is in and route every access to it through helper inline functions that evaluate this variable. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
94810ba4 |
|
24-Apr-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S PR: Default to big endian guest The default MSR when user space does not define anything should be identical on little and big endian hosts, so remove MSR_LE from it. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7562c4fd |
|
04-May-2014 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
KVM: PPC: BOOK3S: PR: Fix WARN_ON with debug options on With debug option "sleep inside atomic section checking" enabled we get the below WARN_ON during a PR KVM boot. This is because upstream now have PREEMPT_COUNT enabled even if we have preempt disabled. Fix the warning by adding preempt_disable/enable around floating point and altivec enable. WARNING: at arch/powerpc/kernel/process.c:156 Modules linked in: kvm_pr kvm CPU: 1 PID: 3990 Comm: qemu-system-ppc Tainted: G W 3.15.0-rc1+ #4 task: c0000000eb85b3a0 ti: c0000000ec59c000 task.ti: c0000000ec59c000 NIP: c000000000015c84 LR: d000000003334644 CTR: c000000000015c00 REGS: c0000000ec59f140 TRAP: 0700 Tainted: G W (3.15.0-rc1+) MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000024 XER: 20000000 CFAR: c000000000015c24 SOFTE: 1 GPR00: d000000003334644 c0000000ec59f3c0 c000000000e2fa40 c0000000e2f80000 GPR04: 0000000000000800 0000000000002000 0000000000000001 8000000000000000 GPR08: 0000000000000001 0000000000000001 0000000000002000 c000000000015c00 GPR12: d00000000333da18 c00000000fb80900 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 00003fffce4e0fa1 GPR20: 0000000000000010 0000000000000001 0000000000000002 00000000100b9a38 GPR24: 0000000000000002 0000000000000000 0000000000000000 0000000000000013 GPR28: 0000000000000000 c0000000eb85b3a0 0000000000002000 c0000000e2f80000 NIP [c000000000015c84] .enable_kernel_fp+0x84/0x90 LR [d000000003334644] .kvmppc_handle_ext+0x134/0x190 [kvm_pr] Call Trace: [c0000000ec59f3c0] [0000000000000010] 0x10 (unreliable) [c0000000ec59f430] [d000000003334644] .kvmppc_handle_ext+0x134/0x190 [kvm_pr] [c0000000ec59f4c0] [d00000000324b380] .kvmppc_set_msr+0x30/0x50 [kvm] [c0000000ec59f530] [d000000003337cac] .kvmppc_core_emulate_op_pr+0x16c/0x5e0 [kvm_pr] [c0000000ec59f5f0] [d00000000324a944] .kvmppc_emulate_instruction+0x284/0xa80 [kvm] [c0000000ec59f6c0] [d000000003336888] .kvmppc_handle_exit_pr+0x488/0xb70 [kvm_pr] [c0000000ec59f790] [d000000003338d34] kvm_start_lightweight+0xcc/0xdc [kvm_pr] [c0000000ec59f960] [d000000003336288] .kvmppc_vcpu_run_pr+0xc8/0x190 [kvm_pr] [c0000000ec59f9f0] [d00000000324c880] .kvmppc_vcpu_run+0x30/0x50 [kvm] [c0000000ec59fa60] [d000000003249e74] .kvm_arch_vcpu_ioctl_run+0x54/0x1b0 [kvm] [c0000000ec59faf0] [d000000003244948] .kvm_vcpu_ioctl+0x478/0x760 [kvm] [c0000000ec59fcb0] [c000000000224e34] .do_vfs_ioctl+0x4d4/0x790 [c0000000ec59fd90] [c000000000225148] .SyS_ioctl+0x58/0xb0 [c0000000ec59fe30] [c00000000000a1e4] syscall_exit+0x0/0x98 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
e5ee5422 |
|
04-May-2014 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
KVM: PPC: BOOK3S: PR: Enable Little Endian PR guest This patch make sure we inherit the LE bit correctly in different case so that we can run Little Endian distro in PR mode Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
ab78475c |
|
06-Apr-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit The book3s_32 target can get built as module which means we don't see the config define for it in code. Instead, check on the bool define CONFIG_KVM_BOOK3S_32_HANDLER whenever we want to know whether we're building for a book3s_32 host. This fixes running book3s_32 kvm as a module for me. Signed-off-by: Alexander Graf <agraf@suse.de> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
|
#
40688909 |
|
08-Jan-2014 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Cope with doorbell interrupts When the PR host is running on a POWER8 machine in POWER8 mode, it will use doorbell interrupts for IPIs. If one of them arrives while we are in the guest, we pop out of the guest with trap number 0xA00, which isn't handled by kvmppc_handle_exit_pr, leading to the following BUG_ON: [ 331.436215] exit_nr=0xa00 | pc=0x1d2c | msr=0x800000000000d032 [ 331.437522] ------------[ cut here ]------------ [ 331.438296] kernel BUG at arch/powerpc/kvm/book3s_pr.c:982! [ 331.439063] Oops: Exception in kernel mode, sig: 5 [#2] [ 331.439819] SMP NR_CPUS=1024 NUMA pSeries [ 331.440552] Modules linked in: tun nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw virtio_net kvm binfmt_misc ibmvscsi scsi_transport_srp scsi_tgt virtio_blk [ 331.447614] CPU: 11 PID: 1296 Comm: qemu-system-ppc Tainted: G D 3.11.7-200.2.fc19.ppc64p7 #1 [ 331.448920] task: c0000003bdc8c000 ti: c0000003bd32c000 task.ti: c0000003bd32c000 [ 331.450088] NIP: d0000000025d6b9c LR: d0000000025d6b98 CTR: c0000000004cfdd0 [ 331.451042] REGS: c0000003bd32f420 TRAP: 0700 Tainted: G D (3.11.7-200.2.fc19.ppc64p7) [ 331.452331] MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 28004824 XER: 20000000 [ 331.454616] SOFTE: 1 [ 331.455106] CFAR: c000000000848bb8 [ 331.455726] GPR00: d0000000025d6b98 c0000003bd32f6a0 d0000000026017b8 0000000000000032 GPR04: c0000000018627f8 c000000001873208 320d0a3030303030 3030303030643033 GPR08: c000000000c490a8 0000000000000000 0000000000000000 0000000000000002 GPR12: 0000000028004822 c00000000fdc6300 0000000000000000 00000100076ec310 GPR16: 000000002ae343b8 00003ffffd397398 0000000000000000 0000000000000000 GPR20: 00000100076f16f4 00000100076ebe60 0000000000000008 ffffffffffffffff GPR24: 0000000000000000 0000008001041e60 0000000000000000 0000008001040ce8 GPR28: c0000003a2d80000 0000000000000a00 0000000000000001 c0000003a2681810 [ 331.466504] NIP [d0000000025d6b9c] .kvmppc_handle_exit_pr+0x75c/0xa80 [kvm] [ 331.466999] LR [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm] [ 331.467517] Call Trace: [ 331.467909] [c0000003bd32f6a0] [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm] (unreliable) [ 331.468553] [c0000003bd32f750] [d0000000025d98f0] kvm_start_lightweight+0xb4/0xc4 [kvm] [ 331.469189] [c0000003bd32f920] [d0000000025d7648] .kvmppc_vcpu_run_pr+0xd8/0x270 [kvm] [ 331.469838] [c0000003bd32f9c0] [d0000000025cf748] .kvmppc_vcpu_run+0xc8/0xf0 [kvm] [ 331.470790] [c0000003bd32fa50] [d0000000025cc19c] .kvm_arch_vcpu_ioctl_run+0x5c/0x1b0 [kvm] [ 331.471401] [c0000003bd32fae0] [d0000000025c4888] .kvm_vcpu_ioctl+0x478/0x730 [kvm] [ 331.472026] [c0000003bd32fc90] [c00000000026192c] .do_vfs_ioctl+0x4dc/0x7a0 [ 331.472561] [c0000003bd32fd80] [c000000000261cc4] .SyS_ioctl+0xd4/0xf0 [ 331.473095] [c0000003bd32fe30] [c000000000009ed8] syscall_exit+0x0/0x98 [ 331.473633] Instruction dump: [ 331.473766] 4bfff9b4 2b9d0800 419efc18 60000000 60420000 3d220000 e8bf11a0 e8df12a8 [ 331.474733] 7fa4eb78 e8698660 48015165 e8410028 <0fe00000> 813f00e4 3ba00000 39290001 [ 331.475386] ---[ end trace 49fc47d994c1f8f2 ]--- [ 331.479817] This fixes the problem by making kvmppc_handle_exit_pr() recognize the interrupt. We also need to jump to the doorbell interrupt handler in book3s_segment.S to handle the interrupt on the way out of the guest. Having done that, there's nothing further to be done in kvmppc_handle_exit_pr(). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
6c85f52b |
|
09-Jan-2014 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc: IRQ disabling cleanup Simplify the handling of lazy EE by going directly from fully-enabled to hard-disabled. This replaces the lazy_irq_pending() check (including its misplaced kvm_guest_exit() call). As suggested by Tiejun Chen, move the interrupt disabling into kvmppc_prepare_to_enter() rather than have each caller do it. Also move the IRQ enabling on heavyweight exit into kvmppc_prepare_to_enter(). Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
99dae3ba |
|
15-Oct-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Load/save FP/VMX/VSX state directly to/from vcpu struct Now that we have the vcpu floating-point and vector state stored in the same type of struct as the main kernel uses, we can load that state directly from the vcpu struct instead of having extra copies to/from the thread_struct. Similarly, when the guest state needs to be saved, we can have it saved it directly to the vcpu struct by setting the current->thread.fp_save_area and current->thread.vr_save_area pointers. That also means that we don't need to back up and restore userspace's FP/vector state. This all makes the code simpler and faster. Note that it's not necessary to save or modify current->thread.fpexc_mode, since nothing in KVM uses or is affected by its value. Nor is it necessary to touch used_vr or used_vsr. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
efff1912 |
|
15-Oct-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Store FP/VSX/VMX state in thread_fp/vr_state structures This uses struct thread_fp_state and struct thread_vr_state to store the floating-point, VMX/Altivec and VSX state, rather than flat arrays. This makes transferring the state to/from the thread_struct simpler and allows us to unify the get/set_one_reg implementations for the VSX registers. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
09548fda |
|
15-Oct-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Use load_fp/vr_state rather than load_up_fpu/altivec The load_up_fpu and load_up_altivec functions were never intended to be called from C, and do things like modifying the MSR value in their callers' stack frames, which are assumed to be interrupt frames. In addition, on 32-bit Book S they require the MMU to be off. This makes KVM use the new load_fp_state() and load_vr_state() functions instead of load_up_fpu/altivec. This means we can remove the assembler glue in book3s_rmhandlers.S, and potentially fixes a bug on Book E, where load_up_fpu was called directly from C. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
398a76c6 |
|
09-Dec-2013 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add devname:kvm aliases for modules Systems that support automatic loading of kernel modules through device aliases should try and automatically load kvm when /dev/kvm gets opened. Add code to support that magic for all PPC kvm targets, even the ones that don't support modules yet. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
40fdd8c8 |
|
28-Nov-2013 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy As soon as we get back to our "highmem" handler in virtual address space we may get preempted. Today the reason we can get preempted is that we replay interrupts and all the lazy logic thinks we have interrupts enabled. However, it's not hard to make the code interruptible and that way we can enable and handle interrupts even earlier. This fixes random guest crashes that happened with CONFIG_PREEMPT=y for me. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a78b55d1 |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: book3s: drop is_hv_enabled drop is_hv_enabled, because that should not be a callback property Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
cbbc58d4 |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: book3s: Allow the HV and PR selection per virtual machine This moves the kvmppc_ops callbacks to be a per VM entity. This enables us to select HV and PR mode when creating a VM. We also allow both kvm-hv and kvm-pr kernel module to be loaded. To achieve this we move /dev/kvm ownership to kvm.ko module. Depending on which KVM mode we select during VM creation we take a reference count on respective module Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix coding style] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
2ba9f0d8 |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: book3s: Support building HV and PR KVM as module Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: squash in compile fix] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
72c12535 |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: book3s: pr: move PR related tracepoints to a separate header This patch moves PR related tracepoints to a separate header. This enables in converting PR to a kernel module which will be done in later patches Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
699cc876 |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: book3s: Add is_hv_enabled to kvmppc_ops This help us to identify whether we are running with hypervisor mode KVM enabled. The change is needed so that we can have both HV and PR kvm enabled in the same kernel. If both HV and PR KVM are included, interrupts come in to the HV version of the kvmppc_interrupt code, which then jumps to the PR handler, renamed to kvmppc_interrupt_pr, if the guest is a PR guest. Allowing both PR and HV in the same kernel required some changes to kvm_dev_ioctl_check_extension(), since the values returned now can't be selected with #ifdefs as much as previously. We look at is_hv_enabled to return the right value when checking for capabilities.For capabilities that are only provided by HV KVM, we return the HV value only if is_hv_enabled is true. For capabilities provided by PR KVM but not HV, we return the PR value only if is_hv_enabled is false. NOTE: in later patch we replace is_hv_enabled with a static inline function comparing kvm_ppc_ops Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3a167bea |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: Add kvmppc_ops callback This patch add a new callback kvmppc_ops. This will help us in enabling both HV and PR KVM together in the same kernel. The actual change to enable them together is done in the later patch in the series. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: squash in booke changes] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
491d6ecc |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Reduce number of shadow PTEs invalidated by MMU notifiers Currently, whenever any of the MMU notifier callbacks get called, we invalidate all the shadow PTEs. This is inefficient because it means that we typically then get a lot of DSIs and ISIs in the guest to fault the shadow PTEs back in. We do this even if the address range being notified doesn't correspond to guest memory. This commit adds code to scan the memslot array to find out what range(s) of guest physical addresses corresponds to the host virtual address range being affected. For each such range we flush only the shadow PTEs for the range, on all cpus. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
93b159b4 |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Better handling of host-side read-only pages Currently we request write access to all pages that get mapped into the guest, even if the guest is only loading from the page. This reduces the effectiveness of KSM because it means that we unshare every page we access. Also, we always set the changed (C) bit in the guest HPTE if it allows writing, even for a guest load. This fixes both these problems. We pass an 'iswrite' flag to the mmu.xlate() functions and to kvmppc_mmu_map_page() to indicate whether the access is a load or a store. The mmu.xlate() functions now only set C for stores. kvmppc_gfn_to_pfn() now calls gfn_to_pfn_prot() instead of gfn_to_pfn() so that it can indicate whether we need write access to the page, and get back a 'writable' flag to indicate whether the page is writable or not. If that 'writable' flag is clear, we then make the host HPTE read-only even if the guest HPTE allowed writing. This means that we can get a protection fault when the guest writes to a page that it has mapped read-write but which is read-only on the host side (perhaps due to KSM having merged the page). Thus we now call kvmppc_handle_pagefault() for protection faults as well as HPTE not found faults. In kvmppc_handle_pagefault(), if the access was allowed by the guest HPTE and we thus need to install a new host HPTE, we then need to remove the old host HPTE if there is one. This is done with a new function, kvmppc_mmu_unmap_page(), which uses kvmppc_mmu_pte_vflush() to find and remove the old host HPTE. Since the memslot-related functions require the KVM SRCU read lock to be held, this adds srcu_read_lock/unlock pairs around the calls to kvmppc_handle_pagefault(). Finally, this changes kvmppc_mmu_book3s_32_xlate_pte() to not ignore guest HPTEs that don't permit access, and to return -EPERM for accesses that are not permitted by the page protections. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3ff95502 |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Allocate kvm_vcpu structs from kvm_vcpu_cache This makes PR KVM allocate its kvm_vcpu structs from the kvm_vcpu_cache rather than having them embedded in the kvmppc_vcpu_book3s struct, which is allocated with vzalloc. The reason is to reduce the differences between PR and HV KVM in order to make is easier to have them coexist in one kernel binary. With this, the kvm_vcpu struct has a pointer to the kvmppc_vcpu_book3s struct. The pointer to the kvmppc_book3s_shadow_vcpu struct has moved from the kvmppc_vcpu_book3s struct to the kvm_vcpu struct, and is only present for 32-bit, since it is only used for 32-bit. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: squash in compile fix from Aneesh] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9308ab8e |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Make HPT accesses and updates SMP-safe This adds a per-VM mutex to provide mutual exclusion between vcpus for accesses to and updates of the guest hashed page table (HPT). This also makes the code use single-byte writes to the HPT entry when updating of the reference (R) and change (C) bits. The reason for doing this, rather than writing back the whole HPTE, is that on non-PAPR virtual machines, the guest OS might be writing to the HPTE concurrently, and writing back the whole HPTE might conflict with that. Also, real hardware does single-byte writes to update R and C. The new mutex is taken in kvmppc_mmu_book3s_64_xlate() when reading the HPT and updating R and/or C, and in the PAPR HPT update hcalls (H_ENTER, H_REMOVE, etc.). Having the mutex means that we don't need to use a hypervisor lock bit in the HPT update hcalls, and we don't need to be careful about the order in which the bytes of the HPTE are updated by those hcalls. The other change here is to make emulated TLB invalidations (tlbie) effective across all vcpus. To do this we call kvmppc_mmu_pte_vflush for all vcpus in kvmppc_ppc_book3s_64_tlbie(). For 32-bit, this makes the setting of the accessed and dirty bits use single-byte writes, and makes tlbie invalidate shadow HPTEs for all vcpus. With this, PR KVM can successfully run SMP guests. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
c9029c34 |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Use 64k host pages where possible Currently, PR KVM uses 4k pages for the host-side mappings of guest memory, regardless of the host page size. When the host page size is 64kB, we might as well use 64k host page mappings for guest mappings of 64kB and larger pages and for guest real-mode mappings. However, the magic page has to remain a 4k page. To implement this, we first add another flag bit to the guest VSID values we use, to indicate that this segment is one where host pages should be mapped using 64k pages. For segments with this bit set we set the bits in the shadow SLB entry to indicate a 64k base page size. When faulting in host HPTEs for this segment, we make them 64k HPTEs instead of 4k. We record the pagesize in struct hpte_cache for use when invalidating the HPTE. For now we restrict the segment containing the magic page (if any) to 4k pages. It should be possible to lift this restriction in future by ensuring that the magic 4k page is appropriately positioned within a host 64k page. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a4a0f252 |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Allow guest to use 64k pages This adds the code to interpret 64k HPTEs in the guest hashed page table (HPT), 64k SLB entries, and to tell the guest about 64k pages in kvm_vm_ioctl_get_smmu_info(). Guest 64k pages are still shadowed by 4k pages. This also adds another hash table to the four we have already in book3s_mmu_hpte.c to allow us to find all the PTEs that we have instantiated that match a given 64k guest page. The tlbie instruction changed starting with POWER6 to use a bit in the RB operand to indicate large page invalidations, and to use other RB bits to indicate the base and actual page sizes and the segment size. 64k pages came in slightly earlier, with POWER5++. We use one bit in vcpu->arch.hflags to indicate that the emulated cpu supports 64k pages, and another to indicate that it has the new tlbie definition. The KVM_PPC_GET_SMMU_INFO ioctl presents a bit of a problem, because the MMU capabilities depend on which CPU model we're emulating, but it is a VM ioctl not a VCPU ioctl and therefore doesn't get passed a VCPU fd. In addition, commonly-used userspace (QEMU) calls it before setting the PVR for any VCPU. Therefore, as a best effort we look at the first vcpu in the VM and return 64k pages or not depending on its capabilities. We also make the PVR default to the host PVR on recent CPUs that support 1TB segments (and therefore multiple page sizes as well) so that KVM_PPC_GET_SMMU_INFO will include 64k page and 1TB segment support on those CPUs. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a2d56020 |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu Currently PR-style KVM keeps the volatile guest register values (R0 - R13, CR, LR, CTR, XER, PC) in a shadow_vcpu struct rather than the main kvm_vcpu struct. For 64-bit, the shadow_vcpu exists in two places, a kmalloc'd struct and in the PACA, and it gets copied back and forth in kvmppc_core_vcpu_load/put(), because the real-mode code can't rely on being able to access the kmalloc'd struct. This changes the code to copy the volatile values into the shadow_vcpu as one of the last things done before entering the guest. Similarly the values are copied back out of the shadow_vcpu to the kvm_vcpu immediately after exiting the guest. We arrange for interrupts to be still disabled at this point so that we can't get preempted on 64-bit and end up copying values from the wrong PACA. This means that the accessor functions in kvm_book3s.h for these registers are greatly simplified, and are same between PR and HV KVM. In places where accesses to shadow_vcpu fields are now replaced by accesses to the kvm_vcpu, we can also remove the svcpu_get/put pairs. Finally, on 64-bit, we don't need the kmalloc'd struct at all any more. With this, the time to read the PVR one million times in a loop went from 567.7ms to 575.5ms (averages of 6 values), an increase of about 1.4% for this worse-case test for guest entries and exits. The standard deviation of the measurements is about 11ms, so the difference is only marginally significant statistically. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f2481771 |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Fix compilation without CONFIG_ALTIVEC Commit 9d1ffdd8f3 ("KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX") added a call to kvmppc_load_up_altivec() that isn't guarded by CONFIG_ALTIVEC, causing a link failure when building a kernel without CONFIG_ALTIVEC set. This adds an #ifdef to fix this. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
de79f7b9 |
|
10-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
powerpc: Put FP/VSX and VR state into structures This creates new 'thread_fp_state' and 'thread_vr_state' structures to store FP/VSX state (including FPSCR) and Altivec/VSX state (including VSCR), and uses them in the thread_struct. In the thread_fp_state, the FPRs and VSRs are represented as u64 rather than double, since we rarely perform floating-point computations on the values, and this will enable the structures to be used in KVM code as well. Similarly FPSCR is now a u64 rather than a structure of two 32-bit values. This takes the offsets out of the macros such as SAVE_32FPRS, REST_32FPRS, etc. This enables the same macros to be used for normal and transactional state, enabling us to delete the transactional versions of the macros. This also removes the unused do_load_up_fpu and do_load_up_altivec, which were in fact buggy since they didn't create large enough stack frames to account for the fact that load_up_fpu and load_up_altivec are not designed to be called from C and assume that their caller's stack frame is an interrupt frame. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
8b23de29 |
|
05-Aug-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls It turns out that if we exit the guest due to a hcall instruction (sc 1), and the loading of the instruction in the guest exit path fails for any reason, the call to kvmppc_ld() in kvmppc_get_last_inst() fetches the instruction after the hcall instruction rather than the hcall itself. This in turn means that the instruction doesn't get recognized as an hcall in kvmppc_handle_exit_pr() but gets passed to the guest kernel as a sc instruction. That usually results in the guest kernel getting a return code of 38 (ENOSYS) from an hcall, which often triggers a BUG_ON() or other failure. This fixes the problem by adding a new variant of kvmppc_get_last_inst() called kvmppc_get_last_sc(), which fetches the instruction if necessary from pc - 4 rather than pc. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9d1ffdd8 |
|
05-Aug-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX Currently the code assumes that once we load up guest FP/VSX or VMX state into the CPU, it stays valid in the CPU registers until we explicitly flush it to the thread_struct. However, on POWER7, copy_page() and memcpy() can use VMX. These functions do flush the VMX state to the thread_struct before using VMX instructions, but if this happens while we have guest state in the VMX registers, and we then re-enter the guest, we don't reload the VMX state from the thread_struct, leading to guest corruption. This has been observed to cause guest processes to segfault. To fix this, we check before re-entering the guest that all of the bits corresponding to facilities owned by the guest, as expressed in vcpu->arch.guest_owned_ext, are set in current->thread.regs->msr. Any bits that have been cleared correspond to facilities that have been used by kernel code and thus flushed to the thread_struct, so for them we reload the state from the thread_struct. We also need to check current->thread.regs->msr before calling giveup_fpu() or giveup_altivec(), since if the relevant bit is clear, the state has already been flushed to the thread_struct and to flush it again would corrupt it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7c7b406e |
|
16-Jul-2013 |
Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> |
KVM: PPC: Book3S PR: return appropriate error when allocation fails err was overwritten by a previous function call, and checked to be 0. If the following page allocation fails, 0 is going to be returned instead of -ENOMEM. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
e0e13614 |
|
16-Jul-2013 |
Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> |
powerpc/kvm/book3s_pr: Return appropriate error when allocation fails err was overwritten by a previous function call, and checked to be 0. If the following page allocation fails, 0 is going to be returned instead of -ENOMEM. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
5f1c248f |
|
10-Jul-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc: Call trace_hardirqs_on before entry Currently this is only being done on 64-bit. Rather than just move it out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is consistent with lazy ee state, and so that we don't track more host code as interrupts-enabled than necessary. Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect that this function now has a role on 32-bit as well. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
0f296829 |
|
22-Jun-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Allow guest to use 1TB segments With this, the guest can use 1TB segments as well as 256MB segments. Since we now have the situation where a single emulated guest segment could correspond to multiple shadow segments (as the shadow segments are still 256MB segments), this adds a new kvmppc_mmu_flush_segment() to scan for all shadow segments that need to be removed. This restructures the guest HPT (hashed page table) lookup code to use the correct hashing and matching functions for HPTEs within a 1TB segment. We use the standard hpt_hash() function instead of open-coding the hash calculation, and we use HPTE_V_COMPARE() with an AVPN value that has the B (segment size) field included. The calculation of avpn is done a little earlier since it doesn't change in the loop starting at the do_second label. The computation in kvmppc_mmu_book3s_64_esid_to_vsid() changes so that it returns a 256MB VSID even if the guest SLB entry is a 1TB entry. This is because the users of this function are creating 256MB SLB entries. We set a new VSID_1T flag so that entries created from 1T segments don't collide with entries from 256MB segments. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8e591cb7 |
|
17-Apr-2013 |
Michael Ellerman <michael@ellerman.id.au> |
KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls For pseries machine emulation, in order to move the interrupt controller code to the kernel, we need to intercept some RTAS calls in the kernel itself. This adds an infrastructure to allow in-kernel handlers to be registered for RTAS services by name. A new ioctl, KVM_PPC_RTAS_DEFINE_TOKEN, then allows userspace to associate token values with those service names. Then, when the guest requests an RTAS service with one of those token values, it will be handled by the relevant in-kernel handler rather than being passed up to userspace as at present. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: fix warning] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
0f47f9b5 |
|
07-Apr-2013 |
Bharat Bhushan <bharat.bhushan@freescale.com> |
KVM: extend EMULATE_EXIT_USER to support different exit reasons Currently the instruction emulator code returns EMULATE_EXIT_USER and common code initializes the "run->exit_reason = .." and "vcpu->arch.hcall_needed = .." with one fixed reason. But there can be different reasons when emulator need to exit to user space. To support that the "run->exit_reason = .." and "vcpu->arch.hcall_needed = .." initialization is moved a level up to emulator. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
c402a3f4 |
|
07-Apr-2013 |
Bharat Bhushan <r65777@freescale.com> |
Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER Instruction emulation return EMULATE_DO_PAPR when it requires exit to userspace on book3s. Similar return is required for booke. EMULATE_DO_PAPR reads out to be confusing so it is renamed to EMULATE_EXIT_USER. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
c843be8a |
|
11-Mar-2013 |
Zhang Yanfei <zhangyanfei@cn.fujitsu.com> |
powerpc: remove cast for kmalloc/kzalloc return value remove cast for kmalloc/kzalloc return value. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
|
#
6e51c9ff |
|
11-Mar-2013 |
Zhang Yanfei <zhangyanfei@cn.fujitsu.com> |
powerpc: remove cast for kmalloc/kzalloc return value remove cast for kmalloc/kzalloc return value. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
#
8482644a |
|
27-Feb-2013 |
Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> |
KVM: set_memory_region: Refactor commit_memory_region() This patch makes the parameter old a const pointer to the old memory slot and adds a new parameter named change to know the change being requested: the former is for removing extra copying and the latter is for cleaning up the code. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
deb26c27 |
|
04-Feb-2013 |
Paul Mackerras <paulus@samba.org> |
powerpc/kvm/book3s_pr: Fix compilation on 32-bit machines Commit a413f474a0 ("powerpc: Disable relocation on exceptions whenever PR KVM is active") added calls to pSeries_disable_reloc_on_exc() and pSeries_enable_reloc_on_exc() to book3s_pr.c, and added declarations of those functions to <asm/hvcall.h>, but didn't add an include of <asm/hvcall.h> to book3s_pr.c. 64-bit kernels seem to get hvcall.h included via some other path, but 32-bit kernels fail to compile with: arch/powerpc/kvm/book3s_pr.c: In function ‘kvmppc_core_init_vm’: arch/powerpc/kvm/book3s_pr.c:1300:4: error: implicit declaration of function ‘pSeries_disable_reloc_on_exc’ [-Werror=implicit-function-declaration] arch/powerpc/kvm/book3s_pr.c: In function ‘kvmppc_core_destroy_vm’: arch/powerpc/kvm/book3s_pr.c:1316:4: error: implicit declaration of function ‘pSeries_enable_reloc_on_exc’ [-Werror=implicit-function-declaration] cc1: all warnings being treated as errors make[2]: *** [arch/powerpc/kvm/book3s_pr.o] Error 1 make[1]: *** [arch/powerpc/kvm] Error 2 make: *** [sub-make] Error 2 This fixes it by adding an include of hvcall.h. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
50c7bb80 |
|
14-Dec-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Enable alternative instruction for SC 1 When running on top of pHyp, the hypercall instruction "sc 1" goes straight into pHyp without trapping in supervisor mode. So if we want to support PAPR guest in this configuration we need to add a second way of accessing PAPR hypercalls, preferably with the exact same semantics except for the instruction. So let's overlay an officially reserved instruction and emulate PAPR hypercalls whenever we hit that one. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a413f474 |
|
03-Dec-2012 |
Ian Munsie <imunsie@au1.ibm.com> |
powerpc: Disable relocation on exceptions whenever PR KVM is active For PR KVM we allow userspace to map 0xc000000000000000. Because transitioning from userspace to the guest kernel may use the relocated exception vectors we have to disable relocation on exceptions whenever PR KVM is active as we cannot trust that address. This issue does not apply to HV KVM, since changing from a guest to the hypervisor will never use the relocated exception vectors. Currently the hypervisor interface only allows us to toggle relocation on exceptions on a partition wide scope, so we need to globally disable relocation on exceptions when the first PR KVM instance is started and only re-enable them when all PR KVM instances have been destroyed. It's a bit heavy handed, but until the hypervisor gives us a lightweight way to toggle relocation on exceptions on a single thread it's only real option. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
3a2e7b0d |
|
04-Nov-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: MSR_DE doesn't exist on Book 3S The mask of MSR bits that get transferred from the guest MSR to the shadow MSR included MSR_DE. In fact that bit only exists on Book 3E processors, and it is assigned the same bit used for MSR_BE on Book 3S processors. Since we already had MSR_BE in the mask, this just removes MSR_DE. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
28c483b6 |
|
04-Nov-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S PR: Fix VSX handling This fixes various issues in how we were handling the VSX registers that exist on POWER7 machines. First, we were running off the end of the current->thread.fpr[] array. Ultimately this was because the vcpu->arch.vsr[] array is sized to be able to store both the FP registers and the extra VSX registers (i.e. 64 entries), but PR KVM only uses it for the extra VSX registers (i.e. 32 entries). Secondly, calling load_up_vsx() from C code is a really bad idea, because it jumps to fast_exception_return at the end, rather than returning with a blr instruction. This was causing it to jump off to a random location with random register contents, since it was using the largely uninitialized stack frame created by kvmppc_load_up_vsx. In fact, it isn't necessary to call either __giveup_vsx or load_up_vsx, since giveup_fpu and load_up_fpu handle the extra VSX registers as well as the standard FP registers on machines with VSX. Also, since VSX instructions can access the VMX registers and the FP registers as well as the extra VSX registers, we have to load up the FP and VMX registers before we can turn on the MSR_VSX bit for the guest. Conversely, if we save away any of the VSX or FP registers, we have to turn off MSR_VSX for the guest. To handle all this, it is more convenient for a single call to kvmppc_giveup_ext() to handle all the state saving that needs to be done, so we make it take a set of MSR bits rather than just one, and the switch statement becomes a series of if statements. Similarly kvmppc_handle_ext needs to be able to load up more than one set of registers. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a8bd19ef |
|
25-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface This enables userspace to get and set all the guest floating-point state using the KVM_[GS]ET_ONE_REG ioctls. The floating-point state includes all of the traditional floating-point registers and the FPSCR (floating point status/control register), all the VMX/Altivec vector registers and the VSCR (vector status/control register), and on POWER7, the vector-scalar registers (note that each FP register is the high-order half of the corresponding VSR). Most of these are implemented in common Book 3S code, except for VSX on POWER7. Because HV and PR differ in how they store the FP and VSX registers on POWER7, the code for these cases is not common. On POWER7, the FP registers are the upper halves of the VSX registers vsr0 - vsr31. PR KVM stores vsr0 - vsr31 in two halves, with the upper halves in the arch.fpr[] array and the lower halves in the arch.vsr[] array, whereas HV KVM on POWER7 stores the whole VSX register in arch.vsr[]. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: fix whitespace, vsx compilation] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a136a8bd |
|
25-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface This enables userspace to get and set various SPRs (special-purpose registers) using the KVM_[GS]ET_ONE_REG ioctls. With this, userspace can get and set all the SPRs that are part of the guest state, either through the KVM_[GS]ET_REGS ioctls, the KVM_[GS]ET_SREGS ioctls, or the KVM_[GS]ET_ONE_REG ioctls. The SPRs that are added here are: - DABR: Data address breakpoint register - DSCR: Data stream control register - PURR: Processor utilization of resources register - SPURR: Scaled PURR - DAR: Data address register - DSISR: Data storage interrupt status register - AMR: Authority mask register - UAMOR: User authority mask override register - MMCR0, MMCR1, MMCRA: Performance monitor unit control registers - PMC1..PMC8: Performance monitor unit counter registers In order to reduce code duplication between PR and HV KVM code, this moves the kvm_vcpu_ioctl_[gs]et_one_reg functions into book3s.c and centralizes the copying between user and kernel space there. The registers that are handled differently between PR and HV, and those that exist only in one flavor, are handled in kvmppc_[gs]et_one_reg() functions that are specific to each flavor. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: minimal style fixes] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a47d72f3 |
|
20-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S HV: Fix updates of vcpu->cpu This removes the powerpc "generic" updates of vcpu->cpu in load and put, and moves them to the various backends. The reason is that "HV" KVM does its own sauce with that field and the generic updates might corrupt it. The field contains the CPU# of the -first- HW CPU of the core always for all the VCPU threads of a core (the one that's online from a host Linux perspective). However, the preempt notifiers are going to be called on the threads VCPUs when they are running (due to them sleeping on our private waitqueue) causing unload to be called, potentially clobbering the value. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
dfe49dbd |
|
11-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly This adds an implementation of kvm_arch_flush_shadow_memslot for Book3S HV, and arranges for kvmppc_core_commit_memory_region to flush the dirty log when modifying an existing slot. With this, we can handle deletion and modification of memory slots. kvm_arch_flush_shadow_memslot calls kvmppc_core_flush_memslot, which on Book3S HV now traverses the reverse map chains to remove any HPT (hashed page table) entries referring to pages in the memslot. This gets called by generic code whenever deleting a memslot or changing the guest physical address for a memslot. We flush the dirty log in kvmppc_core_commit_memory_region for consistency with what x86 does. We only need to flush when an existing memslot is being modified, because for a new memslot the rmap array (which stores the dirty bits) is all zero, meaning that every page is considered clean already, and when deleting a memslot we obviously don't care about the dirty bits any more. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a66b48c3 |
|
11-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Move kvm->arch.slot_phys into memslot.arch Now that we have an architecture-specific field in the kvm_memory_slot structure, we can use it to store the array of page physical addresses that we need for Book3S HV KVM on PPC970 processors. This reduces the size of struct kvm_arch for Book3S HV, and also reduces the size of struct kvm_arch_memory_slot for other PPC KVM variants since the fields in it are now only compiled in for Book3S HV. This necessitates making the kvm_arch_create_memslot and kvm_arch_free_memslot operations specific to each PPC KVM variant. That in turn means that we now don't allocate the rmap arrays on Book3S PR and Book E. Since we now unpin pages and free the slot_phys array in kvmppc_core_free_memslot, we no longer need to do it in kvmppc_core_destroy_vm, since the generic code takes care to free all the memslots when destroying a VM. We now need the new memslot to be passed in to kvmppc_core_prepare_memory_region, since we need to initialize its arch.slot_phys member on Book3S HV. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7c973a2e |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add return value to core_check_requests Requests may want to tell us that we need to go back into host state, so add a return value for the checks. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7ee78855 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add return value in prepare_to_enter Our prepare_to_enter helper wants to be able to return in more circumstances to the host than only when an interrupt is pending. Broaden the interface a bit and move even more generic code to the generic helper. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3766a4c6 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Move kvm_guest_enter call into generic code We need to call kvm_guest_enter in booke and book3s, so move its call to generic code. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
bd2be683 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Rework irq disabling Today, we disable preemption while inside guest context, because we need to expose to the world that we are not in a preemptible context. However, during that time we already have interrupts disabled, which would indicate that we are in a non-preemptible context. The reason the checks for irqs_disabled() fail for us though is that we manually control hard IRQs and ignore all the lazy EE framework. Let's stop doing that. Instead, let's always use lazy EE to indicate when we want to disable IRQs, but do a special final switch that gets us into EE disabled, but soft enabled state. That way when we get back out of guest state, we are immediately ready to process interrupts. This simplifies the code drastically and reduces the time that we appear as preempt disabled. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
24afa37b |
|
11-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Consistentify vcpu exit path When getting out of __vcpu_run, let's be consistent about the state we return in. We want to always * have IRQs enabled * have called kvm_guest_exit before Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
0652eaae |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Indicate we're out of guest mode When going out of guest mode, indicate that we are in vcpu->mode. That way requests from other CPUs don't needlessly need to kick us to process them, because it'll just happen next time we enter the guest. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
706fb730 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Exit guest context while handling exit The x86 implementation of KVM accounts for host time while processing guest exits. Do the same for us. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
c63ddcb4 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Only do resched check once per exit Now that we use our generic exit helper, we can safely drop our previous kvm_resched that we used to trigger at the beginning of the exit handler function. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9b0cb3c8 |
|
10-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier support Now that we have very simple MMU Notifier support for e500 in place, also add the same simple support to book3s. It gets us one step closer to actual fast support. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
03d25c5b |
|
09-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr We need to do the same things when preparing to enter a guest for booke and book3s_pr cores. Fold the generic code into a generic function that both call. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
97c95059 |
|
02-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: PR: Use generic tracepoint for guest exit We want to have tracing information on guest exits for booke as well as book3s. Since most information is identical, use a common trace point. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
32cad84f |
|
03-Aug-2012 |
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> |
KVM: do not release the error page After commit a2766325cf9f9, the error page is replaced by the error code, it need not be released anymore [ The patch has been compiling tested for powerpc ] Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
5b74716e |
|
26-Apr-2012 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
kvm/powerpc: Add new ioctl to retreive server MMU infos This is necessary for qemu to be able to pass the right information to the guest, such as the supported page sizes and corresponding encodings in the SLB and hash table, which can vary depending on the processor type, the type of KVM used (PR vs HV) and the version of KVM Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: fix compilation on hv, adjust for newer ioctl numbers] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f31e65e1 |
|
15-Mar-2012 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
kvm/book3s: Make kernel emulated H_PUT_TCE available for "PR" KVM There is nothing in the code for emulating TCE tables in the kernel that prevents it from working on "PR" KVM... other than ifdef's and location of the code. This and moves the bulk of the code there to a new file called book3s_64_vio.c. This speeds things up a bit on my G5. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: fix for hv kvm, 32bit, whitespace] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3b1d9d7d |
|
30-Apr-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: Enable IRQs during exit handling While handling an exit, we should listen for interrupts and make sure to receive them when they arrive, to keep our latencies low. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
bbcc9c06 |
|
13-Mar-2012 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
powerpc/kvm: Fix magic page vs. 32-bit RTAS on ppc64 When the kernel calls into RTAS, it switches to 32-bit mode. The magic page was is longer accessible in that case, causing the patched instructions in the RTAS call wrapper to crash. This fixes it by making available a 32-bit mapping of the magic page in that case. This mapping is flushed whenever we switch the kernel back to 64-bit mode. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: add a check if the magic page is mapped] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
966cd0f3 |
|
14-Mar-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Ignore unhalt request from kvm_vcpu_block When running kvm_vcpu_block and it realizes that the CPU is actually good to run, we get a request bit set for KVM_REQ_UNHALT. Right now, there's nothing we can do with that bit, so let's unset it right after the call again so we don't get confused in our later checks for pending work. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
4f225ae0 |
|
13-Mar-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3s: PR: Add HV traps so we can run in HV=1 mode on p7 When running PR KVM on a p7 system in bare metal, we get HV exits instead of normal supervisor traps. Semantically they are identical though and the HSRR vs SRR difference is already taken care of in the exit code. So all we need to do is handle them in addition to our normal exits. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
6020c0f6 |
|
11-Mar-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Pass EA to updating emulation ops When emulating updating load/store instructions (lwzu, stwu, ...) we need to write the effective address of the load/store into a register. Currently, we write the physical address in there, which is very wrong. So instead let's save off where the virtual fault was on MMIO and use that information as value to put into the register. While at it, also move the XOP variants of the above instructions to the new scheme of using the already known vaddr instead of calculating it themselves. Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
592f5d87 |
|
13-Mar-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Fix preemption We were leaking preemption counters. Fix the code to always toggle between preempt and non-preempt properly. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
b8e6f8ae |
|
13-Mar-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: Compile fix for ppc32 in HIOR access code We were failing to compile on book3s_32 with the following errors: arch/powerpc/kvm/book3s_pr.c:883:45: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] arch/powerpc/kvm/book3s_pr.c:898:79: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] Fix this by explicity casting the u64 to long before we use it as a pointer. Also, on PPC32 we can not use get_user/put_user for 64bit wide variables, as there is no single instruction that could load or store variables that big. So instead, we have to use copy_from/to_user which works everywhere. Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
95327d08 |
|
01-Apr-2012 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
powerpc/kvm: Fallout from system.h disintegration Add a missing include to fix build Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
2480b208 |
|
25-Nov-2011 |
Cong Wang <amwang@redhat.com> |
powerpc: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>
|
#
31f3438e |
|
11-Dec-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Move kvm_vcpu_ioctl_[gs]et_one_reg down to platform-specific code This moves the get/set_one_reg implementation down from powerpc.c into booke.c, book3s_pr.c and book3s_hv.c. This avoids #ifdefs in C code, but more importantly, it fixes a bug on Book3s HV where we were accessing beyond the end of the kvm_vcpu struct (via the to_book3s() macro) and corrupting memory, causing random crashes and file corruption. On Book3s HV we only accept setting the HIOR to zero, since the guest runs in supervisor mode and its vectors are never offset from zero. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> [agraf update to apply on top of changed ONE_REG patches] Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
1022fc3d |
|
14-Sep-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add support for explicit HIOR setting Until now, we always set HIOR based on the PVR, but this is just wrong. Instead, we should be setting HIOR explicitly, so user space can decide what the initial HIOR value is - just like on real hardware. We keep the old PVR based way around for backwards compatibility, but once user space uses the SET_ONE_REG based method, we drop the PVR logic. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
82ed3616 |
|
14-Dec-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3s HV: Implement get_dirty_log using hardware changed bit This changes the implementation of kvm_vm_ioctl_get_dirty_log() for Book3s HV guests to use the hardware C (changed) bits in the guest hashed page table. Since this makes the implementation quite different from the Book3s PR case, this moves the existing implementation from book3s.c to book3s_pr.c and creates a new implementation in book3s_hv.c. That implementation calls kvmppc_hv_get_dirty_log() to do the actual work by calling kvm_test_clear_dirty on each page. It iterates over the HPTEs, clearing the C bit if set, and returns 1 if any C bit was set (including the saved C bit in the rmap entry). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
e371f713 |
|
19-Dec-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Fix signal check race As Scott put it: > If we get a signal after the check, we want to be sure that we don't > receive the reschedule IPI until after we're in the guest, so that it > will cause another signal check. we need to have interrupts disabled from the point we do signal_check() all the way until we actually enter the guest. This patch fixes potential signal loss races. Reported-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
468a12c2 |
|
09-Dec-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Use get/set for to_svcpu to help preemption When running the 64-bit Book3s PR code without CONFIG_PREEMPT_NONE, we were doing a few things wrong, most notably access to PACA fields without making sure that the pointers stay stable accross the access (preempt_disable()). This patch moves to_svcpu towards a get/put model which allows us to disable preemption while accessing the shadow vcpu fields in the PACA. That way we can run preemptible and everyone's happy! Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
d33ad328 |
|
09-Dec-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3s: PR: No irq_disable in vcpu_run Somewhere during merges we ended up from local_irq_enable() foo(); local_irq_disable() to always keeping irqs enabled during that part. However, we now have the following code: foo(); local_irq_disable() which disables interrupts without the surrounding code enabling them again! So let's remove that disable and be happy. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7d82714d |
|
09-Dec-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3s: PR: Disable preemption in vcpu_run When entering the guest, we want to make sure we're not getting preempted away, so let's disable preemption on entry, but enable it again while handling guest exits. Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
25051b5a |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: Move prepare_to_enter call site into subarch code This function should be called with interrupts disabled, to avoid a race where an exception is delivered after we check, but the resched kick is received before we disable interrupts (and thus doesn't actually trigger the exit code that would recheck exceptions). booke already does this properly in the lightweight exit case, but not on initial entry. For now, move the call of prepare_to_enter into subarch-specific code so that booke can do the right thing here. Ideally book3s would do the same thing, but I'm having a hard time seeing where it does any interrupt disabling of this sort (plus it has several additional call sites), so I'm deferring the book3s fix to someone more familiar with that code. book3s behavior should be unchanged by this patch. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7e28e60e |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: Rename deliver_interrupts to prepare_to_enter This function also updates paravirt int_pending, so rename it to be more obvious that this is a collection of checks run prior to (re)entering a guest. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
96f38d72 |
|
08-Nov-2011 |
Andreas Schwab <schwab@linux-m68k.org> |
KVM: PPC: protect use of kvmppc_h_pr kvmppc_h_pr is only available if CONFIG_KVM_BOOK3S_64_PR. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
bb75c627 |
|
17-Nov-2011 |
Alexander Graf <agraf@suse.de> |
Revert "KVM: PPC: Add support for explicit HIOR setting" This reverts commit a15bd354f083f20f257db450488db52ac27df439. It exceeded the padding on the SREGS struct, rendering the ABI backwards-incompatible. Conflicts: arch/powerpc/kvm/powerpc.c include/linux/kvm.h Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
93087948 |
|
29-Jul-2011 |
Paul Gortmaker <paul.gortmaker@windriver.com> |
powerpc: include export.h for files using EXPORT_SYMBOL/THIS_MODULE Fix failures in powerpc associated with the previously allowed implicit module.h presence that now lead to things like this: arch/powerpc/mm/mmu_context_hash32.c:76:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL' arch/powerpc/mm/tlb_hash32.c:48:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' arch/powerpc/kernel/pci_32.c:51:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL' arch/powerpc/kernel/iomap.c:36:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' arch/powerpc/platforms/44x/canyonlands.c:126:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' arch/powerpc/kvm/44x.c:168:59: error: 'THIS_MODULE' undeclared (first use in this function) [with several contibutions from Stephen Rothwell <sfr@canb.auug.org.au>] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
#
02143947 |
|
23-Jul-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: book3s_pr: Simplify transitions between virtual and real mode This simplifies the way that the book3s_pr makes the transition to real mode when entering the guest. We now call kvmppc_entry_trampoline (renamed from kvmppc_rmcall) in the base kernel using a normal function call instead of doing an indirect call through a pointer in the vcpu. If kvm is a module, the module loader takes care of generating a trampoline as it does for other calls to functions outside the module. kvmppc_entry_trampoline then disables interrupts and jumps to kvmppc_handler_trampoline_enter in real mode using an rfi[d]. That then uses the link register as the address to return to (potentially in module space) when the guest exits. This also simplifies the way that we call the Linux interrupt handler when we exit the guest due to an external, decrementer or performance monitor interrupt. Instead of turning on the MMU, then deciding that we need to call the Linux handler and turning the MMU back off again, we now go straight to the handler at the point where we would turn the MMU on. The handler will then return to the virtual-mode code (potentially in the module). Along the way, this moves the setting and clearing of the HID5 DCBZ32 bit into real-mode interrupts-off code, and also makes sure that we clear the MSR[RI] bit before loading values into SRR0/1. The net result is that we no longer need any code addresses to be stored in vcpu->arch. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
af8f38b3 |
|
10-Aug-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add sanity checking to vcpu_run There are multiple features in PowerPC KVM that can now be enabled depending on the user's wishes. Some of the combinations don't make sense or don't work though. So this patch adds a way to check if the executing environment would actually be able to run the guest properly. It also adds sanity checks if PVR is set (should always be true given the current code flow), if PAPR is only used with book3s_64 where it works and that HV KVM is only used in PAPR mode. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a668f2bd |
|
08-Aug-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Support SC1 hypercalls for PAPR in PR mode PAPR defines hypercalls as SC1 instructions. Using these, the guest modifies page tables and does other privileged operations that it wouldn't be allowed to do in supervisor mode. This patch adds support for PR KVM to trap these instructions and route them through the same PAPR hypercall interface that we already use for HV style KVM. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a15bd354 |
|
08-Aug-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add support for explicit HIOR setting Until now, we always set HIOR based on the PVR, but this is just wrong. Instead, we should be setting HIOR explicitly, so user space can decide what the initial HIOR value is - just like on real hardware. We keep the old PVR based way around for backwards compatibility, but once user space uses the SREGS based method, we drop the PVR logic. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
df6909e5 |
|
28-Jun-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Move guest enter/exit down into subarch-specific code Instead of doing the kvm_guest_enter/exit() and local_irq_dis/enable() calls in powerpc.c, this moves them down into the subarch-specific book3s_pr.c and booke.c. This eliminates an extra local_irq_enable() call in book3s_pr.c, and will be needed for when we do SMT4 guest support in the book3s hypervisor mode code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f9e0554d |
|
28-Jun-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Pass init/destroy vm and prepare/commit memory region ops down This arranges for the top-level arch/powerpc/kvm/powerpc.c file to pass down some of the calls it gets to the lower-level subarchitecture specific code. The lower-level implementations (in booke.c and book3s.c) are no-ops. The coming book3s_hv.c will need this. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f05ed4d5 |
|
28-Jun-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Split out code from book3s.c into book3s_pr.c In preparation for adding code to enable KVM to use hypervisor mode on 64-bit Book 3S processors, this splits book3s.c into two files, book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is specific to running the guest in problem state (user mode) and book3s.c contains code which should apply to all Book 3S processors. In doing this, we abstract some details, namely the interrupt offset, updating the interrupt pending flag, and detecting if the guest is in a critical section. These are all things that will be different when we use hypervisor mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|