#
953e3739 |
|
07-Mar-2023 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Fetch prefixed instructions from the guest In order to handle emulation of prefixed instructions in the guest, this first makes vcpu->arch.last_inst be an unsigned long, i.e. 64 bits on 64-bit platforms. For prefixed instructions, the upper 32 bits are used for the prefix and the lower 32 bits for the suffix, and both halves are byte-swapped if the guest endianness differs from the host. Next, vcpu->arch.emul_inst is now 64 bits wide, to match the HEIR register on POWER10. Like HEIR, for a prefixed instruction it is defined to have the prefix is in the top 32 bits and the suffix in the bottom 32 bits, with both halves in the correct byte order. kvmppc_get_last_inst is extended on 64-bit machines to put the prefix and suffix in the right places in the ppc_inst_t being returned. kvmppc_load_last_inst now returns the instruction in an unsigned long in the same format as vcpu->arch.last_inst. It makes the decision about whether to fetch a suffix based on the SRR1_PREFIXED bit in the MSR image stored in the vcpu struct, which generally comes from SRR1 or HSRR1 on an interrupt. This bit is defined in Power ISA v3.1B to be set if the interrupt occurred due to a prefixed instruction and cleared otherwise for all interrupts except for instruction storage interrupt, which does not come to the hypervisor. It is set to zero for asynchronous interrupts such as external interrupts. In previous ISA versions it was always set to 0 for all interrupts except instruction storage interrupt. The code in book3s_hv_rmhandlers.S that loads the faulting instruction on a HDSI is only used on POWER8 and therefore doesn't ever need to load a suffix. [npiggin@gmail.com - check that the is-prefixed bit in SRR1 matches the type of instruction that was fetched.] Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/ZAgsq9h1CCzouQuV@cleo
|
#
acf17878 |
|
07-Mar-2023 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Make kvmppc_get_last_inst() produce a ppc_inst_t This changes kvmppc_get_last_inst() so that the instruction it fetches is returned in a ppc_inst_t variable rather than a u32. This will allow us to return a 64-bit prefixed instruction on those 64-bit machines that implement Power ISA v3.1 or later, such as POWER10. On 32-bit platforms, ppc_inst_t is 32 bits wide, and is turned back into a u32 by ppc_inst_val, which is an identity operation on those platforms. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/ZAgsiPlL9O7KnlZZ@cleo
|
#
460ba21d |
|
30-Mar-2023 |
Nicholas Piggin <npiggin@gmail.com> |
KVM: PPC: Permit SRR1 flags in more injected interrupt types The prefix architecture in ISA v3.1 introduces a prefixed bit in SRR1 for many types of synchronous interrupts which is set when the interrupt is caused by a prefixed instruction. This requires KVM to be able to set this bit when injecting interrupts into a guest. Plumb through the SRR1 "flags" argument to the core_queue APIs where it's missing for this. For now they are set to 0, which is no change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fixup kvmppc_core_queue_alignment() in booke.c] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230330103224.3589928-2-npiggin@gmail.com
|
#
43d05c61 |
|
02-Apr-2023 |
Michael Ellerman <mpe@ellerman.id.au> |
KVM: PPC: BookE: Fix W=1 warnings Fix various W=1 warnings in booke.c: arch/powerpc/kvm/booke.c:1008:5: error: no previous prototype for ‘kvmppc_handle_exit’ [-Werror=missing-prototypes] 1008 | int kvmppc_handle_exit(struct kvm_vcpu *vcpu, unsigned int exit_nr) | ^~~~~~~~~~~~~~~~~~ arch/powerpc/kvm/booke.c:1009: warning: Function parameter or member 'vcpu' not described in 'kvmppc_handle_exit' arch/powerpc/kvm/booke.c:1009: warning: Function parameter or member 'exit_nr' not described in 'kvmppc_handle_exit' Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/202304020827.3LEZ86WB-lkp@intel.com/ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230403045314.3095410-1-mpe@ellerman.id.au
|
#
e83ca8cf |
|
08-Mar-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: booke: Mark three local functions "static" Tag a few functions that are local and don't have a previous prototype as "static". No functional change intended. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202303031630.ntvIuYqo-lkp@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230308232437.500031-1-seanjc@google.com
|
#
7cb79f43 |
|
19-Jan-2023 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Fix refactoring goof in kvmppc_e500mc_init() Fix a build error due to a mixup during a recent refactoring. The error was reported during code review, but the fixed up patch didn't make it into the final commit. Fixes: 474856bad921 ("KVM: PPC: Move processor compatibility check to module init") Link: https://lore.kernel.org/all/87cz93snqc.fsf@mpe.ellerman.id.au Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230119182158.4026656-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
fe6de81b |
|
28-Jan-2023 |
Sathvika Vasireddy <sv@linux.ibm.com> |
powerpc/kvm: Fix unannotated intra-function call warning objtool throws the following warning: arch/powerpc/kvm/booke.o: warning: objtool: kvmppc_fill_pt_regs+0x30: unannotated intra-function call Fix the warning by setting the value of 'nip' using the _THIS_IP_ macro, without using an assembly bl/mflr sequence to save the instruction pointer. Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230128124158.1066251-1-sv@linux.ibm.com
|
#
6c645b01 |
|
27-Nov-2022 |
Nicholas Piggin <npiggin@gmail.com> |
KVM: PPC: Book3E: Fix CONFIG_TRACE_IRQFLAGS support 32-bit does not trace_irqs_off() to match the trace_irqs_on() call in kvmppc_fix_ee_before_entry(). This can lead to irqs being enabled twice in the trace, and the irqs-off region between guest exit and the host enabling local irqs again is not properly traced. 64-bit code does call this, but from asm code where volatiles are live and so incorrectly get clobbered. Move the irq reconcile into C to fix both problems. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221127124942.1665522-2-npiggin@gmail.com
|
#
c59fb127 |
|
20-Sep-2022 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: remove KVM_REQ_UNHALT KVM_REQ_UNHALT is now unnecessary because it is replaced by the return value of kvm_vcpu_block/kvm_vcpu_halt. Remove it. No functional change intended. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20220921003201.1441511-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
91b99ea7 |
|
08-Oct-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting the actual "block" sequences into a separate helper (to be named kvm_vcpu_block()). x86 will use the standalone block-only path to handle non-halt cases where the vCPU is not runnable. Rename block_ns to halt_ns to match the new function name. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
eaaaed13 |
|
06-Dec-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Avoid referencing userspace memory region in memslot updates For PPC HV, get the number of pages directly from the new memslot instead of computing the same from the userspace memory region, and explicitly check for !DELETE instead of inferring the same when toggling mmio_update. The motivation for these changes is to avoid referencing the @mem param so that it can be dropped in a future commit. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <1e97fb5198be25f98ef82e63a8d770c682264cc9.1638817639.git.maciej.szmigiero@oracle.com>
|
#
537a17b3 |
|
06-Dec-2021 |
Sean Christopherson <seanjc@google.com> |
KVM: Let/force architectures to deal with arch specific memslot data Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch code to handle propagating arch specific data from "new" to "old" when necessary. This is a baby step towards dynamically allocating "new" from the get go, and is a (very) minor performance boost on x86 due to not unnecessarily copying arch data. For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE and FLAGS_ONLY. This is functionally a nop as the previous behavior would overwrite the pointer for CREATE, and eventually discard/ignore it for DELETE. For x86, copy the arch data only for FLAGS_ONLY changes. Unlike PPC HV, x86 needs to reallocate arch data in the MOVE case as the size of x86's allocations depend on the alignment of the memslot's gfn. Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to match the "commit" prototype. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> [mss: add missing RISCV kvm_arch_prepare_memory_region() change] Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>
|
#
235cee16 |
|
27-Oct-2021 |
Laurent Vivier <lvivier@redhat.com> |
KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling Commit 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs") moved guest_exit() into the interrupt protected area to avoid wrong context warning (or worse). The problem is that tick-based time accounting has not yet been updated at this point (because it depends on the timer interrupt firing), so the guest time gets incorrectly accounted to system time. To fix the problem, follow the x86 fix in commit 160457140187 ("Defer vtime accounting 'til after IRQ handling"), and allow host IRQs to run before accounting the guest exit time. In the case vtime accounting is enabled, this is not required because TB is used directly for accounting. Before this patch, with CONFIG_TICK_CPU_ACCOUNTING=y in the host and a guest running a kernel compile, the 'guest' fields of /proc/stat are stuck at zero. With the patch they can be observed increasing roughly as expected. Fixes: e233d54d4d97 ("KVM: booke: use __kvm_guest_exit") Fixes: 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs") Cc: stable@vger.kernel.org # 5.12+ Signed-off-by: Laurent Vivier <lvivier@redhat.com> [np: only required for tick accounting, add Book3E fix, tweak changelog] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211027142150.3711582-1-npiggin@gmail.com
|
#
87bcc5fa |
|
02-Aug-2021 |
Jing Zhang <jingzhangos@google.com> |
KVM: stats: Add halt_wait_ns stats for all architectures Add simple stats halt_wait_ns to record the time a VCPU has spent on waiting for all architectures (not just powerpc). Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-5-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
f95937cc |
|
02-Aug-2021 |
Jing Zhang <jingzhangos@google.com> |
KVM: stats: Support linear and logarithmic histogram statistics Add new types of KVM stats, linear and logarithmic histogram. Histogram are very useful for observing the value distribution of time or size related stats. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
bc9e9e67 |
|
23-Jun-2021 |
Jing Zhang <jingzhangos@google.com> |
KVM: debugfs: Reuse binary stats descriptors To remove code duplication, use the binary stats descriptors in the implementation of the debugfs interface for statistics. This unifies the definition of statistics for the binary and debugfs interfaces. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-8-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
ce55c049 |
|
18-Jun-2021 |
Jing Zhang <jingzhangos@google.com> |
KVM: stats: Support binary stats retrieval for a VCPU Add a VCPU ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VCPU stats header, descriptors and data. Define VCPU statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-5-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
fcfe1bae |
|
18-Jun-2021 |
Jing Zhang <jingzhangos@google.com> |
KVM: stats: Support binary stats retrieval for a VM Add a VM ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VM stats header, descriptors and data. Define VM statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-4-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
0193cc90 |
|
18-Jun-2021 |
Jing Zhang <jingzhangos@google.com> |
KVM: stats: Separate generic stats from architecture specific ones Generic KVM stats are those collected in architecture independent code or those supported by all architectures; put all generic statistics in a separate structure. This ensures that they are defined the same way in the statistics API which is being added, removing duplication among different architectures in the declaration of the descriptors. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
3a96570f |
|
30-Jan-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc: convert interrupt handlers to use wrappers Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-29-npiggin@gmail.com
|
#
63e9f235 |
|
01-Feb-2021 |
Yang Li <yang.lee@linux.alibaba.com> |
KVM: PPC: remove unneeded semicolon Eliminate the following coccicheck warning: ./arch/powerpc/kvm/booke.c:701:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
a300bf8c |
|
07-Nov-2020 |
Kaixu Xia <kaixuxia@tencent.com> |
KVM: PPC: fix comparison to bool warning Fix the following coccicheck warning: ./arch/powerpc/kvm/booke.c:503:6-16: WARNING: Comparison to bool ./arch/powerpc/kvm/booke.c:505:6-17: WARNING: Comparison to bool ./arch/powerpc/kvm/booke.c:507:6-16: WARNING: Comparison to bool Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1604764178-8087-1-git-send-email-kaixuxia@tencent.com
|
#
4e1b2ab7 |
|
10-Sep-2020 |
Greg Kurz <groug@kaod.org> |
KVM: PPC: Don't return -ENOTSUPP to userspace in ioctls ENOTSUPP is a linux only thingy, the value of which is unknown to userspace, not to be confused with ENOTSUP which linux maps to EOPNOTSUPP, as permitted by POSIX [1]: [EOPNOTSUPP] Operation not supported on socket. The type of socket (address family or protocol) does not support the requested operation. A conforming implementation may assign the same values for [EOPNOTSUPP] and [ENOTSUP]. Return -EOPNOTSUPP instead of -ENOTSUPP for the following ioctls: - KVM_GET_FPU for Book3s and BookE - KVM_SET_FPU for Book3s and BookE - KVM_GET_DIRTY_LOG for BookE This doesn't affect QEMU which doesn't call the KVM_GET_FPU and KVM_SET_FPU ioctls on POWER anyway since they are not supported, and _buggily_ ignores anything but -EPERM for KVM_GET_DIRTY_LOG. [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
7ec21d9d |
|
23-Jun-2020 |
Tianjia Zhang <tianjia.zhang@linux.alibaba.com> |
KVM: PPC: Clean up redundant kvm_run parameters in assembly In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu' structure. For historical reasons, many kvm-related function parameters retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This patch does a unified cleanup of these remaining redundant parameters. [paulus@ozlabs.org - Fixed places that were missed in book3s_interrupts.S] Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
8c99d345 |
|
26-Apr-2020 |
Tianjia Zhang <tianjia.zhang@linux.alibaba.com> |
KVM: PPC: Clean up redundant 'kvm_run' parameters In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu' structure. For historical reasons, many kvm-related function parameters retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This patch does a unified cleanup of these remaining redundant parameters. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
cb953129 |
|
08-May-2020 |
David Matlack <dmatlack@google.com> |
kvm: add halt-polling cpu usage stats Two new stats for exposing halt-polling cpu usage: halt_poll_success_ns halt_poll_fail_ns Thus sum of these 2 stats is the total cpu time spent polling. "success" means the VCPU polled until a virtual interrupt was delivered. "fail" means the VCPU had to schedule out (either because the maximum poll time was reached or it needed to yield the CPU). To avoid touching every arch's kvm_vcpu_stat struct, only update and export halt-polling cpu usage stats if we're on x86. Exporting cpu usage as a u64 and in nanoseconds means we will overflow at ~500 years, which seems reasonably large. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Jon Cargille <jcargill@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200508182240.68440-1-jcargill@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
812756a8 |
|
14-Apr-2020 |
Emanuele Giuseppe Esposito <eesposit@redhat.com> |
kvm_host: unify VM_STAT and VCPU_STAT definitions in a single place The macros VM_STAT and VCPU_STAT are redundantly implemented in multiple files, each used by a different architecure to initialize the debugfs entries for statistics. Since they all have the same purpose, they can be unified in a single common definition in include/linux/kvm_host.h Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20200414155625.20559-1-eesposit@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
6fef0c6b |
|
18-Mar-2020 |
Greg Kurz <groug@kaod.org> |
KVM: PPC: Kill kvmppc_ops::mmu_destroy() and kvmppc_mmu_destroy() These are only used by HV KVM and BookE, and in both cases they are nops. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
8fc6ba0a |
|
10-Mar-2020 |
Joe Perches <joe@perches.com> |
KVM: PPC: Use fallthrough; Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe.com/ Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
0dff0846 |
|
18-Feb-2020 |
Sean Christopherson <seanjc@google.com> |
KVM: Provide common implementation for generic dirty log functions Move the implementations of KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG for CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT into common KVM code. The arch specific implemenations are extremely similar, differing only in whether the dirty log needs to be sync'd from hardware (x86) and how the TLBs are flushed. Add new arch hooks to handle sync and TLB flush; the sync will also be used for non-generic dirty log support in a future patch (s390). The ulterior motive for providing a common implementation is to eliminate the dependency between arch and common code with respect to the memslot referenced by the dirty log, i.e. to make it obvious in the code that the validity of the memslot is guaranteed, as a future patch will rework memslot handling such that id_to_memslot() can return NULL. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
e96c81ee |
|
18-Feb-2020 |
Sean Christopherson <seanjc@google.com> |
KVM: Simplify kvm_free_memslot() and all its descendents Now that all callers of kvm_free_memslot() pass NULL for @dont, remove the param from the top-level routine and all arch's implementations. No functional change intended. Tested-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
82307e67 |
|
18-Feb-2020 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Move memslot memory allocation into prepare_memory_region() Allocate the rmap array during kvm_arch_prepare_memory_region() to pave the way for removing kvm_arch_create_memslot() altogether. Moving PPC's memory allocation only changes the order of kernel memory allocations between PPC and common KVM code. No functional change intended. Acked-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
afede96d |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: Drop kvm_arch_vcpu_setup() Remove kvm_arch_vcpu_setup() now that all arch specific implementations are nops. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
b3d42c98 |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: BookE: Setup vcpu during kvmppc_core_vcpu_create() Fold setup() into create() now that the two are called back-to-back by common KVM code. This paves the way for removing kvm_arch_vcpu_setup(). Note, BookE directly implements kvm_arch_vcpu_setup() and PPC's common kvm_arch_vcpu_create() is responsible for its own cleanup, thus the only cleanup required when directly invoking kvmppc_core_vcpu_setup() is to call .vcpu_free(), which is the BookE specific portion of PPC's kvm_arch_vcpu_destroy() by way of kvmppc_core_vcpu_free(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
ff030fdf |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Move kvm_vcpu_init() invocation to common code Move the kvm_cpu_{un}init() calls to common PPC code as an intermediate step towards removing kvm_cpu_{un}init() altogether. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
c50bfbdc |
|
18-Dec-2019 |
Sean Christopherson <seanjc@google.com> |
KVM: PPC: Allocate vcpu struct in common PPC code Move allocation of all flavors of PPC vCPUs to common PPC code. All variants either allocate 'struct kvm_vcpu' directly, or require that the embedded 'struct kvm_vcpu' member be located at offset 0, i.e. guarantee that the allocation can be directly interpreted as a 'struct kvm_vcpu' object. Remove the message from the build-time assertion regarding placement of the struct, as compatibility with the arch usercopy region is no longer the sole dependent on 'struct kvm_vcpu' being at offset zero. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
e1bd0a7e |
|
26-Nov-2019 |
Leonardo Bras <leobras.c@gmail.com> |
KVM: PPC: Book3E: Replace current->mm by kvm->mm Given that in kvm_create_vm() there is: kvm->mm = current->mm; And that on every kvm_*_ioctl we have: if (kvm->mm != current->mm) return -EIO; I see no reason to keep using current->mm instead of kvm->mm. By doing so, we would reduce the use of 'global' variables on code, relying more in the contents of kvm struct. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
d94d71cb |
|
29-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 266 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation 51 franklin street fifth floor boston ma 02110 1301 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 67 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141333.953658117@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
f032b734 |
|
11-Dec-2018 |
Bharata B Rao <bharata@linux.ibm.com> |
KVM: PPC: Pass change type down to memslot commit function Currently, kvm_arch_commit_memory_region() gets called with a parameter indicating what type of change is being made to the memslot, but it doesn't pass it down to the platform-specific memslot commit functions. This adds the `change' parameter to the lower-level functions so that they can use it in future. [paulus@ozlabs.org - fix book E also.] Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
173c520a |
|
07-May-2018 |
Simon Guo <wei.guo.simon@gmail.com> |
KVM: PPC: Move nip/ctr/lr/xer registers to pt_regs in kvm_vcpu_arch This patch moves nip/ctr/lr/xer registers from scattered places in kvm_vcpu_arch to pt_regs structure. cr register is "unsigned long" in pt_regs and u32 in vcpu->arch. It will need more consideration and may move in later patches. Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
b2d7ecbe |
|
26-Apr-2018 |
Laurentiu Tudor <laurentiu.tudor@nxp.com> |
powerpc/kvm/booke: Fix altivec related build break Add missing "altivec unavailable" interrupt injection helper thus fixing the linker error below: arch/powerpc/kvm/emulate_loadstore.o: In function `kvmppc_check_altivec_disabled': arch/powerpc/kvm/emulate_loadstore.c: undefined reference to `.kvmppc_core_queue_vec_unavail' Fixes: 09f984961c137c4b ("KVM: PPC: Book3S: Add MMIO emulation for VMX instructions") Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
66b56562 |
|
04-Dec-2017 |
Christoffer Dall <christoffer.dall@linaro.org> |
KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_set_guest_debug Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_set_guest_debug(). Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
1da5b61d |
|
04-Dec-2017 |
Christoffer Dall <christoffer.dall@linaro.org> |
KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_translate Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_translate(). Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
b4ef9d4e |
|
04-Dec-2017 |
Christoffer Dall <christoffer.dall@linaro.org> |
KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_set_sregs Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_set_sregs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
bcdec41c |
|
04-Dec-2017 |
Christoffer Dall <christoffer.dall@linaro.org> |
KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_get_sregs Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_get_sregs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
875656fe |
|
04-Dec-2017 |
Christoffer Dall <christoffer.dall@linaro.org> |
KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_set_regs Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_set_regs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
1fc9b76b |
|
04-Dec-2017 |
Christoffer Dall <christoffer.dall@linaro.org> |
KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_get_regs Move vcpu_load() and vcpu_put() into the architecture specific implementations of kvm_arch_vcpu_ioctl_get_regs(). Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
86cb30ec |
|
17-Oct-2017 |
Kees Cook <keescook@chromium.org> |
treewide: setup_timer() -> timer_setup() (2 field) This converts all remaining setup_timer() calls that use a nested field to reach a struct timer_list. Coccinelle does not have an easy way to match multiple fields, so a new script is needed to change the matches of "&_E->_timer" into "&_E->_field1._timer" in all the rules. spatch --very-quiet --all-includes --include-headers \ -I ./arch/x86/include -I ./arch/x86/include/generated \ -I ./include -I ./arch/x86/include/uapi \ -I ./arch/x86/include/generated/uapi -I ./include/uapi \ -I ./include/generated/uapi --include ./include/linux/kconfig.h \ --dir . \ --cocci-file ~/src/data/timer_setup-2fields.cocci @fix_address_of depends@ expression e; @@ setup_timer( -&(e) +&e , ...) // Update any raw setup_timer() usages that have a NULL callback, but // would otherwise match change_timer_function_usage, since the latter // will update all function assignments done in the face of a NULL // function initialization in setup_timer(). @change_timer_function_usage_NULL@ expression _E; identifier _field1; identifier _timer; type _cast_data; @@ ( -setup_timer(&_E->_field1._timer, NULL, _E); +timer_setup(&_E->_field1._timer, NULL, 0); | -setup_timer(&_E->_field1._timer, NULL, (_cast_data)_E); +timer_setup(&_E->_field1._timer, NULL, 0); | -setup_timer(&_E._field1._timer, NULL, &_E); +timer_setup(&_E._field1._timer, NULL, 0); | -setup_timer(&_E._field1._timer, NULL, (_cast_data)&_E); +timer_setup(&_E._field1._timer, NULL, 0); ) @change_timer_function_usage@ expression _E; identifier _field1; identifier _timer; struct timer_list _stl; identifier _callback; type _cast_func, _cast_data; @@ ( -setup_timer(&_E->_field1._timer, _callback, _E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, &_callback, _E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, &_callback, (_cast_data)_E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, (_cast_func)_callback, _E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, (_cast_func)&_callback, _E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, (_cast_func)_callback, (_cast_data)_E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, (_cast_func)&_callback, (_cast_data)_E); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, _callback, (_cast_data)_E); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, &_callback, (_cast_data)_E); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, &_callback, (_cast_data)&_E); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)_E); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)&_E); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)_E); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)&_E); +timer_setup(&_E._field1._timer, _callback, 0); | _E->_field1._timer@_stl.function = _callback; | _E->_field1._timer@_stl.function = &_callback; | _E->_field1._timer@_stl.function = (_cast_func)_callback; | _E->_field1._timer@_stl.function = (_cast_func)&_callback; | _E._field1._timer@_stl.function = _callback; | _E._field1._timer@_stl.function = &_callback; | _E._field1._timer@_stl.function = (_cast_func)_callback; | _E._field1._timer@_stl.function = (_cast_func)&_callback; ) // callback(unsigned long arg) @change_callback_handle_cast depends on change_timer_function_usage@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._field1; identifier change_timer_function_usage._timer; type _origtype; identifier _origarg; type _handletype; identifier _handle; @@ void _callback( -_origtype _origarg +struct timer_list *t ) { ( ... when != _origarg _handletype *_handle = -(_handletype *)_origarg; +from_timer(_handle, t, _field1._timer); ... when != _origarg | ... when != _origarg _handletype *_handle = -(void *)_origarg; +from_timer(_handle, t, _field1._timer); ... when != _origarg | ... when != _origarg _handletype *_handle; ... when != _handle _handle = -(_handletype *)_origarg; +from_timer(_handle, t, _field1._timer); ... when != _origarg | ... when != _origarg _handletype *_handle; ... when != _handle _handle = -(void *)_origarg; +from_timer(_handle, t, _field1._timer); ... when != _origarg ) } // callback(unsigned long arg) without existing variable @change_callback_handle_cast_no_arg depends on change_timer_function_usage && !change_callback_handle_cast@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._field1; identifier change_timer_function_usage._timer; type _origtype; identifier _origarg; type _handletype; @@ void _callback( -_origtype _origarg +struct timer_list *t ) { + _handletype *_origarg = from_timer(_origarg, t, _field1._timer); + ... when != _origarg - (_handletype *)_origarg + _origarg ... when != _origarg } // Avoid already converted callbacks. @match_callback_converted depends on change_timer_function_usage && !change_callback_handle_cast && !change_callback_handle_cast_no_arg@ identifier change_timer_function_usage._callback; identifier t; @@ void _callback(struct timer_list *t) { ... } // callback(struct something *handle) @change_callback_handle_arg depends on change_timer_function_usage && !match_callback_converted && !change_callback_handle_cast && !change_callback_handle_cast_no_arg@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._field1; identifier change_timer_function_usage._timer; type _handletype; identifier _handle; @@ void _callback( -_handletype *_handle +struct timer_list *t ) { + _handletype *_handle = from_timer(_handle, t, _field1._timer); ... } // If change_callback_handle_arg ran on an empty function, remove // the added handler. @unchange_callback_handle_arg depends on change_timer_function_usage && change_callback_handle_arg@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._field1; identifier change_timer_function_usage._timer; type _handletype; identifier _handle; identifier t; @@ void _callback(struct timer_list *t) { - _handletype *_handle = from_timer(_handle, t, _field1._timer); } // We only want to refactor the setup_timer() data argument if we've found // the matching callback. This undoes changes in change_timer_function_usage. @unchange_timer_function_usage depends on change_timer_function_usage && !change_callback_handle_cast && !change_callback_handle_cast_no_arg && !change_callback_handle_arg@ expression change_timer_function_usage._E; identifier change_timer_function_usage._field1; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type change_timer_function_usage._cast_data; @@ ( -timer_setup(&_E->_field1._timer, _callback, 0); +setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E); | -timer_setup(&_E._field1._timer, _callback, 0); +setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E); ) // If we fixed a callback from a .function assignment, fix the // assignment cast now. @change_timer_function_assignment depends on change_timer_function_usage && (change_callback_handle_cast || change_callback_handle_cast_no_arg || change_callback_handle_arg)@ expression change_timer_function_usage._E; identifier change_timer_function_usage._field1; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type _cast_func; typedef TIMER_FUNC_TYPE; @@ ( _E->_field1._timer.function = -_callback +(TIMER_FUNC_TYPE)_callback ; | _E->_field1._timer.function = -&_callback +(TIMER_FUNC_TYPE)_callback ; | _E->_field1._timer.function = -(_cast_func)_callback; +(TIMER_FUNC_TYPE)_callback ; | _E->_field1._timer.function = -(_cast_func)&_callback +(TIMER_FUNC_TYPE)_callback ; | _E._field1._timer.function = -_callback +(TIMER_FUNC_TYPE)_callback ; | _E._field1._timer.function = -&_callback; +(TIMER_FUNC_TYPE)_callback ; | _E._field1._timer.function = -(_cast_func)_callback +(TIMER_FUNC_TYPE)_callback ; | _E._field1._timer.function = -(_cast_func)&_callback +(TIMER_FUNC_TYPE)_callback ; ) // Sometimes timer functions are called directly. Replace matched args. @change_timer_function_calls depends on change_timer_function_usage && (change_callback_handle_cast || change_callback_handle_cast_no_arg || change_callback_handle_arg)@ expression _E; identifier change_timer_function_usage._field1; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type _cast_data; @@ _callback( ( -(_cast_data)_E +&_E->_field1._timer | -(_cast_data)&_E +&_E._field1._timer | -_E +&_E->_field1._timer ) ) // If a timer has been configured without a data argument, it can be // converted without regard to the callback argument, since it is unused. @match_timer_function_unused_data@ expression _E; identifier _field1; identifier _timer; identifier _callback; @@ ( -setup_timer(&_E->_field1._timer, _callback, 0); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, _callback, 0L); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E->_field1._timer, _callback, 0UL); +timer_setup(&_E->_field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, _callback, 0); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, _callback, 0L); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_E._field1._timer, _callback, 0UL); +timer_setup(&_E._field1._timer, _callback, 0); | -setup_timer(&_field1._timer, _callback, 0); +timer_setup(&_field1._timer, _callback, 0); | -setup_timer(&_field1._timer, _callback, 0L); +timer_setup(&_field1._timer, _callback, 0); | -setup_timer(&_field1._timer, _callback, 0UL); +timer_setup(&_field1._timer, _callback, 0); | -setup_timer(_field1._timer, _callback, 0); +timer_setup(_field1._timer, _callback, 0); | -setup_timer(_field1._timer, _callback, 0L); +timer_setup(_field1._timer, _callback, 0); | -setup_timer(_field1._timer, _callback, 0UL); +timer_setup(_field1._timer, _callback, 0); ) @change_callback_unused_data depends on match_timer_function_unused_data@ identifier match_timer_function_unused_data._callback; type _origtype; identifier _origarg; @@ void _callback( -_origtype _origarg +struct timer_list *unused ) { ... when != _origarg } Signed-off-by: Kees Cook <keescook@chromium.org>
|
#
2fa6e1e1 |
|
04-Jun-2017 |
Radim Krčmář <rkrcmar@redhat.com> |
KVM: add kvm_request_pending A first step in vcpu->requests encapsulation. Additionally, we now use READ_ONCE() when accessing vcpu->requests, which ensures we always load vcpu->requests when it's accessed. This is important as other threads can change it any time. Also, READ_ONCE() documents that vcpu->requests is used with other threads, likely requiring memory barriers, which it does. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> [ Documented the new use of READ_ONCE() and converted another check in arch/mips/kvm/vz.c ] Signed-off-by: Andrew Jones <drjones@redhat.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
|
#
72875d8a |
|
26-Apr-2017 |
Radim Krčmář <rkrcmar@redhat.com> |
KVM: add kvm_{test,clear}_request to replace {test,clear}_bit Users were expected to use kvm_check_request() for testing and clearing, but request have expanded their use since then and some users want to only test or do a faster clear. Make sure that requests are not directly accessed with bit operations. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
307d9279 |
|
22-Mar-2017 |
Paul Mackerras <paulus@ozlabs.org> |
KVM: PPC: Provide functions for queueing up FP/VEC/VSX unavailable interrupts This provides functions that can be used for generating interrupts indicating that a given functional unit (floating point, vector, or VSX) is unavailable. These functions will be used in instruction emulation code. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
7c0f6ba6 |
|
24-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
ac0e89bb |
|
14-Jul-2016 |
Dan Carpenter <dan.carpenter@oracle.com> |
KVM: PPC: BookE: Fix a sanity check We use logical negate where bitwise negate was intended. It means that we never return -EINVAL here. Fixes: ce11e48b7fdd ('KVM: PPC: E500: Add userspace debug stub support') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
#
6edaa530 |
|
15-Jun-2016 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: remove kvm_guest_enter/exit wrappers Use the functions from context_tracking.h directly. Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
3491caf2 |
|
12-May-2016 |
Christian Borntraeger <borntraeger@de.ibm.com> |
KVM: halt_polling: provide a way to qualify wakeups during poll Some wakeups should not be considered a sucessful poll. For example on s390 I/O interrupts are usually floating, which means that _ALL_ CPUs would be considered runnable - letting all vCPUs poll all the time for transactional like workload, even if one vCPU would be enough. This can result in huge CPU usage for large guests. This patch lets architectures provide a way to qualify wakeups if they should be considered a good/bad wakeups in regard to polls. For s390 the implementation will fence of halt polling for anything but known good, single vCPU events. The s390 implementation for floating interrupts does a wakeup for one vCPU, but the interrupt will be delivered by whatever CPU checks first for a pending interrupt. We prefer the woken up CPU by marking the poll of this CPU as "good" poll. This code will also mark several other wakeup reasons like IPI or expired timers as "good". This will of course also mark some events as not sucessful. As KVM on z runs always as a 2nd level hypervisor, we prefer to not poll, unless we are really sure, though. This patch successfully limits the CPU usage for cases like uperf 1byte transactional ping pong workload or wakeup heavy workload like OLTP while still providing a proper speedup. This also introduced a new vcpu stat "halt_poll_no_tuning" that marks wakeups that are considered not good for polling. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version) Cc: David Matlack <dmatlack@google.com> Cc: Wanpeng Li <kernellwp@gmail.com> [Rename config symbol. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
446957ba |
|
24-Feb-2016 |
Adam Buchbinder <adam.buchbinder@gmail.com> |
powerpc: Fix misspellings in comments. Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
dc4fbba1 |
|
28-Oct-2015 |
Anton Blanchard <anton@samba.org> |
powerpc: Create disable_kernel_{fp,altivec,vsx,spe}() The enable_kernel_*() functions leave the relevant MSR bits enabled until we exit the kernel sometime later. Create disable versions that wrap the kernel use of FP, Altivec VSX or SPE. While we don't want to disable it normally for performance reasons (MSR writes are slow), it will be used for a debug boot option that does this and catches bad uses in other areas of the kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
62bea5bf |
|
15-Sep-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: add halt_attempted_poll to VCPU stats This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
845ac985 |
|
18-May-2015 |
Tudor Laurentiu <b10716@freescale.com> |
KVM: PPC: add missing pt_regs initialization On this switch branch the regs initialization doesn't happen so add it. This was found with the help of a static code analysis tool. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f36f3f28 |
|
18-May-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: add "new" argument to kvm_arch_commit_memory_region This lets the function access the new memory slot without going through kvm_memslots and id_to_memslot. It will simplify the code when more than one address space will be supported. Unfortunately, the "const"ness of the new argument must be casted away in two places. Fixing KVM to accept const struct kvm_memory_slot pointers would require modifications in pretty much all architectures, and is left for later. Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
09170a49 |
|
18-May-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: const-ify uses of struct kvm_userspace_memory_region Architecture-specific helpers are not supposed to muck with struct kvm_userspace_memory_region contents. Add const to enforce this. In order to eliminate the only write in __kvm_set_memory_region, the cleaning of deleted slots is pulled up from update_memslots to __kvm_set_memory_region. Reviewed-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
e233d54d |
|
30-Apr-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: booke: use __kvm_guest_exit Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
f7819512 |
|
04-Feb-2015 |
Paolo Bonzini <pbonzini@redhat.com> |
kvm: add halt_poll_ns module parameter This patch introduces a new module parameter for the KVM module; when it is present, KVM attempts a bit of polling on every HLT before scheduling itself out via kvm_vcpu_block. This parameter helps a lot for latency-bound workloads---in particular I tested it with O_DSYNC writes with a battery-backed disk in the host. In this case, writes are fast (because the data doesn't have to go all the way to the platters) but they cannot be merged by either the host or the guest. KVM's performance here is usually around 30% of bare metal, or 50% if you use cache=directsync or cache=writethrough (these parameters avoid that the guest sends pointless flush requests, and at the same time they are not slow because of the battery-backed cache). The bad performance happens because on every halt the host CPU decides to halt itself too. When the interrupt comes, the vCPU thread is then migrated to a new physical CPU, and in general the latency is horrible because the vCPU thread has to be scheduled back in. With this patch performance reaches 60-65% of bare metal and, more important, 99% of what you get if you use idle=poll in the guest. This means that the tunable gets rid of this particular bottleneck, and more work can be done to improve performance in the kernel or QEMU. Of course there is some price to pay; every time an otherwise idle vCPUs is interrupted by an interrupt, it will poll unnecessarily and thus impose a little load on the host. The above results were obtained with a mostly random value of the parameter (500000), and the load was around 1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU. The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll, that can be used to tune the parameter. It counts how many HLT instructions received an interrupt during the polling period; each successful poll avoids that Linux schedules the VCPU thread out and back in, and may also avoid a likely trip to C1 and back for the physical CPU. While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second. Of these halts, almost all are failed polls. During the benchmark, instead, basically all halts end within the polling period, except a more or less constant stream of 50 per second coming from vCPUs that are not running the benchmark. The wasted time is thus very low. Things may be slightly different for Windows VMs, which have a ~10 ms timer tick. The effect is also visible on Marcelo's recently-introduced latency test for the TSC deadline timer. Though of course a non-RT kernel has awful latency bounds, the latency of the timer is around 8000-10000 clock cycles compared to 20000-120000 without setting halt_poll_ns. For the TSC deadline timer, thus, the effect is both a smaller average latency and a smaller variance. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
8d0eff63 |
|
10-Sep-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Pass enum to kvmppc_get_last_inst The kvmppc_get_last_inst function recently received a facelift that allowed us to pass an enum of the type of instruction we want to read into it rather than an unreadable boolean. Unfortunately, not all callers ended up passing the enum. This wasn't really an issue as "true" and "false" happen to match the two enum values we have, but it's still hard to read. Update all callers of kvmppc_get_last_inst() to follow the new calling convention. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
033aaa14 |
|
09-Sep-2014 |
Madhavan Srinivasan <maddy@linux.vnet.ibm.com> |
powerpc/kvm: common sw breakpoint instr across ppc This patch extends the use of illegal instruction as software breakpoint instruction across the ppc platform. Patch extends booke program interrupt code to support software breakpoint. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [agraf: Fix bookehv] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
d02d4d15 |
|
01-Sep-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Remove the tasklet used by the hrtimer Powerpc timer implementation is a copycat version of s390. Now that they removed the tasklet with commit ea74c0ea1b24a6978a6ebc80ba4dbc7b7848b32d follow this optimization. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
2f699a59 |
|
13-Aug-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
KVM: PPC: BOOKE: Emulate debug registers and exception This patch emulates debug registers and debug exception to support guest using debug resource. This enables running gdb/kgdb etc in guest. On BOOKE architecture we cannot share debug resources between QEMU and guest because: When QEMU is using debug resources then debug exception must be always enabled. To achieve this we set MSR_DE and also set MSRP_DEP so guest cannot change MSR_DE. When emulating debug resource for guest we want guest to control MSR_DE (enable/disable debug interrupt on need). So above mentioned two configuration cannot be supported at the same time. So the result is that we cannot share debug resources between QEMU and Guest on BOOKE architecture. In the current design QEMU gets priority over guest, this means that if QEMU is using debug resources then guest cannot use them and if guest is using debug resource then QEMU can overwrite them. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8a41ea53 |
|
20-Aug-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Make ONE_REG powerpc generic Make ONE_REG generic for server and embedded architectures by moving kvm_vcpu_ioctl_get_one_reg() and kvm_vcpu_ioctl_set_one_reg() functions to powerpc layer. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
95d80a29 |
|
20-Aug-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Book3e: Add AltiVec support Add AltiVec support in KVM for Book3e. FPU support gracefully reuse host infrastructure so follow the same approach for AltiVec. Book3e specification defines shared interrupt numbers for SPE and AltiVec units. Still SPE is present in e200/e500v2 cores while AltiVec is present in e6500 core. So we can currently decide at compile-time which of the SPE or AltiVec units to support exclusively by using CONFIG_SPE_POSSIBLE and CONFIG_PPC_E500MC defines. As Alexander Graf suggested, keep SPE and AltiVec exception handlers distinct to improve code readability. Guests have the privilege to enable AltiVec, so we always need to support AltiVec in KVM and implicitly in host to reflect interrupts and to save/restore the unit context. KVM will be loaded on cores with AltiVec unit only if CONFIG_ALTIVEC is defined. Use this define to guard KVM AltiVec logic. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3efc7da6 |
|
20-Aug-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Book3E: Increase FPU laziness Increase FPU laziness by loading the guest state into the unit before entering the guest instead of doing it on each vcpu schedule. Without this improvement an interrupt may claim floating point corrupting guest state. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
2c509672 |
|
05-Aug-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
KVM: PPC: BOOKE: Add one reg interface for DBSR Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
348ba710 |
|
05-Aug-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
KVM: PPC: BOOKE: Guest and hardware visible debug registers are same Guest visible debug register and hardware visible debug registers are same, so ther is no need to have arch->shadow_dbg_reg, instead use arch->dbg_reg. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
2190991e |
|
05-Aug-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
KVM: PPC: BOOKE: Clear guest dbsr in userspace exit KVM_EXIT_DEBUG Dbsr is not visible to userspace and we do not think any need to expose this to userspace because: Userspace cannot inject debug interrupt to guest (as this does not know guest ability to handle debug interrupt), so userspace will always clear DBSR. Now if userspace has to always clear DBSR in KVM_EXIT_DEBUG handling then clearing dbsr in kernel looks simple as this avoid doing SET_SREGS/set_one_reg() to clear DBSR Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9fee7563 |
|
05-Aug-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
KVM: PPC: BOOKE: allow debug interrupt at "debug level" Debug interrupt can be either "critical level" or "debug level". There are separate set of save/restore registers used for different level. Example: DSRR0/DSRR1 are used for "debug level" and CSRR0/CSRR1 are used for critical level debug interrupt. Using CPU_FTR_DEBUG_LVL_EXC to decide which interrupt level to be used. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
ce91ddc4 |
|
28-Jul-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Remove DCR handling DCR handling was only needed for 440 KVM. Since we removed it, we can also remove handling of DCR accesses. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8de12015 |
|
18-Jun-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Expose helper functions for data/inst faults We're going to implement guest code interpretation in KVM for some rare corner cases. This code needs to be able to inject data and instruction faults into the guest when it encounters them. Expose generic APIs to do this in a reasonably subarch agnostic fashion. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7d15c06f |
|
20-Jun-2014 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Implement kvmppc_xlate for all targets We have a nice API to find the translated GPAs of a GVA including protection flags. So far we only use it on Book3S, but there's no reason the same shouldn't be used on BookE as well. Implement a kvmppc_xlate() version for BookE and clean it up to make it more readable in general. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f5250471 |
|
23-Jul-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Bookehv: Get vcpu's last instruction for emulation On book3e, KVM uses load external pid (lwepx) dedicated instruction to read guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI and LRAT), generated by loading a guest address, needs to be handled by KVM. These exceptions are generated in a substituted guest translation context (EPLC[EGS] = 1) from host context (MSR[GS] = 0). Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. lwepx exceptions originate from host state (MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious too intrusive for the host. Read guest last instruction from kvmppc_load_last_inst() by searching for the physical address and kmap it. This address the TODO for TLB eviction and execute-but-not-read entries, and allow us to get rid of lwepx until we are able to handle failures. A simple stress benchmark shows a 1% sys performance degradation compared with previous approach (lwepx without failure handling): time for i in `seq 1 10000`; do /bin/echo > /dev/null; done real 0m 8.85s user 0m 4.34s sys 0m 4.48s vs real 0m 8.84s user 0m 4.36s sys 0m 4.44s A solution to use lwepx and to handle its exceptions in KVM would be to temporary highjack the interrupt vector from host. This imposes additional synchronizations for cores like FSL e6500 that shares host IVOR registers between hardware threads. This optimized solution can be later developed on top of this patch. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
51f04726 |
|
23-Jul-2014 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Allow kvmppc_get_last_inst() to fail On book3e, guest last instruction is read on the exit path using load external pid (lwepx) dedicated instruction. This load operation may fail due to TLB eviction and execute-but-not-read entries. This patch lay down the path for an alternative solution to read the guest last instruction, by allowing kvmppc_get_lat_inst() function to fail. Architecture specific implmentations of kvmppc_load_last_inst() may read last guest instruction and instruct the emulation layer to re-execute the guest in case of failure. Make kvmppc_get_last_inst() definition common between architectures. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
34f754b9 |
|
17-Jul-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
kvm: ppc: Add SPRN_EPR get helper function kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So rename and move get_epr helper function to same file. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> [agraf: remove duplicate return] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
c1b8a01b |
|
17-Jul-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7 Use kvmppc_set_sprg[0-7]() and kvmppc_get_sprg[0-7]() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
dc168549 |
|
17-Jul-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
kvm: ppc: booke: Add shared struct helpers of SPRN_ESR Add and use kvmppc_set_esr() and kvmppc_get_esr() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a5414d4b |
|
17-Jul-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR Uses kvmppc_set_dar() and kvmppc_get_dar() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
31579eea |
|
17-Jul-2014 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1 Use kvmppc_set_srr0/srr1() and kvmppc_get_srr0/srr1() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
6c85f52b |
|
09-Jan-2014 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc: IRQ disabling cleanup Simplify the handling of lazy EE by going directly from fully-enabled to hard-disabled. This replaces the lazy_irq_pending() check (including its misplaced kvm_guest_exit() call). As suggested by Tiejun Chen, move the interrupt disabling into kvmppc_prepare_to_enter() rather than have each caller do it. Also move the IRQ enabling on heavyweight exit into kvmppc_prepare_to_enter(). Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9bd880a2 |
|
22-Oct-2013 |
Tiejun Chen <tiejun.chen@windriver.com> |
KVM: PPC: Book3E HV: call RECONCILE_IRQ_STATE to sync the software state Rather than calling hard_irq_disable() when we're back in C code we can just call RECONCILE_IRQ_STATE to soft disable IRQs while we're already in hard disabled state. This should be functionally equivalent to the code before, but cleaner and faster. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [agraf: fix comment, commit message] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
08c9a188 |
|
17-Nov-2013 |
Bharat Bhushan <r65777@freescale.com> |
kvm: powerpc: use caching attributes as per linux pte KVM uses same WIM tlb attributes as the corresponding qemu pte. For this we now search the linux pte for the requested page and get these cache caching/coherency attributes from pte. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
99dae3ba |
|
15-Oct-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Load/save FP/VMX/VSX state directly to/from vcpu struct Now that we have the vcpu floating-point and vector state stored in the same type of struct as the main kernel uses, we can load that state directly from the vcpu struct instead of having extra copies to/from the thread_struct. Similarly, when the guest state needs to be saved, we can have it saved it directly to the vcpu struct by setting the current->thread.fp_save_area and current->thread.vr_save_area pointers. That also means that we don't need to back up and restore userspace's FP/vector state. This all makes the code simpler and faster. Note that it's not necessary to save or modify current->thread.fpexc_mode, since nothing in KVM uses or is affected by its value. Nor is it necessary to touch used_vr or used_vsr. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
efff1912 |
|
15-Oct-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Store FP/VSX/VMX state in thread_fp/vr_state structures This uses struct thread_fp_state and struct thread_vr_state to store the floating-point, VMX/Altivec and VSX state, rather than flat arrays. This makes transferring the state to/from the thread_struct simpler and allows us to unify the get/set_one_reg implementations for the VSX registers. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f5f97210 |
|
22-Nov-2013 |
Scott Wood <scottwood@freescale.com> |
powerpc/kvm/booke: Fix build break due to stack frame size warning Commit ce11e48b7fdd256ec68b932a89b397a790566031 ("KVM: PPC: E500: Add userspace debug stub support") added "struct thread_struct" to the stack of kvmppc_vcpu_run(). thread_struct is 1152 bytes on my build, compared to 48 bytes for the recently-introduced "struct debug_reg". Use the latter instead. This fixes the following error: cc1: warnings being treated as errors arch/powerpc/kvm/booke.c: In function 'kvmppc_vcpu_run': arch/powerpc/kvm/booke.c:760:1: error: the frame size of 1424 bytes is larger than 1024 bytes make[2]: *** [arch/powerpc/kvm/booke.o] Error 1 make[1]: *** [arch/powerpc/kvm] Error 2 make[1]: *** Waiting for unfinished jobs.... Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
cbbc58d4 |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: book3s: Allow the HV and PR selection per virtual machine This moves the kvmppc_ops callbacks to be a per VM entity. This enables us to select HV and PR mode when creating a VM. We also allow both kvm-hv and kvm-pr kernel module to be loaded. To achieve this we move /dev/kvm ownership to kvm.ko module. Depending on which KVM mode we select during VM creation we take a reference count on respective module Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix coding style] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
5587027c |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: Add struct kvm arg to memslot APIs We will use that in the later patch to find the kvm ops handler Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
dba291f2 |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: booke: Move booke related tracepoints to separate header Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3a167bea |
|
07-Oct-2013 |
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> |
kvm: powerpc: Add kvmppc_ops callback This patch add a new callback kvmppc_ops. This will help us in enabling both HV and PR KVM together in the same kernel. The actual change to enable them together is done in the later patch in the series. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: squash in booke changes] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
ce11e48b |
|
03-Jul-2013 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: E500: Add userspace debug stub support This patch adds the debug stub support on booke/bookehv. Now QEMU debug stub can use hw breakpoint, watchpoint and software breakpoint to debug guest. This is how we save/restore debug register context when switching between guest, userspace and kernel user-process: When QEMU is running -> thread->debug_reg == QEMU debug register context. -> Kernel will handle switching the debug register on context switch. -> no vcpu_load() called QEMU makes ioctls (except RUN) -> This will call vcpu_load() -> should not change context. -> Some ioctls can change vcpu debug register, context saved in vcpu->debug_regs QEMU Makes RUN ioctl -> Save thread->debug_reg on STACK -> Store thread->debug_reg == vcpu->debug_reg -> load thread->debug_reg -> RUN VCPU ( So thread points to vcpu context ) Context switch happens When VCPU running -> makes vcpu_load() should not load any context -> kernel loads the vcpu context as thread->debug_regs points to vcpu context. On heavyweight_exit -> Load the context saved on stack in thread->debug_reg Currently we do not support debug resource emulation to guest, On debug exception, always exit to user space irrespective of user space is expecting the debug exception or not. If this is unexpected exception (breakpoint/watchpoint event not set by userspace) then let us leave the action on user space. This is similar to what it was before, only thing is that now we have proper exit state available to user space. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
547465ef |
|
03-Jul-2013 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: E500: Using "struct debug_reg" For KVM also use the "struct debug_reg" defined in asm/processor.h Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
b12c7841 |
|
03-Jul-2013 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: E500: exit to user space on "ehpriv 1" instruction "ehpriv 1" instruction is used for setting software breakpoints by user space. This patch adds support to exit to user space with "run->debug" have relevant information. As this is the first point we are using run->debug, also defined the run->debug structure. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8b75cbbe |
|
19-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: BookE: Add GET/SET_ONE_REG interface for VRSAVE This makes the VRSAVE register value for a vcpu accessible through the GET/SET_ONE_REG interface on Book E systems (in addition to the existing GET/SET_SREGS interface), for consistency with Book 3S. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
de79f7b9 |
|
10-Sep-2013 |
Paul Mackerras <paulus@samba.org> |
powerpc: Put FP/VSX and VR state into structures This creates new 'thread_fp_state' and 'thread_vr_state' structures to store FP/VSX state (including FPSCR) and Altivec/VSX state (including VSCR), and uses them in the thread_struct. In the thread_fp_state, the FPRs and VSRs are represented as u64 rather than double, since we rarely perform floating-point computations on the values, and this will enable the structures to be used in KVM code as well. Similarly FPSCR is now a u64 rather than a structure of two 32-bit values. This takes the offsets out of the macros such as SAVE_32FPRS, REST_32FPRS, etc. This enables the same macros to be used for normal and transactional state, enabling us to delete the transactional versions of the macros. This also removes the unused do_load_up_fpu and do_load_up_altivec, which were in fact buggy since they didn't create large enough stack frames to account for the fact that load_up_fpu and load_up_altivec are not designed to be called from C and assume that their caller's stack frame is an interrupt frame. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
c09ec198 |
|
10-Jul-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc/booke: Don't call kvm_guest_enter twice kvm_guest_enter() was already called by kvmppc_prepare_to_enter(). Don't call it again. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
5f1c248f |
|
10-Jul-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc: Call trace_hardirqs_on before entry Currently this is only being done on 64-bit. Rather than just move it out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is consistent with lazy ee state, and so that we don't track more host code as interrupts-enabled than necessary. Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect that this function now has a role on 32-bit as well. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
5f17ce8b |
|
12-May-2013 |
Tiejun Chen <tiejun.chen@windriver.com> |
KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL Availablity of the doorbell_exception function is guarded by CONFIG_PPC_DOORBELL. Use the same define to guard our caller of it. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [agraf: improve patch description] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f8941fbe |
|
11-Jun-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc/booke: Delay kvmppc_lazy_ee_enable kwmppc_lazy_ee_enable() should be called as late as possible, or else we get things like WARN_ON(preemptible()) in enable_kernel_fp() in configurations where preemptible() works. Note that book3s_pr already waits until just before __kvmppc_vcpu_run to call kvmppc_lazy_ee_enable(). Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
7c11c0cc |
|
06-Jun-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit() EE is hard-disabled on entry to kvmppc_handle_exit(), so call hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled is unset. Without this, we get warnings such as arch/powerpc/kernel/time.c:300, and sometimes host kernel hangs. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
#
f1e89028 |
|
06-Jun-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc/booke: Hold srcu lock when calling gfn functions KVM core expects arch code to acquire the srcu lock when calling gfn_to_memslot and similar functions. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
#
eb1e4f43 |
|
12-Apr-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC Enabling this capability connects the vcpu to the designated in-kernel MPIC. Using explicit connections between vcpus and irqchips allows for flexibility, but the main benefit at the moment is that it simplifies the code -- KVM doesn't need vm-global state to remember which MPIC object is associated with this vm, and it doesn't need to care about ordering between irqchip creation and vcpu creation. Signed-off-by: Scott Wood <scottwood@freescale.com> [agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
5df554ad |
|
12-Apr-2013 |
Scott Wood <scottwood@freescale.com> |
kvm/ppc/mpic: in-kernel MPIC emulation Hook the MPIC code up to the KVM interfaces, add locking, etc. Signed-off-by: Scott Wood <scottwood@freescale.com> [agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
35b299e2 |
|
10-Apr-2013 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: Book3E: Refactor ONE_REG ioctl implementation Refactor Book3E ONE_REG ioctl implementation to use kvmppc_get_one_reg/ kvmppc_set_one_reg delegation interface introduced by Book3S. This is necessary for MMU SPRs which are platform specifics. Get rid of useless case braces in the process. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
9b4f5308 |
|
07-Apr-2013 |
Bharat Bhushan <bharat.bhushan@freescale.com> |
booke: exit to user space if emulator request This allows the exit to user space if emulator request by returning EMULATE_EXIT_USER. This will be used in subsequent patches in list Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
092d62ee |
|
07-Apr-2013 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: debug stub interface parameter defined This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below. This is because I am not sure what is required for book3s. So this ioctl behaviour will not change for book3s. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8c32a2ea |
|
20-Mar-2013 |
Bharat Bhushan <r65777@freescale.com> |
Added ONE_REG interface for debug instruction This patch adds the one_reg interface to get the special instruction to be used for setting software breakpoint from userspace. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
4fe27d2a |
|
14-Feb-2013 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Remove unused argument to kvmppc_core_dequeue_external Currently kvmppc_core_dequeue_external() takes a struct kvm_interrupt * argument and does nothing with it, in any of its implementations. This removes it in order to make things easier for forthcoming in-kernel interrupt controller emulation code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
78accda4 |
|
24-Feb-2013 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: Added one_reg interface for timer registers If userspace wants to change some specific bits of TSR (timer status register) then it uses GET/SET_SREGS ioctl interface. So the steps will be: i) user-space will make get ioctl, ii) change TSR in userspace iii) then make set ioctl. It can happen that TSR gets changed by kernel after step i) and before step iii). To avoid this we have added below one_reg ioctls for oring and clearing specific bits in TSR. This patch adds one registerface for: 1) setting specific bit in TSR (timer status register) 2) clearing specific bit in TSR (timer status register) 3) setting/getting the TCR register. There are cases where we want to only change TCR and not TSR. Although we can uses SREGS without KVM_SREGS_E_UPDATE_TSR flag but I think one reg is better. I am open if someone feels we should use SREGS only here. 4) getting/setting TSR register Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
d26f22c9 |
|
24-Feb-2013 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: move tsr update in a separate function This is done so that same function can be called from SREGS and ONE_REG interface (follow up patch). Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
8482644a |
|
27-Feb-2013 |
Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> |
KVM: set_memory_region: Refactor commit_memory_region() This patch makes the parameter old a const pointer to the old memory slot and adds a new parameter named change to know the change being requested: the former is for removing extra copying and the latter is for cleaning up the code. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
011da899 |
|
31-Jan-2013 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Handle alignment interrupts When the guest triggers an alignment interrupt, we don't handle it properly today and instead BUG_ON(). This really shouldn't happen. Instead, we should just pass the interrupt back into the guest so it can deal with it. Reported-by: Gao Guanhua-B22826 <B22826@freescale.com> Tested-by: Gao Guanhua-B22826 <B22826@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
1d542d9c |
|
15-Jan-2013 |
Bharat Bhushan <Bharat.Bhushan@freescale.com> |
KVM: PPC: booke: Allow multiple exception types Current kvmppc_booke_handlers uses the same macro (KVM_HANDLER) and all handlers are considered to be the same size. This will not be the case if we want to use different macros for different handlers. This patch improves the kvmppc_booke_handler so that it can support different macros for different handlers. Signed-off-by: Liu Yu <yu.liu@freescale.com> [bharat.bhushan@freescale.com: Substantial changes] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
324b3e63 |
|
04-Jan-2013 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Add EPR ONE_REG sync We need to be able to read and write the contents of the EPR register from user space. This patch implements that logic through the ONE_REG API and declares its (never implemented) SREGS counterpart as deprecated. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
1c810636 |
|
04-Jan-2013 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Implement EPR exit The External Proxy Facility in FSL BookE chips allows the interrupt controller to automatically acknowledge an interrupt as soon as a core gets its pending external interrupt delivered. Today, user space implements the interrupt controller, so we need to check on it during such a cycle. This patch implements logic for user space to enable EPR exiting, disable EPR exiting and EPR exiting itself, so that user space can acknowledge an interrupt when an external interrupt has successfully been delivered into the guest vcpu. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
b8c649a9 |
|
19-Dec-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Allow irq deliveries to inject requests When injecting an interrupt into guest context, we usually don't need to check for requests anymore. At least not until today. With the introduction of EPR, we will have to create a request when the guest has successfully accepted an external interrupt though. So we need to prepare the interrupt delivery to abort guest entry gracefully. Otherwise we'd delay the EPR request. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
352df1de |
|
11-Oct-2012 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface Implement ONE_REG interface for EPCR register adding KVM_REG_PPC_EPCR to the list of ONE_REG PPC supported registers. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> [agraf: remove HV dependency, use get/put_user] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
38f98824 |
|
11-Oct-2012 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation Add EPCR support in booke mtspr/mfspr emulation. EPCR register is defined only for 64-bit and HV categories, we will expose it at this point only to 64-bit virtual processors running on 64-bit HV hosts. Define a reusable setter function for vcpu's EPCR. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> [agraf: move HV dependency in the code] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
95e90b43 |
|
11-Oct-2012 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: bookehv: Add guest computation mode for irq delivery When delivering guest IRQs, update MSR computation mode according to guest interrupt computation mode found in EPCR. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> [agraf: remove HV dependency in the code] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
b50df19c |
|
11-Oct-2012 |
Mihai Caraman <mihai.caraman@freescale.com> |
KVM: PPC: booke: Fix get_tb() compile error on 64-bit Include header file for get_tb() declaration. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
5bd1cf11 |
|
22-Aug-2012 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: set IN_GUEST_MODE before checking requests Avoid a race as described in the code comment. Also remove a related smp_wmb() from booke's kvmppc_prepare_to_enter(). I can't see any reason for it, and the book3s_pr version doesn't have it. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a47d72f3 |
|
20-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S HV: Fix updates of vcpu->cpu This removes the powerpc "generic" updates of vcpu->cpu in load and put, and moves them to the various backends. The reason is that "HV" KVM does its own sauce with that field and the generic updates might corrupt it. The field contains the CPU# of the -first- HW CPU of the core always for all the VCPU threads of a core (the one that's online from a host Linux perspective). However, the preempt notifiers are going to be called on the threads VCPUs when they are running (due to them sleeping on our private waitqueue) causing unload to be called, potentially clobbering the value. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
dfe49dbd |
|
11-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly This adds an implementation of kvm_arch_flush_shadow_memslot for Book3S HV, and arranges for kvmppc_core_commit_memory_region to flush the dirty log when modifying an existing slot. With this, we can handle deletion and modification of memory slots. kvm_arch_flush_shadow_memslot calls kvmppc_core_flush_memslot, which on Book3S HV now traverses the reverse map chains to remove any HPT (hashed page table) entries referring to pages in the memslot. This gets called by generic code whenever deleting a memslot or changing the guest physical address for a memslot. We flush the dirty log in kvmppc_core_commit_memory_region for consistency with what x86 does. We only need to flush when an existing memslot is being modified, because for a new memslot the rmap array (which stores the dirty bits) is all zero, meaning that every page is considered clean already, and when deleting a memslot we obviously don't care about the dirty bits any more. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a66b48c3 |
|
11-Sep-2012 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Move kvm->arch.slot_phys into memslot.arch Now that we have an architecture-specific field in the kvm_memory_slot structure, we can use it to store the array of page physical addresses that we need for Book3S HV KVM on PPC970 processors. This reduces the size of struct kvm_arch for Book3S HV, and also reduces the size of struct kvm_arch_memory_slot for other PPC KVM variants since the fields in it are now only compiled in for Book3S HV. This necessitates making the kvm_arch_create_memslot and kvm_arch_free_memslot operations specific to each PPC KVM variant. That in turn means that we now don't allocate the rmap arrays on Book3S PR and Book E. Since we now unpin pages and free the slot_phys array in kvmppc_core_free_memslot, we no longer need to do it in kvmppc_core_destroy_vm, since the generic code takes care to free all the memslots when destroying a VM. We now need the new memslot to be passed in to kvmppc_core_prepare_memory_region, since we need to initialize its arch.slot_phys member on Book3S HV. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7a08c274 |
|
16-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Support FPU on non-hv systems When running on HV aware hosts, we can not trap when the guest sets the FP bit, so we just let it do so when it wants to, because it has full access to MSR. For non-HV aware hosts with an FPU (like 440), we need to also adjust the shadow MSR though. Otherwise the guest gets an FP unavailable trap even when it really enabled the FP bit in MSR. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
6df8d3fc |
|
08-Aug-2012 |
Bharat Bhushan <r65777@freescale.com> |
booke: Added ONE_REG interface for IAC/DAC debug registers IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG interface is added to set/get them. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f61c94bb |
|
08-Aug-2012 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: booke: Add watchdog emulation This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> [bharat.bhushan@freescale.com: reworked patch] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> [agraf: adjust to new request framework] Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7c973a2e |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add return value to core_check_requests Requests may want to tell us that we need to go back into host state, so add a return value for the checks. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
7ee78855 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add return value in prepare_to_enter Our prepare_to_enter helper wants to be able to return in more circumstances to the host than only when an interrupt is pending. Broaden the interface a bit and move even more generic code to the generic helper. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
3766a4c6 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Move kvm_guest_enter call into generic code We need to call kvm_guest_enter in booke and book3s, so move its call to generic code. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
bd2be683 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Book3S: PR: Rework irq disabling Today, we disable preemption while inside guest context, because we need to expose to the world that we are not in a preemptible context. However, during that time we already have interrupts disabled, which would indicate that we are in a non-preemptible context. The reason the checks for irqs_disabled() fail for us though is that we manually control hard IRQs and ignore all the lazy EE framework. Let's stop doing that. Instead, let's always use lazy EE to indicate when we want to disable IRQs, but do a special final switch that gets us into EE disabled, but soft enabled state. That way when we get back out of guest state, we are immediately ready to process interrupts. This simplifies the code drastically and reduces the time that we appear as preempt disabled. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
24afa37b |
|
11-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Consistentify vcpu exit path When getting out of __vcpu_run, let's be consistent about the state we return in. We want to always * have IRQs enabled * have called kvm_guest_exit before Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
706fb730 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Exit guest context while handling exit The x86 implementation of KVM accounts for host time while processing guest exits. Do the same for us. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
e85ad380 |
|
12-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Drop redundant vcpu->mode set We only need to set vcpu->mode to outside once. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
03d25c5b |
|
09-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr We need to do the same things when preparing to enter a guest for booke and book3s_pr cores. Fold the generic code into a generic function that both call. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
2d8185d4 |
|
09-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: No duplicate request != 0 check We only call kvmppc_check_requests() when vcpu->requests != 0, so drop the redundant check in the function itself Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
6346046c |
|
07-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Add some more trace points Without trace points, debugging what exactly is going on inside guest code can be very tricky. Add a few more trace points at places that hopefully tell us more when things go wrong. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
862d31f7 |
|
30-Jul-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: E500: Implement MMU notifiers The e500 target has lived without mmu notifiers ever since it got introduced, but fails for the user space check on them with hugetlbfs. So in order to get that one working, implement mmu notifiers in a reasonably dumb fashion and be happy. On embedded hardware, we almost never end up with mmu notifier calls, since most people don't overcommit. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
d69c6436 |
|
08-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Add support for vcpu->mode Generic KVM code might want to know whether we are inside guest context or outside. It also wants to be able to push us out of guest context. Add support to the BookE code for the generic vcpu->mode field that describes the above states. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
4ffc6356 |
|
08-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Add check_requests helper function We need a central place to check for pending requests in. Add one that only does the timer check we already do in a different place. Later, this central function can be extended by more checks. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
cf1c5ca4 |
|
31-Jul-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: BookE: Expose remote TLB flushes in debugfs We're already counting remote TLB flushes in a variable, but don't export it to user space yet. Do so, so we know what's going on. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
97c95059 |
|
02-Aug-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: PR: Use generic tracepoint for guest exit We want to have tracing information on guest exits for booke as well as book3s. Since most information is identical, use a common trace point. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
6328e593 |
|
19-Jun-2012 |
Bharat Bhushan <r65777@freescale.com> |
booke/bookehv: Add host crit-watchdog exception support Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
21bd000a |
|
20-May-2012 |
Bharat Bhushan <r65777@freescale.com> |
KVM: PPC: booke: Added DECAR support Added the decrementer auto-reload support. DECAR is readable on e500v2/e500mc and later cpus. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
966cd0f3 |
|
14-Mar-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Ignore unhalt request from kvm_vcpu_block When running kvm_vcpu_block and it realizes that the CPU is actually good to run, we get a request bit set for KVM_REQ_UNHALT. Right now, there's nothing we can do with that bit, so let's unset it right after the call again so we don't get confused in our later checks for pending work. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
6020c0f6 |
|
11-Mar-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Pass EA to updating emulation ops When emulating updating load/store instructions (lwzu, stwu, ...) we need to write the effective address of the load/store into a register. Currently, we write the physical address in there, which is very wrong. So instead let's save off where the virtual fault was on MMIO and use that information as value to put into the register. While at it, also move the XOP variants of the above instructions to the new scheme of using the already known vaddr instead of calculating it themselves. Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
03660ba2 |
|
27-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Booke: only prepare to enter when we enter So far, we've always called prepare_to_enter even when all we did was return to the host. This patch changes that semantic to only call prepare_to_enter when we actually want to get back into the guest. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7cc1e8ee |
|
22-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: Reinject performance monitor interrupts When we get a performance monitor interrupt, we need to make sure that the host receives it. So reinject it like we reinject the other host destined interrupts. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
4e642ccb |
|
20-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: expose good state on irq reinject When reinjecting an interrupt into the host interrupt handler after we're back in host kernel land, we need to tell the kernel where the interrupt happened. We can't tell it that we were in guest state, because that might lead to random code walking host addresses. So instead, we tell it that we came from the interrupt reinject code. This helps getting reasonable numbers out of perf. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
95f2e921 |
|
20-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: Support perfmon interrupts When during guest context we get a performance monitor interrupt, we currently bail out and oops. Let's route it to its correct handler instead. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
0268597c |
|
19-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: add GS documentation for program interrupt The comment for program interrupts triggered when using bookehv was misleading. Update it to mention why MSR_GS indicates that we have to inject an interrupt into the guest again, not emulate it. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
c35c9d84 |
|
19-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: Readd debug abort code for machine check When during guest execution we get a machine check interrupt, we don't know how to handle it yet. So let's add the error printing code back again that we dropped accidently earlier and tell user space that something went really wrong. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
8b3a00fc |
|
16-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: BOOKE_IRQPRIO_MAX is n+1 The semantics of BOOKE_IRQPRIO_MAX changed to denote the highest available irqprio + 1, so let's reflect that in the code too. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
a8e4ef84 |
|
16-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: rework rescheduling checks Instead of checking whether we should reschedule only when we exited due to an interrupt, let's always check before entering the guest back again. This gets the target more in line with the other archs. Also while at it, generalize the whole thing so that eventually we could have a single kvmppc_prepare_to_enter function for all ppc targets that does signal and reschedule checking for us. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
d1ff5499 |
|
16-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: deliver program int on emulation failure When we fail to emulate an instruction for the guest, we better go in and tell it that we failed to emulate it, by throwing an illegal instruction exception. Please beware that we basically never get around to telling the guest that we failed thanks to the debugging code right above it. If user space however decides that it wants to ignore the debug, we would at least do "the right thing" afterwards. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
acab0529 |
|
16-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: booke: remove leftover debugging The e500mc patches left some debug code in that we don't need. Remove it. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
bf7ca4bd |
|
15-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: rename CONFIG_KVM_E500 -> CONFIG_KVM_E500V2 The CONFIG_KVM_E500 option really indicates that we're running on a V2 machine, not on a machine of the generic E500 class. So indicate that properly and change the config name accordingly. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
79300f8c |
|
15-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: e500mc: implicitly set MSR_GS When setting MSR for an e500mc guest, we implicitly always set MSR_GS to make sure the guest is in guest state. Since we have this implicit rule there, we don't need to explicitly pass MSR_GS to set_msr(). Remove all explicit setters of MSR_GS. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
4ab96919 |
|
15-Feb-2012 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: e500mc: Add doorbell emulation support When one vcpu wants to kick another, it can issue a special IPI instruction called msgsnd. This patch emulates this instruction, its clearing counterpart and the infrastructure required to actually trigger that interrupt inside a guest vcpu. With this patch, SMP guests on e500mc work. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
8fae845f |
|
20-Dec-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: standard PPC floating point support e500mc has a normal PPC FPU, rather than SPE which is found on e500v1/v2. Based on code from Liu Yu <yu.liu@freescale.com>. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
d30f6e48 |
|
20-Dec-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: category E.HV (GS-mode) support Chips such as e500mc that implement category E.HV in Power ISA 2.06 provide hardware virtualization features, including a new MSR mode for guest state. The guest OS can perform many operations without trapping into the hypervisor, including transitions to and from guest userspace. Since we can use SRR1[GS] to reliably tell whether an exception came from guest state, instead of messing around with IVPR, we use DO_KVM similarly to book3s. Current issues include: - Machine checks from guest state are not routed to the host handler. - The guest can cause a host oops by executing an emulated instruction in a page that lacks read permission. Existing e500/4xx support has the same problem. Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>, Varun Sethi <Varun.Sethi@freescale.com>, and Liu Yu <yu.liu@freescale.com>. Signed-off-by: Scott Wood <scottwood@freescale.com> [agraf: remove pt_regs usage] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
fafd6832 |
|
20-Dec-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: Move vm core init/destroy out of booke.c e500mc will want to do lpid allocation/deallocation here. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
94fa9d99 |
|
20-Dec-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: add booke-level vcpu load/put This gives us a place to put load/put actions that correspond to code that is booke-specific but not specific to a particular core. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
31f3438e |
|
11-Dec-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Move kvm_vcpu_ioctl_[gs]et_one_reg down to platform-specific code This moves the get/set_one_reg implementation down from powerpc.c into booke.c, book3s_pr.c and book3s_hv.c. This avoids #ifdefs in C code, but more importantly, it fixes a bug on Book3s HV where we were accessing beyond the end of the kvm_vcpu struct (via the to_book3s() macro) and corrupting memory, causing random crashes and file corruption. On Book3s HV we only accept setting the HIOR to zero, since the guest runs in supervisor mode and its vectors are never offset from zero. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> [agraf update to apply on top of changed ONE_REG patches] Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
dfd4d47e |
|
16-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: Improve timer register emulation Decrementers are now properly driven by TCR/TSR, and the guest has full read/write access to these registers. The decrementer keeps ticking (and setting the TSR bit) regardless of whether the interrupts are enabled with TCR. The decrementer stops at zero, rather than going negative. Decrementers (and FITs, once implemented) are delivered as level-triggered interrupts -- dequeued when the TSR bit is cleared, not on delivery. Signed-off-by: Liu Yu <yu.liu@freescale.com> [scottwood@freescale.com: significant changes] Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
b5904972 |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: Paravirtualize SPRG4-7, ESR, PIR, MASn This allows additional registers to be accessed by the guest in PR-mode KVM without trapping. SPRG4-7 are readable from userspace. On booke, KVM will sync these registers when it enters the guest, so that accesses from guest userspace will work. The guest kernel, OTOH, must consistently use either the real registers or the shared area between exits. This also applies to the already-paravirted SPRG3. On non-booke, it's not clear to what extent SPRG4-7 are supported (they're not architected for book3s, but exist on at least some classic chips). They are copied in the get/set regs ioctls, but I do not see any non-booke emulation. I also do not see any syncing with real registers (in PR-mode) including the user-readable SPRG3. This patch should not make that situation any worse. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
29ac26ef |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: Fix int_pending calculation for MSR[EE] paravirt int_pending was only being lowered if a bit in pending_exceptions was cleared during exception delivery -- but for interrupts, we clear it during IACK/TSR emulation. This caused paravirt for enabling MSR[EE] to be ineffective. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
c59a6a3e |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: Check for MSR[WE] in prepare_to_enter This prevents us from inappropriately blocking in a KVM_SET_REGS ioctl -- the MSR[WE] will take effect when the guest is next entered. It also causes SRR1[WE] to be set when we enter the guest's interrupt handler, which is what e500 hardware is documented to do. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
25051b5a |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: Move prepare_to_enter call site into subarch code This function should be called with interrupts disabled, to avoid a race where an exception is delivered after we check, but the resched kick is received before we disable interrupts (and thus doesn't actually trigger the exit code that would recheck exceptions). booke already does this properly in the lightweight exit case, but not on initial entry. For now, move the call of prepare_to_enter into subarch-specific code so that booke can do the right thing here. Ideally book3s would do the same thing, but I'm having a hard time seeing where it does any interrupt disabling of this sort (plus it has several additional call sites), so I'm deferring the book3s fix to someone more familiar with that code. book3s behavior should be unchanged by this patch. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7e28e60e |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: Rename deliver_interrupts to prepare_to_enter This function also updates paravirt int_pending, so rename it to be more obvious that this is a collection of checks run prior to (re)entering a guest. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
1d1ef222 |
|
08-Nov-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: check for signals in kvmppc_vcpu_run Currently we check prior to returning from a lightweight exit, but not prior to initial entry. book3s already does a similar test. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
841741f2 |
|
02-Sep-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: e500: Don't hardcode PIR=0 The hardcoded behavior prevents proper SMP support. user space shall specify the vcpu's PIR as the vcpu id. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
af8f38b3 |
|
10-Aug-2011 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add sanity checking to vcpu_run There are multiple features in PowerPC KVM that can now be enabled depending on the user's wishes. Some of the combinations don't make sense or don't work though. So this patch adds a way to check if the executing environment would actually be able to run the guest properly. It also adds sanity checks if PVR is set (should always be true given the current code flow), if PAPR is only used with book3s_64 where it works and that HV KVM is only used in PAPR mode. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
df6909e5 |
|
28-Jun-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Move guest enter/exit down into subarch-specific code Instead of doing the kvm_guest_enter/exit() and local_irq_dis/enable() calls in powerpc.c, this moves them down into the subarch-specific book3s_pr.c and booke.c. This eliminates an extra local_irq_enable() call in book3s_pr.c, and will be needed for when we do SMT4 guest support in the book3s hypervisor mode code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
f9e0554d |
|
28-Jun-2011 |
Paul Mackerras <paulus@samba.org> |
KVM: PPC: Pass init/destroy vm and prepare/commit memory region ops down This arranges for the top-level arch/powerpc/kvm/powerpc.c file to pass down some of the calls it gets to the lower-level subarchitecture specific code. The lower-level implementations (in booke.c and book3s.c) are no-ops. The coming book3s_hv.c will need this. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
dd9ebf1f |
|
14-Jun-2011 |
Liu Yu <yu.liu@freescale.com> |
KVM: PPC: e500: Add shadow PID support Dynamically assign host PIDs to guest PIDs, splitting each guest PID into multiple host (shadow) PIDs based on kernel/user and MSR[IS/DS]. Use both PID0 and PID1 so that the shadow PIDs for the right mode can be selected, that correspond both to guest TID = zero and guest TID = guest PID. This allows us to significantly reduce the frequency of needing to invalidate the entire TLB. When the guest mode or PID changes, we just update the host PID0/PID1. And since the allocation of shadow PIDs is global, multiple guests can share the TLB without conflict. Note that KVM does not yet support the guest setting PID1 or PID2 to a value other than zero. This will need to be fixed for nested KVM to work. Until then, we enforce the requirement for guest PID1/PID2 to stay zero by failing the emulation if the guest tries to set them to something else. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
a4cd8b23 |
|
14-Jun-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: e500: enable magic page This is a shared page used for paravirtualization. It is always present in the guest kernel's effective address space at the address indicated by the hypercall that enables it. The physical address specified by the hypercall is not used, as e500 does not have real mode. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
4cd35f67 |
|
14-Jun-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: e500: Save/restore SPE state This is done lazily. The SPE save will be done only if the guest has used SPE since the last preemption or heavyweight exit. Restore will be done only on demand, when enabling MSR_SPE in the shadow MSR, in response to an SPE fault or mtmsr emulation. For SPEFSCR, Linux already switches it on context switch (non-lazily), so the only remaining bit is to save it between qemu and the guest. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
ecee273f |
|
14-Jun-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: use shadow_msr Keep the guest MSR and the guest-mode true MSR separate, rather than modifying the guest MSR on each guest entry to produce a true MSR. Any bits which should be modified based on guest MSR must be explicitly propagated from vcpu->arch.shared->msr to vcpu->arch.shadow_msr in kvmppc_set_msr(). While we're modifying the guest entry code, reorder a few instructions to bury some load latencies. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
5ce941ee |
|
27-Apr-2011 |
Scott Wood <scottwood@freescale.com> |
KVM: PPC: booke: add sregs support Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
bc9c1933 |
|
29-Dec-2010 |
Peter Tyser <ptyser@xes-inc.com> |
KVM: PPC: Fix SPRG get/set for Book3S and BookE Previously SPRGs 4-7 were improperly read and written in kvm_arch_vcpu_ioctl_get_regs() and kvm_arch_vcpu_ioctl_set_regs(); Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
c5335f17 |
|
30-Aug-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Implement level interrupts for BookE BookE also wants to support level based interrupts, so let's implement all the necessary logic there. We need to trick a bit here because the irqprios are 1:1 assigned to architecture defined values. But since there is some space left there, we can just pick a random one and move it later on - it's internal anyways. Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
082decf2 |
|
07-Aug-2010 |
Hollis Blanchard <hollis_blanchard@mentor.com> |
KVM: PPC: initialize IVORs in addition to IVPR Developers can now tell at a glace the exact type of the premature interrupt, instead of just knowing that there was some premature interrupt. Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com> Signed-off-by: Alexander Graf <agraf@suse.de>
|
#
90bba358 |
|
29-Jul-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Tell guest about pending interrupts When the guest turns on interrupts again, it needs to know if we have an interrupt pending for it. Because if so, it should rather get out of guest context and get the interrupt. So we introduce a new field in the shared page that we use to tell the guest that there's a pending interrupt lying around. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
5c6cedf4 |
|
29-Jul-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add PV guest critical sections When running in hooked code we need a way to disable interrupts without clobbering any interrupts or exiting out to the hypervisor. To achieve this, we have an additional critical field in the shared page. If that field is equal to the r1 register of the guest, it tells the hypervisor that we're in such a critical section and thus may not receive any interrupts. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
2a342ed5 |
|
29-Jul-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Implement hypervisor interface To communicate with KVM directly we need to plumb some sort of interface between the guest and KVM. Usually those interfaces use hypercalls. This hypercall implementation is described in the last patch of the series in a special documentation file. Please read that for further information. This patch implements stubs to handle KVM PPC hypercalls on the host and guest side alike. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
a73a9599 |
|
29-Jul-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Convert SPRG[0-4] to shared page When in kernel mode there are 4 additional registers available that are simple data storage. Instead of exiting to the hypervisor to read and write those, we can just share them with the guest using the page. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
de7906c3 |
|
29-Jul-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Convert SRR0 and SRR1 to shared page The SRR0 and SRR1 registers contain cached values of the PC and MSR respectively. They get written to by the hypervisor when an interrupt occurs or directly by the kernel. They are also used to tell the rfi(d) instruction where to jump to. Because it only gets touched on defined events that, it's very simple to share with the guest. Hypervisor and guest both have full r/w access. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
5e030186 |
|
29-Jul-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Convert DAR to shared page. The DAR register contains the address a data page fault occured at. This register behaves pretty much like a simple data storage register that gets written to on data faults. There is no hypervisor interaction required on read or write. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
666e7252 |
|
29-Jul-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Convert MSR to shared page One of the most obvious registers to share with the guest directly is the MSR. The MSR contains the "interrupts enabled" flag which the guest has to toggle in critical sections. So in order to bring the overhead of interrupt en- and disabling down, let's put msr into the shared page. Keep in mind that even though you can fully read its contents, writing to it doesn't always update all state. There are a few safe fields that don't require hypervisor interaction. See the documentation for a list of MSR bits that are safe to be set from inside the guest. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
6045be5d |
|
19-Jun-2010 |
Asias He <asias.hejun@gmail.com> |
KVM: PPC: fix uninitialized variable warning in kvm_ppc_core_deliver_interrupts Fixes: arch/powerpc/kvm/booke.c: In function 'kvmppc_core_deliver_interrupts': arch/powerpc/kvm/booke.c:147: warning: 'msr_mask' may be used uninitialized in this function Signed-off-by: Asias He <asias.hejun@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
2122ff5e |
|
13-May-2010 |
Avi Kivity <avi@redhat.com> |
KVM: move vcpu locking to dispatcher for generic vcpu ioctls All vcpu ioctls need to be locked, so instead of locking each one specifically we lock at the generic dispatcher. This patch only updates generic ioctls and leaves arch specific ioctls alone. Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
98001d8d |
|
13-May-2010 |
Avi Kivity <avi@redhat.com> |
KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
4496f974 |
|
07-Apr-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add dequeue for external on BookE Commit a0abee86af2d1f048dbe99d2bcc4a2cefe685617 introduced unsetting of the IRQ line from userspace. This added a new core specific callback that I apparently forgot to add for BookE. So let's add the callback for BookE as well, making it build again. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
#
5a0e3ad6 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
#
daf5e271 |
|
02-Feb-2010 |
Liu Yu <yu.liu@freescale.com> |
KVM: ppc/booke: Set ESR and DEAR when inject interrupt to guest Old method prematurely sets ESR and DEAR. Move this part after we decide to inject interrupt, which is more like hardware behave. Signed-off-by: Liu Yu <yu.liu@freescale.com> Acked-by: Hollis Blanchard <hollis@penguinppc.org> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
25a8a02d |
|
07-Jan-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Emulate trap SRR1 flags properly Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
992b5b29 |
|
07-Jan-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Add helpers for CR, XER We now have helpers for the GPRs, so let's also add some for CR and XER. Having them in the PACA simplifies code a lot, as we don't need to care about where to store CC or not to overflow any integers. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
8e5b26b5 |
|
07-Jan-2010 |
Alexander Graf <agraf@suse.de> |
KVM: PPC: Use accessor functions for GPR access All code in PPC KVM currently accesses gprs in the vcpu struct directly. While there's nothing wrong with that wrt the current way gprs are stored and loaded, it doesn't suffice for the PACA acceleration that will follow in this patchset. So let's just create little wrapper inline functions that we call whenever a GPR needs to be read from or written to. The compiled code shouldn't really change at all for now. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7706664d |
|
21-Dec-2009 |
Alexander Graf <agraf@suse.de> |
KVM: powerpc: Improve DEC handling We treated the DEC interrupt like an edge based one. This is not true for Book3s. The DEC keeps firing until mtdec is issued again and thus clears the interrupt line. So let's implement this logic in KVM too. This patch moves the line clearing from the firing of the interrupt to the mtdec emulation. This makes PPC64 guests work without AGGRESSIVE_DEC defined. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
4e755758 |
|
29-Oct-2009 |
Alexander Graf <agraf@suse.de> |
Move dirty logging code to sub-arch PowerPC code handles dirty logging in the generic parts atm. While this is great for "return -ENOTSUPP", we need to be rather target specific when actually implementing it. So let's split it to implementation specific code, so we can implement it for book3s. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
2986b8c7 |
|
01-Jun-2009 |
Stephen Rothwell <sfr@canb.auug.org.au> |
KVM: powerpc: fix some init/exit annotations Fixes a couple of warnings like this one: WARNING: arch/powerpc/kvm/kvm-440.o(.text+0x1e8c): Section mismatch in reference from the function kvmppc_44x_exit() to the function .exit.text:kvmppc_booke_exit() The function kvmppc_44x_exit() references a function in an exit section. Often the function kvmppc_booke_exit() has valid usage outside the exit section and the fix is to remove the __exit annotation of kvmppc_booke_exit. Also add some __init annotations on obvious routines. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
bb3a8a17 |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: Add extra E500 exceptions e500 has additional interrupt vectors (and corresponding IVORs) for SPE and performance monitoring interrupts. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
bdc89f13 |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: distinguish between interrupts and priorities Although BOOKE_MAX_INTERRUPT has the right value, the meaning is not match. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
b52a638c |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: Add kvmppc_mmu_dtlb/itlb_miss for booke When itlb or dtlb miss happens, E500 needs to update some mmu registers. So that the auto-load mechanism can work on E500 when write a tlb entry. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
f4435361 |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: remove last 44x-specific bits from booke.c Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
fa86b8dd |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: rename 44x MMU functions used in booke.c e500 will provide its own implementation of these. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
be8d1cae |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: turn tlb_xlate() into a per-core hook (and give it a better name) Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
58a96214 |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: change kvmppc_mmu_map() parameters Passing just the TLB index will ease an e500 implementation. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
475e7cdd |
|
03-Jan-2009 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: small cosmetic changes to Book E DTLB miss handler Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7b701591 |
|
02-Dec-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: mostly cosmetic updates to the exit timing accounting code The only significant changes were to kvmppc_exit_timing_write() and kvmppc_exit_timing_show(), both of which were dramatically simplified. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
73e75b41 |
|
02-Dec-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: Implement in-kernel exit timing statistics Existing KVM statistics are either just counters (kvm_stat) reported for KVM generally or trace based aproaches like kvm_trace. For KVM on powerpc we had the need to track the timings of the different exit types. While this could be achieved parsing data created with a kvm_trace extension this adds too much overhead (at least on embedded PowerPC) slowing down the workloads we wanted to measure. Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm code. These statistic is available per vm&vcpu under the kvm debugfs directory. As this statistic is low, but still some overhead it can be enabled via a .config entry and should be off by default. Since this patch touched all powerpc kvm_stat code anyway this code is now merged and simplified together with the exit timing statistic code (still working with exit timing disabled in .config). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
7924bd41 |
|
02-Dec-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: directly insert shadow mappings into the hardware TLB Formerly, we used to maintain a per-vcpu shadow TLB and on every entry to the guest would load this array into the hardware TLB. This consumed 1280 bytes of memory (64 entries of 16 bytes plus a struct page pointer each), and also required some assembly to loop over the array on every entry. Instead of saving a copy in memory, we can just store shadow mappings directly into the hardware TLB, accepting that the host kernel will clobber these as part of the normal 440 TLB round robin. When we do that we need less than half the memory, and we have decreased the exit handling time for all guest exits, at the cost of increased number of TLB misses because the host overwrites some guest entries. These savings will be increased on processors with larger TLBs or which implement intelligent flush instructions like tlbivax (which will avoid the need to walk arrays in software). In addition to that and to the code simplification, we have a greater chance of leaving other host userspace mappings in the TLB, instead of forcing all subsequent tasks to re-fault all their mappings. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
89168618 |
|
02-Dec-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: support large host pages KVM on 440 has always been able to handle large guest mappings with 4K host pages -- we must, since the guest kernel uses 256MB mappings. This patch makes KVM work when the host has large pages too (tested with 64K). Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
d4cf3892 |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: optimize irq delivery path In kvmppc_deliver_interrupt is just one case left in the switch and it is a rare one (less than 8%) when looking at the exit numbers. Therefore we can at least drop the switch/case and if an if. I inserted an unlikely too, but that's open for discussion. In kvmppc_can_deliver_interrupt all frequent cases are in the default case. I know compilers are smart but we can make it easier for them. By writing down all options and removing the default case combined with the fact that ithe values are constants 0..15 should allow the compiler to write an easy jump table. Modifying kvmppc_can_deliver_interrupt pointed me to the fact that gcc seems to be unable to reduce priority_exception[x] to a build time constant. Therefore I changed the usage of the translation arrays in the interrupt delivery path completely. It is now using priority without translation to irq on the full irq delivery path. To be able to do that ivpr regs are stored by their priority now. Additionally the decision made in kvmppc_can_deliver_interrupt is already sufficient to get the value of interrupt_msr_mask[x]. Therefore we can replace the 16x4byte array used here with a single 4byte variable (might still be one miss, but the chance to find this in cache should be better than the right entry of the whole array). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
9ab80843 |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: optimize find first bit Since we use a unsigned long here anyway we can use the optimized __ffs. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
1b6766c7 |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: optimize kvm stat handling Currently we use an unnecessary if&switch to detect some cases. To be honest we don't need the ligh_exits counter anyway, because we can calculate it out of others. Sum_exits can also be calculated, so we can remove that too. MMIO, DCR and INTR can be counted on other places without these additional control structures (The INTR case was never hit anyway). The handling of BOOKE_INTERRUPT_EXTERNAL/BOOKE_INTERRUPT_DECREMENTER is similar, but we can avoid the additional if when copying 3 lines of code. I thought about a goto there to prevent duplicate lines, but rewriting three lines should be better style than a goto cross switch/case statements (its also not enough code to justify a new inline function). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
b8fd68ac |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: fix set regs to take care of msr change When changing some msr bits e.g. problem state we need to take special care of that. We call the function in our mtmsr emulation (not needed for wrtee[i]), but we don't call kvmppc_set_msr if we change msr via set_regs ioctl. It's a corner case we never hit so far, but I assume it should be kvmppc_set_msr in our arch set regs function (I found it because it is also a corner case when using pv support which would miss the update otherwise). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
5cf8ca22 |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: adjust vcpu types to support 64-bit cores However, some of these fields could be split into separate per-core structures in the future. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
db93f574 |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessor This patch doesn't yet move all 44x-specific data into the new structure, but is the first step down that path. In the future we may also want to create a struct kvm_vcpu_booke. Based on patch from Liu Yu <yu.liu@freescale.com>. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
5cbb5106 |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: Move the last bits of 44x code out of booke.c Needed to port to other Book E processors. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
75f74f0d |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: refactor instruction emulation into generic and core-specific pieces Cores provide 3 emulation hooks, implemented for example in the new 4xx_emulate.c: kvmppc_core_emulate_op kvmppc_core_emulate_mtspr kvmppc_core_emulate_mfspr Strictly speaking the last two aren't necessary, but provide for more informative error reporting ("unknown SPR"). Long term I'd like to have instruction decoding autogenerated from tables of opcodes, and that way we could aggregate universal, Book E, and core-specific instructions more easily and without redundant switch statements. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
9dd921cf |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: Refactor powerpc.c to relocate 440-specific code This introduces a set of core-provided hooks. For 440, some of these are implemented by booke.c, with the rest in (the new) 44x.c. Note that these hooks are link-time, not run-time. Since it is not possible to build a single kernel for both e500 and 440 (for example), using function pointers would only add overhead. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
d9fbd03d |
|
05-Nov-2008 |
Hollis Blanchard <hollisb@us.ibm.com> |
KVM: ppc: combine booke_guest.c and booke_host.c The division was somewhat artificial and cumbersome, and had no functional benefit anyways: we can only guests built for the real host processor. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|