1Hollis Blanchard <hollisb@us.ibm.com> 215 Apr 2008 3 4Various notes on the implementation of KVM for PowerPC 440: 5 6To enforce isolation, host userspace, guest kernel, and guest userspace all 7run at user privilege level. Only the host kernel runs in supervisor mode. 8Executing privileged instructions in the guest traps into KVM (in the host 9kernel), where we decode and emulate them. Through this technique, unmodified 10440 Linux kernels can be run (slowly) as guests. Future performance work will 11focus on reducing the overhead and frequency of these traps. 12 13The usual code flow is started from userspace invoking an "run" ioctl, which 14causes KVM to switch into guest context. We use IVPR to hijack the host 15interrupt vectors while running the guest, which allows us to direct all 16interrupts to kvmppc_handle_interrupt(). At this point, we could either 17- handle the interrupt completely (e.g. emulate "mtspr SPRG0"), or 18- let the host interrupt handler run (e.g. when the decrementer fires), or 19- return to host userspace (e.g. when the guest performs device MMIO) 20 21Address spaces: We take advantage of the fact that Linux doesn't use the AS=1 22address space (in host or guest), which gives us virtual address space to use 23for guest mappings. While the guest is running, the host kernel remains mapped 24in AS=0, but the guest can only use AS=1 mappings. 25 26TLB entries: The TLB entries covering the host linear mapping remain 27present while running the guest. This reduces the overhead of lightweight 28exits, which are handled by KVM running in the host kernel. We keep three 29copies of the TLB: 30 - guest TLB: contents of the TLB as the guest sees it 31 - shadow TLB: the TLB that is actually in hardware while guest is running 32 - host TLB: to restore TLB state when context switching guest -> host 33When a TLB miss occurs because a mapping was not present in the shadow TLB, 34but was present in the guest TLB, KVM handles the fault without invoking the 35guest. Large guest pages are backed by multiple 4KB shadow pages through this 36mechanism. 37 38IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network 39and block IO, so those drivers must be enabled in the guest. It's possible 40that some qemu device emulation (e.g. e1000 or rtl8139) may also work with 41little effort. 42