History log of /linux-master/arch/x86/kvm/i8259.c
Revision Date Author Comments
# 0d42522b 18-Apr-2023 Jinliang Zheng <alexjlzheng@tencent.com>

KVM: x86: Fix poll command

According to the hardware manual, when the Poll command is issued, the
byte returned by the I/O read is 1 in Bit 7 when there is an interrupt,
and the highest priority binary code in Bits 2:0. The current pic
simulation code is not implemented strictly according to the above
expression.

Fix the implementation of pic_poll_read(), set Bit 7 when there is an
interrupt.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Link: https://lore.kernel.org/r/20230419021924.1342184-1-alexjlzheng@tencent.com
Signed-off-by: Sean Christopherson <seanjc@google.com>


# 8d20bd63 30-Nov-2022 Sean Christopherson <seanjc@google.com>

KVM: x86: Unify pr_fmt to use module name for all KVM modules

Define pr_fmt using KBUILD_MODNAME for all KVM x86 code so that printks
use consistent formatting across common x86, Intel, and AMD code. In
addition to providing consistent print formatting, using KBUILD_MODNAME,
e.g. kvm_amd and kvm_intel, allows referencing SVM and VMX (and SEV and
SGX and ...) as technologies without generating weird messages, and
without causing naming conflicts with other kernel code, e.g. "SEV: ",
"tdx: ", "sgx: " etc.. are all used by the kernel for non-KVM subsystems.

Opportunistically move away from printk() for prints that need to be
modified anyways, e.g. to drop a manual "kvm: " prefix.

Opportunistically convert a few SGX WARNs that are similarly modified to
WARN_ONCE; in the very unlikely event that the WARNs fire, odds are good
that they would fire repeatedly and spam the kernel log without providing
unique information in each print.

Note, defining pr_fmt yields undesirable results for code that uses KVM's
printk wrappers, e.g. vcpu_unimpl(). But, that's a pre-existing problem
as SVM/kvm_amd already defines a pr_fmt, and thankfully use of KVM's
wrappers is relatively limited in KVM x86 code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20221130230934.1014142-35-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# fe3787a0 01-Mar-2022 Like Xu <likexu@tencent.com>

KVM: x86/i8259: Remove a dead store of irq in a conditional block

The [clang-analyzer-deadcode.DeadStores] helper reports
that the value stored to 'irq' is never read.

Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220301120217.38092-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# 1f2e66f0 25-Jan-2022 Jinrong Liang <cloudliang@tencent.com>

KVM: x86/i8259: Remove unused "addr" of elcr_ioport_{read,write}()

The "u32 addr" parameter of elcr_ioport_write() and elcr_ioport_read()
is not used, so remove it. No functional change intended.

Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Message-Id: <20220125095909.38122-13-cloudliang@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# 46808a4c 16-Nov-2021 Marc Zyngier <maz@kernel.org>

KVM: Use 'unsigned long' as kvm_for_each_vcpu()'s index

Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu
index. Unfortunately, we're about to move rework the iterator,
which requires this to be upgrade to an unsigned long.

Let's bite the bullet and repaint all of it in one go.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-7-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# 34739a28 19-Jul-2021 Maciej W. Rozycki <macro@orcam.me.uk>

x86: Fix typo s/ECLR/ELCR/ for the PIC register

The proper spelling for the acronym referring to the Edge/Level Control
Register is ELCR rather than ECLR. Adjust references accordingly. No
functional change.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2107200251080.9461@angie.orcam.me.uk


# 14e32321 11-Dec-2019 Marios Pomonis <pomonis@google.com>

KVM: x86: Refactor picdev_write() to prevent Spectre-v1/L1TF attacks

This fixes a Spectre-v1/L1TF vulnerability in picdev_write().
It replaces index computations based on the (attacked-controlled) port
number with constants through a minor refactoring.

Fixes: 85f455f7ddbe ("KVM: Add support for in-kernel PIC emulation")

Signed-off-by: Nick Finco <nifi@google.com>
Signed-off-by: Marios Pomonis <pomonis@google.com>
Reviewed-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# 254272ce 11-Feb-2019 Ben Gardon <bgardon@google.com>

kvm: x86: Add memcg accounting to KVM allocations

There are many KVM kernel memory allocations which are tied to the life of
the VM process and should be charged to the VM process's cgroup. If the
allocations aren't tied to the process, the OOM killer will not know
that killing the process will free the associated kernel memory.
Add __GFP_ACCOUNT flags to many of the allocations which are not yet being
charged to the VM process's cgroup.

Tested:
Ran all kvm-unit-tests on a 64 bit Haswell machine, the patch
introduced no new failures.
Ran a kernel memory accounting test which creates a VM to touch
memory and then checks that the kernel memory allocated for the
process is within certain bounds.
With this patch we account for much more of the vmalloc and slab memory
allocated for the VM.

There remain a few allocations which should be charged to the VM's
cgroup but are not. In x86, they include:
vcpu->arch.pio_data
There allocations are unaccounted in this patch because they are mapped
to userspace, and accounting them to a cgroup causes problems. This
should be addressed in a future patch.

Signed-off-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# b5e7cf52 07-Apr-2017 David Hildenbrand <david@redhat.com>

KVM: x86: simplify pic_ioport_read()

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# 84a5c79e 07-Apr-2017 David Hildenbrand <david@redhat.com>

KVM: x86: set data directly in picdev_read()

Now it looks almost as picdev_write().

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# 9fecaa9e 07-Apr-2017 David Hildenbrand <david@redhat.com>

KVM: x86: drop picdev_in_range()

We already have the exact same checks a couple of lines below.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# dc24d1d2 07-Apr-2017 David Hildenbrand <david@redhat.com>

KVM: x86: make kvm_pic_reset() static

Not used outside of i8259.c, so let's make it static.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# e21d1758 07-Apr-2017 David Hildenbrand <david@redhat.com>

KVM: x86: simplify pic_unlock()

We can easily compact this code and get rid of one local variable.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# 49f520b9 07-Apr-2017 David Hildenbrand <david@redhat.com>

KVM: x86: push usage of slots_lock down

Let's just move it to the place where it is actually needed.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# 90bca052 07-Apr-2017 David Hildenbrand <david@redhat.com>

KVM: x86: get rid of pic_irqchip()

It seemed like a nice idea to encapsulate access to kvm->arch.vpic. But
as the usage is already mixed, internal locks are taken outside of i8259.c
and grepping for "vpic" only is much easier, let's just get rid of
pic_irqchip().

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# 950712eb 15-Mar-2017 Peter Xu <peterx@redhat.com>

KVM: x86: check existance before destroy

Mostly used for split irqchip mode. In that case, these two things are
not inited at all, so no need to release.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>


# 09941366 16-Dec-2016 Radim Krčmář <rkrcmar@redhat.com>

KVM: x86: make pic setup code look like ioapic setup

We don't treat kvm->arch.vpic specially anymore, so the setup can look
like ioapic. This gets a bit more information out of return values.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# 71ba994c 28-Jul-2015 Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: clean/fix memory barriers in irqchip_in_kernel

The memory barriers are trying to protect against concurrent RCU-based
interrupt injection, but the IRQ routing table is not valid at the time
kvm->arch.vpic is written. Fix this by writing kvm->arch.vpic last.
kvm_destroy_pic then need not set kvm->arch.vpic to NULL; modify it
to take a struct kvm_pic* and reuse it if the IOAPIC creation fails.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# 2dccb4cd 10-Mar-2015 Petr Matousek <pmatouse@redhat.com>

kvm: x86: i8259: return initialized data on invalid-size read

If data is read from PIC with invalid access size, the return data stays
uninitialized even though success is returned.

Fix this by always initializing the data.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Message-Id: <20150311111609.GG8544@dhcp-25-225.brq.redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# e32edf4f 26-Mar-2015 Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>

KVM: Redesign kvm_io_bus_ API to pass VCPU structure to the callbacks.

This is needed in e.g. ARM vGIC emulation, where the MMIO handling
depends on the VCPU that does the access.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>


# c1a6bff2 10-Mar-2015 Petr Matousek <pmatouse@redhat.com>

kvm: x86: i8259: return initialized data on invalid-size read

If data is read from PIC with invalid access size, the return data stays
uninitialized even though success is returned.

Fix this by always initializing the data.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# f3200d00 10-Dec-2012 Gleb Natapov <gleb@redhat.com>

KVM: inject ExtINT interrupt before APIC interrupts

According to Intel SDM Volume 3 Section 10.8.1 "Interrupt Handling with
the Pentium 4 and Intel Xeon Processors" and Section 10.8.2 "Interrupt
Handling with the P6 Family and Pentium Processors" ExtINT interrupts are
sent directly to the processor core for handling. Currently KVM checks
APIC before it considers ExtINT interrupts for injection which is
backwards from the spec. Make code behave according to the SDM.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Acked-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# 749c59fd 30-Aug-2012 Jamie Iles <jamie@jamieiles.com>

KVM: PIC: fix use of uninitialised variable.

Commit aea218f3cbbc (KVM: PIC: call ack notifiers for irqs that are
dropped form irr) used an uninitialised variable to track whether an
appropriate apic had been found. This could result in calling the ack
notifier incorrectly.

Cc: Gleb Natapov <gleb@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# ec798660 03-Sep-2012 Gleb Natapov <gleb@redhat.com>

KVM: cleanup pic reset

kvm_pic_reset() is not used anywhere. Move reset logic from
pic_ioport_write() there.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 28a6fdab 14-Aug-2012 Michael S. Tsirkin <mst@redhat.com>

KVM: x86: drop parameter validation in ioapic/pic

We validate irq pin number when routing is setup, so
code handling illegal irq # in pic and ioapic on each injection
is never called.
Drop it, replace with BUG_ON to catch out of bounds access bugs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# aea218f3 26-Jul-2012 Gleb Natapov <gleb@redhat.com>

KVM: PIC: call ack notifiers for irqs that are dropped form irr

After commit 242ec97c358256 PIT interrupts are no longer delivered after
PIC reset. It happens because PIT injects interrupt only if previous one
was acked, but since on PIC reset it is dropped from irr it will never
be delivered and hence acknowledged. Fix that by calling ack notifier on
PIC reset.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 1a577b72 19-Jul-2012 Michael S. Tsirkin <mst@redhat.com>

KVM: fix race with level interrupts

When more than 1 source id is in use for the same GSI, we have the
following race related to handling irq_states race:

CPU 0 clears bit 0. CPU 0 read irq_state as 0. CPU 1 sets level to 1.
CPU 1 calls kvm_ioapic_set_irq(1). CPU 0 calls kvm_ioapic_set_irq(0).
Now ioapic thinks the level is 0 but irq_state is not 0.

Fix by performing all irq_states bitmap handling under pic/ioapic lock.
This also removes the need for atomics with irq_states handling.

Reported-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# 242ec97c 24-Jan-2012 Gleb Natapov <gleb@redhat.com>

KVM: x86: reset edge sense circuit of i8259 on init

The spec says that during initialization "The edge sense circuit is
reset which means that following initialization an interrupt request
(IR) input must make a low-to-high transition to generate an interrupt",
but currently if edge triggered interrupt is in IRR it is delivered
after i8259 initialization.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# d546cb40 14-Dec-2011 Gleb Natapov <gleb@redhat.com>

KVM: drop bsp_vcpu pointer from kvm struct

Drop bsp_vcpu pointer from kvm struct since its only use is incorrect
anyway.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# bd80158a 12-Sep-2011 Jan Kiszka <jan.kiszka@siemens.com>

KVM: Clean up and extend rate-limited output

The use of printk_ratelimit is discouraged, replace it with
pr*_ratelimited or __ratelimit. While at it, convert remaining
guest-triggerable printks to rate-limited variants.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# 743eeb0b 27-Jul-2011 Sasha Levin <levinsasha928@gmail.com>

KVM: Intelligent device lookup on I/O bus

Currently the method of dealing with an IO operation on a bus (PIO/MMIO)
is to call the read or write callback for each device registered
on the bus until we find a device which handles it.

Since the number of devices on a bus can be significant due to ioeventfds
and coalesced MMIO zones, this leads to a lot of overhead on each IO
operation.

Instead of registering devices, we now register ranges which points to
a device. Lookup is done using an efficient bsearch instead of a linear
search.

Performance test was conducted by comparing exit count per second with
200 ioeventfds created on one byte and the guest is trying to access a
different byte continuously (triggering usermode exits).
Before the patch the guest has achieved 259k exits per second, after the
patch the guest does 274k exits per second.

Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 7049467b 08-Feb-2011 Gleb Natapov <gleb@redhat.com>

KVM: remove isr_ack logic from PIC

isr_ack logic was added by e48258009d to avoid unnecessary IPIs. Back
then it made sense, but now the code checks that vcpu is ready to accept
interrupt before sending IPI, so this logic is no longer needed. The
patch removes it.

Fixes a regression with Debian/Hurd.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Reported-and-tested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# d0dfc6b7 31-Dec-2010 Avi Kivity <avi@redhat.com>

KVM: i8259: initialize isr_ack

isr_ack is never initialized. So, until the first PIC reset, interrupts
may fail to be injected. This can cause Windows XP to fail to boot, as
reported in the fallout from the fix to
https://bugzilla.kernel.org/show_bug.cgi?id=21962.

Reported-and-tested-by: Nicolas Prochazka <prochazka.nicolas@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 9611c187 06-Oct-2010 Nicolas Kaiser <nikai@nikai.net>

KVM: fix typo in copyright notice

Fix typo in copyright notice.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# f4f51050 19-Sep-2010 Avi Kivity <avi@redhat.com>

KVM: Convert PIC lock from raw spinlock to ordinary spinlock

The PIC code used to be called from preempt_disable() context, which
wasn't very good for PREEMPT_RT. That is no longer the case, so move
back from raw_spinlock_t to spinlock_t.

Signed-off-by: Avi Kivity <avi@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# 3842d135 26-Jul-2010 Avi Kivity <avi@redhat.com>

KVM: Check for pending events before attempting injection

Instead of blindly attempting to inject an event before each guest entry,
check for a possible event first in vcpu->requests. Sites that can trigger
event injection are modified to set KVM_REQ_EVENT:

- interrupt, nmi window opening
- ppr updates
- i8259 output changes
- local apic irr changes
- rflags updates
- gif flag set
- event set on exit

This improves non-injecting entry performance, and sets the stage for
non-atomic injection.

Signed-off-by: Avi Kivity <avi@redhat.com>


# 9ed049c3 29-Aug-2010 Avi Kivity <avi@redhat.com>

KVM: i8259: Make ICW1 conform to spec

ICW is not a full reset, instead it resets a limited number of registers
in the PIC. Change ICW1 emulation to only reset those registers.

Signed-off-by: Avi Kivity <avi@redhat.com>


# ae0635b3 27-Jul-2010 Avi Kivity <avi@redhat.com>

KVM: fix i8259 oops when no vcpus are online

If there are no vcpus, found will be NULL. Check before doing anything with
it.

Signed-off-by: Avi Kivity <avi@redhat.com>


# 9195c4da 14-Jul-2010 Gleb Natapov <gleb@redhat.com>

KVM: x86: Call mask notifiers from pic

If pit delivers interrupt while pic is masking it OS will never do EOI
and ack notifier will not be called so when pit will be unmasked no pit
interrupts will be delivered any more. Calling mask notifiers solves this
issue.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# 529df65e 21-Jun-2010 Chris Lalancette <clalance@redhat.com>

KVM: Search the LAPIC's for one that will accept a PIC interrupt

Older versions of 32-bit linux have a "Checking 'hlt' instruction"
test where they repeatedly call the 'hlt' instruction, and then
expect a timer interrupt to kick the CPU out of halt. This happens
before any LAPIC or IOAPIC setup happens, which means that all of
the APIC's are in virtual wire mode at this point. Unfortunately,
the current implementation of virtual wire mode is hardcoded to
only kick the BSP, so if a crash+kexec occurs on a different
vcpu, it will never get kicked.

This patch makes pic_unlock() do the equivalent of
kvm_irq_delivery_to_apic() for the IOAPIC code. That is, it runs
through all of the vcpus looking for one that is in virtual wire
mode. In the normal case where LAPICs and IOAPICs are configured,
this won't be used at all. In the bootstrap phase of a modern
OS, before the LAPICs and IOAPICs are configured, this will have
exactly the same behavior as today; VCPU0 is always looked at
first, so it will always get out of the loop after the first
iteration. This will only go through the loop more than once
during a kexec/kdump, in which case it will only do it a few times
until the kexec'ed kernel programs the LAPIC and IOAPIC.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 36633f32 03-May-2010 Avi Kivity <avi@redhat.com>

KVM: i8259: simplify pic_irq_request() calling sequence

Signed-off-by: Avi Kivity <avi@redhat.com>


# 073d4613 03-May-2010 Avi Kivity <avi@redhat.com>

KVM: i8259: reduce excessive abstraction for pic_irq_request()

Part of the i8259 code pretends it isn't part of kvm, but we know better.
Reduce excessive abstraction, eliminating callbacks and void pointers.

Signed-off-by: Avi Kivity <avi@redhat.com>


# 221d059d 23-May-2010 Avi Kivity <avi@redhat.com>

KVM: Update Red Hat copyrights

Signed-off-by: Avi Kivity <avi@redhat.com>


# 50a085bd 24-Feb-2010 Jan Kiszka <jan.kiszka@siemens.com>

KVM: x86: Kick VCPU outside PIC lock again

This restores the deferred VCPU kicking before 956f97cf. We need this
over -rt as wake_up* requires non-atomic context in this configuration.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# fa8273e9 17-Feb-2010 Thomas Gleixner <tglx@linutronix.de>

KVM: Convert i8254/i8259 locks to raw_spinlocks

The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert
them to raw_spinlock. No change for !RT kernels.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 72bb2fcd 08-Feb-2010 Wei Yongjun <yjwei@cn.fujitsu.com>

KVM: cleanup the failure path of KVM_CREATE_IRQCHIP ioctrl

If we fail to init ioapic device or the fail to setup the default irq
routing, the device register by kvm_create_pic() and kvm_ioapic_init()
remain unregister. This patch fixed to do this.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 79fac95e 23-Dec-2009 Marcelo Tosatti <mtosatti@redhat.com>

KVM: convert slots_lock to a mutex

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# e93f8a0f 23-Dec-2009 Marcelo Tosatti <mtosatti@redhat.com>

KVM: convert io_bus to SRCU

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


# eba0226b 24-Aug-2009 Gleb Natapov <gleb@redhat.com>

KVM: Move IO APIC to its own lock

The allows removal of irq_lock from the injection path.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 79c727d4 24-Aug-2009 Gleb Natapov <gleb@redhat.com>

KVM: Call pic_clear_isr() on pic reset to reuse logic there

Also move call of ack notifiers after pic state change.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 88ba63c2 04-Aug-2009 Gleb Natapov <gleb@redhat.com>

KVM: Replace pic_lock()/pic_unlock() with direct call to spinlock functions

They are not doing anything else now.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 938396a2 04-Aug-2009 Gleb Natapov <gleb@redhat.com>

KVM: Call ack notifiers from PIC when guest OS acks an IRQ.

Currently they are called when irq vector is been delivered. Calling ack
notifiers at this point is wrong. Device assignment ack notifier enables
host interrupts, but guest not yet had a chance to clear interrupt
condition in a device.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 956f97cf 04-Aug-2009 Gleb Natapov <gleb@redhat.com>

KVM: Call kvm_vcpu_kick() inside pic spinlock

d5ecfdd25 moved it out because back than it was impossible to
call it inside spinlock. This restriction no longer exists.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 090b7aff 07-Jul-2009 Gregory Haskins <ghaskins@novell.com>

KVM: make io_bus interface more robust

Today kvm_io_bus_regsiter_dev() returns void and will internally BUG_ON
if it fails. We want to create dynamic MMIO/PIO entries driven from
userspace later in the series, so we need to enhance the code to be more
robust with the following changes:

1) Add a return value to the registration function
2) Fix up all the callsites to check the return code, handle any
failures, and percolate the error up to the caller.
3) Add an unregister function that collapses holes in the array

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 1000ff8d 07-Jul-2009 Gleb Natapov <gleb@redhat.com>

KVM: Add trace points in irqchip code

Add tracepoint in msi/ioapic/pic set_irq() functions,
in IPI sending and in the point where IRQ is placed into
apic's IRR.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# bda9020e 29-Jun-2009 Michael S. Tsirkin <mst@redhat.com>

KVM: remove in_range from io devices

This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write
functions. in_range now becomes unused so it is removed from device ops in
favor of read/write callbacks performing range checks internally.

This allows aliasing (mostly for in-kernel virtio), as well as better error
handling by making it possible to pass errors up to userspace.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 6c474694 29-Jun-2009 Michael S. Tsirkin <mst@redhat.com>

KVM: convert bus to slots_lock

Use slots_lock to protect device list on the bus. slots_lock is already
taken for read everywhere, so we only need to take it for write when
registering devices. This is in preparation to removing in_range and
kvm->lock around it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# c5af89b6 09-Jun-2009 Gleb Natapov <gleb@redhat.com>

KVM: Introduce kvm_vcpu_is_bsp() function.

Use it instead of open code "vcpu_id zero is BSP" assumption.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 9f4cc127 04-Jun-2009 Marcelo Tosatti <mtosatti@redhat.com>

KVM: Grab pic lock in kvm_pic_clear_isr_ack

isr_ack is protected by kvm_pic->lock.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# d76685c4 31-May-2009 Gregory Haskins <ghaskins@novell.com>

KVM: cleanup io_device code

We modernize the io_device code so that we use container_of() instead of
dev->private, and move the vtable to a separate ops structure
(theoretically allows better caching for multiple instances of the same
ops structure)

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Avi Kivity <avi@redhat.com>


# d7364a29 20-Feb-2009 Hannes Eder <hannes@hanneseder.net>

KVM: fix sparse warnings: context imbalance

Impact: Attribute function with __acquires(...) resp. __releases(...).

Fix this sparse warnings:
arch/x86/kvm/i8259.c:34:13: warning: context imbalance in 'pic_lock' - wrong count at exit
arch/x86/kvm/i8259.c:39:13: warning: context imbalance in 'pic_unlock' - unexpected unlock

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 4925663a 04-Feb-2009 Gleb Natapov <gleb@redhat.com>

KVM: Report IRQ injection status to userspace.

IRQ injection status is either -1 (if there was no CPU found
that should except the interrupt because IRQ was masked or
ioapic was misconfigured or ...) or >= 0 in that case the
number indicates to how many CPUs interrupt was injected.
If the value is 0 it means that the interrupt was coalesced
and probably should be reinjected.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 44882eed 27-Jan-2009 Marcelo Tosatti <mtosatti@redhat.com>

KVM: make irq ack notifications aware of routing table

IRQ ack notifications assume an identity mapping between pin->gsi,
which might not be the case with, for example, HPET.

Translate before acking.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>


# 3f353858 21-Dec-2008 Avi Kivity <avi@qumranet.com>

KVM: Add locking to virtual i8259 interrupt controller

While most accesses to the i8259 are with the kvm mutex taken, the call
to kvm_pic_read_irq() is not. We can't easily take the kvm mutex there
since the function is called with interrupts disabled.

Fix by adding a spinlock to the virtual interrupt controller. Since we
can't send an IPI under the spinlock (we also take the same spinlock in
an irq disabled context), we defer the IPI until the spinlock is released.
Similarly, we defer irq ack notifications until after spinlock release to
avoid lock recursion.

Signed-off-by: Avi Kivity <avi@redhat.com>


# e4825800 24-Sep-2008 Marcelo Tosatti <mtosatti@redhat.com>

KVM: PIC: enhance IPI avoidance

The PIC code makes little effort to avoid kvm_vcpu_kick(), resulting in
unnecessary guest exits in some conditions.

For example, if the timer interrupt is routed through the IOAPIC, IRR
for IRQ 0 will get set but not cleared, since the APIC is handling the
acks.

This means that everytime an interrupt < 16 is triggered, the priority
logic will find IRQ0 pending and send an IPI to vcpu0 (in case IRQ0 is
not masked, which is Linux's case).

Introduce a new variable isr_ack to represent the IRQ's for which the
guest has been signalled / cleared the ISR. Use it to avoid more than
one IPI per trigger-ack cycle, in addition to the avoidance when ISR is
set in get_priority().

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 85428ac7 14-Aug-2008 Marcelo Tosatti <mtosatti@redhat.com>

KVM: fix i8259 reset irq acking

The irq ack during pic reset has three problems:

- Ignores slave/master PIC, using gsi 0-8 for both.
- Generates an ACK even if the APIC is in control.
- Depends upon IMR being clear, which is broken if the irq was masked
at the time it was generated.

The last one causes the BIOS to hang after the first reboot of
Windows installation, since PIT interrupts stop.

[avi: fix check whether pic interrupts are seen by cpu]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>


# f5244726 26-Jul-2008 Marcelo Tosatti <mtosatti@redhat.com>

KVM: irq ack notification

Based on a patch from: Ben-Ami Yassour <benami@il.ibm.com>
which was based on a patch from: Amit Shah <amit.shah@qumranet.com>

Notify IRQ acking on PIC/APIC emulation. The previous patch missed two things:

- Edge triggered interrupts on IOAPIC
- PIC reset with IRR/ISR set should be equivalent to ack (LAPIC probably
needs something similar).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
CC: Amit Shah <amit.shah@qumranet.com>
CC: Ben-Ami Yassour <benami@il.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>


# 7edd0ce0 07-Jul-2008 Avi Kivity <avi@qumranet.com>

KVM: Consolidate PIC isr clearing into a function

Signed-off-by: Avi Kivity <avi@qumranet.com>


# c65bbfa1 06-Jul-2008 Ben-Ami Yassour <benami@il.ibm.com>

KVM: check injected pic irq within valid pic irqs

Check that an injected pic irq is between 0 and 15.

Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>


# 92760499 30-May-2008 Laurent Vivier <Laurent.Vivier@bull.net>

KVM: kvm_io_device: extend in_range() to manage len and write attribute

Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).

This modification allows to use kvm_io_device with coalesced MMIO.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>


# edf88417 16-Dec-2007 Avi Kivity <avi@qumranet.com>

KVM: Move arch dependent files to new directory arch/x86/kvm/

This paves the way for multiple architecture support. Note that while
ioapic.c could potentially be shared with ia64, it is also moved.

Signed-off-by: Avi Kivity <avi@qumranet.com>