History log of /linux-master/arch/x86/xen/xen-ops.h
Revision Date Author Comments
# 38620fc4 20-Feb-2024 Roger Pau Monne <roger.pau@citrix.com>

x86/xen: attempt to inflate the memory balloon on PVH

When running as PVH or HVM Linux will use holes in the memory map as scratch
space to map grants, foreign domain pages and possibly miscellaneous other
stuff. However the usage of such memory map holes for Xen purposes can be
problematic. The request of holesby Xen happen quite early in the kernel boot
process (grant table setup already uses scratch map space), and it's possible
that by then not all devices have reclaimed their MMIO space. It's not
unlikely for chunks of Xen scratch map space to end up using PCI bridge MMIO
window memory, which (as expected) causes quite a lot of issues in the system.

At least for PVH dom0 we have the possibility of using regions marked as
UNUSABLE in the e820 memory map. Either if the region is UNUSABLE in the
native memory map, or it has been converted into UNUSABLE in order to hide RAM
regions from dom0, the second stage translation page-tables can populate those
areas without issues.

PV already has this kind of logic, where the balloon driver is inflated at
boot. Re-use the current logic in order to also inflate it when running as
PVH. onvert UNUSABLE regions up to the ratio specified in EXTRA_MEM_RATIO to
RAM, while reserving them using xen_add_extra_mem() (which is also moved so
it's no longer tied to CONFIG_PV).

[jgross: fixed build for CONFIG_PVH without CONFIG_XEN_PVH]

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20240220174341.56131-1-roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# db283230 24-Nov-2023 Juergen Gross <jgross@suse.com>

x86/xen: fix percpu vcpu_info allocation

Today the percpu struct vcpu_info is allocated via DEFINE_PER_CPU(),
meaning that it could cross a page boundary. In this case registering
it with the hypervisor will fail, resulting in a panic().

This can easily be fixed by using DEFINE_PER_CPU_ALIGNED() instead,
as struct vcpu_info is guaranteed to have a size of 64 bytes, matching
the cache line size of x86 64-bit processors (Xen doesn't support
32-bit processors).

Fixes: 5ead97c84fa7 ("xen: Core Xen implementation")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.con>
Link: https://lore.kernel.org/r/20231124074852.25161-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# fb9b7b4b 14-Jun-2023 Arnd Bergmann <arnd@arndb.de>

x86: xen: add missing prototypes

These function are all called from assembler files, or from inline
assembler, so there is no immediate need for a prototype in a header,
but if -Wmissing-prototypes is enabled, the compiler warns about them:

arch/x86/xen/efi.c:130:13: error: no previous prototype for 'xen_efi_init' [-Werror=missing-prototypes]
arch/x86/platform/pvh/enlighten.c:120:13: error: no previous prototype for 'xen_prepare_pvh' [-Werror=missing-prototypes]
arch/x86/xen/enlighten_pv.c:1233:34: error: no previous prototype for 'xen_start_kernel' [-Werror=missing-prototypes]
arch/x86/xen/irq.c:22:14: error: no previous prototype for 'xen_force_evtchn_callback' [-Werror=missing-prototypes]
arch/x86/entry/common.c:302:24: error: no previous prototype for 'xen_pv_evtchn_do_upcall' [-Werror=missing-prototypes]

Declare all of them in an appropriate header file to avoid the warnings.
For consistency, also move the asm_cpu_bringup_and_idle() declaration
out of smp_pv.c.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20230614073501.10101-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# 04d68487 17-May-2023 Arnd Bergmann <arnd@arndb.de>

xen: xen_debug_interrupt prototype to global header

The xen_debug_interrupt() function is only called on x86, which has a
prototype in an architecture specific header, but the definition also
exists on others, where the lack of a prototype causes a W=1 warning:

drivers/xen/events/events_2l.c:264:13: error: no previous prototype for 'xen_debug_interrupt' [-Werror=missing-prototypes]

Move the prototype into a global header instead to avoid this warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20230517124525.929201-1-arnd@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>


# 934ef33ee 13-Mar-2023 Jan Beulich <jbeulich@suse.com>

x86/PVH: obtain VGA console info in Dom0

A new platform-op was added to Xen to allow obtaining the same VGA
console information PV Dom0 is handed. Invoke the new function and have
the output data processed by xen_init_vga().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>

Link: https://lore.kernel.org/r/8f315e92-7bda-c124-71cc-478ab9c5e610@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# b75b7f8e 14-Jun-2022 Peter Zijlstra <peterz@infradead.org>

x86/xen: Rename SYS* entry points

Native SYS{CALL,ENTER} entry points are called
entry_SYS{CALL,ENTER}_{64,compat}, make sure the Xen versions are
named consistently.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>


# 12ad6cfc 28-Oct-2021 Juergen Gross <jgross@suse.com>

x86/xen: remove xen_have_vcpu_info_placement flag

The flag xen_have_vcpu_info_placement was needed to support Xen
hypervisors older than version 3.4, which didn't support the
VCPUOP_register_vcpu_info hypercall. Today the Linux kernel requires
at least Xen 4.0 to be able to run, so xen_have_vcpu_info_placement
can be dropped (in theory the flag was used to ensure a working kernel
even in case of the VCPUOP_register_vcpu_info hypercall failing for
other reasons than the hypercall not being supported, but the only
cases covered by the flag would be parameter errors, which ought not
to be made anyway).

This allows to let some functions return void now, as they can never
fail.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211028072748.29862-2-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


# 079c4baa 30-Sep-2021 Jan Beulich <jbeulich@suse.com>

xen/x86: hook up xen_banner() also for PVH

This was effectively lost while dropping PVHv1 code. Move the function
and arrange for it to be called the same way as done in PV mode. Clearly
this then needs re-introducing the XENFEAT_mmu_pt_update_preserve_ad
check that was recently removed, as that's a PV-only feature.

Since the string pointed at by pv_info.name describes the mode, drop
"paravirtualized" from the log message while moving the code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/de03054d-a20d-2114-bb86-eec28e17b3b8@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# 4d1ab432 30-Sep-2021 Jan Beulich <jbeulich@suse.com>

xen/x86: generalize preferred console model from PV to PVH Dom0

Without announcing hvc0 as preferred it won't get used as long as tty0
gets registered earlier. This is particularly problematic with there not
being any screen output for PVH Dom0 when the screen is in graphics
mode, as the necessary information doesn't get conveyed yet from the
hypervisor.

Follow PV's model, but be conservative and do this for Dom0 only for
now.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/582328b6-c86c-37f3-d802-5539b7a86736@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# cae7d81a 30-Sep-2021 Jan Beulich <jbeulich@suse.com>

xen/x86: allow PVH Dom0 without XEN_PV=y

Decouple XEN_DOM0 from XEN_PV, converting some existing uses of XEN_DOM0
to a new XEN_PV_DOM0. (I'm not convinced all are really / should really
be PV-specific, but for starters I've tried to be conservative.)

For PVH Dom0 the hypervisor populates MADT with only x2APIC entries, so
without x2APIC support enabled in the kernel things aren't going to work
very well. (As opposed, DomU-s would only ever see LAPIC entries in MADT
as of now.) Note that this then requires PVH Dom0 to be 64-bit, as
X86_X2APIC depends on X86_64.

In the course of this xen_running_on_version_or_later() needs to be
available more broadly. Move it from a PV-specific to a generic file,
considering that what it does isn't really PV-specific at all anyway.

Note that xen/interface/version.h cannot be included on its own; in
enlighten.c, which uses SCHEDOP_* anyway, include xen/interface/sched.h
first to resolve the apparently sole missing type (xen_ulong_t).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>

Link: https://lore.kernel.org/r/983bb72f-53df-b6af-14bd-5e088bd06a08@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>


# ab234a26 20-Jan-2021 Juergen Gross <jgross@suse.com>

x86/pv: Rework arch_local_irq_restore() to not use popf

POPF is a rather expensive operation, so don't use it for restoring
irq flags. Instead, test whether interrupts are enabled in the flags
parameter and enable interrupts via STI in that case.

This results in the restore_fl paravirt op to be no longer needed.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210120135555.32594-7-jgross@suse.com


# afd30525 20-Jan-2021 Juergen Gross <jgross@suse.com>

x86/xen: Drop USERGS_SYSRET64 paravirt call

USERGS_SYSRET64 is used to return from a syscall via SYSRET, but
a Xen PV guest will nevertheless use the IRET hypercall, as there
is no sysret PV hypercall defined.

So instead of testing all the prerequisites for doing a sysret and
then mangling the stack for Xen PV again for doing an iret just use
the iret exit from the beginning.

This can easily be done via an ALTERNATIVE like it is done for the
sysenter compat case already.

It should be noted that this drops the optimization in Xen for not
restoring a few registers when returning to user mode, but it seems
as if the saved instructions in the kernel more than compensate for
this drop (a kernel build in a Xen PV guest was slightly faster with
this patch applied).

While at it remove the stale sysret32 remnants.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210120135555.32594-6-jgross@suse.com


# d04b1ae5 22-Oct-2020 Juergen Gross <jgross@suse.com>

xen/events: only register debug interrupt for 2-level events

xen_debug_interrupt() is specific to 2-level event handling. So don't
register it with fifo event handling being active.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20201022094907.28560-4-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


# a13f2ef1 29-Jun-2020 Juergen Gross <jgross@suse.com>

x86/xen: remove 32-bit Xen PV guest support

Xen is requiring 64-bit machines today and since Xen 4.14 it can be
built without 32-bit PV guest support. There is no need to carry the
burden of 32-bit PV guest support in the kernel any longer, as new
guests can be either HVM or PVH, or they can use a 64 bit kernel.

Remove the 32-bit Xen PV support from the kernel.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>


# 998c2034 20-May-2020 Vitaly Kuznetsov <vkuznets@redhat.com>

xen: Move xen_setup_callback_vector() definition to include/xen/hvm.h

Kbuild test robot reports the following problem on ARM:

for 'xen_setup_callback_vector' [-Wmissing-prototypes]
1664 | void xen_setup_callback_vector(void) {}
| ^~~~~~~~~~~~~~~~~~~~~~~~~

The problem is that xen_setup_callback_vector is a x86 only thing, its
definition is present in arch/x86/xen/xen-ops.h but not on ARM. In
events_base.c there is a stub for !CONFIG_XEN_PVHVM but it is not declared
as 'static'.

On x86 the situation is hardly better: drivers/xen/events/events_base.c
doesn't include 'xen-ops.h' from arch/x86/xen/, it includes its namesake
from include/xen/ which also results in a 'no previous prototype' warning.

Currently, xen_setup_callback_vector() has two call sites: one in
drivers/xen/events_base.c and another in arch/x86/xen/suspend_hvm.c. The
former is placed under #ifdef CONFIG_X86 and the later is only compiled
in when CONFIG_XEN_PVHVM.

Resolve the issue by moving xen_setup_callback_vector() declaration to
arch neutral 'include/xen/hvm.h' as the implementation lives in arch
neutral drivers/xen/events/events_base.c.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lkml.kernel.org/r/20200520161600.361895-1-vkuznets@redhat.com


# 2f6474e4 21-May-2020 Thomas Gleixner <tglx@linutronix.de>

x86/entry: Switch XEN/PV hypercall entry to IDTENTRY

Convert the XEN/PV hypercall to IDTENTRY:

- Emit the ASM stub with DECLARE_IDTENTRY
- Remove the ASM idtentry in 64-bit
- Remove the open coded ASM entry code in 32-bit
- Remove the old prototypes

The handler stubs need to stay in ASM code as they need corner case handling
and adjustment of the stack pointer.

Provide a new C function which invokes the entry/exit handling and calls
into the XEN handler on the interrupt stack if required.

The exit code is slightly different from the regular idtentry_exit() on
non-preemptible kernels. If the hypercall is preemptible and need_resched()
is set then XEN provides a preempt hypercall scheduling function.

Move this functionality into the entry code so it can use the existing
idtentry functionality.

[ mingo: Build fixes. ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com>
Tested-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20200521202118.055270078@linutronix.de


# a0bb51f2 28-Apr-2020 Vitaly Kuznetsov <vkuznets@redhat.com>

x86/xen: Split HVM vector callback setup and interrupt gate allocation

As a preparatory change for making alloc_intr_gate() __init split
xen_callback_vector() into callback vector setup via hypercall
(xen_setup_callback_vector()) and interrupt gate allocation
(xen_alloc_callback_vector()).

xen_setup_callback_vector() is being called twice: on init and upon
system resume from xen_hvm_post_suspend(). alloc_intr_gate() only
needs to be called once.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200428093824.1451532-2-vkuznets@redhat.com


# 55aedddb 11-Jul-2019 Peter Zijlstra <peterz@infradead.org>

x86/paravirt: Make read_cr2() CALLEE_SAVE

The one paravirt read_cr2() implementation (Xen) is actually quite trivial
and doesn't need to clobber anything other than the return register.

Making read_cr2() CALLEE_SAVE avoids all the PUSH/POP nonsense and allows
more convenient use from assembly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: bp@alien8.de
Cc: rostedt@goodmis.org
Cc: luto@kernel.org
Cc: torvalds@linux-foundation.org
Cc: hpa@zytor.com
Cc: dave.hansen@linux.intel.com
Cc: zhe.he@windriver.com
Cc: joel@joelfernandes.org
Cc: devel@etsukata.com
Link: https://lkml.kernel.org/r/20190711114335.887392493@infradead.org


# 72813bfb 23-Apr-2019 Roger Pau Monne <roger.pau@citrix.com>

xen/pvh: correctly setup the PV EFI interface for dom0

This involves initializing the boot params EFI related fields and the
efi global variable.

Without this fix a PVH dom0 doesn't detect when booted from EFI, and
thus doesn't support accessing any of the EFI related data.

Reported-by: PGNet Dev <pgnet.dev@gmail.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org # 4.19+


# df11b691 20-Aug-2018 Juergen Gross <jgross@suse.com>

x86/xen: remove unused function xen_auto_xlated_memory_setup()

xen_auto_xlated_memory_setup() is a leftover from PVH V1. Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


# 7b25b9cb 19-Jul-2018 Pavel Tatashin <pasha.tatashin@oracle.com>

x86/xen/time: Initialize pv xen time in init_hypervisor_platform()

In every hypervisor except for xen pv time ops are initialized in
init_hypervisor_platform().

Xen PV domains initialize time ops in x86_init.paging.pagetable_init(),
by calling xen_setup_shared_info() which is a poor design, as time is
needed prior to memory allocator.

xen_setup_shared_info() is called from two places: during boot, and
after suspend. Split the content of xen_setup_shared_info() into
three places:

1. add the clock relavent data into new xen pv init_platform vector, and
set clock ops in there.

2. move xen_setup_vcpu_info_placement() to new xen_pv_guest_late_init()
call.

3. Re-initializing parts of shared info copy to xen_pv_post_suspend() to
be symmetric to xen_pv_pre_suspend

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-13-pasha.tatashin@oracle.com


# 0dd6d272 23-Dec-2017 Nick Desaulniers <nick.desaulniers@gmail.com>

x86/xen/time: fix section mismatch for xen_init_time_ops()

The header declares this function as __init but is defined in __ref
section.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


# 2229f70b 08-Nov-2017 Joao Martins <joao.m.martins@oracle.com>

x86/xen/time: setup vcpu 0 time info page

In order to support pvclock vdso on xen we need to setup the time
info page for vcpu 0 and register the page with Xen using the
VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall
will also forcefully update the pvti which will set some of the
necessary flags for vdso. Afterwards we check if it supports the
PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having
vdso/vsyscall support. And if so, it will set the cpu 0 pvti that
will be later on used when mapping the vdso image.

The xen headers are also updated to include the new hypercall for
registering the secondary vcpu_time_info struct.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 5878d5d6 31-Aug-2017 Juergen Gross <jgross@suse.com>

x86/xen: Get rid of paravirt op adjust_exception_frame

When running as Xen pv-guest the exception frame on the stack contains
%r11 and %rcx additional to the other data pushed by the processor.

Instead of having a paravirt op being called for each exception type
prepend the Xen specific code to each exception entry. When running as
Xen pv-guest just use the exception entry with prepended instructions,
otherwise use the entry without the Xen specific code.

[ tglx: Merged through tip to avoid ugly merge conflict ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Cc: luto@amacapital.net
Link: http://lkml.kernel.org/r/20170831174249.26853-1-jg@pfupf.net


# edcb5cf8 16-Aug-2017 Juergen Gross <jgross@suse.com>

x86/paravirt/xen: Remove xen_patch()

Xen's paravirt patch function xen_patch() does some special casing for
irq_ops functions to apply relocations when those functions can be
patched inline instead of calls.

Unfortunately none of the special case function replacements is small
enough to be patched inline, so the special case never applies.

As xen_patch() will call paravirt_patch_default() in all cases it can
be just dropped. xen-asm.h doesn't seem necessary without xen_patch()
as the only thing left in it would be the definition of XEN_EFLAGS_NMI
used only once. So move that definition and remove xen-asm.h.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: rusty@rustcorp.com.au
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20170816173157.8633-2-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# c9b5d98b 02-Jun-2017 Ankur Arora <ankur.a.arora@oracle.com>

xen/vcpu: Handle xen_vcpu_setup() failure in hotplug

The hypercall VCPUOP_register_vcpu_info can fail. This failure is
handled by making per_cpu(xen_vcpu, cpu) point to its shared_info
slot and those without one (cpu >= MAX_VIRT_CPUS) be NULL.

For PVH/PVHVM, this is not enough, because we also need to pull
these VCPUs out of circulation.

Fix for PVH/PVHVM: on registration failure in the cpuhp prepare
callback (xen_cpu_up_prepare_hvm()), return an error to the cpuhp
state-machine so it can fail the CPU init.

Fix for PV: the registration happens before smp_init(), so, in the
failure case we clamp setup_max_cpus and limit the number of VCPUs
that smp_init() will bring-up to MAX_VIRT_CPUS.
This is functionally correct but it makes the code a bit simpler
if we get rid of this explicit clamping: for VCPUs that don't have
valid xen_vcpu, fail the CPU init in the cpuhp prepare callback
(xen_cpu_up_prepare_pv()).

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>


# ad73fd59 02-Jun-2017 Ankur Arora <ankur.a.arora@oracle.com>

xen/vcpu: Simplify xen_vcpu related code

Largely mechanical changes to aid unification of xen_vcpu_restore()
logic for PV, PVH and PVHVM.

xen_vcpu_setup(): the only change in logic is that clamp_max_cpus()
is now handled inside the "if (!xen_have_vcpu_info_placement)" block.

xen_vcpu_restore(): code movement from enlighten_pv.c to enlighten.c.

xen_vcpu_info_reset(): pulls together all the code where xen_vcpu
is set to default.

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>


# 5d9404e1 24-Apr-2017 Julien Grall <julien.grall@arm.com>

xen: Export xen_reboot

The helper xen_reboot will be called by the EFI code in a later patch.

Note that the ARM version does not yet exist and will be added in a
later patch too.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Juergen Gross <jgross@suse.com>


# 9963236d 14-Mar-2017 Vitaly Kuznetsov <vkuznets@redhat.com>

x86/xen: split suspend.c for PV and PVHVM guests

Slit the code in suspend.c into suspend_pv.c and suspend_hvm.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>


# 98f2a47a 14-Mar-2017 Vitaly Kuznetsov <vkuznets@redhat.com>

x86/xen: split off enlighten_hvm.c

Move PVHVM related code to enlighten_hvm.c. Three functions:
xen_cpuhp_setup(), xen_reboot(), xen_emergency_restart() are shared, drop
static qualifier from them. These functions will go to common code once
it is split from enlighten.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>


# 52519f2a 14-Mar-2017 Vitaly Kuznetsov <vkuznets@redhat.com>

x86/xen: globalize have_vcpu_info_placement

have_vcpu_info_placement applies to both PV and HVM and as we're going
to split the code we need to make it global.

Rename to xen_have_vcpu_info_placement.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>


# 063334f3 03-Feb-2017 Boris Ostrovsky <boris.ostrovsky@oracle.com>

xen/x86: Remove PVH support

We are replacing existing PVH guests with new implementation.

We are keeping xen_pvh_domain() macro (for now set to zero) because
when we introduce new PVH implementation later in this series we will
reuse current PVH-specific code (xen_pvh_gnttab_setup()), and that
code is conditioned by 'if (xen_pvh_domain())'. (We will also need
a noop xen_pvh_domain() for !CONFIG_XEN_PVH).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# a5a1d1c2 21-Dec-2016 Thomas Gleixner <tglx@linutronix.de>

clocksource: Use a plain u64 instead of cycle_t

There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>


# ee42d665 30-Jun-2016 Vitaly Kuznetsov <vkuznets@redhat.com>

xen/pvhvm: run xen_vcpu_setup() for the boot CPU

Historically we didn't call VCPUOP_register_vcpu_info for CPU0 for
PVHVM guests (while we had it for PV and ARM guests). This is usually
fine as we can use vcpu info in the shared_info page but when we try
booting on a vCPU with Xen's vCPU id > 31 (e.g. when we try to kdump
after crashing on this CPU) we're not able to boot.

Switch to always doing VCPUOP_register_vcpu_info for the boot CPU.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 88c15ec9 19-Nov-2015 Boris Ostrovsky <boris.ostrovsky@oracle.com>

x86/paravirt: Remove the unused irq_enable_sysexit pv op

As result of commit "x86/xen: Avoid fast syscall path for Xen PV
guests", the irq_enable_sysexit pv op is not called by Xen PV guests
anymore and since they were the only ones who used it we can
safely remove it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1447970147-1733-3-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 70e61199 16-Jul-2015 Juergen Gross <jgross@suse.com>

xen: move p2m list if conflicting with e820 map

Check whether the hypervisor supplied p2m list is placed at a location
which is conflicting with the target E820 map. If this is the case
relocate it to a new area unused up to now and compliant to the E820
map.

As the p2m list might by huge (up to several GB) and is required to be
mapped virtually, set up a temporary mapping for the copied list.

For pvh domains just delete the p2m related information from start
info instead of reserving the p2m memory, as we don't need it at all.

For 32 bit kernels adjust the memblock_reserve() parameters in order
to cover the page tables only. This requires to memblock_reserve() the
start_info page on it's own.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 6c2681c8 16-Jul-2015 Juergen Gross <jgross@suse.com>

xen: add explicit memblock_reserve() calls for special pages

Some special pages containing interfaces to xen are being reserved
implicitly only today. The memblock_reserve() call to reserve them is
meant to reserve the p2m list supplied by xen. It is just reserving
not only the p2m list itself, but some more pages up to the start of
the xen built page tables.

To be able to move the p2m list to another pfn range, which is needed
for support of huge RAM, this memblock_reserve() must be split up to
cover all affected reserved pages explicitly.

The affected pages are:
- start_info page
- xenstore ring (might be missing, mfn is 0 in this case)
- console ring (not for initial domain)

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 04414baa 16-Jul-2015 Juergen Gross <jgross@suse.com>

xen: check pre-allocated page tables for conflict with memory map

Check whether the page tables built by the domain builder are at
memory addresses which are in conflict with the target memory map.
If this is the case just panic instead of running into problems
later.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 9ddac5b7 16-Jul-2015 Juergen Gross <jgross@suse.com>

xen: find unused contiguous memory area

For being able to relocate pre-allocated data areas like initrd or
p2m list it is mandatory to find a contiguous memory area which is
not yet in use and doesn't conflict with the memory map we want to
be in effect.

In case such an area is found reserve it at once as this will be
required to be done in any case.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# e612b4a7 16-Jul-2015 Juergen Gross <jgross@suse.com>

xen: check memory area against e820 map

Provide a service routine to check a physical memory area against the
E820 map. The routine will return false if the complete area is RAM
according to the E820 map and true otherwise.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# fc5fee86 10-Aug-2015 Jason A. Donenfeld <Jason@zx2c4.com>

x86/xen: build "Xen PV" APIC driver for domU as well

It turns out that a PV domU also requires the "Xen PV" APIC
driver. Otherwise, the flat driver is used and we get stuck in busy
loops that never exit, such as in this stack trace:

(gdb) target remote localhost:9999
Remote debugging using localhost:9999
__xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
56 while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY)
(gdb) bt
#0 __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
#1 __default_send_IPI_shortcut (shortcut=<optimized out>,
dest=<optimized out>, vector=<optimized out>) at
./arch/x86/include/asm/ipi.h:75
#2 apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54
#3 0xffffffff81011336 in arch_irq_work_raise () at
arch/x86/kernel/irq_work.c:47
#4 0xffffffff8114990c in irq_work_queue (work=0xffff88000fc0e400) at
kernel/irq_work.c:100
#5 0xffffffff8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633
#6 0xffffffff8110ca60 in vprintk_emit (facility=0, level=<optimized
out>, dict=0x0 <irq_stack_union>, dictlen=<optimized out>,
fmt=<optimized out>, args=<optimized out>)
at kernel/printk/printk.c:1778
#7 0xffffffff816010c8 in printk (fmt=<optimized out>) at
kernel/printk/printk.c:1868
#8 0xffffffffc00013ea in ?? ()
#9 0x0000000000000000 in ?? ()

Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# aac82d31 03-Apr-2015 Andy Lutomirski <luto@kernel.org>

x86, paravirt, xen: Remove the 64-bit ->irq_enable_sysexit() pvop

We don't use irq_enable_sysexit on 64-bit kernels any more.
Remove all the paravirt and Xen machinery to support it on
64-bit kernels.

Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8a03355698fe5b94194e9e7360f19f91c1b2cf1f.1428100853.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# f0feed10 27-Jan-2015 Juergen Gross <jgross@suse.com>

x86/xen: cleanup arch/x86/xen/setup.c

Remove extern declarations in arch/x86/xen/setup.c which are either
not used or redundant. Move needed other extern declarations to
xen-ops.h

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 054954eb 28-Nov-2014 Juergen Gross <jgross@suse.com>

xen: switch to linear virtual mapped sparse p2m list

At start of the day the Xen hypervisor presents a contiguous mfn list
to a pv-domain. In order to support sparse memory this mfn list is
accessed via a three level p2m tree built early in the boot process.
Whenever the system needs the mfn associated with a pfn this tree is
used to find the mfn.

Instead of using a software walked tree for accessing a specific mfn
list entry this patch is creating a virtual address area for the
entire possible mfn list including memory holes. The holes are
covered by mapping a pre-defined page consisting only of "invalid
mfn" entries. Access to a mfn entry is possible by just using the
virtual base address of the mfn list and the pfn as index into that
list. This speeds up the (hot) path of determining the mfn of a
pfn.

Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0
showed following improvements:

Elapsed time: 32:50 -> 32:35
System: 18:07 -> 17:47
User: 104:00 -> 103:30

Tested with following configurations:
- 64 bit dom0, 8GB RAM
- 64 bit dom0, 128 GB RAM, PCI-area above 4 GB
- 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without
the patch)
- 32 bit domU, ballooning up and down
- 32 bit domU, save and restore
- 32 bit domU with PCI passthrough
- 64 bit domU, 8 GB, 2049 MB, 5000 MB
- 64 bit domU, ballooning up and down
- 64 bit domU, save and restore
- 64 bit domU with PCI passthrough

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 5b8e7d80 28-Nov-2014 Juergen Gross <jgross@suse.com>

xen: Delay invalidating extra memory

When the physical memory configuration is initialized the p2m entries
for not pouplated memory pages are set to "invalid". As those pages
are beyond the hypervisor built p2m list the p2m tree has to be
extended.

This patch delays processing the extra memory related p2m entries
during the boot process until some more basic memory management
functions are callable. This removes the need to create new p2m
entries until virtual memory management is available.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 1f3ac86b 28-Nov-2014 Juergen Gross <jgross@suse.com>

xen: Delay remapping memory of pv-domain

Early in the boot process the memory layout of a pv-domain is changed
to match the E820 map (either the host one for Dom0 or the Xen one)
regarding placement of RAM and PCI holes. This requires removing memory
pages initially located at positions not suitable for RAM and adding
them later at higher addresses where no restrictions apply.

To be able to operate on the hypervisor supported p2m list until a
virtual mapped linear p2m list can be constructed, remapping must
be delayed until virtual memory management is initialized, as the
initial p2m list can't be extended unlimited at physical memory
initialization time due to it's fixed structure.

A further advantage is the reduction in complexity and code volume as
we don't have to be careful regarding memory restrictions during p2m
updates.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>


# 47591df5 03-Nov-2014 Juergen Gross <jgross@suse.com>

xen: Support Xen pv-domains using PAT

With the dynamical mapping between cache modes and pgprot values it is
now possible to use all cache modes via the Xen hypervisor PAT settings
in a pv domain.

All to be done is to read the PAT configuration MSR and set up the
translation tables accordingly.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stefan.bader@canonical.com
Cc: xen-devel@lists.xensource.com
Cc: ville.syrjala@linux.intel.com
Cc: jbeulich@suse.com
Cc: toshi.kani@hp.com
Cc: plagnioj@jcrosoft.com
Cc: tomi.valkeinen@ti.com
Cc: bhelgaas@google.com
Link: http://lkml.kernel.org/r/1415019724-4317-19-git-send-email-jgross@suse.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# c7341d6a 12-Jul-2014 Daniel Kiper <daniel.kiper@oracle.com>

arch/x86/xen: Silence compiler warnings

Compiler complains in the following way when x86 32-bit kernel
with Xen support is build:

CC arch/x86/xen/enlighten.o
arch/x86/xen/enlighten.c: In function ‘xen_start_kernel’:
arch/x86/xen/enlighten.c:1726:3: warning: right shift count >= width of type [enabled by default]

Such line contains following EFI initialization code:

boot_params.efi_info.efi_systab_hi = (__u32)(__pa(efi_systab_xen) >> 32);

There is no issue if x86 64-bit kernel is build. However, 32-bit case
generate warning (even if that code will not be executed because Xen
does not work on 32-bit EFI platforms) due to __pa() returning unsigned long
type which has 32-bits width. So move whole EFI initialization stuff
to separate function and build it conditionally to avoid above mentioned
warning on x86 32-bit architecture.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>


# abacaadc 02-Jun-2014 David Vrabel <david.vrabel@citrix.com>

x86/xen: fix memory setup for PVH dom0

Since af06d66ee32b (x86: fix setup of PVH Dom0 memory map) in Xen, PVH
dom0 need only use the memory memory provided by Xen which has already
setup all the correct holes.

xen_memory_setup() then ends up being trivial for a PVH guest so
introduce a new function (xen_auto_xlated_memory_setup()).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>


# aa8532c3 08-May-2014 David Vrabel <david.vrabel@citrix.com>

xen: refactor suspend pre/post hooks

New architectures currently have to provide implementations of 5 different
functions: xen_arch_pre_suspend(), xen_arch_post_suspend(),
xen_arch_hvm_post_suspend(), xen_mm_pin_all(), and xen_mm_unpin_all().

Refactor the suspend code to only require xen_arch_pre_suspend() and
xen_arch_post_suspend().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>


# c9f6e997 20-Jan-2014 Roger Pau Monne <roger.pau@citrix.com>

xen/pvh: Set X86_CR0_WP and others in CR0 (v2)

otherwise we will get for some user-space applications
that use 'clone' with CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID
end up hitting an assert in glibc manifested by:

general protection ip:7f80720d364c sp:7fff98fd8a80 error:0 in
libc-2.13.so[7f807209e000+180000]

This is due to the nature of said operations which sets and clears
the PID. "In the successful one I can see that the page table of
the parent process has been updated successfully to use a
different physical page, so the write of the tid on
that page only affects the child...

On the other hand, in the failed case, the write seems to happen before
the copy of the original page is done, so both the parent and the child
end up with the same value (because the parent copies the page after
the write of the child tid has already happened)."
(Roger's analysis). The nature of this is due to the Xen's commit
of 51e2cac257ec8b4080d89f0855c498cbbd76a5e5
"x86/pvh: set only minimal cr0 and cr4 flags in order to use paging"
the CR0_WP was removed so COW features of the Linux kernel were not
operating properly.

While doing that also update the rest of the CR0 flags to be inline
with what a baremetal Linux kernel would set them to.

In 'secondary_startup_64' (baremetal Linux) sets:

X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP |
X86_CR0_AM | X86_CR0_PG

The hypervisor for HVM type guests (which PVH is a bit) sets:
X86_CR0_PE | X86_CR0_ET | X86_CR0_TS
For PVH it specifically sets:
X86_CR0_PG

Which means we need to set the rest: X86_CR0_MP | X86_CR0_NE |
X86_CR0_WP | X86_CR0_AM to have full parity.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v1: Took out the cr4 writes to be a seperate patch]
[v2: 0-DAY kernel found xen_setup_gdt to be missing a static]


# 5840c84b 13-Dec-2013 Mukesh Rathor <mukesh.rathor@oracle.com>

xen/pvh: Secondary VCPU bringup (non-bootup CPUs)

The VCPU bringup protocol follows the PV with certain twists.
From xen/include/public/arch-x86/xen.h:

Also note that when calling DOMCTL_setvcpucontext and VCPU_initialise
for HVM and PVH guests, not all information in this structure is updated:

- For HVM guests, the structures read include: fpu_ctxt (if
VGCT_I387_VALID is set), flags, user_regs, debugreg[*]

- PVH guests are the same as HVM guests, but additionally use ctrlreg[3] to
set cr3. All other fields not used should be set to 0.

This is what we do. We piggyback on the 'xen_setup_gdt' - but modify
a bit - we need to call 'load_percpu_segment' so that 'switch_to_new_gdt'
can load per-cpu data-structures. It has no effect on the VCPU0.

We also piggyback on the %rdi register to pass in the CPU number - so
that when we bootup a new CPU, the cpu_bringup_and_idle will have
passed as the first parameter the CPU number (via %rdi for 64-bit).

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# eb86b5fd 21-Aug-2013 Andi Kleen <ak@linux.intel.com>

x86/asmlinkage: Fix warning in xen asmlinkage change

Current code uses asmlinkage for functions without arguments.
This adds an implicit regparm(0) which creates a warning
when assigning the function to pointers.

Use __visible for the functions without arguments.
This avoids having to add regparm(0) to function pointers.
Since they have no arguments it does not make any difference.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1377115662-4865-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 9a55fdbe 05-Aug-2013 Andi Kleen <ak@linux.intel.com>

x86, asmlinkage, paravirt: Add __visible/asmlinkage to xen paravirt ops

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-13-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 148f9bb8 18-Jun-2013 Paul Gortmaker <paul.gortmaker@windriver.com>

x86: delete __cpuinit usage from all x86 files

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>


# e9daff24 14-Feb-2013 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Revert "xen PVonHVM: use E820_Reserved area for shared_info"

This reverts commit 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f.

We are doing this b/c on 32-bit PVonHVM with older hypervisors
(Xen 4.1) it ends up bothing up the start_info. This is bad b/c
we use it for the time keeping, and the timekeeping code loops
forever - as the version field never changes. Olaf says to
revert it, so lets do that.

Acked-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 9d02b43d 01-Nov-2012 Olaf Hering <olaf@aepfle.de>

xen PVonHVM: use E820_Reserved area for shared_info

This is a respin of 00e37bdb0113a98408de42db85be002f21dbffd3
("xen PVonHVM: move shared_info to MMIO before kexec").

Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into an
E820 reserved memory area.

The pfn containing the shared_info is located somewhere in RAM. This will
cause trouble if the current kernel is doing a kexec boot into a new
kernel. The new kernel (and its startup code) can not know where the pfn
is, so it can not reserve the page. The hypervisor will continue to update
the pfn, and as a result memory corruption occours in the new kernel.

The toolstack marks the memory area FC000000-FFFFFFFF as reserved in the
E820 map. Within that range newer toolstacks (4.3+) will keep 1MB
starting from FE700000 as reserved for guest use. Older Xen4 toolstacks
will usually not allocate areas up to FE700000, so FE700000 is expected
to work also with older toolstacks.

In Xen3 there is no reserved area at a fixed location. If the guest is
started on such old hosts the shared_info page will be placed in RAM. As
a result kexec can not be used.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 357a3cfb 19-Jul-2012 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/p2m: Add logic to revector a P2M tree to use __va leafs.

During bootup Xen supplies us with a P2M array. It sticks
it right after the ramdisk, as can be seen with a 128GB PV guest:

(certain parts removed for clarity):
xc_dom_build_image: called
xc_dom_alloc_segment: kernel : 0xffffffff81000000 -> 0xffffffff81e43000 (pfn 0x1000 + 0xe43 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0xe43 at 0x7f097d8bf000
xc_dom_alloc_segment: ramdisk : 0xffffffff81e43000 -> 0xffffffff925c7000 (pfn 0x1e43 + 0x10784 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x1e43+0x10784 at 0x7f0952dd2000
xc_dom_alloc_segment: phys2mach : 0xffffffff925c7000 -> 0xffffffffa25c7000 (pfn 0x125c7 + 0x10000 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x125c7+0x10000 at 0x7f0942dd2000
xc_dom_alloc_page : start info : 0xffffffffa25c7000 (pfn 0x225c7)
xc_dom_alloc_page : xenstore : 0xffffffffa25c8000 (pfn 0x225c8)
xc_dom_alloc_page : console : 0xffffffffa25c9000 (pfn 0x225c9)
nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffffa27fffff, 276 table(s)
xc_dom_alloc_segment: page tables : 0xffffffffa25ca000 -> 0xffffffffa26e1000 (pfn 0x225ca + 0x117 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x225ca+0x117 at 0x7f097d7a8000
xc_dom_alloc_page : boot stack : 0xffffffffa26e1000 (pfn 0x226e1)
xc_dom_build_image : virt_alloc_end : 0xffffffffa26e2000
xc_dom_build_image : virt_pgtab_end : 0xffffffffa2800000

So the physical memory and virtual (using __START_KERNEL_map addresses)
layout looks as so:

phys __ka
/------------\ /-------------------\
| 0 | empty | 0xffffffff80000000|
| .. | | .. |
| 16MB | <= kernel starts | 0xffffffff81000000|
| .. | | |
| 30MB | <= kernel ends => | 0xffffffff81e43000|
| .. | & ramdisk starts | .. |
| 293MB | <= ramdisk ends=> | 0xffffffff925c7000|
| .. | & P2M starts | .. |
| .. | | .. |
| 549MB | <= P2M ends => | 0xffffffffa25c7000|
| .. | start_info | 0xffffffffa25c7000|
| .. | xenstore | 0xffffffffa25c8000|
| .. | cosole | 0xffffffffa25c9000|
| 549MB | <= page tables => | 0xffffffffa25ca000|
| .. | | |
| 550MB | <= PGT end => | 0xffffffffa26e1000|
| .. | boot stack | |
\------------/ \-------------------/

As can be seen, the ramdisk, P2M and pagetables are taking
a bit of __ka addresses space. Which is a problem since the
MODULES_VADDR starts at 0xffffffffa0000000 - and P2M sits
right in there! This results during bootup with the inability to
load modules, with this error:

------------[ cut here ]------------
WARNING: at /home/konrad/ssd/linux/mm/vmalloc.c:106 vmap_page_range_noflush+0x2d9/0x370()
Call Trace:
[<ffffffff810719fa>] warn_slowpath_common+0x7a/0xb0
[<ffffffff81030279>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81071a45>] warn_slowpath_null+0x15/0x20
[<ffffffff81130b89>] vmap_page_range_noflush+0x2d9/0x370
[<ffffffff81130c4d>] map_vm_area+0x2d/0x50
[<ffffffff811326d0>] __vmalloc_node_range+0x160/0x250
[<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80
[<ffffffff810c6186>] ? load_module+0x66/0x19c0
[<ffffffff8105cadc>] module_alloc+0x5c/0x60
[<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80
[<ffffffff810c5369>] module_alloc_update_bounds+0x19/0x80
[<ffffffff810c70c3>] load_module+0xfa3/0x19c0
[<ffffffff812491f6>] ? security_file_permission+0x86/0x90
[<ffffffff810c7b3a>] sys_init_module+0x5a/0x220
[<ffffffff815ce339>] system_call_fastpath+0x16/0x1b
---[ end trace fd8f7704fdea0291 ]---
vmalloc: allocation failure, allocated 16384 of 20480 bytes
modprobe: page allocation failure: order:0, mode:0xd2

Since the __va and __ka are 1:1 up to MODULES_VADDR and
cleanup_highmap rids __ka of the ramdisk mapping, what
we want to do is similar - get rid of the P2M in the __ka
address space. There are two ways of fixing this:

1) All P2M lookups instead of using the __ka address would
use the __va address. This means we can safely erase from
__ka space the PMD pointers that point to the PFNs for
P2M array and be OK.
2). Allocate a new array, copy the existing P2M into it,
revector the P2M tree to use that, and return the old
P2M to the memory allocate. This has the advantage that
it sets the stage for using XEN_ELF_NOTE_INIT_P2M
feature. That feature allows us to set the exact virtual
address space we want for the P2M - and allows us to
boot as initial domain on large machines.

So we pick option 2).

This patch only lays the groundwork in the P2M code. The patch
that modifies the MMU is called "xen/mmu: Copy and revector the P2M tree."

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 3699aad0 28-Jun-2012 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything.

We don't need to return the new PGD - as we do not use it.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# ca08649e 16-Aug-2012 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Revert "xen PVonHVM: move shared_info to MMIO before kexec"

This reverts commit 00e37bdb0113a98408de42db85be002f21dbffd3.

During shutdown of PVHVM guests with more than 2VCPUs on certain
machines we can hit the race where the replaced shared_info is not
replaced fast enough and the PV time clock retries reading the same
area over and over without any any success and is stuck in an
infinite loop.

Acked-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 0ec53ecf 14-Sep-2012 Stefano Stabellini <stefano.stabellini@eu.citrix.com>

xen/arm: receive Xen events on ARM

Compile events.c on ARM.
Parse, map and enable the IRQ to get event notifications from the device
tree (node "/xen").

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 00e37bdb 17-Jul-2012 Olaf Hering <olaf@aepfle.de>

xen PVonHVM: move shared_info to MMIO before kexec

Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into
MMIO space before the kexec boot.

The pfn containing the shared_info is located somewhere in RAM. This
will cause trouble if the current kernel is doing a kexec boot into a
new kernel. The new kernel (and its startup code) can not know where the
pfn is, so it can not reserve the page. The hypervisor will continue to
update the pfn, and as a result memory corruption occours in the new
kernel.

One way to work around this issue is to allocate a page in the
xen-platform pci device's BAR memory range. But pci init is done very
late and the shared_info page is already in use very early to read the
pvclock. So moving the pfn from RAM to MMIO is racy because some code
paths on other vcpus could access the pfn during the small window when
the old pfn is moved to the new pfn. There is even a small window were
the old pfn is not backed by a mfn, and during that time all reads
return -1.

Because it is not known upfront where the MMIO region is located it can
not be used right from the start in xen_hvm_init_shared_info.

To minimise trouble the move of the pfn is done shortly before kexec.
This does not eliminate the race because all vcpus are still online when
the syscore_ops will be called. But hopefully there is no work pending
at this point in time. Also the syscore_op is run last which reduces the
risk further.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 83d51ab4 03-May-2012 David Vrabel <dvrabel@cantab.net>

xen/setup: update VA mapping when releasing memory during setup

In xen_memory_setup(), if a page that is being released has a VA
mapping this must also be updated. Otherwise, the page will be not
released completely -- it will still be referenced in Xen and won't be
freed util the mapping is removed and this prevents it from being
reallocated at a different PFN.

This was already being done for the ISA memory region in
xen_ident_map_ISA() but on many systems this was omitting a few pages
as many systems marked a few pages below the ISA memory region as
reserved in the e820 map.

This fixes errors such as:

(XEN) page_alloc.c:1148:d0 Over-allocation for domain 0: 2097153 > 2097152
(XEN) memory.c:133:d0 Could not allocate order=0 extent: id=0 memflags=0 (0 of 17)

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 31b3c9d7 20-Mar-2012 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/x86: Implement x86_apic_ops

Or rather just implement one different function as opposed
to the native one : the read function.

We synthesize the values.

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
[v1: Rebased on top of tip/x86/urgent]
[v2: Return 0xfd instead of 0xff in the default case]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# c2419b4a 31-May-2011 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen: allow enable use of VGA console on dom0

Get the information about the VGA console hardware from Xen, and put
it into the form the bootloader normally generates, so that the rest
of the kernel can deal with VGA as usual.

[ Impact: make VGA console work in dom0 ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Rebased on 2.6.39]
[v2: Removed incorrect comments and fixed compile warnings]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# ad7ba09e 04-May-2011 Daniel Kiper <dkiper@net-space.pl>

arch/x86/xen/xen-ops: Cleanup code/data sections definitions

Cleanup code/data sections definitions
accordingly to include/linux/init.h.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 99bbb3a8 02-Dec-2010 Stefano Stabellini <stefano.stabellini@eu.citrix.com>

xen: PV on HVM: support PV spinlocks and IPIs

Initialize PV spinlocks on boot CPU right after native_smp_prepare_cpus
(that switch to APIC mode and initialize APIC routing); on secondary
CPUs on CPU_UP_PREPARE.

Enable the usage of event channels to send and receive IPIs when
running as a PV on HVM guest.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


# 512b109e 01-Dec-2010 Stefano Stabellini <stefano.stabellini@eu.citrix.com>

xen: unplug the emulated devices at resume time

Early after being resumed we need to unplug again the emulated devices.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


# 41f2e477 30-Mar-2010 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen: add support for PAT

Convert Linux PAT entries into Xen ones when constructing ptes. Linux
doesn't use _PAGE_PAT for ptes, so the only difference in the first 4
entries is that Linux uses _PAGE_PWT for WC, whereas Xen (and default)
use it for WT.

xen_pte_val does the inverse conversion.

We hard-code assumptions about Linux's current PAT layout, but a
warning on the wrmsr to MSR_IA32_CR_PAT should point out any problems.
If necessary we could go to a more general table-based conversion between
Linux and Xen PAT entries.

hugetlbfs poses a problem at the moment, the x86 architecture uses the
same flag for _PAGE_PAT and _PAGE_PSE, which changes meaning depending
on which pagetable level we're using. At the moment this should be OK
so long as nobody tries to do a pte_val on a hugetlbfs pte.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# 2f7acb20 15-Sep-2010 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen: make sure xen_max_p2m_pfn is up to date

Keep xen_max_p2m_pfn up to date with the end of the extra memory
we're adding. It is possible that it will be too high since memory
may be truncated by a "mem=" option on the kernel command line, but
that won't matter.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# f09f6d19 15-Jul-2010 Donald Dutile <ddutile@redhat.com>

Xen: register panic notifier to take crashes of xen guests on panic

Register a panic notifier so that when the guest crashes it can shut
down the domain and indicate it was a crash to the host.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# c1c5413a 13-May-2010 Stefano Stabellini <stefano.stabellini@eu.citrix.com>

x86: Unplug emulated disks and nics.

Add a xen_emul_unplug command line option to the kernel to unplug
xen emulated disks and nics.

Set the default value of xen_emul_unplug depending on whether or
not the Xen PV frontends and the Xen platform PCI driver have
been compiled for this kernel (modules or built-in are both OK).

The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM
even if the host platform doesn't support unplug.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# 409771d2 13-May-2010 Stefano Stabellini <stefano.stabellini@eu.citrix.com>

x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock.

Use xen_vcpuop_clockevent instead of hpet and APIC timers as main
clockevent device on all vcpus, use the xen wallclock time as wallclock
instead of rtc and use xen_clocksource as clocksource.
The pv clock algorithm needs to work correctly for the xen_clocksource
and xen wallclock to be usable, only modern Xen versions offer a
reliable pv clock in HVM guests (XENFEAT_hvm_safe_pvclock).

Using the hpet as clocksource means a VMEXIT every time we read/write to
the hpet mmio addresses, pvclock give us a better rating without
VMEXITs. Same goes for the xen wallclock and xen_vcpuop_clockevent

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# 016b6f5f 13-May-2010 Stefano Stabellini <stefano.stabellini@eu.citrix.com>

xen: Add suspend/resume support for PV on HVM guests.

Suspend/resume requires few different things on HVM: the suspend
hypercall is different; we don't need to save/restore memory related
settings; except the shared info page and the callback mechanism.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# 38e20b07 13-May-2010 Sheng Yang <sheng@linux.intel.com>

x86/xen: event channels delivery on HVM.

Set the callback to receive evtchns from Xen, using the
callback vector delivery mechanism.

The traditional way for receiving event channel notifications from Xen
is via the interrupts from the platform PCI device.
The callback vector is a newer alternative that allow us to receive
notifications on any vcpu and doesn't need any PCI support: we allocate
a vector exclusively to receive events, in the vector handler we don't
need to interact with the vlapic, therefore we avoid a VMEXIT.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# fa24ba62 21-Nov-2009 Ian Campbell <ian.campbell@citrix.com>

xen: correctly restore pfn_to_mfn_list_list after resume

pvops kernels >= 2.6.30 can currently only be saved and restored once. The
second attempt to save results in:

ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000
ERROR Internal error: Failed to map/save the p2m frame list

I finally narrowed it down to:

commit cdaead6b4e657f960d6d6f9f380e7dfeedc6a09b
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Date: Fri Feb 27 15:34:59 2009 -0800

xen: split construction of p2m mfn tables from registration

Build the p2m_mfn_list_list early with the rest of the p2m table, but
register it later when the real shared_info structure is in place.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

The unforeseen side-effect of this change was to cause the mfn list list to not
be rebuilt on resume. Prior to this change it would have been rebuilt via
xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list().

Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend().

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>


# be012920 20-Nov-2009 Ian Campbell <Ian.Campbell@citrix.com>

xen: re-register runstate area earlier on resume.

This is necessary to ensure the runstate area is available to
xen_sched_clock before any calls to printk which will require it in
order to provide a timestamp.

I chose to pull the xen_setup_runstate_info out of xen_time_init into
the caller in order to maintain parity with calling
xen_setup_runstate_info separately from calling xen_time_resume.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>


# f1d7062a 20-Aug-2009 Thomas Gleixner <tglx@linutronix.de>

x86: Move xen_post_allocator_init into xen_pagetable_setup_done

We really do not need two paravirt/x86_init_ops functions which are
called in two consecutive source lines. Move the only user of
post_allocator_init into the already existing pagetable_setup_done
function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# b4ecc126 13-May-2009 Jeremy Fitzhardinge <jeremy@goop.org>

x86: Fix performance regression caused by paravirt_ops on native kernels

Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_ops when running native.

It appears that the hit is entirely due to the paravirtualized
spinlocks introduced by:

| commit 8efcbab674de2bee45a2e4cdf97de16b8e609ac8
| Date: Mon Jul 7 12:07:51 2008 -0700
|
| paravirt: introduce a "lock-byte" spinlock implementation

The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test). The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons). This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions. But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).

If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:

nopv Pv-nospin Pv-spin
CPU cycles 100.00% 99.89% 102.18%
instructions 100.00% 100.10% 100.15%
CPI 100.00% 99.79% 102.03%
cache ref 100.00% 100.84% 100.28%
cache miss 100.00% 90.47% 88.56%
cache miss rate 100.00% 89.72% 88.31%
branches 100.00% 99.93% 100.04%
branch miss 100.00% 103.66% 107.72%
branch miss rt 100.00% 103.73% 107.67%
wallclock 100.00% 99.90% 102.20%

The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.

(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates. Not too surprising, but it suggests that
the non-pvops kernel is over-inlined. On the flipside,
the branch misses go up correspondingly...)

So, what's the fix?

Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls. For example, the compiler
generated code for paravirtualized _spin_lock is:

<_spin_lock+0>: mov %gs:0xb4c8,%rax
<_spin_lock+9>: incl 0xffffffffffffe044(%rax)
<_spin_lock+15>: callq *0xffffffff805a5b30
<_spin_lock+22>: retq

The indirect call will get patched to:
<_spin_lock+0>: mov %gs:0xb4c8,%rax
<_spin_lock+9>: incl 0xffffffffffffe044(%rax)
<_spin_lock+15>: callq <__ticket_spin_lock>
<_spin_lock+20>: nop; nop /* or whatever 2-byte nop */
<_spin_lock+22>: retq

One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled). That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case. The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.

The other obvious answer is to disable pv-spinlocks. Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin). Still it is a
reasonable short-term workaround.

[ Impact: fix pvops performance regression when running native ]

Analysed-by: "Xin Xiaohui" <xiaohui.xin@intel.com>
Analysed-by: "Li Xin" <xin.li@intel.com>
Analysed-by: "Nakajima Jun" <jun.nakajima@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b96229b5 17-Mar-2009 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/mmu: some early pagetable cleanups

1. make sure early-allocated ptes are pinned, so they can be later
unpinned
2. don't pin pmd+pud, just make them RO
3. scatter some __inits around

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# 4185f354 17-Mar-2009 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/mmu: some early pagetable cleanups

1. make sure early-allocated ptes are pinned, so they can be later
unpinned
2. don't pin pmd+pud, just make them RO
3. scatter some __inits around

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>


# b407fc57 18-Feb-2009 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

x86/paravirt: flush pending mmu updates on context switch

Impact: allow preemption during lazy mmu updates

If we're in lazy mmu mode when context switching, leave
lazy mmu mode, but remember the task's state in
TIF_LAZY_MMU_UPDATES. When we resume the task, check this
flag and re-enter lazy mmu mode if its set.

This sets things up for allowing lazy mmu mode while preemptible,
though that won't actually be active until the next change.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>


# 38341432 02-Feb-2009 Jeremy Fitzhardinge <jeremy@goop.org>

xen: setup percpu data pointers

We need to access percpu data fairly early, so set up the percpu
registers as soon as possible. We only need to load the appropriate
segment register. We already have a GDT, but its hard to change it
early because we need to manipulate the pagetable to do so, and that
hasn't been set up yet.

Also, set the kernel stack when bringing up secondary CPUs. If we
don't they all end up sharing the same stack...

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>


# 319f3ba5 28-Jan-2009 Jeremy Fitzhardinge <jeremy@goop.org>

xen: move remaining mmu-related stuff into mmu.c

Impact: Cleanup

Move remaining mmu-related stuff into mmu.c.
A general cleanup, and lay the groundwork for later patches.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>


# b78936e1 16-Dec-2008 Mike Travis <travis@sgi.com>

xen: convert to cpumask_var_t and new cpumask primitives.

Simple change, and eventual space saving when NR_CPUS >> nr_cpu_ids.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>


# 37af46ef 22-Nov-2008 Al Viro <viro@ftp.linux.org.uk>

xen_setup_vcpu_info_placement() is not init on x86

... so get xen-ops.h in agreement with xen/smp.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 26fd1051 08-Sep-2008 Alex Nixon <alex.nixon@citrix.com>

xen: make CPU hotplug functions static

There's no need for these functions to be accessed from outside of xen/smp.c

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d68d82af 22-Aug-2008 Alex Nixon <alex.nixon@citrix.com>

xen: implement CPU hotplugging

Note the changes from 2.6.18-xen CPU hotplugging:

A vcpu_down request from the remote admin via Xenbus both hotunplugs the
CPU, and disables it by removing it from the cpu_present map, and removing
its entry in /sys.

A vcpu_up request from the remote admin only re-enables the CPU, and does
not immediately bring the CPU up. A udev event is emitted, which can be
caught by the user if he wishes to automatically re-up CPUs when available,
or implement a more complex policy.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# ee7686bc 21-Aug-2008 Jeremy Fitzhardinge <jeremy@goop.org>

x86: build fix in "xen spinlock updates and performance measurements"

Ingo Molnar wrote:
> -tip testing found this build failure:
>
> arch/x86/xen/spinlock.c: In function ‘spin_time_start’:
> arch/x86/xen/spinlock.c:60: error: implicit declaration of function ‘xen_clocksource_read’
>
> i've excluded these new commits for now from tip/master - could you
> please send a delta fix against tip/x86/xen?

Make xen_clocksource_read non-static.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 0d1edf46 28-Jul-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: compile irq functions without -pg for ftrace

For some reason I managed to miss a bunch of irq-related functions
which also need to be compiled without -pg when using ftrace. This
patch moves them into their own file, and starts a cleanup process
I've been meaning to do anyway.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: "Alex Nixon (Intern)" <Alex.Nixon@eu.citrix.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d5de8841 23-Jul-2008 Jeremy Fitzhardinge <jeremy@goop.org>

x86: split spinlock implementations out into their own files

ftrace requires certain low-level code, like spinlocks and timestamps,
to be compiled without -pg in order to avoid infinite recursion. This
patch splits out the core paravirt spinlocks and the Xen spinlocks
into separate files which can be compiled without -pg.

Also do xen/time.c while we're about it. As a result, we can now use
ftrace within a Xen domain.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6fcac6d3 08-Jul-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen64: set up syscall and sysenter entrypoints for 64-bit

We set up entrypoints for syscall and sysenter. sysenter is only used
for 32-bit compat processes, whereas syscall can be used in by both 32
and 64-bit processes.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 997409d3 08-Jul-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen64: deal with extra words Xen pushes onto exception frames

Xen pushes two extra words containing the values of rcx and r11. This
pvop hook copies the words back into their appropriate registers, and
cleans them off the stack. This leaves the stack in native form, so
the normal handler can run unchanged.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c7b75947 08-Jul-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen64: smp.c compile hacking

A number of random changes to make xen/smp.c compile in 64-bit mode.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>a
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# a9e7062d 08-Jul-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: move smp setup into smp.c

Move all the smp_ops setup into smp.c, allowing a lot of things to
become static.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# ad55db9f 08-Jul-2008 Isaku Yamahata <yamahata@valinux.co.jp>

xen: add xen_arch_resume()/xen_timer_resume hook for ia64 support

add xen_timer_resume() hook.

Timer resume should be done after event channel is resumed.
add xen_arch_resume() hook when ipi becomes usable after resume.
After resume, some cpu specific resource must be reinitialized
on ia64 that can't be set by another cpu.

However available hooks is run once on only one cpu so that ipi has
to be used.

During stop_machine_run() ipi can't be used because interrupt is masked.
So add another hook after stop_machine_run().
Another approach might be use resume hook which is run by
device_resume(). However device_resume() may be executed on
suspend error recovery path.

So it is necessary to determine whether it is executed on real resume path
or error recovery path.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# e93ef949 01-Jul-2008 Alok Kataria <akataria@vmware.com>

x86: rename paravirtualized TSC functions

Rename the paravirtualized calculate_cpu_khz to calibrate_tsc.
In all cases, we actually calibrate_tsc and use that as the cpu_khz value.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Cc: Dan Hecht <dhecht@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 3b16cf87 26-Jun-2008 Jens Axboe <jens.axboe@oracle.com>

x86: convert to generic helpers for IPI function calls

This converts x86, x86-64, and xen to use the new helpers for
smp_call_function() and friends, and adds support for
smp_call_function_single().

Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>


# d07af1f0 30-May-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: resume timers on all vcpus

On resume, the vcpu timer modes will not be restored. The timer
infrastructure doesn't do this for us, since it assumes the cpus
are offline. We can just poke the other vcpus into the right mode
directly though.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9c7a7942 30-May-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: restore vcpu_info mapping

If we're using vcpu_info mapping, then make sure its restored on all
processors before relasing them from stop_machine.

The only complication is that if this fails, we can't continue because
we've already made assumptions that the mapping is available (baked in
calls to the _direct versions of the functions, for example).

Fortunately this can only happen with a 32-bit hypervisor, which may
possibly run out of mapping space. On a 64-bit hypervisor, this is a
non-issue.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 0e91398f 26-May-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: implement save/restore

This patch implements Xen save/restore and migration.

Saving is triggered via xenbus, which is polled in
drivers/xen/manage.c. When a suspend request comes in, the kernel
prepares itself for saving by:

1 - Freeze all processes. This is primarily to prevent any
partially-completed pagetable updates from confusing the suspend
process. If CONFIG_PREEMPT isn't defined, then this isn't necessary.

2 - Suspend xenbus and other devices

3 - Stop_machine, to make sure all the other vcpus are quiescent. The
Xen tools require the domain to run its save off vcpu0.

4 - Within the stop_machine state, it pins any unpinned pgds (under
construction or destruction), performs canonicalizes various other
pieces of state (mostly converting mfns to pfns), and finally

5 - Suspend the domain

Restore reverses the steps used to save the domain, ending when all
the frozen processes are thawed.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# d5edbc1f 26-May-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: add p2m mfn_list_list

When saving a domain, the Xen tools need to remap all our mfns to
portable pfns. In order to remap our p2m table, it needs to know
where all its pages are, so maintain the references to the p2m table
for it to use.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# a0d695c8 26-May-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: make dummy_shared_info non-static

Rename dummy_shared_info to xen_dummy_shared_info and make it
non-static, in anticipation of users outside of enlighten.c

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# d451bb7a 26-May-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: make phys_to_machine structure dynamic

We now support the use of memory hotplug, so the physical to machine
page mapping structure must be dynamic. This is implemented as a
two-level radix tree structure, which allows us to efficiently
incrementally allocate memory for the p2m table as new pages are
added.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# e04d0d07 02-Apr-2008 Isaku Yamahata <yamahata@valinux.co.jp>

xen: move events.c to drivers/xen for IA64/Xen support

move arch/x86/xen/events.c undedr drivers/xen to share codes
with x86 and ia64. And minor adjustment to compile.
ia64/xen also uses events.c

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# ee523ca1 17-Mar-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: implement a debug-interrupt handler

Xen supports the notion of a debug interrupt which can be triggered
from the console. For now this is implemented to show pending events,
masks and each CPU's pending event set.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# e2a81baf 17-Mar-2008 Jeremy Fitzhardinge <jeremy@goop.org>

xen: support sysenter/sysexit if hypervisor does

64-bit Xen supports sysenter for 32-bit guests, so support its
use. (sysenter is faster than int $0x80 in 32-on-64.)

sysexit is still not supported, so we fake it up using iret.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# 81e103f1 17-Apr-2008 Jeremy Fitzhardinge <jeremy@xensource.com>

xen: use iret instruction all the time

Change iret implementation to not be dependent on direct-access vcpu
structure.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9f79991d 16-Oct-2007 Jeremy Fitzhardinge <jeremy@xensource.com>

xen: deal with stale cr3 values when unpinning pagetables

When a pagetable is no longer in use, it must be unpinned so that its
pages can be freed. However, this is only possible if there are no
stray uses of the pagetable. The code currently deals with all the
usual cases, but there's a rare case where a vcpu is changing cr3, but
is doing so lazily, and the change hasn't actually happened by the time
the pagetable is unpinned, even though it appears to have been completed.

This change adds a second per-cpu cr3 variable - xen_current_cr3 -
which tracks the actual state of the vcpu cr3. It is only updated once
the actual hypercall to set cr3 has been completed. Other processors
wishing to unpin a pagetable can check other vcpu's xen_current_cr3
values to see if any cross-cpu IPIs are needed to clean things up.

[ Stable folks: 2.6.23 bugfix ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>


# f0d73394 16-Oct-2007 Jeremy Fitzhardinge <jeremy@xensource.com>

xen: yield to IPI target if necessary

When sending a call-function IPI to a vcpu, yield if the vcpu isn't
running.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>


# 8965c1c0 16-Oct-2007 Jeremy Fitzhardinge <jeremy@xensource.com>

paravirt: clean up lazy mode handling

Currently, the set_lazy_mode pv_op is overloaded with 5 functions:
1. enter lazy cpu mode
2. leave lazy cpu mode
3. enter lazy mmu mode
4. leave lazy mmu mode
5. flush pending batched operations

This complicates each paravirt backend, since it needs to deal with
all the possible state transitions, handling flushing, etc. In
particular, flushing is quite distinct from the other 4 functions, and
seems to just cause complication.

This patch removes the set_lazy_mode operation, and adds "enter" and
"leave" lazy mode operations on mmu_ops and cpu_ops. All the logic
associated with enter and leaving lazy states is now in common code
(basically BUG_ONs to make sure that no mode is current when entering
a lazy mode, and make sure that the mode is current when leaving).
Also, flush is handled in a common way, by simply leaving and
re-entering the lazy mode.

The result is that the Xen, lguest and VMI lazy mode implementations
are much simpler.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Zach Amsden <zach@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Anthony Liguory <aliguori@us.ibm.com>
Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>


# 9702785a 11-Oct-2007 Thomas Gleixner <tglx@linutronix.de>

i386: move xen

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>