History log of /linux-master/arch/um/include/shared/os.h
Revision Date Author Comments
# 1ca14435 09-Nov-2023 Benjamin Berg <benjamin@sipsolutions.net>

um: Rely on PTRACE_SETREGSET to set FS/GS base registers

These registers are saved/restored together with the other general
registers using ptrace. In arch_set_tls we then just need to set the
register and it will be synced back normally.

Most of this logic was introduced in commit f355559cf7845 ("[PATCH] uml:
x86_64 thread fixes"). However, at least today we can rely on ptrace to
restore the base registers for us. As such, only the part of the patch
that tracks the FS register for use as thread local storage is actually
needed.

Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 0b9ba613 12-Jul-2022 Jason A. Donenfeld <Jason@zx2c4.com>

um: seed rng using host OS rng

UML generally does not provide access to special CPU instructions like
RDRAND, and execution tends to be rather deterministic, with no real
hardware interrupts, making good randomness really very hard, if not
all together impossible. Not only is this a security eyebrow raiser, but
it's also quite annoying when trying to do various pieces of UML-based
automation that takes a long time to boot, if ever.

Fix this by trivially calling getrandom() in the host and using that
seed as "bootloader randomness", which initializes the rng immediately
at UML boot.

The old behavior can be restored the same way as on any other arch, by
way of CONFIG_TRUST_BOOTLOADER_RANDOMNESS=n or
random.trust_bootloader=0. So seen from that perspective, this just
makes UML act like other archs, which is positive in its own right.

Additionally, wire up arch_get_random_{int,long}() in the same way, so
that reseeds can also make use of the host RNG, controllable by
CONFIG_TRUST_CPU_RANDOMNESS and random.trust_cpu, per usual.

Cc: stable@vger.kernel.org
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>


# d2a0a616 25-Jan-2022 Frédéric Danis <frederic.danis@collabora.com>

um: Fix WRITE_ZEROES in the UBD Driver

Call to fallocate with FALLOC_FL_PUNCH_HOLE on a device backed by a sparse
file can end up by missing data, zeroes data range, if the underlying file
is used with a tool like bmaptool which will referenced only used spaces.

Signed-off-by: Frédéric Danis <frederic.danis@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 021fdaef 20-Sep-2021 Al Viro <viro@zeniv.linux.org.uk>

um: header debriding - os.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>


# bbe33504 31-Aug-2021 Johannes Berg <johannes.berg@intel.com>

um: rename set_signals() to um_set_signals()

Rename set_signals() as there's at least one driver that
uses the same name and can now be built on UM due to PCI
support, and thus we can get symbol conflicts.

Also rename set_signals_trace() to be consistent.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 39f75da7 02-Aug-2021 Alexey Dobriyan <adobriyan@gmail.com>

isystem: trim/fixup stdarg.h and other headers

Delete/fixup few includes in anticipation of global -isystem compile
option removal.

Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition
of uintptr_t error (one definition comes from <stddef.h>, another from
<linux/types.h>).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>


# 80f97331 30-May-2021 Shaokun Zhang <zhangshaokun@hisilicon.com>

um: Remove the repeated declaration

Function 'os_flush_stdout' is declared twice, so remove the
repeated declaration.

Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# d8fb32f4 12-Mar-2021 Anton Ivanov <anton.ivanov@cambridgegreys.com>

um: Add support for host CPU flags and alignment

1. Reflect host cpu flags into the UML instance so they can
be used to select the correct implementations for xor, crypto, etc.

2. Reflect host cache alignment into UML instance. This is
important when running 32 bit on a 64 bit host as 32 bit by
default aligns to 32 while the actual alignment should be 64.
Ditto for some Xeons which align at 128.

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# d6b399a0 05-Mar-2021 Johannes Berg <johannes.berg@intel.com>

um: time-travel/signals: fix ndelay() in interrupt

We should be able to ndelay() from any context, even from an
interrupt context! However, this is broken (not functionally,
but locking-wise) in time-travel because we'll get into the
time-travel code and enable interrupts to handle messages on
other time-travel aware subsystems (only virtio for now).

Luckily, I've already reworked the time-travel aware signal
(interrupt) delivery for suspend/resume to have a time travel
handler, which runs directly in the context of the signal and
not from the Linux interrupt.

In order to fix this time-travel issue then, we need to do a
few things:

1) rework the signal handling code to call time-travel handlers
(only) if interrupts are disabled but signals aren't blocked,
instead of marking it only pending there. This is needed to
not deadlock other communication.
2) rework time-travel to not enable interrupts while it's
waiting for a message;
3) rework time-travel to not (just) disable interrupts but
rather block signals at a lower level while it needs them
disabled for communicating with the controller.

Finally, since now we can actually spend even virtual time
in interrupts-disabled sections, the delay warning when we
deliver a time-travel delayed interrupt is no longer valid,
things can (and should) now get delayed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# fbb42e7f 05-Mar-2021 Johannes Berg <johannes.berg@intel.com>

um: export signals_enabled directly

Use signals_enabled instead of always jumping through
a function call to read it, there's not much point in
that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 11385539 13-Dec-2020 Johannes Berg <johannes.berg@intel.com>

um: time-travel: Correct time event IRQ delivery

Lockdep (on 5.10-rc) points out that we're delivering IRQs while IRQs
are not even enabled, which clearly shouldn't happen. Defer the time
event IRQ delivery until they actually are enabled.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# cae20ba0 11-Dec-2020 Johannes Berg <johannes.berg@intel.com>

um: irq/sigio: Support suspend/resume handling of workaround IRQs

If the sigio workaround needed to be applied to a file descriptor,
set_irq_wake() wouldn't work for it since it would get polled by
the thread instead of causing SIGIO, and thus could never really
cause a wakeup, since the thread notification FD wasn't marked as
being able to wake up the system.

Fix this by marking the thread's notification FD explicitly as a
wake source FD, i.e. not suppressing SIGIO for it in suspend. In
order to not cause spurious wakeups, we then need to remove all
FDs that shouldn't wake up the system from the polling thread. In
order to do this, add unlocked versions of ignore_sigio_fd() and
add_sigio_fd() (nothing else is happening in suspend, so this is
fine), and also modify ignore_sigio_fd() to return -ENOENT if the
FD wasn't originally in there. This doesn't matter because nothing
else currently checks the return value, but the irq code needs to
know which ones to restore the workaround for.

All told, this lets us use a timerfd for the RTC clock in the next
patch, which doesn't send SIGIO.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# a374b7cb 02-Dec-2020 Johannes Berg <johannes.berg@intel.com>

um: Support suspend to RAM

With all the previous bits in place, we can now also support
suspend to RAM, in the sense that everything is suspended,
not just most, including userspace, processes like in s2idle.

Since um_idle_sleep() now waits forever, we can simply call
that to "suspend" the system.

As before, you can wake it up using SIGUSR1 since we're just
in a pause() call that only needs to return.

In order to implement selective resume from certain devices,
and not have any arbitrary device interrupt wake up, suspend
interrupts by removing SIGIO notification (O_ASYNC) from all
the FDs that are not supposed to wake up the system. However,
swap out the handler so we don't actually handle the SIGIO as
an interrupt.

Since we're in pause(), the mere act of receiving SIGIO wakes
us up, and then after things have been restored enough, re-set
O_ASYNC for all previously suspended FDs, reinstall the proper
SIGIO handler, and send SIGIO to self to process anything that
might now be pending.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 92dcd3d3 02-Dec-2020 Johannes Berg <johannes.berg@intel.com>

um: Allow PM with suspend-to-idle

In order to be able to experiment with suspend in UML, add the
minimal work to be able to suspend (s2idle) an instance of UML,
and be able to wake it back up from that state with the USR1
signal sent to the main UML process.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 49da38a3 02-Dec-2020 Johannes Berg <johannes.berg@intel.com>

um: Simplify os_idle_sleep() and sleep longer

There really is no reason to pass the amount of time we should
sleep, especially since it's just hard-coded to one second.

Additionally, one second isn't really all that long, and as we
are expecting to be woken up by a signal, we can sleep longer
and avoid doing some work every second, so replace the current
clock_nanosleep() with just an empty select() that can _only_
be woken up by a signal.

We can also remove the deliver_alarm() since we don't need to
do that when we got e.g. SIGIO that woke us up, and if we got
SIGALRM the signal handler will actually (have) run, so it's
just unnecessary extra work.

Similarly, in time-travel mode, just program the wakeup event
from idle to be S64_MAX, which is basically the most you could
ever simulate to. Of course, you should already have an event
in the list that's earlier and will cause a wakeup, normally
that's the regular timer interrupt, though in suspend it may
(later) also be an RTC event. Since actually getting to this
point would be a bug and you can't ever get out again, panic()
on it in the time control code.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 2fccfcc0 01-Dec-2020 Johannes Berg <johannes.berg@intel.com>

um: Remove IRQ_NONE type

We don't actually use this in um_request_irq(), so it can
never be assigned. It's also not clear what that would be
useful for, so just remove it.

This results in quite a number of cleanups, all the way to
removing the "SIGIO on close" startup check, since the data
it assigns (pty_close_sigio) is not used anymore.

While at it, also make this an enum so we get a minimum of
type checking, and remove the IRQ_NONE hack in virtio since
we now no longer have the name twice.

Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 88ce6424 13-Feb-2020 Johannes Berg <johannes.berg@intel.com>

um: Implement time-travel=ext

This implements synchronized time-travel mode which - using a special
application on a unix socket - lets multiple machines take part in a
time-travelling simulation together.

The protocol for the unix domain socket is defined in the new file
include/uapi/linux/um_timetravel.h.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 853bc0ab 05-Nov-2019 Arnd Bergmann <arnd@arndb.de>

um: ubd: use 64-bit time_t where possible

The ubd code suffers from a possible y2038 overflow on 32-bit
architectures, both for the cow header and the os_file_modtime()
function.

Replace time_t with time64_t to extend the ubd_kern side as much
as possible.

Whether this makes a difference for the user side depends on
the host libc implementation that may use either 32-bit or 64-bit
time_t.

For the cow file format, the header contains an unsigned 32-bit
timestamp, which is good until y2106, passing this through a
'long long' gives us a consistent interpretation between 32-bit
and 64-bit um kernels.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>


# f2f4bf5a 25-Aug-2019 Alex Dewar <alex.dewar@gmx.co.uk>

um: Add SPDX headers for files in arch/um/include

Convert files to use SPDX header. All files are licensed under the GPLv2.

Signed-off-by: Alex Dewar <alex.dewar@gmx.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 5d38f324 11-Sep-2019 Erel Geron <erelx.geron@intel.com>

um: drivers: Add virtio vhost-user driver

This module allows virtio devices to be used over a vhost-user socket.

Signed-off-by: Erel Geron <erelx.geron@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 0dafcbe1 23-Aug-2019 Johannes Berg <johannes.berg@intel.com>

um: Implement TRACE_IRQFLAGS_SUPPORT

UML enables TRACE_IRQFLAGS_SUPPORT but doesn't actually implement
it. It seems to have been added for lockdep support, but that can't
actually really work well without IRQ flags tracing, as is also
very noisily reported when enabling CONFIG_DEBUG_LOCKDEP.

Implement it now.

Fixes: 711553efa5b8 ("[PATCH] uml: declare in Kconfig our partial LOCKDEP support")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# c7c6f3b9 27-May-2019 Johannes Berg <johannes.berg@intel.com>

um: Pass nsecs to os timer functions

This makes the code clearer and lets the time travel patch have
the actual time used for these functions in just one place.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 56fc1870 06-May-2019 Johannes Berg <johannes.berg@intel.com>

um: Timer code cleanup

There are some unused functions, and some others that have
unused arguments; clean up the timer code a bit.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# fcd242c6 06-May-2019 Johannes Berg <johannes.berg@intel.com>

um: fix os_timer_one_shot()

os_timer_one_shot() gets passed a value "unsigned long delta",
so must not have an "int ticks" as that actually ends up being
-1, and thus triggering a timer over and over again.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 50109b5a 14-Nov-2018 Anton Ivanov <anton.ivanov@cambridgegreys.com>

um: Add support for DISCARD in the UBD Driver

Support for DISCARD and WRITE_ZEROES in the ubd driver using
fallocate.

DISCARD is enabled by default and can be disabled using a new
UBD command line flag.

If the underlying fs on which the UBD image is stored does not
support DISCARD the support for both DISCARD and WRITE_ZEROES
is turned off.

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# ff6a1798 20-Nov-2017 Anton Ivanov <anton.ivanov@cambridgegreys.com>

Epoll based IRQ controller

1. Removes the need to walk the IRQ/Device list to determine
who triggered the IRQ.
2. Improves scalability (up to several times performance
improvement for cases with 10s of devices).
3. Improves UML baseline IO performance for one disk + one NIC
use case by up to 10%.
4. Introduces write poll triggered IRQs.
5. Prerequisite for introducing high performance mmesg family
of functions in network IO.
6. Fixes RNG shutdown which was leaking a file descriptor

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 6f602afd 29-Jul-2017 Thomas Meyer <thomas@m3y3r.de>

um: Fix FP register size for XSTATE/XSAVE

Hard code max size. Taken from
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/common/x86-xstate.h

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 721ccae8 17-May-2017 Masami Hiramatsu <mhiramat@kernel.org>

um: Add os_warn() for pre-boot warning/error messages

Add os_warn() for printing out pre-boot warning/error
messages in stderr. The messages via os_warn() are not
suppressed by quiet option.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>


# f7887ee1 17-May-2017 Masami Hiramatsu <mhiramat@kernel.org>

um: Add os_info() for pre-boot information messages

Add os_info() for printing out pre-boot information
level messages in stderr. The messages via os_info()
are suppressed by "quiet" kernel command line.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 17a6e1b8 20-Mar-2017 Kyle Huey <me@kylehuey.com>

x86/arch_prctl/64: Rename do_arch_prctl() to do_arch_prctl_64()

In order to introduce new arch_prctls that are not 64 bit only, rename the
existing 64 bit implementation to do_arch_prctl_64(). Also rename the
second argument of that function from 'addr' to 'arg2', because it will no
longer always be an address.

Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Robert O'Callahan <robert@ocallahan.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Len Brown <len.brown@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: David Matlack <dmatlack@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170320081628.18952-5-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# dd93938a 20-Mar-2017 Kyle Huey <me@kylehuey.com>

x86/arch_prctl: Rename 'code' argument to 'option'

The x86 specific arch_prctl() arbitrarily changed prctl's 'option' to
'code'. Before adding new options, rename it.

Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Robert O'Callahan <robert@ocallahan.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: David Matlack <dmatlack@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170320081628.18952-3-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


# e04c989e 29-Dec-2015 Mickaël Salaün <mic@digikod.net>

um: Fix ptrace GETREGS/SETREGS bugs

This fix two related bugs:
* PTRACE_GETREGS doesn't get the right orig_ax (syscall) value
* PTRACE_SETREGS can't set the orig_ax value (erased by initial value)

Get rid of the now useless and error-prone get_syscall().

Fix inconsistent behavior in the ptrace implementation for i386 when
updating orig_eax automatically update the syscall number as well. This
is now updated in handle_syscall().

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Anton Ivanov <aivanov@brocade.com>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: David Drysdale <drysdale@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Kees Cook <keescook@chromium.org>


# 8c6157b6 21-Dec-2015 Anton Ivanov <aivanov@brocade.com>

um: Update UBD to use pread/pwrite family of functions

This decreases the number of syscalls per read/write by half.

Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 2eb5f31b 02-Nov-2015 Anton Ivanov <aivanov@brocade.com>

um: Switch clocksource to hrtimers

UML is using an obsolete itimer call for
all timers and "polls" for kernel space timer firing
in its userspace portion resulting in a long list
of bugs and incorrect behaviour(s). It also uses
ITIMER_VIRTUAL for its timer which results in the
timer being dependent on it running and the cpu
load.

This patch fixes this by moving to posix high resolution
timers firing off CLOCK_MONOTONIC and relaying the timer
correctly to the UML userspace.

Fixes:
- crashes when hosts suspends/resumes
- broken userspace timers - effecive ~40Hz instead
of what they should be. Note - this modifies skas behavior
by no longer setting an itimer per clone(). Timer events
are relayed instead.
- kernel network packet scheduling disciplines
- tcp behaviour especially under load
- various timer related corner cases

Finally, overall responsiveness of userspace is better.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Anton Ivanov <aivanov@brocade.com>
[rw: massaged commit message]
Signed-off-by: Richard Weinberger <richard@nod.at>


# 1d80f0cd 25-Oct-2015 Richard Weinberger <richard@nod.at>

um: Store syscall number after syscall_trace_enter()

To support changing syscall numbers we have to store
it after syscall_trace_enter().

Signed-off-by: Richard Weinberger <richard@nod.at>


# 89520d99 31-May-2015 Richard Weinberger <richard@nod.at>

um: Move syscall() declaration into os.h

Signed-off-by: Richard Weinberger <richard@nod.at>


# d0b5e15f 18-Mar-2015 Richard Weinberger <richard@nod.at>

um: Remove SKAS3/4 support

Before we had SKAS0 UML had two modes of operation
TT (tracing thread) and SKAS3/4 (separated kernel address space).
TT was known to be insecure and got removed a long time ago.
SKAS3/4 required a few (3 or 4) patches on the host side which never went
mainline. The last host patch is 10 years old.

With SKAS0 mode (separated kernel address space using 0 host patches),
default since 2005, SKAS3/4 is obsolete and can be removed.

Signed-off-by: Richard Weinberger <richard@nod.at>


# 0565103d 07-Mar-2014 Anton Ivanov <antivano@cisco.com>

um: Memory corruption on startup

The reverse case of this race (you must msync before read) is
well known. This is the not so common one.

It can be triggered only on systems which do a lot of task
switching and only at UML startup. If you are starting 200+ UMLs
~ 0.5% will always die without this fix.

Signed-off-by: Anton Ivanov <antivano@cisco.com>
[rw: minor whitespace fixes]
Signed-off-by: Richard Weinberger <richard@nod.at>


# f72c22e4 23-Sep-2013 Richard Weinberger <richard@nod.at>

um: Make stack trace reliable against kernel mode faults

As UML uses an alternative signal stack we cannot use
the current stack pointer for stack dumping if UML itself
dies by SIGSEGV. To bypass this issue we save regs taken
from mcontext in our segv handler into thread_struct and
use these regs to obtain the stack pointer in show_stack().

Signed-off-by: Richard Weinberger <richard@nod.at>


# 91d44ff8 18-Aug-2013 Richard Weinberger <richard@nod.at>

um: Cleanup SIGTERM handling

Richard reported that some UML processes survive if the UML
main process receives a SIGTERM.
This issue was caused by a wrongly placed signal(SIGTERM, SIG_DFL)
in init_new_thread_signals().
It disabled the UML exit handler accidently for some processes.
The correct solution is to disable the fatal handler for all
UML helper threads/processes.
Such that last_ditch_exit() does not get called multiple times
and all processes can exit due to SIGTERM.

Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 805f11a0 18-Aug-2013 Richard Weinberger <richard@nod.at>

um: ubd: Add REQ_FLUSH suppport

UML's block device driver does not support write barriers,
to support this this patch adds REQ_FLUSH suppport.
Every time the block layer sends a REQ_FLUSH we fsync() now
our backing file to guarantee data consistency.

Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>


# f75b1b1b 17-Aug-2013 Richard Weinberger <richard@nod.at>

um: Implement probe_kernel_read()

UML needs it's own probe_kernel_read() to handle kernel
mode faults correctly.
The implementation uses mincore() on the host side to detect
whether a page is owned by the UML kernel process.

This fixes also a possible crash when sysrq-t is used.
Starting with 3.10 sysrq-t calls probe_kernel_read() to
read details from the kernel workers. As kernel worker are
completely async pointers may turn NULL while reading them.

Cc: <stian@nixia.no>
Cc: <tj@kernel.org>
Cc: <stable@vger.kernel.org> # 3.10.x
Signed-off-by: Richard Weinberger <richard@nod.at>


# 22e2430d 10-Oct-2012 Al Viro <viro@zeniv.linux.org.uk>

x86, um: convert to saner kernel_execve() semantics

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 37185b33 07-Oct-2012 Al Viro <viro@ZenIV.linux.org.uk>

um: get rid of pointless include "..." where include <...> will do

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 1bbd5f21 18-Aug-2011 Al Viro <viro@ftp.linux.org.uk>

um: merge os-Linux/tls.c into arch/x86/um/os-Linux/tls.c

it's i386-specific; moreover, analogs on other targets have
incompatible interface - PTRACE_GET_THREAD_AREA does exist
elsewhere, but struct user_desc does *not*

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 00361683 18-Aug-2011 Al Viro <viro@ftp.linux.org.uk>

um: fill the handlers array at build time

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>


# e87df986 18-Aug-2011 Al Viro <viro@ftp.linux.org.uk>

um: simplify set_handler()

For one thing, we always block the same signals (IRQ ones - IO, WINCH, VTALRM),
so there's no need to pass sa_mask elements in arguments. For another, the
flags depend only on whether it's an IRQ signal or not (we add SA_RESTART
for them).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>


# 5d40de0f 18-Aug-2011 Al Viro <viro@ftp.linux.org.uk>

um: kill dead code around uaccess

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>


# d634f194 24-May-2011 Richard Weinberger <richard@nod.at>

um: add earlyprintk support

User Mode Linux can also benefit from earlyprintk. UML's earlyprintk
writes kernel messages directly to stdout.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 0ce451ac 24-May-2011 Richard Weinberger <richard@nod.at>

um: fix UML_LIB_PATH

UML_LIB_PATH is hardcoded to /usr/lib/uml/, on 64bit systems UML_LIB_PATH
needs to be /usr/lib64/uml/.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 005a59ec 20-Apr-2009 Al Viro <viro@zeniv.linux.org.uk>

Deal with missing exports for hostfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# ec82c32d 25-Aug-2008 Al Viro <viro@zeniv.linux.org.uk>

x86, um: get rid of arch/um/os symlink

we can get DEV_NULL defined for arch/um/drivers/null.c in less
convoluted ways, TYVM...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>


# 8569c914 17-Aug-2008 Al Viro <viro@zeniv.linux.org.uk>

x86, um: take arch/um/include/* out of the way

We can't just plop asm/* into it - userland helpers are built with it
in search path and seeing asm/* show up there suddenly would be a bad
idea.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>