History log of /linux-master/tools/testing/selftests/rcutorture/bin/kvm-remote.sh
Revision Date Author Comments
# 80021ffb 15-Jun-2023 Paul E. McKenney <paulmck@kernel.org>

torture: Make kvm-remote print diagnostics on initial ssh failure

Currently, if the initial ssh fails, kvm-remote.sh gives up, printing a
message saying so. But it would be nice to get a better idea as to why
ssh failed. This commit therefore dumps out ssh's exit code, stdout,
and stderr upon ssh failure for diagnostic purposes.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# c211ae9c 27-Aug-2022 Paul E. McKenney <paulmck@kernel.org>

torture: Use mktemp instead of guessing at unique names

This commit drags the rcutorture scripting kicking and screaming into the
twenty-first century by making use of the BSD-derived mktemp command to
create temporary files and directories. In happy contrast to many of its
ill-behaved predecessors, mktemp seems to actually work reasonably reliably!

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# ab69d3c8 12-Apr-2022 Paul E. McKenney <paulmck@kernel.org>

torture: Make kvm-remote.sh announce which system is being waited on

If a remote system fails in certain ways, for example, if it is rebooted
without removing the contents of the /tmp directory, its remote.run file
never will be removed and the kvm-remote.sh script will loop waiting
forever. The manual workaround for this (hopefully!) rare event is to
manually remove the file, which will cause the results up to the reboot
to be collected and evaluated.

Unfortunately, to work out which system is holding things up, the user
must refer to the name of the last system whose results were collected,
then look up the name of the next system in sequence, then manually
remove the remote.run file. Even more unfortunately, this procedure can
be fooled in runs where each system handles more than one batch should
a given system take longer than expected, causing the systems to be
handled out of order.

This commit therefore causes kvm-remote.sh to print out the name of
the system it will wait on next, allowing the user to refer directly
to that name. Making the kvm-remote.sh script automatically handle
unscheduled termination of the qemu processes is left as future work.
Quite possibly deep future work.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# b20842ba 15-Feb-2022 Paul E. McKenney <paulmck@kernel.org>

torture: Use "-o Batchmode=yes" to disable ssh password requests

The torture.sh script normally runs unattended, so there is not much
point in the "ssh" command asking for a password. This commit therefore
adds the "-o Batchmode=yes" argument to each "ssh" command to cause it
to fail rather than ask for a password.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# ab3ecd0b 15-Feb-2022 Paul E. McKenney <paulmck@kernel.org>

torture: Reposition so that $? collects ssh code in torture.sh

An "echo" slipped in between an "ssh" and the "ret=$?" that was intended
to collect its exit code, which prevents torture.sh from detecting
"ssh" failure. This commit therefore reassociates the two.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# a7d89cfb 25-Jan-2022 Paul E. McKenney <paulmck@kernel.org>

torture: Change KVM environment variable to RCUTORTURE

The torture-test scripting's long-standing use of KVM as the environment
variable tracking the pathname of the rcutorture directory now conflicts
with allmodconfig builds due to the virt/kvm/Makefile.kvm file's use
of this as a makefile variable. This commit therefore changes the
torture-test scripting from KVM to RCUTORTURE, avoiding the name conflict.

Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# 2bc9062e 20-Dec-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Make kvm-remote.sh try multiple times to download tarball

This commit ups the retries for downloading the build-product tarball
to a given remote system from once to five times, the better to handle
transient network failures.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# 90b21bcf 30-Nov-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Properly redirect kvm-remote.sh "echo" commands

The echo commands following initialization of the "oldrun" variable need
to be "tee"d to $oldrun/remote-log. This commit fixes several stragglers.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# b6c9dbf0 30-Nov-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Fix incorrectly redirected "exit" in kvm-remote.sh

The "exit 4" in kvm-remote.sh is pointlessly redirected, so this commit
removes the redirection.

Fixes: 0092eae4cb4e ("torture: Add kvm-remote.sh script for distributed rcutorture test runs")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# f6153700 22-Nov-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Retry download once before giving up

Currently, a transient network error can kill a run if it happens while
downloading the tarball to one of the target systems. This commit
therefore does a 60-second wait and then a retry. If further experience
indicates, a more elaborate mechanism might be used later.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# faaaf2ac 05-Aug-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Make kvm-remote.sh print size of downloaded tarball

This commit causes kvm-remote.sh to print the size of the tarball that
is downloaded to each of the remote systems. This size can help with
performance projections and analysis.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# a5202e17 08-Jul-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Make kvm-test-1-run-batch.sh select per-scenario affinity masks

This commit causes kvm-test-1-run-batch.sh to use the new
kvm-assign-cpus.sh and kvm-get-cpus-script.sh scripts to create a
TORTURE_AFFINITY environment variable containing either an empty string
(for no affinity) or a list of CPUs to pin the scenario's vCPUs to.
The additional change to kvm-test-1-run.sh places the per-scenario
number-of-CPUs information where it can easily be found.

If there is some reason why affinity cannot be supplied, this commit
prints and logs the reason via changes to kvm-again.sh.

Finally, this commit updates the kvm-remote.sh script to copy the
qemu-affinity output files back to the host system.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# 9e528a84 08-Jul-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Consistently name "qemu*" test output files

There is "qemu-affinity", "qemu-cmd", "qemu-retval", but also "qemu_pid".
This is hard to remember, not so good for bash tab completion, and just
plain inconsistent. This commit therefore renames the "qemu_pid" file to
"qemu-pid". A couple of the scripts must deal with old runs, and thus
must handle both "qemu_pid" and "qemu-pid", but new runs will produce
"qemu-pid".

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# 5a2898f1 17-Jun-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Protect kvm-remote.sh directory trees from /tmp reaping

The kvm-remote.sh script places the datestamped directory containing
all the build artifacts in the destination systems' /tmp directories,
where they accumulate runtime artifacts such as console.log. This works,
but some systems have a habit of removing files in /tmp that have not
been recently accessed. This commit therefore runs a simple script that
periodically accesses all files in the datestamped directory.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# 442f99af 15-Jun-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Log more kvm-remote.sh information

This commit logs additional information to help track down set up and
networking issues.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# 3d78668e 27-Apr-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Don't cap remote runs by build-system number of CPUs

Currently, if a torture scenario requires more CPUs than are present
on the build system, kvm.sh and friends limit the CPUs available to
that scenario. This makes total sense when the build system and the
system running the scenarios are one and the same, but not so much when
remote systems might well have more CPUs.

This commit therefore introduces a --remote flag to kvm.sh that suppresses
this CPU-limiting behavior, and causes kvm-remote.sh to use this flag.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# c43d3b00 27-Apr-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Make kvm-remote.sh account for network failure in pathname checks

In a long-duration kvm-remote.sh run, almost all of the remote accesses will
be simple file-existence checks. These are thus the most likely to be caught
out by network failures, which do happen from time to time.

This commit therefore takes a first step towards tolerating temporary
network outages by making the file-existence checks repeat in the face of
such an outage. They also print a message every minute during a outage,
allowing the user to take appropriate action.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# f8c8484d 01-Apr-2021 Frederic Weisbecker <frederic@kernel.org>

torture: Correctly fetch number of CPUs for non-English languages

Grepping for "CPU" on lscpu output isn't always successful, depending
on the local language setting. As a result, the build can be aborted
early with:

"make: the '-j' option requires a positive integer argument"

This commit therefore uses the human-language-independent approach
available via the getconf command, both in kvm-build.sh and in
kvm-remote.sh.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>


# 0092eae4 05-Mar-2021 Paul E. McKenney <paulmck@kernel.org>

torture: Add kvm-remote.sh script for distributed rcutorture test runs

This commit adds a kvm-remote.sh script that prepares a tarball that
is then downloaded to the remote system(s) and executed. The user is
responsible for having set up the remote systems to run qemu, but all the
kernel builds are done on the system running the kvm-remote.sh script.
The user is also responsible for setting up the remote systems so that
ssh can be run non-interactively, given that ssh is used to poll the
remote systems in order to detect completion of each batch.

See the script's header comment for usage information.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>