Cross Reference: /linux-master/tools/testing/selftests/rcutorture/bin/kvm-remote.sh

Revision	Date	Author	Comments
# 80021ffb	15-Jun-2023	Paul E. McKenney <paulmck@kernel.org>	torture: Make kvm-remote print diagnostics on initial ssh failure Currently, if the initial ssh fails, kvm-remote.sh gives up, printing a message saying so. But it would be nice to get a better idea as to why ssh failed. This commit therefore dumps out ssh's exit code, stdout, and stderr upon ssh failure for diagnostic purposes. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# c211ae9c	27-Aug-2022	Paul E. McKenney <paulmck@kernel.org>	torture: Use mktemp instead of guessing at unique names This commit drags the rcutorture scripting kicking and screaming into the twenty-first century by making use of the BSD-derived mktemp command to create temporary files and directories. In happy contrast to many of its ill-behaved predecessors, mktemp seems to actually work reasonably reliably! Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# ab69d3c8	12-Apr-2022	Paul E. McKenney <paulmck@kernel.org>	torture: Make kvm-remote.sh announce which system is being waited on If a remote system fails in certain ways, for example, if it is rebooted without removing the contents of the /tmp directory, its remote.run file never will be removed and the kvm-remote.sh script will loop waiting forever. The manual workaround for this (hopefully!) rare event is to manually remove the file, which will cause the results up to the reboot to be collected and evaluated. Unfortunately, to work out which system is holding things up, the user must refer to the name of the last system whose results were collected, then look up the name of the next system in sequence, then manually remove the remote.run file. Even more unfortunately, this procedure can be fooled in runs where each system handles more than one batch should a given system take longer than expected, causing the systems to be handled out of order. This commit therefore causes kvm-remote.sh to print out the name of the system it will wait on next, allowing the user to refer directly to that name. Making the kvm-remote.sh script automatically handle unscheduled termination of the qemu processes is left as future work. Quite possibly deep future work. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# b20842ba	15-Feb-2022	Paul E. McKenney <paulmck@kernel.org>	torture: Use "-o Batchmode=yes" to disable ssh password requests The torture.sh script normally runs unattended, so there is not much point in the "ssh" command asking for a password. This commit therefore adds the "-o Batchmode=yes" argument to each "ssh" command to cause it to fail rather than ask for a password. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# ab3ecd0b	15-Feb-2022	Paul E. McKenney <paulmck@kernel.org>	torture: Reposition so that $? collects ssh code in torture.sh An "echo" slipped in between an "ssh" and the "ret=$?" that was intended to collect its exit code, which prevents torture.sh from detecting "ssh" failure. This commit therefore reassociates the two. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# a7d89cfb	25-Jan-2022	Paul E. McKenney <paulmck@kernel.org>	torture: Change KVM environment variable to RCUTORTURE The torture-test scripting's long-standing use of KVM as the environment variable tracking the pathname of the rcutorture directory now conflicts with allmodconfig builds due to the virt/kvm/Makefile.kvm file's use of this as a makefile variable. This commit therefore changes the torture-test scripting from KVM to RCUTORTURE, avoiding the name conflict. Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# 2bc9062e	20-Dec-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Make kvm-remote.sh try multiple times to download tarball This commit ups the retries for downloading the build-product tarball to a given remote system from once to five times, the better to handle transient network failures. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# 90b21bcf	30-Nov-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Properly redirect kvm-remote.sh "echo" commands The echo commands following initialization of the "oldrun" variable need to be "tee"d to $oldrun/remote-log. This commit fixes several stragglers. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# b6c9dbf0	30-Nov-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Fix incorrectly redirected "exit" in kvm-remote.sh The "exit 4" in kvm-remote.sh is pointlessly redirected, so this commit removes the redirection. Fixes: 0092eae4cb4e ("torture: Add kvm-remote.sh script for distributed rcutorture test runs") Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# f6153700	22-Nov-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Retry download once before giving up Currently, a transient network error can kill a run if it happens while downloading the tarball to one of the target systems. This commit therefore does a 60-second wait and then a retry. If further experience indicates, a more elaborate mechanism might be used later. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# faaaf2ac	05-Aug-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Make kvm-remote.sh print size of downloaded tarball This commit causes kvm-remote.sh to print the size of the tarball that is downloaded to each of the remote systems. This size can help with performance projections and analysis. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# a5202e17	08-Jul-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Make kvm-test-1-run-batch.sh select per-scenario affinity masks This commit causes kvm-test-1-run-batch.sh to use the new kvm-assign-cpus.sh and kvm-get-cpus-script.sh scripts to create a TORTURE_AFFINITY environment variable containing either an empty string (for no affinity) or a list of CPUs to pin the scenario's vCPUs to. The additional change to kvm-test-1-run.sh places the per-scenario number-of-CPUs information where it can easily be found. If there is some reason why affinity cannot be supplied, this commit prints and logs the reason via changes to kvm-again.sh. Finally, this commit updates the kvm-remote.sh script to copy the qemu-affinity output files back to the host system. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# 9e528a84	08-Jul-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Consistently name "qemu*" test output files There is "qemu-affinity", "qemu-cmd", "qemu-retval", but also "qemu_pid". This is hard to remember, not so good for bash tab completion, and just plain inconsistent. This commit therefore renames the "qemu_pid" file to "qemu-pid". A couple of the scripts must deal with old runs, and thus must handle both "qemu_pid" and "qemu-pid", but new runs will produce "qemu-pid". Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# 5a2898f1	17-Jun-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Protect kvm-remote.sh directory trees from /tmp reaping The kvm-remote.sh script places the datestamped directory containing all the build artifacts in the destination systems' /tmp directories, where they accumulate runtime artifacts such as console.log. This works, but some systems have a habit of removing files in /tmp that have not been recently accessed. This commit therefore runs a simple script that periodically accesses all files in the datestamped directory. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# 442f99af	15-Jun-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Log more kvm-remote.sh information This commit logs additional information to help track down set up and networking issues. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# 3d78668e	27-Apr-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Don't cap remote runs by build-system number of CPUs Currently, if a torture scenario requires more CPUs than are present on the build system, kvm.sh and friends limit the CPUs available to that scenario. This makes total sense when the build system and the system running the scenarios are one and the same, but not so much when remote systems might well have more CPUs. This commit therefore introduces a --remote flag to kvm.sh that suppresses this CPU-limiting behavior, and causes kvm-remote.sh to use this flag. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# c43d3b00	27-Apr-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Make kvm-remote.sh account for network failure in pathname checks In a long-duration kvm-remote.sh run, almost all of the remote accesses will be simple file-existence checks. These are thus the most likely to be caught out by network failures, which do happen from time to time. This commit therefore takes a first step towards tolerating temporary network outages by making the file-existence checks repeat in the face of such an outage. They also print a message every minute during a outage, allowing the user to take appropriate action. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# f8c8484d	01-Apr-2021	Frederic Weisbecker <frederic@kernel.org>	torture: Correctly fetch number of CPUs for non-English languages Grepping for "CPU" on lscpu output isn't always successful, depending on the local language setting. As a result, the build can be aborted early with: "make: the '-j' option requires a positive integer argument" This commit therefore uses the human-language-independent approach available via the getconf command, both in kvm-build.sh and in kvm-remote.sh. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
# 0092eae4	05-Mar-2021	Paul E. McKenney <paulmck@kernel.org>	torture: Add kvm-remote.sh script for distributed rcutorture test runs This commit adds a kvm-remote.sh script that prepares a tarball that is then downloaded to the remote system(s) and executed. The user is responsible for having set up the remote systems to run qemu, but all the kernel builds are done on the system running the kvm-remote.sh script. The user is also responsible for setting up the remote systems so that ssh can be run non-interactively, given that ssh is used to poll the remote systems in order to detect completion of each batch. See the script's header comment for usage information. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>