History log of /linux-master/tools/testing/selftests/net/so_incoming_cpu.c
Revision Date Author Comments
# 97de5a15 19-Jan-2024 Kuniyuki Iwashima <kuniyu@amazon.com>

selftest: Don't reuse port for SO_INCOMING_CPU test.

Jakub reported that ASSERT_EQ(cpu, i) in so_incoming_cpu.c seems to
fire somewhat randomly.

# # RUN so_incoming_cpu.before_reuseport.test3 ...
# # so_incoming_cpu.c:191:test3:Expected cpu (32) == i (0)
# # test3: Test terminated by assertion
# # FAIL so_incoming_cpu.before_reuseport.test3
# not ok 3 so_incoming_cpu.before_reuseport.test3

When the test failed, not-yet-accepted CLOSE_WAIT sockets received
SYN with a "challenging" SEQ number, which was sent from an unexpected
CPU that did not create the receiver.

The test basically does:

1. for each cpu:
1-1. create a server
1-2. set SO_INCOMING_CPU

2. for each cpu:
2-1. set cpu affinity
2-2. create some clients
2-3. let clients connect() to the server on the same cpu
2-4. close() clients

3. for each server:
3-1. accept() all child sockets
3-2. check if all children have the same SO_INCOMING_CPU with the server

The root cause was the close() in 2-4. and net.ipv4.tcp_tw_reuse.

In a loop of 2., close() changed the client state to FIN_WAIT_2, and
the peer transitioned to CLOSE_WAIT.

In another loop of 2., connect() happened to select the same port of
the FIN_WAIT_2 socket, and it was reused as the default value of
net.ipv4.tcp_tw_reuse is 2.

As a result, the new client sent SYN to the CLOSE_WAIT socket from
a different CPU, and the receiver's sk_incoming_cpu was overwritten
with unexpected CPU ID.

Also, the SYN had a different SEQ number, so the CLOSE_WAIT socket
responded with Challenge ACK. The new client properly returned RST
and effectively killed the CLOSE_WAIT socket.

This way, all clients were created successfully, but the error was
detected later by 3-2., ASSERT_EQ(cpu, i).

To avoid the failure, let's make sure that (i) the number of clients
is less than the number of available ports and (ii) such reuse never
happens.

Fixes: 6df96146b202 ("selftest: Add test for SO_INCOMING_CPU.")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Tested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20240120031642.67014-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 3ff16174 31-Jul-2023 Kuniyuki Iwashima <kuniyu@amazon.com>

selftest: net: Assert on a proper value in so_incoming_cpu.c.

Dan Carpenter reported an error spotted by Smatch.

./tools/testing/selftests/net/so_incoming_cpu.c:163 create_clients()
error: uninitialized symbol 'ret'.

The returned value of sched_setaffinity() should be checked with
ASSERT_EQ(), but the value was not saved in a proper variable,
resulting in an error above.

Let's save the returned value of with sched_setaffinity().

Fixes: 6df96146b202 ("selftest: Add test for SO_INCOMING_CPU.")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-kselftest/fe376760-33b6-4fc9-88e8-178e809af1ac@moroto.mountain/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230731181553.5392-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 6df96146 21-Oct-2022 Kuniyuki Iwashima <kuniyu@amazon.com>

selftest: Add test for SO_INCOMING_CPU.

Some highly optimised applications use SO_INCOMING_CPU to make them
efficient, but they didn't test if it's working correctly by getsockopt()
to avoid slowing down. As a result, no one noticed it had been broken
for years, so it's a good time to add a test to catch future regression.

The test does

1) Create $(nproc) TCP listeners associated with each CPU.

2) Create 32 child sockets for each listener by calling
sched_setaffinity() for each CPU.

3) Check if accept()ed sockets' sk_incoming_cpu matches
listener's one.

If we see -EAGAIN, SO_INCOMING_CPU is broken. However, we might not see
any error even if broken; the kernel could miraculously distribute all SYN
to correct listeners. Not to let that happen, we must increase the number
of clients and CPUs to some extent, so the test requires $(nproc) >= 2 and
creates 64 sockets at least.

Test:
$ nproc
96
$ ./so_incoming_cpu

Before the previous patch:

# Starting 12 tests from 5 test cases.
# RUN so_incoming_cpu.before_reuseport.test1 ...
# so_incoming_cpu.c:191:test1:Expected cpu (5) == i (0)
# test1: Test terminated by assertion
# FAIL so_incoming_cpu.before_reuseport.test1
not ok 1 so_incoming_cpu.before_reuseport.test1
...
# FAILED: 0 / 12 tests passed.
# Totals: pass:0 fail:12 xfail:0 xpass:0 skip:0 error:0

After:

# Starting 12 tests from 5 test cases.
# RUN so_incoming_cpu.before_reuseport.test1 ...
# so_incoming_cpu.c:199:test1:SO_INCOMING_CPU is very likely to be working correctly with 3072 sockets.
# OK so_incoming_cpu.before_reuseport.test1
ok 1 so_incoming_cpu.before_reuseport.test1
...
# PASSED: 12 / 12 tests passed.
# Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>