History log of /linux-master/arch/s390/include/asm/checksum.h
Revision Date Author Comments
# dcd3e1de 03-Feb-2024 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: provide csum_partial_copy_nocheck()

With csum_partial(), which reads all bytes into registers it is easy to
also implement csum_partial_copy_nocheck() which copies the buffer while
calculating its checksum.

For a 512 byte buffer this reduces the runtime by 19%. Compared to the old
generic variant (memcpy() + cksm instruction) runtime is reduced by 42%).

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>


# cb2a1dd5 03-Feb-2024 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: provide vector register variant of csum_partial()

Provide a faster variant of csum_partial() which uses vector registers
instead of the cksm instruction.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>


# 3a74f44d 03-Feb-2024 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: provide and use cksm() inline assembly

Convert those callers of csum_partial() to use the cksm instruction,
which are either very early or in critical paths, like panic/dump, so
they don't have to rely on a working kernel infrastructure, which will
be introduced with a subsequent patch.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>


# 4ce69fcf 03-Feb-2024 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: call instrument_read() instead of kasan_check_read()

Call instrument_read() from csum_partial() instead of kasan_check_read().
instrument_read() covers all memory access instrumentation methods.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>


# 11018ef9 29-Mar-2023 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: remove not needed uaccess.h include

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# e42ac778 29-Mar-2023 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: always use cksm instruction

Commit dfe843dce775 ("s390/checksum: support GENERIC_CSUM, enable it for
KASAN") switched s390 to use the generic checksum functions, so that KASAN
instrumentation also works checksum functions by avoiding architecture
specific inline assemblies.

There is however the problem that the generic csum_partial() function
returns a 32 bit value with a 16 bit folded checksum, while the original
s390 variant does not fold to 16 bit. This in turn causes that the
ipib_checksum in lowcore contains different values depending on kernel
config options.

The ipib_checksum is used by system dumpers to verify if pointers in
lowcore point to valid data. Verification is done by comparing checksum
values. The system dumpers still use 32 bit checksum values which are not
folded, and therefore the checksum verification fails (incorrectly).

Symptom is that reboot after dump does not work anymore when a KASAN
instrumented kernel is dumped.

Fix this by not using the generic checksum implementation. Instead add an
explicit kasan_check_read() so that KASAN knows about the read access from
within the inline assembly.

Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Fixes: dfe843dce775 ("s390/checksum: support GENERIC_CSUM, enable it for KASAN")
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# dfe843dc 30-Nov-2022 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: support GENERIC_CSUM, enable it for KASAN

This is the s390 variant of commit d911c67e10b4 ("x86: kasan: kmsan:
support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN"). Even
though most of the s390 specific checksum code is written in C there is
still the csum_partial() inline assembly which could prevent KASAN and
KMSAN from seeing all memory accesses.

Therefore switch to GENERIC_CSUM if KASAN is enabled just like x86.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>


# a29a6b5a 09-Jun-2021 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: use register pair instead of register asm

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# 98ad45fb 11-Aug-2020 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: coding style changes

Add some coding style changes which hopefully make the code
look a bit less odd.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# 612ad078 11-Aug-2020 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: have consistent calculations

Use "|" instead of "+" within csum_fold() for consistency reasons,
like in the rest of the file.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# 614b4f5d 11-Aug-2020 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: make ip_fast_csum() faster

Convert ip_fast_csum() so it doesn't call csum_partial(), but instead
open code the checksum calculation. The problem with csum_partial() is
that it makes use of the cksm instruction, which has high startup
costs and therefore is only very fast if used on larger memory
regions.

IPv4 headers however are small in size (5-16 32-bit words). The open
coded variant calculates the checksum in ~30% of the time compared to
the old variant (z14, march=z196).

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# bb4644b1 11-Aug-2020 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: rewrite csum_tcpudp_nofold()

Rewrite csum_tcpudp_nofold() so that the generated code will not
contain branches. The old implementation was also optimized for
machines which came with "add logical with carry" instructions,
however the compiler doesn't generate them anymore. This is most
likely because those instructions are slower.

However with the old code the compiler generates a lot of branches,
which isn't too helpful usually. Therefore rewrite the code.

In a tight loop this doesn't make any difference since the branch
prediction unit does its job.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# b064904c 11-Aug-2020 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: provide csum_ipv6_magic()

This implementation needs only ~30% of the time to calculate the
checksum compared to the generic variant. In addition the compiler
also generates only ~30% of the instructions compared to the generic
variant (on z14, compiled with march=z196).

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>


# 6e41c585 22-Jul-2020 Al Viro <viro@zeniv.linux.org.uk>

unify generic instances of csum_partial_copy_nocheck()

quite a few architectures have the same csum_partial_copy_nocheck() -
simply memcpy() the data and then return the csum of the copy.

hexagon, parisc, ia64, s390, um: explicitly spelled out that way.

arc, arm64, csky, h8300, m68k/nommu, microblaze, mips/GENERIC_CSUM, nds32,
nios2, openrisc, riscv, unicore32: end up picking the same thing spelled
out in lib/checksum.h (with varying amounts of perversions along the way).

everybody else (alpha, arm, c6x, m68k/mmu, mips/!GENERIC_CSUM, powerpc,
sh, sparc, x86, xtensa) have non-generic variants. For all except c6x
the declaration is in their asm/checksum.h. c6x uses the wrapper
from asm-generic/checksum.h that would normally lead to the lib/checksum.h
instance, but in case of c6x we end up using an asm function from arch/c6x
instead.

Screw that mess - have architectures with private instances define
_HAVE_ARCH_CSUM_AND_COPY in their asm/checksum.h and have the default
one right in net/checksum.h conditional on _HAVE_ARCH_CSUM_AND_COPY
*not* defined.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 5904122c 18-Feb-2020 Al Viro <viro@zeniv.linux.org.uk>

take the dummy csum_and_copy_from_user() into net/checksum.h

now that can be done conveniently - all non-trivial cases have
_HAVE_ARCH_COPY_AND_CSUM_FROM_USER defined, so the fallback in
net/checksum.h is used only for dummy (copy_from_user, then
csum_partial) implementation. Allowing us to get rid of all
dummy instances, both of csum_and_copy_from_user() and
csum_partial_copy_from_user().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 7c0f6ba6 24-Dec-2016 Linus Torvalds <torvalds@linux-foundation.org>

Replace <asm/uaccess.h> with <linux/uaccess.h> globally

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 01cfbad7 11-Mar-2016 Alexander Duyck <aduyck@mirantis.com>

ipv4: Update parameters for csum_tcpudp_magic to their original types

This patch updates all instances of csum_tcpudp_magic and
csum_tcpudp_nofold to reflect the types that are usually used as the source
inputs. For example the protocol field is populated based on nexthdr which
is actually an unsigned 8 bit value. The length is usually populated based
on skb->len which is an unsigned integer.

This addresses an issue in which the IPv6 function csum_ipv6_magic was
generating a checksum using the full 32b of skb->len while
csum_tcpudp_magic was only using the lower 16 bits. As a result we could
run into issues when attempting to adjust the checksum as there was no
protocol agnostic way to update it.

With this change the value is still truncated as many architectures use
"(len + proto) << 8", however this truncation only occurs for values
greater than 16776960 in length and as such is unlikely to occur as we stop
the inner headers at ~64K in size.

I did have to make a few minor changes in the arm, mn10300, nios2, and
score versions of the function in order to support these changes as they
were either using things such as an OR to combine the protocol and length,
or were using ntohs to convert the length which would have truncated the
value.

I also updated a few spots in terms of whitespace and type differences for
the addresses. Most of this was just to make sure all of the definitions
were in sync going forward.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d72d2bb5 24-Feb-2014 Heiko Carstens <hca@linux.ibm.com>

s390/checksum: remove memset() within csum_partial_copy_from_user()

The memset() within csum_partial_copy_from_user() is rather pointless since
copy_from_user() already cleared the rest of the destination buffer if an
exception happened.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>


# a53c8fab 20-Jul-2012 Heiko Carstens <hca@linux.ibm.com>

s390/comments: unify copyright messages and remove file names

Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.

Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>


# 04efc3be 11-Sep-2009 Heiko Carstens <hca@linux.ibm.com>

[S390] convert/optimize csum_fold() to C

In the meantime gcc generates better code than the old inline
assemblies do. Original inline assembly results in:

lr %r1,%r2
sr %r3,%r3
lr %r2,%r1
srdl %r2,16
alr %r2,%r3
alr %r1,%r2
srl %r1,16
xilf %r1,65535
llghr %r2,%r1
br %r14

Out of the C code gcc generates this:

rll %r1,%r2,16
ar %r1,%r2
srl %r1,16
xilf %r1,65535
llghr %r2,%r1
br %r14

In addition we don't have any static register allocations anymore and
gcc is free to shuffle instructions around for better pipeline usage.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>


# c6557e7f 01-Aug-2008 Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] move include/asm-s390 to arch/s390/include/asm

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>