History log of /linux-master/arch/mips/lib/csum_partial.S
Revision Date Author Comments
# 9259e15b 07-Aug-2023 Masahiro Yamada <masahiroy@kernel.org>

mips: replace #include <asm/export.h> with #include <linux/export.h>

Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.

Replace #include <asm/export.h> with #include <linux/export.h>.

After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>


# fa62f39d 25-Jan-2022 Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: Fix build error due to PTR used in more places

Use PTR_WD instead of PTR to avoid clashes with other parts.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>


# 1cd95ab8 19-Jul-2020 Al Viro <viro@zeniv.linux.org.uk>

mips: propagate the calling convention change down into __csum_partial_copy_..._user()

and turn the exception handlers into simply returning 0, which
simplifies the hell out of things in csum_partial.S

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# f863c65c 19-Jul-2020 Al Viro <viro@zeniv.linux.org.uk>

mips: __csum_partial_copy_kernel() has no users left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# cc44c17b 10-Jul-2020 Al Viro <viro@zeniv.linux.org.uk>

csum_partial_copy_nocheck(): drop the last argument

It's always 0. Note that we theoretically could use ~0U as well -
result will be the same modulo 0xffff, _if_ the damn thing did the
right thing for any value of initial sum; later we'll make use of
that when convenient.

However, unlike csum_and_copy_..._user(), there are instances that
did not work for arbitrary initial sums; c6x is one such.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# ab7c01fd 21-May-2020 Serge Semin <Sergey.Semin@baikalelectronics.ru>

mips: Add MIPS Release 5 support

There are five MIPS32/64 architecture releases currently available:
from 1 to 6 except fourth one, which was intentionally skipped.
Three of them can be called as major: 1st, 2nd and 6th, that not only
have some system level alterations, but also introduced significant
core/ISA level updates. The rest of the MIPS architecture releases are
minor.

Even though they don't have as much ISA/system/core level changes
as the major ones with respect to the previous releases, they still
provide a set of updates (I'd say they were intended to be the
intermediate releases before a major one) that might be useful for the
kernel and user-level code, when activated by the kernel or compiler.
In particular the following features were introduced or ended up being
available at/after MIPS32/64 Release 5 architecture:
+ the last release of the misaligned memory access instructions,
+ virtualisation - VZ ASE - is optional component of the arch,
+ SIMD - MSA ASE - is optional component of the arch,
+ DSP ASE is optional component of the arch,
+ CP0.Status.FR=1 for CP1.FIR.F64=1 (pure 64-bit FPU general registers)
must be available if FPU is implemented,
+ CP1.FIR.Has2008 support is required so CP1.FCSR.{ABS2008,NAN2008} bits
are available.
+ UFR/UNFR aliases to access CP0.Status.FR from user-space by means of
ctc1/cfc1 instructions (enabled by CP0.Config5.UFR),
+ CP0.COnfig5.LLB=1 and eretnc instruction are implemented to without
accidentally clearing LL-bit when returning from an interrupt,
exception, or error trap,
+ XPA feature together with extended versions of CPx registers is
introduced, which needs to have mfhc0/mthc0 instructions available.

So due to these changes GNU GCC provides an extended instructions set
support for MIPS32/64 Release 5 by default like eretnc/mfhc0/mthc0. Even
though the architecture alteration isn't that big, it still worth to be
taken into account by the kernel software. Finally we can't deny that
some optimization/limitations might be found in future and implemented
on some level in kernel or compiler. In this case having even
intermediate MIPS architecture releases support would be more than
useful.

So the most of the changes provided by this commit can be split into
either compile- or runtime configs related. The compile-time related
changes are caused by adding the new CONFIG_CPU_MIPS32_R5/CONFIG_CPU_MIPSR5
configs and concern the code activating MIPSR2 or MIPSR6 already
implemented features (like eretnc/LLbit, mthc0/mfhc0). In addition
CPU_HAS_MSA can be now freely enabled for MIPS32/64 release 5 based
platforms as this is done for CPU_MIPS32_R6 CPUs. The runtime changes
concerns the features which are handled with respect to the MIPS ISA
revision detected at run-time by means of CP0.Config.{AT,AR} bits. Alas
these fields can be used to detect either r1 or r2 or r6 releases.
But since we know which CPUs in fact support the R5 arch, we can manually
set MIPS_CPU_ISA_M32R5/MIPS_CPU_ISA_M64R5 bit of c->isa_level and then
use cpu_has_mips32r5/cpu_has_mips64r5 where it's appropriate.

Since XPA/EVA provide too complex alterationss and to have them used with
MIPS32 Release 2 charged kernels (for compatibility with current platform
configs) they are left to be setup as a separate kernel configs.

Co-developed-by: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Signed-off-by: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>


# 268a2d60 20-Oct-2019 Jiaxun Yang <jiaxun.yang@flygoat.com>

MIPS: Loongson64: Rename CPU TYPES

CPU_LOONGSON2 -> CPU_LOONGSON2EF
CPU_LOONGSON3 -> CPU_LOONGSON64

As newer loongson-2 products (2G/2H/2K1000) can share kernel
implementation with loongson-3 while 2E/2F are less similar with
other LOONGSON64 products.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: chenhc@lemote.com
Cc: paul.burton@mips.com


# 23130042 07-Nov-2016 Paul Burton <paulburton@kernel.org>

MIPS: Export csum functions alongside their definitions

Now that EXPORT_SYMBOL can be used from assembly source, move the
EXPORT_SYMBOL invocations for the csum_partial_* functions to be
alongside their definitions.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14512/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 615eb603 26-Mar-2015 Chen Jie <chenj@lemote.com>

MIPS: csum_partial: Improve instruction parallelism.

Computing sum introduces true data dependency. This patch removes some
true data depdendencies, hence increases instruction level parallelism.

This patch brings up to 50% csum performance gain on Loongson 3a.

One example about how this patch works is in CSUM_BIGCHUNK1:
// ** original ** vs ** patch applied **
ADDC(sum, t0) ADDC(t0, t1)
ADDC(sum, t1) ADDC(t2, t3)
ADDC(sum, t2) ADDC(sum, t0)
ADDC(sum, t3) ADDC(sum, t2)

In the original implementation, each ADDC(sum, ...) depends on the sum
value updated by previous ADDC(as source operand).

With this patch applied, the first two ADDC operations are independent,
hence can be executed simultaneously if possible.

Another example is in the "copy and sum calculating chunk":
// ** original ** vs ** patch applied **
STORE(t0, UNIT(0) ... STORE(t0, UNIT(0) ...
ADDC(sum, t0) ADDC(t0, t1)
STORE(t1, UNIT(1) ... STORE(t1, UNIT(1) ...
ADDC(sum, t1) ADDC(sum, t0)
STORE(t2, UNIT(2) ... STORE(t2, UNIT(2) ...
ADDC(sum, t2) ADDC(t2, t3)
STORE(t3, UNIT(3) ... STORE(t3, UNIT(3) ...
ADDC(sum, t3) ADDC(sum, t2)

With this patch applied, ADDC and the **next next** ADDC are independent.

Signed-off-by: chenj <chenj@lemote.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9608/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 3c09bae4 15-Aug-2014 Chen Jie <chenj@lemote.com>

MIPS: Use WSBH/DSBH/DSHD on Loongson 3A

Signed-off-by: chenj <chenj@lemote.com>
Cc: linux-mips@linux-mips.org
Cc: chenhc@lemote.com
Patchwork: https://patchwork.linux-mips.org/patch/7542/
Patchwork: https://patchwork.linux-mips.org/patch/7550/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 44ba138f 03-Apr-2014 Maciej W. Rozycki <macro@linux-mips.org>

MIPS: csum_partial.S CPU_DADDI_WORKAROUNDS bug fix

This change reverts most of commit
60724ca59eda766a30be57aec6b49bc3e2bead91 [MIPS: IP checksums: Remove
unncessary .set pseudos] that introduced warnings with the
CPU_DADDI_WORKAROUNDS option set:

arch/mips/lib/csum_partial.S: Assembler messages:
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
[...]
arch/mips/lib/csum_partial.S:577: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:577: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:577: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
[and so on, and so on...]

The warnings are benign and good code is produced regardless because no
macros that'd use the assembler's temporary register are involved, however
the `.set noat' directives removed by the commit referred are crucial to
guarantee this is still going to be the case after any changes in the
future. Therefore they need to be brought back to place which this
change does.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6686/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 6f85cebe 17-Jan-2014 Markos Chandras <markos.chandras@imgtec.com>

MIPS: lib: csum_partial: Add EVA support

Use EVA specific functions to read and write data to
user address space.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>


# e89fb56c 17-Jan-2014 Markos Chandras <markos.chandras@imgtec.com>

MIPS: lib: csum_partial: Add macro to build csum_partial symbols

In preparation for EVA support, we use a macro to build the
__csum_partial_copy_user main code so it can be shared across
multiple implementations. EVA uses the same code but it replaces
the load/store/prefetch instructions with the EVA specific ones
therefore using a macro avoids unnecessary code duplications.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>


# 2ab82e66 16-Jan-2014 Markos Chandras <markos.chandras@imgtec.com>

MIPS: lib: csum_partial: Merge EXC and load/store macros

Each load/store macro always adds an entry to the __ex_table
using the EXC macro. There are cases where a load instruction may
never fail such as when we are sure the load happens in the kernel
address space. Therefore, we merge these the EXC and LOADX/STOREX
macros into a single one. We also expand the argument list in the EXC
macro to make the macro more flexible. The extra 'type' argument is not
used by this commit, but it will be used when EVA support is added to
memcpy.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>


# ac85227f 12-Dec-2013 Markos Chandras <markos.chandras@imgtec.com>

MIPS: checksum: Split the 'copy_user' symbol

The 'copy_user' symbol can be used to copy from or to
userland so we will use two different symbols for these
operations. This makes no difference in the existing code,
but when the core is operating in EVA mode, different instructions
need to be used to read and write to userland address space.
The old function has also been renamed to 'copy_kernel' to denote
that it is suitable for copy data to and from kernel space.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>


# e744109f 03-Mar-2013 Gabor Juhos <juhosg@openwrt.org>

MIPS: Use CONFIG_CPU_MIPSR2 in csum_partial.S

The csum_partial implementation contain optimalizations for the MIPS R2
instruction set. This optimization is never enabled however because the
if directive uses the CPU_MIPSR2 constant which is not defined anywhere.

Use the CONFIG_CPU_MIPSR2 constant instead.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4971/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 70342287 21-Jan-2013 Ralf Baechle <ralf@linux-mips.org>

MIPS: Whitespace cleanup.

Having received another series of whitespace patches I decided to do this
once and for all rather than dealing with this kind of patches trickling
in forever.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# b65a75b8 11-Oct-2008 Ralf Baechle <ralf@linux-mips.org>

MIPS: IP checksums: Optimize adjust of sum on buffers of odd alignment.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 60724ca5 11-Oct-2008 Ralf Baechle <ralf@linux-mips.org>

MIPS: IP checksums: Remove unncessary .set pseudos

They possibly silence meaningful warnings ...

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# d86a8123 11-Oct-2008 Ralf Baechle <ralf@linux-mips.org>

MIPS: IP checksums: Remove unncessary folding of sum to 16 bit.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# b80a1b80 20-Sep-2008 Atsushi Nemoto <anemo@mba.ocn.ne.jp>

[MIPS] Fix 64-bit IP checksum code

Use unsigned loads to avoid possible misscalculation of IP checksums. This
bug was instruced in f761106cd728bcf65b7fe161b10221ee00cf7132 (lmo) /
ed99e2bc1dc5dc54eb5a019f4975562dbef20103 (kernel.org).

[Original fix by Atsushi. Improved instruction scheduling and fix for
unaligned unsigned load by me -- Ralf]

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# c5ec1983 29-Jan-2008 Ralf Baechle <ralf@linux-mips.org>

[MIPS] Eleminate local symbols from the symbol table.

These symbols appear in oprofile output, stacktraces and similar but only
make the output harder to read. Many identical symbol names such as
"both_aligned" were also being used in multiple source files making it
impossible to see which file actually was meant. So let's get rid of them.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 619b6e18 22-Oct-2007 Maciej W. Rozycki <macro@linux-mips.org>

[MIPS] R4000/R4400 daddiu erratum workaround

This complements the generic R4000/R4400 errata workaround code and adds
bits for the daddiu problem. In most places it just modifies handwritten
assembly code so that the assembler is allowed to use a temporary register
as daddiu may now be treated as a macro that expands to a sequence of li
and daddu. It is the AT register or, where AT is unavailable or used
explicitly for another purpose, an explicitly-named register is selected,
using the .set at=<reg> feature added recently to gas. This feature is
only used if CONFIG_CPU_DADDI_WORKAROUNDS has been set, so if the
workaround remains disabled, the required version of binutils stays
unchanged.

Similarly, daddiu instructions put in branch delay slots in noreorder
fragments are now taken out of them and the assembler is allowed to
reorder them itself as possible (which it does making the whole idea of
scheduling them into delay slots manually questionable).

Also in the very few places where such a simple conversion was not
possible, a handcoded longer sequence is implemented.

Other than that there are changes to code responsible for building the
TLB fault and page clear/copy handlers to avoid daddiu as appropriate.
These are only effective if the erratum is verified to be present at the
run time.

Finally there is a trivial update to __delay(), because it uses daddiu in
a branch delay slot.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# f860c90b 12-Dec-2006 Atsushi Nemoto <anemo@mba.ocn.ne.jp>

[MIPS] csum_partial and copy in parallel

Implement optimized asm version of csum_partial_copy_nocheck,
csum_partial_copy_from_user and csum_and_copy_to_user which can do
calculate and copy in parallel, based on memcpy.S.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# ed99e2bc 07-Dec-2006 Atsushi Nemoto <anemo@mba.ocn.ne.jp>

[MIPS] Optimize csum_partial for 64bit kernel

Make csum_partial 64-bit powered.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 773ff788 07-Dec-2006 Atsushi Nemoto <anemo@mba.ocn.ne.jp>

[MIPS] Optimize flow of csum_partial

Delete dead codes at end of the function and move small_csumcopy
there. This makes some labels (maybe_end_cruft, small_memcpy,
end_bytes, out) needless and eliminates some branches.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 52ffe760 07-Dec-2006 Atsushi Nemoto <anemo@mba.ocn.ne.jp>

[MIPS] Make csum_partial more readable

Use standard o32 register name instead of T0, T1, etc, like memcpy.S.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# 0bcdda0f 03-Dec-2006 Atsushi Nemoto <anemo@mba.ocn.ne.jp>

[MIPS] Unify csum_partial.S

The 32-bit version and 64-bit version are almost equal. Unify them. This
makes further improvements (for example, copying with parallel, supporting
PREFETCH, etc.) easier.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>