History log of /linux-master/arch/arc/include/asm/cache.h
Revision Date Author Comments
# 47910ca3 13-Sep-2019 Vineet Gupta <vgupta@kernel.org>

ARC: mm: move mmu/cache externs out to setup.h

Don't pollute mmu.h and cache.h with ARC internal bootlog/setup
related functions. Move them aside to setup.h

Signed-off-by: Vineet Gupta <vgupta@kernel.org>


# f091d5a4 08-Nov-2019 Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

ARC: ARCv2: jump label: implement jump label patching

Implement jump label patching for ARC. Jump labels provide
an interface to generate dynamic branches using
self-modifying code.

This allows us to implement conditional branches where
changing branch direction is expensive but branch selection
is basically 'free'

This implementation uses 32-bit NOP and BRANCH instructions
which forced to be aligned by 4 to guarantee that they don't
cross L1 cache line boundary and can be update atomically.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# d2912cb1 04-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# b6835ea7 08-Feb-2019 Alexey Brodkin <abrodkin@synopsys.com>

ARC: define ARCH_SLAB_MINALIGN = 8

The default value of ARCH_SLAB_MINALIGN in "include/linux/slab.h" is
"__alignof__(unsigned long long)" which for ARC unexpectedly turns out
to be 4. This is not a compiler bug, but as defined by ARC ABI [1]

Thus slab allocator would allocate a struct which is 32-bit aligned,
which is generally OK even if struct has long long members.
There was however potetial problem when it had any atomic64_t which
use LLOCKD/SCONDD instructions which are required by ISA to take
64-bit addresses. This is the problem we ran into

[ 4.015732] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[ 4.167881] Misaligned Access
[ 4.172356] Path: /bin/busybox.nosuid
[ 4.176004] CPU: 2 PID: 171 Comm: rm Not tainted 4.19.14-yocto-standard #1
[ 4.182851]
[ 4.182851] [ECR ]: 0x000d0000 => Check Programmer's Manual
[ 4.190061] [EFA ]: 0xbeaec3fc
[ 4.190061] [BLINK ]: ext4_delete_entry+0x210/0x234
[ 4.190061] [ERET ]: ext4_delete_entry+0x13e/0x234
[ 4.202985] [STAT32]: 0x80080002 : IE K
[ 4.207236] BTA: 0x9009329c SP: 0xbe5b1ec4 FP: 0x00000000
[ 4.212790] LPS: 0x9074b118 LPE: 0x9074b120 LPC: 0x00000000
[ 4.218348] r00: 0x00000040 r01: 0x00000021 r02: 0x00000001
...
...
[ 4.270510] Stack Trace:
[ 4.274510] ext4_delete_entry+0x13e/0x234
[ 4.278695] ext4_rmdir+0xe0/0x238
[ 4.282187] vfs_rmdir+0x50/0xf0
[ 4.285492] do_rmdir+0x9e/0x154
[ 4.288802] EV_Trap+0x110/0x114

The fix is to make sure slab allocations are 64-bit aligned.

Do note that atomic64_t is __attribute__((aligned(8)) which means gcc
does generate 64-bit aligned references, relative to beginning of
container struct. However the issue is if the container itself is not
64-bit aligned, atomic64_t ends up unaligned which is what this patch
ensures.

[1] https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/wiki/files/ARCv2_ABI.pdf

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: reworked changelog, added dependency on LL64+LLSC]


# 3624379d 04-Oct-2018 Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

ARC: IOC: panic if kernel was started with previously enabled IOC

If IOC was already enabled (due to bootloader) it technically needs to
be reconfigured with aperture base,size corresponding to Linux memory map
which will certainly be different than uboot's. But disabling and
reenabling IOC when DMA might be potentially active is tricky business.
To avoid random memory issues later, just panic here and ask user to
upgrade bootloader to one which doesn't enable IOC

This was actually seen as issue on some of the HSDK board with a version
of uboot which enabled IOC. There were random issues later with starting
of X or peripherals etc.

Also while I'm at it, replace hardcoded bits in ARC_REG_IO_COH_PARTIAL
and ARC_REG_IO_COH_ENABLE registers by definitions.

Inspired by: https://lkml.org/lkml/2018/1/19/557
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# eb277739 26-Jul-2018 Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

ARC: dma [non-IOC] setup SMP_CACHE_BYTES and cache_line_size

As for today we don't setup SMP_CACHE_BYTES and cache_line_size for
ARC, so they are set to L1_CACHE_BYTES by default. L1 line length
(L1_CACHE_BYTES) might be easily smaller than L2 line (which is
usually the case BTW). This breaks code.

For example this breaks ethernet infrastructure on HSDK/AXS103 boards
with IOC disabled, involving manual cache flushes
Functions which alloc and manage sk_buff packet data area rely on
SMP_CACHE_BYTES define. In the result we can share last L2 cache
line in sk_buff linear packet data area between DMA buffer and
some useful data in other structure. So we can lose this data when
we invalidate DMA buffer.

sk_buff linear packet data area
|
|
| skb->end skb->tail
V | |
V V
----------------------------------------------.
packet data | <tail padding> | <useful data in other struct>
----------------------------------------------.

---------------------.--------------------------------------------------.
SLC line | SLC (L2 cache) line (128B) |
---------------------.--------------------------------------------------.
^ ^
| |
These cache lines will be invalidated when we invalidate skb
linear packet data area before DMA transaction starting.

This leads to issues painful to debug as it reproduces only if
(sk_buff->end - sk_buff->tail) < SLC_LINE_SIZE and
if we have some useful data right after sk_buff->end.

Fix that by hardcode SMP_CACHE_BYTES to max line length we may have.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# ae0b63d9 31-Jul-2017 Vineet Gupta <vgupta@synopsys.com>

ARCv2: SLC: provide a line based flush routine for debugging

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 9f82e90a 18-Jul-2017 Alexey Brodkin <alexey.brodkin@gmail.com>

ARC: Hardcode ARCH_DMA_MINALIGN to max line length we may have

Current implementation relies on L1 line length which might easily
be smaller than L2 line (which is usually the case BTW).

Imagine this typical case: L2 line is 128 bytes while L1 line is
64-bytes. Now we want to allocate small buffer and later use it for DMA
(consider IOC is not available).

kmalloc() allocates small KMALLOC_MIN_SIZE-sized, KMALLOC_MIN_SIZE-aligned
That way if buffer happens to be aligned to L1 line and not L2 line we'll be
flushing and invalidating extra portions of data from L2 which will cause
cache coherency issues.

And since KMALLOC_MIN_SIZE is bound to ARCH_DMA_MINALIGN the fix could
be simple - set ARCH_DMA_MINALIGN to the largest cache line we may ever
get. As of today neither L1 of ARC700 and ARC HS38 nor SLC might not be
longer than 128 bytes.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 7d79cee2 31-Jul-2017 Alexey Brodkin <Alexey.Brodkin@synopsys.com>

ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses

It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1
which hold MSB bits of the physical address correspondingly of region start
and end otherwise SLC region operation is executed in unpredictable manner

Without this patch, SLC flushes on HSDK (IOC disabled) were taking
seconds.

Cc: stable@vger.kernel.org #4.4+
Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: PAR40 regs only written if PAE40 exist]


# f734a310 02-May-2017 Vineet Gupta <vgupta@synopsys.com>

ARCv2: mm: micro-optimize region flush generated code

DC_CTRL.RGN_OP is 3 bits wide, however only 1 bit is used in current
programming model (0: flush, 1: invalidate)

The current code targetting 3 bits leads to additional 8 byte AND
operation which can be elided given that only 1 bit is ever set by
software and/or looked at by hardware

before
------

| 80b63324 <__dma_cache_wback_inv_l1>:
| 80b63324: clri r3
| 80b63328: lr r2,[dc_ctrl]
| 80b6332c: and r2,r2,0xfffff1ff <--- 8 bytes insn
| 80b63334: or r2,r2,576
| 80b63338: sr r2,[dc_ctrl]
| ...
| ...
| 80b63360 <__dma_cache_inv_l1>:
| 80b63360: clri r3
| 80b63364: lr r2,[dc_ctrl]
| 80b63368: and r2,r2,0xfffff1ff <--- 8 bytes insn
| 80b63370: bset_s r2,r2,0x9
| 80b63372: sr r2,[dc_ctrl]
| ...
| ...
| 80b6338c <__dma_cache_wback_l1>:
| 80b6338c: clri r3
| 80b63390: lr r2,[dc_ctrl]
| 80b63394: and r2,r2,0xfffff1ff <--- 8 bytes insn
| 80b6339c: sr r2,[dc_ctrl]

after (AND elided totally in 2 cases, replaced with 2 byte BCLR in 3rd)
-----

| 80b63324 <__dma_cache_wback_inv_l1>:
| 80b63324: clri r3
| 80b63328: lr r2,[dc_ctrl]
| 80b6332c: or r2,r2,576
| 80b63330: sr r2,[dc_ctrl]
| ...
| ...
| 80b63358 <__dma_cache_inv_l1>:
| 80b63358: clri r3
| 80b6335c: lr r2,[dc_ctrl]
| 80b63360: bset_s r2,r2,0x9
| 80b63362: sr r2,[dc_ctrl]
| ...
| ...
| 80b6337c <__dma_cache_wback_l1>:
| 80b6337c: clri r3
| 80b63380: lr r2,[dc_ctrl]
| 80b63384: bclr_s r2,r2,0x9
| 80b63386: sr r2,[dc_ctrl]

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 0d77117f 28-Aug-2014 Vineet Gupta <vgupta@synopsys.com>

ARCv2: mm: Implement cache region flush operations

These are more efficient than the per-line ops

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 8c47f83b 22-Jun-2016 Vineet Gupta <vgupta@synopsys.com>

ARCv2: IOC: Adhere to progamming model guidelines to avoid DMA corruption

On AXS103 release bitfiles, DMA data corruptions were seen because IOC
setup was not following the recommended way in documentation.

Flipping IOC on when caches are enabled or coherency transactions are in
flight, might cause some of the memory operations to not observe
coherency as expected.

So strictly follow the programming model recommendations as documented
in comment header above arc_ioc_setup()

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# d4911cdd 22-Jun-2016 Vineet Gupta <vgupta@synopsys.com>

ARCv2: IOC: refactor the IOC and SLC operations into own functions

- Move IOC setup into arc_ioc_setup()
- Move SLC disabling into arc_slc_disable()

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# cf986d47 13-Oct-2016 Vineet Gupta <vgupta@synopsys.com>

ARCv2: IOC: use @ioc_enable not @ioc_exist where intended

if user disables IOC from debugger at startup (by clearing @ioc_enable),
@ioc_exists is cleared too. This means boot prints don't capture the
fact that IOC was present but disabled which could be misleading.

So invert how we use @ioc_enable and @ioc_exists and make it more
canonical. @ioc_exists represent whether hardware is present or not and
stays same whether enabled or not. @ioc_enable is still user driven,
but will be auto-disabled if IOC hardware is not present, i.e. if
@ioc_exist=0. This is opposite to what we were doing before, but much
clearer.

This means @ioc_enable is now the "exported" toggle in rest of code such
as dma mapping API.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 26c01c49 26-Aug-2016 Vineet Gupta <vgupta@synopsys.com>

ARCv2: Support dynamic peripheral address space in HS38 rel 3.0 cores

HS release 3.0 provides for even more flexibility in specifying the
volatile address space for mapping peripherals.

With HS 2.1 @start was made flexible / programmable - with HS 3.0 even
@end can be setup (vs. fixed to 0xFFFF_FFFF before).

So add code to reflect that and while at it remove an unused struct
defintion

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# deaf7565 24-Oct-2015 Vineet Gupta <vgupta@synopsys.com>

ARCv2: ioremap: Support dynamic peripheral address space

The peripheral address space is architectural address window which is
uncached and typically used to wire up peripherals.

For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region
0xC000_0000 - 0xFFFF_FFFF.

For ARCv2 based HS38 cores the start address is flexible and can be
0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg
(typically done in bootloader)

Further in cas of PAE, the physical address can extend beyond 4GB so
need to confine this check, otherwise all pages beyond 4GB will be
treated as uncached

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 4b32e89a 18-Dec-2015 Alexey Brodkin <Alexey.Brodkin@synopsys.com>

ARC: mm: fix building for MMU v2

ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
define ARC_REG_IC_PTAG for MMU versions >= 3.

But current implementation of cache_line_loop_vX() routines assumes
availability of all of them (v2, v3 and v4) simultaneously.

And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
compilation problem:
---------------------------------->8-------------------------------
CC arch/arc/mm/cache.o
arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
aux_tag = ARC_REG_IC_PTAG;
^
arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
---------------------------------->8-------------------------------

The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
version being used.

We don't use it in cache_line_loop_v2() anyways so who cares.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 5a364c2a 06-Feb-2015 Vineet Gupta <vgupta@synopsys.com>

ARC: mm: PAE40 support

This is the first working implementation of 40-bit physical address
extension on ARCv2.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# f2b0b25a 25-May-2015 Alexey Brodkin <abrodkin@synopsys.com>

ARCv2: Support IO Coherency and permutations involving L1 and L2 caches

In case of ARCv2 CPU there're could be following configurations
that affect cache handling for data exchanged with peripherals
via DMA:
[1] Only L1 cache exists
[2] Both L1 and L2 exist, but no IO coherency unit
[3] L1, L2 caches and IO coherency unit exist

Current implementation takes care of [1] and [2].
Moreover support of [2] is implemented with run-time check
for SLC existence which is not super optimal.

This patch introduces support of [3] and rework of DMA ops
usage. Instead of doing run-time check every time a particular
DMA op is executed we'll have 3 different implementations of
DMA ops and select appropriate one during init.

As for IOC support for it we need:
[a] Implement empty DMA ops because IOC takes care of cache
coherency with DMAed data
[b] Route dma_alloc_coherent() via dma_alloc_noncoherent()
This is required to make IOC work in first place and also
serves as optimization as LD/ST to coherent buffers can be
srviced from caches w/o going all the way to memory

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
[vgupta:
-Added some comments about IOC gains
-Marked dma ops as static,
-Massaged changelog a bit]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 795f4558 02-Apr-2015 Vineet Gupta <vgupta@synopsys.com>

ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency)

L2 cache on ARCHS processors is called SLC (System Level Cache)
For working DMA (in absence of hardware assisted IO Coherency) we need
to manage SLC explicitly when buffers transition between cpu and
controllers.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# bcc4d65a 04-Jun-2015 Vineet Gupta <vgupta@synopsys.com>

ARCv2: MMUv4: support aliasing icache config

This is also default for AXS103 release

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# d1f317d8 06-Apr-2015 Vineet Gupta <vgupta@synopsys.com>

ARCv2: MMUv4: cache programming model changes

Caveats about cache flush on ARCv2 based cores

- dcache is PIPT so paddr is sufficient for cache maintenance ops (no
need to setup PTAG reg

- icache is still VIPT but only aliasing configs need PTAG setup

So basically this is departure from MMU-v3 which always need vaddr in
line ops registers (DC_IVDL, DC_FLDL, IC_IVIL) but paddr in DC_PTAG,
IC_PTAG respectively.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# c4aa49df 18-Sep-2014 Vineet Gupta <vgupta@synopsys.com>

ARC: Update comments about uncached address space

Suggested-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 230a15f1 11-Jun-2014 Paul Bolle <pebolle@tiscali.nl>

ARC: remove checks for CONFIG_ARC_MMU_V4

There's no Kconfig symbol ARC_MMU_V4 so the checks for CONFIG_ARC_MMU_V4
will always evaluate to false. Remove them.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# ef680cdc 07-Mar-2014 Vineet Gupta <vgupta@synopsys.com>

ARC: Disable caches in early boot if so configured

Requested-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 63d2dfdb 05-Sep-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: cacheflush refactor #2: I and D caches lines to have same size

Having them be different seems an obscure configuration.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 07b9b651 05-Sep-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: fix new Section mismatches in build (post __cpuinit cleanup)

--------------->8--------------------
WARNING: vmlinux.o(.text+0x708): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_cache_bcr()

WARNING: vmlinux.o(.text+0x702): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_mmu_bcr()
--------------->8--------------------

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 30499186 14-Jun-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: cache detection code bitrot

* Number of (i|d)cache ways can be retrieved from BCRs and hence no need
to cross check with with built-in constants
* Use of IS_ENABLED() to check for a Kconfig option
* is_not_cache_aligned() not used anymore

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# da1677b0 14-May-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: Disintegrate arcregs.h

* Move the various sub-system defines/types into relevant files/functions
(reduces compilation time)

* move CPU specific stuff out of asm/tlb.h into asm/mmu.h

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 8235703e 31-May-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: Use kconfig helper IS_ENABLED() to get rid of defines.h

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 5bba49f5 09-May-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: [mm] Aliasing VIPT dcache support 4/4

Enforce congruency of userspace shared mappings

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 95d6976d 18-Jan-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: Cache Flush Management

* ARC700 has VIPT L1 Caches
* Caches don't snoop and are not coherent
* Given the PAGE_SIZE and Cache associativity, we don't support aliasing
D$ configurations (yet), but do allow aliasing I$ configs

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>


# 3be80aae 18-Jan-2013 Vineet Gupta <vgupta@synopsys.com>

ARC: Fundamental ARCH data-types/defines

* L1_CACHE_SHIFT
* PAGE_SIZE, PAGE_OFFSET
* struct pt_regs, struct user_regs_struct
* struct thread_struct, cpu_relax(), task_pt_regs(), start_thread(), ...
* struct thread_info, THREAD_SIZE, INIT_THREAD_INFO(), TIF_*, ...
* BUG()
* ELF_*
* Elf_*

To disallow user-space visibility into some of the core kernel data-types
such as struct pt_regs, #ifdef __KERNEL__ which also makes the UAPI header
spit (further patch in the series) to NOT export it to asm/uapi/ptrace.h

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Jonas Bonn <jonas.bonn@gmail.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>