#
dd7c7ab0 |
|
25-Aug-2020 |
Vineet Gupta <vgupta@synopsys.com> |
ARC: [plat-eznps]: Drop support for EZChip NPS platform NPS customers are no longer doing active development, as evident from rand config build failures reported in recent times, so drop support for NPS platform. Tested-by: kernel test robot <lkp@intel.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
d2912cb1 |
|
04-Jun-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
3032f0c9 |
|
07-Mar-2019 |
Vineet Gupta <vgupta@synopsys.com> |
ARCv2: spinlock: remove the extra smp_mb before lock, after unlock - ARCv2 LLSC spinlocks have smp_mb() both before and after the LLSC instructions, which is not required per lkmm ACQ/REL semantics. smp_mb() is only needed _after_ lock and _before_ unlock. So remove the extra barriers. The reason they were there was mainly historical. At the time of initial SMP Linux bringup on HS38 cores, I was too conservative, given the fluidity of both hw and sw. The last attempt to ditch the extra barrier showed some hackbench regression which is apparently not the case now (atleast for LLSC case, read on...) - EX based spinlocks (!CONFIG_ARC_HAS_LLSC) still needs the extra smp_mb(), not due to lkmm, but due to some hardware shenanigans. W/o that, hackbench triggers RCU stall splat so extra DMB is retained !LLSC based systems are not realistic Linux sstem anyways so they can afford to be a nit suboptimal ;-) | [ARCLinux]# for i in (seq 1 1 5) ; do hackbench; done | Running with 10 groups 400 process | INFO: task hackbench:158 blocked for more than 10 seconds. | Not tainted 4.20.0-00005-g96b18288a88e-dirty #117 | "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. | hackbench D 0 158 135 0x00000000 | | Stack Trace: | watchdog: BUG: soft lockup - CPU#3 stuck for 59s! [hackbench:469] | Modules linked in: | Path: (null) | CPU: 3 PID: 469 Comm: hackbench Not tainted 4.20.0-00005-g96b18288a88e-dirty | | [ECR ]: 0x00000000 => Check Programmer's Manual | [EFA ]: 0x00000000 | [BLINK ]: do_exit+0x4a6/0x7d0 | [ERET ]: _raw_write_unlock_irq+0x44/0x5c - And while at it, remove the extar smp_mb() from EX based arch_read_trylock() since the spin lock there guarantees a full barrier anyways - For LLSC case, hackbench threads improves with this patch (HAPS @ 50MHz) ---- before ---- | | [ARCLinux]# for i in 1 2 3 4 5; do hackbench 10 thread; done | Running with 10 groups 400 threads | Time: 16.253 | Time: 16.445 | Time: 16.590 | Time: 16.721 | Time: 16.544 ---- after ---- | | [ARCLinux]# for i in 1 2 3 4 5; do hackbench 10 thread; done | Running with 10 groups 400 threads | Time: 15.638 | Time: 15.730 | Time: 15.870 | Time: 15.842 | Time: 15.729 Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
a4c1887d |
|
03-Oct-2017 |
Will Deacon <will@kernel.org> |
locking/arch: Remove dummy arch_{read,spin,write}_lock_flags() implementations The arch_{read,spin,write}_lock_flags() macros are simply mapped to the non-flags versions by the majority of architectures, so do this in core code and remove the dummy implementations. Also remove the implementation in spinlock_up.h, since all callers of do_raw_spin_lock_flags() call local_irq_save(flags) anyway. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1507055129-12300-4-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
0160fb17 |
|
03-Oct-2017 |
Will Deacon <will@kernel.org> |
locking/arch: Remove dummy arch_{read,spin,write}_relax() implementations arch_{read,spin,write}_relax() are defined as cpu_relax() by the core code, so architectures that can't do better (i.e. most of them) don't need to bother with the dummy definitions. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1507055129-12300-3-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
a8a217c2 |
|
03-Oct-2017 |
Will Deacon <will@kernel.org> |
locking/core: Remove {read,spin,write}_can_lock() Outside of the locking code itself, {read,spin,write}_can_lock() have no users in tree. Apparmor (the last remaining user of write_can_lock()) got moved over to lockdep by the previous patch. This patch removes the use of {read,spin,write}_can_lock() from the BUILD_LOCK_OPS macro, deferring to the trylock operation for testing the lock status, and subsequently removes the unused macros altogether. They aren't guaranteed to work in a concurrent environment and can give incorrect results in the case of qrwlock. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1507055129-12300-2-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
1112c3b2 |
|
28-May-2017 |
Noam Camus <noamca@mellanox.com> |
ARC: [plat-eznps] spinlock aware for MTM This way when we execute "ex" during trying to hold lock we can switch to other HW thread and utilize the core intead of just spinning on a lock. We noticed about 10% improvement of execution time with hackbench test. Signed-off-by: Noam Camus <noamca@mellanox.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
c2bdac14 |
|
17-Oct-2015 |
Vineet Gupta <vgupta@synopsys.com> |
ARC: spinlock: Document the EX based spin_unlock Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
952111d7 |
|
29-Jun-2017 |
Paul E. McKenney <paulmck@kernel.org> |
arch: Remove spin_unlock_wait() arch-specific definitions There is no agreed-upon definition of spin_unlock_wait()'s semantics, and it appears that all callers could do just as well with a lock/unlock pair. This commit therefore removes the underlying arch-specific arch_spin_unlock_wait() for all architectures providing them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: <linux-arch@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Boqun Feng <boqun.feng@gmail.com>
|
#
726328d9 |
|
26-May-2016 |
Peter Zijlstra <peterz@infradead.org> |
locking/spinlock, arch: Update and fix spin_unlock_wait() implementations This patch updates/fixes all spin_unlock_wait() implementations. The update is in semantics; where it previously was only a control dependency, we now upgrade to a full load-acquire to match the store-release from the spin_unlock() we waited on. This ensures that when spin_unlock_wait() returns, we're guaranteed to observe the full critical section we waited on. This fixes a number of spin_unlock_wait() users that (not unreasonably) rely on this. I also fixed a number of ticket lock versions to only wait on the current lock holder, instead of for a full unlock, as this is sufficient. Furthermore; again for ticket locks; I added an smp_rmb() in between the initial ticket load and the spin loop testing the current value because I could not convince myself the address dependency is sufficient, esp. if the loads are of different sizes. I'm more than happy to remove this smp_rmb() again if people are certain the address dependency does indeed work as expected. Note: PPC32 will be fixed independently Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: chris@zankel.net Cc: cmetcalf@mellanox.com Cc: davem@davemloft.net Cc: dhowells@redhat.com Cc: james.hogan@imgtec.com Cc: jejb@parisc-linux.org Cc: linux@armlinux.org.uk Cc: mpe@ellerman.id.au Cc: ralf@linux-mips.org Cc: realmz6@gmail.com Cc: rkuo@codeaurora.org Cc: rth@twiddle.net Cc: schwidefsky@de.ibm.com Cc: tony.luck@intel.com Cc: vgupta@synopsys.com Cc: ysato@users.sourceforge.jp Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
ed6aefed |
|
31-May-2016 |
Vineet Gupta <vgupta@synopsys.com> |
Revert "ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff" This reverts commit e78fdfef84be13a5c2b8276e12203cdf24778596. The issue was fixed in hardware in HS2.1C release and there are no known external users of affected RTL so revert the whole delayed retry series ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
819f3602 |
|
31-May-2016 |
Vineet Gupta <vgupta@synopsys.com> |
Revert "ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle" This reverts commit b89aa12c177477e34caa722818536fb5d0bffd76. The issue was fixed in hardware in HS2.1C release and there are no known external users of affected RTL so revert the whole delayed retry series ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
42316a20 |
|
31-May-2016 |
Vineet Gupta <vgupta@synopsys.com> |
Revert "ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff" This reverts commit 10971638701dedadb58c88ce4d31c9375b224ed6. The issue was fixed in hardware in HS2.1C release and there are no known external users of affected RTL - so revert thw whole delayed retry series ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
2a1021fc |
|
09-Jun-2015 |
Noam Camus <noamc@ezchip.com> |
ARC: rwlock: disable interrupts in !LLSC variant If we hold rwlock and interrupt occures we may end up spinning on it for ever during softirq. Note that this lock is an internal lock and since the lock is free to be used from any context, the lock needs to be IRQ-safe. Below you may see an example for interrupt we get while nl_table_lock is holding its rw->lock_mutex and we spinned on it for ever. The concept for the fix was taken from SPARC. [2015-05-12 19:16:12] Stack Trace: [2015-05-12 19:16:12] arc_unwind_core+0xb8/0x11c [2015-05-12 19:16:12] dump_stack+0x68/0xac [2015-05-12 19:16:12] _raw_read_lock+0xa8/0xac [2015-05-12 19:16:12] netlink_broadcast_filtered+0x56/0x35c [2015-05-12 19:16:12] nlmsg_notify+0x42/0xa4 [2015-05-12 19:16:13] neigh_update+0x1fe/0x44c [2015-05-12 19:16:13] neigh_event_ns+0x40/0xa4 [2015-05-12 19:16:13] arp_process+0x46e/0x5a8 [2015-05-12 19:16:13] __netif_receive_skb_core+0x358/0x500 [2015-05-12 19:16:13] process_backlog+0x92/0x154 [2015-05-12 19:16:13] net_rx_action+0xb8/0x188 [2015-05-12 19:16:13] __do_softirq+0xda/0x1d8 [2015-05-12 19:16:14] irq_exit+0x8a/0x8c [2015-05-12 19:16:14] arch_do_IRQ+0x6c/0xa8 [2015-05-12 19:16:14] handle_interrupt_level1+0xe4/0xf0 Signed-off-by: Noam Camus <noamc@ezchip.com> Acked-by: Peter Zijlstra <peterz@infradead.org>
|
#
10971638 |
|
07-Aug-2015 |
Vineet Gupta <vgupta@synopsys.com> |
ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff The increment of delay counter was 2 instructions: Arithmatic Shfit Left (ASL) + set to 1 on overflow This can be done in 1 using ROtate Left (ROL) Suggested-by: Nigel Topham <ntopham@synopsys.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
b89aa12c |
|
21-Jul-2015 |
Vineet Gupta <vgupta@synopsys.com> |
ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle The previous commit for delayed retry of SCOND needs some fine tuning for spin locks. The backoff from delayed retry in conjunction with spin looping of lock itself can potentially cause the delay counter to reach high values. So to provide fairness to any lock operation, after a lock "seems" available (i.e. just before first SCOND try0, reset the delay counter back to starting value of 1 Essentially reset delay to 1 for a new spin-wait-loop-acquire cycle. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
e78fdfef |
|
14-Jul-2015 |
Vineet Gupta <vgupta@synopsys.com> |
ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff This is to workaround the llock/scond livelock HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping coherency transactions in the SCU. The exclusive line state keeps rotating among contenting cores leading to a never ending cycle. So break the cycle by deferring the retry of failed exclusive access (SCOND). The actual delay needed is function of number of contending cores as well as the unrelated coherency traffic from other cores. To keep the code simple, start off with small delay of 1 which would suffice most cases and in case of contention double the delay. Eventually the delay is sufficient such that the coherency pipeline is drained, thus a subsequent exclusive access would succeed. Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.com Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
69cbe630 |
|
15-Jul-2015 |
Vineet Gupta <vgupta@synopsys.com> |
ARC: LLOCK/SCOND based rwlock With LLOCK/SCOND, the rwlock counter can be atomically updated w/o need for a guarding spin lock. This in turn elides the EXchange instruction based spinning which causes the cacheline transition to exclusive state and concurrent spinning across cores would cause the line to keep bouncing around. LLOCK/SCOND based implementation is superior as spinning on LLOCK keeps the cacheline in shared state. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
ae7eae9e |
|
14-Jul-2015 |
Vineet Gupta <vgupta@synopsys.com> |
ARC: LLOCK/SCOND based spin_lock Current spin_lock uses EXchange instruction to implement the atomic test and set of lock location (reads orig value and ST 1). This however forces the cacheline into exclusive state (because of the ST) and concurrent loops in multiple cores will bounce the line around between cores. Instead, use LLOCK/SCOND to implement the atomic test and set which is better as line is in shared state while lock is spinning on LLOCK The real motivation of this change however is to make way for future changes in atomics to implement delayed retry (with backoff). Initial experiment with delayed retry in atomics combined with orig EX based spinlock was a total disaster (broke even LMBench) as struct sock has a cache line sharing an atomic_t and spinlock. The tight spinning on lock, caused the atomic retry to keep backing off such that it would never finish. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
2576c28e |
|
20-Nov-2014 |
Vineet Gupta <vgupta@synopsys.com> |
ARC: add smp barriers around atomics per Documentation/atomic_ops.txt - arch_spin_lock/unlock were lacking the ACQUIRE/RELEASE barriers Since ARCv2 only provides load/load, store/store and all/all, we need the full barrier - LLOCK/SCOND based atomics, bitops, cmpxchg, which return modified values were lacking the explicit smp barriers. - Non LLOCK/SCOND varaints don't need the explicit barriers since that is implicity provided by the spin locks used to implement the critical section (the spin lock barriers in turn are also fixed in this commit as explained above Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
6c00350b |
|
25-Sep-2013 |
Vineet Gupta <vgupta@synopsys.com> |
ARC: Workaround spinlock livelock in SMP SystemC simulation Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and can only use atomic EX insn (reg with mem) to build higher level R-M-W primitives. This includes a SystemC based SMP simulation model. So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange operation to update reader(s)/writer count. The spinlock operation itself looks as follows: mov reg, 1 ; 1=locked, 0=unlocked retry: EX reg, [lock] ; load existing, store 1, atomically BREQ reg, 1, rety ; if already locked, retry In single-threaded simulation, SystemC alternates between the 2 cores with "N" insn each based scheduling. Additionally for insn with global side effect, such as EX writing to shared mem, a core switch is enforced too. Given that, 2 cores doing a repeated EX on same location, Linux often got into a livelock e.g. when both cores were fiddling with tasklist lock (gdbserver / hackbench) for read/write respectively as the sequence diagram below shows: core1 core2 -------- -------- 1. spin lock [EX r=0, w=1] - LOCKED 2. rwlock(Read) - LOCKED 3. spin unlock [ST 0] - UNLOCKED spin lock [EX r=0,w=1] - LOCKED -- resched core 1---- 5. spin lock [EX r=1] - ALREADY-LOCKED -- resched core 2---- 6. rwlock(Write) - READER-LOCKED 7. spin unlock [ST 0] 8. rwlock failed, retry again 9. spin lock [EX r=0, w=1] -- resched core 1---- 10 spinlock locked in #9, retry #5 11. spin lock [EX gets 1] -- resched core 2---- ... ... The fix was to unlock using the EX insn too (step 7), to trigger another SystemC scheduling pass which would let core1 proceed, eliding the livelock. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
#
6e35fa2d |
|
18-Jan-2013 |
Vineet Gupta <vgupta@synopsys.com> |
ARC: Spinlock/rwlock/mutex primitives Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
|