History log of /linux-master/arch/powerpc/include/asm/cputhreads.h
Revision Date Author Comments
# b350111b 04-Nov-2021 Nicholas Piggin <npiggin@gmail.com>

powerpc: remove cpu_online_cores_map function

This function builds the cores online map with on-stack cpumasks which
can cause high stack usage with large NR_CPUS.

It is not used in any performance sensitive paths, so instead just check
for first thread sibling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211105035042.1398309-1-npiggin@gmail.com


# 77bbbc0c 01-Jun-2021 Suraj Jitindar Singh <sjitindarsingh@gmail.com>

KVM: PPC: Book3S HV: Fix TLB management on SMT8 POWER9 and POWER10 processors

The POWER9 vCPU TLB management code assumes all threads in a core share
a TLB, and that TLBIEL execued by one thread will invalidate TLBs for
all threads. This is not the case for SMT8 capable POWER9 and POWER10
(big core) processors, where the TLB is split between groups of threads.
This results in TLB multi-hits, random data corruption, etc.

Fix this by introducing cpu_first_tlb_thread_sibling etc., to determine
which siblings share TLBs, and use that in the guest TLB flushing code.

[npiggin@gmail.com: add changelog and comment]

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210602040441.3984352-1-npiggin@gmail.com


# f3232321 07-Aug-2020 Srikar Dronamraju <srikar@linux.vnet.ibm.com>

powerpc/topology: Override cpu_smt_mask

On Power9, a pair of SMT4 cores can be presented by the firmware as a SMT8
core for backward compatibility reasons, with the fusion of two SMT4 cores.
Powerpc allows LPARs to be live migrated from Power8 to Power9. Existing
software developed/configured for Power8, expects to see a SMT8 core.

In order to maintain userspace backward compatibility (with Power8 chips in
case of Power9) in enterprise Linux systems, the topology_sibling_cpumask
has to be set to SMT8 core.

cpu_smt_mask() should generally point to the cpu mask of the SMT4 core.
Hence override the default cpu_smt_mask() to be powerpc specific
allowing for better scheduling behaviour on Power.

schbench
(latency measured in usecs, so lesser is better)
Without patch With patch
Latency percentiles (usec) Latency percentiles (usec)
50.0000th: 34 50.0000th: 38
75.0000th: 47 75.0000th: 52
90.0000th: 54 90.0000th: 60
95.0000th: 57 95.0000th: 64
*99.0000th: 62 *99.0000th: 72
99.5000th: 65 99.5000th: 75
99.9000th: 76 99.9000th: 3452
min=0, max=9205 min=0, max=9344

schbench (With Cede disabled)
Without patch With patch
Latency percentiles (usec) Latency percentiles (usec)
50.0000th: 20 50.0000th: 21
75.0000th: 28 75.0000th: 29
90.0000th: 33 90.0000th: 34
95.0000th: 35 95.0000th: 37
*99.0000th: 40 *99.0000th: 40
99.5000th: 48 99.5000th: 42
99.9000th: 94 99.9000th: 79
min=0, max=791 min=0, max=791

perf bench sched pipe
usec/ops : lesser is better
Without patch
N Min Max Median Avg Stddev
101 5.095113 5.595269 5.204842 5.2298776 0.10762713

5.10 - 5.15 : ################################################## 23% (24)
5.15 - 5.20 : ############################################# 21% (22)
5.20 - 5.25 : ################################################## 23% (24)
5.25 - 5.30 : ######################### 11% (12)
5.30 - 5.35 : ########## 4% (5)
5.35 - 5.40 : ######## 3% (4)
5.40 - 5.45 : ######## 3% (4)
5.45 - 5.50 : #### 1% (2)
5.50 - 5.55 : ## 0% (1)
5.55 - 5.60 : #### 1% (2)

With patch
N Min Max Median Avg Stddev
101 5.134675 8.524719 5.207658 5.2780985 0.34911969

5.1 - 5.5 : ################################################## 94% (95)
5.5 - 5.8 : ## 3% (4)
5.8 - 6.2 : 0% (1)
6.2 - 6.5 :
6.5 - 6.8 :
6.8 - 7.2 :
7.2 - 7.5 :
7.5 - 7.8 :
7.8 - 8.2 :
8.2 - 8.5 :

perf bench sched pipe (cede disabled)
usec/ops : lesser is better
Without patch
N Min Max Median Avg Stddev
101 7.884227 12.576538 7.956474 8.0170722 0.46159054

7.9 - 8.4 : ################################################## 99% (100)
8.4 - 8.8 :
8.8 - 9.3 :
9.3 - 9.8 :
9.8 - 10.2 :
10.2 - 10.7 :
10.7 - 11.2 :
11.2 - 11.6 :
11.6 - 12.1 :
12.1 - 12.6 :

With patch
N Min Max Median Avg Stddev
101 7.956021 8.217284 8.015615 8.0283866 0.049844967

7.96 - 7.98 : ###################### 12% (13)
7.98 - 8.01 : ################################################## 28% (29)
8.01 - 8.03 : #################################### 20% (21)
8.03 - 8.06 : ######################### 14% (15)
8.06 - 8.09 : ###################### 12% (13)
8.09 - 8.11 : ###### 3% (4)
8.11 - 8.14 : ### 1% (2)
8.14 - 8.17 : ### 1% (2)
8.17 - 8.19 :
8.19 - 8.22 : # 0% (1)

Observations: With the patch, the initial run/iteration takes a slight
longer time. This can be attributed to the fact that now we pick a CPU
from a idle core which could be sleep mode. Once we remove the cede,
state the numbers improve in favour of the patch.

ebizzy:
transactions per second (higher is better)
without patch
N Min Max Median Avg Stddev
100 1018433 1304470 1193208 1182315.7 60018.733

1018433 - 1047037 : ###### 3% (3)
1047037 - 1075640 : ######## 4% (4)
1075640 - 1104244 : ######## 4% (4)
1104244 - 1132848 : ############### 7% (7)
1132848 - 1161452 : #################################### 17% (17)
1161452 - 1190055 : ########################## 12% (12)
1190055 - 1218659 : ############################################# 21% (21)
1218659 - 1247263 : ################################################## 23% (23)
1247263 - 1275866 : ######## 4% (4)
1275866 - 1304470 : ######## 4% (4)

with patch
N Min Max Median Avg Stddev
100 967014 1292938 1208819 1185281.8 69815.851

967014 - 999606 : ## 1% (1)
999606 - 1032199 : ## 1% (1)
1032199 - 1064791 : ############ 6% (6)
1064791 - 1097384 : ########## 5% (5)
1097384 - 1129976 : ################## 9% (9)
1129976 - 1162568 : #################### 10% (10)
1162568 - 1195161 : ########################## 13% (13)
1195161 - 1227753 : ############################################ 22% (22)
1227753 - 1260346 : ################################################## 25% (25)
1260346 - 1292938 : ############## 7% (7)

Observations: Not much changes, ebizzy is not much impacted.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200807074517.27957-2-srikar@linux.vnet.ibm.com


# 425752c6 10-Oct-2018 Gautham R. Shenoy <ego@linux.vnet.ibm.com>

powerpc: Detect the presence of big-cores via "ibm, thread-groups"

On IBM POWER9, the device tree exposes a property array identifed by
"ibm,thread-groups" which will indicate which groups of threads share
a particular set of resources.

As of today we only have one form of grouping identifying the group of
threads in the core that share the L1 cache, translation cache and
instruction data flow.

This patch adds helper functions to parse the contents of
"ibm,thread-groups" and populate a per-cpu variable to cache
information about siblings of each CPU that share the L1, traslation
cache and instruction data-flow.

It also defines a new global variable named "has_big_cores" which
indicates if the cores on this configuration have multiple groups of
threads that share L1 cache.

For each online CPU, it maintains a cpu_smallcore_mask, which
indicates the online siblings which share the L1-cache with it.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# e340eca9 14-Aug-2016 Guenter Roeck <linux@roeck-us.net>

powerpc: cputhreads: Add missing include file

Powerpc builds may fail with the following build error.

Error log:
In file included from ./arch/powerpc/include/asm/mmu_context.h:11:0,
from ./include/linux/mmu_context.h:4,
from mm/mmu_context.c:8:
./arch/powerpc/include/asm/cputhreads.h: In function 'get_tensr':
./arch/powerpc/include/asm/cputhreads.h:101:2: error:
implicit declaration of function 'cpu_has_feature'

The problem can be triggered by configuring ppc64e_defconfig and selecting
CONFIG_TICK_CPU_ACCOUNTING instead of CONFIG_VIRT_CPU_ACCOUNTING_NATIVE.

Fixes: b92a226e5284 ("powerpc: Move cpu_has_feature() to a separate file")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 6becef7e 20-Nov-2015 chenhui zhao <chenhui.zhao@freescale.com>

powerpc/mpc85xx: Add CPU hotplug support for E6500

Support Freescale E6500 core-based platforms, like t4240.
Support disabling/enabling individual CPU thread dynamically.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>


# d17799f9 20-Nov-2015 chenhui zhao <chenhui.zhao@freescale.com>

powerpc/rcpm: add RCPM driver

There is a RCPM (Run Control/Power Management) in Freescale QorIQ
series processors. The device performs tasks associated with device
run control and power management.

The driver implements some features: mask/unmask irq, enter/exit low
power states, freeze time base, etc.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com>
[scottwood: remove __KERNEL__ ifdef]
Signed-off-by: Scott Wood <oss@buserror.net>


# ebb9d30a 23-Dec-2015 chenhui zhao <chenhui.zhao@freescale.com>

powerpc/mm: any thread in one core can be the first to setup TLB1

On e6500, in the case of cpu hotplug, either thread in one core
may be the first thread initilzing the TLB1. The subsequent threads
must not setup it again.

The code is derived from the comment of Scott Wood.

Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: Scott Wood <oss@buserror.net>


# e602ffb2 19-Apr-2015 Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>

powerpc: Fix cpu_online_cores_map to return only online threads mask

Currently, cpu_online_cores_map returns a mask, which for every core with
at least one online thread, has the bit for thread 0 of the core set to 1,
and the bits for all other threads of the core set to 0. But thread 0 of
the core itself may not be online always. In such cases, if the returned
mask is used for IPI, then it'll cause IPIs to be skipped on cores where
the first thread is offline, because the IPI code refuses to send IPIs to
offline threads.

Fix this by setting the bit of the first online thread in the core.
This is done by fixing this in the underlying function
cpu_thread_mask_to_cores.

The result has the property that for all cores with online threads, there
is one bit set in the returned map. And further, all bits that are set in
the returned map correspond to online threads.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
[ Changelog from Michael Ellerman <mpe@ellerman.id.au> ]
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# d52356e7 31-Mar-2015 Jan Stancek <jstancek@redhat.com>

powerpc: fix memory corruption by pnv_alloc_idle_core_states

Space allocated for paca is based off nr_cpu_ids,
but pnv_alloc_idle_core_states() iterates paca with
cpu_nr_cores()*threads_per_core, which is using NR_CPUS.

This causes pnv_alloc_idle_core_states() to write over memory,
which is outside of paca array and may later lead to various panics.

Fixes: 7cba160ad789 (powernv/cpuidle: Redesign idle states management)
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


# 87313df7 09-Mar-2015 Rusty Russell <rusty@rustcorp.com.au>

powerpc: fix deprecated CPU_MASK_CPU0 usage.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


# 5853aef1 23-May-2014 Michael Ellerman <mpe@ellerman.id.au>

powerpc: Add threads_per_subcore

On POWER8 we have a new concept of a subcore. This is what happens when
you take a regular core and split it. A subcore is a grouping of two or
four SMT threads, as well as a handfull of SPRs which allows the subcore
to appear as if it were a core from the point of view of a guest.

Unlike threads_per_core which is fixed at boot, threads_per_subcore can
change while the system is running. Most code will not want to use
threads_per_subcore.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 933b90a9 13-May-2012 Anshuman Khandual <khandual@linux.vnet.ibm.com>

powerpc: Fixing a cputhread code documentation

--
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 104699c0 27-Apr-2011 KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

powerpc: Convert old cpumask API into new one

Adapt new API.

Almost change is trivial. Most important change is the below line
because we plan to change task->cpus_allowed implementation.

- ctx->cpus_allowed = current->cpus_allowed;

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# 99d86705 06-Oct-2010 Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

powerpc: Cleanup APIs for cpu/thread/core mappings

These APIs take logical cpu number as input
Change cpu_first_thread_in_core() to cpu_first_thread_sibling()
Change cpu_last_thread_in_core() to cpu_last_thread_sibling()

These APIs convert core number (index) to logical cpu/thread numbers
Add cpu_first_thread_of_core(int core)
Changed cpu_thread_to_core() to cpu_core_index_of_thread(int cpu)

The goal is to make 'threads_per_core' accessible to the
pseries_energy module. Instead of making an API to read
threads_per_core, this is a higher level wrapper function to
convert from logical cpu number to core number.

The current APIs cpu_first_thread_in_core() and
cpu_last_thread_in_core() returns logical CPU number while
cpu_thread_to_core() returns core number or index which is
not a logical CPU number. The new APIs are now clearly named to
distinguish 'core number' versus first and last 'logical cpu
number' in that core.

The new APIs cpu_{first,last}_thread_sibling() work on
logical cpu numbers. While cpu_first_thread_of_core() and
cpu_core_index_of_thread() work on core index.

Example usage: (4 threads per core system)

cpu_first_thread_sibling(5) = 4
cpu_last_thread_sibling(5) = 7
cpu_core_index_of_thread(5) = 1
cpu_first_thread_of_core(1) = 4

cpu_core_index_of_thread() is used in cpu_to_drc_index() in the
module and cpu_first_thread_of_core() is used in
drc_index_to_cpu() in the module.

Make API changes to few callers. Export symbols for use in modules.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# fcce8109 23-Jul-2009 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mm: Add HW threads support to no_hash TLB management

The current "no hash" MMU context management code is written with
the assumption that one CPU == one TLB. This is not the case on
implementations that support HW multithreading, where several
linux CPUs can share the same TLB.

This adds some basic support for this to our context management
and our TLB flushing code.

It also cleans up the optional debugging output a bit

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>


# b8b572e1 31-Jul-2008 Stephen Rothwell <sfr@canb.auug.org.au>

powerpc: Move include files to arch/powerpc/include/asm

from include/asm-powerpc. This is the result of a

mkdir arch/powerpc/include/asm
git mv include/asm-powerpc/* arch/powerpc/include/asm

Followed by a few documentation/comment fixups and a couple of places
where <asm-powepc/...> was being used explicitly. Of the latter only
one was outside the arch code and it is a driver only built for powerpc.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>