Searched hist:15354 (Results 1 - 7 of 7) sorted by relevance

/linux-master/arch/powerpc/include/asm/book3s/64/
H A Dradix-4k.hdiff fb4e5dbd Thu Mar 22 02:43:50 MDT 2018 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix

With split PTL (page table lock) config, we allocate the level
4 (leaf) page table using pte fragment framework instead of slab cache
like other levels. This was done to enable us to have split page table
lock at the level 4 of the page table. We use page->plt backing the
all the level 4 pte fragment for the lock.

Currently with Radix, we use only 16 fragments out of the allocated
page. In radix each fragment is 256 bytes which means we use only 4k
out of the allocated 64K page wasting 60k of the allocated memory.
This was done earlier to keep it closer to hash.

This patch update the pte fragment count to 256, thereby using the
full 64K page and reducing the memory usage. Performance tests shows
really low impact even with THP disabled. With THP disabled we will be
contenting further less on level 4 ptl and hence the impact should be
further low.

256 threads:
without patch (10 runs of ./ebizzy -m -n 1000 -s 131072 -S 100)
median = 15678.5
stdev = 42.1209

with patch:
median = 15354
stdev = 194.743

This is with THP disabled. With THP enabled the impact of the patch
will be less.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
H A Dradix-64k.hdiff fb4e5dbd Thu Mar 22 02:43:50 MDT 2018 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix

With split PTL (page table lock) config, we allocate the level
4 (leaf) page table using pte fragment framework instead of slab cache
like other levels. This was done to enable us to have split page table
lock at the level 4 of the page table. We use page->plt backing the
all the level 4 pte fragment for the lock.

Currently with Radix, we use only 16 fragments out of the allocated
page. In radix each fragment is 256 bytes which means we use only 4k
out of the allocated 64K page wasting 60k of the allocated memory.
This was done earlier to keep it closer to hash.

This patch update the pte fragment count to 256, thereby using the
full 64K page and reducing the memory usage. Performance tests shows
really low impact even with THP disabled. With THP disabled we will be
contenting further less on level 4 ptl and hence the impact should be
further low.

256 threads:
without patch (10 runs of ./ebizzy -m -n 1000 -s 131072 -S 100)
median = 15678.5
stdev = 42.1209

with patch:
median = 15354
stdev = 194.743

This is with THP disabled. With THP enabled the impact of the patch
will be less.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
H A Dhash-64k.hdiff fb4e5dbd Thu Mar 22 02:43:50 MDT 2018 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix

With split PTL (page table lock) config, we allocate the level
4 (leaf) page table using pte fragment framework instead of slab cache
like other levels. This was done to enable us to have split page table
lock at the level 4 of the page table. We use page->plt backing the
all the level 4 pte fragment for the lock.

Currently with Radix, we use only 16 fragments out of the allocated
page. In radix each fragment is 256 bytes which means we use only 4k
out of the allocated 64K page wasting 60k of the allocated memory.
This was done earlier to keep it closer to hash.

This patch update the pte fragment count to 256, thereby using the
full 64K page and reducing the memory usage. Performance tests shows
really low impact even with THP disabled. With THP disabled we will be
contenting further less on level 4 ptl and hence the impact should be
further low.

256 threads:
without patch (10 runs of ./ebizzy -m -n 1000 -s 131072 -S 100)
median = 15678.5
stdev = 42.1209

with patch:
median = 15354
stdev = 194.743

This is with THP disabled. With THP enabled the impact of the patch
will be less.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
/linux-master/tools/perf/util/
H A Dstat.hdiff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
H A Dprobe-event.cdiff 4bf6dcaa Sat Aug 06 08:51:26 MDT 2022 Christophe JAILLET <christophe.jaillet@wanadoo.fr> perf probe: Fix an error handling path in 'parse_perf_probe_command()'

If a memory allocation fail, we should branch to the error handling path
in order to free some resources allocated a few lines above.

Fixes: 15354d54698648e2 ("perf probe: Generate event name with line number")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kernel-janitors@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/b71bcb01fa0c7b9778647235c3ab490f699ba278.1659797452.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 15354d54 Mon Nov 18 01:12:20 MST 2019 Masami Hiramatsu <mhiramat@kernel.org> perf probe: Generate event name with line number

Generate event name from function name with line number as
<function>_L<line_number>. Note that this is only for the new event
which is defined by the line number of function (except for line 0).

If there is another event on same line, you have to use
"-f" option. In that case, the new event has "_1" suffix.

e.g.
# perf probe -a kernel_read:2
Added new event:
probe:kernel_read_L2 (on kernel_read:2)

You can now use it in all perf tools, such as:

perf record -e probe:kernel_read_L2 -aR sleep 1

But if we omit the line number or 0th line, it will
have no suffix.

# perf probe -a kernel_read:0
Added new event:
probe:kernel_read (on kernel_read)

You can now use it in all perf tools, such as:

perf record -e probe:kernel_read -aR sleep 1

probe:kernel_read (on kernel_read@linux-5.0.0/fs/read_write.c)
probe:kernel_read_L2 (on kernel_read:2@linux-5.0.0/fs/read_write.c)

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: http://lore.kernel.org/lkml/157406474026.24476.2828897745502059569.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
/linux-master/drivers/md/
H A Ddm-thin.cdiff e8088073 Fri Dec 21 13:23:31 MST 2012 Joe Thornber <ejt@redhat.com> dm thin: fix race between simultaneous io and discards to same block

There is a race when discard bios and non-discard bios are issued
simultaneously to the same block.

Discard support is expensive for all thin devices precisely because you
have to be careful to quiesce the area you're discarding. DM thin must
handle this conflicting IO pattern (simultaneous non-discard vs discard)
even though a sane application shouldn't be issuing such IO.

The race manifests as follows:

1. A non-discard bio is mapped in thin_bio_map.
This doesn't lock out parallel activity to the same block.

2. A discard bio is issued to the same block as the non-discard bio.

3. The discard bio is locked in a dm_bio_prison_cell in process_discard
to lock out parallel activity against the same block.

4. The non-discard bio's mapping continues and its all_io_entry is
incremented so the bio is accounted for in the thin pool's all_io_ds
which is a dm_deferred_set used to track time locality of non-discard IO.

5. The non-discard bio is finally locked in a dm_bio_prison_cell in
process_bio.

The race can result in deadlock, leaving the block layer hanging waiting
for completion of a discard bio that never completes, e.g.:

INFO: task ruby:15354 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ruby D ffffffff8160f0e0 0 15354 15314 0x00000000
ffff8802fb08bc58 0000000000000082 ffff8802fb08bfd8 0000000000012900
ffff8802fb08a010 0000000000012900 0000000000012900 0000000000012900
ffff8802fb08bfd8 0000000000012900 ffff8803324b9480 ffff88032c6f14c0
Call Trace:
[<ffffffff814e5a19>] schedule+0x29/0x70
[<ffffffff814e3d85>] schedule_timeout+0x195/0x220
[<ffffffffa06b9bc1>] ? _dm_request+0x111/0x160 [dm_mod]
[<ffffffff814e589e>] wait_for_common+0x11e/0x190
[<ffffffff8107a170>] ? try_to_wake_up+0x2b0/0x2b0
[<ffffffff814e59ed>] wait_for_completion+0x1d/0x20
[<ffffffff81233289>] blkdev_issue_discard+0x219/0x260
[<ffffffff81233e79>] blkdev_ioctl+0x6e9/0x7b0
[<ffffffff8119a65c>] block_ioctl+0x3c/0x40
[<ffffffff8117539c>] do_vfs_ioctl+0x8c/0x340
[<ffffffff8119a547>] ? block_llseek+0x67/0xb0
[<ffffffff811756f1>] sys_ioctl+0xa1/0xb0
[<ffffffff810561f6>] ? sys_rt_sigprocmask+0x86/0xd0
[<ffffffff814ef099>] system_call_fastpath+0x16/0x1b

The thinp-test-suite's test_discard_random_sectors reliably hits this
deadlock on fast SSD storage.

The fix for this race is that the all_io_entry for a bio must be
incremented whilst the dm_bio_prison_cell is held for the bio's
associated virtual and physical blocks. That cell locking wasn't
occurring early enough in thin_bio_map. This patch fixes this.

Care is taken to always call the new function inc_all_io_entry() with
the relevant cells locked, but they are generally unlocked before
calling issue() to try to avoid holding the cells locked across
generic_submit_request.

Also, now that thin_bio_map may lock bios in a cell, process_bio() is no
longer the only thread that will do so. Because of this we must be sure
to use cell_defer_except() to release all non-holder entries, that
were added by the other thread, because they must be deferred.

This patch depends on "dm thin: replace dm_cell_release_singleton with
cell_defer_except".

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@vger.kernel.org
diff e8088073 Fri Dec 21 13:23:31 MST 2012 Joe Thornber <ejt@redhat.com> dm thin: fix race between simultaneous io and discards to same block

There is a race when discard bios and non-discard bios are issued
simultaneously to the same block.

Discard support is expensive for all thin devices precisely because you
have to be careful to quiesce the area you're discarding. DM thin must
handle this conflicting IO pattern (simultaneous non-discard vs discard)
even though a sane application shouldn't be issuing such IO.

The race manifests as follows:

1. A non-discard bio is mapped in thin_bio_map.
This doesn't lock out parallel activity to the same block.

2. A discard bio is issued to the same block as the non-discard bio.

3. The discard bio is locked in a dm_bio_prison_cell in process_discard
to lock out parallel activity against the same block.

4. The non-discard bio's mapping continues and its all_io_entry is
incremented so the bio is accounted for in the thin pool's all_io_ds
which is a dm_deferred_set used to track time locality of non-discard IO.

5. The non-discard bio is finally locked in a dm_bio_prison_cell in
process_bio.

The race can result in deadlock, leaving the block layer hanging waiting
for completion of a discard bio that never completes, e.g.:

INFO: task ruby:15354 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ruby D ffffffff8160f0e0 0 15354 15314 0x00000000
ffff8802fb08bc58 0000000000000082 ffff8802fb08bfd8 0000000000012900
ffff8802fb08a010 0000000000012900 0000000000012900 0000000000012900
ffff8802fb08bfd8 0000000000012900 ffff8803324b9480 ffff88032c6f14c0
Call Trace:
[<ffffffff814e5a19>] schedule+0x29/0x70
[<ffffffff814e3d85>] schedule_timeout+0x195/0x220
[<ffffffffa06b9bc1>] ? _dm_request+0x111/0x160 [dm_mod]
[<ffffffff814e589e>] wait_for_common+0x11e/0x190
[<ffffffff8107a170>] ? try_to_wake_up+0x2b0/0x2b0
[<ffffffff814e59ed>] wait_for_completion+0x1d/0x20
[<ffffffff81233289>] blkdev_issue_discard+0x219/0x260
[<ffffffff81233e79>] blkdev_ioctl+0x6e9/0x7b0
[<ffffffff8119a65c>] block_ioctl+0x3c/0x40
[<ffffffff8117539c>] do_vfs_ioctl+0x8c/0x340
[<ffffffff8119a547>] ? block_llseek+0x67/0xb0
[<ffffffff811756f1>] sys_ioctl+0xa1/0xb0
[<ffffffff810561f6>] ? sys_rt_sigprocmask+0x86/0xd0
[<ffffffff814ef099>] system_call_fastpath+0x16/0x1b

The thinp-test-suite's test_discard_random_sectors reliably hits this
deadlock on fast SSD storage.

The fix for this race is that the all_io_entry for a bio must be
incremented whilst the dm_bio_prison_cell is held for the bio's
associated virtual and physical blocks. That cell locking wasn't
occurring early enough in thin_bio_map. This patch fixes this.

Care is taken to always call the new function inc_all_io_entry() with
the relevant cells locked, but they are generally unlocked before
calling issue() to try to avoid holding the cells locked across
generic_submit_request.

Also, now that thin_bio_map may lock bios in a cell, process_bio() is no
longer the only thread that will do so. Because of this we must be sure
to use cell_defer_except() to release all non-holder entries, that
were added by the other thread, because they must be deferred.

This patch depends on "dm thin: replace dm_cell_release_singleton with
cell_defer_except".

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@vger.kernel.org
/linux-master/tools/perf/
H A Dbuiltin-stat.cdiff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff 29734550 Tue Dec 05 07:03:11 MST 2017 Jin Yao <yao.jin@linux.intel.com> perf stat: Resort '--per-thread' result

There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':

perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches

3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

Performance counter stats for 'system wide':

vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles

2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Completed in 712 milliseconds