History log of /linux-master/tools/perf/builtin-kmem.c
Revision Date Author Comments
# 8ab12a20 08-Jun-2023 Ian Rogers <irogers@google.com>

perf callchain: Use pthread keys for tls callchain_cursor

Pthread keys are more portable than __thread and allow the association
of a destructor with the key. Use the destructor to clean up TLS
callchain cursors to aid understanding memory leaks.

Committer notes:

Had to fixup a series of unconverted places and also check for the
return of get_tls_callchain_cursor() as it may fail and return NULL.

In that unlikely case we now either print something to a file, if the
caller was expecting to print a callchain, or return an error code to
state that resolving the callchain isn't possible.

In some cases this was made easier because thread__resolve_callchain()
already can fail for other reasons, so this new one (cursor == NULL) can
be added and the callers don't have to explicitely check for this new
condition.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-25-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0dd5041c 08-Jun-2023 Ian Rogers <irogers@google.com>

perf addr_location: Add init/exit/copy functions

struct addr_location holds references to multiple reference counted
objects. Add init/exit functions to make maintenance of those more
consistent with the rest of the code and to try to avoid
leaks. Modification of thread reference counts isn't included in this
change.

Committer notes:

I needed to initialize result to sample->ip to make sure is set to
something, fixing a compile time error, mostly keeping the previous
logic as build_alloc_func_list() already does debugging/error prints
about what went wrong if it takes the 'goto out'.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ee84a303 08-Jun-2023 Ian Rogers <irogers@google.com>

perf thread: Add accessor functions for thread

Using accessors will make it easier to add reference count checking in
later patches.

Committer notes:

thread->nsinfo wasn't wrapped as it is used together with
nsinfo__zput(), where does a trick to set the field with a refcount
being dropped to NULL, and that doesn't work well with using
thread__nsinfo(thread), that loses the &thread->nsinfo pointer.

When refcount checking is added to 'struct thread', later in this
series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to
check the thread pointer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f12ad272 10-Apr-2023 Ian Rogers <irogers@google.com>

perf util: Move input_name to util

'input_name' is the name of the input perf.data file, it is used by data
convert and ui code. Move it to util to make it more consistent with
other global state.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Chengdong Li <chengdongli@tencent.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raul Silvera <rsilvera@google.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230410162511.3055900-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 78a1f7cd 04-Apr-2023 Ian Rogers <irogers@google.com>

perf map: Add helper for ->map_ip() and ->unmap_ip()

Later changes will add reference count checking for struct map, add a
helper function to invoke the map_ip and unmap_ip function pointers. The
helper allows the reference count check to be in fewer places.

Committer notes:

Add missing conversions to:

tools/perf/util/map.c
tools/perf/util/cs-etm.c
tools/perf/util/annotate.c
tools/perf/arch/powerpc/util/sym-handling.c
tools/perf/arch/s390/annotate/instructions.c

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0e6aa013 04-Apr-2023 Ian Rogers <irogers@google.com>

perf map: Rename map_ip() and unmap_ip()

Add dso to match comment. This avoids a naming conflict with later
added accessor functions for variables in struct map.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2973d822 13-Jan-2023 NeilBrown <neilb@suse.de>

mm: discard __GFP_ATOMIC

__GFP_ATOMIC serves little purpose. Its main effect is to set
ALLOC_HARDER which adds a few little boosts to increase the chance of an
allocation succeeding, one of which is to lower the water-mark at which it
will succeed.

It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also
adjusts this watermark. It is probable that other users of __GFP_HIGH
should benefit from the other little bonuses that __GFP_ATOMIC gets.

__GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM.
There is little point to this. We already get a might_sleep() warning if
__GFP_DIRECT_RECLAIM is set.

__GFP_ATOMIC allows the "watermark_boost" to be side-stepped. It is
probable that testing ALLOC_HARDER is a better fit here.

__GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might
sleep. This should test __GFP_DIRECT_RECLAIM instead.

This patch:
- removes __GFP_ATOMIC
- allows __GFP_HIGH allocations to ignore watermark boosting as well
as GFP_ATOMIC requests.
- makes other adjustments as suggested by the above.

The net result is not change to GFP_ATOMIC allocations. Other
allocations that use __GFP_HIGH will benefit from a few different extra
privileges. This affects:
xen, dm, md, ntfs3
the vermillion frame buffer
hibernation
ksm
swap
all of which likely produce more benefit than cost if these selected
allocation are more likely to succeed quickly.

[mgorman: Minor adjustments to rework on top of a series]
Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name
Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# dce088ab 07-Jan-2023 Leo Yan <leo.yan@linaro.org>

perf kmem: Support field "node" in evsel__process_alloc_event() coping with recent tracepoint restructuring

Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of
tracepoints") adds the field "node" into the tracepoints 'kmalloc' and
'kmem_cache_alloc', so this patch modifies the event process function to
support the field "node".

If field "node" is detected by checking function evsel__field(), it
stats the cross allocation.

When the "node" value is NUMA_NO_NODE (-1), it means the memory can be
allocated from any memory node, in this case, we don't account it as a
cross allocation.

Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints")
Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20230108062400.250690-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b3719108 07-Jan-2023 Leo Yan <leo.yan@linaro.org>

perf kmem: Support legacy tracepoints

Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of
tracepoints") removed tracepoints 'kmalloc_node' and
'kmem_cache_alloc_node', we need to consider the tool should be backward
compatible.

If it detect the tracepoint "kmem:kmalloc_node", this patch enables the
legacy tracepoints, otherwise, it will ignore them.

Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints")
Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20230108062400.250690-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 378ef0f5 05-Dec-2022 Ian Rogers <irogers@google.com>

perf build: Use libtraceevent from the system

Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.

If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.

This also disables CONFIG_TRACE that controls "perf trace".

CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.

Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed. The
majority of commands continue to work including "perf test".

Committer notes:

Fixed up a tools/perf/util/Build reject and added:

#include <traceevent/event-parse.h>

to tools/perf/util/scripting-engines/trace-event-perl.c.

Committer testing:

$ rpm -qi libtraceevent-devel
Name : libtraceevent-devel
Version : 1.5.3
Release : 2.fc36
Architecture: x86_64
Install Date: Mon 25 Jul 2022 03:20:19 PM -03
Group : Unspecified
Size : 27728
License : LGPLv2+ and GPLv2+
Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm
Build Date : Fri 15 Apr 2022 10:57:01 AM -03
Build Host : buildvm-x86-05.iad2.fedoraproject.org
Packager : Fedora Project
Vendor : Fedora Project
URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
Bug URL : https://bugz.fedoraproject.org/libtraceevent
Summary : Development headers of libtraceevent
Description :
Development headers of libtraceevent-libs
$

Default build:

$ ldd ~/bin/perf | grep tracee
libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
$

# perf trace -e sched:* --max-events 10
0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
#

Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.

Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:

- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y

- perf-$(CONFIG_LIBTRACEEVENT) += scripts/

- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y

- The python binding needed some fixups and util/trace-event.c can't be
built and linked with the python binding shared object, so remove it
in tools/perf/util/setup.py and exclude it from the list of
dependencies in the python/perf.so Makefile.perf target.

Building without libtraceevent-devel installed uncovered more build
failures:

- The python binding tools/perf/util/python.c was assuming that
traceevent/parse-events.h was always available, which was the case
when we defaulted to using the in-kernel tools/lib/traceevent/ files,
now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
the other parts of it that deal with tracepoints.

- We have to ifdef the rules in the Build files with
CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
detect libtraceevent-devel installed in the system. Simplification here
to avoid these two ways of disabling builtin-trace.c and not having
CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
way.

From Athira:

<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>

Then, ditto for arm64 and s390, detected by container cross build tests.

- s/390 uses test__checkevent_tracepoint() that is now only available if
HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.

Also from Athira:

<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>

Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ae0f4eb3 25-Mar-2022 Wei Li <liwei391@huawei.com>

perf tools: Enhance the matching of sub-commands abbreviations

We support short command 'rec*' for 'record' and 'rep*' for 'report' in
lots of sub-commands, but the matching is not quite strict currnetly.

It may be puzzling sometime, like we mis-type a 'recport' to report but
it will perform 'record' in fact without any message.

To fix this, add a check to ensure that the short cmd is valid prefix
of the real command.

Committer testing:

[root@quaco ~]# perf c2c re sleep 1

Usage: perf c2c {record|report}

-v, --verbose be more verbose (show counter open errors, etc)

# perf c2c rec sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.038 MB perf.data (16 samples) ]
# perf c2c recport sleep 1

Usage: perf c2c {record|report}

-v, --verbose be more verbose (show counter open errors, etc)

# perf c2c record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.038 MB perf.data (15 samples) ]
# perf c2c records sleep 1

Usage: perf c2c {record|report}

-v, --verbose be more verbose (show counter open errors, etc)

#

Signed-off-by: Wei Li <liwei391@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rui Xiang <rui.xiang@huawei.com>
Link: http://lore.kernel.org/lkml/20220325092032.2956161-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6d18804b 04-Jan-2022 Ian Rogers <irogers@google.com>

perf cpumap: Give CPUs their own type

A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.

Committer notes:

To make it build with BUILD_BPF_SKEL=1 these files needed the
conversions to 'struct perf_cpu' usage:

tools/perf/util/bpf_counter.c
tools/perf/util/bpf_counter_cgroup.c
tools/perf/util/bpf_ftrace.c

Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
cpu_map__build_map to cpu function".

Additionally these needed to be fixed for the ARM builds to complete:

tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/pmu.c

Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2681bd85 19-Jul-2021 Namhyung Kim <namhyung@kernel.org>

perf tools: Remove repipe argument from perf_session__new()

The repipe argument is only used by perf inject and the all others
passes 'false'. Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.

This is a preparation of the change the pipe input/output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org
[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b02736f7 30-Nov-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Use the right prefix for 'struct evlist' 'find' methods

perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# be8299e4 08-Jul-2020 Ian Rogers <irogers@google.com>

perf kmem: Pass additional arguments to 'perf record'

'perf kmem' has an input file option but current an output file option
fails:

$ sudo perf kmem record -o /tmp/p.data sleep 1  
 Error: unknown switch `o'

Usage: perf kmem [<options>] {record|stat}

   -f, --force           don't complain, do it
   -i, --input <file>    input file name
   -l, --line <num>      show n lines
   -s, --sort <key[,key2...]>
                         sort by keys: ptr, callsite, bytes, hit, pingpong, frag, page, order, mig>
   -v, --verbose         be more verbose (show symbol address, etc)
       --alloc           show per-allocation statistics
       --caller          show per-callsite statistics
       --live            Show live page stat
       --page            Analyze page allocator
       --raw-ip          show raw ip instead of symbol
       --slab            Analyze slab allocator
       --time <str>      Time span of interest (start,stop)

'perf sched' is similar in implementation and avoids the problem by
passing additional arguments to 'perf record'.

This change makes 'perf kmem' parse command line options consistently
with 'perf sched', although neither actually list that -o is a supported
option.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200708183919.4141023-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8cf5d0e0 04-May-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()

As those is a 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# efc0cdc9 29-Apr-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Rename perf_evsel__{str,int}val() and other tracepoint field metehods to to evsel__*()

As those are not 'struct evsel' methods, not part of tools/lib/perf/,
aka libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8ab2e96d 29-Apr-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Rename *perf_evsel__*name() to *evsel__*name()

As they are 'struct evsel' methods or related routines, not part of
tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5f0fef8a 03-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf callchain: Use 'struct map_symbol' in 'struct callchain_cursor_node'

To ease passing around map+symbol, just like done for other parts of the
tree recently.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1abecfca 16-Oct-2019 Yunfeng Ye <yeyunfeng@huawei.com>

perf kmem: Fix memory leak in compact_gfp_flags()

The memory @orig_flags is allocated by strdup(), it is freed on the
normal path, but leak to free on the error path.

Fix this by adding free(orig_flags) on the error path.

Fixes: 0e11115644b3 ("perf kmem: Print gfp flags in human readable string")
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Feilong Lin <linfeilong@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/f9e9f458-96f3-4a97-a1d5-9feec2420e07@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6ef81c55 21-Aug-2019 Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>

perf session: Return error code for perf_session__new() function on failure

This patch is to return error code of perf_new_session function on
failure instead of NULL.

Test Results:

Before Fix:

$ perf c2c report -input
failed to open nput: No such file or directory

$ echo $?
0
$

After Fix:

$ perf c2c report -input
failed to open nput: No such file or directory

$ echo $?
254
$

Committer notes:

Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
session), i.e. we need to pass zero in case of failure, which was the
case before when NULL was returned by perf_session__new() for failure,
but now we need to negate the result of IS_ERR(session) to respect that
TEST_ASSERT_VAL) expectation of zero meaning failure.

Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shawn Landden <shawn@git.icu>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4a3cec84 30-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf dsos: Move the dsos struct and its methods to separate source files

So that we can reduce the header dependency tree further, in the process
noticed that lots of places were getting even things like build-id
routines and 'struct perf_tool' definition indirectly, so fix all those
too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 38847db9 05-Aug-2019 Tzvetomir Stoyanov <tstoyanov@vmware.com>

libtraceevent, perf tools: Changes in tep_print_event_* APIs

Libtraceevent APIs for printing various trace events information are
complicated, there are complex extra parameters. To control the way
event information is printed, the user should call a set of functions in
a specific sequence.

These APIs are reimplemented to provide a more simple interface for
printing event information.

Removed APIs:

tep_print_event_task()
tep_print_event_time()
tep_print_event_data()
tep_event_info()
tep_is_latency_format()
tep_set_latency_format()
tep_data_latency_format()
tep_set_print_raw()

A new API for printing event information is introduced:
void tep_print_event(struct tep_handle *tep, struct trace_seq *s,
struct tep_record *record, const char *fmt, ...);
where "fmt" is a printf-like format string, followed by the event
fields to be printed. Supported fields:
TEP_PRINT_PID, "%d" - event PID
TEP_PRINT_CPU, "%d" - event CPU
TEP_PRINT_COMM, "%s" - event command string
TEP_PRINT_NAME, "%s" - event name
TEP_PRINT_LATENCY, "%s" - event latency
TEP_PRINT_TIME, %d - event time stamp. A divisor and precision
can be specified as part of this format string:
"%precision.divisord". Example:
"%3.1000d" - divide the time by 1000 and print the first 3 digits
before the dot. Thus, the time stamp "123456000" will be printed as
"123.456"
TEP_PRINT_INFO, "%s" - event information.
TEP_PRINT_INFO_RAW, "%s" - event information, in raw format.

Example:
tep_print_event(tep, s, record, "%16s-%-5d [%03d] %s %6.1000d %s %s",
TEP_PRINT_COMM, TEP_PRINT_PID, TEP_PRINT_CPU,
TEP_PRINT_LATENCY, TEP_PRINT_TIME, TEP_PRINT_NAME, TEP_PRINT_INFO);
Output:
ls-11314 [005] d.h. 185207.366383 function __wake_up

Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-trace-devel@vger.kernel.org
Cc: Patrick McLean <chutzpah@gentoo.org>
Link: http://lore.kernel.org/linux-trace-devel/20190801074959.22023-2-tz.stoyanov@gmail.com
Link: http://lore.kernel.org/lkml/20190805204355.041132030@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8520a98d 29-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf debug: Remove needless include directives from debug.h

All we need there is a forward declaration for 'union perf_event', so
remove it from there and add missing header directives in places using
things from this indirect include.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 32dcd021 21-Jul-2019 Jiri Olsa <jolsa@kernel.org>

perf evsel: Rename struct perf_evsel to struct evsel

Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.

Committer notes:

Added fixes for arm64, provided by Jiri.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7f7c536f 04-Jul-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

tools lib: Adopt zalloc()/zfree() from tools/perf

Eroding a bit more the tools/perf/util/util.h hodpodge header.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3052ba56 25-Jun-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

tools perf: Move from sane_ctype.h obtained from git to the Linux's original

We got the sane_ctype.h headers from git and kept using it so far, but
since that code originally came from the kernel sources to the git
sources, perhaps its better to just use the one in the kernel, so that
we can leverage tools/perf/check_headers.sh to be notified when our copy
gets out of sync, i.e. when fixes or goodies are added to the code we've
copied.

This will help with things like tools/lib/string.c where we want to have
more things in common with the kernel, such as strim(), skip_spaces(),
etc so as to go on removing the things that we have in tools/perf/util/
and instead using the code in the kernel, indirectly and removing things
like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
are made to the original code.

Hopefully this also should help with reducing the difference of code
hosted in tools/ to the one in the kernel proper.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6a9fa4e3 25-Jun-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf string: Move 'dots' and 'graph_dotted_line' out of sane_ctype.h

Those are not in that file in the git repo, lets move it from there so
that we get that sane ctype code fully isolated to allow getting it in
sync either with the git sources or better with the kernel sources
(include/linux/ctype.h + lib/ctype.h), that way we can use
check_headers.h to get notified when changes are made in the original
code so that we can cherry-pick.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ioh5sghn3943j0rxg6lb2dgs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 69769ce1 31-Mar-2019 Tzvetomir Stoyanov <tstoyanov@vmware.com>

perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event to "tep"

The member "pevent" of the struct tep_event is renamed to "tep". This
makes the struct consistent with the chosen naming convention:

tep (trace event parser), instead of the old pevent.

Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-3-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164344.627724996@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2d4f2799 21-Feb-2019 Jiri Olsa <jolsa@kernel.org>

perf data: Add global path holder

Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.

This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.

Also it actually makes the code little simpler.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1101f69a 27-Jan-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

pref tools: Add missing map.h includes

Lots of places get the map.h file indirectly, and since we're going to
remove it from machine.h, then those need to include it directly, do it
now, before we remove that dep.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 49b8e2be 02-Nov-2018 Rasmus Villemoes <linux@rasmusvillemoes.dk>

perf tools: Replace automatic const char[] variables by statics

An automatic const char[] variable gets initialized at runtime, just
like any other automatic variable. For long strings, that uses a lot of
stack and wastes time building the string; e.g. for the "No %s
allocation events..." case one has:

444516: 48 b8 4e 6f 20 25 73 20 61 6c movabs $0x6c61207325206f4e,%rax # "No %s al"
...
444674: 48 89 45 80 mov %rax,-0x80(%rbp)
444678: 48 b8 6c 6f 63 61 74 69 6f 6e movabs $0x6e6f697461636f6c,%rax # "location"
444682: 48 89 45 88 mov %rax,-0x78(%rbp)
444686: 48 b8 20 65 76 65 6e 74 73 20 movabs $0x2073746e65766520,%rax # " events "
444690: 66 44 89 55 c4 mov %r10w,-0x3c(%rbp)
444695: 48 89 45 90 mov %rax,-0x70(%rbp)
444699: 48 b8 66 6f 75 6e 64 2e 20 20 movabs $0x20202e646e756f66,%rax

Make them all static so that the compiler just references objects in .rodata.

Committer testing:

Ok, using dwarves's codiff tool:

$ codiff --functions /tmp/perf.before ~/bin/perf
builtin-sched.c:
cmd_sched | -48
1 function changed, 48 bytes removed, diff: -48

builtin-report.c:
cmd_report | -32
1 function changed, 32 bytes removed, diff: -32

builtin-kmem.c:
cmd_kmem | -64
build_alloc_func_list | -50
2 functions changed, 114 bytes removed, diff: -114

builtin-c2c.c:
perf_c2c__report | -390
1 function changed, 390 bytes removed, diff: -390

ui/browsers/header.c:
tui__header_window | -104
1 function changed, 104 bytes removed, diff: -104

/home/acme/bin/perf:
9 functions changed, 688 bytes removed, diff: -688

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181102230624.20064-1-linux@rasmusvillemoes.dk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6fed932e 08-Aug-2018 Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>

tools lib traceevent, perf tools: Rename 'enum pevent_flag' to 'enum tep_flag'

In order to make libtraceevent into a proper library, variables, data
structures and functions require a unique prefix to prevent name space
conflicts. That prefix will be "tep_" and not "pevent_". This changes
pevent_get_page_size API and enum pevent_flag to enum tep_flag

Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lkml.kernel.org/r/20180808180701.623942406@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4d5c58b1 08-Aug-2018 Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>

tools lib traceevent, perf tools: Rename pevent alloc / free APIs

In order to make libtraceevent into a proper library, variables, data
structures and functions require a unique prefix to prevent name space
conflicts. That prefix will be "tep_" and not "pevent_". This changes
APIs: pevent_alloc, pevent_free, pevent_event_info and pevent_func_resolver_t

Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lkml.kernel.org/r/20180808180700.152609945@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cbc49b25 08-Aug-2018 Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>

tools lib traceevent, perf tools: Rename 'struct pevent_record' to 'struct tep_record'

In order to make libtraceevent into a proper library, variables, data
structures and functions require a unique prefix to prevent name space
conflicts. That prefix will be "tep_" and not "pevent_". This changes
the 'struct pevent_record' to 'struct tep_record'.

Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lkml.kernel.org/r/20180808180659.866021298@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 107cad95 29-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Ditch find_kernel_function variants

Since we do not have split symtabs anymore, no need to have explicit
find_kernel_function variants, use the find_kernel_symbol ones.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hiw2ryflju000f6wl62128it@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 453f85d4 15-Nov-2017 Mel Gorman <mgorman@techsingularity.net>

mm: remove __GFP_COLD

As the page free path makes no distinction between cache hot and cold
pages, there is no real useful ordering of pages in the free list that
allocation requests can take advantage of. Juding from the users of
__GFP_COLD, it is likely that a number of them are the result of copying
other sites instead of actually measuring the impact. Remove the
__GFP_COLD parameter which simplifies a number of paths in the page
allocator.

This is potentially controversial but bear in mind that the size of the
per-cpu pagelists versus modern cache sizes means that the whole per-cpu
list can often fit in the L3 cache. Hence, there is only a potential
benefit for microbenchmarks that alloc/free pages in a tight loop. It's
even worse when THP is taken into account which has little or no chance
of getting a cache-hot page as the per-cpu list is bypassed and the
zeroing of multiple pages will thrash the cache anyway.

The truncate microbenchmarks are not shown as this patch affects the
allocation path and not the free path. A page fault microbenchmark was
tested but it showed no sigificant difference which is not surprising
given that the __GFP_COLD branches are a miniscule percentage of the
fault path.

Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d8be7566 15-Nov-2017 Levin, Alexander (Sasha Levin) <alexander.levin@verizon.com>

kmemcheck: remove whats left of NOTRACK flags

Now that kmemcheck is gone, we don't need the NOTRACK flags.

Link: http://lkml.kernel.org/r/20171007030159.22241-5-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Hansen <devtimhansen@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# eae8ad80 23-Jan-2017 Jiri Olsa <jolsa@kernel.org>

perf tools: Add struct perf_data_file

Add struct perf_data_file to represent a single file within a perf_data
struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-c3f9p4xzykr845ktqcek6p4t@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8ceb41d7 23-Jan-2017 Jiri Olsa <jolsa@kernel.org>

perf tools: Rename struct perf_data_file to perf_data

Rename struct perf_data_file to perf_data, because we will add the
possibility to have multiple files under perf.data, so the 'perf_data'
name fits better.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-39wn4d77phel3dgkzo3lyan0@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 79f56ebe 16-Sep-2017 Christophe JAILLET <christophe.jaillet@wanadoo.fr>

perf kmem: Perform some cleanup if '--time' is given an invalid value

If the string passed in '--time' is invalid, we must do some cleanup
before leaving. As in the other error handling paths of this function.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Fixes: 2a865bd8dddd ("perf kmem: Add option to specify time window of interest")
Link: http://lkml.kernel.org/r/20170916060936.28199-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0ee931c4 13-Sep-2017 Michal Hocko <mhocko@suse.com>

mm: treewide: remove GFP_TEMPORARY allocation flag

GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation. As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory. So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification. I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring. This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL. Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic. It
seems to be a heuristic without any measured advantage for most (if not
all) its current users. The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers. So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# dcda9b04 12-Jul-2017 Michal Hocko <mhocko@suse.com>

mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic

__GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to
the page allocator. This has been true but only for allocations
requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always
ignored for smaller sizes. This is a bit unfortunate because there is
no way to express the same semantic for those requests and they are
considered too important to fail so they might end up looping in the
page allocator for ever, similarly to GFP_NOFAIL requests.

Now that the whole tree has been cleaned up and accidental or misled
usage of __GFP_REPEAT flag has been removed for !costly requests we can
give the original flag a better name and more importantly a more useful
semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user
that the allocator would try really hard but there is no promise of a
success. This will work independent of the order and overrides the
default allocator behavior. Page allocator users have several levels of
guarantee vs. cost options (take GFP_KERNEL as an example)

- GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_
attempt to free memory at all. The most light weight mode which even
doesn't kick the background reclaim. Should be used carefully because
it might deplete the memory and the next user might hit the more
aggressive reclaim

- GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic
allocation without any attempt to free memory from the current
context but can wake kswapd to reclaim memory if the zone is below
the low watermark. Can be used from either atomic contexts or when
the request is a performance optimization and there is another
fallback for a slow path.

- (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) -
non sleeping allocation with an expensive fallback so it can access
some portion of memory reserves. Usually used from interrupt/bh
context with an expensive slow path fallback.

- GFP_KERNEL - both background and direct reclaim are allowed and the
_default_ page allocator behavior is used. That means that !costly
allocation requests are basically nofail but there is no guarantee of
that behavior so failures have to be checked properly by callers
(e.g. OOM killer victim is allowed to fail currently).

- GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior
and all allocation requests fail early rather than cause disruptive
reclaim (one round of reclaim in this implementation). The OOM killer
is not invoked.

- GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator
behavior and all allocation requests try really hard. The request
will fail if the reclaim cannot make any progress. The OOM killer
won't be triggered.

- GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior
and all allocation requests will loop endlessly until they succeed.
This might be really dangerous especially for larger orders.

Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL
because they already had their semantic. No new users are added.
__alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if
there is no progress and we have already passed the OOM point.

This means that all the reclaim opportunities have been exhausted except
the most disruptive one (the OOM killer) and a user defined fallback
behavior is more sensible than keep retrying in the page allocator.

[akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c]
[mhocko@suse.com: semantic fix]
Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz
[mhocko@kernel.org: address other thing spotted by Vlastimil]
Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Belits <alex.belits@cavium.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: NeilBrown <neilb@suse.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 62d94b00 27-Jun-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Replace error() with pr_err()

To consolidate the error reporting facility.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b41iot1094katoffdf19w9zk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a43783ae 18-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Include errno.h where needed

Removing it from util.h, part of an effort to disentangle the includes
hell, that makes changes to util.h or something included by it to cause
a complete rebuild of the tools.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ztrjy52q1rqcchuy3rubfgt2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3d689ed6 17-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move sane ctype stuff from util.h to sane_ctype.h

More stuff that came from git, out of the hodge-podge that is util.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-e3lana4gctz3ub4hn4y29hkw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fd20e811 17-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Including missing inttypes.h header

Needed to use the PRI[xu](32,64) formatting macros.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wkbho8kaw24q67dd11q0j39f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 877a7a11 17-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Add include <linux/kernel.h> where ARRAY_SIZE() is used

To pave the way for further cleanups where linux/kernel.h may stop being
included in some header.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qqxan6tfsl6qx3l0v3nwgjvk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b0ad8ea6 27-Mar-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove unused 'prefix' from builtin functions

We got it from the git sources but never used it for anything, with the
place where this would be somehow used remaining:

static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
{
prefix = NULL;
if (p->option & RUN_SETUP)
prefix = NULL; /* setup_perf_directory(); */

Ditch it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uw5swz05vol0qpr32c5lpvus@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f3b3614a 07-Mar-2017 Hari Bathini <hbathini@linux.vnet.ibm.com>

perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info

Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
by the kernel when fork, clone, setns or unshare are invoked. And update
perf-record documentation with the new option to record namespace
events.

Committer notes:

Combined it with a later patch to allow printing it via 'perf report -D'
and be able to test the feature introduced in this patch. Had to move
here also perf_ns__name(), that was introduced in another later patch.

Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:

util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
^
Testing it:

# perf record --namespaces -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
#
# perf report -D
<SNIP>
3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]

0x1151e0 [0x30]: event: 9
.
. ... raw event: size 48 bytes
. 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
. 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
. 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................
<SNIP>
NAMESPACES events: 1
<SNIP>
#

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a7c3899c 13-Feb-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: No need to check if sym->name is NULL

As it is an array, so will always evaluate to 'true', as reported by
clang:

builtin-sched.c:2070:19: error: address of array 'sym->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
if (sym && sym->name) {
~~ ~~~~~^~~~
1 warning generated.

So just ditch all those useless checks.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ydpm927col06paixb775jjx5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ecc4c561 24-Jan-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Propagate perf_config() errors

Previously these were being ignored, sometimes silently.

Stop doing that, emitting debug messages and handling the errors.

Testing it:

$ cat ~/.perfconfig
cat: /home/acme/.perfconfig: No such file or directory
$ perf stat -e cycles usleep 1

Performance counter stats for 'usleep 1':

938,996 cycles:u

0.003813731 seconds time elapsed

$ perf top --stdio
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
<SNIP>
[ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
[acme@jouet linux]$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
# Overhead Command Shared Object Symbol
# ........ ....... ................. .........................
71.77% usleep libc-2.24.so [.] _dl_addr
27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry
1.13% usleep [kernel.kallsyms] [k] page_fault
$
$ touch ~/.perfconfig
$ ls -la ~/.perfconfig
-rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
$
$ perf stat -e instructions usleep 1

Performance counter stats for 'usleep 1':

244,610 instructions:u

0.000805383 seconds time elapsed

$
[root@jouet ~]# chown acme.acme ~/.perfconfig
[root@jouet ~]# perf stat -e cycles usleep 1
Warning: File /root/.perfconfig not owned by current user or root, ignoring it.

Performance counter stats for 'usleep 1':

937,615 cycles

0.000836931 seconds time elapsed
#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 41b6167e 10-Jan-2017 Michal Hocko <mhocko@suse.com>

mm: get rid of __GFP_OTHER_NODE

The flag was introduced by commit 78afd5612deb ("mm: add
__GFP_OTHER_NODE flag") to allow proper accounting of remote node
allocations done by kernel daemons on behalf of a process - e.g.
khugepaged.

After "mm: fix remote numa hits statistics" we do not need and actually
use the flag so we can safely remove it because all allocations which
are satisfied from their "home" node are accounted properly.

[mhocko@suse.com: fix build]
Link: http://lkml.kernel.org/r/20170106122225.GK5556@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170102153057.9451-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 2a865bd8 29-Nov-2016 David Ahern <dsa@cumulusnetworks.com>

perf kmem: Add option to specify time window of interest

Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

# perf kmem record usleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1.540 MB perf.data (2049 samples) ]
# perf evlist
kmem:kmalloc
kmem:kmalloc_node
kmem:kfree
kmem:kmem_cache_alloc
kmem:kmem_cache_alloc_node
kmem:kmem_cache_free
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
#
# # Use 'perf script' to get a first approach, select a chunk for then using
# # with 'perf kmem stat --time'
#
# perf script | tail -15
usleep 9889 [0] 20119.782088: kmem:kmem_cache_free: (selinux_file_free_security+0x27) call_site=ffffffffb936aa07 ptr=0xffff888a1df49fc0
perf 9888 [3] 20119.782088: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782089: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782090: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782090: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
usleep 9889 [0] 20119.782091: kmem:kmem_cache_alloc: (__sigqueue_alloc+0x4a) call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
perf 9888 [3] 20119.782091: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782093: kmem:kmem_cache_free: (__sigqueue_free.part.17+0x33) call_site=ffffffffb90ad3f3 ptr=0xffff8889f071f6e0
perf 9888 [3] 20119.782098: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782098: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782099: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782100: kmem:kmem_cache_alloc: (alloc_buffer_head+0x21) call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782101: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
perf 9888 [3] 20119.782102: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
perf 9888 [3] 20119.782103: kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
#
# # stats for the whole perf.data file, i.e. no interval specified
#
# perf kmem stat

SUMMARY (SLAB allocator)
========================
Total bytes requested: 172,628
Total bytes allocated: 173,088
Total bytes freed: 161,280
Net total bytes allocated: 11,808
Total bytes wasted on internal fragmentation: 460
Internal fragmentation: 0.265761%
Cross CPU allocations: 0/851
#
# # stats for an end open interval, after a certain time:
#
# perf kmem stat --time 20119.782088,

SUMMARY (SLAB allocator)
========================
Total bytes requested: 552
Total bytes allocated: 552
Total bytes freed: 448
Net total bytes allocated: 104
Total bytes wasted on internal fragmentation: 0
Internal fragmentation: 0.000000%
Cross CPU allocations: 0/8
#

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# aa58e9af 25-Nov-2016 David Ahern <dsa@cumulusnetworks.com>

perf kmem stat: Track memory freed

Track freed memory as well as allocations and show the net in the
summary.

Committer notes:

Testing it:

# perf kmem record usleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1.626 MB perf.data (4208 samples) ]
[root@jouet ~]# perf kmem stat --slab

SUMMARY (SLAB allocator)
========================
Total bytes requested: 234,011
Total bytes allocated: 234,504
Total bytes freed: 213,328 <------
Net total bytes allocated: 21,176
Total bytes wasted on internal fragmentation: 493
Internal fragmentation: 0.210231%
Cross CPU allocations: 4/1,963
#

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480110133-37039-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# be39db9f 01-Sep-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Remove symbol_filter_t machinery

We're not using it anymore, few users were, but we really could do
without it, simplify lots of functions by removing it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-1zng8wdznn00iiz08bb7q3vn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 25160354 28-Jul-2016 Vlastimil Babka <vbabka@suse.cz>

mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations

After the previous patch, we can distinguish costly allocations that
should be really lightweight, such as THP page faults, with
__GFP_NORETRY. This means we don't need to recognize khugepaged
allocations via PF_KTHREAD anymore. We can also change THP page faults
in areas where madvise(MADV_HUGEPAGE) was used to try as hard as
khugepaged, as the process has indicated that it benefits from THP's and
is willing to pay some initial latency costs.

We can also make the flags handling less cryptic by distinguishing
GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from
GFP_TRANSHUGE (only direct reclaim, khugepaged default). Adding
__GFP_NORETRY or __GFP_KSWAPD_RECLAIM is done where needed.

The patch effectively changes the current GFP_TRANSHUGE users as
follows:

* get_huge_zero_page() - the zero page lifetime should be relatively
long and it's shared by multiple users, so it's worth spending some
effort on it. We use GFP_TRANSHUGE, and __GFP_NORETRY is not added.
This also restores direct reclaim to this allocation, which was
unintentionally removed by commit e4a49efe4e7e ("mm: thp: set THP defrag
by default to madvise and add a stall-free defrag option")

* alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency
is not an issue. So if khugepaged "defrag" is enabled (the default), do
reclaim via GFP_TRANSHUGE without __GFP_NORETRY. We can remove the
PF_KTHREAD check from page alloc.

As a side-effect, khugepaged will now no longer check if the initial
compaction was deferred or contended. This is OK, as khugepaged sleep
times between collapsion attempts are long enough to prevent noticeable
disruption, so we should allow it to spend some effort.

* migrate_misplaced_transhuge_page() - already was masking out
__GFP_RECLAIM, so just convert to GFP_TRANSHUGE_LIGHT which is
equivalent.

* alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise)
are now allocating without __GFP_NORETRY. Other vma's keep using
__GFP_NORETRY if direct reclaim/compaction is at all allowed (by default
it's allowed only for madvised vma's). The rest is conversion to
GFP_TRANSHUGE(_LIGHT).

[mhocko@suse.com: suggested GFP_TRANSHUGE_LIGHT]
Link: http://lkml.kernel.org/r/20160721073614.24395-7-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e5cadb93 23-Jun-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Rename for_each() macros to for_each_entry()

To match the semantics for list.h in the kernel, that are used to
implement those macros.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qbcjlgj0ffxquxscahbpddi3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 41840d21 23-Jun-2016 Taeung Song <treeze.taeung@gmail.com>

perf config: Move config declarations from util/cache.h to util/config.h

Lately util/config.h has been added but util/cache.h has declarations of
functions and a global variable for config features.

To manage codes about configuration at one spot, move them to
util/config.h and let source files that need config features include
config.h And if the source files that included previous cache.h need
only config.h, remove including cache.h.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1466672119-4852-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 91d7b2de 14-Apr-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf callchain: Start moving away from global per thread cursors

The recent perf_evsel__fprintf_callchain() move to evsel.c added several
new symbol requirements to the python binding, for instance:

# perf test -v python
16: Try 'import perf' in python, checking link problems :
--- start ---
test child forked, pid 18030
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
callchain_cursor
test child finished with -1
---- end ----
Try 'import perf' in python, checking link problems: FAILED!
#

This would require linking against callchain.c to access to the global
callchain_cursor variables.

Since lots of functions already receive as a parameter a
callchain_cursor struct pointer, make that be the case for some more
function so that we can start phasing out usage of yet another global
variable.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 420adbe9 15-Mar-2016 Vlastimil Babka <vbabka@suse.cz>

mm, tracing: unify mm flags handling in tracepoints and printk

In tracepoints, it's possible to print gfp flags in a human-friendly
format through a macro show_gfp_flags(), which defines a translation
array and passes is to __print_flags(). Since the following patch will
introduce support for gfp flags printing in printk(), it would be nice
to reuse the array. This is not straightforward, since __print_flags()
can't simply reference an array defined in a .c file such as mm/debug.c
- it has to be a macro to allow the macro magic to communicate the
format to userspace tools such as trace-cmd.

The solution is to create a macro __def_gfpflag_names which is used both
in show_gfp_flags(), and to define the gfpflag_names[] array in
mm/debug.c.

On the other hand, mm/debug.c also defines translation tables for page
flags and vma flags, and desire was expressed (but not implemented in
this series) to use these also from tracepoints. Thus, this patch also
renames the events/gfpflags.h file to events/mmflags.h and moves the
table definitions there, using the same macro approach as for gfpflags.
This allows translating all three kinds of mm-specific flags both in
tracepoints and printk.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 14e0a214 15-Mar-2016 Vlastimil Babka <vbabka@suse.cz>

tools, perf: make gfp_compact_table up to date

When updating tracing's show_gfp_flags() I have noticed that perf's
gfp_compact_table is also outdated. Fill in the missing flags and place
a note in gfp.h to increase chance that future updates are synced.
Convert the __GFP_X flags from "GFP_X" to "__GFP_X" strings in line with
the previous patch.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b8cbb349 26-Feb-2016 Wang Nan <wangnan0@huawei.com>

perf config: Bring perf_default_config to the very beginning at main()

Before this patch each subcommand calls perf_config() by themself,
reading the default configuration together with subcommand specific
options. If a subcommand doesn't have it own options, it needs to call
'perf_config(perf_default_config, NULL)' to ensure .perfconfig is
loaded.

This patch brings perf_config(perf_default_config, NULL) to the very
start of main(), so subcommands don't need to do it.

After this patch, 'llvm.clang-path' works for 'perf trace'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4b6ab94e 15-Dec-2015 Josh Poimboeuf <jpoimboe@redhat.com>

perf subcmd: Create subcmd library

Move the subcommand-related files from perf to a new library named
libsubcmd.a.

Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to
'exec-cmd.*' to be consistent with the naming of all the other files.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a5e813c6 30-Sep-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Add method for common kernel_map(FUNCTION) operation

And it is also a step in the direction of killing the separation of data
and text maps in map_groups.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-rrds86kb3wx5wk8v38v56gw8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 77e65977 30-Sep-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Use machine__kernel_map() thoroughly

In places where we were using its open coded equivalent.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-khkdugcdoqy3tkszm3jdxgbe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 249ca1a8 30-Jun-2015 Taeung Song <treeze.taeung@gmail.com>

perf kmem: Fill in the missing session freeing after an error occurs

When an error occurs an error value is just returned without freeing the
session. So allocating and freeing session have to be matched as a pair
even if an error occurs.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1435652124-22414-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b2365122 29-May-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Fix compiler warning about may be accessing uninitialized variable

The last argument to strtok_r doesn't need to be initialized, its just a
placeholder to make this routine reentrant, but gcc doesn't know about
that and complains, breaking the build, fix it by setting it to NULL.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-8e8rgbg3aom9uarsyqjrsctg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 08a9b985 11-May-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Fix compiler warning about may be accessing uninitialized variable

The last argument to strtok_r doesn't need to be initialized, its just a
placeholder to make this routine reentrant, but gcc doesn't know about
that and complains, breaking the build, fix it by setting it to NULL.

Fixes: 0e11115644b3 ("perf kmem: Print gfp flags in human readable string")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-iyyvkbnkrd9g19f6ta9zfkem@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b91fc39f 06-Apr-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Protect the machine->threads with a rwlock

In addition to using refcounts for the struct thread lifetime
management, we need to protect access to machine->threads from
concurrent access.

That happens in 'perf top', where a thread processes events, inserting
and deleting entries from that rb_tree while another thread decays
hist_entries, that end up dropping references and ultimately deleting
threads from the rb_tree and releasing its resources when no further
hist_entry (or other data structures, like in 'perf sched') references
it.

So the rule is the same for refcounts + protected trees in the kernel,
get the tree lock, find object, bump the refcount, drop the tree lock,
return, use object, drop the refcount if no more use of it is needed,
keep it if storing it in some other data structure, drop when releasing
that data structure.

I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
"perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".

The addr_location__put() one is because as we return references to
several data structures, we may end up adding more reference counting
for the other data structures and then we'll drop it at
addr_location__put() time.

Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a923e2c4 05-May-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Show warning when trying to run stat without record

Sometimes one can mistakenly run 'perf kmem stat' without running 'perf
kmem record' before or with a different configuration like recording
--slab and stat --page. Show a warning message like the one below to
inform the user:

# perf kmem stat --page --caller
No page allocation events found. Have you run 'perf kmem record --page'?

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1430837572-31395-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0c160d49 20-Apr-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Add kmem.default config option

Currently perf kmem command will select --slab if neither --slab nor
--page is given for backward compatibility. Add kmem.default config
option to select the default value ('page' or 'slab').

# cat ~/.perfconfig
[kmem]
default = page

# perf kmem stat

SUMMARY (page allocator)
========================
Total allocation requests : 1,518 [ 6,096 KB ]
Total free requests : 1,431 [ 5,748 KB ]

Total alloc+freed requests : 1,330 [ 5,344 KB ]
Total alloc-only requests : 188 [ 752 KB ]
Total free-only requests : 101 [ 404 KB ]

Total allocation failures : 0 [ 0 KB ]
...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1429592107-1807-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0e111156 20-Apr-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Print gfp flags in human readable string

Save libtraceevent output and print it in the header.

# perf kmem stat --page --caller
#
# GFP flags
# ---------
# 00000010: NI: GFP_NOIO
# 000000d0: K: GFP_KERNEL
# 00000200: NWR: GFP_NOWARN
# 000084d0: K|R|Z: GFP_KERNEL|GFP_REPEAT|GFP_ZERO
# 000200d2: HU: GFP_HIGHUSER
# 000200da: HUM: GFP_HIGHUSER_MOVABLE
# 000280da: HUM|Z: GFP_HIGHUSER_MOVABLE|GFP_ZERO
# 002084d0: K|R|Z|NT: GFP_KERNEL|GFP_REPEAT|GFP_ZERO|GFP_NOTRACK
# 0102005a: NF|HW|M: GFP_NOFS|GFP_HARDWALL|GFP_MOVABLE

---------------------------------------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------------------
60 | 15 | 0 | UNMOVABL | K|R|Z|NT | pte_alloc_one
40 | 10 | 0 | MOVABLE | HUM|Z | handle_mm_fault
24 | 6 | 0 | MOVABLE | HUM | do_wp_page
24 | 6 | 0 | UNMOVABL | K | __pollwait
...

Requested-by: Joonsoo Kim <js1304@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1429592107-1807-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2a7ef02c 20-Apr-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Add --live option for current allocation stat

Currently 'perf kmem stat --page' shows total (page) allocation stat by
default, but sometimes one might want to see live (total alloc-only)
requests/pages only. The new --live option does this by subtracting freed
allocation from the stat.

E.g.:

# perf kmem stat --page

SUMMARY (page allocator)
========================
Total allocation requests : 988,858 [ 4,045,368 KB ]
Total free requests : 886,484 [ 3,624,996 KB ]

Total alloc+freed requests : 885,969 [ 3,622,628 KB ]
Total alloc-only requests : 102,889 [ 422,740 KB ]
Total free-only requests : 515 [ 2,368 KB ]

Total allocation failures : 0 [ 0 KB ]

Order Unmovable Reclaimable Movable Reserved CMA/Isolated
----- ------------ ------------ ------------ ------------ ------------
0 172,173 3,083 806,686 . .
1 284 . . . .
2 6,124 58 . . .
3 114 335 . . .
4 . . . . .
5 . . . . .
6 . . . . .
7 . . . . .
8 . . . . .
9 . . 1 . .
10 . . . . .
# perf kmem stat --page --live

SUMMARY (page allocator)
========================
Total allocation requests : 988,858 [ 4,045,368 KB ]
Total free requests : 886,484 [ 3,624,996 KB ]

Total alloc+freed requests : 885,969 [ 3,622,628 KB ]
Total alloc-only requests : 102,889 [ 422,740 KB ]
Total free-only requests : 515 [ 2,368 KB ]

Total allocation failures : 0 [ 0 KB ]

Order Unmovable Reclaimable Movable Reserved CMA/Isolated
----- ------------ ------------ ------------ ------------ ------------
0 2,214 3,025 97,156 . .
1 59 . . . .
2 19 58 . . .
3 23 335 . . .
4 . . . . .
5 . . . . .
6 . . . . .
7 . . . . .
8 . . . . .
9 . . . . .
10 . . . . .
#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1429592107-1807-4-git-send-email-namhyung@kernel.org
[ Added examples to the changeset log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fb4f313d 20-Apr-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Support sort keys on page analysis

Add new sort keys for page: page, order, migtype, gfp - existing
'bytes', 'hit' and 'callsite' sort keys also work for page. Note that
-s/--sort option should be preceded by either of --slab or --page option
to determine where the sort keys applies.

Now it properly groups and sorts allocation stats - so same
page/caller with different order/migtype/gfp will be printed on a
different line.

# perf kmem stat --page --caller -l 10 -s order,hit

-----------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
-----------------------------------------------------------------------------
64 | 4 | 2 | RECLAIM | 00285250 | new_slab
50,144 | 12,536 | 0 | MOVABLE | 0102005a | __page_cache_alloc
52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
40 | 10 | 0 | MOVABLE | 000280da | handle_mm_fault
28 | 7 | 0 | UNMOVABL | 000000d0 | __pollwait
20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
8 | 2 | 0 | UNMOVABL | 000084d0 | __pud_alloc
... | ... | ... | ... | ... | ...
-----------------------------------------------------------------------------

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1429592107-1807-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c9758cc4 20-Apr-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Implement stat --page --caller

It is 'perf kmem' support caller statistics for page. Unlike slab case,
the tracepoints in page allocator don't provide callsite info. So it
records with callchain and extracts callsite info.

Note that the callchain contains several memory allocation functions
which has no meaning for users. So skip those functions to get proper
callsites. I used following regex pattern to skip the allocator
functions:

^_?_?(alloc|get_free|get_zeroed)_pages?

This gave me a following list of functions:

# perf kmem record --page sleep 3
# perf kmem stat --page -v
...
alloc func: __get_free_pages
alloc func: get_zeroed_page
alloc func: alloc_pages_exact
alloc func: __alloc_pages_direct_compact
alloc func: __alloc_pages_nodemask
alloc func: alloc_page_interleave
alloc func: alloc_pages_current
alloc func: alloc_pages_vma
alloc func: alloc_page_buffers
alloc func: alloc_pages_exact_nid
...

The output looks mostly same as --alloc (I also added callsite column
to that) but groups entries by callsite. Currently, the order,
migrate type and GFP flag info is for the last allocation and not
guaranteed to be same for all allocations from the callsite.

---------------------------------------------------------------------------------------------
Total_alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------
1,064 | 266 | 0 | UNMOVABL | 000000d0 | __pollwait
52 | 13 | 0 | UNMOVABL | 002084d0 | pte_alloc_one
44 | 11 | 0 | MOVABLE | 000280da | handle_mm_fault
20 | 5 | 0 | MOVABLE | 000200da | do_cow_fault
20 | 5 | 0 | MOVABLE | 000200da | do_wp_page
16 | 4 | 0 | UNMOVABL | 000084d0 | __pmd_alloc
16 | 4 | 0 | UNMOVABL | 00000200 | __tlb_remove_page
12 | 3 | 0 | UNMOVABL | 000084d0 | __pud_alloc
8 | 2 | 0 | UNMOVABL | 00000010 | bio_copy_user_iov
4 | 1 | 0 | UNMOVABL | 000200d2 | pipe_write
4 | 1 | 0 | MOVABLE | 000280da | do_wp_page
4 | 1 | 0 | UNMOVABL | 002084d0 | pgd_alloc
---------------------------------------------------------------------------------------------

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1429592107-1807-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6b1a2752 14-Apr-2015 David Ahern <david.ahern@oracle.com>

perf kmem: Fix compiles on RHEL6/OL6

0d68bc92c48 breaks compiles on RHEL6/OL6:
cc1: warnings being treated as errors
builtin-kmem.c: In function ‘search_page_alloc_stat’:
builtin-kmem.c:322: error: declaration of ‘stat’ shadows a global declaration
node = &parent->rb_left;
/usr/include/sys/stat.h:455: error: shadowed declaration is here
builtin-kmem.c: In function ‘perf_evsel__process_page_alloc_event’:
builtin-kmem.c:378: error: declaration of ‘stat’ shadows a global declaration
/usr/include/sys/stat.h:455: error: shadowed declaration is here
builtin-kmem.c: In function ‘perf_evsel__process_page_free_event’:
builtin-kmem.c:431: error: declaration of ‘stat’ shadows a global declaration
/usr/include/sys/stat.h:455: error: shadowed declaration is here

Rename local variable to pstat to avoid the name conflict.

Signed-off-by: David Ahern <david.ahern@oracle.com>
Link: http://lkml.kernel.org/r/1429033773-31383-1-git-send-email-david.ahern@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4ad1f430 14-Apr-2015 David Ahern <david.ahern@oracle.com>

perf kmem: Fix compiles on RHEL6/OL6

0d68bc92c48 breaks compiles on RHEL6/OL6:
cc1: warnings being treated as errors
builtin-kmem.c: In function ‘search_page_alloc_stat’:
builtin-kmem.c:322: error: declaration of ‘stat’ shadows a global declaration
node = &parent->rb_left;
/usr/include/sys/stat.h:455: error: shadowed declaration is here
builtin-kmem.c: In function ‘perf_evsel__process_page_alloc_event’:
builtin-kmem.c:378: error: declaration of ‘stat’ shadows a global declaration
/usr/include/sys/stat.h:455: error: shadowed declaration is here
builtin-kmem.c: In function ‘perf_evsel__process_page_free_event’:
builtin-kmem.c:431: error: declaration of ‘stat’ shadows a global declaration
/usr/include/sys/stat.h:455: error: shadowed declaration is here

Rename local variable to pstat to avoid the name conflict.

Signed-off-by: David Ahern <david.ahern@oracle.com>
Link: http://lkml.kernel.org/r/1429033773-31383-1-git-send-email-david.ahern@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6145c259 23-Apr-2015 Will Deacon <will@kernel.org>

perf kmem: Consistently use PRIu64 for printing u64 values

Building the perf tool for 32-bit ARM results in the following build
error due to a combination of an incorrect conversion specifier and
compiling with -Werror:

builtin-kmem.c: In function ‘print_page_summary’:
builtin-kmem.c:644:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=]
nr_alloc_freed, (total_alloc_freed_bytes) / 1024);
^
builtin-kmem.c:647:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=]
(total_page_alloc_bytes - total_alloc_freed_bytes) / 1024);
^
cc1: all warnings being treated as errors

This patch fixes the problem by consistently using PRIu64 for printing
out u64 values.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429796437-1790-1-git-send-email-will.deacon@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0d68bc92 05-Apr-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Analyze page allocator events also

The perf kmem command records and analyze kernel memory allocation only
for SLAB objects. This patch implement a simple page allocator analyzer
using kmem:mm_page_alloc and kmem:mm_page_free events.

It adds two new options of --slab and --page. The --slab option is for
analyzing SLAB allocator and that's what perf kmem currently does.

The new --page option enables page allocator events and analyze kernel
memory usage in page unit. Currently, 'stat --alloc' subcommand is
implemented only.

If none of these --slab nor --page is specified, --slab is implied.

First run 'perf kmem record' to generate a suitable perf.data file:

# perf kmem record --page sleep 5

Then run 'perf kmem stat' to postprocess the perf.data file:

# perf kmem stat --page --alloc --line 10

-------------------------------------------------------------------------------
PFN | Total alloc (KB) | Hits | Order | Mig.type | GFP flags
-------------------------------------------------------------------------------
4045014 | 16 | 1 | 2 | RECLAIM | 00285250
4143980 | 16 | 1 | 2 | RECLAIM | 00285250
3938658 | 16 | 1 | 2 | RECLAIM | 00285250
4045400 | 16 | 1 | 2 | RECLAIM | 00285250
3568708 | 16 | 1 | 2 | RECLAIM | 00285250
3729824 | 16 | 1 | 2 | RECLAIM | 00285250
3657210 | 16 | 1 | 2 | RECLAIM | 00285250
4120750 | 16 | 1 | 2 | RECLAIM | 00285250
3678850 | 16 | 1 | 2 | RECLAIM | 00285250
3693874 | 16 | 1 | 2 | RECLAIM | 00285250
... | ... | ... | ... | ... | ...
-------------------------------------------------------------------------------

SUMMARY (page allocator)
========================
Total allocation requests : 44,260 [ 177,256 KB ]
Total free requests : 117 [ 468 KB ]

Total alloc+freed requests : 49 [ 196 KB ]
Total alloc-only requests : 44,211 [ 177,060 KB ]
Total free-only requests : 68 [ 272 KB ]

Total allocation failures : 0 [ 0 KB ]

Order Unmovable Reclaimable Movable Reserved CMA/Isolated
----- ------------ ------------ ------------ ------------ ------------
0 32 . 44,210 . .
1 . . . . .
2 . 18 . . .
3 . . . . .
4 . . . . .
5 . . . . .
6 . . . . .
7 . . . . .
8 . . . . .
9 . . . . .
10 . . . . .

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1428298576-9785-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 28939e1a 05-Apr-2015 Jiri Olsa <jolsa@kernel.org>

perf kmem: Respect -i option

Currently the perf kmem does not respect -i option.

Initializing the file.path properly after options get parsed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1428298576-9785-2-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d1eeb77c 02-Apr-2015 Yunlong Song <yunlong.song@huawei.com>

perf kmem: Support using -f to override perf.data file ownership

Enable perf kmem to use perf.data when it is not owned by current user
or root.

Example:

# perf kmem record ls
# chown Yunlong.Song:Yunlong.Song perf.data
# ls -al perf.data
-rw------- 1 Yunlong.Song Yunlong.Song 5315665 Apr 2 10:54 perf.data
# id
uid=0(root) gid=0(root) groups=0(root),64(pkcs11)

Before this patch:

# perf kmem stat
File perf.data not owned by current user or root (use -f to override)
# perf kmem stat -f
Error: unknown switch `f'

usage: perf kmem [<options>] {record|stat}

-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
--caller show per-callsite statistics
--alloc show per-allocation statistics
-s, --sort <key[,key2...]>
sort by keys: ptr, call_site, bytes, hit,
pingpong, frag
-l, --line <num> show n lines
--raw-ip show raw ip instead of symbol

As shown above, the -f option does not work at all.

After this patch:

# perf kmem stat
File perf.data not owned by current user or root (use -f to override)
# perf kmem stat -f
SUMMARY
=======
Total bytes requested: 437599
Total bytes allocated: 615472
Total bytes wasted on internal fragmentation: 177873
Internal fragmentation: 28.900259%
Cross CPU allocations: 6/1192

As shown above, the -f option really works now.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427982439-27388-4-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 77cfe388 23-Mar-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Print big numbers using thousands' group

Like perf stat, this makes easy to read the numbers on stat like below:

# perf kmem stat

SUMMARY
=======
Total bytes requested: 9,770,900
Total bytes allocated: 9,782,712
Total bytes wasted on internal fragmentation: 11,812
Internal fragmentation: 0.120744%
Cross CPU allocations: 74/152,819

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1427092244-22764-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 65f46e02 12-Mar-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Fix alignment of slab result table

Its table was a bit misaligned. Fix it.

Before:

# perf kmem stat --caller -l 10
------------------------------------------------------------------------------------------------------
Callsite | Total_alloc/Per | Total_req/Per | Hit | Ping-pong | Frag
------------------------------------------------------------------------------------------------------
radeon_cs_parser_init.part.1+11a | 2080/260 | 1504/188 | 8 | 0 | 27.692%
radeon_cs_parser_init.part.1+e1 | 384/96 | 288/72 | 4 | 0 | 25.000%
radeon_cs_parser_init.part.1+93 | 128/32 | 96/24 | 4 | 0 | 25.000%
load_elf_binary+a39 | 512/512 | 392/392 | 1 | 0 | 23.438%
__alloc_skb+89 | 6144/877 | 4800/685 | 7 | 6 | 21.875%
radeon_fence_emit+5c | 1152/192 | 912/152 | 6 | 0 | 20.833%
radeon_cs_parser_relocs+ad | 8192/2048 | 6624/1656 | 4 | 0 | 19.141%
radeon_sa_bo_new+78 | 1280/64 | 1120/56 | 20 | 0 | 12.500%
load_elf_binary+2c4 | 32/32 | 28/28 | 1 | 0 | 12.500%
anon_vma_prepare+101 | 576/72 | 512/64 | 8 | 0 | 11.111%
... | ... | ... | ... | ... | ...
------------------------------------------------------------------------------------------------------

After:

---------------------------------------------------------------------------------------------------------
Callsite | Total_alloc/Per | Total_req/Per | Hit | Ping-pong | Frag
---------------------------------------------------------------------------------------------------------
radeon_cs_parser_init.part.1+11a | 2080/260 | 1504/188 | 8 | 0 | 27.692%
radeon_cs_parser_init.part.1+e1 | 384/96 | 288/72 | 4 | 0 | 25.000%
radeon_cs_parser_init.part.1+93 | 128/32 | 96/24 | 4 | 0 | 25.000%
load_elf_binary+a39 | 512/512 | 392/392 | 1 | 0 | 23.438%
__alloc_skb+89 | 6144/877 | 4800/685 | 7 | 6 | 21.875%
radeon_fence_emit+5c | 1152/192 | 912/152 | 6 | 0 | 20.833%
radeon_cs_parser_relocs+ad | 8192/2048 | 6624/1656 | 4 | 0 | 19.141%
radeon_sa_bo_new+78 | 1280/64 | 1120/56 | 20 | 0 | 12.500%
load_elf_binary+2c4 | 32/32 | 28/28 | 1 | 0 | 12.500%
anon_vma_prepare+101 | 576/72 | 512/64 | 8 | 0 | 11.111%
... | ... | ... | ... | ... | ...
---------------------------------------------------------------------------------------------------------

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1426145571-3065-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bd72a33e 12-Mar-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Allow -v option

Current perf kmem fails when -v option is used. As it's very useful for
debugging, let's allow it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1426145571-3065-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 405f8755 12-Mar-2015 Namhyung Kim <namhyung@kernel.org>

perf kmem: Fix segfault when invalid sort key is given

When it tries to free 'str', it was already updated by strsep() - so it
needs to save the original pointer.

# perf kmem stat -s xxx,hit
Error: Unknown --sort key: 'xxx'
*** Error in `perf': free(): invalid pointer: 0x0000000000e9e7b6 ***
======= Backtrace: =========
/usr/lib/libc.so.6(+0x7198e)[0x7fc7e6e0d98e]
/usr/lib/libc.so.6(+0x76dee)[0x7fc7e6e12dee]
/usr/lib/libc.so.6(+0x775cb)[0x7fc7e6e135cb]
./perf[0x44a1b5]
./perf[0x490b20]
./perf(parse_options_step+0x173)[0x491773]
./perf(parse_options_subcommand+0xa7)[0x491fb7]
./perf(cmd_kmem+0x2bc)[0x44ae4c]
./perf[0x47aa13]
./perf(main+0x60a)[0x427a9a]
/usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7fc7e6dbc800]
./perf(_start+0x29)[0x427bb9]

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1426145571-3065-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b7b61cbe 03-Mar-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf ordered_events: Shorten function signatures

By keeping pointers to machines, evlist and tool in ordered_events.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-0c6huyaf59mqtm2ek9pmposl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 52e02834 23-Sep-2014 Taeung Song <treeze.taeung@gmail.com>

perf tools: Modify error code for when perf_session__new() fails

Because perf_session__new() can fail for more reasons than just ENOMEM,
modify error code(ENOMEM or EINVAL) to -1.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1411522417-9917-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0a7e6d1b 12-Aug-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Check recorded kernel version when finding vmlinux

Currently vmlinux_path__init() only tries to find vmlinux file from
current directory, /boot and some canonical directories with version
number of the running kernel. This can be a problem when reporting old
data recorded on a kernel version not running currently.

We can use --symfs option for this but it's annoying for user to do it
always. As we already have the info in the perf.data file, it can be
changed to use it for the search automatically.

Before:

$ perf report
...
# Samples: 4K of event 'cpu-clock'
# Event count (approx.): 1067250000
#
# Overhead Command Shared Object Symbol
# ........ .......... ................. ..............................
71.87% swapper [kernel.kallsyms] [k] recover_probed_instruction

After:

# Overhead Command Shared Object Symbol
# ........ .......... ................. ....................
71.87% swapper [kernel.kallsyms] [k] native_safe_halt

This requires to change signature of symbol__init() to receive struct
perf_session_env *.

Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2b2b2c68 12-Aug-2014 Namhyung Kim <namhyung@kernel.org>

perf kmem: Move session handling out of __cmd_kmem()

This is a preparation of fixing dso__load_kernel_sym(). It needs a
session info before calling symbol__init().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1407825645-24586-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 64c40908 31-Jul-2014 Namhyung Kim <namhyung@kernel.org>

perf kmem: Do not ignore mmap events

The perf kmem command didn't process mmap events for some unknown reason
and it instead gets symbol info from a running kernel. This is
problematic if perf kmem record was run on a different kernel.

This patch adds the mmap event handlers and reverts the commit
e727ca73f85d ("perf kmem: Resolve kernel symbols again").

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1406872771-23933-1-git-send-email-namhyung@kernel.org
[ Fixed up merge conflict with Jiri's ordered_events rename patch set ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0a8cb85c 06-Jul-2014 Jiri Olsa <jolsa@kernel.org>

perf tools: Rename ordered_samples bool to ordered_events

The time ordering is generic for all kinds of events, so using generic
name 'ordered_events' for ordered_samples bool in perf_tool struct.

No functional change was intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-07mrqzcuhsks9wfmxrzsvemz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 13ce34df 11-May-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Use tid for finding thread

I believe that passing pid (instead of tid) as the 3rd arg of the
machine__find*_thread() was to find a main thread so that it can
search proper map group for symbols. However with the map sharing
patch applied, it now can do it in any thread.

It fixes a bug when each thread has different name, it only reports a
main thread for samples in other threads.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1399856202-26221-1-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 4b627957 07-Apr-2014 Don Zickus <dzickus@redhat.com>

perf kmem: Utilize the new generic cpunode_map

Use the previous patch implementation of cpunode_map for builtin-kmem.c
Should not be any functional difference.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1396896924-129847-4-git-send-email-dzickus@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>


# 3bca2354 14-Mar-2014 Ramkumar Ramachandra <artagnon@gmail.com>

perf kmem: Introduce --list-cmds for use by scripts

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1394853474-31019-2-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>


# 744a9719 06-Nov-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Ditch evsel->handler.data field

Not needed since this cset:

fcf65bf149af: perf evsel: Cache associated event_format

So lets trim this struct a bit.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-j8setslokt0goiwxq9dogzqm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b9c5143a 11-Sep-2013 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Use an accessor to read thread comm

As the thread comm is going to be implemented by way of a more
complicated data structure than just a pointer to a string from the
thread struct, convert the readers of comm to use an accessor instead of
accessing it directly.

The accessor will be later overriden to support an enhanced comm
implementation.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wr683zwy94hmj4ibogmnv9ce@git.kernel.org
[ Rename thread__comm_curr() to thread__comm_str() ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
[ Fixed up some minor const pointer issues ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f5fc1412 15-Oct-2013 Jiri Olsa <jolsa@redhat.com>

perf tools: Add data object to handle perf data file

This patch is adding 'struct perf_data_file' object as a placeholder for
all attributes regarding perf.data file handling. Changing
perf_session__new to take it as an argument.

The rest of the functionality will be added later to keep this change
simple enough, because all the places using perf_session are changed
now.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381847254-28809-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4921e320 12-Sep-2013 Jiri Olsa <jolsa@redhat.com>

perf kmem: Make it work again on non NUMA machines

The commit '2814eb0 perf kmem: Remove die() calls' disabled 'perf kmem'
command for machines without numa support. It made the command fail if
'/sys/devices/system/node' dir wasn't found.

Skipping the numa based initialization in case the directory is not
found and continue execution.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1379003976-5839-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ef89325f 27-Aug-2013 Adrian Hunter <adrian.hunter@intel.com>

perf tools: Remove references to struct ip_event

The ip_event struct assumes fixed positions for ip, pid and tid. That
is no longer true with the addition of PERF_SAMPLE_IDENTIFIER. The
information is anyway in struct sample, so use that instead.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 314add6b 27-Aug-2013 Adrian Hunter <adrian.hunter@intel.com>

perf tools: change machine__findnew_thread() to set thread pid

Add a new parameter for 'pid' to machine__findnew_thread().
Change callers to pass 'pid' when it is known.

Note that callers sometimes want to find the main thread
which has the memory maps. The main thread has tid == pid
so the usage in that case is:

machine__findnew_thread(machine, pid, pid)

whereas the usage to find the specific thread is:

machine__findnew_thread(machine, pid, tid)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 38051234 04-Jul-2013 Adrian Hunter <adrian.hunter@intel.com>

perf tools: struct thread has a tid not a pid

As evident from 'machine__process_fork_event()' and
'machine__process_exit_event()' the 'pid' member of struct thread is
actually the tid.

Rename 'pid' to 'tid' in struct thread accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1372944040-32690-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4a4d371a 05-Jun-2013 Jiri Olsa <jolsa@redhat.com>

perf record: Remove -f/--force option

It no longer have any affect on the processing and is marked as obsolete
anyway.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-tvwyspiqr4getzfib2lw06ty@git.kernel.org
Link: http://lkml.kernel.org/r/1372307120-737-1-git-send-email-namhyung@kernel.org
[ combined patch removing the -f usage in various sub-commands, such as 'perf sched', etc, by Namhyung Kim ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8d9233f2 24-Jan-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Use memdup()

Instead of hand coded equivalent.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-42ldngi973f4ssvzlklo8t2k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 49e4ba54 20-Dec-2012 Sasha Levin <sasha.levin@oracle.com>

perf kmem: use ARRAY_SIZE instead of reinventing it

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1356030701-16284-8-git-send-email-sasha.levin@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 34ba5122 19-Dec-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Simplify accessing the host machine

It is always there, no sense in calling a function named
"perf_session__find_host_machine".

Also no sense in checking if that function return is NULL, so ditch
needless error handling.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-a6a3zx3afbrxo8p2zqm5mxo8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 70cb4e96 29-Oct-2012 Feng Tang <feng.tang@intel.com>

perf tools: Add a global variable "const char *input_name"

Currently many perf commands annotate/evlist/report/script/lock etc all
support "-i" option to chose a specific perf data, and all of them
create a local "input_name" to save the file name for that perf data.

Since most of these commands need it, we can add a global variable for
it, also it can some other benefits:

1. When calling script browser inside hists/annotation browser, it needs
to know the perf data file name to run that script.

2. For further feature like runtime switching to another perf data file,
this variable can also help.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1351569369-26732-2-git-send-email-feng.tang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0433ffbe 01-Oct-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Don't use globals where not needed to

Some variables were global but used in just one function, so move it to
where it belongs.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wu8lz0g2qg26aqgi51xgzkpp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0f7d2f1b 24-Sep-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Use perf_evsel__intval and perf_session__set_tracepoints_handlers

Following the model of 'perf sched':

. raw_field_value searches first on the common fields, that are unused
in this tool

. Using perf_session__set_tracepoints_handlers will save all those
strcmp to find the right handler at sample processing time, do it just
once and get the handler from evsel->handler.func.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-v9x3q9rv4caxtox7wtjpchq5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1d037ca1 10-Sep-2012 Irina Tirdea <irina.tirdea@gmail.com>

perf tools: Use __maybe_used for unused variables

perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored

__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.

The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.

Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
[ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2814eb05 08-Sep-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Remove die() calls

Just use pr_err() + return -1 and perf_session__process_events to abort
when some event would call die(), then let the perf's main() exit doing
whatever it needs.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-i7rhuqfwshjiwc9gr9m1vov4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 22ad798c 07-Aug-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Use evsel->tp_format and perf_sample

To reduce the number of parameters passed to the various event handling
functions.

Cc: Andrey Wagin <avagin@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-p936ngz06yo5h797ggsm7xru@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fcf65bf1 07-Aug-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Cache associated event_format

We already lookup the associated event_format when reading the perf.data
header, so that we can cache the tracepoint name in evsel->name, so do
it a little further and save the event_format itself, so that we can
avoid relookups in tools that need to access it.

Change the tools to take the most obvious advantage, when they were
using pevent_find_event directly. More work is needed for further
removing the need of a pointer to pevent, such as when asking for event
field values ("common_pid" and the other common fields and per
event_format fields).

This is something that was planned but only got actually done when
Andrey Wagin needed to do this lookup at perf_tool->sample() time, when
we don't have access to pevent (session->pevent) to use with
pevent_find_event().

Cc: Andrey Wagin <avagin@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-txkvew2ckko0b594ae8fbnyk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# da378962 27-Jun-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Stop using a global trace events description list

The pevent thing is per perf.data file, so I made it stop being static
and become a perf_session member, so tools processing perf.data files
use perf_session and _there_ we read the trace events description into
session->pevent and then change everywhere to stop using that single
global pevent variable and use the per session one.

Note that it _doesn't_ fall backs to trace__event_id, as we're not
interested at all in what is present in the
/sys/kernel/debug/tracing/events in the workstation doing the analysis,
just in what is in the perf.data file.

This patch also introduces perf_session__set_tracepoints_handlers that
is the perf perf.data/session way to associate handlers to tracepoint
events by resolving their IDs using the events descriptions stored in a
perf.data file. Make 'perf sched' use it.

Reported-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Tested-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linaro-dev@lists.linaro.org
Cc: patches@linaro.org
Link: http://lkml.kernel.org/r/20120625232016.GA28525@infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# aaf045f7 05-Apr-2012 Steven Rostedt <srostedt@redhat.com>

perf: Have perf use the new libtraceevent.a library

The event parsing code in perf was originally copied from trace-cmd
but never was kept up-to-date with the changes that was done there.
The trace-cmd libtraceevent.a code is much more mature than what is
currently in perf.

This updates the code to use wrappers to handle the calls to the
new event parsing code. The new code requires a handle to be pass
around, which removes the global event variables and allows
more than one event structure to be read from different files
(and different machines).

But perf still has the old global events and the code throughout
perf does not yet have a nice way to pass around a handle.
A global 'pevent' has been made for perf and the old calls have
been created as wrappers to the new event parsing code that uses
the global pevent.

With this change, perf can later incorporate the pevent handle into
the perf structures and allow more than one file to be read and
compared, that contains different events.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>


# 1b22859d 07-Jan-2012 Namhyung Kim <namhyung@gmail.com>

perf kmem: Fix a memory leak

The 'str' should be freed when sort_dimension__add() failed too.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1325957132-10600-5-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8442da1d 07-Jan-2012 Namhyung Kim <namhyung@gmail.com>

perf kmem: Add missing closedir() calls

The setup_cpunode_map() calls opendir() but misses corresponding
closedir(). Add them.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1325957132-10600-4-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# efad1415 07-Dec-2011 Robert Richter <robert.richter@amd.com>

perf report: Accept fifos as input file

The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:

# perf record -a -e cpu-cycles sleep 2 | perf report | cat
failed to open perf.data: No such file or directory (try 'perf record' first)

While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.

Applies to the following commands:

perf annotate
perf buildid-list
perf evlist
perf kmem
perf lock
perf report
perf sched
perf script
perf timechart

Also fixes char const* -> const char* type declaration for filename
strings.

v2:
* Prevent potential null pointer access to input_name in
builtin-report.c. Needed due to removal of patch "perf report: Setup
browser if stdout is a pipe"

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 45694aa7 28-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Rename perf_event_ops to perf_tool

To better reflect that it became the base class for all tools, that must
be in each tool struct and where common stuff will be put.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 743eb868 28-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Resolve machine earlier and pass it to perf_event_ops

Reducing the exposure of perf_session further, so that we can use the
classes in cases where no perf.data file is created.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d20deb64 25-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Pass tool context in the the perf_event_ops functions

So that we don't need to have that many globals.

Next steps will remove the 'session' pointer, that in most cases is
not needed.

Then we can rename perf_event_ops to 'perf_tool' that better describes
this class hierarchy.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9e69c210 15-Mar-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Pass evsel in event_ops->sample()

Resolving the sample->id to an evsel since the most advanced tools,
report and annotate, and the others will too when they evolve to
properly support multi-event perf.data files.

Good also because it does an extra validation, checking that the ID is
valid when present. When that is not the case, the overhead is just a
branch + function call (perf_evlist__id2evsel).

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8115d60c 29-Jan-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Kill event_t typedef, use 'union perf_event' instead

And move the event_t methods to the perf_event__ too.

No code changes, just namespace consistency.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8d50e5b4 29-Jan-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Rename 'struct sample_data' to 'struct perf_sample'

Making the namespace more uniform.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9486aa38 22-Jan-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Fix 64 bit integer format strings

Using %L[uxd] has issues in some architectures, like on ppc64. Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.

Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.

Reported-by: Denis Kirjanov <dkirjanov@kernel.org>
Tested-by: Denis Kirjanov <dkirjanov@kernel.org>
LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
Cc: Denis Kirjanov <dkirjanov@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pingtian Han <phan@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 21ef97f0 09-Dec-2010 Ian Munsie <imunsie@au1.ibm.com>

perf session: Fallback to unordered processing if no sample_id_all

If we are running the new perf on an old kernel without support for
sample_id_all, we should fall back to the old unordered processing of
events. If we didn't than we would *always* process events without
timestamps out of order, whether or not we hit a reordering race. In
other words, instead of there being a chance of not attributing samples
correctly, we would guarantee that samples would not be attributed.

While processing all events without timestamps before events with
timestamps may seem like an intuitive solution, it falls down as
PERF_RECORD_EXIT events would also be processed before any samples.
Even with a workaround for that case, samples before/after an exec would
not be attributed correctly.

This patch allows commands to indicate whether they need to fall back to
unordered processing, so that commands that do not care about timestamps
on every event will not be affected. If we do fallback, this will print
out a warning if report -D was invoked.

This patch adds the test in perf_session__new so that we only need to
test once per session. Commands that do not use an event_ops (such as
record and top) can simply pass NULL in it's place.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1291951882-sup-6069@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ce47dc56 12-Nov-2010 Chris Samuel <chris@csamuel.org>

perf tools: Catch a few uncheck calloc/malloc's

There were a few stray calloc()'s and malloc()'s which were not having
their return values checked for success.

As the calling code either already coped with failure or didn't actually
care we just return -ENOMEM at that point.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Chris Samuel <chris@csamuel.org>
LKML-Reference: <4CDDF95A.1050400@csamuel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 640c03ce 02-Dec-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Parse sample earlier

At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.

This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.

Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.

There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 454c407e 01-May-2010 Tom Zanussi <tzanussi@gmail.com>

perf: add perf-inject builtin

Currently, perf 'live mode' writes build-ids at the end of the
session, which isn't actually useful for processing live mode events.

What would be better would be to have the build-ids sent before any of
the samples that reference them, which can be done by processing the
event stream and retrieving the build-ids on the first hit. Doing
that in perf-record itself, however, is off-limits.

This patch introduces perf-inject, which does the same job while
leaving perf-record untouched. Normal mode perf still records the
build-ids at the end of the session as it should, but for live mode,
perf-inject can be injected in between the record and report steps
e.g.:

perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

perf-inject reads a perf-record event stream and repipes it to stdout.
At any point the processing code can inject other events into the
event stream - in this case build-ids (-b option) are read and
injected as needed into the event stream.

Build-ids are just the first user of perf-inject - potentially
anything that needs userspace processing to augment the trace stream
with additional information could make use of this facility.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5c0541d5 29-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Add machine helper routines

Created when writing the first 'perf test' regression testing routine.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d28c6223 27-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Adopt some map_groups functions

Those functions operated on members now grouped in 'struct machine', so
move those methods to this new class.

The changes made to 'perf probe' shows that using this abstraction
inserting probes on guests almost got supported for free.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 23346f21 27-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Rename "kernel_info" to "machine"

struct kernel_info and kerninfo__ are too vague, what they really
describe are machines, virtual ones or hosts.

There are more changes to introduce helpers to shorten function calls
and to make more clear what is really being done, but I left that for
subsequent patches.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 587570d4 23-Apr-2010 Frederic Weisbecker <fweisbec@gmail.com>

perf: Use generic sample reordering in perf kmem

Use the new generic sample events reordering from perf kmem,
this drops the need of multiplexing the buffers on record time,
improving the scalability of perf kmem.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Li Zefan <lizf@cn.fujitsu.com>


# a1645ce1 18-Apr-2010 Zhang, Yanmin <yanmin_zhang@linux.intel.com>

perf: 'perf kvm' tool for monitoring guest performance from host

Here is the patch of userspace perf tool.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# 8c40041f 06-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Fix breakage introduced by 5a0e3ad slab.h script

Commit 5a0e3ad ("include cleanup: Update gfp.h and slab.h
includes to prepare for breaking implicit slab.h inclusion
from percpu.h") added a '#include <linux/slab.h>' to
tools/perf/builtin-kmem.h because: that tool has lines like
this:

if (!strcmp(event->name, "kmalloc") ||
!strcmp(event->name, "kmem_cache_alloc")) {
process_alloc_event(data, event, cpu, timestamp, thread, 0);
return;
}

So, using the script regex:

>>> import re
>>> s = re.compile(r'^(|.*[^a-zA-Z0-9_])_*(slab_is_available|kmem_cache_|k[mzc]alloc|krealloc|kz?free|ksize|__getname|putname)')
>>> l = ' !strcmp(event->name, "kmem_cache_alloc")) {'
>>> s.search(l)
<_sre.SRE_Match object at 0xb77b1ad0>
>>>

Remove that file that is not available in the tools/perf include
path and thus builtin-kmem.c couldn't be compiled.

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1270561053-14308-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 71cf8b8f 01-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Fixup the symbol address before using it

We get absolute addresses in the events, but relative ones from the
symbol subsystem, so calculate the absolute address by asking for the
map where the symbol was found, that has the place where the DSO was
actually loaded.

For the core kernel this poses no problems if the kernel is not
relocated by things like kexec, or if we use /proc/kallsyms, but for
modules we were getting really large, negative offsets.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e727ca73 01-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Resolve kernel symbols again

Due to the assumption in perf_session__new that the kernel maps would be
created using the fake PERF_RECORD_MMAP event in a perf.data file 'perf
kmem --stat caller', that doesn't have such event, ends up not being
able to resolve the kernel addresses.

Fix it by calling perf_session__create_kernel_maps() in __cmd_kmem().

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7e5e1b14 25-Mar-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: map_groups__find_symbol must return the map too

Tools need to know from which map in the map_group a symbol was resolved
to, so that, for isntance, we can annotate kernel modules symbols by
getting its precise name, etc.

Also add the _by_name variants for completeness.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# 9de89fe7 03-Feb-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Remove perf_session usage in symbols layer

I noticed while writing the first test in 'perf regtest' that to
just test the symbol handling routines one needs to create a
perf session, that is a layer centered on a perf.data file,
events, etc, so I untied these layers.

This reduces the complexity for the users as the number of
parameters to most of the symbols and session APIs now was
reduced while not adding more state to all the map instances by
only having data that is needed to split the kernel (kallsyms
and ELF symtab sections) maps and do vmlinux relocation on the
main kernel map.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 64abebf7 27-Jan-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Create kernel maps in the constructor

Removing one extra step needed in the tools that need this,
fixing a bug in 'perf probe' where this was not being done.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1264633557-17597-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b00eca8c 19-Jan-2010 Pekka Enberg <penberg@cs.helsinki.fi>

perf kmem: Print usage help for unknown commands

This patch fixes "perf kmem" to print usage help instead of
doing nothing.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: <1263921971-10782-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 47103277 19-Jan-2010 Pekka Enberg <penberg@cs.helsinki.fi>

perf kmem: Increase "Hit" column length

It's fairly easy to overflow the "Hit" column with just few
seconds of tracing so increase the column length to avoid broken
formatting.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: <1263921803-10214-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 0d755034 13-Jan-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Don't cast RIP to pointers

Since they can come from another architecture with bigger
pointers, i.e. processing a 64-bit perf.data on a 32-bit arch.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263478990-8200-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b7cece76 13-Jan-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Encode kernel module mappings in perf.data

We were always looking at the running machine /proc/modules,
even when processing a perf.data file, which only makes sense
when we're doing 'perf record' and 'perf report' on the same
machine, and in close sucession, or if we don't use modules at
all, right Peter? ;-)

Now, at 'perf record' time we read /proc/modules, find the long
path for modules, and put them as PERF_MMAP events, just like we
did to encode the reloc reference symbol for vmlinux. Talking
about that now it is encoded in .pgoff, so that we can use
.{start,len} to store the address boundaries for the kernel so
that when we reconstruct the kmaps tree we can do lookups right
away, without having to fixup the end of the kernel maps like we
did in the past (and now only in perf record).

One more step in the 'perf archive' direction when we'll finally
be able to collect data in one machine and analyse in another.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263396139-4798-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4efb5290 21-Dec-2009 Wenji Huang <wenji.huang@oracle.com>

perf kmem: Fix statistics typo

Replace bytes_req with bytes_alloc.

Signed-off-by: Wenji Huang <wenji.huang@oracle.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: acme@redhat.com
LKML-Reference: <1261389175-2227-1-git-send-email-wenji.huang@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 659d8cfb 19-Dec-2009 Ulrich Drepper <drepper@redhat.com>

perf tools: Do a few more directory handling optimizations

A few more optimizations for perf when dealing with directories.

Some of them significantly cut down the work which has to be
done. d_type should always be set; otherwise fix the kernel
code. And there are functions available to parse fstab-like
files, so use them.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: a.p.zijlstra@chello.nl
Cc: acme@redhat.com
Cc: eranian@google.com
Cc: fweisbec@gmail.com
Cc: lizf@cn.fujitsu.com
Cc: paulus@samba.org
Cc: xiaoguangrong@cn.fujitsu.com
LKML-Reference: <200912192140.nBJLeSfA028905@hs20-bc2-1.build.redhat.com>
[ v2: two small stylistic fixlets ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 55aa640f 27-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Remove redundant prefix & suffix from perf_event_ops

Since now all that we have are perf event handlers, leave just
the name of the event.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-9-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d549c769 27-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Remove sample_type_check from event_ops

This is really something tools need to do before asking for the
events to be processed, leaving perf_session__process_events to
do just that, process events.

Also add a msg parameter to perf_session__has_traces() so that
the right message can be printed, fixing a regression added by
me in the previous cset (right timechart message) and also
fixing 'perf kmem', that was not asking if 'perf kmem record'
was ran.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 27295592 27-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Share the common trace sample_check routine as perf_session__has_traces

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 655000e7 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Adopt the strlists for dso, comm

Will be used in perf diff too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 75be6cf4 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Make symbol_conf global

This simplifies a lot of functions, less stuff to be done by
tool writers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c019879b 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Adopt the sample_type variable

All tools had copies, and perf diff would have to specify a
sample_type_check method just for copying it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260807780-19377-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4e4f06e4 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move the hist_entries rb tree to perf_session

As we'll need to sort multiple times for multiple perf sessions,
so that we can then do a diff.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260803439-16783-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4aa65636 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move kmaps to perf_session

There is still some more work to do to disentangle map creation
from DSO loading, but this happens only for the kernel, and for
the early adopters of perf diff, where this disentanglement
matters most, we'll be testing different kernels, so no problem
here.

Further clarification: right now we create the kernel maps for
the various modules and discontiguous kernel text maps when
loading the DSO, we should do it as a two step process, first
creating the maps, for multiple mappings with the same DSO
store, then doing the dso load just once, for the first hit on
one of the maps sharing this DSO backing store.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b3165f41 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move the global threads list to perf_session

So that we can process two perf.data files.

We still need to add a O_MMAP mode for perf_session so that we
can do all the mmap stuff in it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# ec913369 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Reduce the number of parms to perf_session__process_events

By having the cwd/cwdlen in the perf_session struct and
full_paths in perf_event_ops.

Now its just a matter of passing the ops.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 13df45ca 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Register the idle thread in perf_session__process_events

No need for all tools to register it and then immediately call
perf_session__process_events.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 301a0b02 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Ditch register_perf_file_handler

Pass the event_ops to perf_session__process_events instead.

Also move the event_ops definition to session.h, starting to
move things around to their right place, trimming the many
unneeded headers we have.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d8f66248 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Pass the perf_session to the event handling operations

They will need it to get the right threads list, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 94c744b6 11-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Introduce perf_session class

That does all the initialization boilerplate, opening the file,
reading the header, checking if it is valid, etc.

And that will as well have the threads list, kmap (now) global
variable, etc, so that we can handle two (or more) perf.data files
describing sessions to compare.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260573842-19720-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9958e1f0 11-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Rename kthreads to kmaps, using another abstraction for it

Using a struct thread instance just to hold the kernel space maps
(vmlinux + modules) is overkill and confuses people trying to
understand the perf symbols abstractions.

The kernel maps are really present in all threads, i.e. the kernel
is a library, not a separate thread.

So introduce the 'map_groups' abstraction and use it for the kernel
maps, now in the kmaps global variable.

It, in turn, will move, together with the threads list to the
perf_file abstraction, so that we can support multiple perf_file
instances, needed by perf diff.

Brainstormed-with: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260550239-5372-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 79312416 10-Dec-2009 Ingo Molnar <mingo@elte.hu>

perf kmem: Fix unused argument build warning

Fix:

builtin-kmem.c: In function 'parse_caller_opt':
builtin-kmem.c:690: error: unused parameter 'arg'
builtin-kmem.c: In function 'parse_alloc_opt':
builtin-kmem.c:697: error: unused parameter 'arg'

Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <4B20A195.8030106@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 90b86a9f 10-Dec-2009 Li Zefan <lizf@cn.fujitsu.com>

perf kmem: Show usage if no option is specified

As Ingo suggested, make "perf kmem" show help information.
"perf kmem stat [--caller] [--alloc] .." will show memory
statistics.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <4B20A195.8030106@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f48f669d 06-Dec-2009 Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>

perf_event: Eliminate raw->size

raw->size is not used, this patch just cleans it up.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B1C8CC4.4050007@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d8bd9e0a 06-Dec-2009 Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>

perf_event: Fix raw event processing

We use 'data.raw_data' parameter to call process_raw_event(),
but data.raw_data buffer not include data size. it can make perf
tool crash.

This bug was introduced by commit 180f95e29a ("perf: Make common
SAMPLE_EVENT parser").

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B1C7F45.5080105@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 180f95e2 06-Dec-2009 OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

perf: Make common SAMPLE_EVENT parser

Currently, sample event data is parsed for each commands, and it
is assuming that the data is not including other data. (E.g.
timechart, trace, etc. can't parse the event if it has
PERF_SAMPLE_CALLCHAIN)

So, even if we record the superset data for multiple commands at
a time, commands can't parse. etc.

To fix it, this makes common sample event parser, and use it to
parse sample event correctly. (PERF_SAMPLE_READ is unsupported
for now though, it seems to be not using.)

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <87hbs48imv.fsf@devron.myhome.or.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 1ed091c4 27-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Consolidate symbol resolving across all tools

Now we have a very high level routine for simple tools to
process IP sample events:

int event__preprocess_sample(const event_t *self,
struct addr_location *al,
symbol_filter_t filter)

It receives the event itself and will insert new threads in the
global threads list and resolve the map and symbol, filling all
this info into the new addr_location struct, so that tools like
annotate and report can further process the event by creating
hist_entries in their specific way (with or without callgraphs,
etc).

It in turn uses the new next layer function:

void thread__find_addr_location(struct thread *self, u8 cpumode,
enum map_type type, u64 addr,
struct addr_location *al,
symbol_filter_t filter)

This one will, given a thread (userspace or the kernel kthread
one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE
too in the near future) at the given cpumode, taking vdsos into
account (userspace hit, but kernel symbol) and will fill all
these details in the addr_location given.

Tools that need a more compact API for plain function
resolution, like 'kmem', can use this other one:

struct symbol *thread__find_function(struct thread *self, u64 addr,
symbol_filter_t filter)

So, to resolve a kernel symbol, that is all the 'kmem' tool
needs, its just a matter of calling:

sym = thread__find_function(kthread, addr, NULL);

The 'filter' parameter is needed because we do lazy
parsing/loading of ELF symtabs or /proc/kallsyms.

With this we remove more code duplication all around, which is
always good, huh? :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 62daacb5 27-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Reorganize event processing routines, lotsa dups killed

While implementing event__preprocess_sample, that will do all of
the symbol lookup in one convenient function, I noticed that
util/process_event.[ch] were not being used at all, then started
looking if there were other functions that could be shared
and...

All those functions really don't need to receive offset + head,
the only thing they did was common to all of them, so do it at
one place instead.

Stats about number of each type of event processed now is done
in a central place.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259346563-12568-11-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# fcf1203a 24-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Rename find_symbol routines to find_function

Paving the way for supporting variable in adition to function
symbols.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259074912-5924-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b32d133a 23-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Simplify symbol machinery setup

And also express its configuration toggles via a struct.

Now all one has to do is to call symbol__init(NULL) if the
defaults are OK, or pass a struct symbol_conf pointer with the
desired configuration.

If a tool uses kernel_maps__find_symbol() to look at the kernel
and modules mappings for a symbol but didn't call symbol__init()
first, that will generate a one time warning too, alerting the
subcommand developer that symbol__init() must be called.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259071517-3242-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 079d3f65 23-Nov-2009 Li Zefan <lizf@cn.fujitsu.com>

perf kmem: Measure kmalloc/kfree CPU ping-pong call-sites

Show statistics for allocations and frees on different cpus:

------------------------------------------------------------------------------------------------------
Callsite | Total_alloc/Per | Total_req/Per | Hit | Ping-pong | Frag
------------------------------------------------------------------------------------------------------
perf_event_alloc.clone.0+0 | 7504/682 | 7128/648 | 11 | 0 | 5.011%
alloc_buffer_head+16 | 288/57 | 280/56 | 5 | 0 | 2.778%
radix_tree_preload+51 | 296/296 | 288/288 | 1 | 0 | 2.703%
tracepoint_add_probe+32e | 157/31 | 154/30 | 5 | 0 | 1.911%
do_maps_open+0 | 796/12 | 792/12 | 66 | 0 | 0.503%
sock_alloc_send_pskb+16e | 23780/495 | 23744/494 | 48 | 38 | 0.151%
anon_vma_prepare+9a | 3744/44 | 3740/44 | 85 | 0 | 0.107%
d_alloc+21 | 64948/164 | 64944/164 | 396 | 0 | 0.006%
proc_alloc_inode+23 | 262292/676 | 262288/676 | 388 | 0 | 0.002%
create_object+28 | 459600/200 | 459600/200 | 2298 | 71 | 0.000%
journal_start+67 | 14440/40 | 14440/40 | 361 | 0 | 0.000%
get_empty_filp+df | 53504/256 | 53504/256 | 209 | 0 | 0.000%
getname+2a | 823296/4096 | 823296/4096 | 201 | 0 | 0.000%
seq_read+2b0 | 544768/4096 | 544768/4096 | 133 | 0 | 0.000%
seq_open+6d | 17024/128 | 17024/128 | 133 | 0 | 0.000%
mmap_region+2e6 | 11704/88 | 11704/88 | 133 | 0 | 0.000%
single_open+0 | 1072/16 | 1072/16 | 67 | 0 | 0.000%
__alloc_skb+2e | 12544/256 | 12544/256 | 49 | 38 | 0.000%
__sigqueue_alloc+4a | 1296/144 | 1296/144 | 9 | 8 | 0.000%
tracepoint_add_probe+6f | 80/16 | 80/16 | 5 | 0 | 0.000%
------------------------------------------------------------------------------------------------------
...

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
LKML-Reference: <4B0B6E9F.6020309@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7d0d3945 23-Nov-2009 Li Zefan <lizf@cn.fujitsu.com>

perf kmem: Collect cross node allocation statistics

Show cross node memory allocations:

# ./perf kmem

SUMMARY
=======
...
Cross node allocations: 0/3633

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
LKML-Reference: <4B0B6E87.10906@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 29b3e152 23-Nov-2009 Li Zefan <lizf@cn.fujitsu.com>

perf kmem: Default to sort by fragmentation

Make the output sort by fragmentation by default.

Also make the usage of "--sort" option consistent with other
perf tools. That is, we support multi keys: "--sort
key1[,key2]...".

# ./perf kmem --stat caller
------------------------------------------------------------------------------
Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag
------------------------------------------------------------------------------
__netdev_alloc_skb+23 | 5048/1682 | 4564/1521 | 3| 9.588%
perf_event_alloc.clone.0+0 | 7504/682 | 7128/648 | 11| 5.011%
tracepoint_add_probe+32e | 157/31 | 154/30 | 5| 1.911%
alloc_buffer_head+16 | 456/57 | 448/56 | 8| 1.754%
radix_tree_preload+51 | 584/292 | 576/288 | 2| 1.370%
...

TODO:
- Extract duplicate code in builtin-kmem.c and builtin-sched.c
into util/sort.c.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
LKML-Reference: <4B0B6E72.7010200@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7707b6b6 23-Nov-2009 Li Zefan <lizf@cn.fujitsu.com>

perf kmem: Add new option to show raw ip

Add option "--raw-ip" to show raw ip instead of symbols:

# ./perf kmem --stat caller --raw-ip
------------------------------------------------------------------------------
Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag
------------------------------------------------------------------------------
0xc05301aa | 733184/4096 | 733184/4096 | 179| 0.000%
0xc0542ba0 | 483328/4096 | 483328/4096 | 118| 0.000%
...

Also show symbols with format sym+offset instead of sym/offset.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
LKML-Reference: <4B0B6E5C.4080900@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 1b145ae5 23-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf kmem: Resolve symbols

E.g.:

[root@doppio linux-2.6-tip]# perf kmem record sleep 3s
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.804 MB perf.data (~35105 samples) ]

[root@doppio linux-2.6-tip]# perf kmem --stat caller | head -10
------------------------------------------------------------------------------
Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag
------------------------------------------------------------------------------
getname/40 | 1519616/4096 | 1519616/4096 | 371| 0.000%
seq_read/a2 | 987136/4096 | 987136/4096 | 241| 0.000%
__netdev_alloc_skb/43 | 260368/1049 | 259968/1048 | 248| 0.154%
__alloc_skb/5a | 77312/256 | 77312/256 | 302| 0.000%
proc_alloc_inode/33 | 76480/632 | 76472/632 | 121| 0.010%
get_empty_filp/8d | 70272/192 | 70272/192 | 366| 0.000%
split_vma/8e | 42064/176 | 42064/176 | 239| 0.000%
[root@doppio linux-2.6-tip]#

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1259005869-13487-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# cc612d81 23-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Look for vmlinux in more places

Now that we can check the buildid to see if it really matches,
this can be done safely:

vmlinux
/boot/vmlinux
/boot/vmlinux-<uts.release>
/lib/modules/<uts.release>/build/vmlinux
/usr/lib/debug/lib/modules/%s/vmlinux

More can be added - if you know about distros that put the
vmlinux somewhere else please let us know.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259001550-8194-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f3ced7cd 22-Nov-2009 Pekka Enberg <penberg@cs.helsinki.fi>

perf kmem: Add --sort hit and --sort frag

This patch adds support for "--sort hit" and "--sort frag" to
the "perf kmem" tool. The former was already mentioned in the
help text and the latter is useful for finding call-sites that
exhibit worst case behavior for SLAB allocators.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
LKML-Reference: <1258883880-7149-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# ba77c9e1 20-Nov-2009 Li Zefan <lizf@cn.fujitsu.com>

perf: Add 'perf kmem' tool

This tool is mostly a perf version of kmemtrace-user.

The following information is provided by this tool:

- the total amount of memory allocated and fragmentation per
call-site

- the total amount of memory allocated and fragmentation per
allocation

- total memory allocated and fragmentation in the collected
dataset - ...

Sample output:

# ./perf kmem record
^C
# ./perf kmem --stat caller --stat alloc -l 10

------------------------------------------------------------------------------
Callsite | Total_alloc/Per | Total_req/Per | Hit | Fragmentation
------------------------------------------------------------------------------
0xc052f37a | 790528/4096 | 790528/4096 | 193 | 0.000%
0xc0541d70 | 524288/4096 | 524288/4096 | 128 | 0.000%
0xc051cc68 | 481600/200 | 481600/200 | 2408 | 0.000%
0xc0572623 | 297444/676 | 297440/676 | 440 | 0.001%
0xc05399f1 | 73476/164 | 73472/164 | 448 | 0.005%
0xc05243bf | 51456/256 | 51456/256 | 201 | 0.000%
0xc0730d0e | 31844/497 | 31808/497 | 64 | 0.113%
0xc0734c4e | 17152/256 | 17152/256 | 67 | 0.000%
0xc0541a6d | 16384/128 | 16384/128 | 128 | 0.000%
0xc059c217 | 13120/40 | 13120/40 | 328 | 0.000%
0xc0501ee6 | 11264/88 | 11264/88 | 128 | 0.000%
0xc04daef0 | 7504/682 | 7128/648 | 11 | 5.011%
0xc04e14a3 | 4216/191 | 4216/191 | 22 | 0.000%
0xc05041ca | 3524/44 | 3520/44 | 80 | 0.114%
0xc0734fa3 | 2104/701 | 1620/540 | 3 | 23.004%
0xc05ec9f1 | 2024/289 | 2016/288 | 7 | 0.395%
0xc06a1999 | 1792/256 | 1792/256 | 7 | 0.000%
0xc0463b9a | 1584/144 | 1584/144 | 11 | 0.000%
0xc0541eb0 | 1024/16 | 1024/16 | 64 | 0.000%
0xc06a19ac | 896/128 | 896/128 | 7 | 0.000%
0xc05721c0 | 772/12 | 768/12 | 64 | 0.518%
0xc054d1e6 | 288/57 | 280/56 | 5 | 2.778%
0xc04b562e | 157/31 | 154/30 | 5 | 1.911%
0xc04b536f | 80/16 | 80/16 | 5 | 0.000%
0xc05855a0 | 64/64 | 36/36 | 1 | 43.750%
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Alloc Ptr | Total_alloc/Per | Total_req/Per | Hit | Fragmentation
------------------------------------------------------------------------------
0xda884000 | 1052672/4096 | 1052672/4096 | 257 | 0.000%
0xda886000 | 262144/4096 | 262144/4096 | 64 | 0.000%
0xf60c7c00 | 16512/128 | 16512/128 | 129 | 0.000%
0xf59a4118 | 13120/40 | 13120/40 | 328 | 0.000%
0xdfd4b2c0 | 11264/88 | 11264/88 | 128 | 0.000%
0xf5274600 | 7680/256 | 7680/256 | 30 | 0.000%
0xe8395000 | 5948/594 | 5464/546 | 10 | 8.137%
0xe59c3c00 | 5748/479 | 5712/476 | 12 | 0.626%
0xf4cd1a80 | 3524/44 | 3520/44 | 80 | 0.114%
0xe5bd1600 | 2892/482 | 2856/476 | 6 | 1.245%
... | ... | ... | ... | ...
------------------------------------------------------------------------------

SUMMARY
=======
Total bytes requested: 2333626
Total bytes allocated: 2353712
Total bytes wasted on internal fragmentation: 20086
Internal fragmentation: 0.853375%

TODO:
- show sym+offset in 'callsite' column
- show cross node allocation stats
- collect more useful stats?
- ...

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
LKML-Reference: <4B064AF5.9060208@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>