summaryrefslogtreecommitdiffstats
path: root/tools/perf/util
Commit message (Collapse)AuthorAgeFilesLines
* perf tools: Add tracing_path and remove unneeded functionsJiri Olsa2015-08-282-53/+5
| | | | | | | | | | | | | | | | | There's no need for find_tracing_dir, because perf already searches for debugfs/tracefs mount on start and populate tracing_events_path. Adding tracing_path to carry tracing dir string to be used in get_tracing_file instead of calling find_tracing_dir. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Raphael Beamonte <raphael.beamonte@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1440596813-12844-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf buildid: Introduce sysfs/filename__sprintf_build_idMasami Hiramatsu2015-08-282-0/+35
| | | | | | | | | | | | | | | | Introduce sysfs/filename__sprintf_build_id for consolidating similar code. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20150815114259.13642.34685.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf evsel: Add a backpointer to the evlist a evsel is inArnaldo Carvalho de Melo2015-08-283-0/+8
| | | | | | | | | | | | | | | | | | So that functions that deal primarily with an evsel to access information that concerns the whole evlist it is in. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1440677263-21954-5-git-send-email-kan.liang@intel.com Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Support probing at absolute addressWang Nan2015-08-263-24/+163
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It should be useful to allow 'perf probe' probe at absolute offset of a target. For example, when (u)probing at a instruction of a shared object in a embedded system where debuginfo is not avaliable but we know the offset of that instruction by manually digging. This patch enables following perf probe command syntax: # perf probe 0xffffffff811e6615 And # perf probe /lib/x86_64-linux-gnu/libc-2.19.so 0xeb860 In the above example, we don't need a anchor symbol, so it is possible to compute absolute addresses using other methods and then use 'perf probe' to create the probing points. v1 -> v2: Drop the leading '+' in cmdline; Allow uprobing at offset 0x0; Improve 'perf probe -l' result when uprobe at area without debuginfo. v2 -> v3: Split bugfix to a separated patch. Test result: # perf probe 0xffffffff8119d175 %ax # perf probe sys_write %ax # perf probe /lib64/libc-2.18.so 0x0 %ax # perf probe /lib64/libc-2.18.so 0x5 %ax # perf probe /lib64/libc-2.18.so 0xd8e40 %ax # perf probe /lib64/libc-2.18.so __write %ax # perf probe /lib64/libc-2.18.so 0xd8e49 %ax # cat /sys/kernel/debug/tracing/uprobe_events p:probe_libc/abs_0 /lib64/libc-2.18.so:0x (null) arg1=%ax p:probe_libc/abs_5 /lib64/libc-2.18.so:0x0000000000000005 arg1=%ax p:probe_libc/abs_d8e40 /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax p:probe_libc/__write /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax p:probe_libc/abs_d8e49 /lib64/libc-2.18.so:0x00000000000d8e49 arg1=%ax # cat /sys/kernel/debug/tracing/kprobe_events p:probe/abs_ffffffff8119d175 0xffffffff8119d175 arg1=%ax p:probe/sys_write _text+1692016 arg1=%ax # perf probe -l Failed to find debug information for address 5 probe:abs_ffffffff8119d175 (on sys_write+5 with arg1) probe:sys_write (on sys_write with arg1) probe_libc:__write (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1) probe_libc:abs_0 (on 0x0 in /lib64/libc-2.18.so with arg1) probe_libc:abs_5 (on 0x5 in /lib64/libc-2.18.so with arg1) probe_libc:abs_d8e40 (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1) probe_libc:abs_d8e49 (on __GI___libc_write+9 in /lib64/libc-2.18.so with arg1) Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Fix error reported when offset without functionWang Nan2015-08-261-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug that, when offset is provided but function is lost, parse_perf_probe_point() will give a "" string as function name, so the checking code at the end of parse_perf_probe_point() become useless. For example: # perf probe +0x1234 Failed to find symbol in kernel Error: Failed to add events. After this patch: # perf probe +0x1234 Semantic error :Offset requires an entry function. Error: Command Parse Error. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-6-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Fix list result when address is zeroWang Nan2015-08-261-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When manually added uprobe point with zero address, 'perf probe -l' reports error. For example: # echo p:probe_libc/abs_0 /path/to/lib.bin:0x0 arg1=%ax > \ /sys/kernel/debug/tracing/uprobe_events # perf probe -l Error: Failed to show event list. Probing at 0x0 is possible and useful when lib.bin is not a normal shared object but is manually mapped. However, in this case kernel report: # cat /sys/kernel/debug/tracing/uprobe_events p:probe_libc/abs_0 /path/to/lib.bin:0x (null) arg1=%ax This patch supports the above kernel output. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Fix list result when symbol can't be foundWang Nan2015-08-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'perf probe -l' reports error if it is unable find symbol through address. Here is an example. # echo 'p:probe_libc/abs_5 /lib64/libc.so.6:0x5' > /sys/kernel/debug/tracing/uprobe_events # cat /sys/kernel/debug/tracing/uprobe_events p:probe_libc/abs_5 /lib64/libc.so.6:0x0000000000000005 # perf probe -l Error: Failed to show event list Also, this situation triggers a logical inconsistency in convert_to_perf_probe_point() that, it returns ENOMEM but actually it never try strdup(). This patch removes !tp->module && !is_kprobe condition, so it always uses address to build function name if symbol not found. Test result: # perf probe -l probe_libc:abs_5 (on 0x5 in /lib64/libc.so.6) Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Prevent segfault when reading probe point with absolute addressWang Nan2015-08-261-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'perf probe -l' panic if there is a manually inserted probing point with absolute address. For example: # echo 'p:probe/abs_ffffffff811e6615 0xffffffff811e6615' > /sys/kernel/debug/tracing/kprobe_events # perf probe -l Segmentation fault (core dumped) This patch fix this problem by considering the situation that "tp->symbol == NULL" in find_perf_probe_point_from_dwarf() and find_perf_probe_point_from_map(). After this patch: # perf probe -l probe:abs_ffffffff811e6615 (on SyS_write+5@fs/read_write.c) And when debug info is missing: # rm -rf ~/.debug # mv /lib/modules/4.2.0-rc1+/build/vmlinux /lib/modules/4.2.0-rc1+/build/vmlinux.bak # perf probe -l probe:abs_ffffffff811e6615 (on sys_write+5) Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440509256-193590-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add Intel PT support for decoding TRACESTOP packetsAdrian Hunter2015-08-241-0/+11
| | | | | | | | | | | | | | | | | A TRACESTOP packet is produced when an Intel PT trace enters a defined region of the address space at which point the tracing stops. This patch just adds decoder support. Support for specifying TRACESTOP regions is left until later. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-25-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add Intel PT support for decoding CYC packetsAdrian Hunter2015-08-241-5/+306
| | | | | | | | | | | | | | | | | | | | | | | CYC packets provide even finer grain timestamp information than MTC and TSC packets. A CYC packet contains the number of CPU cycles since the last CYC packet. This patch just adds decoder support. The CPU frequency can be related to TSC using the Maximum Non-Turbo Ratio in combination with the CBR (core-to-bus ratio) packet. However more accuracy is achieved by simply interpolating the number of cycles between other timing packets like MTC or TSC. This patch takes the latter approach. Support for a default value and validation of values is provided by a later patch. Also documentation is updated in a separate patch. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-23-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add Intel PT support for decoding MTC packetsAdrian Hunter2015-08-242-4/+159
| | | | | | | | | | | | | | | | | | | MTC packets provide finer grain timestamp information than TSC packets. MTC packets record time using the hardware crystal clock (CTC) which is related to TSC packets using a TMA packet. This patch just adds decoder support. Support for a default value and validation of values is provided by a later patch. Also documentation is updated in a separate patch. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-21-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Pass Intel PT information for decoding MTC and CYCAdrian Hunter2015-08-243-10/+60
| | | | | | | | | | | | | | | | Record additional information in the AUXTRACE_INFO event in preparation for decoding MTC and CYC packets. Pass the information to the decoder. The AUXTRACE_INFO record can be extended by using the size to indicate the presence of new members. The additional information includes PMU config bit positions and the TSC to CTC (hardware crystal clock) ratio needed to decode MTC packets. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-20-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add new Intel PT packet definitionsAdrian Hunter2015-08-243-17/+201
| | | | | | | | | | | | | | | | | | | | | | | | New features have been added to Intel PT which include a number of new packet definitions. This patch adds packet definitions for new packets: TMA, MTC, CYC, VMCS, TRACESTOP and MNT. Also another bit in PIP is defined. This patch only adds support for the definitions. Later patches add support for decoding TMA, MTC, CYC and TRACESTOP which is where those packets are explained. VMCS and the newly defined bit in PIP are used with virtualization which is not supported yet. MNT is a maintenance packet which the decoder should ignore. For details, refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-19-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix Intel PT 'instructions' sample periodAdrian Hunter2015-08-243-1/+8
| | | | | | | | | | | | The period on synthesized 'instructions' samples was being set to a fixed value, whereas the correct value is the number of instructions since the last sample, which is a value that the decoder can provide. So do it that way. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf ordered_events: Clear the progress bar at the end of a flushArnaldo Carvalho de Melo2015-08-241-0/+3
| | | | | | | | | | | | | | | | | | We were depending on the next screen operation after a flush() being one that would redraw the whole screen so that the progress bar would be overwritten, when that didn't happen a screen artifact of, say, a error dialog window would be overlaid on top of the progress bar, fix it by calling ui_browser__finish(), that now has a TUI implementation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-el0fyw6duemnx62lydjzhs8c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Reset the dso find_symbol cache when removing symbolsArnaldo Carvalho de Melo2015-08-242-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'annotate' tool does some filtering in the entries in a DSO but forgot to reset the cache done in dso__find_symbol(), cauxing a SEGV: [root@zoo ~]# perf annotate netlink_poll perf: Segmentation fault -------- backtrace -------- perf[0x526ceb] /lib64/libc.so.6(+0x34960)[0x7faedfbe0960] perf(rb_erase+0x223)[0x499d63] perf[0x4213e9] perf[0x4bc123] perf[0x4bc621] perf[0x4bf26b] perf[0x4bc855] perf(perf_session__process_events+0x340)[0x4bddc0] perf(cmd_annotate+0x6bb)[0x421b5b] perf[0x479063] perf(main+0x60a)[0x42098a] /lib64/libc.so.6(__libc_start_main+0xf0)[0x7faedfbcbfe0] perf[0x420aa9] [0x0] [root@zoo ~]# Fix it by reseting the find cache when removing symbols. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Fixes: b685ac22b436 ("perf symbols: Add front end cache for DSO symbol lookup") Link: http://lkml.kernel.org/n/tip-b2y9x46y0t8yem1ive41zqyp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix tarball build broken by pt/btsAdrian Hunter2015-08-226-6/+35
| | | | | | | | | | Fix some include paths and add missing inat_types.h. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/55D77696.60102@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Try to use symbol table if searching debug info failedWang Nan2015-08-211-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A problem can occur in a statically linked perf when vmlinux can be found: # perf probe --add sys_epoll_pwait probe-definition(0): sys_epoll_pwait symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_epoll_pwait address found : ffffffff8122bd40 Matched function: SyS_epoll_pwait Failed to get call frame on 0xffffffff8122bd40 An error occurred in debuginfo analysis (-2). Error: Failed to add events. Reason: No such file or directory (Code: -2) The reason is caused by libdw that, if libdw is statically linked, it can't load libebl_{arch}.so reliable. In this case it is still possible to get the address from /proc/kalksyms. However, perf tries that only when libdw returns -EBADF. This patch gives it another chance to utilize symbol table, even if libdw returns an error code other than -EBADF. After applying this patch: # perf probe -nv --add sys_epoll_pwait probe-definition(0): sys_epoll_pwait symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_epoll_pwait address found : ffffffff8122bd40 Matched function: SyS_epoll_pwait Failed to get call frame on 0xffffffff8122bd40 An error occurred in debuginfo analysis (-2). Trying to use symbols. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Added new event: Writing event: p:probe/sys_epoll_pwait _text+2276672 probe:sys_epoll_pwait (on sys_epoll_pwait) You can now use it in all perf tools, such as: perf record -e probe:sys_epoll_pwait -aR sleep 1 Although libdw returns an error (Failed to get call frame), perf tries symbol table and finally gets correct address. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440151770-129878-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Initialize reference counts in map__clone()Arnaldo Carvalho de Melo2015-08-211-2/+11
| | | | | | | | | | | | | | | | | | | | Map clone was written before we introduced reference counts for maps and dsos, so all that was needed was just a copy and then we would insert it into the new map_groups instance. Fix it by, after copying, initializing the map->refcnt, grabbing a struct dso refcount and resetting pointers that may be used to determine if a map, when deleted, is in a rb_tree. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-pd4mr80o5b9gvk50iineacec@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add Intel BTS supportAdrian Hunter2015-08-216-4/+981
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel BTS support fits within the new auxtrace infrastructure. Recording is supporting by identifying the Intel BTS PMU, parsing options and setting up events. Decoding is supported by queuing up trace data by thread and then decoding synchronously delivering synthesized event samples into the session processing for tools to consume. Committer note: E.g: [root@felicio ~]# perf record --per-thread -e intel_bts// ls anaconda-ks.cfg apctest.output bin kernel-rt-3.10.0-298.rt56.171.el7.x86_64.rpm libexec lock_page.bpf.c perf.data perf.data.old [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 4.367 MB perf.data ] [root@felicio ~]# perf evlist -v intel_bts//: type: 6, size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 dummy:u: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1 [root@felicio ~]# perf script # the navigate in the pager to some interesting place: ls 1843 1 branches: ffffffff810a60cb flush_signal_handlers ([kernel.kallsyms]) => ffffffff8121a522 setup_new_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121a529 setup_new_exec ([kernel.kallsyms]) => ffffffff8122fa30 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa5d do_close_on_exec ([kernel.kallsyms]) => ffffffff81767ae0 _raw_spin_lock ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81767af4 _raw_spin_lock ([kernel.kallsyms]) => ffffffff8122fa62 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fac9 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fad2 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fadd do_close_on_exec ([kernel.kallsyms]) => ffffffff8120fc80 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcaf filp_close ([kernel.kallsyms]) => ffffffff8120fcb6 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcc2 filp_close ([kernel.kallsyms]) => ffffffff812547f0 dnotify_flush ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81254823 dnotify_flush ([kernel.kallsyms]) => ffffffff8120fcc7 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fccd filp_close ([kernel.kallsyms]) => ffffffff81261790 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617a3 locks_remove_posix ([kernel.kallsyms]) => ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) => ffffffff8120fcd2 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcd5 filp_close ([kernel.kallsyms]) => ffffffff812142c0 fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812142d6 fput ([kernel.kallsyms]) => ffffffff812142df fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121430c fput ([kernel.kallsyms]) => ffffffff810b6580 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65ad task_work_add ([kernel.kallsyms]) => ffffffff810b65b1 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c1 task_work_add ([kernel.kallsyms]) => ffffffff810bc710 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc725 kick_process ([kernel.kallsyms]) => ffffffff810bc742 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc742 kick_process ([kernel.kallsyms]) => ffffffff810b65c6 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c9 task_work_add ([kernel.kallsyms]) => ffffffff81214311 fput ([kernel.kallsyms]) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-9-git-send-email-adrian.hunter@intel.com [ Merged sample->time fix for bug found after first round of testing on slightly older kernel ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix Intel PT timestamp handlingAdrian Hunter2015-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Events that don't sample the timestamp have a timestamp value of -1. Intel PT processing wasn't taking that into account. This is particularly noticeable with Intel BTS because timestamps are not requested by default. Then, if the conversion of -1 to TSC results in a small number, the processing is unaffected. However if the conversion results in a big number, then the data is processed prematurely before relevant sideband data like mmap events, which in turn results in samples with unknown dsos. Commiter note: Since BTS wasn't upstream, I split the patch to fold the BTS part with the patch introducing it, to avoid having this bug in the commit history. PT was already upstream, so this patch contains that part. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1440060692-5585-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisyAdrian Hunter2015-08-212-2/+3
| | | | | | | | | | | | | | | | | | | | | | The "/proc/kcore requires CAP_SYS_RAWIO" message comes up all the time for 'perf script' if vmlinux is not found and the user isn't root, even when the kernel is not being traced and even though the message is only really relevant for annotation. Change it to pr_debug and instead put a note in the message displayed if annotation is not possible. Also, the file being accessed might not be /proc/kcore. Tools can be directed to a different location using the --kallsyms option in which case kcore is expected to be in the same directory. Adjust the message so it is not misleading in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Zhang <zhlcindy@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1440065260-8802-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Fix segfault using --show-mmap-eventsAdrian Hunter2015-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Patch "perf script: Don't assume evsel position of tracking events" changed 'perf script' to use 'perf_evlist__id2evsel()'. That results in a segfault if there is more than 1 event and there are synthesized mmap events e.g. $ perf record -e cycles,instructions -p$$ sleep 1 $ perf script --show-mmap-events Segmentation fault (core dumped) That happens because these synthesized events have an 'id' of zero which does not match any 'evsel'. Currently, these synthesized events use the sample type of the first evsel. Change 'perf_evlist__id2evsel()' to reflect that which also makes it consistent with 'perf_evlist__event2evsel()'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 06b234ec26fd ("perf script: Don't assume evsel position of tracking events") Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1440059205-1765-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar2015-08-2026-6/+7393
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Support Intel PT in several tools, enabling the use of the processor trace feature introduced in Intel Broadwell processors: (Adrian Hunter) # dmesg | grep Performance # [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver. # perf record -e intel_pt//u -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.216 MB perf.data ] # perf script # then navigate in the tool output to some area, like this one: 184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so) 185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so) 186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so) 187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so) 188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so) 189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so) 190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so) 191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so) 192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so) 193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) 194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so) 195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so) 196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so) 197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so) 198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so) 199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so) 200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so) 201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so) - Fix annotation of vdso (Adrian Hunter) - Fix DWARF callchains in 'perf script' (Jiri Olsa) - Fix adding probes in kernel syscalls and listing which variables can be collected at kernel syscall function lines (Masami Hiramatsu) Build Fixes: - Fix 32-bit compilation error in util/annotate.c (Adrian Hunter) - Support static linking with libdw on Fedora 22 (Andi Kleen) Infrastructure changes: - Add a helper function to probe whether cpu-wide tracing is possible (Adrian Hunter) - Move vfs_getname storage to per thread area in 'perf trace' (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf tools: Take Intel PT into useAdrian Hunter2015-08-172-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To record an AUX area, the weak function auxtrace_record__init() must be implemented. Equally to decode an AUX area, the AUX area tracing type must be added to the perf_event__process_auxtrace_info() function. This patch makes those two changes plus hooks up default config for the intel_pt PMU. Also some brief documentation is provided for using the tools with intel_pt. Commiter note: E.g: [root@perf4 ~]# dmesg 451 [0.405807] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver. [root@perf4 ~]# perf --version perf version 4.1.g53874a [root@perf4 ~]# perf record -e intel_pt//u -a sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.383 MB perf.data ] [root@perf4 ~]# perf evlist intel_pt//u sched:sched_switch dummy:u [root@perf4 ~]# perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 0 of event 'intel_pt//u' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # Samples: 393 of event 'sched:sched_switch' # Event count (approx.): 393 # # Overhead Command Shared Object Symbol # ........ .............. ................ .............. 49.62% swapper [kernel.vmlinux] [k] __schedule 10.69% rcu_sched [kernel.vmlinux] [k] __schedule 6.62% rcuos/0 [kernel.vmlinux] [k] __schedule 5.60% kworker/0:1 [kernel.vmlinux] [k] __schedule 3.56% rcuos/3 [kernel.vmlinux] [k] __schedule 3.05% kworker/u384:2 [kernel.vmlinux] [k] __schedule 2.54% kworker/2:0 [kernel.vmlinux] [k] __schedule 2.54% tuned [kernel.vmlinux] [k] __schedule <SNIP> # Samples: 0 of event 'dummy:u' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # Samples: 28 of event 'instructions:u' # Event count (approx.): 5030172 # # Overhead Command Shared Object Symbol # ........ .......... ................... ................................ # 21.43% tuned libpython2.7.so.1.0 [.] PyEval_EvalFrameEx | ---PyEval_EvalFrameEx | |--83.33%-- PyEval_EvalCodeEx | PyEval_EvalFrameEx | | | |--60.00%-- PyEval_EvalCodeEx | | PyEval_EvalFrameEx | | PyEval_EvalFrameEx | | | --40.00%-- PyEval_EvalFrameEx | --16.67%-- PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalFrameEx 14.29% tuned libpython2.7.so.1.0 [.] _PyType_Lookup | ---_PyType_Lookup _PyObject_GenericGetAttrWithDict PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx | |--75.00%-- PyEval_EvalFrameEx | --25.00%-- PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalFrameEx 3.57% irqbalance irqbalance [.] 0x0000000000004038 | ---0x4038 0x4761 0x4761 0x4761 0x49f1 0x2295 3.57% irqbalance libc-2.17.so [.] __GI_____strtoull_l_internal | ---__GI_____strtoull_l_internal 0x6f49 0x229a 3.57% irqbalance libc-2.17.so [.] __strchrnul | ---__strchrnul vfprintf __vsprintf_chk __sprintf_chk 0x2724 0x4038 0x2331 3.57% irqbalance libc-2.17.so [.] __strstr_sse42 | ---__strstr_sse42 0x71e0 0x229f # And now to some userspace ftrace on uninstrumented binaries 8-) : # Hand edited to make it a bit more compact, replacing /home/acme/bin/perf # with /bin/perf: [root@perf4 ~]# perf script perf 8921 [3] 7.310889: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310890: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310893: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310956: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310961: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310968: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.311040: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.311046: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.311050: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311050: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) : Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add Intel PT supportAdrian Hunter2015-08-173-0/+1963
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for Intel Processor Trace. Intel PT support fits within the new auxtrace infrastructure. Recording is supporting by identifying the Intel PT PMU, parsing options and setting up events. Decoding is supported by queuing up trace data by cpu or thread and then decoding synchronously delivering synthesized event samples into the session processing for tools to consume. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add Intel PT decoderAdrian Hunter2015-08-173-1/+1921
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for decoding an Intel Processor Trace. Intel PT trace data must be 'decoded' which involves walking the object code and matching the trace data packets. The decoder requests a buffer of binary data via a get_trace() call-back, which it decodes using instruction information which it gets via another call-back walk_insn(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add Intel PT logAdrian Hunter2015-08-173-1/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a facility to log Intel Processor Trace decoding. The log is intended for debugging purposes only. The log file name is "intel_pt.log" and is opened in the current directory. The log contains a record of all packets and instructions decoded and can get very large (10 MB would be a small one). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add Intel PT instruction decoderAdrian Hunter2015-08-179-1/+2790
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for decoding instructions for Intel Processor Trace. The kernel x86 instruction decoder is copied for this. This essentially provides intel_pt_get_insn() which takes a binary buffer, uses the kernel's x86 instruction decoder to get details of the instruction and then categorizes it for consumption by an Intel PT decoder. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1439450095-30122-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add Intel PT packet decoderAdrian Hunter2015-08-174-0/+466
| | | | | | | | | | | | | | | | | | | | | | | | Add support for decoding Intel Processor Trace packets. This essentially provides intel_pt_get_packet() which takes a buffer of binary data and returns the decoded packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf auxtrace: Add Intel PT as an AUX area tracing typeAdrian Hunter2015-08-172-0/+2
| | | | | | | | | | | | | | | | | | | | Add the Intel Processor Trace type constant PERF_AUXTRACE_INTEL_PT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add a helper function to probe whether cpu-wide tracing is possibleAdrian Hunter2015-08-172-0/+25
| | | | | | | | | | | | | | | | | | Add a helper function to probe whether cpu-wide tracing is possible. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1439458857-30636-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Fix annotation of vdsoAdrian Hunter2015-08-171-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Older kernels attempt to prelink vdso to its virtual address. To permit annotation using objdump, the map__rip_2objdump() calculation must result in that same address which we can infer from the start and offset of the text section. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1439556606-11297-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf annotate: Fix 32-bit compilation error in util/annotate.cAdrian Hunter2015-08-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following 32-bit compilation errors: util/annotate.c: In function ‘addr_map_symbol__account_cycles’: util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘u64’ [-Werror=format=] pr_debug2("BB with bad start: addr %lx start %lx sym %lx saddr %lx\n", ^ util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘u64’ [-Werror=format=] util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘u64’ [-Werror=format=] These were introduced by the patch: "perf report: Add infrastructure for a cycles histogram" Also change the 'saddr' variable from 'unsigned long' to 'u64' noting that theoretically we could be processing data captured on a 64-bit machine but processing it on a 32-bit machine. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: d4957633bf9d ("perf report: Add infrastructure for a cycles histogram") Link: http://lkml.kernel.org/r/1439536294-18241-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf probe: Fix to add missed brace around if blockMasami Hiramatsu2015-08-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 75186a9b09e4 (perf probe: Fix to show lines of sys_ functions correctly) introduced a bug by a missed brace around if block. This fixes to add it. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 75186a9b09e4 ("perf probe: Fix to show lines of sys_ functions correctly") Link: http://lkml.kernel.org/r/20150812215541.9088.62425.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | Merge branch 'perf/urgent' into perf/core, to pick up fixes before adding ↵Ingo Molnar2015-08-202-2/+24
|\ \ | |/ |/| | | | | | | more changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf tools: Make fork event processing more resilientAdrian Hunter2015-08-191-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When processing a fork event, the tools lookup the parent thread by its tid. In a couple of cases, it is possible for that thread to have the wrong pid. That can happen if the data is being processed out of order, or if the (fork) event that would have removed the erroneous thread was lost. Assume the latter case, print a dump message, remove the erroneous thread, create a new one with the correct pid, and keep going. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1439994561-27436-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Avoid deadlock when map_groups are brokenAdrian Hunter2015-08-191-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Attempting to clone map groups onto themselves will deadlock. It only happens because of other bugs, but the code should protect itself anyway. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1439994561-27436-2-git-send-email-adrian.hunter@intel.com [ Use pr_debug() instead of dump_fprintf() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf report: Show call graph from reference eventsKan Liang2015-08-122-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce --show-ref-call-graph for perf report to print reference callgraph for no callgraph event. Here is an example. perf report --show-ref-call-graph --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 5 of event 'cpu/cpu-cycles,call-graph=fp/' # Event count (approx.): 144985 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 72.30% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--22.62%-- __GI___libc_nanosleep --77.38%-- [...] ...... # Samples: 6 of event 'cpu/instructions,call-graph=no/', show reference callgraph # Event count (approx.): 172780 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 73.16% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--31.44%-- __GI___libc_nanosleep --68.56%-- [...] Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1439289050-40510-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf callchain: Allow disabling call graphs per eventKan Liang2015-08-122-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduce "call-graph=no" to disable per-event callgraph. Here is an example. perf record -e 'cpu/cpu-cycles,call-graph=fp/,cpu/instructions,call-graph=no/' sleep 1 perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 6 of event 'cpu/cpu-cycles,call-graph=fp/' # Event count (approx.): 774218 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 61.94% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--97.30%-- __brk | --2.70%-- mmap64 _dl_check_map_versions _dl_check_all_versions 61.94% 0.00% sleep [kernel.vmlinux] [k] perf_event_mmap | ---perf_event_mmap | |--97.30%-- do_brk | sys_brk | entry_SYSCALL_64_fastpath | __brk | --2.70%-- mmap_region do_mmap_pgoff vm_mmap_pgoff sys_mmap_pgoff sys_mmap entry_SYSCALL_64_fastpath mmap64 _dl_check_map_versions _dl_check_all_versions ...... # Samples: 6 of event 'cpu/instructions,call-graph=no/' # Event count (approx.): 359692 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ................................. # 89.03% 0.00% sleep [unknown] [.] 0xffff6598ffff6598 89.03% 0.00% sleep ld-2.17.so [.] _dl_resolve_conflicts 89.03% 0.00% sleep [kernel.vmlinux] [k] page_fault Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1439289050-40510-2-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf callchain: Per-event type selection supportKan Liang2015-08-126-3/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per event. This in term can reduce sampling overhead and the size of the perf.data. Here is an example. perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1 perf evlist -v cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112, config: 0x3c, { sample_period, sample_freq }: 1000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf probe: Fix to show lines of sys_ functions correctlyMasami Hiramatsu2015-08-121-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "perf probe --lines sys_poll" shows only the first line of sys_poll, because the SYSCALL_DEFINE macro: ---- SYSCALL_DEFINE*(foo,...) { body; } ---- is expanded as below (on debuginfo) ---- static inline int SYSC_foo(...) { body; } int SyS_foo(...) <- is an alias of sys_foo. { return SYSC_foo(...); } ---- So, "perf probe --lines sys_foo" decodes SyS_foo function and it also skips inlined functions(SYSC_foo) inside the target function because those functions are usually defined somewhere else. To fix this issue, this fix checks whether the inlined function is defined at the same point of the target function, and if so, it doesn't skip the inline function. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20150812012406.11811.94691.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf sort: Check for SRCLINE_UNKNOWN case in "srcfile" processingAndi Kleen2015-08-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle the SRCLINE_UNKNOWN case correctly when processing "srcfile". Commiter note: We can't just free it, as it was't allocated via malloc, its a guard variable. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20150811133655.GC4524@tassilo.jf.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf evlist: Be more specific on -F/--freqNamhyung Kim2015-08-101-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently perf evlist -F shows the number as if it's always sampling frequency. But we now support per-event freq/period settings. So it'd better to show more detailed info whether it's freq or period. $ perf record -e 'cpu/config=1/,cpu/config=2,period=300000/' sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data ] $ perf evlist -F cpu/config=1/: sample_freq=4000 cpu/config=2,period=300000/: sample_period=300000 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1439102724-14079-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf record: Support per-event freq termNamhyung Kim2015-08-106-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now perf can set per-event value of time and (sampling) period. But I guess most users like me just want to set frequency rather than period. So add the 'freq' term in the event parser. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1439102724-14079-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf report: Add support for srcfile sort keyAndi Kleen2015-08-104-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases it's useful to characterize samples by file. This is useful to get a higher level categorization, for example to map cost to subsystems. Add a srcfile sort key to perf report. It builds on top of the existing srcline support. Commiter notes: E.g.: # perf record -F 10000 usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (13 samples) ] [root@zoo ~]# perf report -s srcfile --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File # ........ ........... 60.99% . 20.62% paravirt.h 14.23% rmap.c 4.04% signal.c 0.11% msr.h # The first line is collecting all the files for which srcfiles couldn't somehow get resolved to: # perf report -s srcfile,dso --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File Shared Object # ........ ........... ................ 40.97% . ld-2.20.so 20.62% paravirt.h [kernel.vmlinux] 20.02% . libc-2.20.so 14.23% rmap.c [kernel.vmlinux] 4.04% signal.c [kernel.vmlinux] 0.11% msr.h [kernel.vmlinux] # XXX: Investigate why that is not resolving on Fedora 21, Andi says he hasn't seen this on Fedora 22. Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438988064-21834-1-git-send-email-andi@firstfloor.org [ Added column length update, from 0e65bdb3f90f ('perf hists: Update the column width for the "srcline" sort key') ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf hists: Update the column width for the "srcline" sort keyArnaldo Carvalho de Melo2015-08-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we introduce a new sort key, we need to update the hists__calc_col_len() function accordingly, otherwise the width will be limited to strlen(header). We can't update it when obtaining a line value for a column (for instance, in sort__srcline_cmp()), because we reset it all when doing a resort (see hists__output_recalc_col_len()), so we need to, from what is in the hist_entry fields, set each of the column widths. Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@kernel.org> Fixes: 409a8be61560 ("perf tools: Add sort by src line/number") Link: http://lkml.kernel.org/n/tip-jgbe0yx8v1gs89cslr93pvz2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf hists: hist_entry__cmp() may use he_tmp.hists, initialize itArnaldo Carvalho de Melo2015-08-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iter_add_next_cumulative_entry() function calls hist_entry__cmp(), which may want to access the hists where this hist_entry is stored, initialize it to let that happen and avoid segfaults. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-iqg98sfn4fvwcxp0pdvqauie@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf tools: Unset perf_event_attr::freq when period term is setJiri Olsa2015-08-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to unset 'perf_event_attr::freq' bit (default 1) when 'period' term is specified within event definition like: -e 'cpu/cpu-cycles,call-graph=fp,time,period=100000' otherwise it will handle the period value as frequency (and fail if it crossed the maximum allowed frequency value). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20150808171210.GC17040@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf tools: Support full source file paths for srclineAndi Kleen2015-08-102-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For perf report/script srcline currently only the base file name of the source file is printed. This is a good default because it usually fits on the screen. But in some cases we want to know the full file name, for example to aggregate hits per file. In the later case we need more than the base file name to resolve file naming collisions: for example the kernel source has ~70 files named "core.c" It's also useful as input to post processing tools which want to point to the right file. Add a flag to allow full file name output. Add an option to perf report/script to enable this option. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438986245-15191-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
OpenPOWER on IntegriCloud