summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/sort.c
Commit message (Collapse)AuthorAgeFilesLines
...
* perf hists: Introduce hist_entry__filter()Namhyung Kim2016-02-241-0/+113
| | | | | | | | | | | | | | | | | | | | | | | | The hist_entry__filter() function is to filter hist entries using sort key related info. This is needed to support hierarchy mode since each hist entry will be associated with a hpp fmt which has a sort key. So each entry should compare to only matching type of filters. To do that, add the ->se_filter callback field to struct sort_entry. This callback takes 'type' argument which determines whether it's matching sort key or not. It returns -1 for non-matching type, 0 for filtered entry and 1 for not filtered entries. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-6-git-send-email-namhyung@kernel.org [ 'socket' is reserved in sys/socket.h, so replace it with 'sk' ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add helper functions for some sort keysNamhyung Kim2016-02-241-0/+33
| | | | | | | | | | | | | | | | The 'trace', 'srcline' and 'srcfile' sort keys updates hist entry's field later. With the hierarchy mode, those fields are passed to a matching entry so it needs to identify the sort keys. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1456326830-30456-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Introduce perf_mem__lck_scnprintf functionJiri Olsa2016-02-241-12/+2
| | | | | | | | | | | | | | Move meminfo's lck display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-9-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Introduce perf_mem__snp_scnprintf functionJiri Olsa2016-02-241-30/+1
| | | | | | | | | | | | | | Move meminfo's snp display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Introduce perf_mem__lvl_scnprintf functionJiri Olsa2016-02-241-49/+1
| | | | | | | | | | | | | | Move meminfo's lvl display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Introduce perf_mem__tlb_scnprintf functionJiri Olsa2016-02-241-42/+2
| | | | | | | | | | | | | | Move meminfo's tlb display function into mem-events.c object, so it could be reused later from script code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1456303616-26926-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Use ARRAY_SIZE in mem sort display functionsJiri Olsa2016-02-231-6/+3
| | | | | | | | | | | | | There's no need to define extra macros for that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-13-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Make cl_address globalJiri Olsa2016-02-231-6/+0
| | | | | | | | | | | | | It'll be used in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix column width setting on 'trace' sort keyNamhyung Kim2016-02-221-3/+0
| | | | | | | | | | | | | | | It missed to update column length of the 'trace' sort key in the hists__calc_col_len() so it might truncate the output. It calculated the column length in the ->cmp() callback originally but it doesn't guarantee it's called always. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1456064558-13086-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix alignment on some sort keysNamhyung Kim2016-02-221-4/+4
| | | | | | | | | | | | | | | The srcline, srcfile and trace sort keys can have long entries. With commit 89fee7094323 ("perf hists: Do column alignment on the format iterator"), it now aligns output with hist_entry__snprintf_alignment(). So each (possibly long) sort entries don't need to do it themselves. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1456101153-14519-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Update srcline/file if neededNamhyung Kim2016-02-221-33/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Normally the hist entry's srcline and/or srcfile is set during sorting. However sometime it's possible to a hist entry's srcline is not set yet after the sorting. This is because the entry is so unique and other sort keys already make it distinct. Then the srcline/file sort didn't have a chance to be called during the sorting. In that case it has NULL srcline/srcfile field and shows nothing. Before: $ perf report -s comm,sym,srcline ... Overhead Command Symbol ----------------------------------------------------------------- 34.42% swapper [k] intel_idle intel_idle.c:0 2.44% perf [.] __poll_nocancel (null) 1.70% gnome-shell [k] fw_domains_get (null) 1.04% Xorg [k] sock_poll (null) After: 34.42% swapper [k] intel_idle intel_idle.c:0 2.44% perf [.] __poll_nocancel .:0 1.70% gnome-shell [k] fw_domains_get fw_domains_get+42 1.04% Xorg [k] sock_poll socket.c:0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1456101111-14400-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix segfault on dynamic entriesNamhyung Kim2016-02-221-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | A dynamic entry is created for each tracepoint event. When it sets up the sort key, it checks with existing keys using ->equal() callback. But it missed to set the ->equal for dynamic entries. The following segfault was due to the missing ->equal() callback. (gdb) bt #0 0x0000000000140003 in ?? () #1 0x0000000000537769 in fmt_equal (b=0x2106980, a=0x21067a0) at ui/hist.c:548 #2 perf_hpp__setup_output_field (list=0x8c6d80 <perf_hpp_list>) at ui/hist.c:560 #3 0x00000000004e927e in setup_sorting (evlist=<optimized out>) at util/sort.c:2642 #4 0x000000000043cf50 in cmd_report (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin-report.c:932 #5 0x00000000004865a1 in run_builtin (p=p@entry=0x8bbce0 <commands+192>, argc=argc@entry=7, argv=argv@entry=0x7ffd24d56ce0) at perf.c:390 #6 0x000000000042dc1f in handle_internal_command (argv=0x7ffd24d56ce0, argc=7) at perf.c:451 #7 run_argv (argv=0x7ffd24d56a70, argcp=0x7ffd24d56a7c) at perf.c:495 #8 main (argc=7, argv=0x7ffd24d56ce0) at perf.c:620 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1456064558-13086-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Do column alignment on the format iteratorArnaldo Carvalho de Melo2016-02-121-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We were doing column alignment in the format function for each cell, returning a string padded with spaces so that when the next column is printed the cursor is at its column alignment. This ends up needlessly printing trailing spaces, do it at the format iterator, that is where we know if it is needed, i.e. if there is more columns to be printed. This eliminates the need for triming lines when doing a dump using 'P' in the TUI browser and also produces far saner results with things like piping 'perf report' to 'less'. Right now only the formatters for sym->name and the 'locked' column (perf mem report), that are the ones that end up at the end of lines in the default 'perf report', 'perf top' and 'perf mem report' tools, the others will be done in a subsequent patch. In the end the 'width' parameter for the formatters now mean, in 'printf' terms, the 'precision', where before it was the field 'width'. Reported-by: Dave Jones <davej@codemonkey.org.uk> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add comment explaining the repsep_snprintf functionArnaldo Carvalho de Melo2016-02-121-1/+9
| | | | | | | | | Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4j67nvlfwbnkg85b969ewnkr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Add struct perf_hpp_list argument to helper functionsJiri Olsa2016-02-031-3/+3
| | | | | | | | | | | | | | | | | Adding struct perf_hpp_list argument to following helper functions: void perf_hpp__setup_output_field(struct perf_hpp_list *list); void perf_hpp__reset_output_field(struct perf_hpp_list *list); void perf_hpp__append_sort_keys(struct perf_hpp_list *list); so they could be used on hists's hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-24-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Introduce perf_hpp_list__for_each_format macroJiri Olsa2016-02-031-4/+4
| | | | | | | | | | | | Introducing perf_hpp_list__for_each_format macro to iterate perf_hpp_list object's output entries. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-20-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Pass perf_hpp_list all the way through setup_output_listJiri Olsa2016-02-031-15/+18
| | | | | | | | | | | | Passing perf_hpp_list all the way through setup_output_list so the output entry could be added on the arbitrary list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-19-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Separate output fields parsing into setup_output_list functionJiri Olsa2016-02-031-12/+22
| | | | | | | | | | | | | Separating output fields parsing into setup_output_list function, so it's separated from field_order string setup and could be reused later in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-14-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Separate sort fields parsing into setup_sort_list functionJiri Olsa2016-02-031-12/+22
| | | | | | | | | | | | | Separating sort fields parsing into setup_sort_list function, so it's separated from sort_order string setup and could be reused later in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-13-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Properly release format fieldsJiri Olsa2016-02-031-0/+24
| | | | | | | | | | | | With multiple list holding format entries, we need the support properly releasing format output/sort fields. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-12-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Allocate output sort fieldJiri Olsa2016-02-031-8/+33
| | | | | | | | | | | | | | | | Currently we use static output fields, because we have single global list of all sort/output fields. We will add hists specific sort and output lists in following patches, so we need all format entries to be dynamically allocated. Adding support to allocate output sort field. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-10-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Add 'equal' method to perf_hpp_fmt structJiri Olsa2016-02-031-19/+20
| | | | | | | | | | | | To easily compare format entries and make it available for all kinds of format entries. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1453109064-1026-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Provide a way to find out if per-thread bucketing is in placeNamhyung Kim2016-01-261-0/+3
| | | | | | | | | | | | | | | | | | Now the UI browsers will be able to offer thread related operations only if the thread is part of the sort order in use, i.e. if hist_entry stats are all for a single thread. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org>, Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org [ Carved out from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add overhead/overhead_children keys defaults via stringJiri Olsa2016-01-081-0/+39
| | | | | | | | | | | | | | | | | | | | We currently set 'overhead' and 'overhead_children' as default sort keys within perf_hpp__init function by directly adding into the sort list. This patch adds 'overhead' and 'overhead_children' in text form into sort_keys and let them be added by standard sort dimension interface. We need to eliminate dirrect sort_list additions to be able to add support for hists specific sort keys. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Noel Grandin <noelgrandin@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1452158050-28061-12-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add all matching dynamic sort keys for field nameNamhyung Kim2016-01-061-16/+32
| | | | | | | | | | | | | | | | | When a perf.data file has multiple events, it's likely to be similar (tracepoint) events. In that case, they might have same field name so add all of them to sort keys instead of bailing out. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1451991518-25673-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint ↵Namhyung Kim2016-01-061-5/+25
| | | | | | | | | | | | | | | | | | | | | events When an evlist contains tracepoint events only, use 'trace' sort key as default. If --raw-trace option was given, use 'trace_fields' instead. This will make users more convenient to see trace result. Suggested-and-Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-14-git-send-email-namhyung@kernel.org [ Check evlist in get_default_sort_order() fixing a segfault in 'perf test hists' reported by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add 'trace_fields' dynamic sort keyNamhyung Kim2016-01-061-9/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'trace_fields' sort key is similar as 'trace' sort key, but it shows each fields separately. Each event will get different columns as their fields. $ perf report -s trace_fields --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 20K of event 'kmem:kmalloc' # Event count (approx.): 20533 # # Overhead Command call_site ptr bytes_req bytes_alloc gfp_flags # ........ ....... .................. .................. ......... ........... ................... # 99.89% perf ffffffffa01d4396 0xffff8803ffb79720 96 96 GFP_NOFS|GFP_ZERO 0.06% sleep ffffffff8114e1cd 0xffff8803d228a000 4096 4096 GFP_KERNEL 0.03% perf ffffffff811d6ae6 0xffff8803f7678f00 240 256 GFP_KERNEL|GFP_ZERO 0.00% perf ffffffff812263c1 0xffff880406172380 128 128 GFP_KERNEL 0.00% perf ffffffff812264b9 0xffff8803ffac1600 504 512 GFP_KERNEL 0.00% perf ffffffff81226634 0xffff880401dc5280 28 32 GFP_KERNEL 0.00% sleep ffffffff81226da9 0xffff8803ffac3a00 392 512 GFP_KERNEL # Samples: 20K of event 'kmem:kfree' # Event count (approx.): 20597 # # Overhead call_site ptr # ........ .................. .................. # 99.58% ffffffffa01d85ad 0xffff8803ffb79720 0.07% ffffffff81443f5c 0xffff8803f7669400 0.02% ffffffff811d5753 0xffff8803f7678f00 0.01% ffffffff81443f5c 0xffff8803f766be00 0.01% ffffffff8114e359 0xffff8803d228a000 0.01% ffffffff81443f5c 0xffff8800d156dc00 0.01% ffffffff81443f5c 0xffff8803f7669400 0.01% ffffffff8114e359 0xffff8803d228a000 0.01% ffffffff8114e359 0xffff8803d228a000 0.01% ffffffff8114e359 0xffff8803d228a000 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-13-git-send-email-namhyung@kernel.org [ Combined with "perf tools: Fix segfault when using -s trace_fields" ] Link: http://lkml.kernel.org/r/1451991518-25673-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Skip dynamic fields not defined for current eventNamhyung Kim2016-01-061-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there are multiple events, each dynamic sort key is defined just for one event. In this case other events will always show "N/A" for those fields. But they are meaningless and consume precious screen width. Let's skip those undefined dynamic fields. $ perf record -e kmem:kmalloc,kmem:kfree -a sleep 1 $ perf report -s 'comm,kmalloc.*' --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 20K of event 'kmem:kmalloc' # Event count (approx.): 20533 # # Overhead Command call_site ptr bytes_req bytes_alloc gfp_flags # ........ ....... .................. .................. ......... ........... ................... # 99.89% perf ffffffffa01d4396 0xffff8803ffb79720 96 96 GFP_NOFS|GFP_ZERO 0.06% sleep ffffffff8114e1cd 0xffff8803d228a000 4096 4096 GFP_KERNEL 0.03% perf ffffffff811d6ae6 0xffff8803f7678f00 240 256 GFP_KERNEL|GFP_ZERO 0.00% perf ffffffff812263c1 0xffff880406172380 128 128 GFP_KERNEL 0.00% perf ffffffff812264b9 0xffff8803ffac1600 504 512 GFP_KERNEL 0.00% perf ffffffff81226634 0xffff880401dc5280 28 32 GFP_KERNEL 0.00% sleep ffffffff81226da9 0xffff8803ffac3a00 392 512 GFP_KERNEL # Samples: 20K of event 'kmem:kfree' # Event count (approx.): 20597 # # Overhead Command # ........ .............. # 99.63% perf 0.14% sleep 0.11% irq/36-iwlwifi 0.11% kworker/u16:0 0.01% Xorg 0.00% firefox Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-12-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Support '<event>.*' dynamic sort keyNamhyung Kim2016-01-061-15/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support '*' character for field name to add all (non-common) fields as sort keys easily. $ perf report -s 'switch.*' --stdio ... # Overhead prev_comm prev_pid prev_prio prev_state next_comm next_pid next_prio # ........ ........... ......... ......... .......... ............ ........ ......... # 3.82% swapper/0 0 120 0 netctl-auto 18711 120 3.75% netctl-auto 18711 120 1 swapper/0 0 120 2.24% swapper/1 0 120 0 netctl-auto 18709 120 2.24% netctl-auto 18709 120 1 swapper/1 0 120 1.80% swapper/2 0 120 0 rcu_preempt 7 120 1.80% swapper/2 0 120 0 netctl-auto 18711 120 1.80% rcu_preempt 7 120 1 swapper/2 0 120 1.80% netctl-auto 18711 120 1 swapper/2 0 120 ... Suggested-and-acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-11-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Support shortcuts for events in dynamic sort keysNamhyung Kim2016-01-061-20/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dynamic sort key requires event name but specifying full event name is rather inconvenient. This patch adds more ways to identify the event in a more compact way. 1. If session has just one event, event name can be omitted. 2. Events can be accessed by index preceded by a percent sign. 3. A part of the name can be used, if it's not ambiguous. The partial name should not contain ':' in it. 4. Full system + event name is still used, it should contain ':'. So in the below example all does same thing: $ perf record -e sched:sched_switch -a sleep 1 $ perf report -s next_pid,next_comm $ perf report -s %1.next_pid,%1.next_comm $ perf report -s switch.next_pid,switch.next_comm $ perf report -s sched:sched_switch.next_pid,sched:sched_switch.next_comm Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-10-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report/top: Add --raw-trace optionNamhyung Kim2016-01-061-3/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The --raw-trace option allows disabling pretty printing by the event's print_fmt or plugin. Besides that, each dynamic sort key now can receive a 'raw' suffix separated by '/' to ask for the raw trace of a specific field. $ perf report -s comm,kmem:kmalloc.gfp_flags ... # Overhead Command gfp_flags # ........ ....... ................... # 99.89% perf GFP_NOFS|GFP_ZERO 0.06% sleep GFP_KERNEL 0.03% perf GFP_KERNEL|GFP_ZERO 0.01% perf GFP_KERNEL Now $ perf report -s comm,kmem:kmalloc.gfp_flags --raw-trace or $ perf report -s comm,kmem:kmalloc.gfp_flags/raw ... # Overhead Command gfp_flags # ........ ....... .......... # 99.89% perf 32848 0.06% sleep 208 0.03% perf 32976 0.01% perf 208 Suggested-and-Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-9-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add 'trace' sort keyNamhyung Kim2016-01-061-16/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'trace' sort key is to show tracepoint event output using either print fmt or plugin. For example sched_switch event (using plugin) will show output like below: # perf record -e sched:sched_switch -a usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.197 MB perf.data (69 samples) ] # $ perf report -s trace --stdio ... # Overhead Trace output # ........ ................................................... # 9.48% swapper/0:0 [120] R ==> transmission-gt:17773 [120] 9.48% transmission-gt:17773 [120] S ==> swapper/0:0 [120] 9.04% swapper/2:0 [120] R ==> transmission-gt:17773 [120] 8.92% transmission-gt:17773 [120] S ==> swapper/2:0 [120] 5.25% swapper/0:0 [120] R ==> kworker/0:1H:109 [100] 5.21% kworker/0:1H:109 [100] S ==> swapper/0:0 [120] 1.78% swapper/3:0 [120] R ==> transmission-gt:17773 [120] 1.78% transmission-gt:17773 [120] S ==> swapper/3:0 [120] 1.53% Xephyr:6524 [120] S ==> swapper/0:0 [120] 1.53% swapper/0:0 [120] R ==> Xephyr:6524 [120] 1.17% swapper/2:0 [120] R ==> irq/33-iwlwifi:233 [49] 1.13% irq/33-iwlwifi:233 [49] S ==> swapper/2:0 [120] Note that the 'trace' sort key works only for tracepoint events. If it's used to other type of events, just "N/A" will be printed. Suggested-and-acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Try to show pretty printed output for dynamic sort keysNamhyung Kim2016-01-061-5/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each tracepoint event has format string for print to improve readability. Try to parse the output and match the field name. If it finds one, use that for the result. If not, fallbacks to the original output. For example, sort on kmem:kmalloc.gfp_flags looks like below: (Note: libtraceevent plugins are not installed on my system. They might affect the output below) Before: # Overhead Command gfp_flags # ........ ....... .......... # 99.89% perf 32848 0.06% sleep 208 0.03% perf 32976 0.01% perf 208 After: # Overhead Command gfp_flags # ........ ....... ................... # 99.89% perf GFP_NOFS|GFP_ZERO 0.06% sleep GFP_KERNEL 0.03% perf GFP_KERNEL|GFP_ZERO 0.01% perf GFP_KERNEL Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-7-git-send-email-namhyung@kernel.org [ Fixed clash with earlier, updated patch in this patchkit ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add dynamic sort key for tracepoint eventsNamhyung Kim2016-01-061-0/+213
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing sort keys are less useful for tracepoint events in that they are always sampled at the same place, the function where the tracepoint is located. For example, a 'perf report' on sched:sched_switch event looks like the following: # Overhead Command Shared Object Symbol # ........ ............... ................ .............. # 47.22% swapper [kernel.vmlinux] [k] __schedule 21.67% transmission-gt [kernel.vmlinux] [k] __schedule 8.23% netctl-auto [kernel.vmlinux] [k] __schedule 5.53% kworker/0:1H [kernel.vmlinux] [k] __schedule 1.98% Xephyr [kernel.vmlinux] [k] __schedule 1.33% irq/33-iwlwifi [kernel.vmlinux] [k] __schedule 1.17% wpa_cli [kernel.vmlinux] [k] __schedule 1.13% rcu_preempt [kernel.vmlinux] [k] __schedule 0.85% ksoftirqd/0 [kernel.vmlinux] [k] __schedule 0.77% Timer [kernel.vmlinux] [k] __schedule In fact, tracepoints have meaningful information in their fields but there's no way to use in 'perf report' currently. The dynamic sort keys are introduced in this patc to overcome this limitation. The sched:sched_switch events have following fields: # sudo cat /sys/kernel/debug/tracing/events/sched/sched_switch/format name: sched_switch ID: 268 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[16]; offset:8; size:16; signed:1; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[16]; offset:40; size:16; signed:1; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, REC->prev_state & (2048-1) ? __print_flags(REC->prev_state & (2048-1), "|", { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" }, { 16, "Z" }, { 32, "X" }, { 64, "x" }, { 128, "K"}, { 256, "W" }, { 512, "P" }, { 1024, "N" }) : "R", REC->prev_state & 2048 ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio With dynamic sort keys, you can use <event.field> as a sort key. Those dynamic keys are checked and created on demand. For instance, below is to sort by next_pid field output on the same data file: $ perf report -s comm,sched:sched_switch.next_pid --stdio ... # Overhead Command next_pid # ........ ............... .......... # 21.23% transmission-gt 0 20.86% swapper 17773 6.62% netctl-auto 0 5.25% swapper 109 5.21% kworker/0:1H 0 1.98% Xephyr 0 1.98% swapper 6524 1.98% swapper 27478 1.37% swapper 27476 1.17% swapper 233 Multiple dynamic sort keys are also supported: $ perf report -s comm,sched:sched_switch.next_pid,sched:sched_switch.next_comm --stdio ... # Overhead Command next_pid next_comm # ........ ............... .......... ................ # 20.86% swapper 17773 transmission-gt 9.64% transmission-gt 0 swapper/0 9.16% transmission-gt 0 swapper/2 5.25% swapper 109 kworker/0:1H 5.21% kworker/0:1H 0 swapper/0 2.14% netctl-auto 0 swapper/2 1.98% netctl-auto 0 swapper/0 1.98% swapper 6524 Xephyr 1.98% swapper 27478 netctl-auto 1.78% transmission-gt 0 swapper/3 1.53% Xephyr 0 swapper/0 1.29% netctl-auto 0 swapper/1 1.29% swapper 27476 netctl-auto 1.21% netctl-auto 0 swapper/3 1.17% swapper 233 irq/33-iwlwifi Note that pid 0 exists for each cpu so have comm of 'swapper/N'. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Pass evlist to setup_sorting()Namhyung Kim2016-01-061-6/+9
| | | | | | | | | | | | | | | | | | This is a preparation to support dynamic sort keys for tracepoint events. Dynamic sort keys can be created for specific fields in trace events so it needs the event information. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-5-git-send-email-namhyung@kernel.org [ Moving the evlist creation earlier in top was split to a previous patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Introduce hpp_dimension__add_output functionJiri Olsa2015-10-061-0/+6
| | | | | | | | | | | | This function will allow to register output column from ui code and respect taken sort/output dimensions. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444134312-29136-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Get rid of superfluos call to reset_dimensionsJiri Olsa2015-10-061-2/+0
| | | | | | | | | | | | | | There's no need to call reset_dimensions within __setup_output_field function. It's already called in its caller setup_sorting right before perf_hpp__init, which will be changed in following patch to respect taken dimension. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444134312-29136-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add support for sorting on the iaddrDon Zickus2015-10-051-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | Sorting on 'symbol' gives to broad a resolution as it can cover a range of IP address. Use the iaddr instead to get proper sorting on IP addresses. Need to use the 'mem_sort' feature of perf record. New sort option is: symbol_iaddr, header label is 'Code Symbol'. $ perf mem report --stdio -F +symbol_iaddr # Overhead Samples Code Symbol Local Weight # ........ ............ ........................ ............ # 54.08% 1 [k] nmi_handle 192 4.51% 1 [k] finish_task_switch 16 3.66% 1 [.] malloc 13 3.10% 1 [.] __strcoll_l 11 Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444068369-20978-8-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Introduce new sort type "socket" for the processor socketKan Liang2015-09-141-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enable perf report to sort by processor socket: $ perf report --stdio --sort socket,comm,dso,symbol # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 686 of event 'cycles' # Event count (approx.): 349215462 # # Overhead SOCKET Command Shared Object Symbol # ........ ...... ....... ................ ............................ # 97.05% 000 test test [.] plusB_c 0.98% 000 test test [.] plusA_c 0.93% 001 perf [kernel.vmlinux] [k] smp_call_function_single 0.19% 001 perf [kernel.vmlinux] [k] page_fault 0.19% 001 swapper [kernel.vmlinux] [k] pm_qos_request 0.16% 000 test [kernel.vmlinux] [k] add_mm_counter_fast Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1441377946-44429-2-git-send-email-kan.liang@intel.com [ Fix col calc, un-allcapsify col header & read the topology when not using perf.data ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Always use non inlined file name for 'srcfile' sort keyAndi Kleen2015-09-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | When profiling the kernel with the 'srcfile' sort key it's common to "get stuck" in include. For example a lot of code uses current or other inlines, so they get accounted to some random include file. This is not very useful as a high level categorization. For example just profiling the idle loop usually shows mostly inlines, so you never see the actual cpuidle file. This patch changes the 'srcfile' sort key to always unwind the inline stack using BFD/DWARF. So we always account to the base function that called the inline. In a few cases include is still shown (for example for MSR accesses), but that is because they get inlining expanded as part of assigning to a global function pointer. For the majority it works fine though. v2: Use simpler while loop. Add maximum iteration count. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1441133239-31254-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sort: Check for SRCLINE_UNKNOWN case in "srcfile" processingAndi Kleen2015-08-121-0/+2
| | | | | | | | | | | | | | | Handle the SRCLINE_UNKNOWN case correctly when processing "srcfile". Commiter note: We can't just free it, as it was't allocated via malloc, its a guard variable. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20150811133655.GC4524@tassilo.jf.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Add support for srcfile sort keyAndi Kleen2015-08-101-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases it's useful to characterize samples by file. This is useful to get a higher level categorization, for example to map cost to subsystems. Add a srcfile sort key to perf report. It builds on top of the existing srcline support. Commiter notes: E.g.: # perf record -F 10000 usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (13 samples) ] [root@zoo ~]# perf report -s srcfile --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File # ........ ........... 60.99% . 20.62% paravirt.h 14.23% rmap.c 4.04% signal.c 0.11% msr.h # The first line is collecting all the files for which srcfiles couldn't somehow get resolved to: # perf report -s srcfile,dso --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File Shared Object # ........ ........... ................ 40.97% . ld-2.20.so 20.62% paravirt.h [kernel.vmlinux] 20.02% . libc-2.20.so 14.23% rmap.c [kernel.vmlinux] 4.04% signal.c [kernel.vmlinux] 0.11% msr.h [kernel.vmlinux] # XXX: Investigate why that is not resolving on Fedora 21, Andi says he hasn't seen this on Fedora 22. Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438988064-21834-1-git-send-email-andi@firstfloor.org [ Added column length update, from 0e65bdb3f90f ('perf hists: Update the column width for the "srcline" sort key') ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Display cycles in branch sort modeAndi Kleen2015-08-061-1/+1
| | | | | | | | | | | | | Display the cycles by default in branch sort mode. To make enough room for the new column I removed dso_to. It is usually redundant with dso_from. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1437233094-12844-9-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add support for cycles, weight branch_info fieldAndi Kleen2015-08-061-0/+24
| | | | | | | | | | | | | | | | | | | | | cycles is a new branch_info field available on some CPUs that indicates the time deltas between branches in the LBR. Add a sort key and output code for the cycles to allow to display the basic block cycles individually in perf report. We also pass in the cycles for weight when LBRs are processed, which allows to get global and local weight, to get an estimate of the total cost. And also print the cycles information for perf report -D. I also added printing for the previously missing LBR flags (mispredict etc.) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1437233094-12844-2-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Fix sort__sym_cmp to also compare end of symbolYannick Brosseau2015-06-191-5/+3
| | | | | | | | | | | | | | | | | | When using a map file from a JIT, due to memory reuse, we can obtain multiple symbols with the same start address but a different length. The symbols__find does check for the end so not doing it in sort__sym_cmp was causing the hist_entry in the annotate part of a report to match to the wrong entry, causing a fatal error. Signed-off-by: Yannick Brosseau <scientist@fb.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/1434584470-17771-1-git-send-email-scientist@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix "Command" sort_entry's cmp and collapse functionJiri Olsa2015-05-151-2/+2
| | | | | | | | | | | | | | | | | | | | Currently the se_cmp and se_collapse use pointer comparison, which is ok for for testing equality of strings. It's not ok as comparing function for rbtree insertion, because it gives different results based on current pointer values. We saw test 32 (hists cumulation test) failing based on different environment setup. Having all sort functions straightened fix the test for us. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jan Stancek <jstancek@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf diff: Support for different binariesKan Liang2015-02-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the perf diff only works with same binaries. That's because it compares the symbol start address. It doesn't work if the perf.data comes from different binaries. This patch matches the symbol names. Actually, perf diff once intended to compare the symbol names. The commit as below can look for a pair by name. 604c5c92972d (perf diff: Change the default sort order to "dso,symbol") However, at that time, perf diff used a global list of dsos. That means the binaries which has same name can only be loaded once. That's a problem for comparing different binaries. For example, we have an old binary and an updated binary. They very likely have same name and most of the functions, so only dsos from old binary will be loaded. When processing the data from updated binary, perf still use the symbol information from old binary. That's wrong. Then the commit as below used IP to replace symbol name. 9c443dfdd31e ("perf diff: Fix support for all --sort combinations") >From that time, perf diff starts to compare the symbol address. The global dsos is discarded from a patch in 2010. a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance from host") However, at that time, perf diff already compared by address. So perf diff cannot work for different binaries as well. This patch actually rolls back the perf diff to original design. The document is also changed, so everybody knows the original design is to compare the symbol names. Here are some examples: The only difference between example_v1.c and example_v2.c is the location of f2 and f3. There is no change in behavior, but the previous perf diff display the wrong differential profile. example_v1.c noinline void f3(void) { volatile int i; for (i = 0; i < 10000;) { if(i%2) i++; else i++; } } noinline void f2(void) { volatile int a = 100, b, c; for (b = 0; b < 10000; b++) c = a * b; } noinline void f1(void) { f2(); f3(); } int main() { int i; for (i = 0; i < 100000; i++) f1(); } example_v2.c noinline void f2(void) { volatile int a = 100, b, c; for (b = 0; b < 10000; b++) c = a * b; } noinline void f3(void) { volatile int i; for (i = 0; i < 10000;) { if(i%2) i++; else i++; } } noinline void f1(void) { f2(); f3(); } int main() { int i; for (i = 0; i < 100000; i++) f1(); } [lk@localhost perf_diff]$ gcc example_v1.c -o example [lk@localhost perf_diff]$ perf record -o example_v1.data ./example [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ] [lk@localhost perf_diff]$ gcc example_v2.c -o example [lk@localhost perf_diff]$ perf record -o example_v2.data ./example [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ] Old perf diff result: [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data Event 'cycles' Baseline Delta Shared Object Symbol ........ ....... ................ ............................... [kernel.vmlinux] [k] __perf_event_task_sched_out 0.00% [kernel.vmlinux] [k] apic_timer_interrupt [kernel.vmlinux] [k] idle_cpu [kernel.vmlinux] [k] intel_pstate_timer_func [kernel.vmlinux] [k] native_read_msr_safe 0.00% [kernel.vmlinux] [k] native_read_tsc 0.00% [kernel.vmlinux] [k] native_write_msr_safe [kernel.vmlinux] [k] ntp_tick_length 0.00% [kernel.vmlinux] [k] rb_erase 0.00% [kernel.vmlinux] [k] tick_sched_timer 0.00% [kernel.vmlinux] [k] unmap_single_vma 0.00% [kernel.vmlinux] [k] update_wall_time 0.00% example [.] f1 46.24% example [.] f2 53.71% -7.55% example [.] f3 +53.81% example [.] f3 0.02% example [.] main New perf diff result: [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data [kernel.vmlinux] [k] __perf_event_task_sched_out 0.00% [kernel.vmlinux] [k] apic_timer_interrupt [kernel.vmlinux] [k] idle_cpu [kernel.vmlinux] [k] intel_pstate_timer_func [kernel.vmlinux] [k] native_read_msr_safe 0.00% [kernel.vmlinux] [k] native_read_tsc 0.00% [kernel.vmlinux] [k] native_write_msr_safe [kernel.vmlinux] [k] ntp_tick_length 0.00% [kernel.vmlinux] [k] rb_erase 0.00% [kernel.vmlinux] [k] tick_sched_timer 0.00% [kernel.vmlinux] [k] unmap_single_vma 0.00% [kernel.vmlinux] [k] update_wall_time 0.00% example [.] f1 46.24% -0.08% example [.] f2 53.71% +0.11% example [.] f3 0.02% example [.] main Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Pass struct perf_hpp_fmt to its callbacksNamhyung Kim2015-01-211-3/+34
| | | | | | | | | | | | | | | | | Currently ->cmp, ->collapse and ->sort callbacks doesn't pass corresponding fmt. But it'll be needed by upcoming changes in perf diff command. Suggested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1420677949-6719-6-git-send-email-namhyung@kernel.org [ fix build by passing perf_hpp_fmt pointer to hist_entry__cmp_ methods ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf callchain: Make get_srcline fall back to sym+offsetAndi Kleen2014-11-241-2/+4
| | | | | | | | | | | | | | | | When the source line is not found fall back to sym + offset. This is generally much more useful than a raw address. For this we need to pass in the symbol from the caller. For some callers it's awkward to compute, so we stay at the old behaviour. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-10-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Fix up srcline histogram key formattingArnaldo Carvalho de Melo2014-11-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem introduced in: commit 5b5916696051 "perf report: Honor column width setting" Where the left justification signal was after the width, which ended up, when the width was, say, 11, always printing: %11.11-s Instead of src:line left justified and limited to 11 chars. Resulting in a like: 70.93% %11.11-s [.] f2 tcall When it should instead be: 70.93% tcall.c:5 [.] f2 tcall Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2xnt0vqkoox52etq2qhyetr0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
OpenPOWER on IntegriCloud