summaryrefslogtreecommitdiffstats
path: root/tools/perf/util
Commit message (Collapse)AuthorAgeFilesLines
* perf probe: Ensure offset provided is not greater than function length ↵Prashanth Nageshappa2012-02-291-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | without DWARF info too The 'perf probe' command allows kprobe to be inserted at any offset from a function start, which results in adding kprobes to unintended location. (example: perf probe do_fork+10000 is allowed even though size of do_fork is ~904). My previous patch https://lkml.org/lkml/2012/2/24/42 addressed the case where DWARF info was available for the kernel. This patch fixes the case where perf probe is used on a kernel without debuginfo available. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/4F4C544D.1010909@linux.vnet.ibm.com Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Ensure comm string is properly terminatedDavid Ahern2012-02-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | If threads in a multi-threaded process have names shorter than the main thread the comm for the named threads is not properly terminated. E.g., for the process 'namedthreads' where each thread is named noploop%d where %d is the thread number: Before: perf script -f comm,tid,ip,sym,dso noploop:4ads 21616 400a49 noploop (/tmp/namedthreads) The 'ads' in the thread comm bleeds over from the process name. After: perf script -f comm,tid,ip,sym,dso noploop:4 21616 400a49 noploop (/tmp/namedthreads) Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1330111898-68071-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Ensure offset provided is not greater than function lengthPrashanth Nageshappa2012-02-291-1/+11
| | | | | | | | | | | | | | | | | | | | | | The perf probe command allows kprobe to be inserted at any offset from a function start, which results in adding kprobes to unintended location. Example: perf probe do_fork+10000 is allowed even though size of do_fork is ~904. This patch will ensure probe addition fails when the offset specified is greater than size of the function. Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/4F473F33.4060409@linux.vnet.ibm.com Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf evlist: Return first evsel for non-sample event on old kernelNamhyung Kim2012-02-291-0/+4
| | | | | | | | | | | | | | | | | On old kernels that don't support sample_id_all feature, perf_evlist__id2evsel() returns NULL for non-sampling events. This breaks perf top when multiple events are given on command line. Fix it by using first evsel in the evlist. This will also prevent getting the same (potential) problem in such new tool/ old kernel combo. Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1329702447-25045-1-git-send-email-namhyung.kim@lge.com Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf evsel: Fix an issue where perf report fails to show the proper percentageNaveen N. Rao2012-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an issue where perf report shows nan% for certain perf.data files. The below is from a report for a do_fork probe: -nan% sshd [kernel.kallsyms] [k] do_fork -nan% packagekitd [kernel.kallsyms] [k] do_fork -nan% dbus-daemon [kernel.kallsyms] [k] do_fork -nan% bash [kernel.kallsyms] [k] do_fork A git bisect shows commit f3bda2c as the cause. However, looking back through the git history, I saw commit 640c03c which seems to have removed the required initialization for perf_sample->period. The problem only started showing after commit f3bda2c. The below patch re-introduces the initialization and it fixes the problem for me. With the below patch, for the same perf.data: 73.08% bash [kernel.kallsyms] [k] do_fork 8.97% 11-dhclient [kernel.kallsyms] [k] do_fork 6.41% sshd [kernel.kallsyms] [k] do_fork 3.85% 20-chrony [kernel.kallsyms] [k] do_fork 2.56% sendmail [kernel.kallsyms] [k] do_fork This patch applies over current linux-tip commit 9949284. Problem introduced in: $ git describe 640c03c v2.6.37-rc3-83-g640c03c Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Robert Richter <robert.richter@amd.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/20120203170113.5190.25558.stgit@localhost6.localdomain6 Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix prefix matching for kernel mapsJiri Olsa2012-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In some perf ancient versions we used '[kernel.kallsyms._text]' as the name for the kernel map. This got changed with commit: perf: 'perf kvm' tool for monitoring guest performance from host commit a1645ce12adb6c9cc9e19d7695466204e3f017fe Author: Zhang, Yanmin <yanmin_zhang@linux.intel.com> and we started to use following name '[kernel.kallsyms]_text'. This name change is important for the report code dealing with ancient perf data. When processing the kernel map event, we need to recognize the old naming (dont match the last ']') and initialize the kernel map correctly. The subsequent call to maps__set_kallsyms_ref_reloc_sym deals with the superfluous ']' to get correct symbol name. Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328461865-6127-1-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix strlen() bug in perf_event__synthesize_event_type()Stephane Eranian2012-01-301-1/+1
| | | | | | | | | | | | | | | | | | The event_type record has a max length for the event name. It's called MAX_EVENT_NAME. The name may be truncated to fit the max length. But the header.size still reflects the original name length. If that length is > MAX_EVENT_NAME, then the header.size field is bogus. Fix this by using the length of the name after the potential truncation. Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120120094912.GA4882@quad Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix broken build by defining _GNU_SOURCE in MakefileDavid Daney2012-01-306-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building on my Debian/mips system, util/util.c fails to build because commit 1aed2671738785e8f5aea663a6fda91aa7ef59b5 (perf kvm: Do guest-only counting by default) indirectly includes stdio.h before the feature selection in util.h is done. This prevents _GNU_SOURCE in util.h from enabling the declaration of getline(), from now second inclusion of stdio.h, and the build is broken. There is another breakage in util/evsel.c caused by include ordering, but I didn't fully track down the commit that caused it. The root cause of all this is an inconsistent definition of _GNU_SOURCE, so I move the definition into the Makefile so that it is passed to all invocations of the compiler and used uniformly for all system header files. All other #define and #undef of _GNU_SOURCE are removed as they cause conflicts with the definition passed to the compiler. All the features.h definitions (_LARGEFILE64_SOURCE _FILE_OFFSET_BITS=64 and _GNU_SOURCE) are needed by the python glue code too, so they are moved to BASIC_CFLAGS, and the misleading comments about BASIC_CFLAGS are removed. This gives me a clean build on x86_64 (fc12) and mips (Debian). Cc: David Daney <david.daney@cavium.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1326836461-11952-1-git-send-email-ddaney.cavm@gmail.com Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix compile error on x86_64 UbuntuNamhyung Kim2012-01-081-1/+0
| | | | | | | | | | | | | | | | | | | | The ctype.h include is not needed here and it breaks build on some systems (at least 64bit Ubuntu 10.04) like below. Just get rid of it. CC util/trace-event-info.o cc1: warnings being treated as errors util/trace-event-info.c: In function ‘record_file’: util/trace-event-info.c:192: error: implicit declaration of function ‘pwrite’ util/trace-event-info.c:192: error: nested extern declaration of ‘pwrite’ make: *** [util/trace-event-info.o] Error 1 Cc: Ingo Molnar <mingo@elte.hu> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1326035430-7621-1-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Fix --stdio output alignment when --showcpuutilization usedNamhyung Kim2012-01-081-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | Current perf report output is broken if --showcpuutilization is used. Combination with -n and/or --show-total-period make things worse. This patch fixes it as follows: before: 48.25% 48.25% 0.00% sleep [kernel.kallsyms] [k] trace_hardirqs_off 34.99% 34.99% 0.00% sleep [kernel.kallsyms] [k] __find_get_block_slow 15.99% 15.99% 0.00% sleep [kernel.kallsyms] [k] lock_release_holdtime 0.77% 0.77% 0.00% sleep [kernel.kallsyms] [k] native_write_msr_safe after: 48.25% 48.25% 0.00% sleep [kernel.kallsyms] [k] trace_hardirqs_off 34.99% 34.99% 0.00% sleep [kernel.kallsyms] [k] __find_get_block_slow 15.99% 15.99% 0.00% sleep [kernel.kallsyms] [k] lock_release_holdtime 0.77% 0.77% 0.00% sleep [kernel.kallsyms] [k] native_write_msr_safe Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1325957132-10600-8-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add support for guest/host-only profilingJoerg Roedel2012-01-061-2/+12
| | | | | | | | | | | | | | | | | To restrict a counter to either host or guest mode this patch introduces two new event modifiers: G and H. With G the counter is configured in guest-only mode and with H in host-only mode. Cc: Gleb Natapov <gleb@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Link: http://lkml.kernel.org/n/tip-or5aj3rghy9ngyg882z6kln9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf kvm: Do guest-only counting by defaultJoerg Roedel2012-01-064-1/+24
| | | | | | | | | | | | | | | Make use of exclude_guest and exlude_host in perf-kvm to do only guest-only counting by default. Cc: Gleb Natapov <gleb@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> [ committer note: Moved perf_{guest,host} & event_attr_init to util.c ] [ so as not to drag more stuff to the python binding] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Stop using 'self' for struct hist_entryArnaldo Carvalho de Melo2012-01-062-48/+48
| | | | | | | | | | | | | | | Stop using this python/OOP convention, doesn't really helps. Will do more from time to time till we get it cleaned up in all of /perf. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-me4dyj6s5snh7jr8wb9gzt82@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists: Rename total_session to total_periodArnaldo Carvalho de Melo2012-01-061-8/+8
| | | | | | | | | | | | | | Nowadays we do it per evsel, not per session (that may have multiple evsels), so rename it to avoid confusion. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-azsgomr5h4dmaudoogw48w49@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf session: Remove impossible condition checkNamhyung Kim2012-01-031-2/+1
| | | | | | | | | | | | The 'size' cannot be 0 because it was set to 8 on the above line in case it was 0 and never changed. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1325000151-4463-1-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix feature-bits rework fallout, remove unused variableIngo Molnar2011-12-291-3/+0
| | | | | | | | | | Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/n/tip-lfckuwbl8m1ykb7t9ydsxe4r@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf script: Add generic perl handler to process eventsRobert Richter2011-12-231-6/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current perf scripting facility only supports tracepoints. This patch implements a generic perl handler to support other events than tracepoints too. This patch introduces a function process_event() that is called by perf for each sample. The function is called with byte streams as arguments containing information about the event, its attributes, the sample and raw data. Perl's unpack() function can easily be used for byte decoding. The following is the default implementation for process_event() that can also be generated with perf script: # Packed byte string args of process_event(): # # $event: union perf_event util/event.h # $attr: struct perf_event_attr linux/perf_event.h # $sample: struct perf_sample util/event.h # $raw_data: perf_sample->raw_data util/event.h sub process_event { my ($event, $attr, $sample, $raw_data) = @_; my @event = unpack("LSS", $event); my @attr = unpack("LLQQQQQLLQQ", $attr); my @sample = unpack("QLLQQQQQLL", $sample); my @raw_data = unpack("C*", $raw_data); use Data::Dumper; print Dumper \@event, \@attr, \@sample, \@raw_data; } Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323969824-9711-4-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Use for_each_set_bit() to iterate over feature flagsRobert Richter2011-12-233-93/+149
| | | | | | | | | | | | | This patch introduces the for_each_set_bit() macro and modifies feature implementation to use it. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-8-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Unify handling of features when writing feature sectionRobert Richter2011-12-231-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The features HEADER_TRACE_INFO and HEADER_BUILD_ID are handled different when writing the feature section. All other features are simply disabled on failure and writing the section goes on without returning an error. There is no reason for these special cases. This patch unifies handling of the features. This should be ok since all features can be parsed independently. Offset and size of a feature's block is stored in struct perf_file_ section right after the data block of perf.data (see perf_session__ write_header()). Thus, if a feature does not exist then other features can be processed anyway. Also moving special code for HEADER_BUILD_ID out to write_build_id(). v2: * perf record throws an error now if buildids may not be generated, which can be disabled with the --no-buildid option. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-6-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Accept fifos as input fileRobert Richter2011-12-231-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default input file for perf report is not handled the same way as perf record does it for its output file. This leads to unexpected behavior of perf report, etc. E.g.: # perf record -a -e cpu-cycles sleep 2 | perf report | cat failed to open perf.data: No such file or directory (try 'perf record' first) While perf record writes to a fifo, perf report expects perf.data to be read. This patch changes this to accept fifos as input file. Applies to the following commands: perf annotate perf buildid-list perf evlist perf kmem perf lock perf report perf sched perf script perf timechart Also fixes char const* -> const char* type declaration for filename strings. v2: * Prevent potential null pointer access to input_name in builtin-report.c. Needed due to removal of patch "perf report: Setup browser if stdout is a pipe" Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Moving code in some filesRobert Richter2011-12-231-249/+246
| | | | | | | | | | | | Needed for later changes. No modified functionality. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-4-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix out-of-bound access to struct perf_sessionRobert Richter2011-12-232-2/+2
| | | | | | | | | | | | | | | If filename is NULL there is an out-of-bound access to struct perf_session if it would be used with perf_session__open(). Shouldn't actually happen in current implementation as filename is always !NULL. Fixing this by always null-terminating filename. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-3-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Continue processing header on unknown featuresRobert Richter2011-12-231-1/+1
| | | | | | | | | | | | | | A feature may be unknown if perf.data is created and parsed on different perf tool versions. This should not stop the header to be processed, instead continue processing it. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-2-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Improve macros for struct feature_opsRobert Richter2011-12-231-18/+22
| | | | | | | | | | | | | Reducing duplication and line size by extending function names for print and write from a single name. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-7-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf: builtin-record: Document and check that mmap_pages must be a power of two.Nelson Elhage2011-12-232-0/+13
| | | | | | | | | | | | | | Now that we automatically point users at it, let's provide them some guidance so that they hopefully don't just get mysterious EINVAL's from the kernel. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1324301972-22740-4-git-send-email-nelhage@nelhage.com Signed-off-by: Nelson Elhage <nelhage@nelhage.com> [ committer note: Made it work after 50a682c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix truncated annotationIngo Molnar2011-12-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I get such truncated annotation results in 'perf top': : Disassembly of section .text: ▒ : ▒ : ffffffff810966a8 <nr_iowait_cpu>: ▒ 4.94 : ffffffff810966a8: movslq %edi,%rdi ▒ 3.70 : ffffffff810966ab: mov $0x13700,%rax ▒ 0.00 : ffffffff810966b2: add -0x7e32cb00(,%rdi,8),%rax ▒ 8.64 : ffffffff810966ba: mov 0x7e0(%rax),%eax ▒ 82.72 : ffffffff810966c0: cltq ▒ Note the missing 'retq' which is there in the original function: ffffffff810966a8 <nr_iowait_cpu>: ffffffff810966a8: 48 63 ff movslq %edi,%rdi ffffffff810966ab: 48 c7 c0 00 37 01 00 mov $0x13700,%rax ffffffff810966b2: 48 03 04 fd 00 35 cd add -0x7e32cb00(,%rdi,8),%rax ffffffff810966b9: 81 ffffffff810966ba: 8b 80 e0 07 00 00 mov 0x7e0(%rax),%eax ffffffff810966c0: 48 98 cltq ffffffff810966c2: c3 retq ffffffff810966c3 <this_cpu_load>: I'm using a fairly recent binutils: GNU objdump version 2.21.51.0.6-2.fc16 20110118 AFAICS the bug is simply that sym->end points to the last byte of the symbol in question - while objdump's --stop-address expects the last byte plus 1 to disassemble the full range. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111223130804.GA24305@elte.hu Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Look up thread names for system wide profilingDavid Ahern2011-12-231-22/+53
| | | | | | | | | | | | | | | | This handles multithreaded processes with named threads when doing system wide profiling: the comm for each thread is looked up allowing them to be different from the thread group leader. v2: - fixed sizeof arg to perf_event__get_comm_tgid Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-3-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix comm for processes with named threadsDavid Ahern2011-12-231-5/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf does not properly handle monitoring of processes with named threads. For example: $ ps -C myapp -L PID LWP TTY TIME CMD 25118 25118 ? 00:00:00 myapp 25118 25119 ? 00:00:00 myapp:worker perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10 perf report --stdio -i /tmp/perf.data 100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out The process name is set to the name of the last thread it finds for the process. The Problem: perf-top and perf-record both create a thread_map of threads to be monitored. That map is used in perf_event__synthesize_thread_map which loops over the entries in thread_map and calls __event__synthesize_thread to generate COMM and MMAP events. __event__synthesize_thread calls perf_event__synthesize_comm which opens /proc/pid/status, reads the name of the task and its thread group id. That's all fine. The problem is that it then reads /proc/pid/task and generates COMM events for each task it finds - but using the name found in /proc/pid/status where pid is the thread of interest. The end result (looping over thread_map + synthesizing comm events for each thread each time) means the name of the last thread processed sets the name for all threads in the process - which is not good for multithreaded processes with named threads. The Fix: perf_event__synthesize_comm has an input argument (full) that decides whether to process task entries for each pid it is passed. It currently never set to 0 (perf_event__synthesize_comm has a single caller and it always passes the value 1). Let's fix that. Add the full input argument to __event__synthesize_thread which passes it to perf_event__synthesize_comm. For thread/process monitoring set full to 0 which means COMM and MMAP events are only generated for the pid passed to it. For system wide monitoring set full to 1 so that COMM events are generated for all threads in a process. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf: Add support for PERF_HW_COUNT_REF_CPU_CYCLESStephane Eranian2011-12-211-0/+2
| | | | | | | | | | | | Add new generic hw event: ref-cycles, which maps to PERF_HW_COUNT_REF_CPUCYCLES: $ perf stat -e ref-cycles ls Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323559734-3488-5-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'perf/core' of git://github.com/acmel/linux into perf/coreIngo Molnar2011-12-207-18/+26
|\
| * perf events: Tidy up perf_event__preprocess_sampleNamhyung Kim2011-12-201-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | Use local variable 'dso' to reduce typing a bit and rearrange the if condition. Also NULL check of al->map in the condition is not necessary. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323703017-6060-7-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Remove stale git headlines from top commentNamhyung Kim2011-12-202-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | These files are part of PERF not GIT although they're come from there :) Cc: Ingo Molnar <mingo@elte.hu> Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323784323-2150-1-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Fix a memory leak on perf_read_values_destroyNamhyung Kim2011-12-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | After freeing each elements of the @values->value, we should free itself too. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323703017-6060-5-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Fix error path on symbol__init()Namhyung Kim2011-12-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | The order of freeing comm_list and dso_list should be reversed. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323703017-6060-4-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Get rid of duplicated snprintf()Namhyung Kim2011-12-201-6/+1
| | | | | | | | | | | | | | | | | | | | | | The 'path' variable is set on a upper line, don't need to do it again. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323703017-6060-3-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evlist: Fix errno value reporting on failed mmapNelson Elhage2011-12-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On failure, perf_evlist__mmap_per_{cpu,thread} will try to munmap() every map that doesn't have a NULL base. This will fail with EINVAL if one of them has base == MAP_FAILED, clobbering errno, so that perf_evlist__map will return EINVAL on any failure regardless of the root cause. Fix this by resetting failed maps to a NULL base. Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1324301972-22740-2-git-send-email-nelhage@nelhage.com Signed-off-by: Nelson Elhage <nelhage@nelhage.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evsel: Fix uninitialized memory access to struct perf_sampleRobert Richter2011-12-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | Memory in struct perf_sample is not fully initialized during parsing. Depending on sampling data some parts may left unchanged. Zero out struct perf_sample first to avoid access to uninitialized memory. Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/1323966762-8574-2-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf record: Add ability to record event periodAndrew Vagin2011-12-201-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is that when SAMPLE_PERIOD is not set, the kernel generates a number of samples in proportion to an event's period. Number of these samples may be too big and the kernel throttles all samples above a defined limit. E.g.: I want to trace when a process sleeps. I created a process which sleeps for 1ms and for 4ms. perf got 100 events in both cases. swapper 0 [000] 1141.371830: sched_stat_sleep: comm=foo pid=1801 delay=1386750 [ns] swapper 0 [000] 1141.369444: sched_stat_sleep: comm=foo pid=1801 delay=4499585 [ns] In the first case a kernel want to send 4499585 events and in the second case it wants to send 1386750 events. perf-reports shows that process sleeps in both places equal time. Instead of this we can get only one sample with an attribute period. As result we have less data transferring between kernel and user-space and we avoid throttling of samples. The patch "events: Don't divide events if it has field period" added a kernel part of this functionality. Acked-by: Arun Sharma <asharma@fb.com> Cc: Arun Sharma <asharma@fb.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: devel@openvz.org Link: http://lkml.kernel.org/r/1324391565-1369947-1-git-send-email-avagin@openvz.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | Merge commit 'v3.2-rc6' into perf/coreIngo Molnar2011-12-201-1/+1
|\ \ | |/ |/| | | | | | | Merge reason: Update with the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * Merge branch 'perf/urgent' of git://github.com/acmel/linux into perf/urgentIngo Molnar2011-12-071-1/+1
| |\
| | * perf header: Use event_name() to get an event nameAndrew Vagin2011-12-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_evsel.name may be not initialized Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arun Sharma <asharma@fb.com> Cc: devel@openvz.org Link: http://lkml.kernel.org/r/1322471015-107825-2-git-send-email-avagin@openvz.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | perf tools: Add ability to synthesize event according to a sampleAndrew Vagin2011-12-123-0/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's the counterpart of perf_session__parse_sample. v2: fixed mistakes found by David Ahern. v3: s/data/sample/ s/perf_event__change_sample/perf_event__synthesize_sample Reviewed-by: David Ahern <dsahern@gmail.com> Cc: Arun Sharma <asharma@fb.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: devel@openvz.org Link: http://lkml.kernel.org/r/1323266161-394927-3-git-send-email-avagin@openvz.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2011-12-064-12/+14
|\ \ \ | |/ / | | | | | | | | | | | | | | | Merge reason: Add these cherry-picked commits so that future changes on perf/core don't conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | perf: Fix parsing of __print_flags() in TP_printk()Steven Rostedt2011-12-051-0/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | A update is made to the sched:sched_switch event that adds some logic to the first parameter of the __print_flags() that shows the state of tasks. This change cause perf to fail parsing the flags. A simple fix is needed to have the parser be able to process ops within the argument. Cc: stable@vger.kernel.org Reported-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * perf session: Fix crash with invalid CPU listDavid Ahern2011-11-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 5d67be9 added the option to specify a range of CPUs of interest, but does not catch an invalid CPU list: $ perf script -c foo Segmentation fault (core dumped) Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/r/1321206327-5881-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf python: Fix undefined symbol problemArnaldo Carvalho de Melo2011-11-163-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently we made perf_evsel__init call hists__init, which broke the perf python binding: [root@emilia linux]# ./tools/perf/python/twatch.py Traceback (most recent call last): File "./tools/perf/python/twatch.py", line 16, in <module> import perf ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: hists__init Fix it by moving the hists__init function to its only caller, evsel.c. This way we avoid dragging in other parts of tools/perf/util/ to the perf python binding. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-5nffmdt5mu6ozxgj54oi4qon@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf event: Introduce perf_event__fprintfArnaldo Carvalho de Melo2011-12-022-6/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | So that tools like 'perf test' can print the events when in verbose mode, for instance. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-xnovdqfi25nc48gy6604k7yp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf evlist: Always do automatic allocation of pollfd and mmap structuresArnaldo Carvalho de Melo2011-11-292-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At first tools were required to do that, but while writing the python bindings to simplify the API I made them auto-allocate when needed. This just makes record, stat and top use that auto allocation, simplifying them a bit. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-iokhcvkzzijr3keioubx8hlq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf tools: Save some loops using perf_evlist__id2evselArnaldo Carvalho de Melo2011-11-284-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we already ask for PERF_SAMPLE_ID and use it to quickly find the associated evsel, add handler func + data to struct perf_evsel to avoid using chains of if(strcmp(event_name)) and also to avoid all the linear list searches via trace_event_find. To demonstrate the technique convert 'perf sched' to it: # perf sched record sleep 5m And then: Performance counter stats for '/tmp/oldperf sched lat': 646.929438 task-clock # 0.999 CPUs utilized 9 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 20,901 page-faults # 0.032 M/sec 1,290,144,450 cycles # 1.994 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 1,606,158,439 instructions # 1.24 insns per cycle 339,088,395 branches # 524.151 M/sec 4,550,735 branch-misses # 1.34% of all branches 0.647524759 seconds time elapsed Versus: Performance counter stats for 'perf sched lat': 473.564691 task-clock # 0.999 CPUs utilized 9 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 20,903 page-faults # 0.044 M/sec 944,367,984 cycles # 1.994 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 1,442,385,571 instructions # 1.53 insns per cycle 308,383,106 branches # 651.195 M/sec 4,481,784 branch-misses # 1.45% of all branches 0.474215751 seconds time elapsed [root@emilia ~]# Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-1kbzpl74lwi6lavpqke2u2p3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf top: Stop using globals for tool stateArnaldo Carvalho de Melo2011-11-281-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | Use its 'perf_tool' base class instead. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-i33q40wwvk2zna8fd36ex6sm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
OpenPOWER on IntegriCloud