diff options
Diffstat (limited to 'tools/perf/Documentation')
20 files changed, 373 insertions, 56 deletions
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt index b0b3007d3c9c..4b6cdbf8f935 100644 --- a/tools/perf/Documentation/intel-pt.txt +++ b/tools/perf/Documentation/intel-pt.txt @@ -108,6 +108,9 @@ approach is available to export the data to a postgresql database. Refer to script export-to-postgresql.py for more details, and to script call-graph-from-postgresql.py for an example of using the database. +There is also script intel-pt-events.py which provides an example of how to +unpack the raw data for power events and PTWRITE. + As mentioned above, it is easy to capture too much data. One way to limit the data captured is to use 'snapshot' mode which is explained further below. Refer to 'new snapshot option' and 'Intel PT modes of operation' further below. @@ -364,6 +367,42 @@ cyc_thresh Specifies how frequently CYC packets are produced - see cyc CYC packets are not requested by default. +pt Specifies pass-through which enables the 'branch' config term. + + The default config selects 'pt' if it is available, so a user will + never need to specify this term. + +branch Enable branch tracing. Branch tracing is enabled by default so to + disable branch tracing use 'branch=0'. + + The default config selects 'branch' if it is available. + +ptw Enable PTWRITE packets which are produced when a ptwrite instruction + is executed. + + Support for this feature is indicated by: + + /sys/bus/event_source/devices/intel_pt/caps/ptwrite + + which contains "1" if the feature is supported and + "0" otherwise. + +fup_on_ptw Enable a FUP packet to follow the PTWRITE packet. The FUP packet + provides the address of the ptwrite instruction. In the absence of + fup_on_ptw, the decoder will use the address of the previous branch + if branch tracing is enabled, otherwise the address will be zero. + Note that fup_on_ptw will work even when branch tracing is disabled. + +pwr_evt Enable power events. The power events provide information about + changes to the CPU C-state. + + Support for this feature is indicated by: + + /sys/bus/event_source/devices/intel_pt/caps/power_event_trace + + which contains "1" if the feature is supported and + "0" otherwise. + new snapshot option ------------------- @@ -674,13 +713,15 @@ Having no option is the same as which, in turn, is the same as - --itrace=ibxe + --itrace=ibxwpe The letters are: i synthesize "instructions" events b synthesize "branches" events x synthesize "transactions" events + w synthesize "ptwrite" events + p synthesize "power" events c synthesize branches events (calls only) r synthesize branches events (returns only) e synthesize tracing error events @@ -699,7 +740,40 @@ and "r" can be combined to get calls and returns. 'flags' field can be used in perf script to determine whether the event is a tranasaction start, commit or abort. -Error events are new. They show where the decoder lost the trace. Error events +Note that "instructions", "branches" and "transactions" events depend on code +flow packets which can be disabled by using the config term "branch=0". Refer +to the config terms section above. + +"ptwrite" events record the payload of the ptwrite instruction and whether +"fup_on_ptw" was used. "ptwrite" events depend on PTWRITE packets which are +recorded only if the "ptw" config term was used. Refer to the config terms +section above. perf script "synth" field displays "ptwrite" information like +this: "ip: 0 payload: 0x123456789abcdef0" where "ip" is 1 if "fup_on_ptw" was +used. + +"Power" events correspond to power event packets and CBR (core-to-bus ratio) +packets. While CBR packets are always recorded when tracing is enabled, power +event packets are recorded only if the "pwr_evt" config term was used. Refer to +the config terms section above. The power events record information about +C-state changes, whereas CBR is indicative of CPU frequency. perf script +"event,synth" fields display information like this: + cbr: cbr: 22 freq: 2189 MHz (200%) + mwait: hints: 0x60 extensions: 0x1 + pwre: hw: 0 cstate: 2 sub-cstate: 0 + exstop: ip: 1 + pwrx: deepest cstate: 2 last cstate: 2 wake reason: 0x4 +Where: + "cbr" includes the frequency and the percentage of maximum non-turbo + "mwait" shows mwait hints and extensions + "pwre" shows C-state transitions (to a C-state deeper than C0) and + whether initiated by hardware + "exstop" indicates execution stopped and whether the IP was recorded + exactly, + "pwrx" indicates return to C0 +For more details refer to the Intel 64 and IA-32 Architectures Software +Developer Manuals. + +Error events show where the decoder lost the trace. Error events are quite important. Users must know if what they are seeing is a complete picture or not. diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt index e2a4c5e0dbe5..a3abe04c779d 100644 --- a/tools/perf/Documentation/itrace.txt +++ b/tools/perf/Documentation/itrace.txt @@ -3,13 +3,15 @@ c synthesize branches events (calls only) r synthesize branches events (returns only) x synthesize transactions events + w synthesize ptwrite events + p synthesize power events e synthesize error events d create a debug log g synthesize a call chain (use with i or x) l synthesize last branch entries (use with i or x) s skip initial number of events - The default is all events i.e. the same as --itrace=ibxe + The default is all events i.e. the same as --itrace=ibxwpe In addition, the period (default 100000) for instructions events can be specified in units of: @@ -26,8 +28,8 @@ Also the number of last branch entries (default 64, max. 1024) for instructions or transactions events can be specified. - It is also possible to skip events generated (instructions, branches, transactions) - at the beginning. This is useful to ignore initialization code. + It is also possible to skip events generated (instructions, branches, transactions, + ptwrite, power) at the beginning. This is useful to ignore initialization code. --itrace=i0nss1000000 diff --git a/tools/perf/Documentation/perf-annotate.txt b/tools/perf/Documentation/perf-annotate.txt index 8ffbd272952d..a89273d8e744 100644 --- a/tools/perf/Documentation/perf-annotate.txt +++ b/tools/perf/Documentation/perf-annotate.txt @@ -39,6 +39,10 @@ OPTIONS --verbose:: Be more verbose. (Show symbol address, etc) +-q:: +--quiet:: + Do not show any message. (Suppress -v) + -D:: --dump-raw-trace:: Dump raw trace in ASCII. diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt index 3f06730c7f47..822414235170 100644 --- a/tools/perf/Documentation/perf-c2c.txt +++ b/tools/perf/Documentation/perf-c2c.txt @@ -76,7 +76,7 @@ REPORT OPTIONS -c:: --coalesce:: - Specify sorintg fields for single cacheline display. + Specify sorting fields for single cacheline display. Following fields are available: tid,pid,iaddr,dso (see COALESCE) @@ -106,7 +106,7 @@ REPORT OPTIONS -d:: --display:: - Siwtch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default. + Switch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default. C2C RECORD ---------- @@ -248,7 +248,7 @@ output fields set for caheline offsets output: Code address, Code symbol, Shared Object, Source line dso - coalesced by shared object -By default the coalescing is setup with 'pid,tid,iaddr'. +By default the coalescing is setup with 'pid,iaddr'. STDIO OUTPUT ------------ diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index 9365b75fd04f..5b4fff3adc4b 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -498,6 +498,18 @@ record.*:: But if this option is 'no-cache', it will not update the build-id cache. 'skip' skips post-processing and does not update the cache. +diff.*:: + diff.order:: + This option sets the number of columns to sort the result. + The default is 0, which means sorting by baseline. + Setting it to 1 will sort the result by delta (or other + compute method selected). + + diff.compute:: + This options sets the method for computing the diff result. + Possible values are 'delta', 'delta-abs', 'ratio' and + 'wdiff'. Default is 'delta'. + SEE ALSO -------- linkperf:perf[1] diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt index 3e9490b9c533..a79c84ae61aa 100644 --- a/tools/perf/Documentation/perf-diff.txt +++ b/tools/perf/Documentation/perf-diff.txt @@ -73,6 +73,10 @@ OPTIONS Be verbose, for instance, show the raw counts in addition to the diff. +-q:: +--quiet:: + Do not show any message. (Suppress -v) + -f:: --force:: Don't do ownership validation. @@ -86,8 +90,9 @@ OPTIONS -c:: --compute:: - Differential computation selection - delta,ratio,wdiff (default is delta). - See COMPARISON METHODS section for more info. + Differential computation selection - delta, ratio, wdiff, delta-abs + (default is delta-abs). Default can be changed using diff.compute + config option. See COMPARISON METHODS section for more info. -p:: --period:: @@ -99,7 +104,11 @@ OPTIONS -o:: --order:: - Specify compute sorting column number. + Specify compute sorting column number. 0 means sorting by baseline + overhead and 1 (default) means sorting by computed value of column 1 + (data from the first file other base baseline). Values more than 1 + can be used only if enough data files are provided. + The default value can be set using the diff.order config option. --percentage:: Determine how to display the overhead percentage of filtered entries. @@ -181,6 +190,10 @@ with: relative to how entries are filtered. Use --percentage=absolute to prevent such fluctuation. +delta-abs +~~~~~~~~~ +Same as 'delta` method, but sort the result with the absolute values. + ratio ~~~~~ If specified the 'Ratio' column is displayed with value 'r' computed as: diff --git a/tools/perf/Documentation/perf-ftrace.txt b/tools/perf/Documentation/perf-ftrace.txt new file mode 100644 index 000000000000..721a447f046e --- /dev/null +++ b/tools/perf/Documentation/perf-ftrace.txt @@ -0,0 +1,87 @@ +perf-ftrace(1) +============= + +NAME +---- +perf-ftrace - simple wrapper for kernel's ftrace functionality + + +SYNOPSIS +-------- +[verse] +'perf ftrace' <command> + +DESCRIPTION +----------- +The 'perf ftrace' command is a simple wrapper of kernel's ftrace +functionality. It only supports single thread tracing currently and +just reads trace_pipe in text and then write it to stdout. + +The following options apply to perf ftrace. + +OPTIONS +------- + +-t:: +--tracer=:: + Tracer to use: function_graph or function. + +-v:: +--verbose=:: + Verbosity level. + +-p:: +--pid=:: + Trace on existing process id (comma separated list). + +-a:: +--all-cpus:: + Force system-wide collection. Scripts run without a <command> + normally use -a by default, while scripts run with a <command> + normally don't - this option allows the latter to be run in + system-wide mode. + +-C:: +--cpu=:: + Only trace for the list of CPUs provided. Multiple CPUs can + be provided as a comma separated list with no space like: 0,1. + Ranges of CPUs are specified with -: 0-2. + Default is to trace on all online CPUs. + +-T:: +--trace-funcs=:: + Only trace functions given by the argument. Multiple functions + can be given by using this option more than once. The function + argument also can be a glob pattern. It will be passed to + 'set_ftrace_filter' in tracefs. + +-N:: +--notrace-funcs=:: + Do not trace functions given by the argument. Like -T option, + this can be used more than once to specify multiple functions + (or glob patterns). It will be passed to 'set_ftrace_notrace' + in tracefs. + +-G:: +--graph-funcs=:: + Set graph filter on the given function (or a glob pattern). + This is useful for the function_graph tracer only and enables + tracing for functions executed from the given function. + This can be used more than once to specify multiple functions. + It will be passed to 'set_graph_function' in tracefs. + +-g:: +--nograph-funcs=:: + Set graph notrace filter on the given function (or a glob pattern). + Like -G option, this is useful for the function_graph tracer only + and disables tracing for function executed from the given function. + This can be used more than once to specify multiple functions. + It will be passed to 'set_graph_notrace' in tracefs. + +-D:: +--graph-depth=:: + Set max depth for function graph tracer to follow + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-trace[1] diff --git a/tools/perf/Documentation/perf-kallsyms.txt b/tools/perf/Documentation/perf-kallsyms.txt new file mode 100644 index 000000000000..954ea9e21236 --- /dev/null +++ b/tools/perf/Documentation/perf-kallsyms.txt @@ -0,0 +1,24 @@ +perf-kallsyms(1) +============== + +NAME +---- +perf-kallsyms - Searches running kernel for symbols + +SYNOPSIS +-------- +[verse] +'perf kallsyms <options> symbol_name[,symbol_name...]' + +DESCRIPTION +----------- +This command searches the running kernel kallsyms file for the given symbol(s) +and prints information about it, including the DSO, the kallsyms begin/end +addresses and the addresses in the ELF kallsyms symbol table (for symbols in +modules). + +OPTIONS +------- +-v:: +--verbose=:: + Increase verbosity level, showing details about symbol table loading, etc. diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index 41857cce5e86..f709de54707b 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -8,7 +8,7 @@ perf-list - List all symbolic event types SYNOPSIS -------- [verse] -'perf list' [--no-desc] [--long-desc] [hw|sw|cache|tracepoint|pmu|event_glob] +'perf list' [--no-desc] [--long-desc] [hw|sw|cache|tracepoint|pmu|sdt|event_glob] DESCRIPTION ----------- @@ -24,6 +24,10 @@ Don't print descriptions. --long-desc:: Print longer event descriptions. +--details:: +Print how named events are resolved internally into perf events, and also +any extra expressions computed by perf stat. + [[EVENT_MODIFIERS]] EVENT MODIFIERS @@ -240,6 +244,8 @@ To limit the list use: . 'pmu' to print the kernel supplied PMU events. +. 'sdt' to list all Statically Defined Tracepoint events. + . If none of the above is matched, it will apply the supplied glob to all events, printing the ones that match. diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt index e6c9902c6d82..165c2b1d4317 100644 --- a/tools/perf/Documentation/perf-probe.txt +++ b/tools/perf/Documentation/perf-probe.txt @@ -240,9 +240,13 @@ Add a probe on schedule() function 12th line with recording cpu local variable: or ./perf probe --add='schedule:12 cpu' - this will add one or more probes which has the name start with "schedule". +Add one or more probes which has the name start with "schedule". - Add probes on lines in schedule() function which calls update_rq_clock(). + ./perf probe schedule* + or + ./perf probe --add='schedule*' + +Add probes on lines in schedule() function which calls update_rq_clock(). ./perf probe 'schedule;update_rq_clock*' or diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index 5054d9147f0f..b0e9e921d534 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -157,7 +157,7 @@ OPTIONS -a:: --all-cpus:: - System-wide collection from all CPUs. + System-wide collection from all CPUs (default if no target is specified). -p:: --pid=:: @@ -225,7 +225,7 @@ OPTIONS the libunwind or libdw library) should be used instead. Using the "lbr" method doesn't require any compiler options. It will produce call graphs from the hardware LBR registers. The - main limition is that it is only available on new Intel + main limitation is that it is only available on new Intel platforms, such as Haswell. It can only get user call chain. It doesn't work with branch stack sampling at the same time. @@ -347,6 +347,9 @@ Enable weightened sampling. An additional weight is recorded per sample and can displayed with the weight and local_weight sort keys. This currently works for TSX abort events and some memory events in precise mode on modern Intel CPUs. +--namespaces:: +Record events of type PERF_RECORD_NAMESPACES. + --transaction:: Record transaction flags for transaction related events. @@ -421,9 +424,19 @@ Configure all used events to run in user space. --timestamp-filename Append timestamp to output file name. ---switch-output:: +--switch-output[=mode]:: Generate multiple perf.data files, timestamp prefixed, switching to a new one -when receiving a SIGUSR2. +based on 'mode' value: + "signal" - when receiving a SIGUSR2 (default value) or + <size> - when reaching the size threshold, size is expected to + be a number with appended unit character - B/K/M/G + <time> - when reaching the time threshold, size is expected to + be a number with appended unit character - s/m/h/d + + Note: the precision of the size threshold hugely depends + on your configuration - the number and size of your ring + buffers (-m). It is generally more precise for higher sizes + (like >5M), for lower values expect different sizes. A possible use case is to, given an external event, slice the perf.data file that gets then processed, possibly via a perf script, to decide if that diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index f2914f03ae7b..9fa84617181e 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -25,6 +25,10 @@ OPTIONS --verbose:: Be more verbose. (show symbol address, etc) +-q:: +--quiet:: + Do not show any message. (Suppress -v) + -n:: --show-nr-samples:: Show the number of samples for each symbol @@ -68,7 +72,8 @@ OPTIONS --sort=:: Sort histogram entries by given key(s) - multiple keys can be specified in CSV format. Following sort keys are available: - pid, comm, dso, symbol, parent, cpu, socket, srcline, weight, local_weight. + pid, comm, dso, symbol, parent, cpu, socket, srcline, weight, + local_weight, cgroup_id. Each key has following meaning: @@ -76,6 +81,7 @@ OPTIONS - pid: command and tid of the task - dso: name of library or module executed at the time of sample - symbol: name of function executed at the time of sample + - symbol_size: size of function executed at the time of sample - parent: name of function matched to the parent regex filter. Unmatched entries are displayed as "[other]". - cpu: cpu number the task ran at the time of sample @@ -87,6 +93,7 @@ OPTIONS - weight: Event specific weight, e.g. memory latency or transaction abort cost. This is the global weight. - local_weight: Local weight version of the weight above. + - cgroup_id: ID derived from cgroup namespace device and inode numbers. - transaction: Transaction abort flags. - overhead: Overhead percentage of sample - overhead_sys: Overhead percentage of sample running in system mode @@ -168,11 +175,14 @@ OPTIONS By default, every sort keys not specified in -F will be appended automatically. + If the keys starts with a prefix '+', then it will append the specified + field(s) to the default field order. For example: perf report -F +period,sample. + -p:: --parent=<regex>:: A regex filter to identify parent. The parent is a caller of this function and searched through the callchain, thus it requires callchain - information recorded. The pattern is in the exteneded regex format and + information recorded. The pattern is in the extended regex format and defaults to "\^sys_|^do_page_fault", see '--sort parent'. -x:: @@ -197,8 +207,8 @@ OPTIONS -g:: --call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>:: Display call chains using type, min percent threshold, print limit, - call order, sort key, optional branch and value. Note that ordering of - parameters is not fixed so any parement can be given in an arbitraty order. + call order, sort key, optional branch and value. Note that ordering + is not fixed so any parameter can be given in an arbitrary order. One exception is the print_limit which should be preceded by threshold. print_type can be either: @@ -225,6 +235,7 @@ OPTIONS sort_key can be: - function: compare on functions (default) - address: compare on individual code addresses + - srcline: compare on source filename and line number branch can be: - branch: include last branch information in callgraph when available. @@ -420,6 +431,10 @@ include::itrace.txt[] --hierarchy:: Enable hierarchical output. +--inline:: + If a callgraph address belongs to an inlined function, the inline stack + will be printed. Each entry is function name or file/line. + include::callchain-overhead-calculation.txt[] SEE ALSO diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt index 76173969ab80..a092a2499e8f 100644 --- a/tools/perf/Documentation/perf-sched.txt +++ b/tools/perf/Documentation/perf-sched.txt @@ -132,6 +132,10 @@ OPTIONS for 'perf sched timehist' --migrations:: Show migration events. +-n:: +--next:: + Show next task. + -I:: --idle-hist:: Show idle-related events only. @@ -143,6 +147,8 @@ OPTIONS for 'perf sched timehist' stop time is not given (i.e, time string is 'x.y,') then analysis goes to end of file. +--state:: + Show task state when it switched out. SEE ALSO -------- diff --git a/tools/perf/Documentation/perf-script-perl.txt b/tools/perf/Documentation/perf-script-perl.txt index dfbb506d2c34..142606c0ec9c 100644 --- a/tools/perf/Documentation/perf-script-perl.txt +++ b/tools/perf/Documentation/perf-script-perl.txt @@ -39,7 +39,7 @@ EVENT HANDLERS When perf script is invoked using a trace script, a user-defined 'handler function' is called for each event in the trace. If there's no handler function defined for a given event type, the event is -ignored (or passed to a 'trace_handled' function, see below) and the +ignored (or passed to a 'trace_unhandled' function, see below) and the next event is processed. Most of the event's field values are passed as arguments to the diff --git a/tools/perf/Documentation/perf-script-python.txt b/tools/perf/Documentation/perf-script-python.txt index 54acba221558..51ec2d20068a 100644 --- a/tools/perf/Documentation/perf-script-python.txt +++ b/tools/perf/Documentation/perf-script-python.txt @@ -149,10 +149,8 @@ def raw_syscalls__sys_enter(event_name, context, common_cpu, print "id=%d, args=%s\n" % \ (id, args), -def trace_unhandled(event_name, context, common_cpu, common_secs, common_nsecs, - common_pid, common_comm): - print_header(event_name, common_cpu, common_secs, common_nsecs, - common_pid, common_comm) +def trace_unhandled(event_name, context, event_fields_dict): + print ' '.join(['%s=%s'%(k,str(v))for k,v in sorted(event_fields_dict.items())]) def print_header(event_name, cpu, secs, nsecs, pid, comm): print "%-20s %5u %05u.%09u %8u %-20s " % \ @@ -321,7 +319,7 @@ So those are the essential steps in writing and running a script. The process can be generalized to any tracepoint or set of tracepoints you're interested in - basically find the tracepoint(s) you're interested in by looking at the list of available events shown by -'perf list' and/or look in /sys/kernel/debug/tracing events for +'perf list' and/or look in /sys/kernel/debug/tracing/events/ for detailed event and field info, record the corresponding trace data using 'perf record', passing it the list of interesting events, generate a skeleton script using 'perf script -g python' and modify the @@ -334,7 +332,7 @@ right place, you can have your script listed alongside the other scripts listed by the 'perf script -l' command e.g.: ---- -root@tropicana:~# perf script -l +# perf script -l List of available trace scripts: wakeup-latency system-wide min/max/avg wakeup latency rw-by-file <comm> r/w activity for a program, by file @@ -383,8 +381,6 @@ source tree: ---- # ls -al kernel-source/tools/perf/scripts/python - -root@tropicana:/home/trz/src/tip# ls -al tools/perf/scripts/python total 32 drwxr-xr-x 4 trz trz 4096 2010-01-26 22:30 . drwxr-xr-x 4 trz trz 4096 2010-01-26 22:29 .. @@ -399,7 +395,7 @@ otherwise your script won't show up at run-time), 'perf script -l' should show a new entry for your script: ---- -root@tropicana:~# perf script -l +# perf script -l List of available trace scripts: wakeup-latency system-wide min/max/avg wakeup latency rw-by-file <comm> r/w activity for a program, by file @@ -437,7 +433,7 @@ EVENT HANDLERS When perf script is invoked using a trace script, a user-defined 'handler function' is called for each event in the trace. If there's no handler function defined for a given event type, the event is -ignored (or passed to a 'trace_handled' function, see below) and the +ignored (or passed to a 'trace_unhandled' function, see below) and the next event is processed. Most of the event's field values are passed as arguments to the @@ -532,7 +528,7 @@ can implement a set of optional functions: gives scripts a chance to do setup tasks: ---- -def trace_begin: +def trace_begin(): pass ---- @@ -541,7 +537,7 @@ def trace_begin: as display results: ---- -def trace_end: +def trace_end(): pass ---- @@ -550,8 +546,7 @@ def trace_end: of common arguments are passed into it: ---- -def trace_unhandled(event_name, context, common_cpu, common_secs, - common_nsecs, common_pid, common_comm): +def trace_unhandled(event_name, context, event_fields_dict): pass ---- diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index 5dc5c6a09ac4..5ee8796be96e 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -36,7 +36,7 @@ There are several variants of perf script: 'perf script report <script> [args]' to run and display the results of <script>. <script> is the name displayed in the output of 'perf - trace --list' i.e. the actual script name minus any language + script --list' i.e. the actual script name minus any language extension. The perf.data output from a previous run of 'perf script record <script>' is used and should be present for this command to succeed. [args] refers to the (mainly optional) args expected by @@ -76,7 +76,7 @@ OPTIONS Any command you can specify in a shell. -D:: ---dump-raw-script=:: +--dump-raw-trace=:: Display verbose dump of the trace data. -L:: @@ -116,8 +116,9 @@ OPTIONS --fields:: Comma separated list of fields to print. Options are: comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, - srcline, period, iregs, brstack, brstacksym, flags, bpf-output, - callindent, insn, insnlen. Field list can be prepended with the type, trace, sw or hw, + srcline, period, iregs, brstack, brstacksym, flags, bpf-output, brstackinsn, brstackoff, + callindent, insn, insnlen, synth. + Field list can be prepended with the type, trace, sw or hw, to indicate to which event type the field list applies. e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace @@ -130,6 +131,14 @@ OPTIONS i.e., the specified fields apply to all event types if the type string is not given. + In addition to overriding fields, it is also possible to add or remove + fields from the defaults. For example + + -F -cpu,+insn + + removes the cpu field and adds the insn field. Adding/removing fields + cannot be mixed with normal overriding. + The arguments are processed in the order received. A later usage can reset a prior request. e.g.: @@ -185,19 +194,29 @@ OPTIONS instruction bytes and the instruction length of the current instruction. + The synth field is used by synthesized events which may be created when + Instruction Trace decoding. + Finally, a user may not set fields to none for all event types. i.e., -F "" is not allowed. The brstack output includes branch related information with raw addresses using the - /v/v/v/v/ syntax in the following order: + /v/v/v/v/cycles syntax in the following order: FROM: branch source instruction TO : branch target instruction M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported X/- : X=branch inside a transactional region, -=not in transaction region or not supported A/- : A=TSX abort entry, -=not aborted region or not supported + cycles The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible. + When brstackinsn is specified the full assembler sequences of branch sequences for each sample + is printed. This is the full execution path leading to the sample. This is only supported when the + sample was recorded with perf record -b or -j any. + + The brstackoff field will print an offset into a specific dso/binary. + -k:: --vmlinux=<file>:: vmlinux pathname @@ -248,6 +267,9 @@ OPTIONS --show-mmap-events Display mmap related events (e.g. MMAP, MMAP2). +--show-namespace-events + Display namespace events i.e. events of type PERF_RECORD_NAMESPACES. + --show-switch-events Display context switch events i.e. events of type PERF_RECORD_SWITCH or PERF_RECORD_SWITCH_CPU_WIDE. @@ -299,6 +321,14 @@ include::itrace.txt[] stop time is not given (i.e, time string is 'x.y,') then analysis goes to end of file. +--max-blocks:: + Set the maximum number of program blocks to print with brstackasm for + each sample. + +--inline:: + If a callgraph address belongs to an inlined function, the inline stack + will be printed. Each entry has function name and file/line. + SEE ALSO -------- linkperf:perf-record[1], linkperf:perf-script-perl[1], diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index d96ccd4844df..698076313606 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -63,7 +63,7 @@ report:: -a:: --all-cpus:: - system-wide collection from all CPUs + system-wide collection from all CPUs (default if no target is specified) -c:: --scale:: @@ -94,8 +94,7 @@ to activate system-wide monitoring. Default is to count on all CPUs. -A:: --no-aggr:: -Do not aggregate counts across all monitored CPUs in system-wide mode (-a). -This option is only valid in system-wide mode. +Do not aggregate counts across all monitored CPUs. -n:: --null:: @@ -237,6 +236,23 @@ To interpret the results it is usually needed to know on which CPUs the workload runs on. If needed the CPUs can be forced using taskset. +--no-merge:: +Do not merge results from same PMUs. + +--smi-cost:: +Measure SMI cost if msr/aperf/ and msr/smi/ events are supported. + +During the measurement, the /sys/device/cpu/freeze_on_smi will be set to +freeze core counters on SMI. +The aperf counter will not be effected by the setting. +The cost of SMI can be measured by (aperf - unhalted core cycles). + +In practice, the percentages of SMI cycles is very useful for performance +oriented analysis. --metric_only will be applied by default. +The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf + +Users who wants to get the actual value can apply --no-metric-only. + EXAMPLES -------- diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt index 781b019751a4..c1e3288a2dfb 100644 --- a/tools/perf/Documentation/perf-trace.txt +++ b/tools/perf/Documentation/perf-trace.txt @@ -35,7 +35,10 @@ OPTIONS -e:: --expr:: - List of syscalls to show, currently only syscall names. +--event:: + List of syscalls and other perf events (tracepoints, HW cache events, + etc) to show. + See 'perf list' for a complete list of events. Prefixing with ! shows all syscalls but the ones specified. You may need to escape it. @@ -120,7 +123,8 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs. major or all pagefaults. Default value is maj. --syscalls:: - Trace system calls. This options is enabled by default. + Trace system calls. This options is enabled by default, disable with + --no-syscalls. --call-graph [mode,type,min[,limit],order[,key][,branch]]:: Setup and enable call-graph (stack chain/backtrace) recording. @@ -135,9 +139,6 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs. --kernel-syscall-graph:: Show the kernel callchains on the syscall exit path. ---event:: - Trace other events, see 'perf list' for a complete list. - --max-stack:: Set the stack depth limit when parsing the callchain, anything beyond the specified depth will be ignored. Note that at this point diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt index b664b18d3991..de8b39dda7b8 100644 --- a/tools/perf/Documentation/perf.data-file-format.txt +++ b/tools/perf/Documentation/perf.data-file-format.txt @@ -11,8 +11,8 @@ All fields are in native-endian of the machine that generated the perf.data. When perf is writing to a pipe it uses a special version of the file format that does not rely on seeking to adjust data offsets. This -format is not described here. The pipe version can be converted to -normal perf.data with perf inject. +format is described in "Pipe-mode data" section. The pipe data version can be +augmented with additional events using perf inject. The file starts with a perf_header: @@ -270,7 +270,7 @@ When the event stream contains multiple events each event is identified by an ID. This can be either through the PERF_SAMPLE_ID or the PERF_SAMPLE_IDENTIFIER header. The PERF_SAMPLE_IDENTIFIER header is at a fixed offset from the event header, which allows reliable -parsing of the header. Relying on ID may be ambigious. +parsing of the header. Relying on ID may be ambiguous. IDENTIFIER is only supported by newer Linux kernels. Perf record specific events: @@ -288,7 +288,7 @@ struct attr_event { uint64_t id[]; }; - PERF_RECORD_HEADER_EVENT_TYPE = 65, /* depreceated */ + PERF_RECORD_HEADER_EVENT_TYPE = 65, /* deprecated */ #define MAX_EVENT_NAME 64 @@ -411,6 +411,21 @@ An array bound by the perf_file_section size. ids points to a array of uint64_t defining the ids for event attr attr. +Pipe-mode data + +Pipe-mode avoid seeks in the file by removing the perf_file_section and flags +from the struct perf_header. The trimmed header is: + +struct perf_pipe_file_header { + u64 magic; + u64 size; +}; + +The information about attrs, data, and event_types is instead in the +synthesized events PERF_RECORD_ATTR, PERF_RECORD_HEADER_TRACING_DATA and +PERF_RECORD_HEADER_EVENT_TYPE that are generated by perf record in pipe-mode. + + References: include/uapi/linux/perf_event.h diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt index 8a6479c0eac9..db0ca3063eae 100644 --- a/tools/perf/Documentation/tips.txt +++ b/tools/perf/Documentation/tips.txt @@ -22,8 +22,8 @@ If you have debuginfo enabled, try: perf report -s sym,srcline For memory address profiling, try: perf mem record / perf mem report For tracepoint events, try: perf report -s trace_fields To record callchains for each sample: perf record -g -To record every process run by an user: perf record -u <user> -Skip collecing build-id when recording: perf record -B +To record every process run by a user: perf record -u <user> +Skip collecting build-id when recording: perf record -B To change sampling frequency to 100 Hz: perf record -F 100 See assembly instructions with percentage: perf annotate <symbol> If you prefer Intel style assembly, try: perf annotate -M intel |