diff options
Diffstat (limited to 'tools/perf/Documentation')
| -rw-r--r-- | tools/perf/Documentation/Build.txt | 24 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf-config.txt | 47 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf-diff.txt | 56 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf-record.txt | 23 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf-report.txt | 13 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf-script.txt | 9 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf-stat.txt | 5 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf-trace.txt | 8 | ||||
| -rw-r--r-- | tools/perf/Documentation/perf.data-file-format.txt | 11 | ||||
| -rw-r--r-- | tools/perf/Documentation/tips.txt | 7 |
10 files changed, 193 insertions, 10 deletions
diff --git a/tools/perf/Documentation/Build.txt b/tools/perf/Documentation/Build.txt index f6fc6507ba55..3766886c4bca 100644 --- a/tools/perf/Documentation/Build.txt +++ b/tools/perf/Documentation/Build.txt @@ -47,3 +47,27 @@ Those objects are then used in final linking: NOTE this description is omitting other libraries involved, only focusing on build framework outcomes + +3) Build with ASan or UBSan +========================== + $ cd tools/perf + $ make DESTDIR=/usr + $ make DESTDIR=/usr install + +AddressSanitizer (or ASan) is a GCC feature that detects memory corruption bugs +such as buffer overflows and memory leaks. + + $ cd tools/perf + $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' + $ ASAN_OPTIONS=log_path=asan.log ./perf record -a + +ASan outputs all detected issues into a log file named 'asan.log.<pid>'. + +UndefinedBehaviorSanitizer (or UBSan) is a fast undefined behavior detector +supported by GCC. UBSan detects undefined behaviors of programs at runtime. + + $ cd tools/perf + $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=undefined' + $ UBSAN_OPTIONS=print_stacktrace=1 ./perf record -a + +If UBSan detects any problem at runtime, it outputs a “runtime error:” message. diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index 4ac7775fbc11..462b3cde0675 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -114,12 +114,16 @@ Given a $HOME/.perfconfig like this: [report] # Defaults - sort-order = comm,dso,symbol + sort_order = comm,dso,symbol percent-limit = 0 queue-size = 0 children = true group = true + [llvm] + dump-obj = true + clang-opt = -g + You can hide source code of annotate feature setting the config to false with % perf config annotate.hide_src_code=true @@ -553,6 +557,47 @@ trace.*:: trace.show_zeros:: Do not suppress syscall arguments that are equal to zero. +llvm.*:: + llvm.clang-path:: + Path to clang. If omit, search it from $PATH. + + llvm.clang-bpf-cmd-template:: + Cmdline template. Below lines show its default value. Environment + variable is used to pass options. + "$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS \ + -Wno-unused-value -Wno-pointer-sign -working-directory \ + $WORKING_DIR -c $CLANG_SOURCE -target bpf -O2 -o -" + + llvm.clang-opt:: + Options passed to clang. + + llvm.kbuild-dir:: + kbuild directory. If not set, use /lib/modules/`uname -r`/build. + If set to "" deliberately, skip kernel header auto-detector. + + llvm.kbuild-opts:: + Options passed to 'make' when detecting kernel header options. + + llvm.dump-obj:: + Enable perf dump BPF object files compiled by LLVM. + + llvm.opts:: + Options passed to llc. + +samples.*:: + + samples.context:: + Define how many ns worth of time to show + around samples in perf report sample context browser. + +scripts.*:: + + Any option defines a script that is added to the scripts menu + in the interactive perf browser and whose output is displayed. + The name of the option is the name, the value is a script command line. + The script gets the same options passed as a full perf script, + in particular -i perfdata file, --cpu, --tid + SEE ALSO -------- linkperf:perf[1] diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt index a79c84ae61aa..da7809b15cc9 100644 --- a/tools/perf/Documentation/perf-diff.txt +++ b/tools/perf/Documentation/perf-diff.txt @@ -118,6 +118,62 @@ OPTIONS sum of shown entries will be always 100%. "absolute" means it retains the original value before and after the filter is applied. +--time:: + Analyze samples within given time window. It supports time + percent with multiple time ranges. Time string is 'a%/n,b%/m,...' + or 'a%-b%,c%-%d,...'. + + For example: + + Select the second 10% time slice to diff: + + perf diff --time 10%/2 + + Select from 0% to 10% time slice to diff: + + perf diff --time 0%-10% + + Select the first and the second 10% time slices to diff: + + perf diff --time 10%/1,10%/2 + + Select from 0% to 10% and 30% to 40% slices to diff: + + perf diff --time 0%-10%,30%-40% + + It also supports analyzing samples within a given time window + <start>,<stop>. Times have the format seconds.microseconds. If 'start' + is not given (i.e., time string is ',x.y') then analysis starts at + the beginning of the file. If stop time is not given (i.e, time + string is 'x.y,') then analysis goes to the end of the file. Time string is + 'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for different + perf.data files. + + For example, we get the timestamp information from 'perf script'. + + perf script -i perf.data.old + mgen 13940 [000] 3946.361400: ... + + perf script -i perf.data + mgen 13940 [000] 3971.150589 ... + + perf diff --time 3946.361400,:3971.150589, + + It analyzes the perf.data.old from the timestamp 3946.361400 to + the end of perf.data.old and analyzes the perf.data from the + timestamp 3971.150589 to the end of perf.data. + +--cpu:: Only diff samples for the list of CPUs provided. Multiple CPUs can + be provided as a comma-separated list with no space: 0,1. Ranges of + CPUs are specified with -: 0-2. Default is to report samples on all + CPUs. + +--pid=:: + Only diff samples for given process ID (comma separated list). + +--tid=:: + Only diff samples for given thread ID (comma separated list). + COMPARISON ---------- The comparison is governed by the baseline file. The baseline perf.data diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index d232b13ea713..8fe4dffcadd0 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -88,6 +88,20 @@ OPTIONS If you want to profile write accesses in [0x1000~1008), just set 'mem:0x1000/8:w'. + - a BPF source file (ending in .c) or a precompiled object file (ending + in .o) selects one or more BPF events. + The BPF program can attach to various perf events based on the ELF section + names. + + When processing a '.c' file, perf searches an installed LLVM to compile it + into an object file first. Optional clang options can be passed via the + '--clang-opt' command line option, e.g.: + + perf record --clang-opt "-DLINUX_VERSION_CODE=0x50000" \ + -e tests/bpf-script-example.c + + Note: '--clang-opt' must be placed before '--event/-e'. + - a group of events surrounded by a pair of brace ("{event1,event2,...}"). Each event is separated by commas and the group should be quoted to prevent the shell interpretation. You also need to use --group on @@ -440,6 +454,11 @@ Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: Asynchronous mode is supported only when linking Perf tool with libc library providing implementation for Posix AIO API. +--affinity=mode:: +Set affinity mask of trace reading thread according to the policy defined by 'mode' value: + node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer + cpu - thread affinity mask is set to cpu of the processed mmap buffer + --all-kernel:: Configure all used events to run in kernel space. @@ -476,6 +495,10 @@ overhead. You can still switch them on with: --switch-output --no-no-buildid --no-no-buildid-cache +--switch-max-files=N:: + +When rotating perf.data with --switch-output, only keep N files. + --dry-run:: Parse options then exit. --dry-run can be used to detect errors in cmdline options. diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 1a27bfe05039..f441baa794ce 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -105,6 +105,8 @@ OPTIONS guest machine - sample: Number of sample - period: Raw number of event count of sample + - time: Separate the samples by time stamp with the resolution specified by + --time-quantum (default 100ms). Specify with overhead and before it. By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) @@ -459,6 +461,10 @@ include::itrace.txt[] --socket-filter:: Only report the samples on the processor socket that match with this filter +--samples=N:: + Save N individual samples for each histogram entry to show context in perf + report tui browser. + --raw-trace:: When displaying traceevent output, do not use print fmt or plugins. @@ -477,6 +483,9 @@ include::itrace.txt[] Please note that not all mmaps are stored, options affecting which ones are include 'perf record --data', for instance. +--ns:: + Show time stamps in nanoseconds. + --stats:: Display overall events statistics without any further processing. (like the one at the end of the perf report -D command) @@ -494,6 +503,10 @@ include::itrace.txt[] The period/hits keywords set the base the percentage is computed on - the samples period or the number of samples (hits). +--time-quantum:: + Configure time quantum for time sort key. Default 100ms. + Accepts s, us, ms, ns units. + include::callchain-overhead-calculation.txt[] SEE ALSO diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index 9e4def08d569..9b0d04dd2a61 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -159,6 +159,12 @@ OPTIONS the override, and the result of the above is that only S/W and H/W events are displayed with the given fields. + It's possible tp add/remove fields only for specific event type: + + -Fsw:-cpu,-period + + removes cpu and period from software events. + For the 'wildcard' option if a user selected field is invalid for an event type, a message is displayed to the user that the option is ignored for that type. For example: @@ -374,6 +380,9 @@ include::itrace.txt[] Set the maximum number of program blocks to print with brstackasm for each sample. +--reltime:: + Print time stamps relative to trace start. + --per-event-dump:: Create per event files with a "perf.data.EVENT.dump" name instead of printing to stdout, useful, for instance, for generating flamegraphs. diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 4bc2085e5197..39c05f89104e 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -72,9 +72,8 @@ report:: --all-cpus:: system-wide collection from all CPUs (default if no target is specified) --c:: ---scale:: - scale/normalize counter values +--no-scale:: + Don't scale/normalize counter values -d:: --detailed:: diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt index 631e687be4eb..fc6e43262c41 100644 --- a/tools/perf/Documentation/perf-trace.txt +++ b/tools/perf/Documentation/perf-trace.txt @@ -210,6 +210,14 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs. may happen, for instance, when a thread gets migrated to a different CPU while processing a syscall. +--map-dump:: + Dump BPF maps setup by events passed via -e, for instance the augmented_raw_syscalls + living in tools/perf/examples/bpf/augmented_raw_syscalls.c. For now this + dumps just boolean map values and integer keys, in time this will print in hex + by default and use BTF when available, as well as use functions to do pretty + printing using the existing 'perf trace' syscall arg beautifiers to map integer + arguments to strings (pid to comm, syscall id to syscall name, etc). + PAGEFAULTS ---------- diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt index dfb218feaad9..593ef49b273c 100644 --- a/tools/perf/Documentation/perf.data-file-format.txt +++ b/tools/perf/Documentation/perf.data-file-format.txt @@ -43,11 +43,10 @@ struct perf_file_section { Flags section: -The header is followed by different optional headers, described by the bits set -in flags. Only headers for which the bit is set are included. Each header -consists of a perf_file_section located after the initial header. -The respective perf_file_section points to the data of the additional -header and defines its size. +For each of the optional features a perf_file_section it placed after the data +section if the feature bit is set in the perf_header flags bitset. The +respective perf_file_section points to the data of the additional header and +defines its size. Some headers consist of strings, which are defined like this: @@ -131,7 +130,7 @@ An uint64_t with the total memory in bytes. HEADER_CMDLINE = 11, -A perf_header_string with the perf command line used to collect the data. +A perf_header_string_list with the perf arg-vector used to collect the data. HEADER_EVENT_DESC = 12, diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt index 849599f39c5e..869965d629ce 100644 --- a/tools/perf/Documentation/tips.txt +++ b/tools/perf/Documentation/tips.txt @@ -15,6 +15,7 @@ To see callchains in a more compact form: perf report -g folded Show individual samples with: perf script Limit to show entries above 5% only: perf report --percent-limit 5 Profiling branch (mis)predictions with: perf record -b / perf report +To show assembler sample contexts use perf record -b / perf script -F +brstackinsn --xed Treat branches as callchains: perf report --branch-history To count events in every 1000 msec: perf stat -I 1000 Print event counts in CSV format with: perf stat -x, @@ -34,3 +35,9 @@ Show current config key-value pairs: perf config --list Show user configuration overrides: perf config --user --list To add Node.js USDT(User-Level Statically Defined Tracing): perf buildid-cache --add `which node` To report cacheline events from previous recording: perf c2c report +To browse sample contexts use perf report --sample 10 and select in context menu +To separate samples by time use perf report --sort time,overhead,sym +To set sample time separation other than 100ms with --sort time use --time-quantum +Add -I to perf report to sample register values visible in perf report context. +To show IPC for sampling periods use perf record -e '{cycles,instructions}:S' and then browse context +To show context switches in perf report sample context add --switch-events to perf record. |

