summaryrefslogtreecommitdiffstats
path: root/Documentation/trace
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/trace')
-rw-r--r--Documentation/trace/boottime-trace.rst184
-rw-r--r--Documentation/trace/coresight/coresight-cpu-debug.rst (renamed from Documentation/trace/coresight-cpu-debug.txt)67
-rw-r--r--Documentation/trace/coresight/coresight-etm4x-reference.rst798
-rw-r--r--Documentation/trace/coresight/coresight.rst (renamed from Documentation/trace/coresight.txt)372
-rw-r--r--Documentation/trace/coresight/index.rst9
-rw-r--r--Documentation/trace/events.rst515
-rw-r--r--Documentation/trace/ftrace-uses.rst10
-rw-r--r--Documentation/trace/ftrace.rst33
-rw-r--r--Documentation/trace/index.rst2
-rw-r--r--Documentation/trace/intel_th.rst28
-rw-r--r--Documentation/trace/kprobetrace.rst3
-rw-r--r--Documentation/trace/ring-buffer-design.txt2
-rw-r--r--Documentation/trace/uprobetracer.rst1
13 files changed, 1801 insertions, 223 deletions
diff --git a/Documentation/trace/boottime-trace.rst b/Documentation/trace/boottime-trace.rst
new file mode 100644
index 000000000000..dcb390075ca1
--- /dev/null
+++ b/Documentation/trace/boottime-trace.rst
@@ -0,0 +1,184 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Boot-time tracing
+=================
+
+:Author: Masami Hiramatsu <mhiramat@kernel.org>
+
+Overview
+========
+
+Boot-time tracing allows users to trace boot-time process including
+device initialization with full features of ftrace including per-event
+filter and actions, histograms, kprobe-events and synthetic-events,
+and trace instances.
+Since kernel command line is not enough to control these complex features,
+this uses bootconfig file to describe tracing feature programming.
+
+Options in the Boot Config
+==========================
+
+Here is the list of available options list for boot time tracing in
+boot config file [1]_. All options are under "ftrace." or "kernel."
+prefix. See kernel parameters for the options which starts
+with "kernel." prefix [2]_.
+
+.. [1] See :ref:`Documentation/admin-guide/bootconfig.rst <bootconfig>`
+.. [2] See :ref:`Documentation/admin-guide/kernel-parameters.rst <kernelparameters>`
+
+Ftrace Global Options
+---------------------
+
+Ftrace global options have "kernel." prefix in boot config, which means
+these options are passed as a part of kernel legacy command line.
+
+kernel.tp_printk
+ Output trace-event data on printk buffer too.
+
+kernel.dump_on_oops [= MODE]
+ Dump ftrace on Oops. If MODE = 1 or omitted, dump trace buffer
+ on all CPUs. If MODE = 2, dump a buffer on a CPU which kicks Oops.
+
+kernel.traceoff_on_warning
+ Stop tracing if WARN_ON() occurs.
+
+kernel.fgraph_max_depth = MAX_DEPTH
+ Set MAX_DEPTH to maximum depth of fgraph tracer.
+
+kernel.fgraph_filters = FILTER[, FILTER2...]
+ Add fgraph tracing function filters.
+
+kernel.fgraph_notraces = FILTER[, FILTER2...]
+ Add fgraph non-tracing function filters.
+
+
+Ftrace Per-instance Options
+---------------------------
+
+These options can be used for each instance including global ftrace node.
+
+ftrace.[instance.INSTANCE.]options = OPT1[, OPT2[...]]
+ Enable given ftrace options.
+
+ftrace.[instance.INSTANCE.]trace_clock = CLOCK
+ Set given CLOCK to ftrace's trace_clock.
+
+ftrace.[instance.INSTANCE.]buffer_size = SIZE
+ Configure ftrace buffer size to SIZE. You can use "KB" or "MB"
+ for that SIZE.
+
+ftrace.[instance.INSTANCE.]alloc_snapshot
+ Allocate snapshot buffer.
+
+ftrace.[instance.INSTANCE.]cpumask = CPUMASK
+ Set CPUMASK as trace cpu-mask.
+
+ftrace.[instance.INSTANCE.]events = EVENT[, EVENT2[...]]
+ Enable given events on boot. You can use a wild card in EVENT.
+
+ftrace.[instance.INSTANCE.]tracer = TRACER
+ Set TRACER to current tracer on boot. (e.g. function)
+
+ftrace.[instance.INSTANCE.]ftrace.filters
+ This will take an array of tracing function filter rules.
+
+ftrace.[instance.INSTANCE.]ftrace.notraces
+ This will take an array of NON-tracing function filter rules.
+
+
+Ftrace Per-Event Options
+------------------------
+
+These options are setting per-event options.
+
+ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
+ Enable GROUP:EVENT tracing.
+
+ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
+ Set FILTER rule to the GROUP:EVENT.
+
+ftrace.[instance.INSTANCE.]event.GROUP.EVENT.actions = ACTION[, ACTION2[...]]
+ Set ACTIONs to the GROUP:EVENT.
+
+ftrace.[instance.INSTANCE.]event.kprobes.EVENT.probes = PROBE[, PROBE2[...]]
+ Defines new kprobe event based on PROBEs. It is able to define
+ multiple probes on one event, but those must have same type of
+ arguments. This option is available only for the event which
+ group name is "kprobes".
+
+ftrace.[instance.INSTANCE.]event.synthetic.EVENT.fields = FIELD[, FIELD2[...]]
+ Defines new synthetic event with FIELDs. Each field should be
+ "type varname".
+
+Note that kprobe and synthetic event definitions can be written under
+instance node, but those are also visible from other instances. So please
+take care for event name conflict.
+
+
+Examples
+========
+
+For example, to add filter and actions for each event, define kprobe
+events, and synthetic events with histogram, write a boot config like
+below::
+
+ ftrace.event {
+ task.task_newtask {
+ filter = "pid < 128"
+ enable
+ }
+ kprobes.vfs_read {
+ probes = "vfs_read $arg1 $arg2"
+ filter = "common_pid < 200"
+ enable
+ }
+ synthetic.initcall_latency {
+ fields = "unsigned long func", "u64 lat"
+ actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
+ }
+ initcall.initcall_start {
+ actions = "hist:keys=func:ts0=common_timestamp.usecs"
+ }
+ initcall.initcall_finish {
+ actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
+ }
+ }
+
+Also, boot-time tracing supports "instance" node, which allows us to run
+several tracers for different purpose at once. For example, one tracer
+is for tracing functions starting with "user\_", and others tracing
+"kernel\_" functions, you can write boot config as below::
+
+ ftrace.instance {
+ foo {
+ tracer = "function"
+ ftrace.filters = "user_*"
+ }
+ bar {
+ tracer = "function"
+ ftrace.filters = "kernel_*"
+ }
+ }
+
+The instance node also accepts event nodes so that each instance
+can customize its event tracing.
+
+This boot-time tracing also supports ftrace kernel parameters via boot
+config.
+For example, following kernel parameters::
+
+ trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"
+
+This can be written in boot config like below::
+
+ kernel {
+ trace_options = sym-addr
+ trace_event = "initcall:*"
+ tp_printk
+ trace_buf_size = 1M
+ ftrace = function
+ ftrace_filter = "vfs*"
+ }
+
+Note that parameters start with "kernel" prefix instead of "ftrace".
diff --git a/Documentation/trace/coresight-cpu-debug.txt b/Documentation/trace/coresight/coresight-cpu-debug.rst
index 1a660a39e3c0..993dd294b81b 100644
--- a/Documentation/trace/coresight-cpu-debug.txt
+++ b/Documentation/trace/coresight/coresight-cpu-debug.rst
@@ -1,8 +1,9 @@
- Coresight CPU Debug Module
- ==========================
+==========================
+Coresight CPU Debug Module
+==========================
- Author: Leo Yan <leo.yan@linaro.org>
- Date: April 5th, 2017
+ :Author: Leo Yan <leo.yan@linaro.org>
+ :Date: April 5th, 2017
Introduction
------------
@@ -69,6 +70,7 @@ Before accessing debug registers, we should ensure the clock and power domain
have been enabled properly. In ARMv8-a ARM (ARM DDI 0487A.k) chapter 'H9.1
Debug registers', the debug registers are spread into two domains: the debug
domain and the CPU domain.
+::
+---------------+
| |
@@ -125,18 +127,21 @@ If you want to enable debugging functionality at boot time, you can add
"coresight_cpu_debug.enable=1" to the kernel command line parameter.
The driver also can work as module, so can enable the debugging when insmod
-module:
-# insmod coresight_cpu_debug.ko debug=1
+module::
+
+ # insmod coresight_cpu_debug.ko debug=1
When boot time or insmod module you have not enabled the debugging, the driver
uses the debugfs file system to provide a knob to dynamically enable or disable
debugging:
-To enable it, write a '1' into /sys/kernel/debug/coresight_cpu_debug/enable:
-# echo 1 > /sys/kernel/debug/coresight_cpu_debug/enable
+To enable it, write a '1' into /sys/kernel/debug/coresight_cpu_debug/enable::
+
+ # echo 1 > /sys/kernel/debug/coresight_cpu_debug/enable
+
+To disable it, write a '0' into /sys/kernel/debug/coresight_cpu_debug/enable::
-To disable it, write a '0' into /sys/kernel/debug/coresight_cpu_debug/enable:
-# echo 0 > /sys/kernel/debug/coresight_cpu_debug/enable
+ # echo 0 > /sys/kernel/debug/coresight_cpu_debug/enable
As explained in chapter "Clock and power domain", if you are working on one
platform which has idle states to power off debug logic and the power
@@ -154,34 +159,34 @@ subsystem, more specifically by using the "/dev/cpu_dma_latency"
interface (see Documentation/power/pm_qos_interface.rst for more
details). As specified in the PM QoS documentation the requested
parameter will stay in effect until the file descriptor is released.
-For example:
+For example::
-# exec 3<> /dev/cpu_dma_latency; echo 0 >&3
-...
-Do some work...
-...
-# exec 3<>-
+ # exec 3<> /dev/cpu_dma_latency; echo 0 >&3
+ ...
+ Do some work...
+ ...
+ # exec 3<>-
The same can also be done from an application program.
Disable specific CPU's specific idle state from cpuidle sysfs (see
-Documentation/admin-guide/pm/cpuidle.rst):
-# echo 1 > /sys/devices/system/cpu/cpu$cpu/cpuidle/state$state/disable
+Documentation/admin-guide/pm/cpuidle.rst)::
+ # echo 1 > /sys/devices/system/cpu/cpu$cpu/cpuidle/state$state/disable
Output format
-------------
-Here is an example of the debugging output format:
-
-ARM external debug module:
-coresight-cpu-debug 850000.debug: CPU[0]:
-coresight-cpu-debug 850000.debug: EDPRSR: 00000001 (Power:On DLK:Unlock)
-coresight-cpu-debug 850000.debug: EDPCSR: handle_IPI+0x174/0x1d8
-coresight-cpu-debug 850000.debug: EDCIDSR: 00000000
-coresight-cpu-debug 850000.debug: EDVIDSR: 90000000 (State:Non-secure Mode:EL1/0 Width:64bits VMID:0)
-coresight-cpu-debug 852000.debug: CPU[1]:
-coresight-cpu-debug 852000.debug: EDPRSR: 00000001 (Power:On DLK:Unlock)
-coresight-cpu-debug 852000.debug: EDPCSR: debug_notifier_call+0x23c/0x358
-coresight-cpu-debug 852000.debug: EDCIDSR: 00000000
-coresight-cpu-debug 852000.debug: EDVIDSR: 90000000 (State:Non-secure Mode:EL1/0 Width:64bits VMID:0)
+Here is an example of the debugging output format::
+
+ ARM external debug module:
+ coresight-cpu-debug 850000.debug: CPU[0]:
+ coresight-cpu-debug 850000.debug: EDPRSR: 00000001 (Power:On DLK:Unlock)
+ coresight-cpu-debug 850000.debug: EDPCSR: handle_IPI+0x174/0x1d8
+ coresight-cpu-debug 850000.debug: EDCIDSR: 00000000
+ coresight-cpu-debug 850000.debug: EDVIDSR: 90000000 (State:Non-secure Mode:EL1/0 Width:64bits VMID:0)
+ coresight-cpu-debug 852000.debug: CPU[1]:
+ coresight-cpu-debug 852000.debug: EDPRSR: 00000001 (Power:On DLK:Unlock)
+ coresight-cpu-debug 852000.debug: EDPCSR: debug_notifier_call+0x23c/0x358
+ coresight-cpu-debug 852000.debug: EDCIDSR: 00000000
+ coresight-cpu-debug 852000.debug: EDVIDSR: 90000000 (State:Non-secure Mode:EL1/0 Width:64bits VMID:0)
diff --git a/Documentation/trace/coresight/coresight-etm4x-reference.rst b/Documentation/trace/coresight/coresight-etm4x-reference.rst
new file mode 100644
index 000000000000..b64d9a9c79df
--- /dev/null
+++ b/Documentation/trace/coresight/coresight-etm4x-reference.rst
@@ -0,0 +1,798 @@
+===============================================
+ETMv4 sysfs linux driver programming reference.
+===============================================
+
+ :Author: Mike Leach <mike.leach@linaro.org>
+ :Date: October 11th, 2019
+
+Supplement to existing ETMv4 driver documentation.
+
+Sysfs files and directories
+---------------------------
+
+Root: ``/sys/bus/coresight/devices/etm<N>``
+
+
+The following paragraphs explain the association between sysfs files and the
+ETMv4 registers that they effect. Note the register names are given without
+the ‘TRC’ prefix.
+
+----
+
+:File: ``mode`` (rw)
+:Trace Registers: {CONFIGR + others}
+:Notes:
+ Bit select trace features. See ‘mode’ section below. Bits
+ in this will cause equivalent programming of trace config and
+ other registers to enable the features requested.
+
+:Syntax & eg:
+ ``echo bitfield > mode``
+
+ bitfield up to 32 bits setting trace features.
+
+:Example:
+ ``$> echo 0x012 > mode``
+
+----
+
+:File: ``reset`` (wo)
+:Trace Registers: All
+:Notes:
+ Reset all programming to trace nothing / no logic programmed.
+
+:Syntax:
+ ``echo 1 > reset``
+
+----
+
+:File: ``enable_source`` (wo)
+:Trace Registers: PRGCTLR, All hardware regs.
+:Notes:
+ - > 0 : Programs up the hardware with the current values held in the driver
+ and enables trace.
+
+ - = 0 : disable trace hardware.
+
+:Syntax:
+ ``echo 1 > enable_source``
+
+----
+
+:File: ``cpu`` (ro)
+:Trace Registers: None.
+:Notes:
+ CPU ID that this ETM is attached to.
+
+:Example:
+ ``$> cat cpu``
+
+ ``$> 0``
+
+----
+
+:File: ``addr_idx`` (rw)
+:Trace Registers: None.
+:Notes:
+ Virtual register to index address comparator and range
+ features. Set index for first of the pair in a range.
+
+:Syntax:
+ ``echo idx > addr_idx``
+
+ Where idx < nr_addr_cmp x 2
+
+----
+
+:File: ``addr_range`` (rw)
+:Trace Registers: ACVR[idx, idx+1], VIIECTLR
+:Notes:
+ Pair of addresses for a range selected by addr_idx. Include
+ / exclude according to the optional parameter, or if omitted
+ uses the current ‘mode’ setting. Select comparator range in
+ control register. Error if index is odd value.
+
+:Depends: ``mode, addr_idx``
+:Syntax:
+ ``echo addr1 addr2 [exclude] > addr_range``
+
+ Where addr1 and addr2 define the range and addr1 < addr2.
+
+ Optional exclude value:-
+
+ - 0 for include
+ - 1 for exclude.
+:Example:
+ ``$> echo 0x0000 0x2000 0 > addr_range``
+
+----
+
+:File: ``addr_single`` (rw)
+:Trace Registers: ACVR[idx]
+:Notes:
+ Set a single address comparator according to addr_idx. This
+ is used if the address comparator is used as part of event
+ generation logic etc.
+
+:Depends: ``addr_idx``
+:Syntax:
+ ``echo addr1 > addr_single``
+
+----
+
+:File: ``addr_start`` (rw)
+:Trace Registers: ACVR[idx], VISSCTLR
+:Notes:
+ Set a trace start address comparator according to addr_idx.
+ Select comparator in control register.
+
+:Depends: ``addr_idx``
+:Syntax:
+ ``echo addr1 > addr_start``
+
+----
+
+:File: ``addr_stop`` (rw)
+:Trace Registers: ACVR[idx], VISSCTLR
+:Notes:
+ Set a trace stop address comparator according to addr_idx.
+ Select comparator in control register.
+
+:Depends: ``addr_idx``
+:Syntax:
+ ``echo addr1 > addr_stop``
+
+----
+
+:File: ``addr_context`` (rw)
+:Trace Registers: ACATR[idx,{6:4}]
+:Notes:
+ Link context ID comparator to address comparator addr_idx
+
+:Depends: ``addr_idx``
+:Syntax:
+ ``echo ctxt_idx > addr_context``
+
+ Where ctxt_idx is the index of the linked context id / vmid
+ comparator.
+
+----
+
+:File: ``addr_ctxtype`` (rw)
+:Trace Registers: ACATR[idx,{3:2}]
+:Notes:
+ Input value string. Set type for linked context ID comparator
+
+:Depends: ``addr_idx``
+:Syntax:
+ ``echo type > addr_ctxtype``
+
+ Type one of {all, vmid, ctxid, none}
+:Example:
+ ``$> echo ctxid > addr_ctxtype``
+
+----
+
+:File: ``addr_exlevel_s_ns`` (rw)
+:Trace Registers: ACATR[idx,{14:8}]
+:Notes:
+ Set the ELx secure and non-secure matching bits for the
+ selected address comparator
+
+:Depends: ``addr_idx``
+:Syntax:
+ ``echo val > addr_exlevel_s_ns``
+
+ val is a 7 bit value for exception levels to exclude. Input
+ value shifted to correct bits in register.
+:Example:
+ ``$> echo 0x4F > addr_exlevel_s_ns``
+
+----
+
+:File: ``addr_instdatatype`` (rw)
+:Trace Registers: ACATR[idx,{1:0}]
+:Notes:
+ Set the comparator address type for matching. Driver only
+ supports setting instruction address type.
+
+:Depends: ``addr_idx``
+
+----
+
+:File: ``addr_cmp_view`` (ro)
+:Trace Registers: ACVR[idx, idx+1], ACATR[idx], VIIECTLR
+:Notes:
+ Read the currently selected address comparator. If part of
+ address range then display both addresses.
+
+:Depends: ``addr_idx``
+:Syntax:
+ ``cat addr_cmp_view``
+:Example:
+ ``$> cat addr_cmp_view``
+
+ ``addr_cmp[0] range 0x0 0xffffffffffffffff include ctrl(0x4b00)``
+
+----
+
+:File: ``nr_addr_cmp`` (ro)
+:Trace Registers: From IDR4
+:Notes:
+ Number of address comparator pairs
+
+----
+
+:File: ``sshot_idx`` (rw)
+:Trace Registers: None
+:Notes:
+ Select single shot register set.
+
+----
+
+:File: ``sshot_ctrl`` (rw)
+:Trace Registers: SSCCR[idx]
+:Notes:
+ Access a single shot comparator control register.
+
+:Depends: ``sshot_idx``
+:Syntax:
+ ``echo val > sshot_ctrl``
+
+ Writes val into the selected control register.
+
+----
+
+:File: ``sshot_status`` (ro)
+:Trace Registers: SSCSR[idx]
+:Notes:
+ Read a single shot comparator status register
+
+:Depends: ``sshot_idx``
+:Syntax:
+ ``cat sshot_status``
+
+ Read status.
+:Example:
+ ``$> cat sshot_status``
+
+ ``0x1``
+
+----
+
+:File: ``sshot_pe_ctrl`` (rw)
+:Trace Registers: SSPCICR[idx]
+:Notes:
+ Access a single shot PE comparator input control register.
+
+:Depends: ``sshot_idx``
+:Syntax:
+ ``echo val > sshot_pe_ctrl``
+
+ Writes val into the selected control register.
+
+----
+
+:File: ``ns_exlevel_vinst`` (rw)
+:Trace Registers: VICTLR{23:20}
+:Notes:
+ Program non-secure exception level filters. Set / clear NS
+ exception filter bits. Setting ‘1’ excludes trace from the
+ exception level.
+
+:Syntax:
+ ``echo bitfield > ns_exlevel_viinst``
+
+ Where bitfield contains bits to set clear for EL0 to EL2
+:Example:
+ ``%> echo 0x4 > ns_exlevel_viinst``
+
+ Excludes EL2 NS trace.
+
+----
+
+:File: ``vinst_pe_cmp_start_stop`` (rw)
+:Trace Registers: VIPCSSCTLR
+:Notes:
+ Access PE start stop comparator input control registers
+
+----
+
+:File: ``bb_ctrl`` (rw)
+:Trace Registers: BBCTLR
+:Notes:
+ Define ranges that Branch Broadcast will operate in.
+ Default (0x0) is all addresses.
+
+:Depends: BB enabled.
+
+----
+
+:File: ``cyc_threshold`` (rw)
+:Trace Registers: CCCTLR
+:Notes:
+ Set the threshold for which cycle counts will be emitted.
+ Error if attempt to set below minimum defined in IDR3, masked
+ to width of valid bits.
+
+:Depends: CC enabled.
+
+----
+
+:File: ``syncfreq`` (rw)
+:Trace Registers: SYNCPR
+:Notes:
+ Set trace synchronisation period. Power of 2 value, 0 (off)
+ or 8-20. Driver defaults to 12 (every 4096 bytes).
+
+----
+
+:File: ``cntr_idx`` (rw)
+:Trace Registers: none
+:Notes:
+ Select the counter to access
+
+:Syntax:
+ ``echo idx > cntr_idx``
+
+ Where idx < nr_cntr
+
+----
+
+:File: ``cntr_ctrl`` (rw)
+:Trace Registers: CNTCTLR[idx]
+:Notes:
+ Set counter control value.
+
+:Depends: ``cntr_idx``
+:Syntax:
+ ``echo val > cntr_ctrl``
+
+ Where val is per ETMv4 spec.
+
+----
+
+:File: ``cntrldvr`` (rw)
+:Trace Registers: CNTRLDVR[idx]
+:Notes:
+ Set counter reload value.
+
+:Depends: ``cntr_idx``
+:Syntax:
+ ``echo val > cntrldvr``
+
+ Where val is per ETMv4 spec.
+
+----
+
+:File: ``nr_cntr`` (ro)
+:Trace Registers: From IDR5
+
+:Notes:
+ Number of counters implemented.
+
+----
+
+:File: ``ctxid_idx`` (rw)
+:Trace Registers: None
+:Notes:
+ Select the context ID comparator to access
+
+:Syntax:
+ ``echo idx > ctxid_idx``
+
+ Where idx < numcidc
+
+----
+
+:File: ``ctxid_pid`` (rw)
+:Trace Registers: CIDCVR[idx]
+:Notes:
+ Set the context ID comparator value
+
+:Depends: ``ctxid_idx``
+
+----
+
+:File: ``ctxid_masks`` (rw)
+:Trace Registers: CIDCCTLR0, CIDCCTLR1, CIDCVR<0-7>
+:Notes:
+ Pair of values to set the byte masks for 1-8 context ID
+ comparators. Automatically clears masked bytes to 0 in CID
+ value registers.
+
+:Syntax:
+ ``echo m3m2m1m0 [m7m6m5m4] > ctxid_masks``
+
+ 32 bit values made up of mask bytes, where mN represents a
+ byte mask value for Context ID comparator N.
+
+ Second value not required on systems that have fewer than 4
+ context ID comparators
+
+----
+
+:File: ``numcidc`` (ro)
+:Trace Registers: From IDR4
+:Notes:
+ Number of Context ID comparators
+
+----
+
+:File: ``vmid_idx`` (rw)
+:Trace Registers: None
+:Notes:
+ Select the VM ID comparator to access.
+
+:Syntax:
+ ``echo idx > vmid_idx``
+
+ Where idx <  numvmidc
+
+----
+
+:File: ``vmid_val`` (rw)
+:Trace Registers: VMIDCVR[idx]
+:Notes:
+ Set the VM ID comparator value
+
+:Depends: ``vmid_idx``
+
+----
+
+:File: ``vmid_masks`` (rw)
+:Trace Registers: VMIDCCTLR0, VMIDCCTLR1, VMIDCVR<0-7>
+:Notes:
+ Pair of values to set the byte masks for 1-8 VM ID comparators.
+ Automatically clears masked bytes to 0 in VMID value registers.
+
+:Syntax:
+ ``echo m3m2m1m0 [m7m6m5m4] > vmid_masks``
+
+ Where mN represents a byte mask value for VMID comparator N.
+ Second value not required on systems that have fewer than 4
+ VMID comparators.
+
+----
+
+:File: ``numvmidc`` (ro)
+:Trace Registers: From IDR4
+:Notes:
+ Number of VMID comparators
+
+----
+
+:File: ``res_idx`` (rw)
+:Trace Registers: None.
+:Notes:
+ Select the resource selector control to access. Must be 2 or
+ higher as selectors 0 and 1 are hardwired.
+
+:Syntax:
+ ``echo idx > res_idx``
+
+ Where 2 <= idx < nr_resource x 2
+
+----
+
+:File: ``res_ctrl`` (rw)
+:Trace Registers: RSCTLR[idx]
+:Notes:
+ Set resource selector control value. Value per ETMv4 spec.
+
+:Depends: ``res_idx``
+:Syntax:
+ ``echo val > res_cntr``
+
+ Where val is per ETMv4 spec.
+
+----
+
+:File: ``nr_resource`` (ro)
+:Trace Registers: From IDR4
+:Notes:
+ Number of resource selector pairs
+
+----
+
+:File: ``event`` (rw)
+:Trace Registers: EVENTCTRL0R
+:Notes:
+ Set up to 4 implemented event fields.
+
+:Syntax:
+ ``echo ev3ev2ev1ev0 > event``
+
+ Where evN is an 8 bit event field. Up to 4 event fields make up the
+ 32-bit input value. Number of valid fields is implementation dependent,
+ defined in IDR0.
+
+----
+
+:File: ``event_instren`` (rw)
+:Trace Registers: EVENTCTRL1R
+:Notes:
+ Choose events which insert event packets into trace stream.
+
+:Depends: EVENTCTRL0R
+:Syntax:
+ ``echo bitfield > event_instren``
+
+ Where bitfield is up to 4 bits according to number of event fields.
+
+----
+
+:File: ``event_ts`` (rw)
+:Trace Registers: TSCTLR
+:Notes:
+ Set the event that will generate timestamp requests.
+
+:Depends: ``TS activated``
+:Syntax:
+ ``echo evfield > event_ts``
+
+ Where evfield is an 8 bit event selector.
+
+----
+
+:File: ``seq_idx`` (rw)
+:Trace Registers: None
+:Notes:
+ Sequencer event register select - 0 to 2
+
+----
+
+:File: ``seq_state`` (rw)
+:Trace Registers: SEQSTR
+:Notes:
+ Sequencer current state - 0 to 3.
+
+----
+
+:File: ``seq_event`` (rw)
+:Trace Registers: SEQEVR[idx]
+:Notes:
+ State transition event registers
+
+:Depends: ``seq_idx``
+:Syntax:
+ ``echo evBevF > seq_event``
+
+ Where evBevF is a 16 bit value made up of two event selectors,
+
+ - evB : back
+ - evF : forwards.
+
+----
+
+:File: ``seq_reset_event`` (rw)
+:Trace Registers: SEQRSTEVR
+:Notes:
+ Sequencer reset event
+
+:Syntax:
+ ``echo evfield > seq_reset_event``
+
+ Where evfield is an 8 bit event selector.
+
+----
+
+:File: ``nrseqstate`` (ro)
+:Trace Registers: From IDR5
+:Notes:
+ Number of sequencer states (0 or 4)
+
+----
+
+:File: ``nr_pe_cmp`` (ro)
+:Trace Registers: From IDR4
+:Notes:
+ Number of PE comparator inputs
+
+----
+
+:File: ``nr_ext_inp`` (ro)
+:Trace Registers: From IDR5
+:Notes:
+ Number of external inputs
+
+----
+
+:File: ``nr_ss_cmp`` (ro)
+:Trace Registers: From IDR4
+:Notes:
+ Number of Single Shot control registers
+
+----
+
+*Note:* When programming any address comparator the driver will tag the
+comparator with a type used - i.e. RANGE, SINGLE, START, STOP. Once this tag
+is set, then only the values can be changed using the same sysfs file / type
+used to program it.
+
+Thus::
+
+ % echo 0 > addr_idx ; select address comparator 0
+ % echo 0x1000 0x5000 0 > addr_range ; set address range on comparators 0, 1.
+ % echo 0x2000 > addr_start ; error as comparator 0 is a range comparator
+ % echo 2 > addr_idx ; select address comparator 2
+ % echo 0x2000 > addr_start ; this is OK as comparator 2 is unused.
+ % echo 0x3000 > addr_stop ; error as comparator 2 set as start address.
+ % echo 2 > addr_idx ; select address comparator 3
+ % echo 0x3000 > addr_stop ; this is OK
+
+To remove programming on all the comparators (and all the other hardware) use
+the reset parameter::
+
+ % echo 1 > reset
+
+
+
+The ‘mode’ sysfs parameter.
+---------------------------
+
+This is a bitfield selection parameter that sets the overall trace mode for the
+ETM. The table below describes the bits, using the defines from the driver
+source file, along with a description of the feature these represent. Many
+features are optional and therefore dependent on implementation in the
+hardware.
+
+Bit assignments shown below:-
+
+----
+
+**bit (0):**
+ ETM_MODE_EXCLUDE
+
+**description:**
+ This is the default value for the include / exclude function when
+ setting address ranges. Set 1 for exclude range. When the mode
+ parameter is set this value is applied to the currently indexed
+ address range.
+
+
+**bit (4):**
+ ETM_MODE_BB
+
+**description:**
+ Set to enable branch broadcast if supported in hardware [IDR0].
+
+
+**bit (5):**
+ ETMv4_MODE_CYCACC
+
+**description:**
+ Set to enable cycle accurate trace if supported [IDR0].
+
+
+**bit (6):**
+ ETMv4_MODE_CTXID
+
+**description:**
+ Set to enable context ID tracing if supported in hardware [IDR2].
+
+
+**bit (7):**
+ ETM_MODE_VMID
+
+**description:**
+ Set to enable virtual machine ID tracing if supported [IDR2].
+
+
+**bit (11):**
+ ETMv4_MODE_TIMESTAMP
+
+**description:**
+ Set to enable timestamp generation if supported [IDR0].
+
+
+**bit (12):**
+ ETM_MODE_RETURNSTACK
+**description:**
+ Set to enable trace return stack use if supported [IDR0].
+
+
+**bit (13-14):**
+ ETM_MODE_QELEM(val)
+
+**description:**
+ ‘val’ determines level of Q element support enabled if
+ implemented by the ETM [IDR0]
+
+
+**bit (19):**
+ ETM_MODE_ATB_TRIGGER
+
+**description:**
+ Set to enable the ATBTRIGGER bit in the event control register
+ [EVENTCTLR1] if supported [IDR5].
+
+
+**bit (20):**
+ ETM_MODE_LPOVERRIDE
+
+**description:**
+ Set to enable the LPOVERRIDE bit in the event control register
+ [EVENTCTLR1], if supported [IDR5].
+
+
+**bit (21):**
+ ETM_MODE_ISTALL_EN
+
+**description:**
+ Set to enable the ISTALL bit in the stall control register
+ [STALLCTLR]
+
+
+**bit (23):**
+ ETM_MODE_INSTPRIO
+
+**description:**
+ Set to enable the INSTPRIORITY bit in the stall control register
+ [STALLCTLR] , if supported [IDR0].
+
+
+**bit (24):**
+ ETM_MODE_NOOVERFLOW
+
+**description:**
+ Set to enable the NOOVERFLOW bit in the stall control register
+ [STALLCTLR], if supported [IDR3].
+
+
+**bit (25):**
+ ETM_MODE_TRACE_RESET
+
+**description:**
+ Set to enable the TRCRESET bit in the viewinst control register
+ [VICTLR] , if supported [IDR3].
+
+
+**bit (26):**
+ ETM_MODE_TRACE_ERR
+
+**description:**
+ Set to enable the TRCCTRL bit in the viewinst control register
+ [VICTLR].
+
+
+**bit (27):**
+ ETM_MODE_VIEWINST_STARTSTOP
+
+**description:**
+ Set the initial state value of the ViewInst start / stop logic
+ in the viewinst control register [VICTLR]
+
+
+**bit (30):**
+ ETM_MODE_EXCL_KERN
+
+**description:**
+ Set default trace setup to exclude kernel mode trace (see note a)
+
+
+**bit (31):**
+ ETM_MODE_EXCL_USER
+
+**description:**
+ Set default trace setup to exclude user space trace (see note a)
+
+----
+
+*Note a)* On startup the ETM is programmed to trace the complete address space
+using address range comparator 0. ‘mode’ bits 30 / 31 modify this setting to
+set EL exclude bits for NS state in either user space (EL0) or kernel space
+(EL1) in the address range comparator. (the default setting excludes all
+secure EL, and NS EL2)
+
+Once the reset parameter has been used, and/or custom programming has been
+implemented - using these bits will result in the EL bits for address
+comparator 0 being set in the same way.
+
+*Note b)* Bits 2-3, 8-10, 15-16, 18, 22, control features that only work with
+data trace. As A-profile data trace is architecturally prohibited in ETMv4,
+these have been omitted here. Possible uses could be where a kernel has
+support for control of R or M profile infrastructure as part of a heterogeneous
+system.
+
+Bits 17, 28-29 are unused.
diff --git a/Documentation/trace/coresight.txt b/Documentation/trace/coresight/coresight.rst
index b027d61b27a6..a566719f8e7e 100644
--- a/Documentation/trace/coresight.txt
+++ b/Documentation/trace/coresight/coresight.rst
@@ -1,8 +1,9 @@
- Coresight - HW Assisted Tracing on ARM
- ======================================
+======================================
+Coresight - HW Assisted Tracing on ARM
+======================================
- Author: Mathieu Poirier <mathieu.poirier@linaro.org>
- Date: September 11th, 2014
+ :Author: Mathieu Poirier <mathieu.poirier@linaro.org>
+ :Date: September 11th, 2014
Introduction
------------
@@ -26,7 +27,7 @@ implementation, either storing the compressed stream in a memory buffer or
creating an interface to the outside world where data can be transferred to a
host without fear of filling up the onboard coresight memory buffer.
-At typical coresight system would look like this:
+At typical coresight system would look like this::
*****************************************************************
**************************** AMBA AXI ****************************===||
@@ -95,15 +96,24 @@ Acronyms and Classification
Acronyms:
-PTM: Program Trace Macrocell
-ETM: Embedded Trace Macrocell
-STM: System trace Macrocell
-ETB: Embedded Trace Buffer
-ITM: Instrumentation Trace Macrocell
-TPIU: Trace Port Interface Unit
-TMC-ETR: Trace Memory Controller, configured as Embedded Trace Router
-TMC-ETF: Trace Memory Controller, configured as Embedded Trace FIFO
-CTI: Cross Trigger Interface
+PTM:
+ Program Trace Macrocell
+ETM:
+ Embedded Trace Macrocell
+STM:
+ System trace Macrocell
+ETB:
+ Embedded Trace Buffer
+ITM:
+ Instrumentation Trace Macrocell
+TPIU:
+ Trace Port Interface Unit
+TMC-ETR:
+ Trace Memory Controller, configured as Embedded Trace Router
+TMC-ETF:
+ Trace Memory Controller, configured as Embedded Trace FIFO
+CTI:
+ Cross Trigger Interface
Classification:
@@ -118,7 +128,7 @@ Misc:
Device Tree Bindings
-----------------------
+--------------------
See Documentation/devicetree/bindings/arm/coresight.txt for details.
@@ -133,79 +143,79 @@ The coresight framework provides a central point to represent, configure and
manage coresight devices on a platform. Any coresight compliant device can
register with the framework for as long as they use the right APIs:
-struct coresight_device *coresight_register(struct coresight_desc *desc);
-void coresight_unregister(struct coresight_device *csdev);
+.. c:function:: struct coresight_device *coresight_register(struct coresight_desc *desc);
+.. c:function:: void coresight_unregister(struct coresight_device *csdev);
-The registering function is taking a "struct coresight_device *csdev" and
-register the device with the core framework. The unregister function takes
-a reference to a "struct coresight_device", obtained at registration time.
+The registering function is taking a ``struct coresight_desc *desc`` and
+register the device with the core framework. The unregister function takes
+a reference to a ``struct coresight_device *csdev`` obtained at registration time.
If everything goes well during the registration process the new devices will
-show up under /sys/bus/coresight/devices, as showns here for a TC2 platform:
+show up under /sys/bus/coresight/devices, as showns here for a TC2 platform::
-root:~# ls /sys/bus/coresight/devices/
-replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm
-20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm
-root:~#
+ root:~# ls /sys/bus/coresight/devices/
+ replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm
+ 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm
+ root:~#
-The functions take a "struct coresight_device", which looks like this:
+The functions take a ``struct coresight_device``, which looks like this::
-struct coresight_desc {
- enum coresight_dev_type type;
- struct coresight_dev_subtype subtype;
- const struct coresight_ops *ops;
- struct coresight_platform_data *pdata;
- struct device *dev;
- const struct attribute_group **groups;
-};
+ struct coresight_desc {
+ enum coresight_dev_type type;
+ struct coresight_dev_subtype subtype;
+ const struct coresight_ops *ops;
+ struct coresight_platform_data *pdata;
+ struct device *dev;
+ const struct attribute_group **groups;
+ };
The "coresight_dev_type" identifies what the device is, i.e, source link or
sink while the "coresight_dev_subtype" will characterise that type further.
-The "struct coresight_ops" is mandatory and will tell the framework how to
+The ``struct coresight_ops`` is mandatory and will tell the framework how to
perform base operations related to the components, each component having
-a different set of requirement. For that "struct coresight_ops_sink",
-"struct coresight_ops_link" and "struct coresight_ops_source" have been
+a different set of requirement. For that ``struct coresight_ops_sink``,
+``struct coresight_ops_link`` and ``struct coresight_ops_source`` have been
provided.
-The next field, "struct coresight_platform_data *pdata" is acquired by calling
-"of_get_coresight_platform_data()", as part of the driver's _probe routine and
-"struct device *dev" gets the device reference embedded in the "amba_device":
+The next field ``struct coresight_platform_data *pdata`` is acquired by calling
+``of_get_coresight_platform_data()``, as part of the driver's _probe routine and
+``struct device *dev`` gets the device reference embedded in the ``amba_device``::
-static int etm_probe(struct amba_device *adev, const struct amba_id *id)
-{
- ...
- ...
- drvdata->dev = &adev->dev;
- ...
-}
+ static int etm_probe(struct amba_device *adev, const struct amba_id *id)
+ {
+ ...
+ ...
+ drvdata->dev = &adev->dev;
+ ...
+ }
Specific class of device (source, link, or sink) have generic operations
-that can be performed on them (see "struct coresight_ops"). The
-"**groups" is a list of sysfs entries pertaining to operations
+that can be performed on them (see ``struct coresight_ops``). The ``**groups``
+is a list of sysfs entries pertaining to operations
specific to that component only. "Implementation defined" customisations are
expected to be accessed and controlled using those entries.
-
Device Naming scheme
-------------------------
+--------------------
+
The devices that appear on the "coresight" bus were named the same as their
parent devices, i.e, the real devices that appears on AMBA bus or the platform bus.
Thus the names were based on the Linux Open Firmware layer naming convention,
which follows the base physical address of the device followed by the device
-type. e.g:
+type. e.g::
-root:~# ls /sys/bus/coresight/devices/
- 20010000.etf 20040000.funnel 20100000.stm 22040000.etm
- 22140000.etm 230c0000.funnel 23240000.etm 20030000.tpiu
- 20070000.etr 20120000.replicator 220c0000.funnel
- 23040000.etm 23140000.etm 23340000.etm
+ root:~# ls /sys/bus/coresight/devices/
+ 20010000.etf 20040000.funnel 20100000.stm 22040000.etm
+ 22140000.etm 230c0000.funnel 23240000.etm 20030000.tpiu
+ 20070000.etr 20120000.replicator 220c0000.funnel
+ 23040000.etm 23140000.etm 23340000.etm
However, with the introduction of ACPI support, the names of the real
devices are a bit cryptic and non-obvious. Thus, a new naming scheme was
introduced to use more generic names based on the type of the device. The
-following rules apply:
+following rules apply::
1) Devices that are bound to CPUs, are named based on the CPU logical
number.
@@ -220,11 +230,11 @@ following rules apply:
e.g, tmc_etf0, tmc_etr0, funnel0, funnel1
-Thus, with the new scheme the devices could appear as :
+Thus, with the new scheme the devices could appear as ::
-root:~# ls /sys/bus/coresight/devices/
- etm0 etm1 etm2 etm3 etm4 etm5 funnel0
- funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
+ root:~# ls /sys/bus/coresight/devices/
+ etm0 etm1 etm2 etm3 etm4 etm5 funnel0
+ funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
Some of the examples below might refer to old naming scheme and some
to the newer scheme, to give a confirmation that what you see on your
@@ -234,9 +244,12 @@ the system under specified locations.
How to use the tracer modules
-----------------------------
-There are two ways to use the Coresight framework: 1) using the perf cmd line
-tools and 2) interacting directly with the Coresight devices using the sysFS
-interface. Preference is given to the former as using the sysFS interface
+There are two ways to use the Coresight framework:
+
+1. using the perf cmd line tools.
+2. interacting directly with the Coresight devices using the sysFS interface.
+
+Preference is given to the former as using the sysFS interface
requires a deep understanding of the Coresight HW. The following sections
provide details on using both methods.
@@ -245,107 +258,107 @@ provide details on using both methods.
Before trace collection can start, a coresight sink needs to be identified.
There is no limit on the amount of sinks (nor sources) that can be enabled at
any given moment. As a generic operation, all device pertaining to the sink
-class will have an "active" entry in sysfs:
-
-root:/sys/bus/coresight/devices# ls
-replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm
-20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm
-root:/sys/bus/coresight/devices# ls 20010000.etb
-enable_sink status trigger_cntr
-root:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink
-root:/sys/bus/coresight/devices# cat 20010000.etb/enable_sink
-1
-root:/sys/bus/coresight/devices#
+class will have an "active" entry in sysfs::
+
+ root:/sys/bus/coresight/devices# ls
+ replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm
+ 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm
+ root:/sys/bus/coresight/devices# ls 20010000.etb
+ enable_sink status trigger_cntr
+ root:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink
+ root:/sys/bus/coresight/devices# cat 20010000.etb/enable_sink
+ 1
+ root:/sys/bus/coresight/devices#
At boot time the current etm3x driver will configure the first address
comparator with "_stext" and "_etext", essentially tracing any instruction
that falls within that range. As such "enabling" a source will immediately
-trigger a trace capture:
-
-root:/sys/bus/coresight/devices# echo 1 > 2201c000.ptm/enable_source
-root:/sys/bus/coresight/devices# cat 2201c000.ptm/enable_source
-1
-root:/sys/bus/coresight/devices# cat 20010000.etb/status
-Depth: 0x2000
-Status: 0x1
-RAM read ptr: 0x0
-RAM wrt ptr: 0x19d3 <----- The write pointer is moving
-Trigger cnt: 0x0
-Control: 0x1
-Flush status: 0x0
-Flush ctrl: 0x2001
-root:/sys/bus/coresight/devices#
-
-Trace collection is stopped the same way:
-
-root:/sys/bus/coresight/devices# echo 0 > 2201c000.ptm/enable_source
-root:/sys/bus/coresight/devices#
-
-The content of the ETB buffer can be harvested directly from /dev:
-
-root:/sys/bus/coresight/devices# dd if=/dev/20010000.etb \
-of=~/cstrace.bin
-
-64+0 records in
-64+0 records out
-32768 bytes (33 kB) copied, 0.00125258 s, 26.2 MB/s
-root:/sys/bus/coresight/devices#
+trigger a trace capture::
+
+ root:/sys/bus/coresight/devices# echo 1 > 2201c000.ptm/enable_source
+ root:/sys/bus/coresight/devices# cat 2201c000.ptm/enable_source
+ 1
+ root:/sys/bus/coresight/devices# cat 20010000.etb/status
+ Depth: 0x2000
+ Status: 0x1
+ RAM read ptr: 0x0
+ RAM wrt ptr: 0x19d3 <----- The write pointer is moving
+ Trigger cnt: 0x0
+ Control: 0x1
+ Flush status: 0x0
+ Flush ctrl: 0x2001
+ root:/sys/bus/coresight/devices#
+
+Trace collection is stopped the same way::
+
+ root:/sys/bus/coresight/devices# echo 0 > 2201c000.ptm/enable_source
+ root:/sys/bus/coresight/devices#
+
+The content of the ETB buffer can be harvested directly from /dev::
+
+ root:/sys/bus/coresight/devices# dd if=/dev/20010000.etb \
+ of=~/cstrace.bin
+ 64+0 records in
+ 64+0 records out
+ 32768 bytes (33 kB) copied, 0.00125258 s, 26.2 MB/s
+ root:/sys/bus/coresight/devices#
The file cstrace.bin can be decompressed using "ptm2human", DS-5 or Trace32.
Following is a DS-5 output of an experimental loop that increments a variable up
to a certain value. The example is simple and yet provides a glimpse of the
wealth of possibilities that coresight provides.
-
-Info Tracing enabled
-Instruction 106378866 0x8026B53C E52DE004 false PUSH {lr}
-Instruction 0 0x8026B540 E24DD00C false SUB sp,sp,#0xc
-Instruction 0 0x8026B544 E3A03000 false MOV r3,#0
-Instruction 0 0x8026B548 E58D3004 false STR r3,[sp,#4]
-Instruction 0 0x8026B54C E59D3004 false LDR r3,[sp,#4]
-Instruction 0 0x8026B550 E3530004 false CMP r3,#4
-Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
-Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
-Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
-Timestamp Timestamp: 17106715833
-Instruction 319 0x8026B54C E59D3004 false LDR r3,[sp,#4]
-Instruction 0 0x8026B550 E3530004 false CMP r3,#4
-Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
-Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
-Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
-Instruction 9 0x8026B54C E59D3004 false LDR r3,[sp,#4]
-Instruction 0 0x8026B550 E3530004 false CMP r3,#4
-Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
-Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
-Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
-Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4]
-Instruction 0 0x8026B550 E3530004 false CMP r3,#4
-Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
-Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
-Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
-Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4]
-Instruction 0 0x8026B550 E3530004 false CMP r3,#4
-Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
-Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
-Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
-Instruction 10 0x8026B54C E59D3004 false LDR r3,[sp,#4]
-Instruction 0 0x8026B550 E3530004 false CMP r3,#4
-Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
-Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
-Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
-Instruction 6 0x8026B560 EE1D3F30 false MRC p15,#0x0,r3,c13,c0,#1
-Instruction 0 0x8026B564 E1A0100D false MOV r1,sp
-Instruction 0 0x8026B568 E3C12D7F false BIC r2,r1,#0x1fc0
-Instruction 0 0x8026B56C E3C2203F false BIC r2,r2,#0x3f
-Instruction 0 0x8026B570 E59D1004 false LDR r1,[sp,#4]
-Instruction 0 0x8026B574 E59F0010 false LDR r0,[pc,#16] ; [0x8026B58C] = 0x80550368
-Instruction 0 0x8026B578 E592200C false LDR r2,[r2,#0xc]
-Instruction 0 0x8026B57C E59221D0 false LDR r2,[r2,#0x1d0]
-Instruction 0 0x8026B580 EB07A4CF true BL {pc}+0x1e9344 ; 0x804548c4
-Info Tracing enabled
-Instruction 13570831 0x8026B584 E28DD00C false ADD sp,sp,#0xc
-Instruction 0 0x8026B588 E8BD8000 true LDM sp!,{pc}
-Timestamp Timestamp: 17107041535
+::
+
+ Info Tracing enabled
+ Instruction 106378866 0x8026B53C E52DE004 false PUSH {lr}
+ Instruction 0 0x8026B540 E24DD00C false SUB sp,sp,#0xc
+ Instruction 0 0x8026B544 E3A03000 false MOV r3,#0
+ Instruction 0 0x8026B548 E58D3004 false STR r3,[sp,#4]
+ Instruction 0 0x8026B54C E59D3004 false LDR r3,[sp,#4]
+ Instruction 0 0x8026B550 E3530004 false CMP r3,#4
+ Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
+ Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
+ Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
+ Timestamp Timestamp: 17106715833
+ Instruction 319 0x8026B54C E59D3004 false LDR r3,[sp,#4]
+ Instruction 0 0x8026B550 E3530004 false CMP r3,#4
+ Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
+ Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
+ Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
+ Instruction 9 0x8026B54C E59D3004 false LDR r3,[sp,#4]
+ Instruction 0 0x8026B550 E3530004 false CMP r3,#4
+ Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
+ Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
+ Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
+ Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4]
+ Instruction 0 0x8026B550 E3530004 false CMP r3,#4
+ Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
+ Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
+ Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
+ Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4]
+ Instruction 0 0x8026B550 E3530004 false CMP r3,#4
+ Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
+ Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
+ Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
+ Instruction 10 0x8026B54C E59D3004 false LDR r3,[sp,#4]
+ Instruction 0 0x8026B550 E3530004 false CMP r3,#4
+ Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
+ Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
+ Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
+ Instruction 6 0x8026B560 EE1D3F30 false MRC p15,#0x0,r3,c13,c0,#1
+ Instruction 0 0x8026B564 E1A0100D false MOV r1,sp
+ Instruction 0 0x8026B568 E3C12D7F false BIC r2,r1,#0x1fc0
+ Instruction 0 0x8026B56C E3C2203F false BIC r2,r2,#0x3f
+ Instruction 0 0x8026B570 E59D1004 false LDR r1,[sp,#4]
+ Instruction 0 0x8026B574 E59F0010 false LDR r0,[pc,#16] ; [0x8026B58C] = 0x80550368
+ Instruction 0 0x8026B578 E592200C false LDR r2,[r2,#0xc]
+ Instruction 0 0x8026B57C E59221D0 false LDR r2,[r2,#0x1d0]
+ Instruction 0 0x8026B580 EB07A4CF true BL {pc}+0x1e9344 ; 0x804548c4
+ Info Tracing enabled
+ Instruction 13570831 0x8026B584 E28DD00C false ADD sp,sp,#0xc
+ Instruction 0 0x8026B588 E8BD8000 true LDM sp!,{pc}
+ Timestamp Timestamp: 17107041535
2) Using perf framework:
@@ -370,19 +383,18 @@ A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is
listed along with configuration options within forward slashes '/'. Since a
Coresight system will typically have more than one sink, the name of the sink to
work with needs to be specified as an event option.
-On newer kernels the available sinks are listed in sysFS under:
-($SYSFS)/bus/event_source/devices/cs_etm/sinks/
+On newer kernels the available sinks are listed in sysFS under
+($SYSFS)/bus/event_source/devices/cs_etm/sinks/::
root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls
tmc_etf0 tmc_etr0 tpiu0
On older kernels, this may need to be found from the list of coresight devices,
-available under ($SYSFS)/bus/coresight/devices/:
+available under ($SYSFS)/bus/coresight/devices/::
root:~# ls /sys/bus/coresight/devices/
etm0 etm1 etm2 etm3 etm4 etm5 funnel0
funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
-
root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program
As mentioned above in section "Device Naming scheme", the names of the devices could
@@ -395,14 +407,14 @@ to use for the trace session.
More information on the above and other example on how to use Coresight with
the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub
-repository [3].
+repository [#third]_.
2.1) AutoFDO analysis using the perf tools:
perf can be used to record and analyze trace of programs.
Execution can be recorded using 'perf record' with the cs_etm event,
-specifying the name of the sink to record to, e.g:
+specifying the name of the sink to record to, e.g::
perf record -e cs_etm/@tmc_etr0/u --per-thread
@@ -421,12 +433,14 @@ Generating coverage files for Feedback Directed Optimization: AutoFDO
'perf inject' accepts the --itrace option in which case tracing data is
removed and replaced with the synthesized events. e.g.
+::
perf inject --itrace --strip -i perf.data -o perf.data.new
Below is an example of using ARM ETM for autoFDO. It requires autofdo
(https://github.com/google/autofdo) and gcc version 5. The bubble
sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial).
+::
$ gcc-5 -O3 sort.c -o sort
$ taskset -c 2 ./sort
@@ -455,28 +469,30 @@ difference is that clients are driving the trace capture rather
than the program flow through the code.
As with any other CoreSight component, specifics about the STM tracer can be
-found in sysfs with more information on each entry being found in [1]:
+found in sysfs with more information on each entry being found in [#first]_::
-root@genericarmv8:~# ls /sys/bus/coresight/devices/stm0
-enable_source hwevent_select port_enable subsystem uevent
-hwevent_enable mgmt port_select traceid
-root@genericarmv8:~#
+ root@genericarmv8:~# ls /sys/bus/coresight/devices/stm0
+ enable_source hwevent_select port_enable subsystem uevent
+ hwevent_enable mgmt port_select traceid
+ root@genericarmv8:~#
Like any other source a sink needs to be identified and the STM enabled before
-being used:
+being used::
-root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
-root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/stm0/enable_source
+ root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
+ root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/stm0/enable_source
From there user space applications can request and use channels using the devfs
-interface provided for that purpose by the generic STM API:
+interface provided for that purpose by the generic STM API::
+
+ root@genericarmv8:~# ls -l /dev/stm0
+ crw------- 1 root root 10, 61 Jan 3 18:11 /dev/stm0
+ root@genericarmv8:~#
+
+Details on how to use the generic STM API can be found here:- :doc:`../stm` [#second]_.
-root@genericarmv8:~# ls -l /dev/stm0
-crw------- 1 root root 10, 61 Jan 3 18:11 /dev/stm0
-root@genericarmv8:~#
+.. [#first] Documentation/ABI/testing/sysfs-bus-coresight-devices-stm
-Details on how to use the generic STM API can be found here [2].
+.. [#second] Documentation/trace/stm.rst
-[1]. Documentation/ABI/testing/sysfs-bus-coresight-devices-stm
-[2]. Documentation/trace/stm.rst
-[3]. https://github.com/Linaro/perf-opencsd
+.. [#third] https://github.com/Linaro/perf-opencsd
diff --git a/Documentation/trace/coresight/index.rst b/Documentation/trace/coresight/index.rst
new file mode 100644
index 000000000000..8d31b155a87c
--- /dev/null
+++ b/Documentation/trace/coresight/index.rst
@@ -0,0 +1,9 @@
+==============================
+CoreSight - ARM Hardware Trace
+==============================
+
+.. toctree::
+ :maxdepth: 2
+ :glob:
+
+ *
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index f7e1fcc0953c..ed79b220bd07 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -525,3 +525,518 @@ The following commands are supported:
event counts (hitcount).
See Documentation/trace/histogram.rst for details and examples.
+
+6.3 In-kernel trace event API
+-----------------------------
+
+In most cases, the command-line interface to trace events is more than
+sufficient. Sometimes, however, applications might find the need for
+more complex relationships than can be expressed through a simple
+series of linked command-line expressions, or putting together sets of
+commands may be simply too cumbersome. An example might be an
+application that needs to 'listen' to the trace stream in order to
+maintain an in-kernel state machine detecting, for instance, when an
+illegal kernel state occurs in the scheduler.
+
+The trace event subsystem provides an in-kernel API allowing modules
+or other kernel code to generate user-defined 'synthetic' events at
+will, which can be used to either augment the existing trace stream
+and/or signal that a particular important state has occurred.
+
+A similar in-kernel API is also available for creating kprobe and
+kretprobe events.
+
+Both the synthetic event and k/ret/probe event APIs are built on top
+of a lower-level "dynevent_cmd" event command API, which is also
+available for more specialized applications, or as the basis of other
+higher-level trace event APIs.
+
+The API provided for these purposes is describe below and allows the
+following:
+
+ - dynamically creating synthetic event definitions
+ - dynamically creating kprobe and kretprobe event definitions
+ - tracing synthetic events from in-kernel code
+ - the low-level "dynevent_cmd" API
+
+6.3.1 Dyamically creating synthetic event definitions
+-----------------------------------------------------
+
+There are a couple ways to create a new synthetic event from a kernel
+module or other kernel code.
+
+The first creates the event in one step, using synth_event_create().
+In this method, the name of the event to create and an array defining
+the fields is supplied to synth_event_create(). If successful, a
+synthetic event with that name and fields will exist following that
+call. For example, to create a new "schedtest" synthetic event:
+
+ ret = synth_event_create("schedtest", sched_fields,
+ ARRAY_SIZE(sched_fields), THIS_MODULE);
+
+The sched_fields param in this example points to an array of struct
+synth_field_desc, each of which describes an event field by type and
+name:
+
+ static struct synth_field_desc sched_fields[] = {
+ { .type = "pid_t", .name = "next_pid_field" },
+ { .type = "char[16]", .name = "next_comm_field" },
+ { .type = "u64", .name = "ts_ns" },
+ { .type = "u64", .name = "ts_ms" },
+ { .type = "unsigned int", .name = "cpu" },
+ { .type = "char[64]", .name = "my_string_field" },
+ { .type = "int", .name = "my_int_field" },
+ };
+
+See synth_field_size() for available types. If field_name contains [n]
+the field is considered to be an array.
+
+If the event is created from within a module, a pointer to the module
+must be passed to synth_event_create(). This will ensure that the
+trace buffer won't contain unreadable events when the module is
+removed.
+
+At this point, the event object is ready to be used for generating new
+events.
+
+In the second method, the event is created in several steps. This
+allows events to be created dynamically and without the need to create
+and populate an array of fields beforehand.
+
+To use this method, an empty or partially empty synthetic event should
+first be created using synth_event_gen_cmd_start() or
+synth_event_gen_cmd_array_start(). For synth_event_gen_cmd_start(),
+the name of the event along with one or more pairs of args each pair
+representing a 'type field_name;' field specification should be
+supplied. For synth_event_gen_cmd_array_start(), the name of the
+event along with an array of struct synth_field_desc should be
+supplied. Before calling synth_event_gen_cmd_start() or
+synth_event_gen_cmd_array_start(), the user should create and
+initialize a dynevent_cmd object using synth_event_cmd_init().
+
+For example, to create a new "schedtest" synthetic event with two
+fields:
+
+ struct dynevent_cmd cmd;
+ char *buf;
+
+ /* Create a buffer to hold the generated command */
+ buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
+
+ /* Before generating the command, initialize the cmd object */
+ synth_event_cmd_init(&cmd, buf, MAX_DYNEVENT_CMD_LEN);
+
+ ret = synth_event_gen_cmd_start(&cmd, "schedtest", THIS_MODULE,
+ "pid_t", "next_pid_field",
+ "u64", "ts_ns");
+
+Alternatively, using an array of struct synth_field_desc fields
+containing the same information:
+
+ ret = synth_event_gen_cmd_array_start(&cmd, "schedtest", THIS_MODULE,
+ fields, n_fields);
+
+Once the synthetic event object has been created, it can then be
+populated with more fields. Fields are added one by one using
+synth_event_add_field(), supplying the dynevent_cmd object, a field
+type, and a field name. For example, to add a new int field named
+"intfield", the following call should be made:
+
+ ret = synth_event_add_field(&cmd, "int", "intfield");
+
+See synth_field_size() for available types. If field_name contains [n]
+the field is considered to be an array.
+
+A group of fields can also be added all at once using an array of
+synth_field_desc with add_synth_fields(). For example, this would add
+just the first four sched_fields:
+
+ ret = synth_event_add_fields(&cmd, sched_fields, 4);
+
+If you already have a string of the form 'type field_name',
+synth_event_add_field_str() can be used to add it as-is; it will
+also automatically append a ';' to the string.
+
+Once all the fields have been added, the event should be finalized and
+registered by calling the synth_event_gen_cmd_end() function:
+
+ ret = synth_event_gen_cmd_end(&cmd);
+
+At this point, the event object is ready to be used for tracing new
+events.
+
+6.3.3 Tracing synthetic events from in-kernel code
+--------------------------------------------------
+
+To trace a synthetic event, there are several options. The first
+option is to trace the event in one call, using synth_event_trace()
+with a variable number of values, or synth_event_trace_array() with an
+array of values to be set. A second option can be used to avoid the
+need for a pre-formed array of values or list of arguments, via
+synth_event_trace_start() and synth_event_trace_end() along with
+synth_event_add_next_val() or synth_event_add_val() to add the values
+piecewise.
+
+6.3.3.1 Tracing a synthetic event all at once
+---------------------------------------------
+
+To trace a synthetic event all at once, the synth_event_trace() or
+synth_event_trace_array() functions can be used.
+
+The synth_event_trace() function is passed the trace_event_file
+representing the synthetic event (which can be retrieved using
+trace_get_event_file() using the synthetic event name, "synthetic" as
+the system name, and the trace instance name (NULL if using the global
+trace array)), along with an variable number of u64 args, one for each
+synthetic event field, and the number of values being passed.
+
+So, to trace an event corresponding to the synthetic event definition
+above, code like the following could be used:
+
+ ret = synth_event_trace(create_synth_test, 7, /* number of values */
+ 444, /* next_pid_field */
+ (u64)"clackers", /* next_comm_field */
+ 1000000, /* ts_ns */
+ 1000, /* ts_ms */
+ smp_processor_id(),/* cpu */
+ (u64)"Thneed", /* my_string_field */
+ 999); /* my_int_field */
+
+All vals should be cast to u64, and string vals are just pointers to
+strings, cast to u64. Strings will be copied into space reserved in
+the event for the string, using these pointers.
+
+Alternatively, the synth_event_trace_array() function can be used to
+accomplish the same thing. It is passed the trace_event_file
+representing the synthetic event (which can be retrieved using
+trace_get_event_file() using the synthetic event name, "synthetic" as
+the system name, and the trace instance name (NULL if using the global
+trace array)), along with an array of u64, one for each synthetic
+event field.
+
+To trace an event corresponding to the synthetic event definition
+above, code like the following could be used:
+
+ u64 vals[7];
+
+ vals[0] = 777; /* next_pid_field */
+ vals[1] = (u64)"tiddlywinks"; /* next_comm_field */
+ vals[2] = 1000000; /* ts_ns */
+ vals[3] = 1000; /* ts_ms */
+ vals[4] = smp_processor_id(); /* cpu */
+ vals[5] = (u64)"thneed"; /* my_string_field */
+ vals[6] = 398; /* my_int_field */
+
+The 'vals' array is just an array of u64, the number of which must
+match the number of field in the synthetic event, and which must be in
+the same order as the synthetic event fields.
+
+All vals should be cast to u64, and string vals are just pointers to
+strings, cast to u64. Strings will be copied into space reserved in
+the event for the string, using these pointers.
+
+In order to trace a synthetic event, a pointer to the trace event file
+is needed. The trace_get_event_file() function can be used to get
+it - it will find the file in the given trace instance (in this case
+NULL since the top trace array is being used) while at the same time
+preventing the instance containing it from going away:
+
+ schedtest_event_file = trace_get_event_file(NULL, "synthetic",
+ "schedtest");
+
+Before tracing the event, it should be enabled in some way, otherwise
+the synthetic event won't actually show up in the trace buffer.
+
+To enable a synthetic event from the kernel, trace_array_set_clr_event()
+can be used (which is not specific to synthetic events, so does need
+the "synthetic" system name to be specified explicitly).
+
+To enable the event, pass 'true' to it:
+
+ trace_array_set_clr_event(schedtest_event_file->tr,
+ "synthetic", "schedtest", true);
+
+To disable it pass false:
+
+ trace_array_set_clr_event(schedtest_event_file->tr,
+ "synthetic", "schedtest", false);
+
+Finally, synth_event_trace_array() can be used to actually trace the
+event, which should be visible in the trace buffer afterwards:
+
+ ret = synth_event_trace_array(schedtest_event_file, vals,
+ ARRAY_SIZE(vals));
+
+To remove the synthetic event, the event should be disabled, and the
+trace instance should be 'put' back using trace_put_event_file():
+
+ trace_array_set_clr_event(schedtest_event_file->tr,
+ "synthetic", "schedtest", false);
+ trace_put_event_file(schedtest_event_file);
+
+If those have been successful, synth_event_delete() can be called to
+remove the event:
+
+ ret = synth_event_delete("schedtest");
+
+6.3.3.1 Tracing a synthetic event piecewise
+-------------------------------------------
+
+To trace a synthetic using the piecewise method described above, the
+synth_event_trace_start() function is used to 'open' the synthetic
+event trace:
+
+ struct synth_trace_state trace_state;
+
+ ret = synth_event_trace_start(schedtest_event_file, &trace_state);
+
+It's passed the trace_event_file representing the synthetic event
+using the same methods as described above, along with a pointer to a
+struct synth_trace_state object, which will be zeroed before use and
+used to maintain state between this and following calls.
+
+Once the event has been opened, which means space for it has been
+reserved in the trace buffer, the individual fields can be set. There
+are two ways to do that, either one after another for each field in
+the event, which requires no lookups, or by name, which does. The
+tradeoff is flexibility in doing the assignments vs the cost of a
+lookup per field.
+
+To assign the values one after the other without lookups,
+synth_event_add_next_val() should be used. Each call is passed the
+same synth_trace_state object used in the synth_event_trace_start(),
+along with the value to set the next field in the event. After each
+field is set, the 'cursor' points to the next field, which will be set
+by the subsequent call, continuing until all the fields have been set
+in order. The same sequence of calls as in the above examples using
+this method would be (without error-handling code):
+
+ /* next_pid_field */
+ ret = synth_event_add_next_val(777, &trace_state);
+
+ /* next_comm_field */
+ ret = synth_event_add_next_val((u64)"slinky", &trace_state);
+
+ /* ts_ns */
+ ret = synth_event_add_next_val(1000000, &trace_state);
+
+ /* ts_ms */
+ ret = synth_event_add_next_val(1000, &trace_state);
+
+ /* cpu */
+ ret = synth_event_add_next_val(smp_processor_id(), &trace_state);
+
+ /* my_string_field */
+ ret = synth_event_add_next_val((u64)"thneed_2.01", &trace_state);
+
+ /* my_int_field */
+ ret = synth_event_add_next_val(395, &trace_state);
+
+To assign the values in any order, synth_event_add_val() should be
+used. Each call is passed the same synth_trace_state object used in
+the synth_event_trace_start(), along with the field name of the field
+to set and the value to set it to. The same sequence of calls as in
+the above examples using this method would be (without error-handling
+code):
+
+ ret = synth_event_add_val("next_pid_field", 777, &trace_state);
+ ret = synth_event_add_val("next_comm_field", (u64)"silly putty",
+ &trace_state);
+ ret = synth_event_add_val("ts_ns", 1000000, &trace_state);
+ ret = synth_event_add_val("ts_ms", 1000, &trace_state);
+ ret = synth_event_add_val("cpu", smp_processor_id(), &trace_state);
+ ret = synth_event_add_val("my_string_field", (u64)"thneed_9",
+ &trace_state);
+ ret = synth_event_add_val("my_int_field", 3999, &trace_state);
+
+Note that synth_event_add_next_val() and synth_event_add_val() are
+incompatible if used within the same trace of an event - either one
+can be used but not both at the same time.
+
+Finally, the event won't be actually traced until it's 'closed',
+which is done using synth_event_trace_end(), which takes only the
+struct synth_trace_state object used in the previous calls:
+
+ ret = synth_event_trace_end(&trace_state);
+
+Note that synth_event_trace_end() must be called at the end regardless
+of whether any of the add calls failed (say due to a bad field name
+being passed in).
+
+6.3.4 Dyamically creating kprobe and kretprobe event definitions
+----------------------------------------------------------------
+
+To create a kprobe or kretprobe trace event from kernel code, the
+kprobe_event_gen_cmd_start() or kretprobe_event_gen_cmd_start()
+functions can be used.
+
+To create a kprobe event, an empty or partially empty kprobe event
+should first be created using kprobe_event_gen_cmd_start(). The name
+of the event and the probe location should be specfied along with one
+or args each representing a probe field should be supplied to this
+function. Before calling kprobe_event_gen_cmd_start(), the user
+should create and initialize a dynevent_cmd object using
+kprobe_event_cmd_init().
+
+For example, to create a new "schedtest" kprobe event with two fields:
+
+ struct dynevent_cmd cmd;
+ char *buf;
+
+ /* Create a buffer to hold the generated command */
+ buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
+
+ /* Before generating the command, initialize the cmd object */
+ kprobe_event_cmd_init(&cmd, buf, MAX_DYNEVENT_CMD_LEN);
+
+ /*
+ * Define the gen_kprobe_test event with the first 2 kprobe
+ * fields.
+ */
+ ret = kprobe_event_gen_cmd_start(&cmd, "gen_kprobe_test", "do_sys_open",
+ "dfd=%ax", "filename=%dx");
+
+Once the kprobe event object has been created, it can then be
+populated with more fields. Fields can be added using
+kprobe_event_add_fields(), supplying the dynevent_cmd object along
+with a variable arg list of probe fields. For example, to add a
+couple additional fields, the following call could be made:
+
+ ret = kprobe_event_add_fields(&cmd, "flags=%cx", "mode=+4($stack)");
+
+Once all the fields have been added, the event should be finalized and
+registered by calling the kprobe_event_gen_cmd_end() or
+kretprobe_event_gen_cmd_end() functions, depending on whether a kprobe
+or kretprobe command was started:
+
+ ret = kprobe_event_gen_cmd_end(&cmd);
+
+or
+
+ ret = kretprobe_event_gen_cmd_end(&cmd);
+
+At this point, the event object is ready to be used for tracing new
+events.
+
+Similarly, a kretprobe event can be created using
+kretprobe_event_gen_cmd_start() with a probe name and location and
+additional params such as $retval:
+
+ ret = kretprobe_event_gen_cmd_start(&cmd, "gen_kretprobe_test",
+ "do_sys_open", "$retval");
+
+Similar to the synthetic event case, code like the following can be
+used to enable the newly created kprobe event:
+
+ gen_kprobe_test = trace_get_event_file(NULL, "kprobes", "gen_kprobe_test");
+
+ ret = trace_array_set_clr_event(gen_kprobe_test->tr,
+ "kprobes", "gen_kprobe_test", true);
+
+Finally, also similar to synthetic events, the following code can be
+used to give the kprobe event file back and delete the event:
+
+ trace_put_event_file(gen_kprobe_test);
+
+ ret = kprobe_event_delete("gen_kprobe_test");
+
+6.3.4 The "dynevent_cmd" low-level API
+--------------------------------------
+
+Both the in-kernel synthetic event and kprobe interfaces are built on
+top of a lower-level "dynevent_cmd" interface. This interface is
+meant to provide the basis for higher-level interfaces such as the
+synthetic and kprobe interfaces, which can be used as examples.
+
+The basic idea is simple and amounts to providing a general-purpose
+layer that can be used to generate trace event commands. The
+generated command strings can then be passed to the command-parsing
+and event creation code that already exists in the trace event
+subystem for creating the corresponding trace events.
+
+In a nutshell, the way it works is that the higher-level interface
+code creates a struct dynevent_cmd object, then uses a couple
+functions, dynevent_arg_add() and dynevent_arg_pair_add() to build up
+a command string, which finally causes the command to be executed
+using the dynevent_create() function. The details of the interface
+are described below.
+
+The first step in building a new command string is to create and
+initialize an instance of a dynevent_cmd. Here, for instance, we
+create a dynevent_cmd on the stack and initialize it:
+
+ struct dynevent_cmd cmd;
+ char *buf;
+ int ret;
+
+ buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
+
+ dynevent_cmd_init(cmd, buf, maxlen, DYNEVENT_TYPE_FOO,
+ foo_event_run_command);
+
+The dynevent_cmd initialization needs to be given a user-specified
+buffer and the length of the buffer (MAX_DYNEVENT_CMD_LEN can be used
+for this purpose - at 2k it's generally too big to be comfortably put
+on the stack, so is dynamically allocated), a dynevent type id, which
+is meant to be used to check that further API calls are for the
+correct command type, and a pointer to an event-specific run_command()
+callback that will be called to actually execute the event-specific
+command function.
+
+Once that's done, the command string can by built up by successive
+calls to argument-adding functions.
+
+To add a single argument, define and initialize a struct dynevent_arg
+or struct dynevent_arg_pair object. Here's an example of the simplest
+possible arg addition, which is simply to append the given string as
+a whitespace-separated argument to the command:
+
+ struct dynevent_arg arg;
+
+ dynevent_arg_init(&arg, NULL, 0);
+
+ arg.str = name;
+
+ ret = dynevent_arg_add(cmd, &arg);
+
+The arg object is first initialized using dynevent_arg_init() and in
+this case the parameters are NULL or 0, which means there's no
+optional sanity-checking function or separator appended to the end of
+the arg.
+
+Here's another more complicated example using an 'arg pair', which is
+used to create an argument that consists of a couple components added
+together as a unit, for example, a 'type field_name;' arg or a simple
+expression arg e.g. 'flags=%cx':
+
+ struct dynevent_arg_pair arg_pair;
+
+ dynevent_arg_pair_init(&arg_pair, dynevent_foo_check_arg_fn, 0, ';');
+
+ arg_pair.lhs = type;
+ arg_pair.rhs = name;
+
+ ret = dynevent_arg_pair_add(cmd, &arg_pair);
+
+Again, the arg_pair is first initialized, in this case with a callback
+function used to check the sanity of the args (for example, that
+neither part of the pair is NULL), along with a character to be used
+to add an operator between the pair (here none) and a separator to be
+appended onto the end of the arg pair (here ';').
+
+There's also a dynevent_str_add() function that can be used to simply
+add a string as-is, with no spaces, delimeters, or arg check.
+
+Any number of dynevent_*_add() calls can be made to build up the string
+(until its length surpasses cmd->maxlen). When all the arguments have
+been added and the command string is complete, the only thing left to
+do is run the command, which happens by simply calling
+dynevent_create():
+
+ ret = dynevent_create(&cmd);
+
+At that point, if the return value is 0, the dynamic event has been
+created and is ready to use.
+
+See the dynevent_cmd function definitions themselves for the details
+of the API.
diff --git a/Documentation/trace/ftrace-uses.rst b/Documentation/trace/ftrace-uses.rst
index 1fbc69894eed..2a05e770618a 100644
--- a/Documentation/trace/ftrace-uses.rst
+++ b/Documentation/trace/ftrace-uses.rst
@@ -146,7 +146,7 @@ FTRACE_OPS_FL_RECURSION_SAFE
itself or any nested functions that those functions call.
If this flag is set, it is possible that the callback will also
- be called with preemption enabled (when CONFIG_PREEMPT is set),
+ be called with preemption enabled (when CONFIG_PREEMPTION is set),
but this is not guaranteed.
FTRACE_OPS_FL_IPMODIFY
@@ -170,6 +170,14 @@ FTRACE_OPS_FL_RCU
a callback may be executed and RCU synchronization will not protect
it.
+FTRACE_OPS_FL_PERMANENT
+ If this is set on any ftrace ops, then the tracing cannot disabled by
+ writing 0 to the proc sysctl ftrace_enabled. Equally, a callback with
+ the flag set cannot be registered if ftrace_enabled is 0.
+
+ Livepatch uses it not to lose the function redirection, so the system
+ stays protected.
+
Filtering which functions to trace
==================================
diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
index f60079259669..ff658e27d25b 100644
--- a/Documentation/trace/ftrace.rst
+++ b/Documentation/trace/ftrace.rst
@@ -95,7 +95,8 @@ of ftrace. Here is a list of some of the key files:
current_tracer:
This is used to set or display the current tracer
- that is configured.
+ that is configured. Changing the current tracer clears
+ the ring buffer content as well as the "snapshot" buffer.
available_tracers:
@@ -125,7 +126,9 @@ of ftrace. Here is a list of some of the key files:
This file holds the output of the trace in a human
readable format (described below). Note, tracing is temporarily
- disabled while this file is being read (opened).
+ disabled when the file is open for reading. Once all readers
+ are closed, tracing is re-enabled. Opening this file for
+ writing with the O_TRUNC flag clears the ring buffer content.
trace_pipe:
@@ -139,8 +142,9 @@ of ftrace. Here is a list of some of the key files:
will not be read again with a sequential read. The
"trace" file is static, and if the tracer is not
adding more data, it will display the same
- information every time it is read. This file will not
- disable tracing while being read.
+ information every time it is read. Unlike the
+ "trace" file, opening this file for reading will not
+ temporarily disable tracing.
trace_options:
@@ -183,7 +187,8 @@ of ftrace. Here is a list of some of the key files:
CPU buffer and not total size of all buffers. The
trace buffers are allocated in pages (blocks of memory
that the kernel uses for allocation, usually 4 KB in size).
- If the last page allocated has room for more bytes
+ A few extra pages may be allocated to accommodate buffer management
+ meta-data. If the last page allocated has room for more bytes
than requested, the rest of the page will be used,
making the actual allocation bigger than requested or shown.
( Note, the size may not be a multiple of the page size
@@ -233,7 +238,7 @@ of ftrace. Here is a list of some of the key files:
This interface also allows for commands to be used. See the
"Filter commands" section for more details.
- As a speed up, since processing strings can't be quite expensive
+ As a speed up, since processing strings can be quite expensive
and requires a check of all functions registered to tracing, instead
an index can be written into this file. A number (starting with "1")
written will instead select the same corresponding at the line position
@@ -380,7 +385,7 @@ of ftrace. Here is a list of some of the key files:
By default, 128 comms are saved (see "saved_cmdlines" above). To
increase or decrease the amount of comms that are cached, echo
- in a the number of comms to cache, into this file.
+ the number of comms to cache into this file.
saved_tgids:
@@ -488,6 +493,9 @@ of ftrace. Here is a list of some of the key files:
# echo global > trace_clock
+ Setting a clock clears the ring buffer content as well as the
+ "snapshot" buffer.
+
trace_marker:
This is a very useful file for synchronizing user space
@@ -2974,7 +2982,9 @@ Note, the proc sysctl ftrace_enable is a big on/off switch for the
function tracer. By default it is enabled (when function tracing is
enabled in the kernel). If it is disabled, all function tracing is
disabled. This includes not only the function tracers for ftrace, but
-also for any other uses (perf, kprobes, stack tracing, profiling, etc).
+also for any other uses (perf, kprobes, stack tracing, profiling, etc). It
+cannot be disabled if there is a callback with FTRACE_OPS_FL_PERMANENT set
+registered.
Please disable this with care.
@@ -3153,7 +3163,10 @@ different. The trace is live.
Note, reading the trace_pipe file will block until more input is
-added.
+added. This is contrary to the trace file. If any process opened
+the trace file for reading, it will actually disable tracing and
+prevent new entries from being added. The trace_pipe file does
+not have this limitation.
trace entries
-------------
@@ -3317,7 +3330,7 @@ directories after it is created.
As you can see, the new directory looks similar to the tracing directory
itself. In fact, it is very similar, except that the buffer and
-events are agnostic from the main director, or from any other
+events are agnostic from the main directory, or from any other
instances that are created.
The files in the new directory work just like the files with the
diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
index 6b4107cf4b98..fa9e1c730f6a 100644
--- a/Documentation/trace/index.rst
+++ b/Documentation/trace/index.rst
@@ -19,7 +19,9 @@ Linux Tracing Technologies
events-msr
mmiotrace
histogram
+ boottime-trace
hwlat_detector
intel_th
stm
sys-t
+ coresight/index
diff --git a/Documentation/trace/intel_th.rst b/Documentation/trace/intel_th.rst
index baa12eb09ef4..70b7126eaeeb 100644
--- a/Documentation/trace/intel_th.rst
+++ b/Documentation/trace/intel_th.rst
@@ -44,7 +44,8 @@ Documentation/trace/stm.rst for more information on that.
MSU can be configured to collect trace data into a system memory
buffer, which can later on be read from its device nodes via read() or
-mmap() interface.
+mmap() interface and directed to a "software sink" driver that will
+consume the data and/or relay it further.
On the whole, Intel(R) Trace Hub does not require any special
userspace software to function; everything can be configured, started
@@ -122,3 +123,28 @@ In order to enable the host mode, set the 'host_mode' parameter of the
will show up on the intel_th bus. Also, trace configuration and
capture controlling attribute groups of the 'gth' device will not be
exposed. The 'sth' device will operate as usual.
+
+Software Sinks
+--------------
+
+The Memory Storage Unit (MSU) driver provides an in-kernel API for
+drivers to register themselves as software sinks for the trace data.
+Such drivers can further export the data via other devices, such as
+USB device controllers or network cards.
+
+The API has two main parts::
+ - notifying the software sink that a particular window is full, and
+ "locking" that window, that is, making it unavailable for the trace
+ collection; when this happens, the MSU driver will automatically
+ switch to the next window in the buffer if it is unlocked, or stop
+ the trace capture if it's not;
+ - tracking the "locked" state of windows and providing a way for the
+ software sink driver to notify the MSU driver when a window is
+ unlocked and can be used again to collect trace data.
+
+An example sink driver, msu-sink illustrates the implementation of a
+software sink. Functionally, it simply unlocks windows as soon as they
+are full, keeping the MSU running in a circular buffer mode. Unlike the
+"multi" mode, it will fill out all the windows in the buffer as opposed
+to just the first one. It can be enabled by writing "sink" to the "mode"
+file (assuming msu-sink.ko is loaded).
diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst
index fbb314bfa112..cc4c5fc313df 100644
--- a/Documentation/trace/kprobetrace.rst
+++ b/Documentation/trace/kprobetrace.rst
@@ -52,6 +52,7 @@ Synopsis of kprobe_events
$retval : Fetch return value.(\*2)
$comm : Fetch current task comm.
+|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*3)(\*4)
+ \IMM : Store an immediate value to the argument.
NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types
@@ -96,6 +97,7 @@ which shows given pointer in "symbol+offset" style.
For $comm, the default type is "string"; any other type is invalid.
.. _user_mem_access:
+
User Memory Access
------------------
Kprobe events supports user-space memory access. For that purpose, you can use
@@ -251,4 +253,3 @@ And you can see the traced information via /sys/kernel/debug/tracing/trace.
Each line shows when the kernel hits an event, and <- SYMBOL means kernel
returns from SYMBOL(e.g. "sys_open+0x1b/0x1d <- do_sys_open" means kernel
returns from do_sys_open to sys_open+0x1b).
-
diff --git a/Documentation/trace/ring-buffer-design.txt b/Documentation/trace/ring-buffer-design.txt
index ff747b6fa39b..2d53c6f25b91 100644
--- a/Documentation/trace/ring-buffer-design.txt
+++ b/Documentation/trace/ring-buffer-design.txt
@@ -37,7 +37,7 @@ commit_page - a pointer to the page with the last finished non-nested write.
cmpxchg - hardware-assisted atomic transaction that performs the following:
- A = B iff previous A == C
+ A = B if previous A == C
R = cmpxchg(A, C, B) is saying that we replace A with B if and only if
current A is equal to C, and we put the old (current) A into R
diff --git a/Documentation/trace/uprobetracer.rst b/Documentation/trace/uprobetracer.rst
index 6e75a6c5a2c8..98cde99939d7 100644
--- a/Documentation/trace/uprobetracer.rst
+++ b/Documentation/trace/uprobetracer.rst
@@ -45,6 +45,7 @@ Synopsis of uprobe_tracer
$retval : Fetch return value.(\*1)
$comm : Fetch current task comm.
+|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*2)(\*3)
+ \IMM : Store an immediate value to the argument.
NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types
OpenPOWER on IntegriCloud