<feed xmlns='http://www.w3.org/2005/Atom'>
<title>blackbird-op-linux/tools/perf/util/Build, branch master</title>
<subtitle>Blackbird™ Linux sources for OpenPOWER</subtitle>
<id>https://git.raptorcs.com/git/blackbird-op-linux/atom?h=master</id>
<link rel='self' href='https://git.raptorcs.com/git/blackbird-op-linux/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/'/>
<updated>2019-11-28T11:08:38+00:00</updated>
<entry>
<title>perf affinity: Add infrastructure to save/restore affinity</title>
<updated>2019-11-28T11:08:38+00:00</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2019-11-21T00:15:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=267ed5d8593cedd6146eabe00d270629c9cff771'/>
<id>urn:sha1:267ed5d8593cedd6146eabe00d270629c9cff771</id>
<content type='text'>
The kernel perf subsystem has to IPI to the target CPU for many
operations. On systems with many CPUs and when managing many events the
overhead can be dominated by lots of IPIs.

An alternative is to set up CPU affinity in the perf tool, then set up
all the events for that CPU, and then move on to the next CPU.

Add some affinity management infrastructure to enable such a model.
Used in followon patches.

Committer notes:

Use zfree() in some places, add missing stdbool.h header, some minor
coding style changes.

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191121001522.180827-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf pmu: Use file system cache to optimize sysfs access</title>
<updated>2019-11-28T11:08:38+00:00</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2019-11-21T00:15:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=d96645821e940bddff3fc5290656f83bf70d4c92'/>
<id>urn:sha1:d96645821e940bddff3fc5290656f83bf70d4c92</id>
<content type='text'>
pmu.c does a lot of redundant /sys accesses while parsing aliases
and probing for PMUs. On large systems with a lot of PMUs this
can get expensive (&gt;2s):

  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
   27.25    1.227847           8    160888     16976 openat
   26.42    1.190481           7    164224    164077 stat

Add a cache to remember if specific file names exist or don't
exist, which eliminates most of this overhead.

Also optimize some stat() calls to be slightly cheaper access()

Resulting in:

    0.18    0.004166           2      1851       305 open
    0.08    0.001970           2       829       622 access

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191121001522.180827-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf block: Cleanup and refactor block info functions</title>
<updated>2019-11-07T12:09:18+00:00</updated>
<author>
<name>Jin Yao</name>
<email>yao.jin@linux.intel.com</email>
</author>
<published>2019-11-07T07:47:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=6041441870ab521a2652f1d558a770c586b790be'/>
<id>urn:sha1:6041441870ab521a2652f1d558a770c586b790be</id>
<content type='text'>
We have already implemented some block-info related functions.
Now it's time to do some cleanup, refactoring and move the
functions and structures to new block-info.h/block-info.c.

 v4:
 ---
 Move code for skipping column length calculation to patch:
 'perf diff: Don't use hack to skip column length calculation'

 v3:
 ---
 1. Rename the patch title
 2. Rename from block.h/block.c to block-info.h/block-info.c
 3. Move more common part to block-info, such as
    block_info__process_sym.
 4. Remove the nasty hack for skipping calculation of column
    length

Signed-off-by: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jin Yao &lt;yao.jin@intel.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20191107074719.26139-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf diff: Report noisy for cycles diff</title>
<updated>2019-10-11T13:57:00+00:00</updated>
<author>
<name>Jin Yao</name>
<email>yao.jin@linux.intel.com</email>
</author>
<published>2019-09-25T01:14:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=cebf7d51a6c3babc4d0589da7aec0de1af0a5691'/>
<id>urn:sha1:cebf7d51a6c3babc4d0589da7aec0de1af0a5691</id>
<content type='text'>
This patch prints the stddev and hist for the cycles diff of program
block. It can help us to understand if the cycles is noisy or not.

This patch is inspired by Andi Kleen's patch:

  https://lwn.net/Articles/600471/

We create new option '--cycles-hist'.

Example:

  perf record -b ./div
  perf record -b ./div
  perf diff -c cycles

  # Baseline                                [Program Block Range] Cycles Diff  Shared Object      Symbol
  # ........  .......................................................... ....  .................  ............................
  #
      46.72%                                      [div.c:40 -&gt; div.c:40]    0  div                [.] main
      46.72%                                      [div.c:42 -&gt; div.c:44]    0  div                [.] main
      46.72%                                      [div.c:42 -&gt; div.c:39]    0  div                [.] main
      20.54%                          [random_r.c:357 -&gt; random_r.c:394]    1  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -&gt; random_r.c:380]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -&gt; random_r.c:388]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -&gt; random_r.c:391]    0  libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -&gt; random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -&gt; random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -&gt; random.c:293]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -&gt; random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -&gt; random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -&gt; random.c:298]    0  libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -&gt; div.c:25]    0  div                [.] compute_flag
       8.40%                                      [div.c:27 -&gt; div.c:28]    0  div                [.] compute_flag
       5.14%                                    [rand.c:26 -&gt; rand.c:27]    0  libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -&gt; rand.c:28]    0  libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -&gt; rand@plt+0]    0  div                [.] rand@plt
       0.00%                                                                   [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -&gt; do_mmap+732]  -10  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -&gt; do_mmap+765]    1  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -&gt; do_mmap+299]    0  [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -&gt; __x86_indirect_thunk_r15+0]    7  [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -&gt; native_sched_clock+119]   -1  [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -&gt; native_write_msr+16]  -13  [kernel.kallsyms]  [k] native_write_msr

When we enable the option '--cycles-hist', the output is

  perf diff -c cycles --cycles-hist

  # Baseline                                [Program Block Range] Cycles Diff        stddev/Hist  Shared Object      Symbol
  # ........  .......................................................... ....  .................  .................  ............................
  #
      46.72%                                      [div.c:40 -&gt; div.c:40]    0  ± 37.8% ▁█▁▁██▁█   div                [.] main
      46.72%                                      [div.c:42 -&gt; div.c:44]    0  ± 49.4% ▁▁▂█▂▂▂▂   div                [.] main
      46.72%                                      [div.c:42 -&gt; div.c:39]    0  ± 24.1% ▃█▂▄▁▃▂▁   div                [.] main
      20.54%                          [random_r.c:357 -&gt; random_r.c:394]    1  ± 33.5% ▅▂▁█▃▁▂▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -&gt; random_r.c:380]    0  ± 39.4% ▁▁█▁██▅▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -&gt; random_r.c:388]    0                     libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -&gt; random_r.c:391]    0  ± 41.2% ▁▃▁▂█▄▃▁   libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -&gt; random.c:291]    0  ± 48.8% ▁▁▁▁███▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -&gt; random.c:291]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -&gt; random.c:293]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -&gt; random.c:295]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -&gt; random.c:295]    0                     libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -&gt; random.c:298]    0  ± 75.6% ▃█▁▁▁▁▁▁   libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -&gt; div.c:25]    0  ± 42.1% ▁▃▁▁███▁   div                [.] compute_flag
       8.40%                                      [div.c:27 -&gt; div.c:28]    0  ± 41.8% ██▁▁▄▁▁▄   div                [.] compute_flag
       5.14%                                    [rand.c:26 -&gt; rand.c:27]    0  ± 37.8% ▁▁▁████▁   libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -&gt; rand.c:28]    0                     libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -&gt; rand@plt+0]    0                     div                [.] rand@plt
       0.00%                                                                                      [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -&gt; do_mmap+732]  -10                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -&gt; do_mmap+765]    1                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -&gt; do_mmap+299]    0                     [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -&gt; __x86_indirect_thunk_r15+0]    7                     [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -&gt; native_sched_clock+119]   -1  ± 38.5% ▄█▁        [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -&gt; native_write_msr+16]  -13  ± 47.1% ▁█▇▃▁▁     [kernel.kallsyms]  [k] native_write_msr

 v8:
 ---
 Rebase to perf/core branch

 v7:
 ---
 1. v6 got Jiri's ACK.
 2. Rebase to latest perf/core branch.

 v6:
 ---
 1. Jiri provides better code for using data__hpp_register() in ui_init().
    Use this code in v6.

 v5:
 ---
 1. Refine the use of data__hpp_register() in ui_init() according to
    Jiri's suggestion.

 v4:
 ---
 1. Rename the new option from '--noisy' to '--cycles-hist'
 2. Remove the option '-n'.
 3. Only update the spark value and stats when '--cycles-hist' is enabled.
 4. Remove the code of printing '..'.

 v3:
 ---
 1. Move the histogram to a separate column
 2. Move the svals[] out of struct stats

 v2:
 ---
 Jiri got a compile error,

  CC       builtin-diff.o
  builtin-diff.c: In function ‘compute_cycles_diff’:
  builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
  712 |          labs(pair-&gt;block_info-&gt;cycles_spark[i] -
      |          ^~~~

 Because the result of u64 - u64 is still u64. Now we change the type of
 cycles_spark[] to s64.

Signed-off-by: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf evsel: Introduce evsel_fprintf.h</title>
<updated>2019-09-25T19:26:34+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-09-24T18:41:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=ca1252779f48ece225c6003e01c675abb91cf1b4'/>
<id>urn:sha1:ca1252779f48ece225c6003e01c675abb91cf1b4</id>
<content type='text'>
We already had evsel_fprintf.c, add its counterpart, so that we can
reduce evsel.h a bit more.

We needed a new perf_event_attr_fprintf.c file so as to have a separate
object to link with the python binding in tools/perf/util/python-ext-sources
and not drag symbol_conf, etc into the python binding.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-06bdmt1062d9unzgqmxwlv88@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf copyfile: Move copyfile routines to separate files</title>
<updated>2019-09-25T12:51:49+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-09-24T18:14:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=32ff3fec07b6d8e6c5cc2342f6cbbdcb224d484c'/>
<id>urn:sha1:32ff3fec07b6d8e6c5cc2342f6cbbdcb224d484c</id>
<content type='text'>
Further reducing the util.c hodgepodge files.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-0i62zh7ok25znibyebgq0qs4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf tools: Move event synthesizing routines to separate .c file</title>
<updated>2019-09-20T13:28:21+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-09-18T19:08:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=055c67ed39887c5563e9540470a4617c1b772aec'/>
<id>urn:sha1:055c67ed39887c5563e9540470a4617c1b772aec</id>
<content type='text'>
For better grouping, in time we may end up making most of these static,
i.e. generalizing the 'perf record' synthesizing code so that based on
the target it can do the right thing and call the needed synthesizers.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-s9zxxhk40s95pjng9panet16@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf dsos: Move the dsos struct and its methods to separate source files</title>
<updated>2019-09-01T01:24:10+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-08-30T14:11:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=4a3cec84949d14dc3ef7fb8a51b8949af93cac13'/>
<id>urn:sha1:4a3cec84949d14dc3ef7fb8a51b8949af93cac13</id>
<content type='text'>
So that we can reduce the header dependency tree further, in the process
noticed that lots of places were getting even things like build-id
routines and 'struct perf_tool' definition indirectly, so fix all those
too.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf cacheline: Move cacheline related routines to separate files</title>
<updated>2019-08-26T14:58:29+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-08-22T19:58:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=125009026bfc9ec929975d35344bf69d2c636e95'/>
<id>urn:sha1:125009026bfc9ec929975d35344bf69d2c636e95</id>
<content type='text'>
To disentangle util/sort.h a bit more.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-6kbf2cauas06rbqp15pyter5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf evswitch: Move switch logic to use in other tools</title>
<updated>2019-08-15T15:24:31+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-08-15T14:00:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=8829e56fa050998164e496d380cd69186ae9b8d0'/>
<id>urn:sha1:8829e56fa050998164e496d380cd69186ae9b8d0</id>
<content type='text'>
Now other tools that want switching can use an evswitch for that, just
set it up and add it to the PERF_RECORD_SAMPLE processing function.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Florian Weimer &lt;fweimer@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: William Cohen &lt;wcohen@redhat.com&gt;
Link: https://lkml.kernel.org/n/tip-b1trj1q97qwfv251l66q3noj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
