| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As disscused in https://bugs.llvm.org/show_bug.cgi?id=43219,
i believe it may be somewhat useful to show //some// aggregates
over all the sea of statistics provided.
Example:
```
Average Wait times (based on the timeline view):
[0]: Executions
[1]: Average time spent waiting in a scheduler's queue
[2]: Average time spent waiting in a scheduler's queue while ready
[3]: Average time elapsed from WB until retire stage
[0] [1] [2] [3]
0. 3 1.0 1.0 4.7 vmulps %xmm0, %xmm1, %xmm2
1. 3 2.7 0.0 2.3 vhaddps %xmm2, %xmm2, %xmm3
2. 3 6.0 0.0 0.0 vhaddps %xmm3, %xmm3, %xmm4
3 3.2 0.3 2.3 <total>
```
I.e. we average the averages.
Reviewers: andreadb, mattd, RKSimon
Reviewed By: andreadb
Subscribers: gbedwell, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68714
llvm-svn: 374361
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Jaguar, horizontal adds/subs have local forwarding disable.
That means, we pay a compulsory extra cycle of write-back stage, and the value
is not available until the end of that stage.
This patch changes the latency of horizontal operations by adding an extra
cycle. With this patch, latency numbers now match what is reported by perf.
I plan to send another patch to also 'fix' the latency of shuffle operations (on
Jaguar, local forwarding is disabled for vector shuffles too).
Differential Revision: https://reviews.llvm.org/D56777
llvm-svn: 351366
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generated by the SummaryView.
This patch adds two new fields to the perf report generated by the SummaryView.
Fields are now logically organized into two small groups; only the second group
contains throughput indicators.
Example:
```
Iterations: 100
Instructions: 300
Total Cycles: 414
Total uOps: 700
Dispatch Width: 4
uOps Per Cycle: 1.69
IPC: 0.72
Block RThroughput: 4.0
```
This patch also updates the docs for llvm-mca.
Due to the nature of this change, several tests in the tools/llvm-mca directory
were affected, and had to be updated using script `update_mca_test_checks.py`.
llvm-svn: 340946
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the Instruction Info View. NFC
This makes easier to identify changes in the instruction info flags. It also
helps spotting potential regressions similar to the one recently introduced at
r336728.
Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic
for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and
spaces. A change in position of the flag marker may not trigger a test failure.
This patch only changes the character used for flag `hasSideEffects`. The reason
why I didn't touch other flags is because I want to avoid spamming the mailing
because of the massive diff due to the numerous tests affected by this change.
In future, each instruction flag should be associated with a different character
in the Instruction Info View.
llvm-svn: 336797
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It's super irritating.
[properly configured] git client then complains about that double-newline,
and you have to use `--force` to ignore the warning, since even if you
fix it manually, it will be reintroduced the very next runtime :/
Reviewers: RKSimon, andreadb, courbet, craig.topper, javed.absar, gbedwell
Reviewed By: gbedwell
Subscribers: javed.absar, tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D47697
llvm-svn: 333887
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the "block reciprocal throughput" computation in the
SummaryView.
The block reciprocal throughput is computed as the MAX of:
- NumMicroOps / DispatchWidth
- Resource Cycles / #Units (for every resource consumed).
The block throughput is bounded from above by the hardware dispatch throughput.
That is because the DispatchWidth is an upper bound on how many opcodes can be part
of a single dispatch group.
The block throughput is also limited by the amount of hardware parallelism. The
number of available resource units affects how the resource pressure is
distributed, and also how many blocks can be delivered every cycle.
llvm-svn: 333095
|
|
|
|
| |
llvm-svn: 332447
|
|
|
|
|
|
|
|
|
|
|
| |
This script can be used to regenerate tests in the
test/tools/llvm-mca directory (PR36904).
Regenerated a number of tests using the pattern: test/tools/llvm-mca/*/*/*.s
Differential Revision: https://reviews.llvm.org/D45369
llvm-svn: 330246
|
|
|
|
|
|
|
|
| |
timeline view."
This reapplies r329403 with a fix for the floating point rounding issue.
llvm-svn: 329680
|
|
|
|
|
|
|
|
|
|
|
|
| |
timeline view."
This made AArch64/CortexA57/direct-branch.s fail on Windows, e.g.
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11251
> Also, update a few tests to minimize the diff in D45369.
> No functional change intended.
llvm-svn: 329569
|
|
|
|
|
|
|
| |
Also, update a few tests to minimize the diff in D45369.
No functional change intended.
llvm-svn: 329403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the ability to describe properties of the hardware retire
control unit.
Tablegen class RetireControlUnit has been added for this purpose (see
TargetSchedule.td).
A RetireControlUnit specifies the size of the reorder buffer, as well as the
maximum number of opcodes that can be retired every cycle.
A zero (or negative) value for the reorder buffer size means: "the size is
unknown". If the size is unknown, then llvm-mca defaults it to the value of
field SchedMachineModel::MicroOpBufferSize. A zero or negative number of
opcodes retired per cycle means: "there is no restriction on the number of
instructions that can be retired every cycle".
Models can optionally specify an instance of RetireControlUnit. There can only
be up-to one RetireControlUnit definition per scheduling model.
Information related to the RCU (RetireControlUnit) is stored in (two new fields
of) MCExtraProcessorInfo. llvm-mca loads that information when it initializes
the DispatchUnit / RetireControlUnit (see Dispatch.h/Dispatch.cpp).
This patch fixes PR36661.
Differential Revision: https://reviews.llvm.org/D45259
llvm-svn: 329304
|
|
|
|
|
|
|
|
|
|
| |
unit for JWriteResFpuPair defs
Jaguar's FPU has 2 scheduler pipes (JFPU0/JFPU1) which forward to multiple functional sub-units each. We need to model that an micro-op will both consume the scheduler pipe and a functional unit.
This patch just handles the ops defined through JWriteResFpuPair, I'll go through the custom cases later.
llvm-svn: 327791
|
|
|
|
|
|
|
| |
In future, both the summary information and the 'instruction info' table should
be moved into a separate "Summary" view.
llvm-svn: 327010
|
|
llvm-mca is an LLVM based performance analysis tool that can be used to
statically measure the performance of code, and to help triage potential
problems with target scheduling models.
llvm-mca uses information which is already available in LLVM (e.g. scheduling
models) to statically measure the performance of machine code in a specific cpu.
Performance is measured in terms of throughput as well as processor resource
consumption. The tool currently works for processors with an out-of-order
backend, for which there is a scheduling model available in LLVM.
The main goal of this tool is not just to predict the performance of the code
when run on the target, but also help with diagnosing potential performance
issues.
Given an assembly code sequence, llvm-mca estimates the IPC (instructions per
cycle), as well as hardware resources pressure. The analysis and reporting style
were mostly inspired by the IACA tool from Intel.
This patch is related to the RFC on llvm-dev visible at this link:
http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html
Differential Revision: https://reviews.llvm.org/D43951
llvm-svn: 326998
|