| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
The patch gives out the details of the znver2 scheduler model.
There are few improvements with respect to execution units, latencies and
throughput when compared with znver1.
The tests that were present for znver1 for llvm-mca tool were replicated.
The latencies, execution units, timeline and throughput information are updated for znver2.
Reviewers: craig.topper, Simon Pilgrim
Differential Revision: https://reviews.llvm.org/D66088
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As disscused in https://bugs.llvm.org/show_bug.cgi?id=43219,
i believe it may be somewhat useful to show //some// aggregates
over all the sea of statistics provided.
Example:
```
Average Wait times (based on the timeline view):
[0]: Executions
[1]: Average time spent waiting in a scheduler's queue
[2]: Average time spent waiting in a scheduler's queue while ready
[3]: Average time elapsed from WB until retire stage
[0] [1] [2] [3]
0. 3 1.0 1.0 4.7 vmulps %xmm0, %xmm1, %xmm2
1. 3 2.7 0.0 2.3 vhaddps %xmm2, %xmm2, %xmm3
2. 3 6.0 0.0 0.0 vhaddps %xmm3, %xmm3, %xmm4
3 3.2 0.3 2.3 <total>
```
I.e. we average the averages.
Reviewers: andreadb, mattd, RKSimon
Reviewed By: andreadb
Subscribers: gbedwell, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68714
llvm-svn: 374361
|
|
|
|
|
|
| |
D66424 adds the base support for LOCK so we should be able to add special case support for all these cases in future patches
llvm-svn: 369367
|
|
|
|
|
|
|
|
|
|
| |
resources-x86_64.s files. NFC
In D66424 it has been requested to move all the new tests added by r369278 into
resources-x86_64.s. That is because only the 8b/16 ops should be tested by
resources-cmpxchg.s. This partially reverts r369278.
llvm-svn: 369288
|
|
|
|
|
|
| |
Addresses a review comment in D66424
llvm-svn: 369279
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
llvm.x86.sse.stmxcsr only writes to memory.
llvm.x86.sse.ldmxcsr only reads from memory, and might generate an FPE.
Reviewers: craig.topper, RKSimon
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62896
llvm-svn: 363773
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
printing.
We require d/q suffixes on the memory form of these instructions to disambiguate the memory size.
We don't require it on the register forms, but need to support parsing both with and without it.
Previously we always printed the d/q suffix on the register forms, but it's redundant and
inconsistent with gcc and objdump.
After this patch we should support the d/q for parsing, but not print it when its unneeded.
llvm-svn: 360085
|
|
|
|
|
|
| |
This is defined as part of SSE1, XMM PMOVMSKB doesn't appear until SSE2
llvm-svn: 359477
|
|
|
|
| |
llvm-svn: 359397
|
|
|
|
|
|
|
|
|
|
| |
custom printing and custom parsing to achieve the same result and more
Similar to previous change done for VPCOM and VPCMP
Differential Revision: https://reviews.llvm.org/D59468
llvm-svn: 356384
|
|
|
|
|
|
| |
Some targets have fast-path handling for these patterns that we should model.
llvm-svn: 355498
|
|
|
|
|
|
|
|
|
|
| |
arguments where on is %st.
All of these instructions consume one encoded register and the other register is %st. They either write the result to %st or the encoded register. Previously we printed both arguments when the encoded register was written. And we printed one argument when the result was written to %st. For the stack popping forms the encoded register is always the destination and we didn't print both operands. This was inconsistent with gcc and objdump and just makes the output assembly code harder to read.
This patch changes things to always print both operands making us consistent with gcc and objdump. The parser should still be able to handle the single register forms just as it did before. This also matches the GNU assembler behavior.
llvm-svn: 353061
|
|
|
|
|
|
|
|
| |
printing it as %st(0) when its encoded in the instruction.
This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior.
llvm-svn: 353015
|
|
|
|
|
|
|
|
|
|
| |
as the clobber name to make MS inline asm work correctly"
Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st.
I'll be making a more directed change in a future patch.
llvm-svn: 353013
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
name to make MS inline asm work correctly
Summary:
When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name.
This also matches what objdump disassembly prints. It's also what is printed by gcc -S.
Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri
Reviewed By: rnk
Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits
Differential Revision: https://reviews.llvm.org/D57621
llvm-svn: 352985
|
|
|
|
|
|
| |
We're getting pretty close to matching/exceeding test coverage of the test\CodeGen\X86\*-schedule.ll files, which should allow us to get rid of -print-schedule and fix PR37160
llvm-svn: 351836
|
|
|
|
|
|
| |
and rdtsc/rdtscp tests
llvm-svn: 351835
|
|
|
|
| |
llvm-svn: 351831
|
|
|
|
|
|
| |
These technically should be under a MONITOR cpuid bit, but we tag them as SSE3 so I've done that here as well.
llvm-svn: 351829
|
|
|
|
| |
llvm-svn: 351828
|
|
|
|
| |
llvm-svn: 351827
|
|
|
|
|
|
| |
Add missing non-VEX instructions
llvm-svn: 348623
|
|
|
|
| |
llvm-svn: 348622
|
|
|
|
| |
llvm-svn: 343421
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generated by the SummaryView.
This patch adds two new fields to the perf report generated by the SummaryView.
Fields are now logically organized into two small groups; only the second group
contains throughput indicators.
Example:
```
Iterations: 100
Instructions: 300
Total Cycles: 414
Total uOps: 700
Dispatch Width: 4
uOps Per Cycle: 1.69
IPC: 0.72
Block RThroughput: 4.0
```
This patch also updates the docs for llvm-mca.
Due to the nature of this change, several tests in the tools/llvm-mca directory
were affected, and had to be updated using script `update_mca_test_checks.py`.
llvm-svn: 340946
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D49912
llvm-svn: 339145
|
|
|
|
|
|
| |
I've put CMPXCHG8B/CMPXCHG16B in the same file, even though technically they are under separate CPUID bits all targets seem to support both (or neither).
llvm-svn: 338595
|
|
|
|
|
|
| |
These aren't just available via 3DNow! so test for them separately as well.
llvm-svn: 338584
|
|
|
|
|
|
| |
Renamed the btver2 file that already contained them - the other targets were only testing the AVX versions
llvm-svn: 338583
|
|
|
|
| |
llvm-svn: 338576
|
|
|
|
|
|
| |
We already added these to btver2, now add them to other targets, even though none of their models treat them specially (yet).
llvm-svn: 338565
|
|
|
|
|
|
| |
CPUID, IN/OUT, INS/OUTS, INT, PAUSE, SCAS, UD2, XLAT
llvm-svn: 338563
|
|
|
|
| |
llvm-svn: 338550
|
|
|
|
| |
llvm-svn: 338532
|
|
|
|
| |
llvm-svn: 338514
|
|
|
|
|
|
| |
These aren't exhaustive, but cover some instructions that are only available in 32-bit mode (where would we be without good BCD math performance?).
llvm-svn: 338404
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
partially written registers.
Summary:
Pretty mechanical follow-up for D49196.
As microarchitecture.pdf notes, "20 AMD Ryzen pipeline",
"20.8 Register renaming and out-of-order schedulers":
The integer register file has 168 physical registers of 64 bits each.
The floating point register file has 160 registers of 128 bits each.
"20.14 Partial register access":
The processor always keeps the different parts of an integer register together.
...
An instruction that writes to part of a register will therefore have a false dependence
on any previous write to the same register or any part of it.
Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh
Reviewed By: GGanesh
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D49393
llvm-svn: 337676
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh
Reviewed By: GGanesh
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D49392
llvm-svn: 337675
|
|
|
|
| |
llvm-svn: 337586
|
|
|
|
|
|
| |
x86_64 resource tests
llvm-svn: 337306
|
|
|
|
|
|
| |
SNB doesn't support MOVBE but the numbers in Generic (which use the SNB model) look sane.
llvm-svn: 337305
|
|
|
|
| |
llvm-svn: 337302
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before revision 336728, the "mayLoad" flag for instruction (V)MOVLPSrm was
inferred directly from the "default" pattern associated with the instruction
definition.
r336728 removed special node X86Movlps, and all the patterns associated to it.
Now instruction (V)MOVLPSrm doesn't have a pattern associated to it, and the
'mayLoad/hasSideEffects' flags are left unset.
When the instruction info is emitted by tablegen, method
CodeGenDAGPatterns::InferInstructionFlags() sees that (V)MOVLPSrm doesn't have a
pattern, and flags are undefined. So, it conservatively sets the
"hasSideEffects" flag for it.
As a consequence, we were losing the 'mayLoad' flag, and we were gaining a
'hasSideEffect' flag in its place.
This patch fixes the issue (originally reported by Michael Holmen).
The mca tests show the differences in the instruction info flags. Instructions
that were affected by this problem were: MOVLPSrm/VMOVLPSrm/VMOVLPSZ128rm.
Differential Revision: https://reviews.llvm.org/D49182
llvm-svn: 336818
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the Instruction Info View. NFC
This makes easier to identify changes in the instruction info flags. It also
helps spotting potential regressions similar to the one recently introduced at
r336728.
Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic
for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and
spaces. A change in position of the flag marker may not trigger a test failure.
This patch only changes the character used for flag `hasSideEffects`. The reason
why I didn't touch other flags is because I want to avoid spamming the mailing
because of the massive diff due to the numerous tests affected by this change.
In future, each instruction flag should be associated with a different character
in the Instruction Info View.
llvm-svn: 336797
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: RKSimon, andreadb, courbet
Reviewed By: RKSimon
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48997
llvm-svn: 336510
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I ran llvm-exegesis on SKX, SKL, BDW, HSW, SNB.
Atom is from Agner and SLM is a guess.
I've left AMD processors alone.
Reviewers: RKSimon, craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48079
llvm-svn: 335097
|
|
|
|
|
|
|
|
| |
in reg-reg cases
I noticed while working on zero-idiom + dependency-breaking support (PR36671) that most of our binary instruction tests were reusing the same src registers, which would cause the tests to fail once we enable scalar zero-idiom support on btver2. Fixed in all targets to keep them in sync.
llvm-svn: 334110
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fix for the problem arising in D47374 (PR37678):
https://bugs.llvm.org/show_bug.cgi?id=37678
We may not have throughput info because it's not specified in the model
or it's not available with variant scheduling, so assume that those
instructions can execute/complete at max-issue-width.
Differential Revision: https://reviews.llvm.org/D47723
llvm-svn: 334055
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It's super irritating.
[properly configured] git client then complains about that double-newline,
and you have to use `--force` to ignore the warning, since even if you
fix it manually, it will be reintroduced the very next runtime :/
Reviewers: RKSimon, andreadb, courbet, craig.topper, javed.absar, gbedwell
Reviewed By: gbedwell
Subscribers: javed.absar, tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D47697
llvm-svn: 333887
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
After SNB, Intel CPUs can rename CF independently of other EFLAGS,
so the renamer can zero it for free. Note that STC still consumes resources.
To reproduce: `$ llvm-exegesis -mode=uops -opcode-name=CLC`
On SNB:
```
---
key:
opcode_name: CLC
mode: uops
config: ''
cpu_name: sandybridge
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: '3', value: 0.0014, debug_string: SBPort0 }
- { key: '4', value: 0.0013, debug_string: SBPort1 }
- { key: '5', value: 0.0003, debug_string: SBPort4 }
- { key: '6', value: 0.0029, debug_string: SBPort5 }
- { key: '10', value: 0.0003, debug_string: SBPort23 }
error: ''
info: 'instruction is serial, repeating a random one.
Snippet:
CLC
'
...
```
On HSW:
```
---
key:
opcode_name: CLC
mode: uops
config: ''
cpu_name: haswell
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: '3', value: 0.001, debug_string: HWPort0 }
- { key: '4', value: 0.0009, debug_string: HWPort1 }
- { key: '5', value: 0.0004, debug_string: HWPort2 }
- { key: '6', value: 0.0006, debug_string: HWPort3 }
- { key: '7', value: 0.0002, debug_string: HWPort4 }
- { key: '8', value: 0.0012, debug_string: HWPort5 }
- { key: '9', value: 0.0022, debug_string: HWPort6 }
- { key: '10', value: 0.0001, debug_string: HWPort7 }
error: ''
info: 'instruction is serial, repeating a random one.
Snippet:
CLC
'
...
```
Reviewers: craig.topper, RKSimon
Subscribers: gchatelet, llvm-commits
Differential Revision: https://reviews.llvm.org/D47362
llvm-svn: 333392
|