| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
| |
There was a missing space before the instruction name, and the newline
is redundant since MI::print by default adds one.
llvm-svn: 353046
|
| |
|
|
|
|
|
| |
Also add some more select tests to help show future legalization
changes.
llvm-svn: 353045
|
| |
|
|
|
|
|
|
| |
Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case.
We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage.
llvm-svn: 353044
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes hidden codegen flag -print-schedule effectively reverting the
logic originally committed as r300311
(https://llvm.org/viewvc/llvm-project?view=revision&revision=300311).
Flag -print-schedule was originally introduced by r300311 to address PR32216
(https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better
testing of schedule model instruction latencies/throughputs".
These days, we can use llvm-mca to test scheduling models. So there is no longer
a need for flag -print-schedule in LLVM. The main use case for PR32216 is
now addressed by llvm-mca.
Flag -print-schedule is mainly used for debugging purposes, and it is only
actually used by x86 specific tests. We already have extensive (latency and
throughput) tests under "test/tools/llvm-mca" for X86 processor models. That
means, most (if not all) existing -print-schedule tests for X86 are redundant.
When flag -print-schedule was first added to LLVM, several files had to be
modified; a few APIs gained new arguments (see for example method
MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained
a couple of getSchedInfoStr() methods.
Method getSchedInfoStr() had to originally work for both MCInst and
MachineInstr. The original implmentation of getSchedInfoStr() introduced a
subtle layering violation (reported as PR37160 and then fixed/worked-around by
r330615).
In retrospect, that new API could have been designed more optimally. We can
always query MCSchedModel to get the latency and throughput. More importantly,
the "sched-info" string should not have been generated by the subtarget.
Note, r317782 fixed an issue where "print-schedule" didn't work very well in the
presence of inline assembly. That commit is also reverted by this change.
Differential Revision: https://reviews.llvm.org/D57244
llvm-svn: 353043
|
| |
|
|
|
|
| |
Noticed while investigating PR40483
llvm-svn: 353042
|
| |
|
|
| |
llvm-svn: 353041
|
| |
|
|
|
|
|
|
|
|
| |
This prevents Constant Hoisting from pulling the constant out of the block,
allowing us to still produce LDRH/UXTH nodes. LDRB/UXTB (255) is already cheap
by the default getIntImmCost, but I've added it for clarity.
Differential Revision: https://reviews.llvm.org/D57671
llvm-svn: 353040
|
| |
|
|
| |
llvm-svn: 353039
|
| |
|
|
| |
llvm-svn: 353038
|
| |
|
|
| |
llvm-svn: 353037
|
| |
|
|
| |
llvm-svn: 353036
|
| |
|
|
| |
llvm-svn: 353035
|
| |
|
|
| |
llvm-svn: 353034
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
From https://bugs.llvm.org/show_bug.cgi?id=40516
```
$ cat a.cpp
const NamespaceName::VeryLongClassName &NamespaceName::VeryLongClassName::myFunction() {
// do stuff
}
const NamespaceName::VeryLongClassName &NamespaceName::VeryLongClassName::operator++() {
// do stuff
}
$ ~/ll/build/opt/bin/clang-format -style=LLVM a.cpp
const NamespaceName::VeryLongClassName &
NamespaceName::VeryLongClassName::myFunction() {
// do stuff
}
const NamespaceName::VeryLongClassName &NamespaceName::VeryLongClassName::
operator++() {
// do stuff
}
```
What was happening is that the split penalty before `operator` was being set to
a smaller value by a prior if block. Moved checks around to fix this and added a
regression test.
Reviewers: djasper
Reviewed By: djasper
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57604
llvm-svn: 353033
|
| |
|
|
| |
llvm-svn: 353032
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Enables users to add comment handlers to preprocessor when building
preambles.
Reviewers: ilya-biryukov, ioeric
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D57507
llvm-svn: 353030
|
| |
|
|
| |
llvm-svn: 353028
|
| |
|
|
|
|
|
|
| |
CHANGELOG:
- cleanup filestatus caches when clangd crashes
- extension workwith LSP v3.14.0, support go-to-declaration feature
llvm-svn: 353027
|
| |
|
|
|
|
| |
This allows us to use latest LSP v3.14.0 (for go-to-declaration feature).
llvm-svn: 353026
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(YamlContext::getRegNo())
Summary:
Together with the previous patch, it's an -90% improvement,
or roughly -96% improvement if you look starting with rL347204
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):
1483.18 msec task-clock # 0.999 CPUs utilized ( +- 0.10% )
68 context-switches # 46.085 M/sec ( +- 22.62% )
0 cpu-migrations # 0.000 K/sec
11641 page-faults # 7850.880 M/sec ( +- 0.62% )
5943246799 cycles # 4008184.428 GHz ( +- 0.10% ) (83.28%)
442869514 stalled-cycles-frontend # 7.45% frontend cycles idle ( +- 0.41% ) (83.29%)
1443375663 stalled-cycles-backend # 24.29% backend cycles idle ( +- 0.47% ) (33.43%)
7714006752 instructions # 1.30 insn per cycle
# 0.19 stalled cycles per insn ( +- 0.07% ) (50.17%)
1977242936 branches # 1333472193.855 M/sec ( +- 0.07% ) (66.79%)
32624220 branch-misses # 1.65% of all branches ( +- 0.18% ) (83.34%)
1.48438 +- 0.00143 seconds time elapsed ( +- 0.10% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-newer.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-newer.html' (9 runs):
963.28 msec task-clock # 0.999 CPUs utilized ( +- 0.37% )
12 context-switches # 12.695 M/sec ( +- 52.79% )
0 cpu-migrations # 0.000 K/sec
11599 page-faults # 12046.971 M/sec ( +- 0.59% )
3860122322 cycles # 4009359.596 GHz ( +- 0.37% ) (83.19%)
380300669 stalled-cycles-frontend # 9.85% frontend cycles idle ( +- 0.34% ) (83.30%)
1071910340 stalled-cycles-backend # 27.77% backend cycles idle ( +- 1.30% ) (33.51%)
4773418224 instructions # 1.24 insn per cycle
# 0.22 stalled cycles per insn ( +- 0.15% ) (50.17%)
1106990316 branches # 1149787979.919 M/sec ( +- 0.11% ) (66.80%)
23632231 branch-misses # 2.13% of all branches ( +- 0.18% ) (83.33%)
0.96389 +- 0.00356 seconds time elapsed ( +- 0.37% )
```
```
$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-newer.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-old.html
```
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57658
llvm-svn: 353025
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(YamlContext::getInstrOpcode())
Summary:
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (9 runs):
9465.46 msec task-clock # 1.000 CPUs utilized ( +- 0.05% )
60 context-switches # 6.363 M/sec ( +- 79.45% )
0 cpu-migrations # 0.000 K/sec
11364 page-faults # 1200.697 M/sec ( +- 0.60% )
37935623543 cycles # 4008083.912 GHz ( +- 0.05% ) (83.32%)
2371625356 stalled-cycles-frontend # 6.25% frontend cycles idle ( +- 0.37% ) (83.32%)
8476077875 stalled-cycles-backend # 22.34% backend cycles idle ( +- 0.18% ) (33.36%)
41822439158 instructions # 1.10 insn per cycle
# 0.20 stalled cycles per insn ( +- 0.02% ) (50.03%)
11607658944 branches # 1226405861.486 M/sec ( +- 0.01% ) (66.69%)
210864633 branch-misses # 1.82% of all branches ( +- 0.06% ) (83.34%)
9.46636 +- 0.00441 seconds time elapsed ( +- 0.05% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):
1480.66 msec task-clock # 1.000 CPUs utilized ( +- 0.19% )
13 context-switches # 8.483 M/sec ( +- 83.10% )
0 cpu-migrations # 0.075 M/sec ( +-100.00% )
11596 page-faults # 7834.247 M/sec ( +- 0.59% )
5933732194 cycles # 4008977.535 GHz ( +- 0.19% ) (83.22%)
438111928 stalled-cycles-frontend # 7.38% frontend cycles idle ( +- 0.37% ) (83.25%)
1454969705 stalled-cycles-backend # 24.52% backend cycles idle ( +- 0.94% ) (33.53%)
7724218604 instructions # 1.30 insn per cycle
# 0.19 stalled cycles per insn ( +- 0.07% ) (50.14%)
1979796413 branches # 1337599858.945 M/sec ( +- 0.06% ) (66.74%)
32641638 branch-misses # 1.65% of all branches ( +- 0.18% ) (83.31%)
1.48128 +- 0.00284 seconds time elapsed ( +- 0.19% )
$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-old.html
```
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits, RKSimon
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57657
llvm-svn: 353024
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
D57000 / [[ https://bugs.llvm.org/show_bug.cgi?id=37698 | PR37698 ]] added support for measuring of the inverse throughput.
But the support for the analysis was not added.
This attempts to fix that. (analysis done o bdver2 / piledriver)
First, small-scale experiment:
```
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=inverse_throughput -opcode-name=BSF64rr
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-d0acdd.o
---
mode: inverse_throughput
key:
instructions:
- 'BSF64rr RAX RDX'
config: ''
register_initial_values:
- 'RDX=0x0'
cpu_name: bdver2
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: inverse_throughput, value: 3.0278, per_snippet_value: 3.0278 }
error: ''
info: instruction has no tied variables picking Uses different from defs
assembled_snippet: 48BA0000000000000000480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2C3
...
```
If we plug `bsfq %r12, %r10` into llvm-mca:
https://godbolt.org/z/ZtOyhJ
```
Dispatch Width: 4
uOps Per Cycle: 3.00
IPC: 0.50
Block RThroughput: 2.0
```
So RThroughput mismatch exists.
Now, let's upscale and analyse:
{F8207148}
`$ ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html`:
{F8207172}
{F8207197}
And if we now look at https://www.agner.org/optimize/instruction_tables.pdf,
`Reciprocal throughput` for `BSF r,r` is listed as `3`.
Yay?
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, RKSimon, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57647
llvm-svn: 353023
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
... from 8.
`VALIGNDZ128rmbik XMM0 XMM0 K1 XMM3 RDI i_0x1 i_0x0 i_0x1` instruction already has 9 components.
It does not matter much in terms of performance, but avoiding allocation seems to come with low cost here..
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57654
llvm-svn: 353022
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Up until the point i have looked in the source, i didn't even understood that
i can disable 'cluster' output. I have always silenced it via ` &> /dev/null`.
(And hoped it wasn't contributing much of the run time.)
While i expect that it has it's use-cases i never once needed it so far.
If i forget to silence it, console is completely flooded with that output.
How about not expecting users to opt-out of analyses,
but to explicitly specify the analyses that should be performed?
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, RKSimon, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57648
llvm-svn: 353021
|
| |
|
|
|
|
|
|
|
|
|
| |
Summary: this commit adds support to a new dependence type introduced in OpenMP
5.0. The LLVM OpenMP RTL already supports this feature, so we only need to
modify CLANG to take advantage of them.
Differential Revision: https://reviews.llvm.org/D57576
llvm-svn: 353018
|
| |
|
|
|
|
|
|
|
|
|
| |
Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we
have a huge amounts of such code, we will be littering SE with these nodes. We could
just state that they all are undef and save some memory.
Differential Revision: https://reviews.llvm.org/D57567
Reviewed By: sanjoy
llvm-svn: 353017
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now print ST0 as 'st' when generating the clobber list for MS inline assembly in clang. This matches what the gcc reg name list expects.
Original commit message:
This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler.
Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler.
Differential Revision: https://reviews.llvm.org/D57298
llvm-svn: 353016
|
| |
|
|
|
|
|
|
| |
printing it as %st(0) when its encoded in the instruction.
This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior.
llvm-svn: 353015
|
| |
|
|
|
|
| |
updates.
llvm-svn: 353014
|
| |
|
|
|
|
|
|
|
|
| |
as the clobber name to make MS inline asm work correctly"
Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st.
I'll be making a more directed change in a future patch.
llvm-svn: 353013
|
| |
|
|
|
|
|
| |
The MachO tests can run on any target, but require that the x86 backend
is available. Broaden the coverage of the test.
llvm-svn: 353012
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In llvm-nm, the symbol size was being computed only with --print-size option,
even though it was being printed in other cases, such as with --format=posix.
This patch simply removes the guard, so that the size is computed
independently of the later decision to print it or not.
Fixes PR39997.
Differential Revision: https://reviews.llvm.org/D57599
llvm-svn: 353011
|
| |
|
|
|
|
| |
This fixes compilation after SVN r352966 in SEH mode.
llvm-svn: 353010
|
| |
|
|
|
|
| |
explicit function types.
llvm-svn: 353009
|
| |
|
|
|
|
| |
Pointed out by Shoaib Meenai.
llvm-svn: 353008
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: PR40564
Reviewers: aprantl, rnk
Subscribers: llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57629
llvm-svn: 353007
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The description of what the various Expr::Ignore* do has drifted from the
actual implementation.
Inspection reveals that IgnoreParenImpCasts() is not equivalent to doing
IgnoreParens() + IgnoreImpCasts() until reaching a fixed point, but
IgnoreParenCasts() is equivalent to doing IgnoreParens() + IgnoreCasts()
until reaching a fixed point. There is also a fair amount of duplication
in the various Expr::Ignore* functions which increase the chance of further
future inconsistencies. In preparation for the next patch which will factor
out the implementation of the various Expr::Ignore*, do the following cleanups:
Remove Stmt::IgnoreImplicit, in favor of Expr::IgnoreImplicit. IgnoreImplicit
is the only function among all of the Expr::Ignore* which is available in Stmt.
There are only a few users of Stmt::IgnoreImplicit. They can just use instead
Expr::IgnoreImplicit like they have to do for the other Ignore*.
Move Expr::IgnoreImpCasts() from Expr.h to Expr.cpp. This made no difference
in the run-time with my usual benchmark (-fsyntax-only on all of Boost).
While we are at it, make IgnoreParenNoopCasts take a const reference to the
ASTContext for const correctness.
Update the comments to match what the Expr::Ignore* are actually doing.
I am not sure that listing exactly what each Expr::Ignore* do is optimal,
but it certainly looks better than the current state which is in my opinion
between misleading and just plain wrong.
The whole patch is NFC (if you count removing Stmt::IgnoreImplicit as NFC).
Differential Revision: https://reviews.llvm.org/D57266
Reviewed By: aaron.ballman
llvm-svn: 353006
|
| |
|
|
|
|
|
|
|
|
| |
As discussed in D50222, this changes the vector types in tests required for that revision to ones legal for X86.
Patch by @hermord (Dmytro Shynkevych)
Differential Revision: https://reviews.llvm.org/D56372
llvm-svn: 353004
|
| |
|
|
|
|
|
|
|
|
|
| |
There is currently no way to distinguish implicit from explicit
CXXThisExpr in the AST dump output.
Differential Revision: https://reviews.llvm.org/D57649
Reviewed By: steveire
llvm-svn: 353003
|
| |
|
|
|
|
|
| |
We don't need a mtctr/bctr for this test now; a regular
conditional branch is fine.
llvm-svn: 353002
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are 2 changes visible here:
1. There's no reason to limit this transform based on number
of condition registers. That diff allows PPC to produce
slightly better (dot-instructions should be generally good)
code.
Note: someone that cares about PPC codegen might want to
look closer at that output because it seems like we could
still improve this.
2. We (probably?) should not bother trying to form uaddo (or
other overflow ops) when there's no target support for such
an op. This goes beyond checking whether the op is expanded
because both PPC and AArch64 show better codegen for standard
types regardless of whether the op is legal/custom.
llvm-svn: 353001
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Enable tests that were previously disabled because they didn't work on
Windows.
Reviewers: morehouse
Reviewed By: morehouse
Subscribers: morehouse
Differential Revision: https://reviews.llvm.org/D57563
llvm-svn: 353000
|
| |
|
|
|
|
| |
getTargetShuffleMask can only do this safely if we're extracting the lowest subvector from a vector of the same result type.
llvm-svn: 352999
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the most important uaddo problem mentioned in PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
...but that was overcome in x86 codegen with D57637.
That patch also corrects the inc vs. add regressions seen with the previous attempt at this.
Still, we want to make this matcher complete, so we can potentially canonicalize the pattern
even if it's an 'add 1' operation.
Pattern matching, however, shouldn't assume that we have canonicalized IR, so we match 4
commuted variants of uaddo.
There's also a test with a crazy type to show that the existing CGP transform based on this
matcher is not limited by target legality checks.
I'm not sure if the Hexagon diff means the test is no longer testing what it intended to
test, but that should be solvable in a follow-up.
Differential Revision: https://reviews.llvm.org/D57516
llvm-svn: 352998
|
| |
|
|
| |
llvm-svn: 352997
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Different Unix "errno" values are returned for the following scenarios:
$ echo test > /tmp/existingFile/impossibleDir/impossibleFile
"Not a directory"
$ echo test > /tmp/nonexistentDir/impossibleFile
"No such file or directory"
This fixes the regression introduced by r352971 / D57592.
llvm-svn: 352996
|
| |
|
|
| |
llvm-svn: 352995
|
| |
|
|
|
|
|
|
| |
Aim to use scalar source or lowest 128-bit vector directly.
We're still missing some VZMOVL_LOAD combines.
llvm-svn: 352994
|
| |
|
|
|
|
|
| |
There's probably no reason to try this transform
for an obviously unsupported op.
llvm-svn: 352993
|
| |
|
|
|
|
|
|
| |
The test specifiies the triple, so it needs to be in the
x86 directory in case a bot has been configured without
the x86 target.
llvm-svn: 352992
|