| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(trunc (X)), 0), Y, (or Y, C2))
Summary:
InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y).
This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used.
Reviewers: spatel, majnemer, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34184
llvm-svn: 306029
|
|
|
|
|
|
|
|
|
|
| |
If the components of the and/or had multiple uses, this transform created an additional instruction.
This patch makes sure we remove one of the components.
Differential Revision: https://reviews.llvm.org/D34498
llvm-svn: 306027
|
|
|
|
| |
llvm-svn: 306012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are 2 parts to this patch made simultaneously to avoid a regression.
We're reversing the canonicalization that moves bitwise vector ops before bitcasts.
We're moving bitwise vector ops *after* bitcasts instead. That's the 1st and 3rd hunks
of the patch. The motivation is that there's only one fold that currently depends on
the existing canonicalization (see next), but there are many folds that would
automatically benefit from the new canonicalization.
PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these
patterns in IR.
There's an or(and,andn) pattern that requires an adjustment in order to continue matching
to 'select' because the bitcast changes position. This match is unfortunately complicated
because it requires 4 logic ops with optional bitcast and sext ops.
Test diffs:
1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference -
bitcast comes before logic.
2. There are also tests with no diffs in bitcast.ll that verify that we're still doing
folds that were enabled by the previous canonicalization.
3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns
to look through bitcasts.
4. logical-select.ll contains several tests for the or(and,andn) --> select fold to
verify that we are still handling those cases. The lone diff shows the movement of
the bitcast from the new canonicalization rule.
Differential Revision: https://reviews.llvm.org/D33517
llvm-svn: 306011
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds prologue code emission for stack probe function
calls.
Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D34387
llvm-svn: 306010
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The ARM ELF ABI requires the linker to do interworking for wide
conditional branches from Thumb code to ARM code.
That was pointed out by @peter.smith in the comments for D33436.
Reviewers: rafael, peter.smith, echristo
Reviewed By: peter.smith
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, peter.smith
Differential Revision: https://reviews.llvm.org/D34447
llvm-svn: 306009
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows $AT to be used as a register name in assembly files.
Currently only $at is recognized as a valid register name.
Patch by Stanislav Ocovaj.
Differential Revision: https://reviews.llvm.org/D34348
llvm-svn: 306007
|
|
|
|
|
|
|
| |
Reserve an extra scavenging stack slot if the offset field in store-
-immediate instructions may overflow.
llvm-svn: 306004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
encoding
Summary:
Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA.
There are no real instructions for them, but there are pseudo instructions.
Reviewers: arsenm, vpykhtin, cfang
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34403
llvm-svn: 305999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has been deprecated since ARMARM v7-AR, release C.b, published back
in 2012.
This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was
introduced to check that conditionalization of Neon instructions did
happen when generating Thumb2. However, the test had evolved and was no
longer testing that. Rather than trying to adapt that test, this commit
introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can
now use the MIR framework to write nicer/more maintainable tests.
llvm-svn: 305998
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than creating a separate ".rdata" section distinct from the
customary ".rodata" in ELF, ".rdata" switches to the ".rodata" section.
This patch relands r305949 and r305950 with the correct commit message
and addresses nit raised during review.
Patch By: John Baldwin!
Differential Revision: https://reviews.llvm.org/D34452
llvm-svn: 305995
|
|
|
|
|
|
|
|
| |
These appear to have been simply missing.
Differential Revision: https://reviews.llvm.org/D34461
llvm-svn: 305993
|
|
|
|
|
|
| |
This reverts commit r305960 because it broke self-hosting on AArch64.
llvm-svn: 305990
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Support vector type G_INSERT legalization/selection.
Split from https://reviews.llvm.org/D33665
Reviewers: qcolombet, t.p.northover, zvi, guyblank
Reviewed By: guyblank
Subscribers: guyblank, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33956
llvm-svn: 305989
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES
instructions back to back and adds FeatureFuseAES to enable the feature.
Reviewers: evandro, javed.absar, rengolin, t.p.northover
Reviewed By: javed.absar
Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34142
llvm-svn: 305988
|
|
|
|
|
|
|
|
|
|
|
|
| |
Masked gather for vector length 2 is lowered incorrectly for element type i32.
The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD.
The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal.
In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well.
Differential revision: https://reviews.llvm.org/D34343
llvm-svn: 305987
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
Added several subtarget features for GFX9 SDWA.
This diff also contains changes from D34026.
Depends D34026
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34241
llvm-svn: 305986
|
|
|
|
|
|
| |
folding can create more instructions than it removed when there are multiple uses. NFC
llvm-svn: 305985
|
|
|
|
|
|
|
|
| |
This includes the safe SEH tables and the control flow guard function
table. LLD will emit the guard table soon, and I need a tool that dumps
them for testing.
llvm-svn: 305979
|
|
|
|
| |
llvm-svn: 305976
|
|
|
|
|
|
|
| |
This reverts commit r305949 and r305950 as they didn't have the
correct commit message.
llvm-svn: 305973
|
|
|
|
|
|
|
|
| |
This is one of the nodes which also compile as v_cmp_*.
Differential Revision: https://reviews.llvm.org/D34485
llvm-svn: 305970
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a bug where we always treat APSInts in Codeview as
signed when writing them to YAML. One symptom of this problem is that
llvm-pdbdump raw would show Enumerator Values that differ between the
original PDB and a PDB that has been round-tripped through YAML.
Reviewers: zturner
Reviewed By: zturner
Subscribers: llvm-commits, fhahn
Differential Revision: https://reviews.llvm.org/D34013
llvm-svn: 305965
|
|
|
|
|
|
|
|
|
| |
If one of the arguments of adde/sube is zero we can fold another
add/sub into it.
Differential Revision: https://reviews.llvm.org/D34374
llvm-svn: 305964
|
|
|
|
|
|
|
|
|
| |
This simplification allows to avoid generating v_cndmask_b32
to serialize condition code between compare and use.
Differential Revision: https://reviews.llvm.org/D34300
llvm-svn: 305962
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact:
spec/2006/fp/C++/444.namd 26.84 -0.31%
spec/2006/fp/C++/447.dealII 46.19 +0.89%
spec/2006/fp/C++/450.soplex 42.92 -0.44%
spec/2006/fp/C++/453.povray 38.57 -2.25%
spec/2006/fp/C/433.milc 24.54 -0.76%
spec/2006/fp/C/470.lbm 41.08 +0.26%
spec/2006/fp/C/482.sphinx3 47.58 -0.99%
spec/2006/int/C++/471.omnetpp 22.06 +1.87%
spec/2006/int/C++/473.astar 22.65 -0.12%
spec/2006/int/C++/483.xalancbmk 33.69 +4.97%
spec/2006/int/C/400.perlbench 33.43 +1.70%
spec/2006/int/C/401.bzip2 23.02 -0.19%
spec/2006/int/C/403.gcc 32.57 -0.43%
spec/2006/int/C/429.mcf 40.35 +0.27%
spec/2006/int/C/445.gobmk 26.96 +0.06%
spec/2006/int/C/456.hmmer 24.4 +0.19%
spec/2006/int/C/458.sjeng 27.91 -0.08%
spec/2006/int/C/462.libquantum 57.47 -0.20%
spec/2006/int/C/464.h264ref 46.52 +1.35%
geometric mean +0.29%
The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag.
I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent.
Reviewers: hfinkel, mkuper, davidxl, chandlerc
Reviewed By: chandlerc
Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33341
llvm-svn: 305960
|
|
|
|
| |
llvm-svn: 305951
|
|
|
|
| |
llvm-svn: 305950
|
|
|
|
|
|
|
|
| |
Patch by Fedor Sergeev.
Differential Revision: https://reviews.llvm.org/D33868
llvm-svn: 305948
|
|
|
|
|
|
|
|
|
|
| |
(consumer).
Reviewer: aprantl
Differential Revision: https://reviews.llvm.org/D34418
llvm-svn: 305944
|
|
|
|
| |
llvm-svn: 305943
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This attribute is used to ensure the guard page is triggered on stack
overflow. Stack frames larger than the guard page size will generate
a call to __probestack to touch each page so the guard page won't
be skipped.
Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D34386
llvm-svn: 305939
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using various methods, BasicAA tries to determine whether two
GetElementPtr memory locations alias when its base pointers are known
to be equal. When none of its heuristics are applicable, it falls back
to PartialAlias to, according to a comment, protect TBAA making a wrong
decision in case of unions and malloc. PartialAlias is not correct,
because a PartialAlias result implies that some, but not all, bytes
overlap which is not necessarily the case here.
AAResults returns the first analysis result that is not MayAlias.
BasicAA is always the first alias analysis. When it returns
PartialAlias, no other analysis is queried to give a more exact result
(which was the intention of returning PartialAlias instead of MayAlias).
For instance, ScopedAA could return a more accurate result.
The PartialAlias hack was introduced in r131781 (and re-applied in
r132632 after some reverts) to fix llvm.org/PR9971 where TBAA returns a
wrong NoAlias result due to a union. A test case for the malloc case
mentioned in the comment was not provided and I don't think it is
affected since it returns an omnipotent char anyway.
Since r303851 (https://reviews.llvm.org/D33328) clang does emit specific
TBAA for unions anymore (but "omnipotent char" instead). Hence, the
PartialAlias workaround is not required anymore.
This patch passes the test-suite and check-llvm/check-clang of a
self-hoisted build on x64.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D34318
llvm-svn: 305938
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls.
Reviewers: iteratee, davidxl
Reviewed By: iteratee
Subscribers: sanjoy, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D34456
llvm-svn: 305934
|
|
|
|
|
|
|
|
|
|
|
| |
Define target hook isReallyTriviallyReMaterializable() to explicitly specify
PowerPC instructions that are trivially rematerializable. This will allow
the MachineLICM pass to accurately identify PPC instructions that should always
be hoisted.
Differential Revision: https://reviews.llvm.org/D34255
llvm-svn: 305932
|
|
|
|
|
|
|
| |
I don't think there's any visible difference from having the wrong layout
for the 32-bit case at this point, but that could change in the future.
llvm-svn: 305931
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
known bits
Summary:
I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know.
This patch adds range metadata to these intrinsics based on the known bits.
Currently the patch punts if we already have range metadata present.
Reviewers: spatel, RKSimon, davide, majnemer
Reviewed By: RKSimon
Subscribers: sanjoy, hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D32582
llvm-svn: 305927
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C2)) create more instructions than it removes
Summary:
Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184
This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision.
Reviewers: spatel, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34437
llvm-svn: 305926
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds support for xors of splat vectors.
Reviewers: mcrosier
Reviewed By: mcrosier
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34354
llvm-svn: 305925
|
|
|
|
|
|
|
|
|
|
| |
See Bug 33509: https://bugs.llvm.org//show_bug.cgi?id=33509
Reviewers: Sam Kolton, Artem Tamazov, Valery Pykhtin
Differential Revision: https://reviews.llvm.org/D34360
llvm-svn: 305923
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added test file for ARMv8.1 LSE Atomics that I forgot to include in
commit r305893.
Patch by Ananth Jasty.
Differential Revision: https://reviews.llvm.org/D33586
Change-Id: Ic1ad8ed87c1b584c4c791b459a686c866a3c3087
llvm-svn: 305918
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305916
|
|
|
|
|
|
|
|
|
|
|
|
| |
different than any of the src
See Bug 33279: https://bugs.llvm.org//show_bug.cgi?id=33279
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D34003
llvm-svn: 305915
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305913
|
|
|
|
| |
llvm-svn: 305910
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305909
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305908
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305907
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305906
|
|
|
|
| |
llvm-svn: 305905
|