| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
aggressive cast-folding
Summary:
InstCombine unfolds expressions of the form `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` such that in a later iteration of InstCombine the exposed `zext(icmp)` instructions can be optimized. We now combine this unfolding and the subsequent `zext(icmp)` optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of `logic(cast(icmp), cast(icmp))` expressions to `cast(logic(icmp, icmp))` which had been disabled due to its interference with the unfolding transformation.
Tested via `make check` and `lnt`.
Background
==========
For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in `visitZExt()` in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` in order to expose `zext(icmp)` operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these `zext(icmp)` would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the `zext`-unfolding and the elimination of the exposed `zext(icmp)` to happen at one go because this enables us to still allow the cast-folding in `logic(cast(icmp), cast(icmp))` without entering an endless loop again.
Details on the submitted changes
================================
- In `visitZExt()` we combine the unfolding and optimization of `zext` instructions.
- In `transformZExtICmp()` we have to use `Builder->CreateIntCast()` instead of `CastInst::CreateIntegerCast()` to make sure that the new `CastInst` is inserted in a `BasicBlock`. The new calls to `transformZExtICmp()` that we introduce in `visitZExt()` would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of `replaceInstUsesWith()` is necessary to ensure that the new `CastInst` replaces the `ZExtInst` accordingly.
- In InstCombineAndOrXor.cpp we again allow the folding of casts on `icmp` instructions.
- The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes.
- The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105.
Reviewers: grosser, majnemer, spatel
Subscribers: eli.friedman, majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22864
Contributed-by: Matthias Reisinger <d412vv1n@gmail.com>
llvm-svn: 277635
|
| |
|
|
|
|
|
|
| |
Patch by Aditya Kumar.
Differential Revision: https://reviews.llvm.org/D22967
llvm-svn: 277634
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently only support combining target shuffles that consist of a single source input (plus elements known to be undef/zero).
This patch generalizes the recursive combining of the target shuffle to collect all the inputs, merging any duplicates along the way, into a full set of src ops and its shuffle mask.
We uncover a number of cases where we have failed to combine a unary shuffle because the input has been duplicated and separated during lowering.
This will allow us to combine to 2-input shuffles in a future patch.
Differential Revision: https://reviews.llvm.org/D22859
llvm-svn: 277631
|
| |
|
|
|
|
|
|
| |
This reverts commit the revert commit r277627. The build errors
mentioned in r277627 were likely caused by an unclean build directory.
Sorry for the noise.
llvm-svn: 277630
|
| |
|
|
|
|
|
|
|
| |
splat vectors
This removes the restriction for the icmp constant, but as noted by the FIXME comments,
we still need to change individual checks for binop operand constants.
llvm-svn: 277629
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r277540. It breaks the build with:
../lib/Object/Archive.cpp:264:41: error: return type of out-of-line definition of 'llvm::object::ArchiveMemberHeader::getUID' differs from that in the declaration
Expected<unsigned> ArchiveMemberHeader::getUID() const {
~~~~~~~~~~~~~~~~~~ ^
include/llvm/Object/Archive.h:53:12: note: previous declaration is here
unsigned getUID() const;
~~~~~~~~ ^
llvm-svn: 277627
|
| |
|
|
| |
llvm-svn: 277626
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fix for PR28697.
An MDNode can indirectly refer to a GlobalValue, through a
ConstantAsMetadata. When the GlobalValue is deleted, the MDNode operand
is reset to `nullptr`. If the node is uniqued, this can lead to a
hard-to-detect cache invalidation in a Metadata map that's shared across
an LLVMContext.
Consider:
1. A map from Metadata* to `T` called RemappedMDs.
2. A node that references a global variable, `!{i1* @GV}`.
3. Insert `!{i1* @GV} -> SomeT` in the map.
4. Delete `@GV`, leaving behind `!{null} -> SomeT`.
Looking up the generic and uninteresting `!{null}` gives you `SomeT`,
which is likely related to `@GV`. Worse, `SomeT`'s lifetime may be tied
to the deleted `@GV`.
This occurs in practice in the shared ValueMap used since r266579 in the
IRMover. Other code that handles more than one Module (with different
lifetimes) in the same LLVMContext could hit it too.
The fix here is a partial revert of r225223: in the rare case that an
MDNode operand is a ConstantAsMetadata (i.e., wrapping a node from the
Value hierarchy), drop uniquing if it gets replaced with `nullptr`.
This changes step #4 above to leave behind `distinct !{null} -> SomeT`,
which can't be confused with the generic `!{null}`.
In theory, this can cause some churn in the LLVMContext's MDNode
uniquing map when Values are being deleted. However:
- The number of GlobalValues referenced from uniqued MDNodes is
expected to be quite small. E.g., the debug info metadata schema
only references GlobalValues from distinct nodes.
- Other Constants have the lifetime of the LLVMContext, whose teardown
is careful to drop references before deleting the constants.
As a result, I don't expect a compile time regression from this change.
llvm-svn: 277625
|
| |
|
|
| |
llvm-svn: 277622
|
| |
|
|
|
|
|
|
|
| |
It is possible for the value map to not have an entry for some value
that has already been removed.
I don't have a testcase, this is fall-out from a buildbot.
llvm-svn: 277614
|
| |
|
|
| |
llvm-svn: 277612
|
| |
|
|
|
|
|
|
|
|
|
| |
We were able to figure out that the result of a call is some constant.
While propagating that fact, we added the constant to the value map.
This is problematic because it results in us losing the call site when
processing the value map.
This fixes PR28802.
llvm-svn: 277611
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes.
Reviewers: john.brawn, jmolloy
Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits
Differential Revision: https://reviews.llvm.org/D23090
llvm-svn: 277610
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
MappedBlockSTream can work with any sequence of block data where
the ordering is specified by a list of block numbers. So rather
than manually stitch them together in the case of the FPM, reuse
this functionality so that we can treat the FPM as if it were
contiguous.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23066
llvm-svn: 277609
|
| |
|
|
|
|
| |
This reverts commit r277592, trying to fix the AArch64 42VMA buildbot.
llvm-svn: 277607
|
| |
|
|
| |
llvm-svn: 277605
|
| |
|
|
|
|
|
|
|
| |
When expanding FP constants, we attempt to shrink doubles to floats and perform an extending load.
However, on SystemZ, and possibly on other targets (I've only confirmed the problem on SystemZ), the FP extending load instruction may convert SNaN into QNaN, or may cause an exception. So in the general case, we would still like to shrink FP constants, but SNaNs should be left as doubles.
Differential Revision: https://reviews.llvm.org/D22685
llvm-svn: 277602
|
| |
|
|
|
|
|
|
| |
When the same base address is used to load two different data types, LSR
would assume a memory type of "void". This type is not sized and has no
alignment information. Checking for it causes a crash.
llvm-svn: 277601
|
| |
|
|
|
|
|
|
| |
obsolete comment (NFC)
Differential Revision: https://reviews.llvm.org/D23013
llvm-svn: 277595
|
| |
|
|
|
|
|
|
|
|
| |
Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem.
Reviewed By: sanjoy
Differential Revision: http://reviews.llvm.org/D23059
llvm-svn: 277592
|
| |
|
|
|
|
|
|
| |
vcvttsd2usi{l|q} instructions.
Differential Revision: http://reviews.llvm.org/D23111
llvm-svn: 277586
|
| |
|
|
|
|
|
| |
called "FPM" instead of "LPM" in a hold-over from when the code was
modeled on that used to parse function passes.
llvm-svn: 277584
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
manager.
While this has some utility for debugging and testing on its own, it is
primarily intended to demonstrate the technique for adding custom
wrappers that can provide more interesting interation behavior in
a nice, orthogonal, and composable layer.
Being able to write these kinds of very dynamic and customized controls
for running passes was one of the motivating use cases of the new pass
manager design, and this gives a hint at how they might look. The actual
logic is tiny here, and most of this is just wiring in the pipeline
parsing so that this can be widely used.
I'm adding this now to show the wiring without a lot of business logic.
This is a precursor patch for showing how a "iterate up to N times as
long as we devirtualize a call" utility can be added as a separable and
composable component along side the CGSCC pass management.
Differential Revision: https://reviews.llvm.org/D22405
llvm-svn: 277581
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We also add a test to show what currently happens when we create a
section per function and emit an xray_instr_map. This illustrates the
relationship (or lack thereof) between the per-function section and the
xray_instr_map section.
We also change the code generation slightly so that we don't always
create group sections, but rather only do so if a function where the
table is associated with is in a group.
Also in this change:
- Remove the "merge" flag on the xray_instr_map section.
- Test that we're generating the right table for comdat and non-comdat functions.
Reviewers: echristo, majnemer
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23104
llvm-svn: 277580
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IfConversion used to always add the undef flag when adding a use operand
on a newly predicated instruction. This would be an operand for the register
being conditionally redefined. Due to the undef flag, the liveness of this
register prior to the predicated instruction would get lost.
This patch changes this so that such use operands are added only when the
register is live, without the undef flag.
This was reverted but pushed again now, for details follow link below.
Reviewed by Quentin Colombet.
http://reviews.llvm.org/D209077
llvm-svn: 277571
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is the first refactoring before adding new functionality.
Add a class wrapper for the functions and container for
state associated with the transformation.
No functional change
Reviewers: majnemer, nadav, mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23065
llvm-svn: 277565
|
| |
|
|
| |
llvm-svn: 277564
|
| |
|
|
|
|
|
|
| |
I forgot to do this initially, and added when I saw this fail in
a no-asserts build, but managed to loose the diff from the actual patch
that got submitted. Very sorry.
llvm-svn: 277562
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reason about and less error prone.
The core idea is to fully parse the text without trying to identify
passes or structure. This is done with a single state machine. There
were various bugs in the logic around this previously that were repeated
and scattered across the code. Having a single routine makes it much
easier to fix and get correct. For example, this routine doesn't suffer
from PR28577.
Then the actual pass construction is handled using *much* easier to read
code and simple loops, with particular pass manager construction sunk to
live with other pass construction. This is especially nice as the pass
managers *are* in fact passes.
Finally, the "implicit" pass manager synthesis is done much more simply
by forming "pre-parsed" structures rather than having to duplicate tons
of logic.
One of the bugs fixed by this was evident in the tests where we accepted
a pipeline that wasn't really well formed. Another bug is PR28577 for
which I have added a test case.
The code is less efficient than the previous code but I'm really hoping
that's not a priority. ;]
Thanks to Sean for the review!
Differential Revision: https://reviews.llvm.org/D22724
llvm-svn: 277561
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug where we'd sometimes cache overly-conservative results
with our walker. This bug was made more obvious by r277480, which makes
our cache far more spotty than it was. Test case is llvm-unit, because
we're likely going to use CachingWalker only for def optimization in the
future.
The bug stems from that there was a place where the walker assumed that
`DefNode.Last` was a valid target to cache to when failing to optimize
phis. This is sometimes incorrect if we have a cache hit. The fix is to
use the thing we *can* assume is a valid target to cache to. :)
llvm-svn: 277559
|
| |
|
|
|
|
| |
here. NFC.
llvm-svn: 277557
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Sometimes, bitsets could get really large (>300k entries) and
we might want to drop a check, as it would have a too much cost.
Adding a flag to control how much penalty are we willing to pay
for bitsets.
Reviewers: kcc
Differential Revision: https://reviews.llvm.org/D23088
llvm-svn: 277556
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Depends on D23072
Reviewers: george.burgess.iv
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23076
llvm-svn: 277553
|
| |
|
|
|
|
|
| |
Leftover from D22686; the passes can handle all the instructions
unconditionally; only isel needs to care whether to generate them.
llvm-svn: 277549
|
| |
|
|
| |
llvm-svn: 277544
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Kicks off the implementation of wasm SIMD128 support (spec:
https://github.com/stoklund/portable-simd/blob/master/portable-simd.md),
adding support for add, sub, mul for i8x16, i16x8, i32x4, and f32x4.
The spec is WIP, and might change in the near future.
Patch by João Porto
Differential Revision: https://reviews.llvm.org/D22686
llvm-svn: 277543
|
| |
|
|
|
|
|
|
| |
In this particular example we wouldn't want the smmls anyway (the value is
actually unused), but in general smmls does not provide the required flags
register so if that SUBE result is used we can't replace it.
llvm-svn: 277541
|
| |
|
|
|
|
|
| |
Fixed the last incorrect uses of llvm_unreachable() in the code
which were actually just cases of errors in the input Archives.
llvm-svn: 277540
|
| |
|
|
|
|
| |
Clean-up before changing this to allow folds for vectors.
llvm-svn: 277538
|
| |
|
|
| |
llvm-svn: 277535
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tejohnson, eraman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22980
llvm-svn: 277534
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were issues with simply reporting AttrUnknown on
previously-unknown values in CFLAnders. So, we now act *entirely*
conservatively for values we haven't seen before. As in the prior patch
(r277362), writing a lit test for this isn't exactly trivial. If someone
wants a test badly, I'm willing to try to write one.
Patch by Jia Chen.
Differential Revision: https://reviews.llvm.org/D23077
llvm-svn: 277533
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: We really want to move towards MemoryLocOrCall (or fix AA) everywhere, but for now, this lets us have a single instructionClobbersQuery.
Reviewers: george.burgess.iv
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23072
llvm-svn: 277530
|
| |
|
|
|
|
|
|
|
| |
BitVector::extend initializes extended bits as true by default.
That is not desirable because new pages should be initially free.
Differential Revision: https://reviews.llvm.org/D23048
llvm-svn: 277529
|
| |
|
|
| |
llvm-svn: 277528
|
| |
|
|
|
|
|
|
|
|
| |
original value.
As agreed in post-commit review of r265388, I'm switching the flag to
its original value until the 90% runtime performance regression on
SingleSource/Benchmarks/Stanford/Bubblesort is addressed.
llvm-svn: 277524
|
| |
|
|
|
|
|
|
|
|
| |
info.
Also added test case to verify IR changes done by NVPTXGenericToNVVM pass.
Differential Revision: https://reviews.llvm.org/D22837
llvm-svn: 277520
|
| |
|
|
|
|
|
|
|
| |
Update comment for isOutOfScope and add a testcase for uniform value being used
out of scope.
Differential Revision: https://reviews.llvm.org/D23073
llvm-svn: 277515
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We were relying on the misleadingly-names $status result to actually be the
status. Actually it's just a scratch register that may or may not be valid (and
is the inverse of the real ststus anyway). Success can be determined by
comparing the value loaded against the one we wanted to see for "cmpxchg
strong" loops like this.
Should fix PR28819.
llvm-svn: 277513
|
| |
|
|
| |
llvm-svn: 277510
|