| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
commuted IR. NFC
llvm-svn: 305438
|
|
|
|
|
|
| |
to the same target regardless of condition
llvm-svn: 305416
|
|
|
|
|
|
|
|
|
|
|
| |
This way we end up not looking at PHI args already removed.
MemSSA now goes through the updater so we can prune
it to avoid having redundant MemoryPHI arguments, but that
doesn't quite work for the general case.
Discussed with Daniel Berlin, fixes PR33406.
llvm-svn: 305409
|
|
|
|
|
|
|
|
|
|
|
|
| |
with non power of 2 bit widths
There's an early out that's trying to detect when we don't know any bits that make up the legal range of a shift. The code subtracts one from BitWidth which creates a mask in the lower bits for power of 2 bit widths. This is then ANDed with the known bits to see if any of those bits are known. If the bit width isn't a power of 2 this creates a non-sensical mask.
This patch corrects this by rounding up to a power of 2 before doing the subtract and mask.
Differential Revision: https://reviews.llvm.org/D34165
llvm-svn: 305400
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things.
The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack.
This is done in three stages:
• The first patch (LLVM) adds support for DW_OP_plus_uconst.
• The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst.
• The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions.
Patch by Sander de Smalen.
Reviewers: echristo, pcc, aprantl
Reviewed By: aprantl
Subscribers: fhahn, javed.absar, aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D33894
llvm-svn: 305386
|
|
|
|
|
|
|
|
| |
eq (and X, C1), 0), Y, (or Y, C2)) when the icmp portion gets turned into a truncate and a signed compare with 0.
InstCombine has an optimization that recognizes an and with the sign bit of legal type size and turns it into a truncate and compare that checks the sign bit. But the select handling code doesn't recognize this idiom.
llvm-svn: 305338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Leave an updated VP metadata on the fallback memcpy intrinsic after
specialization. This can be used for later possible expansion based on
the average of the remaining values.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34164
llvm-svn: 305321
|
|
|
|
|
|
| |
functions in the AlwaysInliner
llvm-svn: 305245
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
After RS4GC, we should drop metadata that is no longer valid. These metadata
is used by optimizations scheduled after RS4GC, and can cause a miscompile.
One such metadata is invariant.load which is used by LICM sinking transform.
After rewriting statepoints, the address of a load maybe relocated. With
invariant.load metadata on a load instruction, LICM sinking assumes the
loaded value (from a dererenceable address) to be invariant, and
rematerializes the load operand and the load at the exit block.
This transforms the IR to have an unrelocated use of the
address after a statepoint, which is incorrect.
Other metadata we conservatively remove are related to
dereferenceability and noalias metadata.
This patch drops such metadata on store and load instructions after
rewriting statepoints.
Reviewers: reames, sanjoy, apilipenko
Reviewed by: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33756
llvm-svn: 305234
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up to https://reviews.llvm.org/D33879 / https://reviews.llvm.org/rL304939 ,
and was discussed in https://reviews.llvm.org/D33338.
We prefer this form because a narrower shift may be cheaper, and we can more easily fold a
zext than a sext.
http://rise4fun.com/Alive/slVe
Name: shz
%s = sext i8 %x to i12
%r = lshr i12 %s, 4
=>
%a = ashr i8 %x, 4
%r = zext i8 %a to i12
llvm-svn: 305190
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D33847
llvm-svn: 305170
|
|
|
|
|
|
| |
As discussed on D33983, as SLM has so many custom costs its worth testing as well.
llvm-svn: 305151
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33737
llvm-svn: 305132
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in
GEPOperator::accumulateConstantOffset.
Basically it does not consider the situation that the pointer operand of load or store
may be in a non-zero address space and its size may be different from the size of
a pointer in address space 0.
This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend.
Diffferential Revision: https://reviews.llvm.org/D33298
llvm-svn: 305107
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
isSafeToSpeculativelyExecute is the wrong predicate to use here.
All that checks for is whether it is safe to hoist a value due to
unaligned/un-dereferencable accesses. However, not only are we doing
sinking rather than hoisting, our concern is that the location
we're loading from may have been modified. Instead forbid sinking
any load across a critical edge.
Reviewers: majnemer
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D33179
llvm-svn: 305102
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.
Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979
llvm-svn: 305055
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.
If we do, we get incorrect optimizations in cases like:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] - 128;
}
which gets optimized to:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] | 128;
}
Because:
- InstCombine turned `sub i32 %from.i, 128` into
`add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
"couldn't wrap", and changed the `add` to an `or`.
InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.
llvm-svn: 305053
|
|
|
|
|
|
|
|
|
| |
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.
llvm-svn: 305052
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.
It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.
This is the second attempt to land this change after fixing PR33316.
llvm-svn: 305031
|
|
|
|
|
|
|
|
|
| |
This is to prepare to allow for dead stripping of globals in the
merged modules.
Differential Revision: https://reviews.llvm.org/D33921
llvm-svn: 305027
|
|
|
|
|
|
|
|
|
|
|
|
| |
No IR tests were added with rL304313 ( https://reviews.llvm.org/D28637 ),
so I want these for extra coverage if we enable memcmp expansion for x86.
As shown, nothing is expanded for x86 in CGP yet.
Also fundamentally, we're doing an IR transform, so we should have IR tests
for just that part. If something goes wrong, we need to know if the bug is
in CGP or later lowering.
llvm-svn: 305011
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.
Reviewers: davidxl, dnovillo, iteratee
Reviewed By: iteratee
Subscribers: iteratee, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D34017
llvm-svn: 305009
|
|
|
|
|
|
|
|
|
| |
ordering.
This is a temporarily fix which needs additional work, as it triggers a test3 failure.
test3 is commented out till then.
llvm-svn: 304993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cloned constexpr
Have cloneConstantExprWithNewAddressSpaces return nullptr when
returning initial ConstantExpr.
Reviewers: arsenm
Subscribers: jholewinski, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D33995
llvm-svn: 304975
|
|
|
|
| |
llvm-svn: 304973
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was discussed in D33338. We have larger pattern-matching ending in a truncate that
we can reduce or remove by handling these smaller patterns first. Further motivation is
that narrower shift ops are easier for value tracking and zext is better than sext.
http://rise4fun.com/Alive/rhh
Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7
=>
%r = zext i1 %x to i8
Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7
=>
%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8
Differential Revision: https://reviews.llvm.org/D33879
llvm-svn: 304939
|
|
|
|
|
|
|
|
| |
PR33346
Skip cases when expected value is not constant int.
llvm-svn: 304933
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it so that the code quality for CFI checks when compiling
with -O2 and linking with --lto-O0 is similar to that of the rest of
the code.
Reduces the size of a chrome binary built with -O2/--lto-O0 by
about 750KB.
Differential Revision: https://reviews.llvm.org/D33925
llvm-svn: 304921
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in llvm/test/Transforms/Util/PredicateInfo/
A few tests in llvm/test/Transforms/Util/PredicateInfo/ are using -reverse-iterate.
The option -reverse-iterate is enabled with +Asserts in usual cases, but it can be turned on/off regardless of LLVM_ENABLE_ASSERTIONS.
I wonder if this were incompatible to https://reviews.llvm.org/D33908 (r304757).
Differential Revision: https://reviews.llvm.org/D33854
llvm-svn: 304851
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: RKSimon, DavidKreitzer
Differential Revision: https://reviews.llvm.org/D33684
Patch by: Aleen Farhana <Farhana.aleen@gmail.com>
llvm-svn: 304834
|
|
|
|
| |
llvm-svn: 304829
|
|
|
|
| |
llvm-svn: 304826
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The patch makes instruction count the highest priority for
LSR solution for X86 (previously registers had highest priority).
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D30562
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304824
|
|
|
|
|
|
|
|
|
|
| |
Patch https://reviews.llvm.org/rL304806 was causing failures in Aarch64
and multiple other targets since the test should be run on X86 only.
Specifying the target triple is not enough. Moving the testcase to the
X86 target directory in LoopIdiom.
llvm-svn: 304809
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. When there is no perfect iteration order, we can't let phi nodes
put themselves in terms of things that come later in the iteration
order, or we will endlessly cycle (the normal RPO algorithm clears the
hashtable to avoid this issue).
2. We are sometimes erasing the wrong expression (causing pessimism)
because our equality says loads and stores are the same.
We introduce an exact equality function and use it when erasing to
make sure we erase only identical expressions, not equivalent ones.
llvm-svn: 304807
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expanding the loop idiom test for memcpy to also recognize
unordered atomic memcpy. The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames, anna
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304806
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were canonizalizing the pre loop (into loop-simplify form) before
the post loop blocks were added into parent loop. This is incorrect when IRCE is
done on a subloop. The post-loop blocks are created, but not yet added to the
parent loop. So, loop-simplification on the pre-loop incorrectly updates
LoopInfo.
This patch corrects the ordering so that pre and post loop blocks are added to
parent loop (if any), and then the loops are canonicalized to LCSSA and
LoopSimplifyForm.
Reviewers: reames, sanjoy, apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33846
llvm-svn: 304800
|
|
|
|
| |
llvm-svn: 304784
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug that can cause extractelements with operands that
haven't been defined yet to be inserted at a wrong point when
optimising insertelements.
Patch by Karl Hylen.
Differential Revision: https://reviews.llvm.org/D33449
llvm-svn: 304701
|
|
|
|
|
|
| |
x86-registered-target' which seems to be the correct way to make them run on an x86 build.
llvm-svn: 304682
|
|
|
|
|
|
| |
intrinsic. The second argument is not a vector so needs special treatment.
llvm-svn: 304679
|
|
|
|
|
|
| |
vector llvm.powi intrinsics due to the second argument not being a vector.
llvm-svn: 304678
|
|
|
|
|
|
| |
known bits.
llvm-svn: 304669
|
|
|
|
|
|
| |
to understand that the second argument is still a scalar.
llvm-svn: 304668
|
|
|
|
|
|
| |
some showing missed optimizations. NFC
llvm-svn: 304667
|
|
|
|
|
|
| |
like a copy paste mistake.
llvm-svn: 304666
|
|
|
|
|
|
| |
This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin.
llvm-svn: 304626
|
|
|
|
|
|
| |
Also added a test option and 2 cost analysis related tests.
llvm-svn: 304599
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is to enable the new switch inline cost heuristic (r301649) by removing the
old heuristic as well as the flag itself.
In my experiment for LLVM test suite and spec2000/2006, +17.82% performance and
8% code size reduce was observed in spec2000/vertex with O3 LTO in AArch64.
No significant code size / performance regression was found in O3/O2/Os. No
significant complain was reported from the llvm-dev thread.
Reviewers: hans, chandlerc, eraman, haicheng, mcrosier, bmakam, eastig, ddibyend, echristo
Reviewed By: echristo
Subscribers: javed.absar, kristof.beyls, echristo, aemerson, rengolin, mehdi_amini
Differential Revision: https://reviews.llvm.org/D32653
llvm-svn: 304594
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As shown in the test case, SROA was crashing when trying to split
stores (to the alloca) of loads (from anywhere), because it assumed
the pointer operand to the loads and stores had to have the same
address space. This isn't the case. Make sure to use the correct
pointer type for both the load and the store.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D32593
llvm-svn: 304585
|