| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.
If we do, we get incorrect optimizations in cases like:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] - 128;
}
which gets optimized to:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] | 128;
}
Because:
- InstCombine turned `sub i32 %from.i, 128` into
`add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
"couldn't wrap", and changed the `add` to an `or`.
InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.
llvm-svn: 305053
|
|
|
|
|
|
|
|
|
| |
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.
llvm-svn: 305052
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InstSimplify
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.
Reviewers: dberlin, hfinkel, spatel
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33604
llvm-svn: 305049
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.
It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.
This is the second attempt to land this change after fixing PR33316.
llvm-svn: 305031
|
|
|
|
|
|
| |
refer to global values instead of functions. While there fix an 80 column violation. NFC
llvm-svn: 305030
|
|
|
|
|
|
|
|
|
| |
This is to prepare to allow for dead stripping of globals in the
merged modules.
Differential Revision: https://reviews.llvm.org/D33921
llvm-svn: 305027
|
|
|
|
|
|
| |
-fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308
llvm-svn: 305026
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.
Reviewers: davidxl, dnovillo, iteratee
Reviewed By: iteratee
Subscribers: iteratee, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D34017
llvm-svn: 305009
|
|
|
|
|
|
|
|
|
| |
ordering.
This is a temporarily fix which needs additional work, as it triggers a test3 failure.
test3 is commented out till then.
llvm-svn: 304993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cloned constexpr
Have cloneConstantExprWithNewAddressSpaces return nullptr when
returning initial ConstantExpr.
Reviewers: arsenm
Subscribers: jholewinski, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D33995
llvm-svn: 304975
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The zero heuristic assumes that integers are more likely positive than negative,
but this also has the effect of assuming that strcmp return values are more
likely positive than negative. Given that for nonzero strcmp return values it's
the ordering of arguments that determines the sign of the result there's no
reason to assume that's true.
Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to
decide if it's strcmp-like, and if so only assume that nonzero is more likely
than zero i.e. strings are more often different than the same. This causes a
slight code generation change in the spec2006 benchmark 403.gcc, but with no
noticeable performance impact. The intent of this patch is to allow better
optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are
also some changes that need to be made to if-conversion.
Differential Revision: https://reviews.llvm.org/D33934
llvm-svn: 304970
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was discussed in D33338. We have larger pattern-matching ending in a truncate that
we can reduce or remove by handling these smaller patterns first. Further motivation is
that narrower shift ops are easier for value tracking and zext is better than sext.
http://rise4fun.com/Alive/rhh
Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7
=>
%r = zext i1 %x to i8
Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7
=>
%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8
Differential Revision: https://reviews.llvm.org/D33879
llvm-svn: 304939
|
|
|
|
|
|
|
|
| |
PR33346
Skip cases when expected value is not constant int.
llvm-svn: 304933
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it so that the code quality for CFI checks when compiling
with -O2 and linking with --lto-O0 is similar to that of the rest of
the code.
Reduces the size of a chrome binary built with -O2/--lto-O0 by
about 750KB.
Differential Revision: https://reviews.llvm.org/D33925
llvm-svn: 304921
|
|
|
|
|
|
|
|
| |
compiled code for comparing APInts with 0 and 1. NFC
These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare.
llvm-svn: 304876
|
|
|
|
|
|
|
|
| |
pointer is non-zero instead of checking that the APInt self is non-zero.
I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used.
llvm-svn: 304875
|
|
|
|
|
|
|
|
|
|
|
|
| |
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The patch makes instruction count the highest priority for
LSR solution for X86 (previously registers had highest priority).
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D30562
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304824
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. When there is no perfect iteration order, we can't let phi nodes
put themselves in terms of things that come later in the iteration
order, or we will endlessly cycle (the normal RPO algorithm clears the
hashtable to avoid this issue).
2. We are sometimes erasing the wrong expression (causing pessimism)
because our equality says loads and stores are the same.
We introduce an exact equality function and use it when erasing to
make sure we erase only identical expressions, not equivalent ones.
llvm-svn: 304807
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expanding the loop idiom test for memcpy to also recognize
unordered atomic memcpy. The only difference for recognizing
an unordered atomic memcpy and instead of a normal memcpy is
that the loads and/or stores involved are unordered atomic operations.
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
Patch by Daniel Neilson!
Reviewers: reames, anna, skatkov
Reviewed By: reames, anna
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33243
llvm-svn: 304806
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were canonizalizing the pre loop (into loop-simplify form) before
the post loop blocks were added into parent loop. This is incorrect when IRCE is
done on a subloop. The post-loop blocks are created, but not yet added to the
parent loop. So, loop-simplification on the pre-loop incorrectly updates
LoopInfo.
This patch corrects the ordering so that pre and post loop blocks are added to
parent loop (if any), and then the loops are canonicalized to LCSSA and
LoopSimplifyForm.
Reviewers: reames, sanjoy, apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33846
llvm-svn: 304800
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
same basic block.
Summary:
This problem stems from the fact that instructions are allocated using new
in LLVM, i.e. there is no relationship that can be derived by just looking
at the pointer value.
This interface dispatches to appropriate dominance check given 2 instructions,
i.e. in case the instructions are in the same basic block, ordered basicblock
(with instruction numbering and caching) are used. Otherwise, dominator tree
is used.
This is a preparation patch for https://reviews.llvm.org/D32720
Reviewers: dberlin, hfinkel, davide
Subscribers: davide, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33380
llvm-svn: 304764
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The patch moves LSR cost comparison to target part.
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D30561
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304750
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option.
Currently even if the option set to false we calculate and print (in debug mode) instruction costs.
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D33914
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304746
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug that can cause extractelements with operands that
haven't been defined yet to be inserted at a wrong point when
optimising insertelements.
Patch by Karl Hylen.
Differential Revision: https://reviews.llvm.org/D33449
llvm-svn: 304701
|
|
|
|
|
|
|
|
| |
-fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet."
This reverts commit r304630, as it broke ARM/AArch64 bots for 2 days.
llvm-svn: 304698
|
|
|
|
|
|
|
|
|
|
| |
Following the request made in https://reviews.llvm.org/D32871,
scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is
hereby made non-virtual in InnerLoopVectorizer.
Should have been part of r297580 originally.
llvm-svn: 304685
|
|
|
|
|
|
| |
known bits.
llvm-svn: 304669
|
|
|
|
| |
llvm-svn: 304638
|
|
|
|
| |
llvm-svn: 304637
|
|
|
|
| |
llvm-svn: 304636
|
|
|
|
|
|
| |
-fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet.
llvm-svn: 304630
|
|
|
|
|
|
| |
This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin.
llvm-svn: 304626
|
|
|
|
|
|
|
|
|
| |
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.
Differential revision: https://reviews.llvm.org/D33320
llvm-svn: 304616
|
|
|
|
|
|
| |
sections in future. NFC
llvm-svn: 304610
|
|
|
|
|
|
| |
This reverts commit 6e311de8b907aa20da9a1a13ab07c3ce2ef4068a.
llvm-svn: 304609
|
|
|
|
|
|
| |
We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed.
llvm-svn: 304607
|
|
|
|
| |
llvm-svn: 304600
|
|
|
|
|
|
| |
Also added a test option and 2 cost analysis related tests.
llvm-svn: 304599
|
|
|
|
|
|
|
|
|
| |
empty
Minor optimization but mostly simplifies my debugging so I'm not dealing
with empty SCCNodeSets while investigating issues in this optimization.
llvm-svn: 304597
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixed some comments, added an additional description of the algorithms,
improved readability of the code.
Reviewers: anemet
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33320
llvm-svn: 304593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As shown in the test case, SROA was crashing when trying to split
stores (to the alloca) of loads (from anywhere), because it assumed
the pointer operand to the loads and stores had to have the same
address space. This isn't the case. Make sure to use the correct
pointer type for both the load and the store.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D32593
llvm-svn: 304585
|
|
|
|
|
|
|
|
|
|
| |
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.
It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.
llvm-svn: 304582
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33805
llvm-svn: 304578
|
|
|
|
|
|
| |
constant
llvm-svn: 304562
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp
an extra time to canonicalize it:
INSTCOMBINE ITERATION #1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting: %notx = xor i8 %x, -1
IC: Visiting: %cmp = icmp sgt i8 %notx, 42
IC: Old = %cmp = icmp sgt i8 %notx, 42
New = <badref> = icmp sgt i8 -43, %x
IC: ADD: %cmp = icmp sgt i8 -43, %x
IC: ERASE %1 = icmp sgt i8 %notx, 42
IC: ADD: %notx = xor i8 %x, -1
IC: DCE: %notx = xor i8 %x, -1
IC: ERASE %notx = xor i8 %x, -1
IC: Visiting: %cmp = icmp sgt i8 -43, %x
IC: Mod = %cmp = icmp sgt i8 -43, %x
New = %cmp = icmp slt i8 %x, -43
IC: ADD: %cmp = icmp slt i8 %x, -43
IC: Visiting: %cmp = icmp slt i8 %x, -43
IC: Visiting: ret i1 %cmp
If we create the swapped ICmp directly, we go faster:
INSTCOMBINE ITERATION #1 on cmpnot
IC: ADDING: 3 instrs to worklist
IC: Visiting: %notx = xor i8 %x, -1
IC: Visiting: %cmp = icmp sgt i8 %notx, 42
IC: Old = %cmp = icmp sgt i8 %notx, 42
New = <badref> = icmp slt i8 %x, -43
IC: ADD: %cmp = icmp slt i8 %x, -43
IC: ERASE %1 = icmp sgt i8 %notx, 42
IC: ADD: %notx = xor i8 %x, -1
IC: DCE: %notx = xor i8 %x, -1
IC: ERASE %notx = xor i8 %x, -1
IC: Visiting: %cmp = icmp slt i8 %x, -43
IC: Visiting: ret i1 %cmp
llvm-svn: 304558
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned.
Make sure we clean up orphaned coro.saves. The bug manifested with a crash similar to this:
```
llvm_unreachable("Unknown type!");
llvm::MVT::getVT (Ty=0x489518, HandleUnknown=false)
llvm::EVT::getEVT
llvm::TargetLoweringBase::getValueType
llvm::ComputeValueVTs
llvm::SelectionDAGBuilder::visitTargetIntrinsic
```
Reviewers: GorNishanov
Subscribers: EricWF, llvm-commits
Differential Revision: https://reviews.llvm.org/D33817
llvm-svn: 304518
|
|
|
|
|
|
|
|
|
| |
builtin_expect applied on && or || expressions were not
handled properly before. With this patch, the problem is fixed.
Differential Revision: http://reviews.llvm.org/D33164
llvm-svn: 304517
|
|
|
|
| |
llvm-svn: 304514
|