| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows.
Reviewers: majnemer, rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25293
llvm-svn: 284061
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unroll a loop"
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.
The original summary:
This patch tries to fully unroll loops having break statement like this
for (int i = 0; i < 8; i++) {
if (a[i] == value) {
found = true;
break;
}
}
GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.
The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.
The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.
llvm-svn: 284053
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25530
llvm-svn: 284052
|
| |
|
|
|
|
|
|
| |
unroll a loop"
This reverts commit r284044.
llvm-svn: 284051
|
| |
|
|
|
|
| |
The codegen has changed slightly between my tests and the commit.
llvm-svn: 284049
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch tries to fully unroll loops having break statement like this
for (int i = 0; i < 8; i++) {
if (a[i] == value) {
found = true;
break;
}
}
GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.
The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.
The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.
Differential Revision: https://reviews.llvm.org/D24790
llvm-svn: 284044
|
| |
|
|
|
|
|
|
|
|
|
|
| |
LTOCodeGenerator::applyScopeRestrictions().
We need to use the overload of Mangler::getNameWithPrefix that takes a
GlobalValue in order to mangle in the stdcall stack byte count for Windows
targets.
Differential Revision: https://reviews.llvm.org/D25529
llvm-svn: 284040
|
| |
|
|
|
|
|
|
|
|
|
| |
Branch folder removes implicit defs if they are the only non-branching
instructions in a block, and the branches do not use the defined registers.
The problem is that in some cases these implicit defs are required for
the liveness information to be correct.
Differential Revision: https://reviews.llvm.org/D25478
llvm-svn: 284036
|
| |
|
|
|
|
|
|
|
|
|
| |
This is the most basic handling of the indirect access
pseudos using GPR indexing mode. This currently only enables
the mode for a single v_mov_b32 and then disables it.
This is much more complicated to use than the movrel instructions,
so a new optimization pass is probably needed to fold the access
into the uses and keep the mode enabled for them.
llvm-svn: 284031
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Module inline asm was always being linked/concatenated
when running the IRLinker. This is correct for full LTO but not when
we are importing for ThinLTO, as it can result in multiply defined
symbols when the module asm defines a global symbol.
In order to test with llvm-lto2, I had to work around PR30396,
where a symbol that is defined in module assembly but defined in the
LLVM IR appears twice. Added workaround to llvm-lto2 with a FIXME.
Fixes PR30610.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25359
llvm-svn: 284030
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Constant bundle operands may need to retain their constant-ness for
correctness. I'll admit that this is slightly odd, but it looks like
SimplifyCFG already does this for things like @llvm.frameaddress and
@llvm.stackmap, so I suppose adding one more case is not a big deal.
It is possible to add a mechanism to denote bundle operands that need to
remain constants, but that's probably too complicated for the time
being.
Reviewers: jmolloy
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D25502
llvm-svn: 284028
|
| |
|
|
|
|
|
| |
VI added a second method of indexing into VGPRs
besides using v_movrel*
llvm-svn: 284027
|
| |
|
|
| |
llvm-svn: 284025
|
| |
|
|
|
|
|
|
| |
This makes more fields overridable and removes redundant bits.
Patch by: Changpeng Fang
llvm-svn: 284024
|
| |
|
|
|
|
|
|
|
|
| |
Since this change is known to cause performance degradations in some cases it's commited under a temporary flag which is turned off by default.
Patch by Li Huang
Differential Revision: https://reviews.llvm.org/D18777
llvm-svn: 284022
|
| |
|
|
|
|
|
| |
Prevent partial parsing of '$' or '@' of invalid identifiers and fixup
workaround points. NFC Intended.
llvm-svn: 284017
|
| |
|
|
|
|
|
|
| |
Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors.
Differential Revision: https://reviews.llvm.org/D25374
llvm-svn: 284015
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
disabled
This combiner breaks debug experience and should not be run when optimizations are disabled.
For example:
int main() {
int j = 0;
j += 2;
if (j == 2)
return 0;
return 5;
}
When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function.
Differential Revision: https://reviews.llvm.org/D19268
llvm-svn: 284014
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
An arithmetic shift can be safely changed to a logical shift if the first
operand is known positive. This allows ComputeKnownBits (and similar analysis)
to determine the sign bit of the shifted value in some cases. In turn, this
allows InstCombine to canonicalize a signed comparison (a > 0) into an equality
check (a != 0).
PR30577
Differential Revision: https://reviews.llvm.org/D25119
llvm-svn: 284013
|
| |
|
|
|
|
|
|
| |
As discussed by Andrea on PR30486, we have an unsafe cast to an Instruction type in the select combine which doesn't take into account that it could be a ConstantExpr instead.
Differential Revision: https://reviews.llvm.org/D25466
llvm-svn: 284000
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add unit tests for checking a few tricky instruction sizes. Also remove the old
tests for the instruction sizes, which were clunky and brittle.
Since this is the first set of target-specific unit tests, we need to add some
CMake plumbing. In the future, adding unit tests for a given target will be as
simple as creating a directory with the same name as the target under
unittests/Target. The tests are only run if the target is enabled in
LLVM_TARGETS_TO_BUILD.
Differential Revision: https://reviews.llvm.org/D24548
llvm-svn: 283990
|
| |
|
|
|
|
| |
I screwed up my merge conflict and lost some of the CHECK lines.
llvm-svn: 283974
|
| |
|
|
| |
llvm-svn: 283973
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although Copies are not specific to preISel, we still have to assign them
a proper register class. However, given they are not constrained to
anything we do not have to handle the source register at the copy. It
will be properly mapped when reaching the related definition.
In the process, the handlong of G_ANYEXT is slightly modified as those
end up being selected as copy. The difference is that when register size
do not match on both sides, we need to insert SUBREG_TO_REG operation,
otherwise the post RA copy expansion will not be happy!
llvm-svn: 283972
|
| |
|
|
| |
llvm-svn: 283971
|
| |
|
|
|
|
| |
Those are copies, we do not have to do any legalization action for them.
llvm-svn: 283970
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a refreshed version of a patch that was reverted: it fixes
the problems reported in both PR30216 and PR30499, and
contains all the test-cases from both bugs.
To hoist stores past loads, we used to search for potential
conflicting loads on the hoisting path by following a MemorySSA
def-def link from the store to be hoisted to the previous
defining memory access, and from there we followed the def-use
chains to all the uses that occur on the hoisting path. The
problem is that the def-def link may point to a store that does
not alias with the store to be hoisted, and so the loads that are
walked may not alias with the store to be hoisted, and even as in
the testcase of PR30216, the loads that may alias with the store
to be hoisted are not visited.
The current patch visits all loads on the path from the store to
be hoisted to the hoisting position and uses the alias analysis
to ask whether the store may alias the load. I was not able to
use the MemorySSA functionality to ask for whether load and
store are clobbered: I'm not sure which function to call, so I
used a call to AA->isNoAlias().
Store past store is still working as before using a MemorySSA
query: I added an extra test to pr30216.ll to make sure store
past store does not regress.
Tested on x86_64-linux with check and a test-suite run.
Differential Revision: https://reviews.llvm.org/D25476
llvm-svn: 283965
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In PPCMIPeephole, when we see two splat instructions, we can't simply do the following transformation:
B = Splat A
C = Splat B
=>
C = Splat A
because B may still be used between these two instructions. Instead, we should make the second Splat a PPC::COPY and let later passes decide whether to remove it or not:
B = Splat A
C = Splat B
=>
B = Splat A
C = COPY B
Fixes PR30663.
Reviewers: echristo, iteratee, kbarton, nemanjai
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D25493
llvm-svn: 283961
|
| |
|
|
|
|
|
|
| |
Fixes a crash in the build_vector -> vector_shuffle combine
when the first vector input is twice as wide as the output,
and the second input vector is even wider.
llvm-svn: 283953
|
| |
|
|
|
|
|
| |
Mostly Ahmed's work again, I'm just sprucing things up slightly before
committing.
llvm-svn: 283952
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Darwin, marking a section as "regular,live_support" means that a
symbol in the section should only be kept live if it has a reference to
something that is live. Otherwise, the linker is free to dead-strip it.
Turn this functionality on for the __llvm_prf_data section.
This means that counters and data associated with dead functions will be
removed from dead-stripped binaries. This will result in smaller
profiles and binaries, and should speed up profile collection.
Tested with check-profile, llvm-lit test/tools/llvm-{cov,profdata}, and
check-llvm.
Differential Revision: https://reviews.llvm.org/D25456
llvm-svn: 283947
|
| |
|
|
|
|
|
|
|
| |
Reverts r283938 to reinstate r283867 with a fix.
The original change had an ArrayRef referring to a destroyed temporary
initializer list. Use plain C arrays instead.
llvm-svn: 283942
|
| |
|
|
|
|
|
|
|
| |
load commands that uses the MachO::linker_option_command
type but not used in llvm libObject code but used in llvm tool code.
This includes just LC_LINKER_OPTION load command.
llvm-svn: 283939
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts r283867.
This appears to be an infinite loop:
while (HiRegToSave != AllHighRegs.end() && CopyReg != AllCopyRegs.end()) {
if (HiRegsToSave.count(*HiRegToSave)) {
...
CopyReg = findNextOrderedReg(++CopyReg, CopyRegs, AllCopyRegs.end());
HiRegToSave =
findNextOrderedReg(++HiRegToSave, HiRegsToSave, AllHighRegs.end());
}
}
llvm-svn: 283938
|
| |
|
|
|
|
| |
Patch mostly by Ahmed Bougaca.
llvm-svn: 283937
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.
Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.
Issue with early tail-duplication of blocks that branch to a fallthrough
predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll
Differential revision: https://reviews.llvm.org/D18226
llvm-svn: 283934
|
| |
|
|
| |
llvm-svn: 283930
|
| |
|
|
|
|
| |
commented-out code.
llvm-svn: 283924
|
| |
|
|
| |
llvm-svn: 283911
|
| |
|
|
|
|
|
| |
This enhances the fold added with:
https://reviews.llvm.org/rL283900
llvm-svn: 283905
|
| |
|
|
| |
llvm-svn: 283903
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25471
llvm-svn: 283902
|
| |
|
|
|
|
|
|
|
|
| |
Summary:
This test is allowed to run on non-x86 hosts and thus must use
llvm-nm rather than nm.
Differential Revision: https://reviews.llvm.org/D25473
llvm-svn: 283901
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The non-obvious motivation for adding this fold (which already happens in InstCombine)
is that we want to canonicalize IR towards select instructions and canonicalize DAG
nodes towards boolean math. So we need to recreate some folds in the DAG to handle that
change in direction.
An interesting implementation difference for cases like this is that InstCombine
generally works top-down while the DAG goes bottom-up. That means we need to detect
different patterns. In this case, the SimplifyDemandedBits fold prevents us from
performing a zext to sext fold that would then be recognized as a negation of a sext.
llvm-svn: 283900
|
| |
|
|
| |
llvm-svn: 283894
|
| |
|
|
|
|
|
|
|
|
| |
Differential Revision:
http://reviews.llvm.org/D25454
Reviewers:
tstellarAMD
llvm-svn: 283893
|
| |
|
|
|
|
| |
Added 32-bit target test
llvm-svn: 283883
|
| |
|
|
| |
llvm-svn: 283881
|
| |
|
|
|
|
| |
To make it more obvious how bad some of that truncation code is....
llvm-svn: 283880
|
| |
|
|
| |
llvm-svn: 283876
|