| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
This should fix the GlobalISel verifier.
llvm-svn: 293550
|
| |
|
|
|
|
|
|
|
| |
For some reason the exception selector register must be a pointer (that's
assumed by SDag); on the other hand, it gets moved into an IR-level type which
might be entirely different (i32 on AArch64). IRTranslator needs to be aware of
this.
llvm-svn: 293546
|
| |
|
|
| |
llvm-svn: 293545
|
| |
|
|
| |
llvm-svn: 293541
|
| |
|
|
|
|
|
| |
This can happen if earlier combining has removed all uses of some VReg, which
is fine and shouldn't flag an error.
llvm-svn: 293537
|
| |
|
|
| |
llvm-svn: 293532
|
| |
|
|
|
|
|
|
|
|
|
| |
Previously, we would hit UB (or the ISD::DELETED_NODE assert) if we
happened to replace a node during UpdateChains, because it would be
left in the list we were iterating over. This nulls out the pointer
when that happens so that we can avoid the issue.
Fixes llvm.org/PR31710
llvm-svn: 293522
|
| |
|
|
|
|
| |
possible. NFCI.
llvm-svn: 293520
|
| |
|
|
|
|
|
|
| |
fcmp (fneg x), c, pred -> fcmp x, -c, (swap pred)
InstCombine already does this.
llvm-svn: 293512
|
| |
|
|
|
|
|
|
|
| |
To simplify/clarify memory ownership, make leaks (as one was found/fixed
recently) harder to write, etc.
(also, while I was there - removed a duplicate lookup in a container)
llvm-svn: 293506
|
| |
|
|
|
|
|
| |
This fixes emitting conversions of constants on targets
without legal f16 that need to use these for legalization.
llvm-svn: 293499
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28079
llvm-svn: 293470
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The primary use of the dump() functions in LLVM is for use in a
debugger. Unfortunately lldb does not seem to handle default arguments
so using `p SomeMI.dump()` fails and you have to type the longer `p
SomeMI.dump(nullptr)`. Remove the paramter to make the most common use
easy. (You can always construct something like `p
SomeMI.print(dbgs(),MyTII)` if you need more features).
Differential Revision: https://reviews.llvm.org/D29241
llvm-svn: 293440
|
| |
|
|
|
|
| |
It's operation already exists manually in many places without using the method.
llvm-svn: 293421
|
| |
|
|
|
|
| |
The type system requires that the number of vector elements should fit in 32-bits so this should be safe.
llvm-svn: 293414
|
| |
|
|
|
|
|
|
| |
of EXTRACT_SUBVECTOR.
The type system already requires that the number of vector elements must fit in 32-bits so an index should as well. Even if the type of the index were larger all we care about is that the constant index can fit in 64-bits so that we can call getZExtValue.
llvm-svn: 293413
|
| |
|
|
|
|
| |
trying to use getConstantOperandVal.
llvm-svn: 293412
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D29141
llvm-svn: 293408
|
| |
|
|
| |
llvm-svn: 293371
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the OperandsMapper creates virtual registers, it used to just create
plain scalar register with the right size. This may confuse the
instruction selector because we lose the information of the instruction
using those registers what supposed to do. The MachineVerifier complains
about that already.
With this patch, the OperandsMapper still creates plain scalar register,
but the expectation is for the mapping function to remap the type
properly. The default mapping function has been updated to do that.
rdar://problem/30231850
llvm-svn: 293362
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html
For reference:
- Public headers should just declare the dump() method but not use
LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void MyClass::dump() {
// print stuff to dbgs()...
}
#endif
llvm-svn: 293359
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r292621, the recommit fixes a bug related with live interval update
after the partial redundent copy is moved.
This recommit solves an additional bug related to the lack of update of
subranges.
The original patch is to solve the performance problem described in
PR27827. Register coalescing sometimes cannot remove a copy because of
interference. But if we can find a reverse copy in one of the predecessor
block of the copy, the copy is partially redundent and we may remove the
copy partially by moving it to the predecessor block without the
reverse copy.
Differential Revision: https://reviews.llvm.org/D28585
Re-apply r292621
Revert "Revert rL292621. Caused some internal build bot failures in apple."
This reverts commit r292984.
Original patch: Wei Mi <wmi@google.com>
Subrange fix: Mostly Matthias Braun <matze@braunis.de>
llvm-svn: 293353
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#0 0x89cdeb in operator new[](unsigned long) /code/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:84:37
#1 0x4ec87c4 in llvm::RegisterBankInfo::ValueMapping const* llvm::RegisterBankInfo::getOperandsMapping<llvm::RegisterBankInfo::ValueMapping const* const*>(llvm::RegisterBankInfo::ValueMapping const* const*, llvm::RegisterBankInfo::ValueMapping const* const*) const /code/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:297:9
#2 0x9327ee in llvm::AArch64RegisterBankInfo::getInstrMapping(llvm::MachineInstr const&) const /code/llvm/lib/Target/AArch64/AArch64RegisterBankInfo.cpp:540:30
#3 0x4eb8d07 in llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:546:24
#4 0x4eb9dd2 in llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:624:12
#5 0x3141875 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /code/llvm/lib/CodeGen/MachineFunctionPass.cpp:62:13
#6 0x396128d in llvm::FPPassManager::runOnFunction(llvm::Function&) /code/llvm/lib/IR/LegacyPassManager.cpp:1513:27
#7 0x3961832 in llvm::FPPassManager::runOnModule(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1534:16
#8 0x3962540 in runOnModule /code/llvm/lib/IR/LegacyPassManager.cpp:1590:27
#9 0x3962540 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1693
#10 0x8ae368 in compileModule(char**, llvm::LLVMContext&) /code/llvm/tools/llc/llc.cpp:562:8
#11 0x8a7a1b in main /code/llvm/tools/llc/llc.cpp:316:22
llvm-svn: 293351
|
| |
|
|
|
|
|
| |
We have to delete the block manually or it leaks. That triggers failures in
-fsanitize=leak bots (unsurprisingly), which should be fixed by this patch.
llvm-svn: 293347
|
| |
|
|
|
|
|
| |
Since it's not actually a generic MI, its register operands need a RegClass,
which is conveniently the target's pointer RegClass.
llvm-svn: 293335
|
| |
|
|
|
|
| |
Should fix machine verifier failures.
llvm-svn: 293334
|
| |
|
|
|
|
|
|
| |
Preparation for upcoming changes. No testcase as none of the public
targets bundles early enough and has a post machine scheduler enabled at
the same time. The error is also easily catched by asserts.
llvm-svn: 293324
|
| |
|
|
| |
llvm-svn: 293323
|
| |
|
|
|
|
| |
Comment, doxygen and a bit of whitespace cleanup.
llvm-svn: 293322
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change prevent the signed value of cost from being negative as the value is passed as an unsigned argument.
Reviewers: mcrosier, jmolloy, qcolombet, javed.absar
Reviewed By: mcrosier, qcolombet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28871
llvm-svn: 293307
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
In case of a SIGN/ZERO_EXTEND of an incomplete vector type (using only a
partial number of available vector elements), WidenVecRes_Convert() used to
resort to scalarization.
This patch adds a handling of the (common) case where an input vector can be
found of same width as the widened result vector, by converting the node to
SIGN/ZERO_EXTEND_VECTOR_INREG.
Review: Eli Friedman
llvm-svn: 293268
|
| |
|
|
|
|
|
|
| |
The translation scheme is mostly cribbed from FastISel, and it's not entirely
convincing semantically. But it does seem to work in the common cases and allow
variables to be printed so it can't be all wrong.
llvm-svn: 293228
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a set of experimental intrinsics intended to prevent
optimizations that make assumptions about the rounding mode and floating point
exception behavior. These intrinsics will later be extended to specify
flush-to-zero behavior. More work is also required to model instruction
dependencies in machine code and to generate these instructions from clang
(when required by pragmas and/or command line options that are not currently
supported).
Differential Revision: https://reviews.llvm.org/D27028
llvm-svn: 293226
|
| |
|
|
|
|
| |
This simplifies skipping debug instructions and shrinking ranges.
llvm-svn: 293202
|
| |
|
|
|
|
|
|
| |
UseAA is enabled."
This reverts commit r293184 which is failing in LTO builds
llvm-svn: 293188
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabled.
* Simplify Consecutive Merge Store Candidate Search
Now that address aliasing is much less conservative, push through
simplified store merging search and chain alias analysis which only
checks for parallel stores through the chain subgraph. This is cleaner
as the separation of non-interfering loads/stores from the
store-merging logic.
When merging stores search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited.
This improves the quality of the output SelectionDAG and the output
Codegen (save perhaps for some ARM cases where we correctly constructs
wider loads, but then promotes them to float operations which appear
but requires more expensive constant generation).
Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)
Additional Minor Changes:
1. Finishes removing unused AliasLoad code
2. Unifies the chain aggregation in the merged stores across code
paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seems sufficient to not cause regressions in
tests.
5. Remove Chain dependencies of Memory operations on CopyfromReg
nodes as these are captured by data dependence
6. Forward loads-store values through tokenfactors containing
{CopyToReg,CopyFromReg} Values.
7. Peephole to convert buildvector of extract_vector_elt to
extract_subvector if possible (see
CodeGen/AArch64/store-merge.ll)
8. Store merging for the ARM target is restricted to 32-bit as
some in some contexts invalid 64-bit operations are being
generated. This can be removed once appropriate checks are
added.
This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.
Many tests required some changes as memory operations are now
reorderable, improving load-store forwarding. One test in
particular is worth noting:
CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
forwarding converts a load-store pair into a parallel store and
a memory-realized bitcast of the same value. However, because we
lose the sharing of the explicit and implicit store values we
must create another local store. A similar transformation
happens before SelectionDAG as well.
Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
llvm-svn: 293184
|
| |
|
|
|
|
| |
undef subvectors.
llvm-svn: 293152
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows MIR passes to emit optimization remarks with the same level
of functionality that is available to IR passes.
It also hooks up the greedy register allocator to report spills. This
allows for interesting use cases like increasing interleaving on a loop
until spilling of registers is observed.
I still need to experiment whether reporting every spill scales but this
demonstrates for now that the functionality works from llc
using -pass-remarks*=<pass>.
Differential Revision: https://reviews.llvm.org/D29004
llvm-svn: 293110
|
| |
|
|
|
|
|
|
| |
Later code expects the vector loads produced to be directly
concatenable, which means we shouldn't pad anything except the last load
produced with UNDEF.
llvm-svn: 293088
|
| |
|
|
|
|
| |
Thanks to Quentin for suggesting the refactoring.
llvm-svn: 293087
|
| |
|
|
|
|
|
| |
I think it's a hold-over from some previous iteration, but it's never
set to true in LLVM as it exists now.
llvm-svn: 293086
|
| |
|
|
| |
llvm-svn: 293077
|
| |
|
|
|
|
|
| |
This reverts commit r293033, per Danny's comment. In short, we require
domtrees to have roots at all times.
llvm-svn: 293075
|
| |
|
|
|
|
| |
Fix unused variable, specify types explicitly to make VC compiler happy.
llvm-svn: 293039
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Attempt #2.
The previous patch (https://reviews.llvm.org/rL289538) got reverted because of a bug. Chandler also requested some changes to the algorithm.
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413479.html
This is an updated patch. The key difference is that collectBitProviders (renamed to calculateByteProvider) now collects the origin of one byte, not the whole value. It simplifies the implementation and allows to stop the traversal earlier if we know that the result won't be used.
From the original commit:
Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it.
Assuming little endian target:
i8 *a = ...
i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
=>
i32 val = *((i32)a)
i8 *a = ...
i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3]
=>
i32 val = BSWAP(*((i32)a))
This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations.
Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part:
i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24)
Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses.
The general scheme is to match OR expressions by recursively calculating the origin of individual bytes which constitute the resulting OR value. If all the OR bytes come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed.
Reviewed By: RKSimon, filcab, chandlerc
Differential Revision: https://reviews.llvm.org/D27861
llvm-svn: 293036
|
| |
|
|
|
|
|
|
|
|
|
| |
If dominator tree has no roots, the pass that calculates it is
likely to be skipped. It occures, for instance, in the case of
entities with linkage available_externally. Do not run tree
verification in such case.
Differential Revision: https://reviews.llvm.org/D28767
llvm-svn: 293033
|
| |
|
|
|
|
|
|
| |
clang already emits this with -cl-no-signed-zeros, but codegen
doesn't do anything with it. Treat it like the other fast math
attributes, and change one place to use it.
llvm-svn: 293024
|
| |
|
|
| |
llvm-svn: 293023
|
| |
|
|
| |
llvm-svn: 293019
|
| |
|
|
|
|
|
|
|
|
|
| |
Looks like our cmake goop for handling .inc->td dependencies doesn't
track the .td files.
This manifests as cmake complaining about missing files since r293009.
Force a rerun to avoid that.
llvm-svn: 293012
|