| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As part of this, define DenseMapInfo for MCRegister (and Register while I'm at it)
Depends on D65599
Reviewers: arsenm
Subscribers: MatzeB, qcolombet, jvesely, wdng, nhaehnle, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65605
llvm-svn: 367719
|
|
|
|
|
|
|
|
| |
This really should have been part of 366765. For some reason, I forgot to handle the corresponding load side, and the readable test cases (using deopt vs statepoints) turned out to be overly reduced. Oops.
As seen in the test change, the problem was that we were using a load with alignment expectations rather than the unaligned variant when the stack alignment was less than that prefered type alignment.
llvm-svn: 367718
|
|
|
|
|
|
|
|
|
|
| |
compressstore scalarization
This adds support for generating all the loads or stores for a constant mask into a single basic block with no conditionals.
Differential Revision: https://reviews.llvm.org/D65613
llvm-svn: 367715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverses a questionable IR canonicalization when a truncate
is free:
sra (add (shl X, N1C), AddC), N1C -->
sext (add (trunc X to (width - N1C)), AddC')
https://rise4fun.com/Alive/slRC
More details in PR42644:
https://bugs.llvm.org/show_bug.cgi?id=42644
I limited this to pre-legalization for code simplicity because that
should be enough to reverse the IR patterns. I don't have any
evidence (no regression test diffs) that we need to try this later.
Differential Revision: https://reviews.llvm.org/D65607
llvm-svn: 367710
|
|
|
|
|
|
|
|
|
|
| |
Codeview debug info assembly format for better readability"
This is breaking bots and the author asked me to revert.
This reverts commit 367704.
llvm-svn: 367707
|
|
|
|
|
|
| |
assembly format for better readability
llvm-svn: 367704
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a crash in the case where the type info object is an alias
pointing to a non-zero offset within a global or is otherwise unanalyzable
by the stripPointerCasts() function. Looking through the alias is not the
right thing to do anyway for similar reasons as D65118.
Differential Revision: https://reviews.llvm.org/D65314
llvm-svn: 367696
|
|
|
|
| |
llvm-svn: 367683
|
|
|
|
|
|
|
|
|
|
|
| |
This optimisation isn't generally profitable for ARM, because we can
save/restore many registers in the prologue and epilogue using the PUSH
and POP instructions, but mostly use individual LDR/STR instructions for
other spills.
Differential revision: https://reviews.llvm.org/D64910
llvm-svn: 367670
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves
with a null RegScavenger. Simply not updating the register scavenger
is fine because IPRA only cares about the SavedRegs vector, the acutal
code of the function has already been generated at this point.
- Add a new hook to TargetRegisterInfo to get the set of registers which
can be clobbered inside a call, even if the compiler can see both
sides, by linker-generated code.
Differential revision: https://reviews.llvm.org/D64908
llvm-svn: 367669
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dupRetToEnableTailCallOpts()
Summary:
The old code can be simplified to define the element type of TailCalls as `BasicBlock` not `CallInst`. Also I use the for-range loop instead the for loop.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D64905
llvm-svn: 367644
|
|
|
|
|
|
|
|
|
|
| |
inline comments"
due to a sanitizer failure.
This reverts commit 367623.
llvm-svn: 367640
|
|
|
|
|
|
| |
llvm::Register as started by r367614. NFC
llvm-svn: 367633
|
|
|
|
|
| |
Signed-off-by: Nilanjana Basu <nilanjana.basu87@gmail.com>
llvm-svn: 367623
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AMDGPU sometimes has legal s16 and <2 x s16> operations, but all
registers are really 32-bit. An unmerge destination really should ben
widened to a 32-bit register. If widening a scalarizing vector with a
target size that matches the vector size, bitcast to integer and
extract the relevant bits with shifts.
I'm not sure if this is the right place for this. This could arguably
be part of widenScalar for the result. I also have a growing feeling
that we're missing a bitcast legalize action.
llvm-svn: 367604
|
|
|
|
|
|
|
|
|
|
|
|
| |
multiply is legal
If a type is larger than a legal type and needs to be split, we would previously allow the multiply to be decomposed even if the split multiply is legal. Since the shift + add/sub code would also need to be split, its not any better to decompose it.
This patch figures out what type the mul will eventually be legalized to and then uses that type for the query. I tried just returning false illegal types and letting them get handled after type legalization, but then we can't recognize and i64 constant splat on 32-bit targets since will be destroyed by type legalization. We could special case vectors of i64 to avoid that...
Differential Revision: https://reviews.llvm.org/D65533
llvm-svn: 367601
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The note in the documentation suggests this restriction is a compile
time optimization for architectures that make heavy use of
bundling. Allowing virtual registers in a bundle is useful for some
(non-R600) AMDGPU use cases and are infrequent enough to matter.
A more common AMDGPU use case has already been using virtual registers
in bundles since r333691, although never calling finalizeBundle on
them and manually creating the use/def list on the BUNDLE
instruction. This is also relatively infrequent, and only happens for
consecutive sequences of some load/store types.
llvm-svn: 367597
|
|
|
|
|
|
|
| |
AMDGPU testcase isn't broken now, but will be in a future patch
without this.
llvm-svn: 367591
|
|
|
|
|
|
|
|
| |
ISD::INSERT_VECTOR_ELT handling
Allow us to peek through vector insertions to avoid dependencies on entire insertion chains.
llvm-svn: 367588
|
|
|
|
|
|
| |
Also use KnownBits::isNegative/isNonNegative to further simplify.
llvm-svn: 367518
|
|
|
|
|
|
|
| |
AMDGPU change and test is a placeholder until a future patch with
complete handling.
llvm-svn: 367503
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
functions and globals when using -function-sections and -data-sections.
This allows functions and globals to to be reordered later in the linking phase
(using the -symbol-ordering-file) even though reordering will be limited to
the scope of the explicit section.
Patch by Rahman Lavaee!
Differential Revision: https://reviews.llvm.org/D65478
llvm-svn: 367501
|
|
|
|
|
|
|
|
|
|
| |
and partial fix.
Causes windows buildbot errors.
This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and
53da7ca94343166ac68aef81db0398932fc258bb.
llvm-svn: 367496
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scalar bit tests for the branches.
X86 at least is able to use movmsk or kmov to move the mask to the scalar
domain. Then we can just use test instructions to test individual bits.
This is more efficient than extracting each mask element
individually.
I special cased v1i1 to use the previous behavior. This avoids
poor type legalization of bitcast of v1i1 to i1.
I've skipped expandload/compressstore as I think we need to
handle constant masks for those better first.
Many tests end up with duplicate test instructions due to tail
duplication in the branch folding pass. But the same thing
happens when constructing similar code in C. So its not unique
to the scalarization.
Not sure if this lowering code will also be good for other targets,
but we're only testing X86 today.
Differential Revision: https://reviews.llvm.org/D65319
llvm-svn: 367489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
utilize NoSignedZerosFPMath options control
Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context.
Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper
Reviewed By: spatel
Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji
Differential Revision: https://reviews.llvm.org/D65170
llvm-svn: 367486
|
|
|
|
|
|
|
|
|
| |
after windows buildbot failure.
Added a check that the MachineInstr exists and is a call before trying
to add symbols around it.
llvm-svn: 367483
|
|
|
|
| |
llvm-svn: 367482
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.
NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.
Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette
Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65488
llvm-svn: 367476
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
char to unsigned.
This makes the field wider than MachineOperand::SubReg_TargetFlags so that
we don't end up silently truncating any higher bits. We should still catch
any bits truncated from the MachineOperand field as a consequence of the
assertion in MachineOperand::setTargetFlags().
Differential Revision: https://reviews.llvm.org/D65465
llvm-svn: 367474
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to bail out in store merging dependence check.
We run into a case where dependence check in store merging bail out many times
for the same store and root nodes in a huge basicblock. That increases compile
time by almost 100x. The patch add a map to track how many times the bailing
out happen for the same store and root, and if it is over a limit, stop
considering the store with the same root as a merging candidate.
Differential Revision: https://reviews.llvm.org/D65174
llvm-svn: 367472
|
|
|
|
|
|
|
|
|
| |
The build failure found after the rL365467 has been
resolved.
Differential Revision: https://reviews.llvm.org/D60716
llvm-svn: 367446
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This emits labels around heapallocsite calls in SelectionDAG.
Reviewers: rnk
Subscribers: MatzeB, aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61105
llvm-svn: 367374
|
|
|
|
| |
llvm-svn: 367369
|
|
|
|
|
|
|
|
|
| |
Add an option to control whether or not to enable store merging in dag combiner
so we can workaround some bugs more easily.
Differential Revision: https://reviews.llvm.org/D65482
llvm-svn: 367365
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64966
llvm-svn: 367344
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Addresses number of comment made on D64652 after commiting:
- Reorders function decls in the TargetLoweringObjectFileXCOFF class.
- Fix comment in MCSectionXCOFF to include description of external reference
csects.
- Convert several llvm_unreachables to report_fatal_error
- Convert several dyn_casts to casts as they are expected not to fail.
- Avoid copying DataLayout object.
llvm-svn: 367324
|
|
|
|
|
|
|
|
| |
change value type. NFCI.
This is implicit in the value type checks in getSubVectorSrc - this just makes it upfront and obvious.
llvm-svn: 367220
|
|
|
|
|
|
| |
If the subvector binop is illegal then early-out and avoid the subvector searches.
llvm-svn: 367181
|
|
|
|
|
|
|
|
|
|
|
|
| |
support (Reapplied)
This allows us to peek through BITCASTs, attempt to simplify the source operand, and then bitcast back.
This reapplies rL367091 which was reverted at rL367118 - we were inconsistently peeking through the bitcasts to the source value.
Fixes PR42777
llvm-svn: 367174
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of just equal the limit.
If anything called the recursive isKnownNeverNaN/computeKnownBits/ComputeNumSignBits/SimplifyDemandedBits/SimplifyMultipleUseDemandedBits with an incorrect depth then we could continue to recurse if we'd already exceeded the depth limit.
This replaces the limit check (Depth == 6) with a (Depth >= 6) to make sure that we don't circumvent it.
This causes a couple of regressions as a mixture of calls (SimplifyMultipleUseDemandedBits + combineX86ShufflesRecursively) were calling with depths that were already over the limit. I've fixed SimplifyMultipleUseDemandedBits to not do this. combineX86ShufflesRecursively is trickier as we get a lot of regressions if we reduce its own limit from 8 to 6 (it also starts at Depth == 1 instead of Depth == 0 like the others....) - I'll see what I can do in future patches.
llvm-svn: 367171
|
|
|
|
|
|
| |
We're getting reports of massive compile time increases because SimplifyMultipleUseDemandedBits was losing track of the depth and not earlying-out. No repro yet, but consider this a pre-emptive commit.
llvm-svn: 367169
|
|
|
|
|
|
|
|
| |
We need this to narrow a sext to s128.
Differential Revision: https://reviews.llvm.org/D65357
llvm-svn: 367164
|
|
|
|
|
|
|
|
|
|
|
| |
Adds machine operand lowering for MCSymbolSDNodes to the PowerPC
backend. This is needed to produce call instructions in assembly for AIX
because the callee operand is a MCSymbolSDNode. The test is XFAIL'ed for
asserts due to a (valid) assertion in PEI that the AIX ABI isn't supported yet.
Differential Revision: https://reviews.llvm.org/D63738
llvm-svn: 367133
|
|
|
|
| |
llvm-svn: 367118
|
|
|
|
|
|
| |
SimplifyMultipleUseDemandedBits.
llvm-svn: 367098
|
|
|
|
|
|
| |
support.
llvm-svn: 367096
|
|
|
|
|
|
|
|
| |
SimplifyMultipleUseDemandedBits.
Eventually all of these will be moved over, but we create nodes in GetDemandedBits recursion at the moment which causes regressions when we try to remove them all.
llvm-svn: 367092
|
|
|
|
|
|
|
|
| |
support.
This allows us to peek through BITCASTs and attempt simplify the source operand, and then bitcast back.
llvm-svn: 367091
|
|
|
|
| |
llvm-svn: 367083
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
blocks
Summary:
In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.
Below is an example
```
BB: | BB:
XOR 3, 3, 4 | XOR 3, 3, 4
B TBB | B ChainBB
... | ...
ChainBB: | ChainBB:
B TBB | ADD 3, 3, 4
... | BLR
TBB: |
ADD 3, 3, 4 |
BLR |
```
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D63972
llvm-svn: 367080
|