summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Use MCRegister in MCRegisterInfo's interfacesDaniel Sanders2019-08-022-5/+6
| | | | | | | | | | | | | | | | | Summary: As part of this, define DenseMapInfo for MCRegister (and Register while I'm at it) Depends on D65599 Reviewers: arsenm Subscribers: MatzeB, qcolombet, jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65605 llvm-svn: 367719
* [Statepoints] Fix overalignment of loads in no-realign-stack functionsPhilip Reames2019-08-021-8/+15
| | | | | | | | This really should have been part of 366765. For some reason, I forgot to handle the corresponding load side, and the readable test cases (using deopt vs statepoints) turned out to be overly reduced. Oops. As seen in the test change, the problem was that we were using a load with alignment expectations rather than the unaligned variant when the stack alignment was less than that prefered type alignment. llvm-svn: 367718
* [ScalarizeMaskedMemIntrin] Add constant mask support to expandload and ↵Craig Topper2019-08-021-0/+34
| | | | | | | | | | compressstore scalarization This adds support for generating all the loads or stores for a constant mask into a single basic block with no conditionals. Differential Revision: https://reviews.llvm.org/D65613 llvm-svn: 367715
* [DAGCombiner] try to convert opposing shifts to castsSanjay Patel2019-08-021-0/+26
| | | | | | | | | | | | | | | | | | | | | This reverses a questionable IR canonicalization when a truncate is free: sra (add (shl X, N1C), AddC), N1C --> sext (add (trunc X to (width - N1C)), AddC') https://rise4fun.com/Alive/slRC More details in PR42644: https://bugs.llvm.org/show_bug.cgi?id=42644 I limited this to pre-legalization for code simplicity because that should be enough to reverse the IR patterns. I don't have any evidence (no regression test diffs) that we need to try this later. Differential Revision: https://reviews.llvm.org/D65607 llvm-svn: 367710
* Temporarily Revert "Changing representation of cv_def_range directives in ↵Eric Christopher2019-08-021-6/+17
| | | | | | | | | | Codeview debug info assembly format for better readability" This is breaking bots and the author asked me to revert. This reverts commit 367704. llvm-svn: 367707
* Changing representation of cv_def_range directives in Codeview debug info ↵Nilanjana Basu2019-08-021-17/+6
| | | | | | assembly format for better readability llvm-svn: 367704
* CodeGen: Don't follow aliases when extracting type info.Peter Collingbourne2019-08-021-1/+1
| | | | | | | | | | | This fixes a crash in the case where the type info object is an alias pointing to a non-zero offset within a global or is otherwise unanalyzable by the stripPointerCasts() function. Looking through the alias is not the right thing to do anyway for similar reasons as D65118. Differential Revision: https://reviews.llvm.org/D65314 llvm-svn: 367696
* GlobalISel: support swiftself attributeTim Northover2019-08-022-15/+0
| | | | llvm-svn: 367683
* [IPRA][ARM] Disable no-CSR optimisation for ARMOliver Stannard2019-08-022-2/+5
| | | | | | | | | | | This optimisation isn't generally profitable for ARM, because we can save/restore many registers in the prologue and epilogue using the PUSH and POP instructions, but mostly use individual LDR/STR instructions for other spills. Differential revision: https://reviews.llvm.org/D64910 llvm-svn: 367670
* Fix and test inter-procedural register allocation for ARMOliver Stannard2019-08-021-0/+7
| | | | | | | | | | | | | | - Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves with a null RegScavenger. Simply not updating the register scavenger is fine because IPRA only cares about the SavedRegs vector, the acutal code of the function has already been generated at this point. - Add a new hook to TargetRegisterInfo to get the set of registers which can be clobbered inside a call, even if the compiler can see both sides, by linker-generated code. Differential revision: https://reviews.llvm.org/D64908 llvm-svn: 367669
* [NFC][CodeGen] Modify the type element of TailCalls to simplify the ↵Kang Zhang2019-08-021-12/+9
| | | | | | | | | | | | | dupRetToEnableTailCallOpts() Summary: The old code can be simplified to define the element type of TailCalls as `BasicBlock` not `CallInst`. Also I use the for-range loop instead the for loop. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D64905 llvm-svn: 367644
* Temporarily revert "Changes to improve CodeView debug info type record ↵Eric Christopher2019-08-021-20/+20
| | | | | | | | | | inline comments" due to a sanitizer failure. This reverts commit 367623. llvm-svn: 367640
* Finish moving TargetRegisterInfo::isVirtualRegister() and friends to ↵Daniel Sanders2019-08-0173-538/+508
| | | | | | llvm::Register as started by r367614. NFC llvm-svn: 367633
* Changes to improve CodeView debug info type record inline commentsNilanjana Basu2019-08-011-20/+20
| | | | | Signed-off-by: Nilanjana Basu <nilanjana.basu87@gmail.com> llvm-svn: 367623
* GlobalISel: Lower scalarizing unmerge of a vector to shiftsMatt Arsenault2019-08-011-0/+35
| | | | | | | | | | | | | | AMDGPU sometimes has legal s16 and <2 x s16> operations, but all registers are really 32-bit. An unmerge destination really should ben widened to a 32-bit register. If widening a scalarizing vector with a target size that matches the vector size, bitcast to integer and extract the relevant bits with shifts. I'm not sure if this is the right place for this. This could arguably be part of widenScalar for the result. I also have a growing feeling that we're missing a bitcast legalize action. llvm-svn: 367604
* [X86] In decomposeMulByConstant, legalize the VT before querying whether the ↵Craig Topper2019-08-011-1/+1
| | | | | | | | | | | | multiply is legal If a type is larger than a legal type and needs to be split, we would previously allow the multiply to be decomposed even if the split multiply is legal. Since the shift + add/sub code would also need to be split, its not any better to decompose it. This patch figures out what type the mul will eventually be legalized to and then uses that type for the query. I tried just returning false illegal types and letting them get handled after type legalization, but then we can't recognize and i64 constant splat on 32-bit targets since will be destroyed by type legalization. We could special case vectors of i64 to avoid that... Differential Revision: https://reviews.llvm.org/D65533 llvm-svn: 367601
* CodeGen: Allow virtual registers in bundlesMatt Arsenault2019-08-011-2/+2
| | | | | | | | | | | | | | | The note in the documentation suggests this restriction is a compile time optimization for architectures that make heavy use of bundling. Allowing virtual registers in a bundle is useful for some (non-R600) AMDGPU use cases and are infrequent enough to matter. A more common AMDGPU use case has already been using virtual registers in bundles since r333691, although never calling finalizeBundle on them and manually creating the use/def list on the BUNDLE instruction. This is also relatively infrequent, and only happens for consecutive sequences of some load/store types. llvm-svn: 367597
* GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointerMatt Arsenault2019-08-011-1/+3
| | | | | | | AMDGPU testcase isn't broken now, but will be in a future patch without this. llvm-svn: 367591
* [TargetLowering] SimplifyMultipleUseDemandedBits - Add ↵Simon Pilgrim2019-08-011-0/+10
| | | | | | | | ISD::INSERT_VECTOR_ELT handling Allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367588
* [SelectionDAG] Use APInt::isSubsetOf/intersects to simplify some code.Craig Topper2019-08-011-2/+2
| | | | | | Also use KnownBits::isNegative/isNonNegative to further simplify. llvm-svn: 367518
* GlobalISel: moreElementsVector for G_LOAD/G_STOREMatt Arsenault2019-08-011-1/+11
| | | | | | | AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503
* Create unique, but identically-named ELF sections for explicitly-sectioned ↵Peter Collingbourne2019-08-011-2/+17
| | | | | | | | | | | | | | functions and globals when using -function-sections and -data-sections. This allows functions and globals to to be reordered later in the linking phase (using the -symbol-ordering-file) even though reordering will be limited to the scope of the explicit section. Patch by Rahman Lavaee! Differential Revision: https://reviews.llvm.org/D65478 llvm-svn: 367501
* Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" andAmy Huang2019-07-311-10/+0
| | | | | | | | | | and partial fix. Causes windows buildbot errors. This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and 53da7ca94343166ac68aef81db0398932fc258bb. llvm-svn: 367496
* [ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use ↵Craig Topper2019-07-311-11/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | scalar bit tests for the branches. X86 at least is able to use movmsk or kmov to move the mask to the scalar domain. Then we can just use test instructions to test individual bits. This is more efficient than extracting each mask element individually. I special cased v1i1 to use the previous behavior. This avoids poor type legalization of bitcast of v1i1 to i1. I've skipped expandload/compressstore as I think we need to handle constant masks for those better first. Many tests end up with duplicate test instructions due to tail duplication in the branch folding pass. But the same thing happens when constructing similar code in C. So its not unique to the scalarization. Not sure if this lowering code will also be good for other targets, but we're only testing X86 today. Differential Revision: https://reviews.llvm.org/D65319 llvm-svn: 367489
* Migrate some more fadd and fsub cases away from UnsafeFPMath control to ↵Michael Berg2019-07-312-7/+7
| | | | | | | | | | | | | | | | utilize NoSignedZerosFPMath options control Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context. Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper Reviewed By: spatel Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji Differential Revision: https://reviews.llvm.org/D65170 llvm-svn: 367486
* Fix to r367374 "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG"Amy Huang2019-07-311-4/+8
| | | | | | | | | after windows buildbot failure. Added a check that the MachineInstr exists and is a call before trying to add symbols around it. llvm-svn: 367483
* Fix unused variable warning for non-assert builds.Eric Christopher2019-07-311-0/+1
| | | | llvm-svn: 367482
* [GISel] Pass MD_callees metadata down in call lowering.Mark Lacey2019-07-311-1/+5
| | | | | | | | | | | | | | | | | | | | Summary: This will make it possible to improve IPRA by taking into account register usage in indirect calls. NFC yet; this is just laying the groundwork to start building up patches to take advantage of the information for improved register allocation. Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65488 llvm-svn: 367476
* SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned ↵Peter Collingbourne2019-07-312-18/+15
| | | | | | | | | | | | | char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474
* [DAGCombine] Limit the number of times for the same store and root nodesWei Mi2019-07-311-3/+42
| | | | | | | | | | | | | | to bail out in store merging dependence check. We run into a case where dependence check in store merging bail out many times for the same store and root nodes in a huge basicblock. That increases compile time by almost 100x. The patch add a map to track how many times the bailing out happen for the same store and root, and if it is over a limit, stop considering the store with the same root as a merging candidate. Differential Revision: https://reviews.llvm.org/D65174 llvm-svn: 367472
* Reland "[DwarfDebug] Dump call site debug info"Djordje Todorovic2019-07-3110-43/+421
| | | | | | | | | The build failure found after the rL365467 has been resolved. Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 367446
* [MS] Emit S_HEAPALLOCSITE debug info in SelectionDAGAmy Huang2019-07-311-0/+6
| | | | | | | | | | | | | | Summary: This emits labels around heapallocsite calls in SelectionDAG. Reviewers: rnk Subscribers: MatzeB, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61105 llvm-svn: 367374
* GlobalISel: Add G_ATOMICRMW_{FADD|FSUB}Matt Arsenault2019-07-302-14/+36
| | | | llvm-svn: 367369
* [DAGCombiner] Add an option to control whether or not to enable store merging.Wei Mi2019-07-301-1/+6
| | | | | | | | | Add an option to control whether or not to enable store merging in dag combiner so we can workaround some bugs more easily. Differential Revision: https://reviews.llvm.org/D65482 llvm-svn: 367365
* [AMDGPU/GlobalISel] Add llvm.amdgcn.fdiv.fast legalization.Austin Kerbow2019-07-301-4/+6
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64966 llvm-svn: 367344
* Address post commit review comments on revision 366727.Sean Fertile2019-07-301-5/+5
| | | | | | | | | | | | | Addresses number of comment made on D64652 after commiting: - Reorders function decls in the TargetLoweringObjectFileXCOFF class. - Fix comment in MCSectionXCOFF to include description of external reference csects. - Convert several llvm_unreachables to report_fatal_error - Convert several dyn_casts to casts as they are expected not to fail. - Avoid copying DataLayout object. llvm-svn: 367324
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for binops that ↵Simon Pilgrim2019-07-291-0/+4
| | | | | | | | change value type. NFCI. This is implicit in the value type checks in getSubVectorSrc - this just makes it upfront and obvious. llvm-svn: 367220
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for illegal op. NFCI.Simon Pilgrim2019-07-271-1/+4
| | | | | | If the subvector binop is illegal then early-out and avoid the subvector searches. llvm-svn: 367181
* [TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵Simon Pilgrim2019-07-271-2/+59
| | | | | | | | | | | | support (Reapplied) This allows us to peek through BITCASTs, attempt to simplify the source operand, and then bitcast back. This reapplies rL367091 which was reverted at rL367118 - we were inconsistently peeking through the bitcasts to the source value. Fixes PR42777 llvm-svn: 367174
* [SelectionDAG] Check for any recursion depth greater than or equal to limit ↵Simon Pilgrim2019-07-272-5/+5
| | | | | | | | | | | | instead of just equal the limit. If anything called the recursive isKnownNeverNaN/computeKnownBits/ComputeNumSignBits/SimplifyDemandedBits/SimplifyMultipleUseDemandedBits with an incorrect depth then we could continue to recurse if we'd already exceeded the depth limit. This replaces the limit check (Depth == 6) with a (Depth >= 6) to make sure that we don't circumvent it. This causes a couple of regressions as a mixture of calls (SimplifyMultipleUseDemandedBits + combineX86ShufflesRecursively) were calling with depths that were already over the limit. I've fixed SimplifyMultipleUseDemandedBits to not do this. combineX86ShufflesRecursively is trickier as we get a lot of regressions if we reduce its own limit from 8 to 6 (it also starts at Depth == 1 instead of Depth == 0 like the others....) - I'll see what I can do in future patches. llvm-svn: 367171
* [TargetLowering] Add depth limit to SimplifyMultipleUseDemandedBitsSimon Pilgrim2019-07-271-0/+3
| | | | | | We're getting reports of massive compile time increases because SimplifyMultipleUseDemandedBits was losing track of the depth and not earlying-out. No repro yet, but consider this a pre-emptive commit. llvm-svn: 367169
* [AArch64][GlobalISel] Implement narrowing of G_SEXT.Amara Emerson2019-07-261-0/+20
| | | | | | | | We need this to narrow a sext to s128. Differential Revision: https://reviews.llvm.org/D65357 llvm-svn: 367164
* [PowerPC][AIX]Add lowering of MCSymbol MachineOperand.Sean Fertile2019-07-261-0/+3
| | | | | | | | | | | Adds machine operand lowering for MCSymbolSDNodes to the PowerPC backend. This is needed to produce call instructions in assembly for AIX because the callee operand is a MCSymbolSDNode. The test is XFAIL'ed for asserts due to a (valid) assertion in PEI that the AIX ABI isn't supported yet. Differential Revision: https://reviews.llvm.org/D63738 llvm-svn: 367133
* Revert r367091, it caused PR42777.Nico Weber2019-07-261-59/+2
| | | | llvm-svn: 367118
* [SelectionDAG] GetDemandedBits - update SIGN_EXTEND_INREG op to just call ↵Simon Pilgrim2019-07-261-9/+1
| | | | | | SimplifyMultipleUseDemandedBits. llvm-svn: 367098
* [TargetLowering] SimplifyMultipleUseDemandedBits - add SIGN_EXTEND_INREG ↵Simon Pilgrim2019-07-261-0/+7
| | | | | | support. llvm-svn: 367096
* [SelectionDAG] GetDemandedBits - update OR/XOR ops to just call ↵Simon Pilgrim2019-07-261-6/+2
| | | | | | | | SimplifyMultipleUseDemandedBits. Eventually all of these will be moved over, but we create nodes in GetDemandedBits recursion at the moment which causes regressions when we try to remove them all. llvm-svn: 367092
* [TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵Simon Pilgrim2019-07-261-2/+59
| | | | | | | | support. This allows us to peek through BITCASTs and attempt simplify the source operand, and then bitcast back. llvm-svn: 367091
* Some case eror for: detected memory leaksKang Zhang2019-07-261-25/+0
| | | | llvm-svn: 367083
* [PowerPC] Do the Simple Early Return in block-placement pass to optimize the ↵Kang Zhang2019-07-261-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Below is an example ``` BB: | BB: XOR 3, 3, 4 | XOR 3, 3, 4 B TBB | B ChainBB ... | ... ChainBB: | ChainBB: B TBB | ADD 3, 3, 4 ... | BLR TBB: | ADD 3, 3, 4 | BLR | ``` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 367080
OpenPOWER on IntegriCloud