summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Re-commit r315885: [globalisel][tblgen] Add support for iPTR and implement ↵Daniel Sanders2017-10-161-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | am_unscaled* and am_indexed* Summary: iPTR is a pointer of subtarget-specific size to any address space. Therefore type checks on this size derive the SizeInBits from a subtarget hook. At this point, we can import the simplests G_LOAD rules and select load instructions using them. Further patches will support for the predicates to enable additional loads as well as the stores. The previous commit failed on MSVC due to a failure to convert an initializer_list to a std::vector. Hopefully, MSVC will accept this version. Depends on D37457 Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D37458 llvm-svn: 315887
* Revert r315885: [globalisel][tblgen] Add support for iPTR and implement ↵Daniel Sanders2017-10-161-18/+0
| | | | | | | | am_unscaled* and am_indexed* MSVC doesn't like one of the constructors. llvm-svn: 315886
* [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and ↵Daniel Sanders2017-10-161-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | am_indexed* Summary: iPTR is a pointer of subtarget-specific size to any address space. Therefore type checks on this size derive the SizeInBits from a subtarget hook. At this point, we can import the simplests G_LOAD rules and select load instructions using them. Further patches will support for the predicates to enable additional loads as well as the stores. Depends on D37457 Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D37458 llvm-svn: 315885
* Re-commit r315863: [globalisel][tablegen] Import ComplexPattern when used as ↵Daniel Sanders2017-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | an operator Summary: It's possible for a ComplexPattern to be used as an operator in a match pattern. This is used by the load/store patterns in AArch64 to name the suboperands returned by ComplexPattern predicate so that they can be broken apart and referenced independently in the result pattern. This patch adds support for this in order to enable the import of load/store patterns. Depends on D37445 Hopefully fixed the ambiguous constructor that a large number of bots reported. Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D37456 llvm-svn: 315869
* Revert r315863: [globalisel][tablegen] Import ComplexPattern when used as an ↵Daniel Sanders2017-10-151-1/+1
| | | | | | | | operator A large number of bots are failing on an ambiguous constructor call. llvm-svn: 315866
* [globalisel][tablegen] Import ComplexPattern when used as an operatorDaniel Sanders2017-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: It's possible for a ComplexPattern to be used as an operator in a match pattern. This is used by the load/store patterns in AArch64 to name the suboperands returned by ComplexPattern predicate so that they can be broken apart and referenced independently in the result pattern. This patch adds support for this in order to enable the import of load/store patterns. Depends on D37445 Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D37456 llvm-svn: 315863
* Reverting r315590; it did not include changes for llvm-tblgen, which is ↵Aaron Ballman2017-10-1539-62/+62
| | | | | | | | causing link errors for several people. Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1 llvm-svn: 315854
* [RegisterBankInfo] Cache the getMinimalPhysRegClass informationQuentin Colombet2017-10-132-8/+21
| | | | | | | | | | | | | | | TargetRegisterInfo::getMinimalPhysRegClass is actually pretty expensive because it has to iterate over all the register classes. Cache this information as we need and get it so that we limit its usage. Right now, we heavily rely on it, because this is how we get the mapping for vregs defined by copies from physreg (i.e., the one that are ABI related). Improve compile time by up to 10% for that pass. NFC llvm-svn: 315759
* [Legalizer] Use SmallSetVector instead of SetVector.Quentin Colombet2017-10-131-1/+1
| | | | | | NFC llvm-svn: 315758
* [LegalizerInfo] Don't evaluate end boundary every time through the loopQuentin Colombet2017-10-131-3/+4
| | | | | | | | Match the LLVM coding standard for loop conditions. NFC. llvm-svn: 315757
* [Legalizer] Only allocate the SetVectors once per function.Quentin Colombet2017-10-131-3/+5
| | | | | | | | | | | | Prior to this patch we used to create SetVectors in temporaries that were created and destroyed for each instruction. Now, instead we create and destroyed them only once, but clear the content for each instruction. This speeds up the pass by ~25%. NFC. llvm-svn: 315756
* DAG: Add opcode and source type to isFPExtFreeMatt Arsenault2017-10-131-235/+253
| | | | | | | | This is only currently used for mad/fma transforms. This is the only case where it should be used for AMDGPU, so add an opcode to be sure. llvm-svn: 315740
* DAG: Add flags to dumpsMatt Arsenault2017-10-131-0/+30
| | | | llvm-svn: 315690
* [SelectionDAG] Cleanup the SIGN_EXTEND_INREG handling in computeKnownBits. NFCICraig Topper2017-10-131-26/+14
| | | | | | Use less temporary APInts. Use bit counting more. Don't call getScalarSizeInBits so many places, just capture it once. llvm-svn: 315671
* [SelectionDAG] Fix typo in comment. NFCCraig Topper2017-10-131-1/+1
| | | | llvm-svn: 315670
* [SelectionDAG] Correct the early out in SelectionDAG::getZeroExtendInReg to ↵Craig Topper2017-10-131-1/+1
| | | | | | | | work properly for vector types. I don't know if we ever hit this case or not. Turning it into an assert only fired on expanding some atomic operation in a SystemZ lit test. llvm-svn: 315648
* Add DK_Remark to SMDiagnosticAdam Nemet2017-10-121-0/+3
| | | | | | | | | | | Swift uses SMDiagnostic for diagnostic messages. For https://github.com/apple/swift/pull/12294, we need remark support. I picked the color that clang uses to display them. Differential Revision: https://reviews.llvm.org/D38865 llvm-svn: 315642
* [SelectionDAG] Const-correct the DemandedMask argument to one of the ↵Craig Topper2017-10-121-1/+1
| | | | | | overloads of SimplifyDemandedBits. NFC llvm-svn: 315641
* Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"Matthias Braun2017-10-125-443/+39
| | | | | | | | | | Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the lldb bots. This reverts commit r315633. llvm-svn: 315637
* Deprecate DwarfUnit::addBlockByrefAddress().Adrian Prantl2017-10-121-0/+6
| | | | | | | | | | | | The clang frontend already creates a DIExpression that replicates the logic in addBlockByrefAddress() exactly, thus making this function effectively unreachable. To guard against human error I'm hereby marking the function with an assertion and let it hit the bots before eventually removing it. rdar://problem/31629055 llvm-svn: 315636
* TargetMachine: Merge TargetMachine and LLVMTargetMachineMatthias Braun2017-10-125-39/+443
| | | | | | | | | | | | | | | Merge LLVMTargetMachine into TargetMachine. - There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMachine. - It should still be possible to stub out all the various functions in case a target does not want to use lib/CodeGen - This simplifies the code and avoids methods ending up in the wrong interface. Differential Revision: https://reviews.llvm.org/D38489 llvm-svn: 315633
* [SelectionDAG] Simplify the ISD::SIGN_EXTEND/ZERO_EXTEND handling to use ↵Craig Topper2017-10-121-25/+11
| | | | | | less temporary APInts by counting bits instead. NFCI llvm-svn: 315628
* [DWARF] Fix bad comparator in sortGlobalExprs.Eli Friedman2017-10-121-7/+12
| | | | | | | | | | | | | | | | | | The comparator passed to std::sort must provide a strict weak ordering; otherwise, the behavior is undefined. Fixes an assertion failure generating debug info for globals split by GlobalOpt. I have a testcase, but not sure how to reduce it, so not included here. (Someone else came up with a testcase, but I can't reproduce the crash with it, presumably because my version of LLVM ends up sorting the array differently.) This isn't really a complete fix (see the FIXME in the patch), but at least it doesn't have undefined behavior. Differential Revision: https://reviews.llvm.org/D38830 llvm-svn: 315619
* Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.Wei Ding2017-10-121-4/+15
| | | | | | Differential Revision: http://reviews.llvm.org/D37348 llvm-svn: 315610
* [dump] Remove NDEBUG from test to enable dump methods [NFC]Don Hinton2017-10-1239-62/+62
| | | | | | | | | | | | | | | Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590
* MachineInstr: Make isEqual agree with getHashValue in ↵Diana Picus2017-10-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MachineInstrExpressionTrait MachineInstr::isIdenticalTo has a lot of logic for dealing with register Defs (i.e. deciding whether to take them into account or ignore them). This logic gets things wrong in some obscure cases, for instance if an operand is not a Def for both the current MI and the one we are comparing to. I'm not sure if it's possible for this to happen for regular register operands, but it may happen in the ARM backend for special operands which use sentinel values for the register (i.e. 0, which is neither a physical register nor a virtual one). This causes MachineInstrExpressionTrait::isEqual (which uses MachineInstr::isIdenticalTo) to return true for the following instructions, which are the same except for the fact that one sets the flags and the other one doesn't: %1114 = ADDrsi %1113, %216, 17, 14, _, def _ %1115 = ADDrsi %1113, %216, 17, 14, _, _ OTOH, MachineInstrExpressionTrait::getHashValue returns different values for the 2 instructions due to the different isDef on the last operand. In practice this means that when trying to add those instructions to a DenseMap, they will be considered different because of their different hash values, but when growing the map we might get an assertion while copying from the old buckets to the new buckets because isEqual misleadingly returns true. This patch makes sure that isEqual and getHashValue agree, by improving the checks in MachineInstr::isIdenticalTo when we are ignoring virtual register definitions (which is what the Trait uses). Firstly, instead of checking isPhysicalRegister, we use !isVirtualRegister, so that we cover both physical registers and sentinel values. Secondly, instead of checking MachineOperand::isReg, we use MachineOperand::isIdenticalTo, which checks isReg, isSubReg and isDef, which are the same values that the hash function uses to compute the hash. Note that the function is symmetric with this change, since if the current operand is not a Def, we check MachineOperand::isIdenticalTo, which returns false if the operands have different isDef's. Differential Revision: https://reviews.llvm.org/D38789 llvm-svn: 315579
* Reinstantiate old/bad deduplication logic that was removed in r315279.Daniel Jasper2017-10-121-0/+10
| | | | | | | | | | | While this shouldn't be necessary anymore, we have cases where we run into the assertion below, i.e. cases with two non-fragment entries for the same variable at different frame indices. This should be fixed, but for now, we should revert to a version that does not trigger asserts. llvm-svn: 315576
* [ScheduleDAGInstrs] fix behavior of getUnderlyingObjectsForCodeGen when no ↵Hiroshi Inoue2017-10-121-10/+17
| | | | | | | | | | | | | | | identifiable object found This patch fixes the bug introduced in https://reviews.llvm.org/D35907; the bug is reported by http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171002/491452.html. Before D35907, when GetUnderlyingObjects fails to find an identifiable object, allMMOsOkay lambda in getUnderlyingObjectsForInstr returns false and Objects vector is cleared. This behavior is unintentionally changed by D35907. This patch makes the behavior for such case same as the previous behavior. Since D35907 introduced a wrapper function getUnderlyingObjectsForCodeGen around GetUnderlyingObjects, getUnderlyingObjectsForCodeGen is modified to return a boolean value to ask the caller to clear the Objects vector. Differential Revision: https://reviews.llvm.org/D38735 llvm-svn: 315565
* [RegisterCoalescer] Don't set read-undef in pruneValues, only clearMikael Holmen2017-10-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The comments in the code said // Remove <def,read-undef> flags. This def is now a partial redef. but the code didn't just remove read-undef, it could introduce new ones which could cause errors. E.g. if we have something like %vreg1<def> = IMPLICIT_DEF %vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4 %vreg2:subreg2<def> = op %vreg6, %vreg7 and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg def, which the old code did. Now we solve this by actually do what the code comment says. We remove read-undef flags rather than remove or introduce them. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38616 llvm-svn: 315564
* Revert r307036 because of PR34919.Wei Mi2017-10-121-92/+0
| | | | llvm-svn: 315540
* [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.Lang Hames2017-10-111-4/+5
| | | | | | | | MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that, and allows us to remove the last instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315531
* [codeview] Implement FPO data assembler directivesReid Kleckner2017-10-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds a set of new directives that describe 32-bit x86 prologues. The directives are limited and do not expose the full complexity of codeview FPO data. They are merely a convenience for the compiler to generate more readable assembly so we don't need to generate tons of labels in CodeGen. If our prologue emission changes in the future, we can change the set of available directives to suit our needs. These are modelled after the .seh_ directives, which use a different format that interacts with exception handling. The directives are: .cv_fpo_proc _foo .cv_fpo_pushreg ebp/ebx/etc .cv_fpo_setframe ebp/esi/etc .cv_fpo_stackalloc 200 .cv_fpo_endprologue .cv_fpo_endproc .cv_fpo_data _foo I tried to follow the implementation of ARM EHABI CFI directives by sinking most directives out of MCStreamer and into X86TargetStreamer. This helps avoid polluting non-X86 code with WinCOFF specific logic. I used cdb to confirm that this can show locals in parent CSRs in a few cases, most importantly the one where we use ESI as a frame pointer, i.e. the one in http://crbug.com/756153#c28 Once we have cdb integration in debuginfo-tests, we can add integration tests there. Reviewers: majnemer, hans Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38776 llvm-svn: 315513
* [MachineCombiner] Fix initialisation of LastUpdate for incremental update.Florian Hahn2017-10-111-2/+4
| | | | | | | | | | | | | | | | | Summary: Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled. Patch by Paul Walker. Reviewers: fhahn, Gerolf, efriedma, MatzeB Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38734 llvm-svn: 315502
* [NFC] Convert OptimizationRemarkEmitter old emit() calls to new closureVivek Pandya2017-10-114-56/+71
| | | | | | | | | | | | | | parameterized emit() calls Summary: This is not functional change to adopt new emit() API added in r313691. Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38285 llvm-svn: 315476
* [Pipeliner] Fix offset value for instrs dependent on post-inc load/storesKrzysztof Parzyszek2017-10-111-3/+8
| | | | | | | | | | | | The software pipeliner and the packetizer try to break dependence between the post-increment instruction and the dependent memory instructions by changing the base register and the offset value. However, in some cases, the existing logic didn't work properly and created incorrect offset value. Patch by Jyotsna Verma. llvm-svn: 315468
* [Pipeliner] Improve serialization order for post-incrementsKrzysztof Parzyszek2017-10-111-13/+57
| | | | | | | | | | | | | | | | | | | The pipeliner is generating a serial sequence that causes poor register allocation when a post-increment instruction appears prior to the use of the post-increment register. This occurs when there is a circular set of dependences involved with a sequence of instructions in the same cycle. In this case, there is no serialization of the parallel semantics that will not cause an additional register to be allocated. This patch fixes the problem by changing the instructions so that the post-increment instruction is used by the subsequent instruction, which enables the register allocator to make a better decision and not require another register. Patch by Brendon Cahoon. llvm-svn: 315466
* [DAGCombiner] convert insertelement of bitcasted vector into shuffleSanjay Patel2017-10-111-3/+62
| | | | | | | | | | | | | | | | Eg: insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7} This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector. We may want to abandon that one if we can't find value in squashing the more specific pattern sooner. We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles. There may be room for improvement in the shuffle lowering here, but that would be follow-up work. Differential Revision: https://reviews.llvm.org/D38388 llvm-svn: 315460
* [TargetLowering] Correctly track NumFixedArgs field of CallLoweringInfoAlex Bradbury2017-10-111-0/+1
| | | | | | | | | | | | | | | | | | | | | The NumFixedArgs field of CallLoweringInfo is used by TargetLowering::LowerCallTo to determine whether a given argument is passed using the vararg calling convention or not (specifically, to set IsFixed for each ISD::OutputArg). Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both incorrectly set NumFixedArgs based on the _previous_ args list. Secondly, TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying the argument list so a pointer is passed for the return value. If your backend uses the IsFixed property or directly accesses NumFixedArgs, it is _possible_ this change could result in codegen changes (although the previous behaviour would have been incorrect). No such cases have been identified during code review for any in-tree architecture. Differential Revision: https://reviews.llvm.org/D37898 llvm-svn: 315457
* [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.Lang Hames2017-10-111-2/+5
| | | | | | | | MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that, and allows us to remove another instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315410
* CodeGen: Minor cleanups to use MachineInstr::getMF. NFCJustin Bogner2017-10-1018-45/+42
| | | | | | | Since r315388 we have a shorter way to say this, so we'll replace MI->getParent()->getParent() with MI->getMF() in a few places. llvm-svn: 315390
* CodeGen: Add MachineInstr::getMF(). NFCJustin Bogner2017-10-101-0/+4
| | | | | | | | Similarly to how Instruction has getFunction, this adds a less verbose way to write MI->getParent()->getParent(). I'll follow up shortly with a change that changes a bunch of the uses. llvm-svn: 315388
* [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-10-1016-196/+403
| | | | | | other minor fixes (NFC). llvm-svn: 315380
* Convert condition to an early exit (NFC).Adrian Prantl2017-10-101-1/+3
| | | | | | <rdar://problem/34689604> llvm-svn: 315359
* [DAGCombine] Fix for shuffle to vector extend for non power 2 vectorsDavid Stuttard2017-10-101-0/+3
| | | | | | | | | | | | | | | | | | | | | Summary: See https://llvm.org/PR33743 for more details It seems that for non-power of 2 vector sizes, the algorithm can produce non-matching sizes for input and result causing an assert. This usually isn't a problem as the isAnyExtend check will weed these out, but in some cases (most often with lots of undefined values for the mask indices) it can pass this check for non power of 2 vectors. Adding in an extra check that ensures that bit size will match for the result and input (as required) Subscribers: nhaehnle Differential Revision: https://reviews.llvm.org/D35241 llvm-svn: 315307
* Ignore all duplicate frame index expressionBjorn Steinbrink2017-10-102-24/+26
| | | | | | | | | | | | | | | | | Some passes might duplicate calls to llvm.dbg.declare creating duplicate frame index expression which currently trigger an assertion which is meant to catch erroneous, overlapping fragment declarations. But identical frame index expressions are just redundant and don't actually conflict with each other, so we can be more lenient and just ignore the duplicates. Reviewers: aprantl, rnk Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38540 llvm-svn: 315279
* Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*Adam Nemet2017-10-094-4/+4
| | | | | | | Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249
* [GISel]: Fix generation of illegal COPYs during CallLoweringAditya Nandakumar2017-10-091-3/+4
| | | | | | | | | | | We end up creating COPY's that are either truncating/extending and this should be illegal. https://reviews.llvm.org/D37640 Patch for X86 and ARM by igorb, rovka llvm-svn: 315240
* [DAG] combine assertsexts around a truncSanjay Patel2017-10-091-10/+10
| | | | | | | This was a suggested follow-up to: D37017 / https://reviews.llvm.org/rL313577 llvm-svn: 315206
* Remove unused variables. No functionality change.Benjamin Kramer2017-10-082-2/+1
| | | | llvm-svn: 315185
* [SelectionDAG} Use KnownBits::isUnknown and hasConflict. NFCCraig Topper2017-10-071-8/+8
| | | | llvm-svn: 315154
OpenPOWER on IntegriCloud