summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Implement canonicalizeMatt Arsenault2016-04-142-1/+4
| | | | | | Also add generic DAG node for it. llvm-svn: 266272
* TargetLowering: Factor out common code for tail call eligibility checking; NFCMatthias Braun2016-04-141-0/+27
| | | | llvm-svn: 266270
* Cleanup Store Merging in UseAA caseNirav Dave2016-04-131-30/+44
| | | | | | | | | | | | | | This patch fixes a bug (PR26827) when using anti-aliasing in store merging. This sets the chain users of the component stores to point to the new store instead of the component stores chain parent. Reviewers: jyknight Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18909 llvm-svn: 266217
* Calculate __builtin_object_size when pointer depends on a conditionPetar Jovanovic2016-04-131-3/+12
| | | | | | | | | | | | | | | | This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193
* Recommit r265547, and r265610,r265639,r265657 on top of it, plusWei Mi2016-04-1310-533/+749
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | two fixes with one about error verify-regalloc reported, and another about live range update of phi after rematerialization. r265547: Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Patches on top of r265547: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" Differential Revision: http://reviews.llvm.org/D15302 Differential Revision: http://reviews.llvm.org/D18934 Differential Revision: http://reviews.llvm.org/D18935 Differential Revision: http://reviews.llvm.org/D18936 llvm-svn: 266162
* CodeGen: Clear the MFI's save and restore point after PrologEpilogInserterJustin Bogner2016-04-121-0/+2
| | | | | | | | | | This state is no longer useful and not guaranteed to be valid in later codegen passes. For example, see the added test, which would print a savepoint of %bb.-1 without this change, and crashes with a use-after-free error under ASan if you apply the recycling allocator patch from llvm.org/PR26808. llvm-svn: 266150
* Pre-fill LibcallRoutineNames with nullptr.James Y Knight2016-04-121-32/+12
| | | | | | And rearrange InitLibcallNames slightly. llvm-svn: 266142
* Add __atomic_* lowering to AtomicExpandPass.James Y Knight2016-04-122-9/+555
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115
* [CodeGen] Remove constant-folding dead code. NFC.Ahmed Bougacha2016-04-121-12/+4
| | | | | | | | | | | This code was specific to vector operations with scalar operands: all the opcodes in FoldValue (via FoldConstantArithmetic) can't match those criteria. Replace it with an assert if that ever changes: at that point, we might need to add back a splat BUILD_VECTOR. llvm-svn: 266100
* Introduce an GCRelocateInst class [NFC]Philip Reames2016-04-123-7/+6
| | | | | | Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098
* [ScheduleDAGInstrs] Handle instructions with multiple MMOsGeoff Berry2016-04-121-30/+41
| | | | | | | | | | | | | | | | | | | | | Summary: In getUnderlyingObjectsForInstr(): Don't give up on instructions with multiple MMOs, instead look through all the MMOs and if they all meet the conservative criteria previously used for single MMO instructions, then return all of the underlying objects derived from the MMOs. The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid the case where multiple underlying objects are present and are related in such a way that successive iterations of the loop end up adding a dependency from an instruction to itself. Reviewers: atrick, hfinkel Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18093 llvm-svn: 266084
* This reverts commit r266002, r266011 and r266016.Rafael Espindola2016-04-122-555/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062
* [RegBankSelect] Teach the repairing code how to handle physicalQuentin Colombet2016-04-121-2/+6
| | | | | | registers. llvm-svn: 266029
* [RegisterBankInfo] Do not provide a default mapping for non-reg of phiQuentin Colombet2016-04-121-0/+7
| | | | | | operations. llvm-svn: 266027
* [RegBankSelect] Teach how to repair definitions.Quentin Colombet2016-04-121-13/+112
| | | | | | | | | Although repairing definitions is not mandatory for correctness (only phis would be impacted because of the RPO traversal), not repairing might go against the cost model. Therefore, just repair when it is possible. llvm-svn: 266025
* Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionPropertyDerek Schuff2016-04-112-9/+7
| | | | | | | | | | Use the MachineFunctionProperty mechanism to indicate whether the liveness info is accurate instead of a bool flag on MRI. Keeps the MRI accessor function for convenience. NFC Differential Revision: http://reviews.llvm.org/D18767 llvm-svn: 266020
* AtomicExpandPass: mark assert variable as usedJF Bastien2016-04-111-0/+3
| | | | | | Avoid -Wunused-variable llvm-svn: 266016
* Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass)James Y Knight2016-04-111-8/+8
| | | | | | | | It doesn't like implicitly calling the ArrayRef constructor with a returned array -- it appears to decays the returned value to a pointer, first, before trying to make an ArrayRef out of it. llvm-svn: 266011
* CodeGen: Fix a use-after-free in TailDuplicationJustin Bogner2016-04-111-2/+0
| | | | | | | | | | The call to processPHI already erased MI from its parent, so MI isn't even valid here, making the getParent() call a use-after-free in addition to being redundant. Found by ASan with the ArrayRecycler changes in llvm.org/pr26808. llvm-svn: 266008
* [safestack] Add canary to unsafe stack framesEvgeniy Stepanov2016-04-112-19/+79
| | | | | | | | Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004
* Add __atomic_* lowering to AtomicExpandPass.James Y Knight2016-04-112-9/+552
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002
* [DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) ↵Simon Pilgrim2016-04-111-2/+2
| | | | | | | | | | | | | | anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998
* Fix a couple of redundant conditional expressions (PR27283, PR28282)Hans Wennborg2016-04-111-2/+2
| | | | llvm-svn: 265987
* use range-loops; NFCISanjay Patel2016-04-111-13/+8
| | | | llvm-svn: 265985
* Combine redundant stack realignment booleans in MachineFrameInfoReid Kleckner2016-04-111-17/+14
| | | | | | | | | MachineFrameInfo does not need to be able to distinguish between the user asking us not to realign the stack and the target telling us it doesn't support stack realignment. Either way, fixed stack objects have their alignment clamped. llvm-svn: 265971
* TargetRegisterInfo: Add getRegAsmName()Tom Stellard2016-04-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The motivation for this new function is to move an invalid assumption about the relationship between the names of register definitions in tablegen files and their assembly names into TargetRegisterInfo, so that we can begin working on fixing this assumption. The current problem is that if you have a register definition in TableGen like: def MYReg0 : Register<"r0", 0>; The function TargetLowering::getRegForInlineAsmConstraint() derives the assembly name from the tablegen name: "MyReg0" rather than the given assembly name "r0". This is working, because on most targets the tablegen name and the assembly names are case insensitive matches for each other (e.g. def EAX : X86Reg<"eax", ...> getRegAsmName() will allow targets to override this default assumption and return the correct assembly name. Reviewers: echristo, hfinkel Subscribers: SamWot, echristo, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D15614 llvm-svn: 265955
* [CodeGen] Don't assume that fixed stack objects are aligned in a ↵Charles Davis2016-04-091-5/+16
| | | | | | | | | | | | | | | | | | | | stack-realigned function. Summary: After we make the adjustment, we can assume that for local allocas, but not for stack parameters, the return address, or any other fixed stack object (which has a negative offset and therefore lies prior to the adjusted SP). Fixes PR26662. Reviewers: hfinkel, qcolombet, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D18471 llvm-svn: 265886
* Drop debug info for DISubprograms that are not referenced by anythingAdrian Prantl2016-04-094-53/+8
| | | | | | | | | | | | | | | | | | | | | This patch drops the debug info for all DISubprograms that are (a) not attached to an llvm::Function and (b) not indirectly reachable via inline scopes from any surviving Function and (c) not reachable from a type (i.e.: member functions). Background: I'm currently working on a patch to reverse the pointers between DICompileUnit and DISubprogram (for more info check Duncan's RFC on lazy-loading of debug info metadata http://lists.llvm.org/pipermail/llvm-dev/2016-March/097419.html). The idea is to remove the list of subprograms from DICompileUnit and instead point to the owning compile unit from each DISubprogram. After doing this all DISubprograms fulfilling the above criteria will be implicitly dropped unless we go through an extra effort to preserve them. http://reviews.llvm.org/D18477 <rdar://problem/25256815> llvm-svn: 265876
* [x86] use BMI 'andn' for logic + compare ops Sanjay Patel2016-04-091-0/+4
| | | | | | | | | | With BMI, we can use 'andn' to save an instruction when the result is only used in a compare. This is related to one of the potential sequences to check 'isfinite' in: https://llvm.org/bugs/show_bug.cgi?id=27164 Differential Revision: http://reviews.llvm.org/D18910 llvm-svn: 265875
* Support the Nodebug emission kind for DICompileUnits.Adrian Prantl2016-04-081-12/+21
| | | | | | | | | | | | | | | | Sample-based profiling and optimization remarks currently remove DICompileUnits from llvm.dbg.cu to suppress the emission of debug info from them. This is somewhat of a hack and only borderline legal IR. This patch uses the recently introduced NoDebug emission kind in DICompileUnit to achieve the same result without breaking the Verifier. A nice side-effect of this change is that it is now possible to combine NoDebug and regular compile units under LTO. http://reviews.llvm.org/D18808 <rdar://problem/25427165> llvm-svn: 265861
* [SSP] Remove llvm.stackprotectorcheck.Tim Shen2016-04-085-125/+76
| | | | | | | | | | This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851
* Codegen: Factor tail duplication into a utility class. NFCKyle Butt2016-04-083-949/+923
| | | | | | | | | | | This is in preparation for tail duplication during block placement. See D18226. This needs to be a utility class for 2 reasons. No passes may run after block placement, and also, tail-duplication affects subsequent layout decisions, so it must be interleaved with placement, and can't be separated out into its own pass. The original pass is still useful, and now runs by delegating to the utility class. llvm-svn: 265842
* Fix Load Control Dependence in MemCpy GenerationNirav Dave2016-04-082-57/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Memcpy lowering we had missed a dependence from the load of the operation to successor operations. This causes us to potentially construct an in initial DAG with a memory dependence not fully represented in the chain sub-DAG but rather require looking at the entire DAG breaking alias analysis by allowing incorrect repositioning of memory operations. To work around this, r200033 changed DAGCombiner::GatherAllAliases to be conservative if any possible issues to happen. Unfortunately this check forbade many non-problematic situations as well. For example, it's common for incoming argument lowering to add a non-aliasing load hanging off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would find that other (unvisited) use of the EntryNode chain, and just give up entirely. Furthermore, the check was incomplete: it would not actually detect all such potentially problematic DAG constructions, because GatherAllAliases did not guarantee to visit all chain nodes going up to the root EntryNode. This is in general fine -- giving up early will just miss a potential optimization, not generate incorrect results. But, for this non-chain dependency detection code, it's possible that you could have a load attached to a higher-up chain node than any which were visited. If that load aliases your store, but the only dependency is through the value operand of a non-aliasing store, it would've been missed by this code, and potentially reordered. With the dependence added, this check can be removed and Alias Analysis can be much more aggressive. This fixes code quality regression in the Consecutive Store Merge cleanup (D14834). Test Change: ppc64-align-long-double.ll now may see multiple serializations of its stores Differential Revision: http://reviews.llvm.org/D18062 llvm-svn: 265836
* [RegBankSelect] Use reverse post order traversal.Quentin Colombet2016-04-081-2/+12
| | | | | | | | When assigning the register banks of an instruction, it is best to know all the constraints of the input to have a good idea of how this will impact the cost of the whole function. llvm-svn: 265812
* [RegisterBankInfo] Change the implementation for the default mapping.Quentin Colombet2016-04-081-1/+14
| | | | | | | | | | Do not give that much importance to the current register bank of an operand. This is likely just a side effect of the current execution and it is properly wise to prefer a register bank that can be extracted from the information available statically (like encoding constraints and type). llvm-svn: 265810
* [RegBankSelect] Improve debug output.Quentin Colombet2016-04-081-1/+10
| | | | | | | | Add verbose information when checking if the current and the desired register banks match. Detail what happens when we assign a register bank. llvm-svn: 265804
* [MIR] Teach the parser how to deal with register banks.Quentin Colombet2016-04-081-10/+51
| | | | llvm-svn: 265802
* [MachineVerifier] Teach how to check some of the properties of genericQuentin Colombet2016-04-081-1/+24
| | | | | | | | | | | | | | virtual registers. Generic virtual registers: - May not have a register class - May not have a register bank - If they do not have a register class they must have a size - If they have a register bank, the size of the register bank must be greater or equal to the size of the virtual register (basically check that the virtual register will fit into that register class) llvm-svn: 265798
* [MIR] Teach the mir printer how to print the register bank.Quentin Colombet2016-04-081-5/+8
| | | | | | | | | For now, we put the register bank in the Class field since a register may only have one of those at a given time. The downside of that representation is that if a register class and a register bank have the same name, we will not be able to distinguish them. llvm-svn: 265796
* Revert r265547 "Recommit r265309 after fixed an invalid memory reference bug ↵Hans Wennborg2016-04-0810-722/+526
| | | | | | | | | | | | | happened" It caused PR27275: "ARM: Bad machine code: Using an undefined physical register" Also reverting the following commits that were landed on top: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" llvm-svn: 265790
* CXX_FAST_TLS calling convention: performance improvement for PPC64Chuang-Yu Cheng2016-04-081-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781
* Use std::fill to simplify some code. NFCCraig Topper2016-04-081-2/+3
| | | | llvm-svn: 265771
* [TargetRegisterInfo] Re-apply r265734.Quentin Colombet2016-04-081-12/+5
| | | | | | | Original commit message: [TargetRegisterInfo] Refactor the code to use BitMaskClassIterator. llvm-svn: 265764
* DwarfDebug: Support floating point constants in location lists.Adrian Prantl2016-04-083-18/+41
| | | | | | | | | | | | | | | | | | This patch closes a gap in the DWARF backend that caused LLVM to drop debug info for floating point variables that were constant for part of their scope. Floating point constants are emitted as one or more DW_OP_constu joined via DW_OP_piece. This fixes a regression caught by the LLDB testsuite that I introduced in r262247 when we stopped blindly expanding the range of singular DBG_VALUEs to span the entire scope and started to emit location lists with accurate ranges instead. Also deletes a now-impossible testcase (debug-loc-empty-entries). <rdar://problem/25448338> llvm-svn: 265760
* Revert "[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator."Quentin Colombet2016-04-081-5/+12
| | | | | | | | | | This reverts commit r265734. Looks like ASan is not happy about it. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/11741 Looking. llvm-svn: 265755
* [RegisterBankInfo] Make the debug output more compact.Quentin Colombet2016-04-081-1/+1
| | | | | | | Print the mask of the partial mapping as an hexadecimal instead of a binary value. llvm-svn: 265754
* [RegBankSelect] Add a few debug statements.Quentin Colombet2016-04-071-0/+9
| | | | llvm-svn: 265749
* [RegisterBankInfo] Add print and dump method to the InstructionMappingQuentin Colombet2016-04-071-0/+16
| | | | | | helper class. llvm-svn: 265747
* [RegisterBankInfo] Add print and dump method to the ValueMapping helperQuentin Colombet2016-04-071-0/+16
| | | | | | class. llvm-svn: 265746
* [MachineInstr] Teach the print method about RegisterBank.Quentin Colombet2016-04-071-11/+10
| | | | | | | | Properly print either the register class or the register bank or a virtual register. Get rid of a few ifdefs in the process. llvm-svn: 265745
OpenPOWER on IntegriCloud