summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* DebugInfo: Avoid propagating incorrect debug locations in SelectionDAG via CSE.Wolfgang Pieb2016-05-021-31/+37
| | | | | | | | | | | | | | | | | | Summary: When SelectionDAG performs CSE it is possible that the context's source location is different from that of the selected node. This can lead to incorrect line number records. We update the debug location to the one that occurs earlier in the instruction sequence. This fixes PR21006. Reviewers: echristo, sdmitrouk Subscribers: jevinskie, asl, llvm-commits Differential Revision: http://reviews.llvm.org/D12094 llvm-svn: 268323
* Fix grammar and correct comment - the debug information wasn't incorrect, ↵Eric Christopher2016-05-021-2/+2
| | | | | | rather suboptimal. llvm-svn: 268211
* [CodeGen] Add OPC_MoveChild0-OPC_MoveChild7 opcodes to isel matching tables ↵Craig Topper2016-05-021-0/+12
| | | | | | to optimize table size. Shaves about 12K off the X86 matcher table. llvm-svn: 268209
* getelementptr instruction, support index vector of EVT.Igor Breger2016-05-011-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195
* DAGCombiner: Reduce truncated shl widthMatt Arsenault2016-04-291-0/+19
| | | | llvm-svn: 268094
* Use SelectionDAG::getTargetConstant* helper functions. NFC.Simon Pilgrim2016-04-291-4/+4
| | | | | | Instead of SelectionDAG::getConstant directly to make it more obvious that we're creating target constants. llvm-svn: 268074
* Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to ↵Filipe Cabecinhas2016-04-292-4/+4
| | | | | | | | | | | | | | | | | | | the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050
* [DAGCombiner] Follow coding convention for function name (NFC)Gerolf Hoflehner2016-04-271-2/+2
| | | | llvm-svn: 267745
* Revert r267649, it caused PR27539.Nico Weber2016-04-271-11/+7
| | | | llvm-svn: 267723
* Detects the SAD pattern on X86 so that much better code will be emitted once ↵Cong Hou2016-04-271-7/+11
| | | | | | | | the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649
* [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.Ahmed Bougacha2016-04-262-40/+25
| | | | | | Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
* [PR27390] [CodeGen] Reject indexed loads in CombinerDAG.Marcin Koscielnicki2016-04-252-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | visitAND, when folding and (load) forgets to check which output of an indexed load is involved, happily folding the updated address output on the following testcase: target datalayout = "e-m:e-i64:64-n32:64" target triple = "powerpc64le-unknown-linux-gnu" %typ = type { i32, i32 } define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) { %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1 %1 = load i32, i32* %b, align 4 %2 = ptrtoint i32* %b to i64 %3 = and i64 %2, -35184372088833 %4 = inttoptr i64 %3 to i32* %_msld = load i32, i32* %4, align 4 %zzz = add i32 %1, %_msld ret i32 %zzz } Fix this by checking ResNo. I've found a few more places that currently neglect to check for indexed load, and tightened them up as well, but I don't have test cases for them. In fact, they might not be triggerable at all, at least with current targets. Still, better safe than sorry. Differential Revision: http://reviews.llvm.org/D19202 llvm-svn: 267420
* [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)Gerolf Hoflehner2016-04-241-1/+10
| | | | | | | | | | | | | | | | | | | The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328
* [CodeGen] Teach DAG combine to fold select_cc seteq X, 0, sizeof(X), ↵Craig Topper2016-04-241-0/+35
| | | | | | | | ctlz_zero_undef(X) -> ctlz(X). InstCombine already does this for IR and X86 pattern matches this during isel. A follow up commit will remove the X86 patterns to allow this to be tested. llvm-svn: 267325
* [CodeGen] When promoting CTTZ operations to larger type, don't insert a ↵Craig Topper2016-04-231-9/+11
| | | | | | select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280
* DAGCombiner: Relax alignment restriction when changing store typeMatt Arsenault2016-04-221-10/+14
| | | | | | If the target allows the alignment, this should be OK. llvm-svn: 267217
* DAGCombiner: Relax alignment restriction when changing load typeMatt Arsenault2016-04-221-3/+4
| | | | | | If the target allows the alignment, this should still be OK. llvm-svn: 267209
* Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64Daniel Sanders2016-04-221-11/+2
| | | | | | It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127
* [MachineCombiner] Support for floating-point FMA on ARM64Gerolf Hoflehner2016-04-221-2/+11
| | | | | | | | | | | | | | | | Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098
* LegalizeDAG: Move unaligned load/store expansion to TLIMatt Arsenault2016-04-212-310/+304
| | | | | | | | When custom lowered, this is not called if the store is custom lowered. Move it to be a utility function so targets can easily expand unaligned accesses when custom lowering. llvm-svn: 267029
* DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit componentMatt Arsenault2016-04-211-0/+44
| | | | | | | If the extracted bits are restricted to the upper half or lower half, this can be truncated. llvm-svn: 267024
* [SelectionDAG] Teach LegalizeVectorOps to directly Expand ↵Craig Topper2016-04-211-3/+5
| | | | | | | | CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to CTTZ/CTLZ directly if those ops are Legal/Custom instead of deferring it to LegalizeOps. This is needed to support CTTZ/CTLZ Custom correctly since LegalizeOps would be too late to do the custom lowering. llvm-svn: 266951
* [PPC, SSP] Support PowerPC Linux stack protection.Tim Shen2016-04-191-13/+11
| | | | llvm-svn: 266809
* [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARDTim Shen2016-04-192-54/+51
| | | | | | | | | | | | | | | | | | | | | | | With this change, ideally IR pass can always generate llvm.stackguard call to get the stack guard; but for now there are still IR form stack guard customizations around (see getIRStackGuard()). Future SSP customization should go through LOAD_STACK_GUARD. There is a behavior change: stack guard values are not CSEed anymore, since we should never reuse the value in case that it has been spilled (and corrupted). See ssp-guard-spill.ll. This also cause the change of stack size and codegen in X86 and AArch64 test cases. Ideally we'd like to know if the guard created in llvm.stackprotector() gets spilled or not. If the value is spilled, discard the value and reload stack guard; otherwise reuse the value. This can be done by teaching register allocator to know how to rematerialize LOAD_STACK_GUARD and force a rematerialization (which seems hard), or check for spilling in expandPostRAPseudo. It only makes sense when the stack guard is a global variable, which requires more instructions to load. Anyway, this seems to go out of the scope of the current patch. llvm-svn: 266806
* [NFC] Header cleanupMehdi Amini2016-04-185-5/+0
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* Switch lowering: don't add incoming PHI values from skipped bit test MBB's ↵Hans Wennborg2016-04-151-12/+26
| | | | | | | | | | | (PR27135) After r245976, LLVM will skip the last bit test case if knows it will always be true. However, we would still erroneously update PHI nodes with incoming values from the MBB that would perform the final bit test, causing -verify-machineinstrs to fail. llvm-svn: 266479
* SelectionDAGISel: rangeify a loopHans Wennborg2016-04-151-23/+20
| | | | llvm-svn: 266478
* Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.hReid Kleckner2016-04-141-0/+1
| | | | | | | | | | | MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351
* [CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NANDavid Majnemer2016-04-141-6/+16
| | | | | | | | | | | | | | | The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279
* AMDGPU: Implement canonicalizeMatt Arsenault2016-04-142-1/+4
| | | | | | Also add generic DAG node for it. llvm-svn: 266272
* TargetLowering: Factor out common code for tail call eligibility checking; NFCMatthias Braun2016-04-141-0/+27
| | | | llvm-svn: 266270
* Cleanup Store Merging in UseAA caseNirav Dave2016-04-131-30/+44
| | | | | | | | | | | | | | This patch fixes a bug (PR26827) when using anti-aliasing in store merging. This sets the chain users of the component stores to point to the new store instead of the component stores chain parent. Reviewers: jyknight Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18909 llvm-svn: 266217
* [CodeGen] Remove constant-folding dead code. NFC.Ahmed Bougacha2016-04-121-12/+4
| | | | | | | | | | | This code was specific to vector operations with scalar operands: all the opcodes in FoldValue (via FoldConstantArithmetic) can't match those criteria. Replace it with an assert if that ever changes: at that point, we might need to add back a splat BUILD_VECTOR. llvm-svn: 266100
* Introduce an GCRelocateInst class [NFC]Philip Reames2016-04-123-7/+6
| | | | | | Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098
* [DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) ↵Simon Pilgrim2016-04-111-2/+2
| | | | | | | | | | | | | | anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998
* TargetRegisterInfo: Add getRegAsmName()Tom Stellard2016-04-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The motivation for this new function is to move an invalid assumption about the relationship between the names of register definitions in tablegen files and their assembly names into TargetRegisterInfo, so that we can begin working on fixing this assumption. The current problem is that if you have a register definition in TableGen like: def MYReg0 : Register<"r0", 0>; The function TargetLowering::getRegForInlineAsmConstraint() derives the assembly name from the tablegen name: "MyReg0" rather than the given assembly name "r0". This is working, because on most targets the tablegen name and the assembly names are case insensitive matches for each other (e.g. def EAX : X86Reg<"eax", ...> getRegAsmName() will allow targets to override this default assumption and return the correct assembly name. Reviewers: echristo, hfinkel Subscribers: SamWot, echristo, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D15614 llvm-svn: 265955
* [x86] use BMI 'andn' for logic + compare ops Sanjay Patel2016-04-091-0/+4
| | | | | | | | | | With BMI, we can use 'andn' to save an instruction when the result is only used in a compare. This is related to one of the potential sequences to check 'isfinite' in: https://llvm.org/bugs/show_bug.cgi?id=27164 Differential Revision: http://reviews.llvm.org/D18910 llvm-svn: 265875
* [SSP] Remove llvm.stackprotectorcheck.Tim Shen2016-04-083-30/+16
| | | | | | | | | | This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851
* Fix Load Control Dependence in MemCpy GenerationNirav Dave2016-04-082-57/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Memcpy lowering we had missed a dependence from the load of the operation to successor operations. This causes us to potentially construct an in initial DAG with a memory dependence not fully represented in the chain sub-DAG but rather require looking at the entire DAG breaking alias analysis by allowing incorrect repositioning of memory operations. To work around this, r200033 changed DAGCombiner::GatherAllAliases to be conservative if any possible issues to happen. Unfortunately this check forbade many non-problematic situations as well. For example, it's common for incoming argument lowering to add a non-aliasing load hanging off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would find that other (unvisited) use of the EntryNode chain, and just give up entirely. Furthermore, the check was incomplete: it would not actually detect all such potentially problematic DAG constructions, because GatherAllAliases did not guarantee to visit all chain nodes going up to the root EntryNode. This is in general fine -- giving up early will just miss a potential optimization, not generate incorrect results. But, for this non-chain dependency detection code, it's possible that you could have a load attached to a higher-up chain node than any which were visited. If that load aliases your store, but the only dependency is through the value operand of a non-aliasing store, it would've been missed by this code, and potentially reordered. With the dependence added, this check can be removed and Alias Analysis can be much more aggressive. This fixes code quality regression in the Consecutive Store Merge cleanup (D14834). Test Change: ppc64-align-long-double.ll now may see multiple serializations of its stores Differential Revision: http://reviews.llvm.org/D18062 llvm-svn: 265836
* NFC: make AtomicOrdering an enum classJF Bastien2016-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602
* Lower @llvm.experimental.deoptimize as a noreturn callSanjoy Das2016-04-063-7/+35
| | | | | | | | | | | | | | | | | | | | | | | While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502
* Swift Calling Convention: swifterror target-independent change.Manman Ren2016-04-055-4/+353
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At IR level, the swifterror argument is an input argument with type ErrorObject**. For targets that support swifterror, we want to optimize it to behave as an inout value with type ErrorObject*; it will be passed in a fixed physical register. The main idea is to track the virtual registers for each swifterror value. We define swifterror values as AllocaInsts with swifterror attribute or a function argument with swifterror attribute. In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before handling the basic blocks. When iterating over all basic blocks in RPO, before actually visiting the basic block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when there are multiple predecessors or to simply propagate them. There, we create a virtual register for each swifterror value in the entry block. For predecessors that are not yet visited, we create virtual registers to hold the swifterror values at the end of the predecessor. The assignments are saved in SwiftErrorWorklist and will be materialized at the end of visiting the basic block. When visiting a load from a swifterror value, we copy from the current virtual register assignment. When visiting a store to a swifterror value, we create a virtual register to hold the swifterror value and update SwiftErrorMap to track the current virtual register assignment. Differential Revision: http://reviews.llvm.org/D18108 llvm-svn: 265433
* fix typos; NFCSanjay Patel2016-04-041-2/+2
| | | | llvm-svn: 265356
* Swift Calling Convention: add swifterror attribute.Manman Ren2016-04-013-0/+9
| | | | | | | | | | | | A ``swifterror`` attribute can be applied to a function parameter or an AllocaInst. This commit does not include any target-specific change. The target-specific optimization will come as a follow-up patch. Differential Revision: http://reviews.llvm.org/D18092 llvm-svn: 265189
* Don't use an i64 return type with webkit_jsccSanjoy Das2016-04-011-6/+7
| | | | | | | | | Re-enable an assertion enabled by Justin Lebar in rL265092. rL265092 was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc does not like non-i64 return types. Change the test case to not do that. llvm-svn: 265099
* Revert "Protect some assertions with NDEBUG rather than DEBUG()."Justin Lebar2016-04-011-7/+6
| | | | | | This reverts r265092, because it breaks CodeGen/X86/deopt-intrinsic.ll. llvm-svn: 265093
* Protect some assertions with NDEBUG rather than DEBUG().Justin Lebar2016-04-011-6/+7
| | | | | | | DEBUG() only runs if you pass -debug, but these assertions are generally useful. llvm-svn: 265092
* fix typo; NFCSanjay Patel2016-03-311-2/+1
| | | | llvm-svn: 265054
* Prevent X86ISelLowering from merging volatile loadsNirav Dave2016-03-312-14/+9
| | | | | | | | | | | | | Change isConsecutiveLoads to check that loads are non-volatile as this is a requirement for any load merges. Propagate change to two callers. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18546 llvm-svn: 265013
* LegalizeDAG: Don't replace vector store with integer if not legalMatt Arsenault2016-03-303-41/+87
| | | | | | | | | | | For the same reason as the corresponding load change. Note that ExpandStore is completely broken for non-byte sized element vector stores, but preserve the current broken behavior which has tests for it. The behavior should be the same, but now introduces a new typed store that is incorrectly split later rather than doing it directly. llvm-svn: 264928
OpenPOWER on IntegriCloud