summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* SelectionDAG: create correct BooleanContent constantsTim Northover2013-09-062-2/+11
| | | | | | | | | | | | | | | Occasionally DAGCombiner can spot that a SETCC operation is completely redundant and reduce it to "all true" or "all false". If this happens to a vector, the value produced has to take account of what a normal comparison would have produced, which may be an all-1s bitmask. The fix in SelectionDAG.cpp is tested, however, as far as I can see the code in TargetLowering.cpp is possibly unreachable and almost certainly irrelevant when triggered so there are no tests. However, I believe it's still clearly the right change and may save someone else some hassle if it suddenly becomes reachable. So I'm doing it anyway. llvm-svn: 190147
* Use TargetSubtargetInfo::useAA() in DAGCombineHal Finkel2013-08-291-3/+10
| | | | | | | This uses the TargetSubtargetInfo::useAA() function to control the defaults of the -combiner-alias-analysis and -combiner-global-alias-analysis options. llvm-svn: 189564
* Fix a typo and coding style of a previous commit. No functional change.Juergen Ributzka2013-08-281-3/+2
| | | | llvm-svn: 189526
* DAGCombiner: make sure or/shl/srl really has zero high bits before forming bswapTim Northover2013-08-271-6/+24
| | | | | | | | We want to convert code like (or (srl N, 8), (shl N, 8)) into (srl (bswap N), const), but this is only valid if the bits above 16 on the source pattern are 0, the checks we were doing on this were slightly wrong before. llvm-svn: 189348
* Remove an over-zealous assertion. A pointer type could be illegal if the ↵Owen Anderson2013-08-271-1/+0
| | | | | | target is prepared to custom-legalize pointer operands. This assertion was evaluated before the target would have a chance to do so, making it impossible. llvm-svn: 189299
* SelectionDAG: Remove unnecessary uses of TargetLowering::getPointerTy()Tom Stellard2013-08-267-43/+50
| | | | | | | | | | | | If we have a binary operation like ISD:ADD, we can set the result type equal to the result type of one of its operands rather than using TargetLowering::getPointerTy(). Also, any use of DAG.getIntPtrConstant(C) as an operand for a binary operation can be replaced with: DAG.getConstant(C, OtherOperand.getValueType()); llvm-svn: 189227
* SelectionDAG: Use correct pointer size when splitting vector storesTom Stellard2013-08-261-1/+1
| | | | llvm-svn: 189224
* SelectionDAG: Use correct pointer size when lowering function arguments v2Tom Stellard2013-08-261-5/+8
| | | | | | | | | | | | | | | | This adds minimal support to the SelectionDAG for handling address spaces with different pointer sizes. The SelectionDAG should now correctly lower pointer function arguments to the correct size as well as generate the correct code when lowering getelementptr. This patch also updates the R600 DataLayout to use 32-bit pointers for the local address space. v2: - Add more helper functions to TargetLoweringBase - Use CHECK-LABEL for tests llvm-svn: 189221
* Add a function object to compare the first or second component of a std::pair.Benjamin Kramer2013-08-241-10/+1
| | | | | | Replace instances of this scattered around the code base. llvm-svn: 189169
* [stack protector] Work around an issue with the BMOVPCB_CALL instruction on ↵Michael Gottesman2013-08-221-1/+1
| | | | | | | | | | ARM by disabling does not return on __stack_chk_fail. This is to fix the bots while I look to see if there is something I can do here. rdar://14811848 llvm-svn: 189076
* [stackprotector] When finding the split point to splice off the end of a ↵Michael Gottesman2013-08-221-9/+44
| | | | | | | | parentmbb into a successmbb, include any DBG_VALUE MI. Fix for PR16954. llvm-svn: 188987
* SelectionDAG: Make sure stores are always added to the LegalizedNodes listTom Stellard2013-08-211-1/+1
| | | | | | | | | | | | | | | | When truncated vector stores were being custom lowered in VectorLegalizer::LegalizeOp(), the old (illegal) and new (legal) node pair was not being added to LegalizedNodes list. Instead of the legalized result being passed to VectorLegalizer::TranslateLegalizeResult(), the result was being passed back into VectorLegalizer::LegalizeOp(), which ended up adding a (new, new) pair to the list instead. This was causing an assertion failure when a custom lowered truncated vector store was the last instruction a basic block and the VectorLegalizer was unable to find it in the LegalizedNodes list when updating the DAG root. llvm-svn: 188953
* Teach BaseIndexOffset::match to identify base pointers in loops.Juergen Ributzka2013-08-211-2/+14
| | | | | | | | | | | | | | The small utility function that pattern matches Base + Index + Offset patterns for loads and stores fails to recognize the base pointer for loads/stores from/into an array at offset 0 inside a loop. As a result DAGCombiner::MergeConsecutiveStores was not able to merge all stores. This commit fixes the issue by adding an additional pattern match and also a test case. Reviewer: Nadav llvm-svn: 188936
* [SystemZ] Use SRST to optimize memchrRichard Sandiford2013-08-202-0/+36
| | | | | | | | | | | | | | | | | | | SystemZTargetLowering::emitStringWrapper() previously loaded the character into R0 before the loop and made R0 live on entry. I'd forgotten that allocatable registers weren't allowed to be live across blocks at this stage, and it confused LiveVariables enough to cause a miscompilation of f3 in memchr-02.ll. This patch instead loads R0 in the loop and leaves LICM to hoist it after RA. This is actually what I'd tried originally, but I went for the manual optimisation after noticing that R0 often wasn't being hoisted. This bug forced me to go back and look at why, now fixed as r188774. We should also try to optimize null checks so that they test the CC result of the SRST directly. The select between null and the SRST GPR result could then usually be deleted as dead. llvm-svn: 188779
* Remove unused variables that crept in.Michael Gottesman2013-08-201-6/+0
| | | | llvm-svn: 188761
* Teach selectiondag how to handle the stackprotectorcheck intrinsic.Michael Gottesman2013-08-203-4/+390
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, generation of stack protectors was done exclusively in the pre-SelectionDAG Codegen LLVM IR Pass "Stack Protector". This necessitated splitting basic blocks at the IR level to create the success/failure basic blocks in the tail of the basic block in question. As a result of this, calls that would have qualified for the sibling call optimization were no longer eligible for optimization since said calls were no longer right in the "tail position" (i.e. the immediate predecessor of a ReturnInst instruction). Then it was noticed that since the sibling call optimization causes the callee to reuse the caller's stack, if we could delay the generation of the stack protector check until later in CodeGen after the sibling call decision was made, we get both the tail call optimization and the stack protector check! A few goals in solving this problem were: 1. Preserve the architecture independence of stack protector generation. 2. Preserve the normal IR level stack protector check for platforms like OpenBSD for which we support platform specific stack protector generation. The main problem that guided the present solution is that one can not solve this problem in an architecture independent manner at the IR level only. This is because: 1. The decision on whether or not to perform a sibling call on certain platforms (for instance i386) requires lower level information related to available registers that can not be known at the IR level. 2. Even if the previous point were not true, the decision on whether to perform a tail call is done in LowerCallTo in SelectionDAG which occurs after the Stack Protector Pass. As a result, one would need to put the relevant callinst into the stack protector check success basic block (where the return inst is placed) and then move it back later at SelectionDAG/MI time before the stack protector check if the tail call optimization failed. The MI level option was nixed immediately since it would require platform specific pattern matching. The SelectionDAG level option was nixed because SelectionDAG only processes one IR level basic block at a time implying one could not create a DAG Combine to move the callinst. To get around this problem a few things were realized: 1. While one can not handle multiple IR level basic blocks at the SelectionDAG Level, one can generate multiple machine basic blocks for one IR level basic block. This is how we handle bit tests and switches. 2. At the MI level, tail calls are represented via a special return MIInst called "tcreturn". Thus if we know the basic block in which we wish to insert the stack protector check, we get the correct behavior by always inserting the stack protector check right before the return statement. This is a "magical transformation" since no matter where the stack protector check intrinsic is, we always insert the stack protector check code at the end of the BB. Given the aforementioned constraints, the following solution was devised: 1. On platforms that do not support SelectionDAG stack protector check generation, allow for the normal IR level stack protector check generation to continue. 2. On platforms that do support SelectionDAG stack protector check generation: a. Use the IR level stack protector pass to decide if a stack protector is required/which BB we insert the stack protector check in by reusing the logic already therein. If we wish to generate a stack protector check in a basic block, we place a special IR intrinsic called llvm.stackprotectorcheck right before the BB's returninst or if there is a callinst that could potentially be sibling call optimized, before the call inst. b. Then when a BB with said intrinsic is processed, we codegen the BB normally via SelectBasicBlock. In said process, when we visit the stack protector check, we do not actually emit anything into the BB. Instead, we just initialize the stack protector descriptor class (which involves stashing information/creating the success mbbb and the failure mbb if we have not created one for this function yet) and export the guard variable that we are going to compare. c. After we finish selecting the basic block, in FinishBasicBlock if the StackProtectorDescriptor attached to the SelectionDAGBuilder is initialized, we first find a splice point in the parent basic block before the terminator and then splice the terminator of said basic block into the success basic block. Then we code-gen a new tail for the parent basic block consisting of the two loads, the comparison, and finally two branches to the success/failure basic blocks. We conclude by code-gening the failure basic block if we have not code-gened it already (all stack protector checks we generate in the same function, use the same failure basic block). llvm-svn: 188755
* Add a llvm.copysign intrinsicHal Finkel2013-08-193-0/+9
| | | | | | | | | | | | | | | | | | | | | This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well. In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded. In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN). llvm-svn: 188728
* Improve the widening of integral binary vector operationsPaul Redmond2013-08-192-10/+24
| | | | | | | | | | | | | | - split WidenVecRes_Binary into WidenVecRes_Binary and WidenVecRes_BinaryCanTrap - WidenVecRes_BinaryCanTrap preserves the original behaviour for operations that can trap - WidenVecRes_Binary simply widens the operation and improves codegen for 3-element vectors by allowing widening and promotion on x86 (matches the behaviour of unary and ternary operation widening) - use WidenVecRes_Binary for operations on integers. Reviewed by: nrotem llvm-svn: 188699
* Add ExpandFloatOp_FCOPYSIGN to handle ppcf128-related expansionsHal Finkel2013-08-192-0/+13
| | | | | | | | | | We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the sum of two doubles, and the first double must have the larger magnitude, we can take the sign from the first double. As a result, in addition to fixing the crash, this is also an optimization. llvm-svn: 188655
* ARM: Fix more fast-isel verifier failures.Jim Grosbach2013-08-161-0/+4
| | | | | | | | | | | | | Teach the generic instruction selection helper functions to constrain the register classes of their input operands. For non-physical register references, the generic code needs to be careful not to mess that up when replacing references to result registers. As the comment indicates for MachineRegisterInfo::replaceRegWith(), it's important to call constrainRegClass() first. rdar://12594152 llvm-svn: 188593
* [SystemZ] Use SRST to implement strlen and strnlenRichard Sandiford2013-08-162-0/+63
| | | | | | It would also make sense to use it for memchr; I'm working on that now. llvm-svn: 188547
* [SystemZ] Use MVST to implement strcpy and stpcpyRichard Sandiford2013-08-162-0/+38
| | | | llvm-svn: 188546
* [SystemZ] Use CLST to implement strcmpRichard Sandiford2013-08-162-0/+34
| | | | llvm-svn: 188544
* [SystemZ] Fix handling of 64-bit memcmp resultsRichard Sandiford2013-08-162-19/+31
| | | | | | | | | | | | | Generalize r188163 to cope with return types other than MVT::i32, just as the existing visitMemCmpCall code did. I've split this out into a subroutine so that it can be used for other upcoming patches. I also noticed that I'd used the wrong API to record the out chain. It's a load that uses DAG.getRoot() rather than getRoot(), so the out chain should go on PendingLoads. I don't have a testcase for that because we don't do any interesting scheduling on z yet. llvm-svn: 188540
* Replace getValueType().getSimpleVT() with getSimpleValueType().Craig Topper2013-08-155-15/+15
| | | | llvm-svn: 188442
* DAG: Combine (and (setne X, 0), (setne X, -1)) -> (setuge (add X, 1), 2)Jim Grosbach2013-08-131-0/+13
| | | | | | | | | | | | | | | | | | | | | | | A common idiom is to use zero and all-ones as sentinal values and to check for both in a single conditional ("x != 0 && x != (unsigned)-1"). That generates code, for i32, like: testl %edi, %edi setne %al cmpl $-1, %edi setne %cl andb %al, %cl With this transform, we generate the simpler: incl %edi cmpl $1, %edi seta %al Similar improvements for other integer sizes and on other platforms. In general, combining the two setcc instructions into one is better. rdar://14689217 llvm-svn: 188315
* Update makeLibCall to return both the call and the chain associated with the ↵Michael Gottesman2013-08-134-63/+73
| | | | | | | | | | | | | | | libcall instead of just the call. This allows us to specify libcalls that return void. LowerCallTo returns a pair with the return value of the call as the first element and the chain associated with the return value as the second element. If we lower a call that has a void return value, LowerCallTo returns an SDValue with a NULL SDNode and the chain for the call. Thus makeLibCall by just returning the first value makes it impossible for you to set up the chain so that the call is not eliminated as dead code. I also updated all references to makeLibCall to reflect the new return type. llvm-svn: 188300
* Fixed SelectionDAGBuilder.h C++ filetype declaration to use the canonical ↵Michael Gottesman2013-08-121-1/+1
| | | | | | C++ instead of c++. llvm-svn: 188203
* [SystemZ] Use CLC and IPM to implement memcmpRichard Sandiford2013-08-121-0/+21
| | | | | | | For now this is restricted to fixed-length comparisons with a length in the range [1, 256], as for memcpy() and MVC. llvm-svn: 188163
* Change asserts at the top of getVectorShuffle to check that LHS and RHS have ↵Craig Topper2013-08-091-6/+3
| | | | | | | | | | the same type as the result. Previously the asserts were only checking that RHS and LHS were the same type and had the same element type as the result. All downstream code for ISD::VECTOR_SHUFFLE requires the types to be the same. Also removed one unnecessary check of matched element counts that was present in the code. llvm-svn: 188051
* Remove AllUndef check from one of the loops in getVectorShuffle. It was ↵Craig Topper2013-08-081-5/+1
| | | | | | already handled by the 'AllLHS && AllRHS' check after the previous loop. llvm-svn: 187965
* Optimize mask generation for one of the DAG combiner shufflevector cases.Craig Topper2013-08-081-3/+3
| | | | llvm-svn: 187961
* Add ISD::FROUND for libm round()Hal Finkel2013-08-077-1/+49
| | | | | | | | | | | | | | | All libm floating-point rounding functions, except for round(), had their own ISD nodes. Recent PowerPC cores have an instruction for round(), and so here I'm adding ISD::FROUND so that round() can be custom lowered as well. For the most part, this is straightforward. I've added an intrinsic and a matching ISD node just like those for nearbyint() and friends. The SelectionDAG pattern I've named frnd (because ISD::FP_ROUND has already claimed fround). This will be used by the PowerPC backend in a follow-up commit. llvm-svn: 187926
* TargetLowering: Add getVectorIdxTy() function v2Tom Stellard2013-08-058-77/+98
| | | | | | | | | | | | | | | | | | | | | This virtual function can be implemented by targets to specify the type to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT, INSERT_SUBVECTOR, EXTRACT_SUBVECTOR. The default implementation returns the result from TargetLowering::getPointerTy() The previous code was using TargetLowering::getPointerTy() for vector indices, because this is guaranteed to be legal on all targets. However, using TargetLowering::getPointerTy() can be a problem for targets with pointer sizes that differ across address spaces. On such targets, when vectors need to be loaded or stored to an address space other than the default 'zero' address space (which is the address space assumed by TargetLowering::getPointerTy()), having an index that is a different size than the pointer can lead to inefficient pointer calculations, (e.g. 64-bit adds for a 32-bit address space). There is no intended functionality change with this patch. llvm-svn: 187748
* Fix crashing on invalid inline asm with matching constraints.Eric Christopher2013-07-311-15/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For a testcase like the following: typedef unsigned long uint64_t; typedef struct { uint64_t lo; uint64_t hi; } blob128_t; void add_128_to_128(const blob128_t *in, blob128_t *res) { asm ("PAND %1, %0" : "+Q"(*res) : "Q"(*in)); } where we'll fail to allocate the register for the output constraint, our matching input constraint will not find a register to match, and could try to search past the end of the current operands array. On the idea that we'd like to attempt to keep compilation going to find more errors in the module, change the error cases when we're visiting inline asm IR to return immediately and avoid trying to create a node in the DAG. This leaves us with only a single error message per inline asm instruction, but allows us to safely keep going in the general case. llvm-svn: 187470
* Reflow this to be easier to read.Eric Christopher2013-07-301-7/+5
| | | | llvm-svn: 187459
* [DAGCombiner] insert_vector_elt: Avoid building a vector twice.Quentin Colombet2013-07-301-1/+3
| | | | | | | | | | | | | | | | This patch prevents the following combine when the input vector is used more than once. insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx => build_vector elt0, ..., NewEltIdx, ..., eltN The reasons are: - Building a vector may be expensive, so try to reuse the existing part of a vector instead of creating a new one (think big vectors). - elt0 to eltN now have two users instead of one. This may prevent some other optimizations. llvm-svn: 187396
* Reimplement isPotentiallyReachable to make nocapture deduction much stronger.Nick Lewycky2013-07-271-0/+1
| | | | | | | | | | Adds unit tests for it too. Split BasicBlockUtils into an analysis-half and a transforms-half, and put the analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable into llvm::isPotentiallyReachable and move it into Analysis/CFG. llvm-svn: 187283
* Add a target legalize hook for SplitVectorOperand (again)Justin Holewinski2013-07-261-0/+4
| | | | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 Attempt to fix the buildbots by making the X86 test I just added platform independent llvm-svn: 187202
* Revert "Add a target legalize hook for SplitVectorOperand"Rafael Espindola2013-07-261-4/+0
| | | | | | | | | | This reverts commit 187198. It broke the bots. The soft float test probably needs a -triple because of name differences. On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of "vroundss $1, %xmm0, %xmm0, %xmm0". llvm-svn: 187201
* Add a target legalize hook for SplitVectorOperandJustin Holewinski2013-07-261-0/+4
| | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 llvm-svn: 187198
* DAGCombiner: Pass the correct type to TargetLowering::isF(Abs|Neg)FreeTom Stellard2013-07-231-2/+2
| | | | | | | This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007
* Add -*- C++ -*- to InstrEmitter.h.Michael Gottesman2013-07-171-1/+1
| | | | llvm-svn: 186527
* Remove invalid assert in DAGTypeLegalizer::RemapValueHal Finkel2013-07-151-1/+4
| | | | | | | | | | | | | | | | | | There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks which, in part, says: // Note that these invariants may not hold momentarily when processing a node: // the node being processed may be put in a map before being marked Processed. Unfortunately, this assert would be valid only if the above-mentioned invariant held unconditionally. This was causing llc to assert when, in fact, everything was fine. Thanks to Richard Sandiford for investigating this issue! Fixes PR16562. llvm-svn: 186338
* Add 'const' qualifier to some arrays.Craig Topper2013-07-151-1/+2
| | | | llvm-svn: 186312
* Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper2013-07-147-44/+44
| | | | | | size. llvm-svn: 186274
* Pass SmallVector by const reference instead of by value.Craig Topper2013-07-131-2/+2
| | | | llvm-svn: 186243
* Remove trailing whitespacStephen Lin2013-07-101-2/+2
| | | | llvm-svn: 186032
* Un-break the buildbot by tweaking the indirection flag.Adrian Prantl2013-07-101-2/+8
| | | | | | Pulled in a testcase from the debuginfo-test suite. llvm-svn: 185993
* Document a known limitation of the status quo.Adrian Prantl2013-07-101-1/+3
| | | | llvm-svn: 185992
OpenPOWER on IntegriCloud