summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* Move ExtractVectorElements to SelectionDAG.Matt Arsenault2014-04-111-0/+16
| | | | | | | This seems generally useful, and makes sense to go along with SplitVector. llvm-svn: 206041
* SelectionDAG: Use helper function to improve legalization of ISD::MULTom Stellard2014-04-111-0/+17
| | | | | | | | The TargetLowering::expandMUL() helper contains lowering code extracted from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more ISD::MUL patterns without having to use a library call. llvm-svn: 206037
* SelectionDAG: Factor ISD::MUL lowering code out of DAGTypeLegalizerTom Stellard2014-04-112-67/+113
| | | | | | | | | | | This code has been moved to a new function in the TargetLowering class called expandMUL(). The purpose of this is to be able to share lowering code between the SelectionDAGLegalize and DAGTypeLegalizer classes. No functionality changed intended. llvm-svn: 206036
* [c++11] Range'ify use list loops in InstrEmitter.Jim Grosbach2014-04-111-9/+3
| | | | llvm-svn: 206015
* [c++11] Range'ify use list loops in DAGCombiner.Jim Grosbach2014-04-111-18/+7
| | | | llvm-svn: 206014
* SelectionDAG: Don't constant fold target-specific nodes.Jim Grosbach2014-04-091-0/+6
| | | | | | | | | | | | | | FoldConstantArithmetic() only knows how to deal with a few target independent ISD opcodes. Bail early if it sees a target-specific ISD node. These node do funny things with operand types which may break the assumptions of the code that follows, and there's no actual folding that can be done anyway. For example, non-constant 256 bit vector shifts on X86 have a shift-amount operand that's a 128-bit v4i32 vector regardless of what the first operand type is and that breaks the assumption that the operand types must match. rdar://16530923 llvm-svn: 205937
* [DAGCombiner] DAG combine does not know how to combine indexed loads withQuentin Colombet2014-04-091-2/+5
| | | | | | | | | | | sign/zero/any extensions. However a few places were not checking properly the property of the load and were turning an indexed load into a regular extended load. Therefore the indexed value was lost during the process and this was triggering an assertion. <rdar://problem/16389332> llvm-svn: 205923
* Bug 19348: Check for legal ExtLoad operation before foldingMatt Arsenault2014-04-081-9/+12
| | | | | | | | (aext (zextload x)) -> (aext (truncate (*extload x))) Patch by Stanislav Mekhanoshin! llvm-svn: 205805
* Put a limit on ScheduleDAGSDNodes::ClusterNeighboringLoads to avoid blowing ↵Andrew Trick2014-04-071-1/+6
| | | | | | | | | | | | | | | | | | | | up compile time. Fixes PR16365 - Extremely slow compilation in -O1 and -O2. The SD scheduler has a quadratic implementation of load clustering which absolutely blows up compile time for large blocks with constant pool loads. The MI scheduler has a better implementation of load clustering. However, we have not done the work yet to completely eliminate the SD scheduler. Some benchmarks still seem to benefit from early load clustering, although maybe by chance. As an intermediate term fix, I just put a nice limit on the number of DAG users to search before finding a match. With this limit there are no binary differences in the LLVM test suite, and the PR16365 test case does not suffer any compile time impact from this routine. llvm-svn: 205738
* Add DAG parameter to ComputeNumSignBitsForTargetNodeMatt Arsenault2014-04-042-1/+2
| | | | | | | | This way, you can check the number of sign bits in the operands. The depth parameter it already has is pretty useless without this. llvm-svn: 205649
* DAGLegalize: add last-ditch type-legalization for VSELECT.Tim Northover2014-04-042-0/+16
| | | | | | | | | | | | | When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can decide that the result is OK (v1i64 is legal on AArch64, for example) but it still need scalarising because of that v1i1. There was no code to do this though. AArch64 and ARM64 have DAG combines to produce efficient code and prevent that occuring in *most* such situations, but there are edge cases that they miss. This adds a legalization to cope with that. llvm-svn: 205626
* ARM64: handle v1i1 types arising from setcc properly.Tim Northover2014-04-041-3/+15
| | | | | | | | | | | | | | | | | | | | There were several overlapping problems here, and this solution is closely inspired by the one adopted in AArch64 in r201381. Firstly, scalarisation of v1i1 setcc operations simply fails if the input types are legal. This is fixed in LegalizeVectorTypes.cpp this time, and allows AArch64 code to be simplified slightly. Second, vselect with such a setcc feeding into it ends up in ScalarizeVectorOperand, where it's not handled. I experimented with an implementation, but found that whatever DAG came out was rather horrific. I think Hao's DAG combine approach is a good one for quality, though there are edge cases it won't catch (to be fixed separately). Should fix PR19335. llvm-svn: 205625
* Make consistent use of MCPhysReg instead of uint16_t throughout the tree.Craig Topper2014-04-041-1/+1
| | | | llvm-svn: 205610
* Fix for PR 19261:Eric Christopher2014-04-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | llc doesn't generate nodes for unconditional fall-through branches for targets without FastISel implementation (X86 has it, but can be disabled by "-fast-isel=false") in SelectionDAGBuilder::visitBr(). So for line 4 in the following testcase 1: void foo(int i){ 2: switch(i){ 3: default: 4: break; 5: } 6: return; 7: } there is no corresponding line in .debug_line section, and a debugger cannot set a breakpoint at line 4. Fix this by always emitting a branch when we're not optimizing and add a testcase to ensure that there's code on every line we'd want to break. Patch by Daniil Fukalov. llvm-svn: 205529
* Add comments and test case for [DAG] Keep the opaque constant flag when ↵Juergen Ributzka2014-04-021-1/+5
| | | | | | performing unary constant folding operations (r204737). llvm-svn: 205474
* Make isSetCCEquivalent respect the TargetBooleanContentsMatt Arsenault2014-04-011-19/+22
| | | | llvm-svn: 205336
* Add helpers for checking if a value is a target boolean constant.Matt Arsenault2014-04-011-0/+48
| | | | llvm-svn: 205335
* Add an optional ability to expand larger BUILD_VECTORs with shufflesHal Finkel2014-03-311-20/+117
| | | | | | | | | | | | | | | | | | | | | | | This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243
* Add a TLI hook to control when BUILD_VECTOR might be expanded using shufflesHal Finkel2014-03-311-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two general methods for expanding a BUILD_VECTOR node: 1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle them together. 2. Build the vector on the stack and then load it. Currently, we use a fixed heuristic: If there are only one or two unique defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and vector shuffles (provided that the required shuffle mask is legal). Otherwise, always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this can still be a good idea depending on what tricks the target can play when lowering the resulting shuffle. If the target can't do anything special, however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic leads to sub-optimal code (two stack loads instead of one). Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a build vector of a particular type are likely to be optimial, this adds a new TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type and the count of unique defined values. If this function returns true, then method (1) will be used, subject to the constraint that all of the necessary shuffles are legal (as determined by isShuffleMaskLegal). If this function returns false, then method (2) is always used. This commit does not enhance the current code to support expanding a build_vector with more than two unique values using shuffles, but I'll commit an implementation of the more-general case shortly. llvm-svn: 205230
* Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELTHal Finkel2014-03-311-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the loop vectorizer vectorizes code that uses the loop induction variable, we often end up with IR like this: %b1 = insertelement <2 x i32> undef, i32 %v, i32 0 %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer %i = add <2 x i32> %b2, <i32 2, i32 3> If the add in this example is not legal (as is the case on PPC with VSX), it will be scalarized, and we'll end up with a number of extract_vector_elt nodes with the vector shuffle as the input operand, and that vector shuffle is fed by one or more build_vector nodes. By the time that vector operations are expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by looking through the vector shuffle (to make sure that no illegal operations are created), and so the extract_vector_elt -> vector shuffle -> build_vector is never simplified to an operand of the build vector. By looking at build_vectors through a shuffle we fix this particular situation, preventing a vector from being built, only to be deconstructed again (for the scalarized add) -- an expensive proposition when this all needs to be done via the stack. We probably want a more comprehensive fix here where we look back recursively through any shuffles to any build_vectors or scalar_to_vectors, etc. but that can come later. llvm-svn: 205179
* Make use of previously generated stores in ↵Hal Finkel2014-03-301-4/+33
| | | | | | | | | | | | | | | | | SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153
* Avoid storing Twines.Benjamin Kramer2014-03-291-22/+19
| | | | | | While there nested ifs into a helper function. No functionality change. llvm-svn: 205108
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934
* Add @llvm.clear_cache builtinRenato Golin2014-03-261-0/+2
| | | | | | | | | | | | | | | | | Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. llvm-svn: 204802
* Revert "Prevent alias from pointing to weak aliases."Rafael Espindola2014-03-261-1/+1
| | | | | | | | | This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781
* [DAG] Keep the opaque constant flag when performing unary constant folding ↵Juergen Ributzka2014-03-251-6/+12
| | | | | | | | | | | | operations. Usually opaque constants shouldn't be folded, unless they are simple unary operations that don't create new constants. Although this shouldn't drop the opaque constant flag. This commit fixes this. Related to <rdar://problem/14774662> llvm-svn: 204737
* Fix creating illegal setcc cond codes.Matt Arsenault2014-03-251-10/+18
| | | | | | | | | | | | | | If GT/UGT or LT/ULT were set to expand, a comparison with a constant would replace it with the illegal cond code. There are several more places later in this function that will have the same basic problem. Theoretically R600 should hit this problem for a test, but for some reason it doesn't. llvm-svn: 204727
* SelectionDAG: Allow promotion of SELECT nodes from float to int typesTom Stellard2014-03-241-1/+2
| | | | | | | | And vice-versa, as long as the types are the same width. There are a few R600 tests that will cover this. llvm-svn: 204616
* remove a bunch of unused private methodsNuno Lopes2014-03-231-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h | 1 include/llvm/IR/DebugInfo.h | 3 lib/CodeGen/MachineSSAUpdater.cpp | 10 -- lib/CodeGen/PostRASchedulerList.cpp | 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 10 -- lib/IR/DebugInfo.cpp | 12 -- lib/MC/MCAsmStreamer.cpp | 2 lib/Support/YAMLParser.cpp | 39 --------- lib/TableGen/TGParser.cpp | 16 --- lib/TableGen/TGParser.h | 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp | 9 -- lib/Target/ARM/ARMCodeEmitter.cpp | 12 -- lib/Target/ARM/ARMFastISel.cpp | 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp | 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp | 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp | 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h | 2 lib/Target/PowerPC/PPCFastISel.cpp | 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp | 2 lib/Transforms/Instrumentation/BoundsChecking.cpp | 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp | 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp | 8 - lib/Transforms/Scalar/SCCP.cpp | 1 utils/TableGen/CodeEmitterGen.cpp | 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560
* [DAG] Fix an assertion failure caused by an invalid cast in method ↵Andrea Di Biagio2014-03-222-8/+7
| | | | | | | | | | | | 'BuildVectorSDNode::isConstantSplat' This patch renames method 'isConstantSplat' as 'getConstantSplatValue' (mainly for consistency reasons), and rewrites its logic to ensure that we always perform a legal 'cast<ConstantSDNode>'. Added test shift-combine-crash.ll to verify that DAGCombiner no longer crashes with an assertion failure in the attempt to simplify a vector shift by a vector of all undef counts. llvm-svn: 204536
* Fix an assertion caused by using inline asm with indirect register inputs.Kevin Qin2014-03-211-1/+1
| | | | llvm-svn: 204425
* Add support for scalarizing/splitting vector bswap.Raul E. Silvera2014-03-181-0/+2
| | | | | | | | | | | | | | Summary: SLP Vectorization of intrinsics (r203707) has exposed cases where the expansion of vector bswap is failing (PR19151). Reviewers: hfinkel CC: chandlerc Differential Revision: http://llvm-reviews.chandlerc.com/D3104 llvm-svn: 204163
* [DAGCombiner] teach how to simplify xor/and/or nodes according to the ↵Andrea Di Biagio2014-03-181-21/+52
| | | | | | | | | | | | | | following rules: 1) (AND (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (AND (A, B), C, Mask) 2) (OR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (OR (A, B), C, Mask) 3) (XOR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (XOR (A, B), V_0, Mask) 4) (AND (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, AND (A, B), Mask) 5) (OR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, OR (A, B), Mask) 6) (XOR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (V_0, XOR (A, B), Mask) llvm-svn: 204160
* Make DAGCombiner work on vector bitshifts with constant splat vectors.Matt Arsenault2014-03-172-137/+178
| | | | llvm-svn: 204071
* [VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16Adam Nemet2014-03-171-7/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get promoted to fp_to_sint v8f32->v8i32. This is a legal operation on AVX. For that to work properly, we also need to teach the legalizer about the specific promotion required here. The default vector promotion uses bitcasting to a vector type of the same total size. We want to promote the vector element type, effectively widening the operation and then truncating the result. This is analogous to the current logic of how int_to_fp is promoted. The change also factors out some code from the int_to_fp promotion code to ValueType::widenIntegerVectorElementType. This is now shared between int_to_fp and fp_to_int. There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in X86. It can now go through the new target-independent fp_to_*int promotion logic. I also checked that no other target uses Promote for these ops yet, so there shouldn't be any unexpected change in behavior. Fixes <rdar://problem/16202247> llvm-svn: 204058
* Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changingOwen Anderson2014-03-131-1/+1
| | | | | | | | | | operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
* Phase 1 of refactoring the MachineRegisterInfo iterators to make them suitableOwen Anderson2014-03-131-3/+4
| | | | | | | | | | | | | | | | | | | for use with C++11 range-based for-loops. The gist of phase 1 is to remove the skipInstruction() and skipBundle() methods from these iterators, instead splitting each iterator into a version that walks operands, a version that walks instructions, and a version that walks bundles. This has the result of making some "clever" loops in lib/CodeGen more verbose, but also makes their iterator invalidation characteristics much more obvious to the casual reader. (Making them concise again in the future is a good motivating case for a pre-incrementing range adapter!) Phase 2 of this undertaking with consist of removing the getOperand() method, and changing operator*() of the operand-walker to return a MachineOperand&. At that point, it should be possible to add range views for them that work as one might expect. llvm-svn: 203757
* Replace '#include ValueTypes.h' with forward declarations.Patrik Hagglund2014-03-121-1/+1
| | | | | | | In some cases the include is pushed "downstream" (or removed if unused). llvm-svn: 203644
* Remove copy ctors that did the same thing as the default one.Benjamin Kramer2014-03-111-8/+0
| | | | | | | The code added nothing but potentially disabled move semantics and made types non-trivially copyable. llvm-svn: 203563
* IR: add a second ordering operand to cmpxhg for failureTim Northover2014-03-114-15/+33
| | | | | | | | | | | | | | | The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
* Fix non 2-space indentation.Matt Arsenault2014-03-111-73/+73
| | | | llvm-svn: 203514
* [C++11] Add range based accessors for the Use-Def chain of a Value.Chandler Carruth2014-03-093-14/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a *Use* iterator rather than a *User* iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update *all* of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over User*s rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of T*s, but that *can* be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364
* [C++11] Add 'override' keyword to virtual methods that override their base ↵Craig Topper2014-03-089-30/+31
| | | | | | class. llvm-svn: 203339
* [DAGCombiner] Distribute TRUNC through AND in rotation amountAdam Nemet2014-03-071-0/+16
| | | | | | | | | | | | | | | | This is already done for shifts. Allow it for rotations as well. E.g.: (rotl:i32 x, (trunc (and y, 31))) -> (rotl:i32 x, (and (trunc y), 31)) Use the newly factored-out distributeTruncateThroughAnd. With this patch and some X86.td tweaks we should be able to remove redundant masking of the rotation amount like in the example above. HW implicitly performs this masking. The testcase will be added as part of the X86 patch. llvm-svn: 203316
* [DAGCombiner] Recognize another rotation idiomAdam Nemet2014-03-071-0/+8
| | | | | | | | | | | | | | | | This is the new idiom: x<<(y&31) | x>>((0-y)&31) which is recognized as: x ROTL (y&31) The change refines matchRotateSub. In Neg & (OpSize - 1) == (OpSize - Pos) & (OpSize - 1), if Pos is Pos' & (OpSize - 1) we can just use Pos' instead of Pos. llvm-svn: 203315
* [DAGCombiner] Slightly improve readability of matchRotateSubAdam Nemet2014-03-071-8/+9
| | | | | | | | | | Slightly change the wording in the function comment. Originally, it can be misunderstood as we turned the input into two subsequent rotates. Better connect the comment which talks about Mask and the code which used LoBits. Renamed variable to MaskLoBits. llvm-svn: 203314
* ISel: Make VSELECT selection terminate in cases where the condition type has toArnold Schwaighofer2014-03-071-0/+11
| | | | | | | | | | | | | | | be split and the result type widened. When the condition of a vselect has to be split it makes no sense widening the vselect and thereby widening the condition. We end up in an endless loop of widening (vselect result type) and splitting (condition mask type) doing this. Instead, split both the condition and the vselect and widen the result. I ran this over the test suite with i686 and mattr=+sse and saw no regressions. Fixes PR18036. llvm-svn: 203311
* [X86] Teach the DAGCombiner how to fold a OR of two shufflevector nodes.Andrea Di Biagio2014-03-061-0/+54
| | | | | | | | | | | | | | | | | | | | | | This patch teaches the DAGCombiner how to fold a binary OR between two shufflevector into a single shuffle vector when possible. The rules are: 1. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask1) 2. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf B, A, Mask2) The DAGCombiner can take advantage of the fact that OR is commutative and compute two possible shuffle masks (Mask1 and Mask2) for the resulting shuffle node. Before folding a dag according to either rule 1 or 2, DAGCombiner verifies that the resulting shuffle mask is legal for the target. DAGCombiner would firstly try to fold according to 1.; If not possible then it will try to fold according to 2. If both Mask1 and Mask2 are illegal then we conservatively don't fold the OR instruction. llvm-svn: 203156
* R600: Fix extloads from i8 / i16 to i64.Matt Arsenault2014-03-061-0/+15
| | | | | | | This appears to only be working for global loads. Private and local break for other reasons. llvm-svn: 203135
OpenPOWER on IntegriCloud