bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer ↵	Craig Topper	2014-04-26	2	-11/+9
\| \| \| \| \| \|	and size. llvm-svn: 207329
*	Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.	Craig Topper	2014-04-26	9	-249/+183
\| \| \| \|	llvm-svn: 207327
*	Remove an unused version of getMemIntrinsicNode and getNode. Additionally, ↵	Craig Topper	2014-04-26	1	-20/+0
\| \| \| \| \| \|	these were calling makeVTList with the pointers passed in which would were unlikely to belong to SelectionDAG and likely would have just been stack pointers. llvm-svn: 207326
*	Rip out X86-specific vector SDIV lowering, make the corresponding ↵	Benjamin Kramer	2014-04-26	1	-13/+24
\| \| \| \| \| \|	DAGCombiner transform work on vectors. llvm-svn: 207316
*	DAGCombiner: Turn divs of vector splats into vectorized multiplications.	Benjamin Kramer	2014-04-26	2	-24/+54
\| \| \| \| \| \| \| \| \| \| \| \|	Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315
*	[DAG] During DAG legalization keep opaque constants even after expanding.	Juergen Ributzka	2014-04-26	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The included test case would return the incorrect results, because the expansion of an shift with a constant shift amount of 0 would generate undefined behavior. This is because ExpandShiftByConstant assumes that all shifts by constants with a value of 0 have already been optimized away. This doesn't happen for opaque constants and usually this isn't a problem, because opaque constants won't take this code path - they are not supposed to. In the case that the opaque constant has to be expanded by the legalizer, the legalizer would drop the opaque flag. In this case we hit the limitations of ExpandShiftByConstant and create incorrect code. This commit fixes the legalizer by not dropping the opaque flag when expanding opaque constants and adding an assertion to ExpandShiftByConstant to catch this not supported case in the future. This fixes <rdar://problem/16718472> llvm-svn: 207304
*	This reapplies r207235 with an additional bugfixes caught by the msan	Adrian Prantl	2014-04-25	5	-33/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	buildbot - do not insert debug intrinsics before phi nodes. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207269
*	Revert "This reapplies r207130 with an additional testcase+and a missing ↵	Adrian Prantl	2014-04-25	5	-50/+33
\| \| \| \| \| \| \| \|	check for" This reverts commit 207235 to investigate msan buildbot breakage. llvm-svn: 207250
*	This reapplies r207130 with an additional testcase+and a missing check for	Adrian Prantl	2014-04-25	5	-33/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207235
*	Revert "This reapplies r207130 with an additional testcase+and a missing ↵	Adrian Prantl	2014-04-25	5	-50/+33
\| \| \| \| \| \| \| \|	check for" Typo in testcase. llvm-svn: 207166
*	This reapplies r207130 with an additional testcase+and a missing check for	Adrian Prantl	2014-04-25	5	-33/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207165
*	Revert "Debug info for optimized code: Support variables that are on the ↵	Adrian Prantl	2014-04-25	4	-43/+25
\| \| \| \| \| \| \| \|	stack and" This reverts commit 207130 for buildbot breakage. llvm-svn: 207162
*	Debug info for optimized code: Support variables that are on the stack and	Adrian Prantl	2014-04-24	4	-25/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine-intrinsics testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207130
*	Fix an infinite loop bug in DAG Combine about keeping transfering between ↵	Hao Liu	2014-04-22	1	-9/+7
\| \| \| \| \| \|	ANY_EXTEND and SIGN_EXTEND. llvm-svn: 206873
*	[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE	Chandler Carruth	2014-04-22	16	-16/+32
\| \| \| \| \| \| \| \| \| \| \| \|	define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837
*	[Modules] Make Support/Debug.h modular. This requires it to not change	Chandler Carruth	2014-04-21	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro after header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822
*	[Modules] Sink the DEBUG_TYPE macro out of LegalizeTypes.h and into the	Chandler Carruth	2014-04-21	6	-1/+5
\| \| \| \| \| \| \|	various .cpp files. This macro is inherently non-modular, and it wasn't even needed in this header file. llvm-svn: 206775
*	Fix unnecessary line break	Matt Arsenault	2014-04-21	1	-4/+2
\| \| \| \|	llvm-svn: 206772
*	Patch by Vadim Chugunov	Yaron Keren	2014-04-19	3	-3/+10
\| \| \| \| \| \| \| \| \| \| \|	Win64 stack unwinder gets confused when execution flow "falls through" after a call to 'noreturn' function. This fixes the "missing epilogue" problem by emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows. A secondary use for it would be for anyone wanting to make double-sure that 'noreturn' functions, indeed, do not return. llvm-svn: 206684
*	DAGCombiner: don't optimise non-existant litpool load	Tim Northover	2014-04-16	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	This particular DAG combine is designed to kick in when both ConstantFPs will end up being loaded via a litpool, however those nodes have a semi-legal status, dictated by isFPImmLegal so in some cases there wouldn't have been a litpool in the first place. Don't try to be clever in those circumstances. Picked up while merging some AArch64 tests. llvm-svn: 206365
*	Convert SelectionDAG::getVTList to use ArrayRef	Craig Topper	2014-04-16	5	-18/+19
\| \| \| \|	llvm-svn: 206357
*	[C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵	Craig Topper	2014-04-16	2	-14/+14
\| \| \| \| \| \|	instead of comparing to nullptr. llvm-svn: 206356
*	Make FastISel::SelectInstruction return before target specific fast-isel code	Akira Hatanaka	2014-04-15	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \|	handles Intrinsic::trap if TargetOptions::TrapFuncName is set. This fixes a bug in which the trap function was not taken into consideration when a program was compiled without optimization (at -O0). <rdar://problem/16291933> llvm-svn: 206323
*	Revert r191049/r191059 as it can produce wrong code (see PR17975).	Robert Lougher	2014-04-15	1	-21/+0
\| \| \| \| \| \|	It has already been reverted on the 3.4 branch in r196521. llvm-svn: 206311
*	FastISel: constrain the RegClass of operands when emitting instructions.	Tim Northover	2014-04-15	1	-8/+47
\| \| \| \| \| \| \| \| \| \| \|	ARM64 suffered multiple -verify-machineinstr failures (principally over the xsp/xzr issue) because FastISel was completely ignoring which subset of the general-purpose registers each instruction required. More fixes are coming in ARM64 specific FastISel, but this should cover the generic problems. llvm-svn: 206283
*	Break PseudoSourceValue out of the Value hierarchy. It is now the root of ↵	Nick Lewycky	2014-04-15	3	-149/+55
\| \| \| \| \| \|	its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead. llvm-svn: 206255
*	[C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵	Craig Topper	2014-04-14	20	-425/+425
\| \| \| \| \| \|	instead of comparing to nullptr. llvm-svn: 206142
*	Reenable use of TBAA during CodeGen	Hal Finkel	2014-04-12	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had disabled use of TBAA during CodeGen (even when otherwise using AA) because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to miss basic type punning that it should catch (and, thus, we'd fail to override TBAA when we should). However, when AA is in use during CodeGen, CGP now uses normal GEPs and bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a result, BasicAA should be able to make us do the right thing in the face of type-punning, and it seems safe to enable use of TBAA again. self-hosting seems fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle. Note: We still don't update TBAA when merging stack slots, although because BasicAA should now catch all such cases, this is no longer a blocking issue. Nevertheless, I plan to commit code to deal with this properly in the near future. llvm-svn: 206093
*	Move ExtractVectorElements to SelectionDAG.	Matt Arsenault	2014-04-11	1	-0/+16
\| \| \| \| \| \| \|	This seems generally useful, and makes sense to go along with SplitVector. llvm-svn: 206041
*	SelectionDAG: Use helper function to improve legalization of ISD::MUL	Tom Stellard	2014-04-11	1	-0/+17
\| \| \| \| \| \| \| \|	The TargetLowering::expandMUL() helper contains lowering code extracted from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more ISD::MUL patterns without having to use a library call. llvm-svn: 206037
*	SelectionDAG: Factor ISD::MUL lowering code out of DAGTypeLegalizer	Tom Stellard	2014-04-11	2	-67/+113
\| \| \| \| \| \| \| \| \| \| \|	This code has been moved to a new function in the TargetLowering class called expandMUL(). The purpose of this is to be able to share lowering code between the SelectionDAGLegalize and DAGTypeLegalizer classes. No functionality changed intended. llvm-svn: 206036
*	[c++11] Range'ify use list loops in InstrEmitter.	Jim Grosbach	2014-04-11	1	-9/+3
\| \| \| \|	llvm-svn: 206015
*	[c++11] Range'ify use list loops in DAGCombiner.	Jim Grosbach	2014-04-11	1	-18/+7
\| \| \| \|	llvm-svn: 206014
*	SelectionDAG: Don't constant fold target-specific nodes.	Jim Grosbach	2014-04-09	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	FoldConstantArithmetic() only knows how to deal with a few target independent ISD opcodes. Bail early if it sees a target-specific ISD node. These node do funny things with operand types which may break the assumptions of the code that follows, and there's no actual folding that can be done anyway. For example, non-constant 256 bit vector shifts on X86 have a shift-amount operand that's a 128-bit v4i32 vector regardless of what the first operand type is and that breaks the assumption that the operand types must match. rdar://16530923 llvm-svn: 205937
*	[DAGCombiner] DAG combine does not know how to combine indexed loads with	Quentin Colombet	2014-04-09	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	sign/zero/any extensions. However a few places were not checking properly the property of the load and were turning an indexed load into a regular extended load. Therefore the indexed value was lost during the process and this was triggering an assertion. <rdar://problem/16389332> llvm-svn: 205923
*	Bug 19348: Check for legal ExtLoad operation before folding	Matt Arsenault	2014-04-08	1	-9/+12
\| \| \| \| \| \| \| \|	(aext (zextload x)) -> (aext (truncate (*extload x))) Patch by Stanislav Mekhanoshin! llvm-svn: 205805
*	Put a limit on ScheduleDAGSDNodes::ClusterNeighboringLoads to avoid blowing ↵	Andrew Trick	2014-04-07	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	up compile time. Fixes PR16365 - Extremely slow compilation in -O1 and -O2. The SD scheduler has a quadratic implementation of load clustering which absolutely blows up compile time for large blocks with constant pool loads. The MI scheduler has a better implementation of load clustering. However, we have not done the work yet to completely eliminate the SD scheduler. Some benchmarks still seem to benefit from early load clustering, although maybe by chance. As an intermediate term fix, I just put a nice limit on the number of DAG users to search before finding a match. With this limit there are no binary differences in the LLVM test suite, and the PR16365 test case does not suffer any compile time impact from this routine. llvm-svn: 205738
*	Add DAG parameter to ComputeNumSignBitsForTargetNode	Matt Arsenault	2014-04-04	2	-1/+2
\| \| \| \| \| \| \| \|	This way, you can check the number of sign bits in the operands. The depth parameter it already has is pretty useless without this. llvm-svn: 205649
*	DAGLegalize: add last-ditch type-legalization for VSELECT.	Tim Northover	2014-04-04	2	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can decide that the result is OK (v1i64 is legal on AArch64, for example) but it still need scalarising because of that v1i1. There was no code to do this though. AArch64 and ARM64 have DAG combines to produce efficient code and prevent that occuring in most such situations, but there are edge cases that they miss. This adds a legalization to cope with that. llvm-svn: 205626
*	ARM64: handle v1i1 types arising from setcc properly.	Tim Northover	2014-04-04	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were several overlapping problems here, and this solution is closely inspired by the one adopted in AArch64 in r201381. Firstly, scalarisation of v1i1 setcc operations simply fails if the input types are legal. This is fixed in LegalizeVectorTypes.cpp this time, and allows AArch64 code to be simplified slightly. Second, vselect with such a setcc feeding into it ends up in ScalarizeVectorOperand, where it's not handled. I experimented with an implementation, but found that whatever DAG came out was rather horrific. I think Hao's DAG combine approach is a good one for quality, though there are edge cases it won't catch (to be fixed separately). Should fix PR19335. llvm-svn: 205625
*	Make consistent use of MCPhysReg instead of uint16_t throughout the tree.	Craig Topper	2014-04-04	1	-1/+1
\| \| \| \|	llvm-svn: 205610
*	Fix for PR 19261:	Eric Christopher	2014-04-03	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llc doesn't generate nodes for unconditional fall-through branches for targets without FastISel implementation (X86 has it, but can be disabled by "-fast-isel=false") in SelectionDAGBuilder::visitBr(). So for line 4 in the following testcase 1: void foo(int i){ 2: switch(i){ 3: default: 4: break; 5: } 6: return; 7: } there is no corresponding line in .debug_line section, and a debugger cannot set a breakpoint at line 4. Fix this by always emitting a branch when we're not optimizing and add a testcase to ensure that there's code on every line we'd want to break. Patch by Daniil Fukalov. llvm-svn: 205529
*	Add comments and test case for [DAG] Keep the opaque constant flag when ↵	Juergen Ributzka	2014-04-02	1	-1/+5
\| \| \| \| \| \|	performing unary constant folding operations (r204737). llvm-svn: 205474
*	Make isSetCCEquivalent respect the TargetBooleanContents	Matt Arsenault	2014-04-01	1	-19/+22
\| \| \| \|	llvm-svn: 205336
*	Add helpers for checking if a value is a target boolean constant.	Matt Arsenault	2014-04-01	1	-0/+48
\| \| \| \|	llvm-svn: 205335
*	Add an optional ability to expand larger BUILD_VECTORs with shuffles	Hal Finkel	2014-03-31	1	-20/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243
*	Add a TLI hook to control when BUILD_VECTOR might be expanded using shuffles	Hal Finkel	2014-03-31	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two general methods for expanding a BUILD_VECTOR node: 1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle them together. 2. Build the vector on the stack and then load it. Currently, we use a fixed heuristic: If there are only one or two unique defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and vector shuffles (provided that the required shuffle mask is legal). Otherwise, always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this can still be a good idea depending on what tricks the target can play when lowering the resulting shuffle. If the target can't do anything special, however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic leads to sub-optimal code (two stack loads instead of one). Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a build vector of a particular type are likely to be optimial, this adds a new TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type and the count of unique defined values. If this function returns true, then method (1) will be used, subject to the constraint that all of the necessary shuffles are legal (as determined by isShuffleMaskLegal). If this function returns false, then method (2) is always used. This commit does not enhance the current code to support expanding a build_vector with more than two unique values using shuffles, but I'll commit an implementation of the more-general case shortly. llvm-svn: 205230
*	Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELT	Hal Finkel	2014-03-31	1	-7/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the loop vectorizer vectorizes code that uses the loop induction variable, we often end up with IR like this: %b1 = insertelement <2 x i32> undef, i32 %v, i32 0 %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer %i = add <2 x i32> %b2, <i32 2, i32 3> If the add in this example is not legal (as is the case on PPC with VSX), it will be scalarized, and we'll end up with a number of extract_vector_elt nodes with the vector shuffle as the input operand, and that vector shuffle is fed by one or more build_vector nodes. By the time that vector operations are expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by looking through the vector shuffle (to make sure that no illegal operations are created), and so the extract_vector_elt -> vector shuffle -> build_vector is never simplified to an operand of the build vector. By looking at build_vectors through a shuffle we fix this particular situation, preventing a vector from being built, only to be deconstructed again (for the scalarized add) -- an expensive proposition when this all needs to be done via the stack. We probably want a more comprehensive fix here where we look back recursively through any shuffles to any build_vectors or scalar_to_vectors, etc. but that can come later. llvm-svn: 205179
*	Make use of previously generated stores in ↵	Hal Finkel	2014-03-30	1	-4/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153
*	Avoid storing Twines.	Benjamin Kramer	2014-03-29	1	-22/+19
\| \| \| \| \| \|	While there nested ifs into a helper function. No functionality change. llvm-svn: 205108