bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add support for scalarizing/splitting vector bswap.	Raul E. Silvera	2014-03-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SLP Vectorization of intrinsics (r203707) has exposed cases where the expansion of vector bswap is failing (PR19151). Reviewers: hfinkel CC: chandlerc Differential Revision: http://llvm-reviews.chandlerc.com/D3104 llvm-svn: 204163
*	[DAGCombiner] teach how to simplify xor/and/or nodes according to the ↵	Andrea Di Biagio	2014-03-18	1	-21/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	following rules: 1) (AND (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (AND (A, B), C, Mask) 2) (OR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (OR (A, B), C, Mask) 3) (XOR (shuf (A, C, Mask), shuf (B, C, Mask)) -> shuf (XOR (A, B), V_0, Mask) 4) (AND (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, AND (A, B), Mask) 5) (OR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (C, OR (A, B), Mask) 6) (XOR (shuf (C, A, Mask), shuf (C, B, Mask)) -> shuf (V_0, XOR (A, B), Mask) llvm-svn: 204160
*	Make DAGCombiner work on vector bitshifts with constant splat vectors.	Matt Arsenault	2014-03-17	2	-137/+178
\| \| \| \|	llvm-svn: 204071
*	[VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16	Adam Nemet	2014-03-17	1	-7/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get promoted to fp_to_sint v8f32->v8i32. This is a legal operation on AVX. For that to work properly, we also need to teach the legalizer about the specific promotion required here. The default vector promotion uses bitcasting to a vector type of the same total size. We want to promote the vector element type, effectively widening the operation and then truncating the result. This is analogous to the current logic of how int_to_fp is promoted. The change also factors out some code from the int_to_fp promotion code to ValueType::widenIntegerVectorElementType. This is now shared between int_to_fp and fp_to_int. There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in X86. It can now go through the new target-independent fp_to_*int promotion logic. I also checked that no other target uses Promote for these ops yet, so there shouldn't be any unexpected change in behavior. Fixes <rdar://problem/16202247> llvm-svn: 204058
*	Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing	Owen Anderson	2014-03-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
*	Phase 1 of refactoring the MachineRegisterInfo iterators to make them suitable	Owen Anderson	2014-03-13	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	for use with C++11 range-based for-loops. The gist of phase 1 is to remove the skipInstruction() and skipBundle() methods from these iterators, instead splitting each iterator into a version that walks operands, a version that walks instructions, and a version that walks bundles. This has the result of making some "clever" loops in lib/CodeGen more verbose, but also makes their iterator invalidation characteristics much more obvious to the casual reader. (Making them concise again in the future is a good motivating case for a pre-incrementing range adapter!) Phase 2 of this undertaking with consist of removing the getOperand() method, and changing operator*() of the operand-walker to return a MachineOperand&. At that point, it should be possible to add range views for them that work as one might expect. llvm-svn: 203757
*	Replace '#include ValueTypes.h' with forward declarations.	Patrik Hagglund	2014-03-12	1	-1/+1
\| \| \| \| \| \| \|	In some cases the include is pushed "downstream" (or removed if unused). llvm-svn: 203644
*	Remove copy ctors that did the same thing as the default one.	Benjamin Kramer	2014-03-11	1	-8/+0
\| \| \| \| \| \| \|	The code added nothing but potentially disabled move semantics and made types non-trivially copyable. llvm-svn: 203563
*	IR: add a second ordering operand to cmpxhg for failure	Tim Northover	2014-03-11	4	-15/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
*	Fix non 2-space indentation.	Matt Arsenault	2014-03-11	1	-73/+73
\| \| \| \|	llvm-svn: 203514
*	[C++11] Add range based accessors for the Use-Def chain of a Value.	Chandler Carruth	2014-03-09	3	-14/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a Use iterator rather than a User iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update all of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over Users rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of Ts, but that can be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364
*	[C++11] Add 'override' keyword to virtual methods that override their base ↵	Craig Topper	2014-03-08	9	-30/+31
\| \| \| \| \| \|	class. llvm-svn: 203339
*	[DAGCombiner] Distribute TRUNC through AND in rotation amount	Adam Nemet	2014-03-07	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is already done for shifts. Allow it for rotations as well. E.g.: (rotl:i32 x, (trunc (and y, 31))) -> (rotl:i32 x, (and (trunc y), 31)) Use the newly factored-out distributeTruncateThroughAnd. With this patch and some X86.td tweaks we should be able to remove redundant masking of the rotation amount like in the example above. HW implicitly performs this masking. The testcase will be added as part of the X86 patch. llvm-svn: 203316
*	[DAGCombiner] Recognize another rotation idiom	Adam Nemet	2014-03-07	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the new idiom: x<<(y&31) \| x>>((0-y)&31) which is recognized as: x ROTL (y&31) The change refines matchRotateSub. In Neg & (OpSize - 1) == (OpSize - Pos) & (OpSize - 1), if Pos is Pos' & (OpSize - 1) we can just use Pos' instead of Pos. llvm-svn: 203315
*	[DAGCombiner] Slightly improve readability of matchRotateSub	Adam Nemet	2014-03-07	1	-8/+9
\| \| \| \| \| \| \| \| \| \|	Slightly change the wording in the function comment. Originally, it can be misunderstood as we turned the input into two subsequent rotates. Better connect the comment which talks about Mask and the code which used LoBits. Renamed variable to MaskLoBits. llvm-svn: 203314
*	ISel: Make VSELECT selection terminate in cases where the condition type has to	Arnold Schwaighofer	2014-03-07	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	be split and the result type widened. When the condition of a vselect has to be split it makes no sense widening the vselect and thereby widening the condition. We end up in an endless loop of widening (vselect result type) and splitting (condition mask type) doing this. Instead, split both the condition and the vselect and widen the result. I ran this over the test suite with i686 and mattr=+sse and saw no regressions. Fixes PR18036. llvm-svn: 203311
*	[X86] Teach the DAGCombiner how to fold a OR of two shufflevector nodes.	Andrea Di Biagio	2014-03-06	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches the DAGCombiner how to fold a binary OR between two shufflevector into a single shuffle vector when possible. The rules are: 1. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask1) 2. fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf B, A, Mask2) The DAGCombiner can take advantage of the fact that OR is commutative and compute two possible shuffle masks (Mask1 and Mask2) for the resulting shuffle node. Before folding a dag according to either rule 1 or 2, DAGCombiner verifies that the resulting shuffle mask is legal for the target. DAGCombiner would firstly try to fold according to 1.; If not possible then it will try to fold according to 2. If both Mask1 and Mask2 are illegal then we conservatively don't fold the OR instruction. llvm-svn: 203156
*	R600: Fix extloads from i8 / i16 to i64.	Matt Arsenault	2014-03-06	1	-0/+15
\| \| \| \| \| \| \|	This appears to only be working for global loads. Private and local break for other reasons. llvm-svn: 203135
*	[Layering] Move DebugInfo.h into the IR library where its implementation	Chandler Carruth	2014-03-06	8	-8/+8
\| \| \| \| \| \|	already lives. llvm-svn: 203046
*	[Layering] Move DebugLoc.h into the IR library. The implementation	Chandler Carruth	2014-03-05	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	already lived there and it is where it belongs -- this is the in-memory debug location representation. This is just cleanup -- Modules can actually cope with this, but that doesn't make it right. After chatting with folks that have out-of-tree stuff, going ahead and moving the rest of the headers seems preferable. llvm-svn: 202960
*	Make stackmap machineinstrs clobber the scratch regs too.	Andrew Trick	2014-03-05	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Patchpoints already did this. Doing it for stackmaps is a convenience for the runtime in the event that it needs to scratch register to patch or perform a runtime call thunk. Unlike patchpoints, we just assume the AnyRegCC calling convention. This is the only language and target independent calling convention specific to stackmaps so makes sense. Although the calling convention is not currently used to select the scratch registers. llvm-svn: 202943
*	Fix unused variable in FunctionLoweringInfo.cpp	Hans Wennborg	2014-03-05	1	-1/+1
\| \| \| \|	llvm-svn: 202932
*	Check for dynamic allocas and inline asm that clobbers sp before building	Hans Wennborg	2014-03-05	3	-9/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	selection dag (PR19012) In X86SelectionDagInfo::EmitTargetCodeForMemcpy we check with MachineFrameInfo to make sure that ESI isn't used as a base pointer register before we choose to emit rep movs (which clobbers esi). The problem is that MachineFrameInfo wouldn't know about dynamic allocas or inline asm that clobbers the stack pointer until SelectionDAGBuilder has encountered them. This patch fixes the problem by checking for such things when building the FunctionLoweringInfo. Differential Revision: http://llvm-reviews.chandlerc.com/D2954 llvm-svn: 202930
*	[DAGCombiner] Factor out distributeTruncateThroughAnd	Adam Nemet	2014-03-04	1	-47/+42
\| \| \| \| \| \| \| \| \|	Currently this code is duplicated across visitSHL, visitSRA and visitSRL. The plan is to add rotates as clients to this new function. There is no functional change intended here. llvm-svn: 202908
*	[Modules] Move CallSite into the IR library where it belogs. It is	Chandler Carruth	2014-03-04	1	-1/+1
\| \| \| \| \| \| \|	abstracting between a CallInst and an InvokeInst, both of which are IR concepts. llvm-svn: 202816
*	[C++11] Replace llvm::tie with std::tie.	Benjamin Kramer	2014-03-02	3	-32/+32
\| \| \| \| \| \|	The old implementation is no longer needed in C++11. llvm-svn: 202644
*	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev.	Benjamin Kramer	2014-03-02	8	-19/+19
\| \| \| \| \| \|	Remove the old functions. llvm-svn: 202636
*	Now that we have C++11, turn simple functors into lambdas and remove a ton ↵	Benjamin Kramer	2014-03-01	1	-21/+10
\| \| \| \| \| \| \| \|	of boilerplate. No intended functionality change. llvm-svn: 202588
*	Fix visitTRUNCATE for legal i1 values	Hal Finkel	2014-02-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This extract-and-trunc vector optimization cannot work for i1 values as currently implemented, and so I'm disabling this for now for i1 values. In the future, this can be fixed properly. Soon I'll commit support for i1 CR bit tracking in the PowerPC backend, and this will be covered by one of the existing regression tests. llvm-svn: 202449
*	Add missing const	Matt Arsenault	2014-02-24	1	-1/+1
\| \| \| \|	llvm-svn: 202074
*	Trivial code simplification	Matt Arsenault	2014-02-24	1	-2/+1
\| \| \| \|	llvm-svn: 202073
*	[DAGCombiner] PCMP* sets its result to all ones or zeros so we can AND with the	Quentin Colombet	2014-02-21	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	shifted mask rather than masking and shifting separately. The patch adds this transformation to the DAGCombiner: (shl (and (setcc:i8v16 ...) N01C) N1C) -> (and (setcc:i8v16 ...) N01C<<N1C) <rdar://problem/16054492> Patch by Adam Nemet <anemet@apple.com> llvm-svn: 201906
*	Rename a few more DataLayout variables from TD to DL.	Rafael Espindola	2014-02-21	2	-15/+15
\| \| \| \|	llvm-svn: 201870
*	Rename a DebugLoc variable to DbgLoc and a DataLayout to DL.	Rafael Espindola	2014-02-18	1	-73/+73
\| \| \| \| \| \|	This is quiet a bit less confusing now that TargetData was renamed DataLayout. llvm-svn: 201606
*	Rename some member variables from TD to DL.	Rafael Espindola	2014-02-18	1	-1/+1
\| \| \| \| \| \|	TargetData was renamed DataLayout back in r165242. llvm-svn: 201581
*	[DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder.	Juergen Ributzka	2014-02-13	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \|	This fix checks the original LLVM IR node to identify opaque constants by looking for the bitcast-constant pattern. Originally we looked at the generated SDNode, but this might lead to incorrect results. The SDNode could have been generated by an constant expression that was folded to a constant. This fixes <rdar://problem/16050719> llvm-svn: 201291
*	[Stackmaps] Improve the stackmap lowering code in the SelectionDAGBuilder.	Juergen Ributzka	2014-02-12	1	-33/+33
\| \| \| \| \| \| \| \| \| \|	We are now no longer relying on the target-specific call lowering implementation to lower a stackmap intrinsic call. Instead we perform the call lowering in a target-independent way directly in the stackmap lowering code. This simplifies the code and removes the need to fixup the code after the target-specific call lowering. llvm-svn: 201263
*	[Stackmaps] Fix the ID type to be i64 also for stackmaps (as we claim in the ↵	Juergen Ributzka	2014-02-12	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \|	documenation) The ID type for the stackmap and patchpoint intrinsics are in both cases i64. This fixes an zero extend in the SelectionDAGBuilder that still used i32. This also updates the target independent instructions STACKMAP and PATCHPOINT to use the correct type. llvm-svn: 201262
*	Teach the DAGCombiner how to fold concat_vector nodes when the input is two	Robert Lougher	2014-02-11	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	BUILD_VECTOR nodes, e.g.: (concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4)) -> (BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4) This fixes an issue with AVX, where a sequence was not recognized as a 256-bit vbroadcast due to the concat_vectors. llvm-svn: 201158
*	[DAG] Don't pull the binary operation though the shift if the operands have ↵	Juergen Ributzka	2014-02-06	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \|	opaque constants. During DAGCombine visitShiftByConstant assumes that certain binary operations with only constant operands can always be folded successfully. This is no longer true when the constant is opaque. This commit fixes visitShiftByConstant by not performing the optimization for opaque constants. Otherwise we would end up in an infinite DAGCombine loop. llvm-svn: 200900
*	Pass address space to allowsUnalignedMemoryAccesses	Matt Arsenault	2014-02-05	3	-11/+25
\| \| \| \|	llvm-svn: 200888
*	Add address space argument to allowsUnalignedMemoryAccess.	Matt Arsenault	2014-02-05	1	-1/+1
\| \| \| \| \| \| \|	On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887
*	Add CheckChildInteger to ISelMatcher operations. Removes nearly 2000 bytes ↵	Craig Topper	2014-02-05	1	-2/+23
\| \| \| \| \| \|	from X86 matcher table. llvm-svn: 200821
*	Expand vector bswap in LegalizeVectorOps	Hal Finkel	2014-02-03	1	-0/+1
\| \| \| \| \| \| \|	ISD::BSWAP was missing from the list of node types that should be expanded element-wise. llvm-svn: 200705
*	Implement inalloca codegen for x86 with the new inalloca design	Reid Kleckner	2014-01-31	2	-2/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 llvm-svn: 200596
*	Don't put non-static allocas in the static alloca map	Reid Kleckner	2014-01-31	1	-1/+7
\| \| \| \| \| \| \| \|	Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. llvm-svn: 200593
*	This patch teaches the DAGCombiner how to fold insert_subvector nodes	Manman Ren	2014-01-31	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506
*	DAGCombine should not produce ISD::OR nodes after operation legalization if ↵	Owen Anderson	2014-01-31	1	-2/+4
\| \| \| \| \| \|	they're not legal. llvm-svn: 200503
*	PGO branch weight: update edge weights in SelectionDAGBuilder.	Manman Ren	2014-01-31	2	-12/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502
*	Revert r200431 due to bot failures.	Manman Ren	2014-01-30	2	-65/+12
\| \| \| \|	llvm-svn: 200434