bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Add convergent flag to INLINEASM instruction.	Wei Ding	2016-06-22	1	-0/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D21214 llvm-svn: 273455
*	[SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee	Krzysztof Parzyszek	2016-06-22	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \|	The setCallee function will set the number of fixed arguments based on the size of the argument list. The FixedArgs parameter was often explicitly set to 0, leading to a lack of consistent value for non- vararg functions. Differential Revision: http://reviews.llvm.org/D20376 llvm-svn: 273403
*	[SelectionDAG] Don't treat library calls specially if marked with nobuiltin.	Marcin Koscielnicki	2016-06-17	1	-2/+3
\| \| \| \| \| \| \| \|	To be used by D19781. Differential Revision: http://reviews.llvm.org/D19801 llvm-svn: 273039
*	[SelectionDAG] Remove exit-on-error flag from test (PR27765)	Diana Picus	2016-06-14	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The exit-on-error flag in the ARM test is necessary in order to avoid an unreachable in the DAGTypeLegalizer, when trying to expand a physical register. We can also avoid this situation by introducing a bitcast early on, where the invalid scalar-to-vector conversion is detected. We also add a test for PowerPC, which goes through a similar code path in the SelectionDAGBuilder. Fixes PR27765. Differential Revision: http://reviews.llvm.org/D21061 llvm-svn: 272644
*	Pass DebugLoc and SDLoc by const ref.	Benjamin Kramer	2016-06-12	1	-37/+34
\| \| \| \| \| \| \| \|	This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
*	[stack-protection] Add support for MSVC buffer security check	Etienne Bergeron	2016-06-07	1	-5/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch is adding support for the MSVC buffer security check implementation The buffer security check is turned on with the '/GS' compiler switch. * https://msdn.microsoft.com/en-us/library/8dbf701c.aspx * To be added to clang here: http://reviews.llvm.org/D20347 Some overview of buffer security check feature and implementation: * https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx * http://www.ksyash.com/2011/01/buffer-overflow-protection-3/ * http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html For the following example: ``` int example(int offset, int index) { char buffer[10]; memset(buffer, 0xCC, index); return buffer[index]; } ``` The MSVC compiler is adding these instructions to perform stack integrity check: ``` push ebp mov ebp,esp sub esp,50h [1] mov eax,dword ptr [__security_cookie (01068024h)] [2] xor eax,ebp [3] mov dword ptr [ebp-4],eax push ebx push esi push edi mov eax,dword ptr [index] push eax push 0CCh lea ecx,[buffer] push ecx call _memset (010610B9h) add esp,0Ch mov eax,dword ptr [index] movsx eax,byte ptr buffer[eax] pop edi pop esi pop ebx [4] mov ecx,dword ptr [ebp-4] [5] xor ecx,ebp [6] call @__security_check_cookie@4 (01061276h) mov esp,ebp pop ebp ret ``` The instrumentation above is: * [1] is loading the global security canary, * [3] is storing the local computed ([2]) canary to the guard slot, * [4] is loading the guard slot and ([5]) re-compute the global canary, * [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling. Overview of the current stack-protection implementation: * lib/CodeGen/StackProtector.cpp * There is a default stack-protection implementation applied on intermediate representation. * The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie. * An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast). * Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling. * Guard manipulation and comparison are added directly to the intermediate representation. * lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp * lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp * There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls). * see long comment above 'class StackProtectorDescriptor' declaration. * The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr). * 'getSDagStackGuard' returns the appropriate stack guard (security cookie) * The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'. * include/llvm/Target/TargetLowering.h * Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'. * lib/Target/X86/X86ISelLowering.cpp * Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm. Function-based Instrumentation: * The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions. * To support function-based instrumentation, this patch is * adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h), * If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue. * modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation, * generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp), * if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp). Modifications * adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp) * adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h) Results * IR generated instrumentation: ``` clang-cl /GS test.cc /Od /c -mllvm -print-isel-input ``` ``` * Final LLVM Code input to ISel * ; Function Attrs: nounwind sspstrong define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 { entry: %StackGuardSlot = alloca i8* <<<-- Allocated guard slot %0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot) %index.addr = alloca i32, align 4 %offset.addr = alloca i32, align 4 %buffer = alloca [10 x i8], align 1 store i32 %index, i32* %index.addr, align 4 store i32 %offset, i32* %offset.addr, align 4 %arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0 %1 = load i32, i32* %index.addr, align 4 call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false) %2 = load i32, i32* %index.addr, align 4 %arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2 %3 = load i8, i8* %arrayidx, align 1 %conv = sext i8 %3 to i32 %4 = load volatile i8, i8* %StackGuardSlot <<<-- Loading Guard slot call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check ret i32 %conv } ``` * SelectionDAG generated instrumentation: ``` clang-cl /GS test.cc /O1 /c /FA ``` ``` "?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z" # BB#0: # %entry pushl %esi subl $16, %esp movl ___security_cookie, %eax <<<-- Loading Stack Guard value movl 28(%esp), %esi movl %eax, 12(%esp) <<<-- Store to Guard slot leal 2(%esp), %eax pushl %esi pushl $204 pushl %eax calll _memset addl $12, %esp movsbl 2(%esp,%esi), %esi movl 12(%esp), %ecx <<<-- Loading Guard slot calll @__security_check_cookie@4 <<<-- Epilogue function-based check movl %esi, %eax addl $16, %esp popl %esi retl ``` Reviewers: kcc, pcc, eugenis, rnk Subscribers: majnemer, llvm-commits, hans, thakis, rnk Differential Revision: http://reviews.llvm.org/D20346 llvm-svn: 272053
*	SDAG: Use an Optional<> instead of a sigil value. NFC	Justin Bogner	2016-05-26	1	-6/+6
\| \| \| \| \| \| \|	This just makes it a bit more clear that we don't intend to use a deleted node for anything here. llvm-svn: 270931
*	Fix some comment typos in SelectionDAGBuilder. NFC	Diana Picus	2016-05-20	1	-3/+3
\| \| \| \|	llvm-svn: 270190
*	Fix an assert in SelectionDAGBuilder when processing inline asm	Renato Golin	2016-05-17	1	-25/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When processing inline asm that contains errors, make sure we can recover gracefully by creating an UNDEF SDValue for the inline asm statement before returning from SelectionDAGBuilder::visitInlineAsm. This is necessary for consumers that don't exit on the first error that is emitted (e.g. clang) and that would assert later on. Fixes PR24071. Patch by Diana Picus. llvm-svn: 269811
*	SelectionDAG: Select min/max when both are used	Matt Arsenault	2016-05-16	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \|	Allow two users of the condition if the other user is also a min/max select. i.e. %c = icmp slt i32 %x, %y %min = select i1 %c, i32 %x, i32 %y %max = select i1 %c, i32 %y, i32 %x llvm-svn: 269699
*	getelementptr instruction, support index vector of EVT.	Igor Breger	2016-05-01	1	-1/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195
*	[PPC, SSP] Support PowerPC Linux stack protection.	Tim Shen	2016-04-19	1	-13/+11
\| \| \| \|	llvm-svn: 266809
*	[SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD	Tim Shen	2016-04-19	1	-45/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this change, ideally IR pass can always generate llvm.stackguard call to get the stack guard; but for now there are still IR form stack guard customizations around (see getIRStackGuard()). Future SSP customization should go through LOAD_STACK_GUARD. There is a behavior change: stack guard values are not CSEed anymore, since we should never reuse the value in case that it has been spilled (and corrupted). See ssp-guard-spill.ll. This also cause the change of stack size and codegen in X86 and AArch64 test cases. Ideally we'd like to know if the guard created in llvm.stackprotector() gets spilled or not. If the value is spilled, discard the value and reload stack guard; otherwise reuse the value. This can be done by teaching register allocator to know how to rematerialize LOAD_STACK_GUARD and force a rematerialization (which seems hard), or check for spilling in expandPostRAPseudo. It only makes sense when the stack guard is a global variable, which requires more instructions to load. Anyway, this seems to go out of the scope of the current patch. llvm-svn: 266806
*	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN	David Majnemer	2016-04-14	1	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279
*	AMDGPU: Implement canonicalize	Matt Arsenault	2016-04-14	1	-1/+3
\| \| \| \| \| \|	Also add generic DAG node for it. llvm-svn: 266272
*	Introduce an GCRelocateInst class [NFC]	Philip Reames	2016-04-12	1	-1/+1
\| \| \| \| \| \|	Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098
*	[SSP] Remove llvm.stackprotectorcheck.	Tim Shen	2016-04-08	1	-13/+4
\| \| \| \| \| \| \| \| \| \|	This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851
*	NFC: make AtomicOrdering an enum class	JF Bastien	2016-04-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602
*	Lower @llvm.experimental.deoptimize as a noreturn call	Sanjoy Das	2016-04-06	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502
*	Swift Calling Convention: swifterror target-independent change.	Manman Ren	2016-04-05	1	-3/+197
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At IR level, the swifterror argument is an input argument with type ErrorObject*. For targets that support swifterror, we want to optimize it to behave as an inout value with type ErrorObject; it will be passed in a fixed physical register. The main idea is to track the virtual registers for each swifterror value. We define swifterror values as AllocaInsts with swifterror attribute or a function argument with swifterror attribute. In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before handling the basic blocks. When iterating over all basic blocks in RPO, before actually visiting the basic block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when there are multiple predecessors or to simply propagate them. There, we create a virtual register for each swifterror value in the entry block. For predecessors that are not yet visited, we create virtual registers to hold the swifterror values at the end of the predecessor. The assignments are saved in SwiftErrorWorklist and will be materialized at the end of visiting the basic block. When visiting a load from a swifterror value, we copy from the current virtual register assignment. When visiting a store to a swifterror value, we create a virtual register to hold the swifterror value and update SwiftErrorMap to track the current virtual register assignment. Differential Revision: http://reviews.llvm.org/D18108 llvm-svn: 265433
*	Swift Calling Convention: add swifterror attribute.	Manman Ren	2016-04-01	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	A ``swifterror`` attribute can be applied to a function parameter or an AllocaInst. This commit does not include any target-specific change. The target-specific optimization will come as a follow-up patch. Differential Revision: http://reviews.llvm.org/D18092 llvm-svn: 265189
*	Don't use an i64 return type with webkit_jscc	Sanjoy Das	2016-04-01	1	-6/+7
\| \| \| \| \| \| \| \| \|	Re-enable an assertion enabled by Justin Lebar in rL265092. rL265092 was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc does not like non-i64 return types. Change the test case to not do that. llvm-svn: 265099
*	Revert "Protect some assertions with NDEBUG rather than DEBUG()."	Justin Lebar	2016-04-01	1	-7/+6
\| \| \| \| \| \|	This reverts r265092, because it breaks CodeGen/X86/deopt-intrinsic.ll. llvm-svn: 265093
*	Protect some assertions with NDEBUG rather than DEBUG().	Justin Lebar	2016-04-01	1	-6/+7
\| \| \| \| \| \| \|	DEBUG() only runs if you pass -debug, but these assertions are generally useful. llvm-svn: 265092
*	Add support for no-jump-tables	Nirav Dave	2016-03-29	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add function soft attribute to the generation of Jump Tables in CodeGen as initial step towards clang support of gcc's no-jump-table support Reviewers: hans, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18321 llvm-svn: 264756
*	Swift Calling Convention: add swiftself attribute.	Manman Ren	2016-03-29	1	-0/+5
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754
*	[Codegen] Decrease minimum jump table density.	Kyle Butt	2016-03-29	1	-4/+22
\| \| \| \| \| \| \| \| \| \| \|	Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689
*	Add lowering support for llvm.experimental.deoptimize	Sanjoy Das	2016-03-24	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Only adds support for "naked" calls to llvm.experimental.deoptimize. Support for round-tripping through RewriteStatepointsForGC will come as a separate patch (should be simpler than this one). Reviewers: reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18429 llvm-svn: 264329
*	Add a hasOperandBundlesOtherThan helper, and use it; NFC	Sanjoy Das	2016-03-22	1	-12/+6
\| \| \| \|	llvm-svn: 264072
*	Add "first class" lowering for deopt operand bundles	Sanjoy Das	2016-03-22	1	-4/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015
*	[SelectionDAG] Remove visitStatepoint; NFC	Sanjoy Das	2016-03-17	1	-1/+1
\| \| \| \| \| \| \|	This way we have a single entry point into StatepointLowering. The method was a direct dispatch to LowerStatepoint anyway. llvm-svn: 263682
*	[SelectionDAG] Extract out populateCallLoweringInfo; NFC	Sanjoy Das	2016-03-16	1	-14/+17
\| \| \| \| \| \| \| \| \|	SelectionDAGBuilder::populateCallLoweringInfo is now used instead of SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo interface is more composable in face of design changes like http://reviews.llvm.org/D18106 llvm-svn: 263663
*	[DAG] use isUndef() ; NFCI	Sanjay Patel	2016-03-14	1	-2/+2
\| \| \| \|	llvm-svn: 263448
*	SelectionDAG: Fix a crash on inline asm when output register supports ↵	Tom Stellard	2016-03-09	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022
*	Revert "[mips] Promote the result of SETCC nodes to GPR width."	Vasileios Kalintiris	2016-03-01	1	-11/+1
\| \| \| \| \| \| \| \| \|	This reverts commit r262316. It seems that my change breaks an out-of-tree chromium buildbot, so I'm reverting this in order to investigate the situation further. llvm-svn: 262387
*	[NVPTX] Use different, convergent MIs for convergent calls.	Justin Lebar	2016-03-01	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373
*	[mips] Promote the result of SETCC nodes to GPR width.	Vasileios Kalintiris	2016-03-01	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch modifies the existing comparison, branch, conditional-move and select patterns, and adds new ones where needed. Also, the updated SLT{u,i,iu} set of instructions generate a GPR width result. The majority of the code changes in the Mips back-end fix the wrong assumption that the result of SETCC nodes always produce an i32 value. The changes in the common code path account for the fact that in 64-bit MIPS targets, i1 is promoted to i32 instead of i64. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D10970 llvm-svn: 262316
*	Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause ↵	Cong Hou	2016-02-26	1	-4/+4
\| \| \| \| \| \|	assertion failure on AArch64. llvm-svn: 262091
*	Detecte vector reduction operations just before instruction selection.	Cong Hou	2016-02-24	1	-0/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(This is the second attemp to commit this patch, after fixing pr26652 & pr26653). This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261804
*	NFC. Move isDereferenceable to Loads.h/cpp	Artur Pilipenko	2016-02-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736
*	Remove uses of builtin comma operator.	Richard Trieu	2016-02-18	1	-18/+26
\| \| \| \| \| \|	Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
*	Revert r261070, it caused PR26652 / PR26653.	Nico Weber	2016-02-17	1	-126/+0
\| \| \| \|	llvm-svn: 261127
*	Detecte vector reduction operations just before instruction selection.	Cong Hou	2016-02-17	1	-0/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261070
*	[CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.	Ahmed Bougacha	2016-02-09	1	-5/+2
\| \| \| \|	llvm-svn: 260316
*	[X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)	Hans Wennborg	2016-02-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This matches GCC and MSVC's behaviour, and saves on code size. We were already not extending i1 return values on x86_64 after r127766. This takes that patch further by applying it to x86 target as well, and also for i8 and i16. The ABI docs have been unclear about the required behaviour here. The new i386 psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being updated to say the same [2]. Differential Revision: http://reviews.llvm.org/D16907 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ llvm-svn: 260133
*	SelectionDAG: Lower some range metadata to AssertZext	Matt Arsenault	2016-02-08	1	-3/+40
\| \| \| \| \| \| \| \| \| \|	If a range has a lower bound of 0, add an AssertZext from the nearest floor power of two. This allows operations with some workitem intrinsics with known maximum ranges to use fast 24-bit multiplies. llvm-svn: 260109
*	Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to ↵	Benjamin Kramer	2016-01-27	1	-7/+7
\| \| \| \| \| \| \| \|	CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939
*	Remove extra whitespace. NFC.	Junmo Park	2016-01-23	1	-4/+4
\| \| \| \|	llvm-svn: 258617
*	[opaque pointer types] [NFC] Add an explicit type argument to ↵	Eduard Burtescu	2016-01-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	ConstantFoldLoadFromConstPtr. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16418 llvm-svn: 258472
*	[NFC] Replace several manual GEP loops with gep_type_iterator.	Eduard Burtescu	2016-01-20	1	-17/+7
\| \| \| \| \| \| \| \| \| \|	Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16335 llvm-svn: 258262