bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86] Fix emitEpilogue() to make less assumptions about pops	Michael Kuperstein	2015-09-16	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the mirror image of r242395. When X86FrameLowering::emitEpilogue() looks for where to insert the %esp addition that deallocates stack space used for local allocations, it assumes that any sequence of pop instructions from function exit backwards consists purely of restoring callee-save registers. This may be false, since from some point backward, the pops may be clean-up of stack space allocated for arguments to a call. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12688 llvm-svn: 247784
*	Use range-based for loops. NFC	Craig Topper	2015-09-16	1	-47/+31
\| \| \| \|	llvm-svn: 247772
*	Fix a spelling error in the description of a statistic. NFC	Craig Topper	2015-09-16	1	-1/+1
\| \| \| \|	llvm-svn: 247771
*	Introducing llvm.invariant.group.barrier intrinsic	Piotr Padlewski	2015-09-15	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For more info for what reason it was invented, goto: http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html invariant.group.barrier: http://reviews.llvm.org/D12310 docs: http://reviews.llvm.org/D11399 CodeGenPrepare: http://reviews.llvm.org/D12875 llvm-svn: 247711
*	[ShrinkWrapping] Fix an infinite loop while looking for restore point.	Quentin Colombet	2015-09-15	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	This may happen when the input program itself contains an infinite loop with no exit block. In that case, we would fail to find a block post-dominating the loop such that this block is outside of the loop. This fixes PR24823. Working on reducing the test case. llvm-svn: 247710
*	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and ↵	Daniel Sanders	2015-09-15	1	-4/+4
\| \| \| \| \| \| \| \|	related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702
*	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* ↵	Daniel Sanders	2015-09-15	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692
*	Revert r247684 - Replace Triple with a new TargetTuple ...	Daniel Sanders	2015-09-15	1	-4/+4
\| \| \| \| \| \|	LLDB needs to be updated in the same commit. llvm-svn: 247686
*	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.	Daniel Sanders	2015-09-15	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683
*	DwarfDebug: Emit dwo_id+dwo_name for DICompileUnits that provide a dwoId.	Adrian Prantl	2015-09-14	1	-0/+8
\| \| \| \| \| \| \|	For module debugging clang emits prefabricated skeleton compile units that can be recognized by a nonzero dwoId. llvm-svn: 247626
*	[opaque pointer types] Switch a few cases of getElementType over, since I ↵	David Blaikie	2015-09-14	1	-5/+3
\| \| \| \| \| \|	had them lying around anyway llvm-svn: 247610
*	RegisterPressure: Simplify close{Top\|Bottom}()	Matthias Braun	2015-09-14	1	-12/+2
\| \| \| \| \| \| \| \| \| \| \|	- There are no duplicate registers in LiveRegs list we are copying from and so we do not need to sort the registers. - Simply use SmallVector::apend instead of a loop between begin() and end() with push_back(). Differential Revision: http://reviews.llvm.org/D12813 llvm-svn: 247588
*	Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type ↵	David Blaikie	2015-09-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rather than decomposing it into pointee type + address space" This was a flawed change - it just caused the getElementType call to be deferred until later, when we really need to remove it. Now that the IR for GlobalAliases has been updated, the root cause is addressed that way instead and this change is no longer needed (and in fact gets in the way - because we want to pass the pointee type directly down further). Follow up patches to push this through GlobalValue, bitcode format, etc, will come along soon. This reverts commit 236160. llvm-svn: 247585
*	[CodeGen] Fix AtomicExpand invalidation issue caused by r247429.	Ahmed Bougacha	2015-09-12	1	-2/+4
\| \| \| \|	llvm-svn: 247514
*	Fix typos.	Bruce Mitchener	2015-09-12	2	-4/+4
\| \| \| \| \| \| \| \| \| \|	Summary: This fixes a variety of typos in docs, code and headers. Subscribers: jholewinski, sanjoy, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12626 llvm-svn: 247495
*	Use function attribute "stackrealign" to decide whether stack	Akira Hatanaka	2015-09-11	1	-9/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	realignment should be forced. With this commit, we can now force stack realignment when doing LTO and do so on a per-function basis. Also, add a new cl::opt option "stackrealign" to CommandFlags.h which is used to force stack realignment via llc's command line. Out-of-tree projects currently using -force-align-stack to force stack realignment should make changes to attach the attribute to the functions in the IR. Differential Revision: http://reviews.llvm.org/D11814 llvm-svn: 247450
*	[X86] Make sure startproc/endproc are paired	David Majnemer	2015-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	We used different conditions to determine if we should emit startproc vs endproc. Use the same condition to ensure that they will always be paired. This fixes PR24374. llvm-svn: 247435
*	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit.	Ahmed Bougacha	2015-09-11	1	-18/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429
*	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind.	Ahmed Bougacha	2015-09-11	1	-3/+3
\| \| \| \| \| \|	This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428
*	Pass BranchProbability/BlockMass by value instead of const& as they are ↵	Cong Hou	2015-09-10	1	-6/+6
\| \| \| \| \| \|	small. NFC. llvm-svn: 247357
*	Fix SEH state numbering algorithm to handle cleanupendpads	Reid Kleckner	2015-09-10	1	-4/+9
\| \| \| \| \| \| \|	WinEHPrepare's new coloring algorithm really expects to see cleanupendpads now, so Clang will start emitting them soon. llvm-svn: 247341
*	Add an explicit 'inline' specifier to these static functions. GCC is	Chandler Carruth	2015-09-10	1	-14/+14
\| \| \| \| \| \| \| \|	warning on them having always_inline attribute for reasons I don't fully understand -- static functions are just as inlinable as inline functions in terms of linkage. llvm-svn: 247334
*	Debug Info: Allow a DIModule to appear as the scope of other entities.	Adrian Prantl	2015-09-10	1	-0/+2
\| \| \| \|	llvm-svn: 247304
*	[WinEH] Fix single-block cleanup coloring	Joseph Tremoulet	2015-09-10	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The coloring code in WinEHPrepare queues cleanuprets' successors with the correct color (the parent one) when it sees their cleanuppad, and so later when iterating successors knows to skip processing cleanuprets since they've already been queued. This latter check was incorrectly under an 'else' condition and so inadvertently was not kicking in for single-block cleanups. This change sinks the check out of the 'else' to fix the bug. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12751 llvm-svn: 247299
*	Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor ↵	Hans Wennborg	2015-09-10	1	-2/+2
\| \| \| \| \| \| \| \| \|	fixes" Except the changes that defined virtual destructors as =default, because that ran into problems with GCC 4.7 and overriding methods that weren't noexcept. llvm-svn: 247298
*	Fix PR 24724 - The implicit register verifier shouldn't assume certain operand	Alex Lorenz	2015-09-10	1	-39/+16
\| \| \| \| \| \| \| \| \| \|	order. The implicit register verifier in the MIR parser should only check if the instruction's default implicit operands are present in the instruction. It should not check the order in which they occur. llvm-svn: 247283
*	[DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant ↵	Silviu Baranga	2015-09-10	1	-11/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	folding vectors Summary: The BUILD_VECTOR node will truncate its operators to match the type. We need to take this into account when constant folding - we need to perform a truncation before constant folding the elements. This is because the upper bits can change the result, depending on the operation type (for example this is the case for min/max). This change also adds a regression test. Reviewers: jmolloy Subscribers: jmolloy, llvm-commits Differential Revision: http://reviews.llvm.org/D12697 llvm-svn: 247265
*	Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"	Hans Wennborg	2015-09-10	1	-2/+2
\| \| \| \| \| \| \|	This caused build breakges, e.g. http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926 llvm-svn: 247226
*	[WinEH] Add codegen support for cleanuppad and cleanupret	Reid Kleckner	2015-09-10	4	-56/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219
*	Fix Clang-tidy misc-use-override warnings, other minor fixes	Hans Wennborg	2015-09-10	1	-2/+2
\| \| \| \| \| \| \| \|	Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12740 llvm-svn: 247216
*	[SEH] Emit 32-bit SEH tables for the new EH IR	Reid Kleckner	2015-09-09	5	-79/+225
\| \| \| \| \| \| \| \| \| \| \|	The 32-bit tables don't actually contain PC range data, so emitting them is incredibly simple. The 64-bit tables, on the other hand, use the same table for state numbering as well as label ranges. This makes things more difficult, so it will be implemented later. llvm-svn: 247192
*	Save LaneMask with livein registers	Matthias Braun	2015-09-09	16	-58/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 llvm-svn: 247171
*	VirtRegMap: Improve addMBBLiveIns() using SlotIndex::MBBIndexIterator; NFC	Matthias Braun	2015-09-09	1	-25/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have an explicit iterator over the idx2MBBMap in SlotIndices we can use the fact that segments and the idx2MBBMap is sorted by SlotIndex position so can advance both simultaneously instead of starting from the beginning for each segment. This complicates the code for the subregister case somewhat but should be more efficient and has the advantage that we get the final lanemask for each block immediately which will be important for a subsequent change. Removes the now unused SlotIndexes::findMBBLiveIns function. Differential Revision: http://reviews.llvm.org/D12443 llvm-svn: 247170
*	[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible	Chandler Carruth	2015-09-09	17	-42/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are within a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
*	MachineVerifier: Check that SlotIndex MBBIndexList is sorted.	Matthias Braun	2015-09-09	1	-0/+17
\| \| \| \| \| \| \|	This introduces a check that the MBBIndexList is sorted as proposed in http://reviews.llvm.org/D12443 but split up into a separate commit. llvm-svn: 247166
*	Fix vector splitting for extract_vector_elt and vector elements of <8-bits.	Daniel Sanders	2015-09-09	2	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: One of the vector splitting paths for extract_vector_elt tries to lower: define i1 @via_stack_bug(i8 signext %idx) { %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx ret i1 %1 } to: define i1 @via_stack_bug(i8 signext %idx) { %base = alloca <2 x i1> store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx %3 = load i1, i1* %2 ret i1 %3 } However, the elements of <2 x i1> are not byte-addressible. The result of this is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies to '%base + %idx * 0', and then simply '%base' causing all values of %idx to extract element zero. This commit fixes this by promoting the vector elements of <8-bits to i8 before splitting the vector. This fixes a number of test failures in pocl. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12591 llvm-svn: 247128
*	SelectionDAG: Support Expand of f16 extloads	Matt Arsenault	2015-09-09	1	-1/+19
\| \| \| \| \| \| \| \| \| \|	Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113
*	Fix typos / grammar	Matt Arsenault	2015-09-09	1	-26/+26
\| \| \| \|	llvm-svn: 247109
*	[WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code	Reid Kleckner	2015-09-08	5	-57/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Typically these are catchpads, which hold data used to decide whether to catch the exception or continue unwinding. We also shouldn't create MBBs for catchendpads, cleanupendpads, or terminatepads, since no real code can live in them. This fixes a problem where MI passes (like the register allocator) would try to put code into catchpad blocks, which are not executed by the runtime. In the new world, blocks ending in invokes now have many possible successors. llvm-svn: 247102
*	[WinEH] Emit prologues and epilogues for funclets	Reid Kleckner	2015-09-08	3	-30/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092
*	[WebAssembly] Support running without a register allocator in the default ↵	Dan Gohman	2015-09-08	1	-16/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	CodeGen passes This allows backends which don't use a traditional register allocator, but do need PHI lowering and other passes, to use the default TargetPassConfig::addFastRegAlloc and TargetPassConfig::addOptimizedRegAlloc implementations. Differential Revision: http://reviews.llvm.org/D12691 llvm-svn: 247065
*	[SelectionDAG] Swap commutative binops before constant-based folding	Hal Finkel	2015-09-06	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In searching for a fix for the underlying code-quality bug highlighted by r246937 (that SDAG simplification can lead to us generating an ISD::OR node with a constant zero LHS), I ran across this: We generically canonicalize commutative binary-operation nodes in SDAG getNode so that, if only one operand is a constant, it will be on the RHS. However, we were doing this only after a bunch of constant-based simplification checks that all assume this canonical form (that any constant will be on the RHS). Moving the operand-swapping canonicalization prior to these checks seems like the right thing to do (and, as it turns out, causes SDAG to completely fold away the computation in test/CodeGen/ARM/2012-11-14-subs_carry.ll, just like InstCombine would do). llvm-svn: 246938
*	Typo. NFC.	Chad Rosier	2015-09-04	1	-1/+1
\| \| \| \|	llvm-svn: 246851
*	Sink COFF.h MC include into .cpp files	Reid Kleckner	2015-09-03	1	-0/+1
\| \| \| \| \| \| \| \|	This prevents MC clients from getting COFF.h, which conflicts with winnt.h macros. Also a minor IWYU cleanup. Now the only public headers including COFF.h are in Object, and they actually need it. llvm-svn: 246784
*	check for fastness before merging in DAGCombiner::MergeConsecutiveStores()	Sanjay Patel	2015-09-03	1	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time we have a merged access candidate. Without this patch, we were generating unaligned 16-byte (SSE) memops for x86 targets where those accesses are slow. This change was mentioned in: http://reviews.llvm.org/D10662 and http://reviews.llvm.org/D10905 and will help solve PR21711. Differential Revision: http://reviews.llvm.org/D12573 llvm-svn: 246771
*	[WinEH] Add cleanupendpad instruction	Joseph Tremoulet	2015-09-03	4	-9/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add a `cleanupendpad` instruction, used to mark exceptional exits out of cleanups (for languages/targets that can abort a cleanup with another exception). The `cleanupendpad` instruction is similar to the `catchendpad` instruction in that it is an EH pad which is the target of unwind edges in the handler and which itself has an unwind edge to the next EH action. The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad` argument indicating which cleanup it exits. The unwind successors of a `cleanuppad`'s `cleanupendpad`s must agree with each other and with its `cleanupret`s. Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12433 llvm-svn: 246751
*	use "unpredictable" metadata in fast-isel when splitting compares	Sanjay Patel	2015-09-02	1	-1/+4
\| \| \| \| \| \| \| \|	This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12342 llvm-svn: 246692
*	use "unpredictable" metadata in SelectionDAG when splitting compares	Sanjay Patel	2015-09-02	1	-4/+5
\| \| \| \| \| \| \| \|	This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12343 llvm-svn: 246691
*	Optimization for Gather/Scatter with uniform base	Elena Demikhovsky	2015-09-02	1	-31/+43
\| \| \| \| \| \| \| \| \|	Vector 'getelementptr' with scalar base is an opportunity for gather/scatter intrinsic to generate a better sequence. While looking for uniform base, we want to use the scalar base pointer of GEP, if exists. Differential Revision: http://reviews.llvm.org/D11121 llvm-svn: 246622
*	[ARM][AArch64] Turn on by default interleaved access lowering	Silviu Baranga	2015-09-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Interleaved access lowering removes a memory operation and a sequence of vector shuffles and replaces it with a series of memory operations. This should be always beneficial. This pass in only enabled on ARM/AArch64. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12145 llvm-svn: 246540