bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SEH] Emit 32-bit SEH tables for the new EH IR	Reid Kleckner	2015-09-09	5	-79/+225
\| \| \| \| \| \| \| \| \| \| \|	The 32-bit tables don't actually contain PC range data, so emitting them is incredibly simple. The 64-bit tables, on the other hand, use the same table for state numbering as well as label ranges. This makes things more difficult, so it will be implemented later. llvm-svn: 247192
*	Save LaneMask with livein registers	Matthias Braun	2015-09-09	16	-58/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 llvm-svn: 247171
*	VirtRegMap: Improve addMBBLiveIns() using SlotIndex::MBBIndexIterator; NFC	Matthias Braun	2015-09-09	1	-25/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have an explicit iterator over the idx2MBBMap in SlotIndices we can use the fact that segments and the idx2MBBMap is sorted by SlotIndex position so can advance both simultaneously instead of starting from the beginning for each segment. This complicates the code for the subregister case somewhat but should be more efficient and has the advantage that we get the final lanemask for each block immediately which will be important for a subsequent change. Removes the now unused SlotIndexes::findMBBLiveIns function. Differential Revision: http://reviews.llvm.org/D12443 llvm-svn: 247170
*	[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible	Chandler Carruth	2015-09-09	17	-42/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are within a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
*	MachineVerifier: Check that SlotIndex MBBIndexList is sorted.	Matthias Braun	2015-09-09	1	-0/+17
\| \| \| \| \| \| \|	This introduces a check that the MBBIndexList is sorted as proposed in http://reviews.llvm.org/D12443 but split up into a separate commit. llvm-svn: 247166
*	Fix vector splitting for extract_vector_elt and vector elements of <8-bits.	Daniel Sanders	2015-09-09	2	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: One of the vector splitting paths for extract_vector_elt tries to lower: define i1 @via_stack_bug(i8 signext %idx) { %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx ret i1 %1 } to: define i1 @via_stack_bug(i8 signext %idx) { %base = alloca <2 x i1> store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx %3 = load i1, i1* %2 ret i1 %3 } However, the elements of <2 x i1> are not byte-addressible. The result of this is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies to '%base + %idx * 0', and then simply '%base' causing all values of %idx to extract element zero. This commit fixes this by promoting the vector elements of <8-bits to i8 before splitting the vector. This fixes a number of test failures in pocl. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12591 llvm-svn: 247128
*	SelectionDAG: Support Expand of f16 extloads	Matt Arsenault	2015-09-09	1	-1/+19
\| \| \| \| \| \| \| \| \| \|	Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113
*	Fix typos / grammar	Matt Arsenault	2015-09-09	1	-26/+26
\| \| \| \|	llvm-svn: 247109
*	[WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code	Reid Kleckner	2015-09-08	5	-57/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Typically these are catchpads, which hold data used to decide whether to catch the exception or continue unwinding. We also shouldn't create MBBs for catchendpads, cleanupendpads, or terminatepads, since no real code can live in them. This fixes a problem where MI passes (like the register allocator) would try to put code into catchpad blocks, which are not executed by the runtime. In the new world, blocks ending in invokes now have many possible successors. llvm-svn: 247102
*	[WinEH] Emit prologues and epilogues for funclets	Reid Kleckner	2015-09-08	3	-30/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092
*	[WebAssembly] Support running without a register allocator in the default ↵	Dan Gohman	2015-09-08	1	-16/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	CodeGen passes This allows backends which don't use a traditional register allocator, but do need PHI lowering and other passes, to use the default TargetPassConfig::addFastRegAlloc and TargetPassConfig::addOptimizedRegAlloc implementations. Differential Revision: http://reviews.llvm.org/D12691 llvm-svn: 247065
*	[SelectionDAG] Swap commutative binops before constant-based folding	Hal Finkel	2015-09-06	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In searching for a fix for the underlying code-quality bug highlighted by r246937 (that SDAG simplification can lead to us generating an ISD::OR node with a constant zero LHS), I ran across this: We generically canonicalize commutative binary-operation nodes in SDAG getNode so that, if only one operand is a constant, it will be on the RHS. However, we were doing this only after a bunch of constant-based simplification checks that all assume this canonical form (that any constant will be on the RHS). Moving the operand-swapping canonicalization prior to these checks seems like the right thing to do (and, as it turns out, causes SDAG to completely fold away the computation in test/CodeGen/ARM/2012-11-14-subs_carry.ll, just like InstCombine would do). llvm-svn: 246938
*	Typo. NFC.	Chad Rosier	2015-09-04	1	-1/+1
\| \| \| \|	llvm-svn: 246851
*	Sink COFF.h MC include into .cpp files	Reid Kleckner	2015-09-03	1	-0/+1
\| \| \| \| \| \| \| \|	This prevents MC clients from getting COFF.h, which conflicts with winnt.h macros. Also a minor IWYU cleanup. Now the only public headers including COFF.h are in Object, and they actually need it. llvm-svn: 246784
*	check for fastness before merging in DAGCombiner::MergeConsecutiveStores()	Sanjay Patel	2015-09-03	1	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time we have a merged access candidate. Without this patch, we were generating unaligned 16-byte (SSE) memops for x86 targets where those accesses are slow. This change was mentioned in: http://reviews.llvm.org/D10662 and http://reviews.llvm.org/D10905 and will help solve PR21711. Differential Revision: http://reviews.llvm.org/D12573 llvm-svn: 246771
*	[WinEH] Add cleanupendpad instruction	Joseph Tremoulet	2015-09-03	4	-9/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add a `cleanupendpad` instruction, used to mark exceptional exits out of cleanups (for languages/targets that can abort a cleanup with another exception). The `cleanupendpad` instruction is similar to the `catchendpad` instruction in that it is an EH pad which is the target of unwind edges in the handler and which itself has an unwind edge to the next EH action. The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad` argument indicating which cleanup it exits. The unwind successors of a `cleanuppad`'s `cleanupendpad`s must agree with each other and with its `cleanupret`s. Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12433 llvm-svn: 246751
*	use "unpredictable" metadata in fast-isel when splitting compares	Sanjay Patel	2015-09-02	1	-1/+4
\| \| \| \| \| \| \| \|	This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12342 llvm-svn: 246692
*	use "unpredictable" metadata in SelectionDAG when splitting compares	Sanjay Patel	2015-09-02	1	-4/+5
\| \| \| \| \| \| \| \|	This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12343 llvm-svn: 246691
*	Optimization for Gather/Scatter with uniform base	Elena Demikhovsky	2015-09-02	1	-31/+43
\| \| \| \| \| \| \| \| \|	Vector 'getelementptr' with scalar base is an opportunity for gather/scatter intrinsic to generate a better sequence. While looking for uniform base, we want to use the scalar base pointer of GEP, if exists. Differential Revision: http://reviews.llvm.org/D11121 llvm-svn: 246622
*	[ARM][AArch64] Turn on by default interleaved access lowering	Silviu Baranga	2015-09-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Interleaved access lowering removes a memory operation and a sequence of vector shuffles and replaces it with a series of memory operations. This should be always beneficial. This pass in only enabled on ARM/AArch64. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12145 llvm-svn: 246540
*	Distribute the weight on the edge from switch to default statement to edges ↵	Cong Hou	2015-09-01	3	-21/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	generated in lowering switch. Currently, when edge weights are assigned to edges that are created when lowering switch statement, the weight on the edge to default statement (let's call it "default weight" here) is not considered. We need to distribute this weight properly. However, without value profiling, we have no idea how to distribute it. In this patch, I applied the heuristic that this weight is evenly distributed to successors. For example, given a switch statement with cases 1,2,3,5,10,11,20, and every edge from switch to each successor has weight 10. If there is a binary search tree built to test if n < 10, then its two out-edges will have weight 4x10+10/2 = 45 and 3x10 + 10/2 = 35 respectively (currently they are 40 and 30 without considering the default weight). Each distribution (which is 5 here) will be stored in each SwitchWorkListItem for further distribution. There are some exceptions: For a jump table header which doesn't have any edge to default statement, we don't distribute the default weight to it. For a bit test header which covers a contiguous range and hence has no edges to default statement, we don't distribute the default weight to it. When the branch checks a single value or a contiguous range with no edge to default statement, we don't distribute the default weight to it. In other cases, the default weight is evenly distributed to successors. Differential Revision: http://reviews.llvm.org/D12418 llvm-svn: 246522
*	[DAGCombine] Fixup SETCC legality checking	Hal Finkel	2015-08-31	1	-11/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 246507
*	don't set a legal vector type if we know we can't use that type (NFCI)	Sanjay Patel	2015-08-31	1	-18/+12
\| \| \| \| \| \|	Added benefit: the 'if' logic now matches the text of the comment that describes it. llvm-svn: 246506
*	generalize helper function of MergeConsecutiveStores to handle vector types ↵	Sanjay Patel	2015-08-31	1	-14/+21
\| \| \| \| \| \| \| \| \| \|	(NFCI) This was part of D7208 (r227242), but that commit was reverted because it exposed a bug in AArch64 lowering. I should have that fixed and the rest of the commit reinstated soon. llvm-svn: 246493
*	[DAGCombine] Use getSetCCResultType utility function	Hal Finkel	2015-08-31	1	-1/+1
\| \| \| \| \| \| \|	DAGCombine has a utility wrapper around TLI's getSetCCResultType; use it in the one place in DAGCombine still directly calling the TLI function. NFC. llvm-svn: 246482
*	[EH] Handle non-Function personalities like unknown personalities	Reid Kleckner	2015-08-31	5	-74/+13
\| \| \| \| \| \| \| \| \|	Also delete and simplify a lot of MachineModuleInfo code that used to be needed to handle personalities on landingpads. Now that the personality is on the LLVM Function, we no longer need to track it this way on MMI. Certainly it should not live on LandingPadInfo. llvm-svn: 246478
*	[DAGCombine] Remove some old dead code for forming SETCC nodes	Hal Finkel	2015-08-31	1	-45/+0
\| \| \| \| \| \| \| \| \| \|	This code was dead when it was committed in r23665 (Oct 7, 2005), and before it reaches its 10th anniversary, it really should go. We can always bring it back if we'd like, but it forms more SETCC nodes, and the way we do legality checking on SETCC nodes is wrong in a number of places, and removing this means fewer places to fix. NFC. llvm-svn: 246466
*	Rework of the new interface for shrink wrapping	Kit Barton	2015-08-31	2	-21/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on comments from Hal (http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150810/292978.html), I've changed the interface to add a callback mechanism to the TargetFrameLowering class to query whether the specific target supports shrink wrapping. By default, shrink wrapping is disabled by default. Each target can override the default behaviour using the TargetFrameLowering::targetSupportsShrinkWrapping() method. Shrink wrapping can still be explicitly enabled or disabled from the command line, using the existing -enable-shrink-wrap=<true\|false> option. Phabricator: http://reviews.llvm.org/D12293 llvm-svn: 246463
*	[AggressiveAntiDepBreaker] Check for EarlyClobber on defining instruction	Hal Finkel	2015-08-31	1	-0/+14
\| \| \| \| \| \| \| \| \|	AggressiveAntiDepBreaker was doing some EarlyClobber checking, but was not checking that the register being potentially renamed was defined by an early-clobber def where there was also a use, in that instruction, of the register being considered as the target of the rename. Fixes PR24014. llvm-svn: 246423
*	Support: Support LLVM_ENABLE_THREADS=0 in llvm/Support/thread.h.	Peter Collingbourne	2015-08-31	1	-2/+2
\| \| \| \| \| \| \| \|	Specifically, the header now provides llvm::thread, which is either a typedef of std::thread or a replacement that calls the function synchronously depending on the value of LLVM_ENABLE_THREADS. llvm-svn: 246402
*	Revert "Revert "New interface function is added to VectorUtils Value ↵	Renato Golin	2015-08-30	1	-17/+13
\| \| \| \| \| \| \| \| \|	getSplatValue(Value Val);"" This reverts commit r246379. It seems that the commit was not the culprit, and the bot will be investigated for instability. llvm-svn: 246380
*	Revert "New interface function is added to VectorUtils Value ↵	Renato Golin	2015-08-30	1	-13/+17
\| \| \| \| \| \| \| \| \| \|	getSplatValue(Value Val);" This reverts commit r246371, as it cause a rather obscure bug in AArch64 test-suite paq8p (time outs, seg-faults). I'll investigate it before reapplying. llvm-svn: 246379
*	New interface function is added to VectorUtils	Elena Demikhovsky	2015-08-30	1	-17/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Value getSplatValue(Value Val); It complements the CreateVectorSplat(), which creates 2 instructions - insertelement and shuffle with all-zero mask. The new function recognizes the pattern - insertelement+shuffle and returns the splat value (or nullptr). It also returns a splat value form ConstantDataVector, for completeness. Differential Revision: http://reviews.llvm.org/D11124 llvm-svn: 246371
*	SelectionDAG: add missing ComputeSignBits case for SELECT_CC	Fiona Glaser	2015-08-29	1	-0/+5
\| \| \| \| \| \|	Identical to SELECT, just with different operand numbers. llvm-svn: 246366
*	Fix shared library build.	Peter Collingbourne	2015-08-29	1	-0/+7
\| \| \| \|	llvm-svn: 246365
*	AsmPrinter: Allow null subroutine type	Duncan P. N. Exon Smith	2015-08-28	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the DWARF backend requires that subprograms have a type, and the type is ignored if it has an empty type array. The long term direction here -- see PR23079 -- is instead to skip the type entirely if there's no valid type. It turns out we have cases in tree of missing types on subprograms, but since they're not referenced by compile units, the backend never crashes on them. One option would be to add a Verifier check that subprograms have types, and fix the bitrot. However, this is a fair bit of churn (20-30 testcases) that would be reversed anyway by PR23079. I found this inconsistency because of a WIP patch and upgrade script for PR23367 that started crashing on test/DebugInfo/2010-10-01-crash.ll. This commit updates the testcase to reference the subprogram from the compile unit, and fixes the resulting crash (in line with the direction of PR23079). This also updates `DIBuilder` to stop assuming a non-null pointer for the subroutine types. llvm-svn: 246333
*	Revert r246232 and r246304.	David Majnemer	2015-08-28	1	-10/+12
\| \| \| \| \| \| \| \| \|	This reverts isSafeToSpeculativelyExecute's use of ReadNone until we split ReadNone into two pieces: one attribute which reasons about how the function reasons about memory and another attribute which determines how it may be speculated, CSE'd, trap, etc. llvm-svn: 246331
*	Make MergeConsecutiveStores look at other stores on same chain	Matt Arsenault	2015-08-28	1	-24/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When combiner AA is enabled, look at stores on the same chain. Non-aliasing stores are moved to the same chain so the existing code fails because it expects to find an adajcent store on a consecutive chain. Because of how DAGCombiner tries these store combines, MergeConsecutiveStores doesn't see the correct set of stores on the chain when it visits the other stores. Each store individually has its chain fixed before trying to merge consecutive stores, and then tries to merge stores from that point before the other stores have been processed to have their chains fixed. To fix this, attempt to use FindBetterChain on any possibly neighboring stores in visitSTORE. Suppose you have 4 32-bit stores that should be merged into 1 vector store. One store would be visited first, fixing the chain. What happens is because not all of the store chains have yet been fixed, 2 of the stores are merged. The other 2 stores later have their chains fixed, but because the other stores were already merged, they have different memory types and merging the two different sized stores is not supported and would be more difficult to handle. llvm-svn: 246307
*	[CodeGen] isInTailCallPosition didn't consider readnone tailcalls	David Majnemer	2015-08-28	1	-12/+10
\| \| \| \| \| \| \| \| \| \|	A readnone tailcall may still have a chain of computation which follows it that would invalidate a tailcall lowering. Don't skip the analysis in such cases. This fixes PR24613. llvm-svn: 246304
*	LLVMCodeGen: Update libdeps corresponding to r246236.	NAKAMURA Takumi	2015-08-28	1	-1/+1
\| \| \| \|	llvm-svn: 246274
*	[CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0.	Ahmed Bougacha	2015-08-28	4	-0/+28
\| \| \| \| \| \| \| \| \| \| \|	For targets that didn't support this, this will let us respect the langref instead of failing to select. Note that we don't need to change the 32-bit x86/PPC lowerings (to account for the result type/# difference) because they're both custom and bypass type legalization. llvm-svn: 246258
*	[WinEH] Update coloring to handle nested cases cleanly	Joseph Tremoulet	2015-08-28	1	-70/+114
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Change the coloring algorithm in WinEHPrepare to visit a funclet's exits in its parents' contexts and so properly classify the continuations of nested funclets. Also change the placement of cloned blocks to be deterministic and to maintain the relative order of each funclet's blocks. Add a lit test showing various patterns that require cloning, the last several of which don't have CHECKs yet because they require cloning entire funclets which is NYI. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12353 llvm-svn: 246245
*	CodeGen: Introduce splitCodeGen and teach LTOCodeGenerator to use it.	Peter Collingbourne	2015-08-27	2	-0/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llvm::splitCodeGen is a function that implements the core of parallel LTO code generation. It uses llvm::SplitModule to split the module into linkable partitions and spawning one code generation thread per partition. The function produces multiple object files which can be linked in the usual way. This has been threaded through to LTOCodeGenerator (and llvm-lto for testing purposes). Separate patches will add parallel LTO support to the gold plugin and lld. Differential Revision: http://reviews.llvm.org/D12260 llvm-svn: 246236
*	[WinEH] Add some support for code generating catchpad	Reid Kleckner	2015-08-27	20	-77/+134
\| \| \| \| \| \| \|	We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235
*	[CodeGen] Check FoldConstantArithmetic result before using it.	Ahmed Bougacha	2015-08-27	1	-2/+3
\| \| \| \| \| \| \| \|	Fixes PR24602: r245689 introduced an unguarded use of SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails because of opaque (hoisted) constants. llvm-svn: 246217
*	Fixed a bug that edge weights are not assigned correctly when lowering ↵	Cong Hou	2015-08-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	switch statement. This is a one-line-change patch that moves the update to UnhandledWeights to the correct position: it should be updated for all clusters instead of just range clusters. Differential Revision: http://reviews.llvm.org/D12391 llvm-svn: 246129
*	Assign weights to edges to jump table / bit test header when lowering switch ↵	Cong Hou	2015-08-26	2	-11/+22
\| \| \| \| \| \| \| \| \| \|	statement. Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly. Differential Revision: http://reviews.llvm.org/D12166#fae6eca7 llvm-svn: 246104
*	SelectionDAGBuilder: Fix SPDescriptor not resetting GuardReg	Matthias Braun	2015-08-26	1	-0/+1
\| \| \| \| \| \| \| \|	This was causing problems when some functions use a GuardReg and some don't as can happen when mixing SelectionDAG and FastISel generated functions. llvm-svn: 246075
*	FastISel: Avoid adding a successor block twice for degenerate IR.	Matthias Braun	2015-08-26	1	-1/+5
\| \| \| \| \| \| \| \|	This fixes http://llvm.org/PR24581 Differential Revision: http://reviews.llvm.org/D12350 llvm-svn: 246074
*	FastISel: Factor out common code; NFC intended	Matthias Braun	2015-08-26	1	-0/+12
\| \| \| \| \| \| \| \| \|	This should be no functional change but for the record: For three cases in X86FastISel this will change the order in which the FalseMBB and TrueMBB of a conditional branch is addedd to the successor/predecessor lists. llvm-svn: 245997