bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[IR] Conservatively mark 'catchpad' as accessing memory	David Majnemer	2015-09-10	1	-0/+2
\| \| \| \| \| \| \| \|	The exact semantics of 'catchpad' are really in the hands of the personality routine so we shouldn't assume that they have no side effects. llvm-svn: 247322
*	[libFuzzer] refactor the code to allow building libFuzzer on platforms that ↵	Kostya Serebryany	2015-09-10	2	-17/+65
\| \| \| \| \| \|	don't have dfsan and don't support weak functions llvm-svn: 247321
*	[SPARC] Switch to the Machine Scheduler.	James Y Knight	2015-09-10	2	-1/+6
\| \| \| \| \| \| \| \| \| \| \|	The (mostly-deprecated) SelectionDAG-based ILPListDAGScheduler scheduler was making poor scheduling decisions, causing high register pressure and extraneous register spills. Switching to the newer machine scheduler generates better code -- even without there being a machine model defined for SPARC yet. llvm-svn: 247315
*	[SCEV] Consistently Handle Expressions That Cannot Be Divided	Matthew Simpson	2015-09-10	1	-36/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses the issue of SCEV division asserting on some input expressions (e.g., non-affine expressions) and quietly giving up on others. When giving up, we set the quotient to be equal to zero and the remainder to be equal to the numerator. With this patch, we always quietly give up when we cannot perform the division. This patch also adds a test case for DependenceAnalysis that previously caused an assertion. Differential Revision: http://reviews.llvm.org/D11725 llvm-svn: 247314
*	[MergeFuncs] Fix callsite attributes in thunk generation	JF Bastien	2015-09-10	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change correctly sets the attributes on the callsites generated in thunks. This makes sure things such as sret, sext, etc. are correctly set, so that the call can be a proper tailcall. Also, the transfer of attributes in the replaceDirectCallers function appears to be unnecessary, but until this is confirmed it will remain. Author: jrkoenig Reviewers: dschuff, jfb Subscribers: llvm-commits, nlewycky Differential revision: http://reviews.llvm.org/D12581 llvm-svn: 247313
*	[SimplifyCFG] Use known bits to eliminate dead switch defaults	Philip Reames	2015-09-10	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This is a follow up to http://reviews.llvm.org/D11995 implementing the suggestion by Hans. If we know some of the bits of the value being switched on, we know that the maximum number of unique cases covers the unknown bits. This allows to eliminate switch defaults for large integers (i32) when most bits in the value are known. Note that I had to make the transform contingent on not having any dead cases. This is conservatively correct with the old code, but required for the new code since we might have a dead case which varies one of the known bits. Counting that towards our number of covering cases would be bad. If we do have dead cases, we'll eliminate them first, then revisit the possibly dead default. Differential Revision: http://reviews.llvm.org/D12497 llvm-svn: 247309
*	Debug Info: Allow a DIModule to appear as the scope of other entities.	Adrian Prantl	2015-09-10	1	-0/+2
\| \| \| \|	llvm-svn: 247304
*	[libFuzzer] add two more variants of FuzzerDriver for convenience	Kostya Serebryany	2015-09-10	2	-20/+39
\| \| \| \|	llvm-svn: 247300
*	[WinEH] Fix single-block cleanup coloring	Joseph Tremoulet	2015-09-10	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The coloring code in WinEHPrepare queues cleanuprets' successors with the correct color (the parent one) when it sees their cleanuppad, and so later when iterating successors knows to skip processing cleanuprets since they've already been queued. This latter check was incorrectly under an 'else' condition and so inadvertently was not kicking in for single-block cleanups. This change sinks the check out of the 'else' to fix the bug. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12751 llvm-svn: 247299
*	Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor ↵	Hans Wennborg	2015-09-10	4	-58/+58
\| \| \| \| \| \| \| \| \|	fixes" Except the changes that defined virtual destructors as =default, because that ran into problems with GCC 4.7 and overriding methods that weren't noexcept. llvm-svn: 247298
*	Fix an undefined behavior introduces in r247234	Steven Wu	2015-09-10	1	-1/+1
\| \| \| \|	llvm-svn: 247296
*	80-cols; NFC	Sanjay Patel	2015-09-10	1	-4/+4
\| \| \| \|	llvm-svn: 247295
*	use range-based for loop; NFCI	Sanjay Patel	2015-09-10	1	-2/+2
\| \| \| \|	llvm-svn: 247294
*	use range-based for loop; NFCI	Sanjay Patel	2015-09-10	1	-2/+2
\| \| \| \|	llvm-svn: 247293
*	fix typo; NFC	Sanjay Patel	2015-09-10	1	-1/+1
\| \| \| \|	llvm-svn: 247287
*	Fix PR 24724 - The implicit register verifier shouldn't assume certain operand	Alex Lorenz	2015-09-10	1	-39/+16
\| \| \| \| \| \| \| \| \| \|	order. The implicit register verifier in the MIR parser should only check if the instruction's default implicit operands are present in the instruction. It should not check the order in which they occur. llvm-svn: 247283
*	AVX512: Implemented encoding and intrinsics for	Igor Breger	2015-09-10	1	-52/+109
\| \| \| \| \| \| \| \| \|	vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247276
*	There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that	Jakub Kuderski	2015-09-10	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \|	removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 247271
*	[ADT] Rewrite the StringRef::find implementation to be simpler, clearer,	Chandler Carruth	2015-09-10	1	-16/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and tremendously less reliant on the optimizer to fix things. The code is always necessarily looking for the entire length of the string when doing the equality tests in this find implementation, but it previously was needlessly re-checking the size each time among other annoyances. By writing this so simply an ddirectly in terms of memcmp, it also is about 8x faster in a debug build, which in turn makes FileCheck about 2x faster in 'ninja check-llvm'. This saves about 8% of the time for FileCheck-heavy parts of the test suite like the x86 backend tests. llvm-svn: 247269
*	[DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant ↵	Silviu Baranga	2015-09-10	1	-11/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	folding vectors Summary: The BUILD_VECTOR node will truncate its operators to match the type. We need to take this into account when constant folding - we need to perform a truncation before constant folding the elements. This is because the upper bits can change the result, depending on the operation type (for example this is the case for min/max). This change also adds a regression test. Reviewers: jmolloy Subscribers: jmolloy, llvm-commits Differential Revision: http://reviews.llvm.org/D12697 llvm-svn: 247265
*	Enable GlobalsAA by default	James Molloy	2015-09-10	1	-1/+1
\| \| \| \| \| \|	This can give significant improvements to alias analysis in some situations, and improves its testing coverage in all situations. llvm-svn: 247264
*	Add GlobalsAA as preserved to a bunch of transforms	James Molloy	2015-09-10	14	-0/+28
\| \| \| \| \| \|	GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA. llvm-svn: 247263
*	[ARM] Do not use vtrn for vectorshuffle if the order is reversed	James Molloy	2015-09-10	1	-4/+13
\| \| \| \| \| \| \| \|	The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254
*	[ADT] Micro-optimize the Triple constructor by doing a single split and	Chandler Carruth	2015-09-10	1	-8/+21
\| \| \| \| \| \| \| \| \| \| \| \|	re-using the resulting components rather than repeatedly splitting and re-splitting to compute each component as part of the initializer list. This is more work on PR23676. Sadly, it doesn't help much. It removes the constructor from my profile, but doesn't make a sufficient dent in the total time. But it should play together nicely with subsequent changes. llvm-svn: 247250
*	[ADT] Fix a confusing interface spec and some annoying peculiarities	Chandler Carruth	2015-09-10	1	-31/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with the StringRef::split method when used with a MaxSplit argument other than '-1' (which nobody really does today, but which should actually work). The spec claimed both to split up to MaxSplit times, but also to append <= MaxSplit strings to the vector. One of these doesn't make sense. Given the name "MaxSplit", let's go with it being a max over how many splits occur, which means the max on how many strings get appended is MaxSplit+1. I'm not actually sure the implementation correctly provided this logic either, as it used a really opaque loop structure. The implementation was also playing weird games with nullptr in the data field to try to rely on a totally opaque hidden property of the split method that returns a pair. Nasty IMO. Replace all of this with what is (IMO) simpler code that doesn't use the pair returning split method, and instead just finds each separator and appends directly. I think this is a lot easier to read, and it most definitely matches the spec. Added some tests that exercise the corner cases around StringRef() and StringRef("") that all now pass. I'll start using this in code in the next commit. llvm-svn: 247249
*	GlobalsAAResult(&&): Move every members.	NAKAMURA Takumi	2015-09-10	1	-1/+6
\| \| \| \| \| \|	Or, one of MSVC builders failed with unexpected behavior. llvm-svn: 247247
*	[ADT] Switch a bunch of places in LLVM that were doing single-character	Chandler Carruth	2015-09-10	8	-12/+12
\| \| \| \| \| \| \|	splits to actually use the single character split routine which does less work, and in a debug build is substantially faster. llvm-svn: 247245
*	[ADT] Add a single-character version of the small vector split routine	Chandler Carruth	2015-09-10	2	-1/+21
\| \| \| \| \| \| \| \| \| \| \|	on StringRef. Finding and splitting on a single character is substantially faster than doing it on even a single character StringRef -- we immediately get to a very tuned memchr call this way. Even nicer, we get to this even in a debug build, shaving 18% off the runtime of TripleTest.Normalization, helping PR23676 some more. llvm-svn: 247244
*	[ScalarEvolution] Fix PR24757.	Sanjoy Das	2015-09-10	1	-2/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: PR24757 was caused by some incorect math in `ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X in 2^N * A = 2^N * X is not necessarily A. Reviewers: atrick, majnemer, meheff Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12721 llvm-svn: 247242
*	[LPM] Simplify this code and fix a compile error for compilers that	Chandler Carruth	2015-09-10	1	-3/+1
\| \| \| \| \| \| \| \|	don't correctly implement the scoping rules of C++11 range based for loops. This kind of aliasing isn't a good idea anyways (and wasn't really intended). llvm-svn: 247241
*	[LPM] Use a map from analysis ID to immutable passes in the legacy pass	Chandler Carruth	2015-09-10	1	-18/+24
\| \| \| \| \| \| \| \| \| \| \| \|	manager to avoid a slow linear scan of every immutable pass and on every attempt to find an analysis pass. This speeds up 'check-llvm' on an unoptimized build for me by 15%, YMMV. It should also help (a tiny bit) other folks that are really bottlenecked on repeated runs of tiny pass pipelines across small IR files. llvm-svn: 247240
*	Enable the shrink wrapping optimization for PPC64.	Kit Barton	2015-09-10	3	-77/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function 2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function 3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run: Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64. Phabricator review: http://reviews.llvm.org/D11817 llvm-svn: 247237
*	[AArch64] Match FI+offset in STNP addressing mode.	Ahmed Bougacha	2015-09-10	2	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236
*	[AArch64] Match base+offset in STNP addressing mode.	Ahmed Bougacha	2015-09-10	1	-0/+16
\| \| \| \| \| \|	Followup to r247231. llvm-svn: 247234
*	[AArch64] Support selecting STNP.	Ahmed Bougacha	2015-09-10	3	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We could go through the load/store optimizer and match STNP where we would have matched a nontemporal-annotated STP, but that's not reliable enough, as an opportunistic optimization. Insetad, we can guarantee emitting STNP, by matching them at ISel. Since there are no single-input nontemporal stores, we have to resort to some high-bits-extracting trickery to generate an STNP from a plain store. Also, we need to support another, LDP/STP-specific addressing mode, base + signed scaled 7-bit immediate offset. For now, only match the base. Let's make it smart separately. Part of PR24086. llvm-svn: 247231
*	AMDGPU/SI: Fix more cases of losing exec operands	Matt Arsenault	2015-09-10	3	-16/+12
\| \| \| \|	llvm-svn: 247230
*	AMDGPU/SI: Fix creating v_mov_b32s without exec uses	Matt Arsenault	2015-09-10	1	-2/+14
\| \| \| \| \| \| \|	This will be caught by existing tests with a verifier check to be added in a future commit. llvm-svn: 247229
*	Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"	Hans Wennborg	2015-09-10	4	-58/+58
\| \| \| \| \| \| \|	This caused build breakges, e.g. http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926 llvm-svn: 247226
*	[CodeGen] Make x86 nontemporal store patfrags generic. NFC.	Ahmed Bougacha	2015-09-10	1	-19/+0
\| \| \| \| \| \|	To be used by other targets. llvm-svn: 247225
*	[RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC]	Philip Reames	2015-09-10	1	-8/+1
\| \| \| \|	llvm-svn: 247223
*	[RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC]	Philip Reames	2015-09-10	1	-3/+3
\| \| \| \| \| \|	The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation. That's not true. The tighter form of the assert is useful documentation. llvm-svn: 247221
*	[RewriteStatepointsForGC] One last bit of naming [NFCI]	Philip Reames	2015-09-10	1	-7/+7
\| \| \| \|	llvm-svn: 247220
*	[WinEH] Add codegen support for cleanuppad and cleanupret	Reid Kleckner	2015-09-10	9	-62/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219
*	[RewriteStatepointsForGC] Further style/naming fixup [NFCI]	Philip Reames	2015-09-10	1	-26/+26
\| \| \| \|	llvm-svn: 247217
*	Fix Clang-tidy misc-use-override warnings, other minor fixes	Hans Wennborg	2015-09-10	4	-58/+58
\| \| \| \| \| \| \| \|	Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12740 llvm-svn: 247216
*	[RewriteStatepointsForGC] More naming cleanup [NFCI]	Philip Reames	2015-09-10	1	-6/+6
\| \| \| \|	llvm-svn: 247213
*	[RewriteStatepointsForGC] Code cleanup [NFC]	Philip Reames	2015-09-09	1	-25/+26
\| \| \| \| \| \|	Factor out common code related to naming values, fix a small style issue. More to follow in separate changes. llvm-svn: 247211
*	[RewriteStatepointsForGC] Extend base pointer inference to handle insertelement	Philip Reames	2015-09-09	1	-58/+61
\| \| \| \| \| \| \| \| \| \| \| \|	This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done. Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time. Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered all insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well. Differential Revision: http://reviews.llvm.org/D12583 llvm-svn: 247210
*	[RewriteStatepointsForGC] Make base pointer inference deterministic	Philip Reames	2015-09-09	1	-44/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the base pointer algorithm wasn't deterministic. The core fixed point was (of course), but we were inserting new nodes and optimizing them in an order which was unspecified and variable. We'd somewhat hacked around this for testing by sorting by value name, but that doesn't solve the general determinism problem. Instead, we can use the order of traversal over the def/use graph to give us a single consistent ordering. Today, this is a DFS order, but the exact order doesn't mater provided it's deterministic for a given input. (Q: It is safe to rely on a deterministic order of operands right?) Note that this only fixes the determinism within a single inference step. The inference step is currently invoked many times in a non-deterministic order. That's a future change in the sequence. :) Differential Revision: http://reviews.llvm.org/D12640 llvm-svn: 247208
*	LowerBitSets: Fix non-determinism bug.	Peter Collingbourne	2015-09-09	1	-4/+22
\| \| \| \| \| \| \| \|	Visit disjoint sets in a deterministic order based on the maximum BitSetNM index, otherwise the order in which we visit them will depend on pointer comparisons. This was being exposed by MSan. llvm-svn: 247201