bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AArch64/ARM64: move ARM64 into AArch64's place	Tim Northover	2014-05-24	15	-20/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
*	AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64.	Tim Northover	2014-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm doing this in two phases for a better "git blame" record. This commit removes the previous AArch64 backend and redirects all functionality to ARM64. It also deduplicates test-lines and removes orphaned AArch64 tests. The next step will be "git mv ARM64 AArch64" and rewire most of the tests. Hopefully LLVM is still functional, though it would be even better if no-one ever had to care because the rename happens straight afterwards. llvm-svn: 209576
*	Implement sext(C1 + C2X) --> sext(C1) + sext(C2X) and	Michael Zolotukhin	2014-05-24	1	-0/+175
\| \| \| \| \| \| \| \| \| \| \|	sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568
*	Fix broken FileCheck prefixes	Nico Rieck	2014-05-23	1	-1/+1
\| \| \| \|	llvm-svn: 209538
*	Add the extracted constant offset using GEP	Jingyue Wu	2014-05-23	2	-13/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed a TODO in r207783. Add the extracted constant offset using GEP instead of ugly ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR easier to understand. Updated all affected tests, and added a new test in split-gep.ll to cover a corner case where emitting uglygep is necessary. llvm-svn: 209537
*	ScalarEvolution: Fix handling of AddRecs in isKnownPredicate	Justin Bogner	2014-05-23	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487
*	Add support for missed and analysis optimization remarks.	Diego Novillo	2014-05-22	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds two new diagnostics: -pass-remarks-missed and -pass-remarks-analysis. They take the same values as -pass-remarks but are intended to be triggered in different contexts. -pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed, which passes call when they tried to apply a transformation but couldn't. -pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis, which passes call when they want to inform the user about analysis results. The patch also: 1- Adds support in the inliner for the two new remarks and a test case. 2- Moves emitOptimizationRemark* functions to the llvm namespace. 3- Adds an LLVMContext argument instead of making them member functions of LLVMContext. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3682 llvm-svn: 209442
*	Similar to bitcast, treat addrspacecast as a foldable operand.	Eli Bendersky	2014-05-22	1	-0/+37
\| \| \| \| \| \| \| \|	Added a test sink-addrspacecast.ll to verify this change. Patch by Jingyue Wu. llvm-svn: 209343
*	Teach isKnownNonNull that a nonnull return is not null. Add a test for this ↵	Nick Lewycky	2014-05-20	1	-0/+17
\| \| \| \| \| \|	case as well as the case of a nonnull attribute (already handled but not tested). llvm-svn: 209193
*	[ConstantHoisting][X86] Change the cost model to never hoist constants for ↵	Juergen Ributzka	2014-05-19	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	types larger than i128. Currently the X86 backend doesn't support types larger than i128 very well. For example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted. This fix changes the cost model to never hoist constants for types larger than i128. Once the codegen issues have been resolved, the cost model can be updated to allow also larger types. This is related to <rdar://problem/16954938> llvm-svn: 209162
*	Check the alwaysinline attribute on the call as well as on the caller.	Peter Collingbourne	2014-05-19	1	-0/+11
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D3815 llvm-svn: 209150
*	Flip on vectorization of bswap intrinsics.	Benjamin Kramer	2014-05-19	1	-0/+44
\| \| \| \| \| \| \| \| \|	The cost model conservatively assumes that it will always get scalarized and that's about as good as we can get with the generic TTI; reasoning whether a shuffle with an efficient lowering is available is hard. We can override that conservative estimate for some targets in the future. llvm-svn: 209125
*	Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'	Dinesh Dwivedi	2014-05-19	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \|	This removes TODO added in r208849 [http://reviews.llvm.org/D3629] MIN(MIN(A, 97), 23) -> MIN(A, 23) MAX(MAX(A, 23), 97) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3785 llvm-svn: 209110
*	Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"	NAKAMURA Takumi	2014-05-17	1	-56/+0
\| \| \| \| \| \|	It broke clang selfhosting even after r209065. llvm-svn: 209067
*	Add support for combining GEPs across PHI nodes	Louis Gerbarg	2014-05-16	1	-0/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ \|-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ \|-->PHI1-->GEP3 ==> \|-->PHI2->GEP12->GEP3 == > \|-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. rdar://15547484 llvm-svn: 209049
*	Add comdat key field to llvm.global_ctors and llvm.global_dtors	Reid Kleckner	2014-05-16	1	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows us to put dynamic initializers for weak data into the same comdat group as the data being initialized. This is necessary for MSVC ABI compatibility. Once we have comdats for guard variables, we can use the combination to help GlobalOpt fire more often for weak data with guarded initialization on other platforms. Reviewers: nlewycky Differential Revision: http://reviews.llvm.org/D3499 llvm-svn: 209015
*	Fix most of PR10367.	Rafael Espindola	2014-05-16	5	-21/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the design of GlobalAlias so that it doesn't take a ConstantExpr anymore. It now points directly to a GlobalObject, but its type is independent of the aliasee type. To avoid changing all alias related tests in this patches, I kept the common syntax @foo = alias i32* @bar to mean the same as now. The cases that used to use cast now use the more general syntax @foo = alias i16, i32* @bar. Note that GlobalAlias now behaves a bit more like GlobalVariable. We know that its type is always a pointer, so we omit the '*'. For the bitcode, a nice surprise is that we were writing both identical types already, so the format change is minimal. Auto upgrade is handled by looking through the casts and no new fields are needed for now. New bitcode will simply have different types for Alias and Aliasee. One last interesting point in the patch is that replaceAllUsesWith becomes smart enough to avoid putting a ConstantExpr in the aliasee. This seems better than checking and updating every caller. A followup patch will delete getAliasedGlobal now that it is redundant. Another patch will add support for an explicit offset. llvm-svn: 209007
*	InstSimplify: Improve handling of ashr/lshr	David Majnemer	2014-05-16	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Analyze the range of values produced by ashr/lshr cst, %V when it is being used in an icmp. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3774 llvm-svn: 209000
*	InstSimplify: Optimize using dividend in sdiv	David Majnemer	2014-05-16	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The dividend in an sdiv tells us the largest and smallest possible results. Use this fact to optimize comparisons against an sdiv with a constant dividend. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3795 llvm-svn: 208999
*	Revert "Implement global merge optimization for global variables."	Rafael Espindola	2014-05-16	6	-88/+39
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978
*	Implement global merge optimization for global variables.	Jiangning Liu	2014-05-15	6	-39/+88
\| \| \| \| \| \| \| \| \| \| \|	This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934
*	Don't insert lifetime.end markers between a musttail call and ret	Reid Kleckner	2014-05-15	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \|	The allocas going out of scope are immediately killed by the return instruction. This is a resend of r208912, which was committed accidentally. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3792 llvm-svn: 208920
*	Revert "Don't insert lifetime.end markers between a musttail call and ret"	Reid Kleckner	2014-05-15	1	-36/+0
\| \| \| \| \| \| \| \|	This reverts commit r208912. It was committed accidentally without review. llvm-svn: 208914
*	Don't insert lifetime.end markers between a musttail call and ret	Reid Kleckner	2014-05-15	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \|	The allocas going out of scope are immediately killed by the return instruction. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3630 llvm-svn: 208912
*	Teach the inliner how to preserve musttail invariants	Reid Kleckner	2014-05-15	1	-9/+140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The interesting case is what happens when you inline a musttail call through a musttail call site. In this case, we can't break perfect forwarding or allow any stack growth. Instead of merging control flow from the inlined return instruction after a musttail call into the body of the caller, leave the inlined return instruction in the caller so that the musttail call stays in the tail position. More work is required in http://reviews.llvm.org/D3630 to handle the case where the inlined function has dynamic allocas or byval arguments. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3491 llvm-svn: 208910
*	Teach the constant folder to look through bitcast constant expressions	Chandler Carruth	2014-05-15	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	much more effectively when trying to constant fold a load of a constant. Previously, we only handled bitcasts by trying to find a totally generic byte representation of the constant and use that. Now, we look through the bitcast to see what constant we might fold the load into, and then try to form a constant expression cast of the found value that would be equivalent to loading the value. You might wonder why on earth this actually matters. Well, turns out that the Itanium ABI causes us to create a single array for a vtable where the first elements are virtual base offsets, followed by the virtual function pointers. Because the array is homogenous the element type is consistently i8* and we inttoptr the virtual base offsets into the initial elements. Then constructors bitcast these pointers to i64 pointers prior to loading them. Boom, no more constant folding of virtual base offsets. This is the first fix to LLVM to address the insane performance Eric Niebler discovered with Clang on his range comprehensions[1]. There is more to come though, this doesn't really fix the problem fully. [1]: http://ericniebler.com/2014/04/27/range-comprehensions/ llvm-svn: 208856
*	Reverting r208848, reason: build failure: ↵	Dinesh Dwivedi	2014-05-15	1	-54/+0
\| \| \| \| \| \|	sanitizer-x86_64-linux-bootstrap/builds/3399 llvm-svn: 208852
*	Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'	Dinesh Dwivedi	2014-05-15	1	-0/+48
\| \| \| \| \| \| \| \| \|	MIN(MIN(A, 23), 97) -> MIN(A, 23) MAX(MAX(A, 97), 23) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3629 llvm-svn: 208849
*	Added inst combine transforms for single bit tests from Chris's note	Dinesh Dwivedi	2014-05-15	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if ((x & C) == 0) x \|= C becomes x \|= C if ((x & C) != 0) x ^= C becomes x &= ~C if ((x & C) == 0) x ^= C becomes x \|= C if ((x & C) != 0) x &= ~C becomes x &= ~C if ((x & C) == 0) x &= ~C becomes nothing Z3 Verifications code for above transform http://rise4fun.com/Z3/Pmsh Differential Revision: http://reviews.llvm.org/D3717 llvm-svn: 208848
*	InstCombine: Optimize -x s< cst	David Majnemer	2014-05-15	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This gets rid of a sub instruction by moving the negation to the constant when valid. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3773 llvm-svn: 208827
*	InstSimplify: Optimize signed icmp of -(zext V)	David Majnemer	2014-05-14	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We know that -(zext V) will always be <= zero, simplify signed icmps that have these. Uncovered using http://www.cs.utah.edu/~regehr/souper/ Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3754 llvm-svn: 208809
*	Fix the case when reordering shuffle and binop produces a constant.	Serge Pavlov	2014-05-14	1	-0/+11
\| \| \| \| \| \|	This resolves PR19737. llvm-svn: 208762
*	Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. ↵	Nick Lewycky	2014-05-14	1	-0/+19
\| \| \| \| \| \|	This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/ llvm-svn: 208750
*	Fix type of shuffle resulted from shuffle merge.	Serge Pavlov	2014-05-13	1	-0/+8
\| \| \| \| \| \|	This fix resolves PR19730. llvm-svn: 208666
*	Convert test to FileCheck.	Rafael Espindola	2014-05-13	1	-1/+8
\| \| \| \|	llvm-svn: 208658
*	Convert test to FileCheck.	Rafael Espindola	2014-05-13	1	-2/+12
\| \| \| \|	llvm-svn: 208644
*	[Test] Trim unnecessary .c and .cpp from config.suffix in lit.local.cfg	Adam Nemet	2014-05-12	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tested by comparing make check VERBOSE=1 before and after to make sure no tests are missed. (VERBOSE=1 prints the list of tests.) Only one test :( remains where .cpp is required: tools/llvm-cov/range_based_for.cpp:// RUN: llvm-cov range_based_for.cpp \| FileCheck %s --check-prefix=STDOUT The topic was discussed in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140428/214905.html llvm-svn: 208621
*	Fix type of shuffle obtained from reordering with binary operation	Serge Pavlov	2014-05-12	1	-0/+11
\| \| \| \| \| \| \| \|	In transformation: BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef) type of the undef argument must be same as type of BinOp. llvm-svn: 208531
*	Fix reordering of shuffles and binary operations	Serge Pavlov	2014-05-12	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Do not apply transformation: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) if operands v1 and v2 are of different size. This change fixes PR19717, which was caused by r208488. llvm-svn: 208518
*	Reorder shuffle and binary operation.	Serge Pavlov	2014-05-11	2	-12/+125
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables transformations: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2) They allow to eliminate extra shuffles in some cases. Differential Revision: http://reviews.llvm.org/D3525 llvm-svn: 208488
*	SLPVectorizer: When sorting by domination for CSE don't assert on ↵	Benjamin Kramer	2014-05-09	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \|	unreachable code. There is no total ordering if the CFG is disconnected. We don't care if we catch all CSE opportunities in dead code either so just exclude ignore them in the assert. PR19646 llvm-svn: 208461
*	Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCost	Louis Gerbarg	2014-05-09	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Since ExtractValue is not included in ComputeSpeculationCost CFGs containing ExtractValueInsts cannot be simplified. In particular this interacts with InstCombineCompare's tendency to insert add.with.overflow intrinsics for certain idiomatic math operations, preventing optimization. This patch adds ExtractValue to the ComputeSpeculationCost. Test case included rdar://14853450 llvm-svn: 208434
*	[InstCombine] Some cleanup in optimization of redundant insertvalue ↵	Michael Zolotukhin	2014-05-08	1	-0/+11
\| \| \| \| \| \| \| \|	instructions. And one more test added. llvm-svn: 208355
*	Revert test commit. Removed blank line.	Dario Domizioli	2014-05-08	1	-1/+0
\| \| \| \|	llvm-svn: 208308
*	Test commit. Added blank line.	Dario Domizioli	2014-05-08	1	-0/+1
\| \| \| \|	llvm-svn: 208298
*	Move late partial-unrolling thresholds into the processor definitions	Hal Finkel	2014-05-08	2	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. llvm-svn: 208289
*	IR: Don't allow non-default visibility on local linkage	Duncan P. N. Exon Smith	2014-05-07	5	-20/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Visibilities of `hidden` and `protected` are meaningless for symbols with local linkage. - Change the assembler to reject non-default visibility on symbols with local linkage. - Change the bitcode reader to auto-upgrade `hidden` and `protected` to `default` when the linkage is local. - Update LangRef. <rdar://problem/16141113> llvm-svn: 208263
*	[InstCombine] Add optimization of redundant insertvalue instructions.	Michael Zolotukhin	2014-05-07	1	-0/+25
\| \| \| \| \| \|	rdar://problem/11861387 llvm-svn: 208214
*	Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k ↵	Nick Lewycky	2014-05-05	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017
*	Move test from r207969 to another folder and rename it.	Michael Zolotukhin	2014-05-05	1	-25/+0
\| \| \| \|	llvm-svn: 207984