bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[InstCombine] Recognize and simplify three way comparison idioms	Anna Thomas	2017-06-23	1	-0/+395
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Many languages have a three way comparison idiom where comparing two values produces not a boolean, but a tri-state value. Typical values (e.g. as used in the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1 for greater than. We actually do a great job already of converting three way comparisons into binary comparisons when the result produced has one a single use. Unfortunately, such values can have more than one use, and in that case, our existing optimizations break down. The patch adds a peephole which converts a three-way compare + test idiom into a binary comparison on the original inputs. It focused on replacing the test on the result of the three way compare and does nothing about removing the three way compare itself. That's left to other optimizations (which do actually kick in commonly.) We currently recognize one idiom on signed integer compare. In the future, we plan to recognize and simplify other comparison idioms on other signed/unsigned datatypes such as floats, vectors etc. This is a resurrection of Philip Reames' original patch: https://reviews.llvm.org/D19452 Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34278 llvm-svn: 306100
*	[LoopSimplify] Factor the logic to form dedicated exits into a utility.	Chandler Carruth	2017-06-23	3	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to factor it out here. I also noticed that there is a pretty significantly more efficient way to implement this than the way the code in LoopSimplify worked. We don't need to actually retain the set of unique exit blocks, we can just rewrite them as we find them and use only a set to deduplicate. This did require changing one part of LoopSimplify to not re-use the unique set of exits, but it only used it to check that there was a single unique exit. That part of the code is about to walk the exiting blocks anyways, so it seemed better to rewrite it to use those exiting blocks to compute this property on-demand. I also had to ditch a statistic, but it doesn't seem terribly valuable. Differential Revision: https://reviews.llvm.org/D34049 llvm-svn: 306081
*	[LVI] Teach LVI to reason about ORs of icmps similar to how it reasons about ↵	Craig Topper	2017-06-23	1	-0/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ANDs of icmps Summary: LVI can reason about an AND of icmps on the true dest of a branch. I believe we can do similar for the false dest of ORs. This allows us to get the same answer for the demorganed versions of some of the AND test cases as you can see. Reviewers: anna, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34431 llvm-svn: 306076
*	Restrict the definition of loop preheader to avoid EH blocks	Andrew Kaylor	2017-06-22	1	-0/+41
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34487 llvm-svn: 306070
*	Define behavior of "stack-probe-size" attribute when inlining.	whitequark	2017-06-22	1	-0/+29
\| \| \| \| \| \| \| \| \| \|	Also document the attribute, since "probe-stack" already is. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D34528 llvm-svn: 306069
*	Supported lowerInterleavedStore() in X86InterleavedAccess.	Farhana Aleen	2017-06-22	1	-9/+64
\| \| \| \| \| \| \| \| \| \|	Reviewers: RKSimon, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32658 llvm-svn: 306068
*	Remove the LoadCombine pass. It was never enabled and is unsupported.	Eric Christopher	2017-06-22	5	-355/+0
\| \| \| \| \| \|	Based on discussions with the author on mailing lists. llvm-svn: 306067
*	[LoopDeletion] Update exits correctly when multiple duplicate edges from an ↵	Anna Thomas	2017-06-22	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	exiting block Summary: Currently, we incorrectly update exit blocks of loops when there are multiple edges from a single exiting block to the exit block. This can happen when we have switches as the terminator of the exiting blocks. The fix here is to correctly update the phi nodes in the exit block, and remove all incoming values except for one which is from the preheader. Note: Currently, this error can manifest only while deleting non-executed loops. However, it is possible to trigger this error in invariant loops, once we enhance the logic around the exit conditions for the loop check. Reviewers: chandlerc, dberlin, sanjoy, efriedma Reviewed by: efriedma Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34516 llvm-svn: 306048
*	[InstCombine] Teach foldSelectICmpAndOr to recognize (select (icmp slt ↵	Craig Topper	2017-06-22	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(trunc (X)), 0), Y, (or Y, C2)) Summary: InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y). This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used. Reviewers: spatel, majnemer, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34184 llvm-svn: 306029
*	[InstCombine] Add one use checks to or/and->xnor folding	Craig Topper	2017-06-22	1	-4/+2
\| \| \| \| \| \| \| \| \| \|	If the components of the and/or had multiple uses, this transform created an additional instruction. This patch makes sure we remove one of the components. Differential Revision: https://reviews.llvm.org/D34498 llvm-svn: 306027
*	[InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138)	Sanjay Patel	2017-06-22	4	-39/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are 2 parts to this patch made simultaneously to avoid a regression. We're reversing the canonicalization that moves bitwise vector ops before bitcasts. We're moving bitwise vector ops after bitcasts instead. That's the 1st and 3rd hunks of the patch. The motivation is that there's only one fold that currently depends on the existing canonicalization (see next), but there are many folds that would automatically benefit from the new canonicalization. PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these patterns in IR. There's an or(and,andn) pattern that requires an adjustment in order to continue matching to 'select' because the bitcast changes position. This match is unfortunately complicated because it requires 4 logic ops with optional bitcast and sext ops. Test diffs: 1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference - bitcast comes before logic. 2. There are also tests with no diffs in bitcast.ll that verify that we're still doing folds that were enabled by the previous canonicalization. 3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns to look through bitcasts. 4. logical-select.ll contains several tests for the or(and,andn) --> select fold to verify that we are still handling those cases. The lone diff shows the movement of the bitcast from the new canonicalization rule. Differential Revision: https://reviews.llvm.org/D33517 llvm-svn: 306011
*	Revert "Enable vectorizer-maximize-bandwidth by default."	Diana Picus	2017-06-22	11	-76/+67
\| \| \| \| \| \|	This reverts commit r305960 because it broke self-hosting on AArch64. llvm-svn: 305990
*	[InstCombine] Add test cases to demonstrate that and->xnor and or->xnor ↵	Craig Topper	2017-06-22	1	-0/+50
\| \| \| \| \| \|	folding can create more instructions than it removed when there are multiple uses. NFC llvm-svn: 305985
*	Enable vectorizer-maximize-bandwidth by default.	Dehao Chen	2017-06-21	11	-67/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 305960
*	Add a "probe-stack" attribute	whitequark	2017-06-21	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	This attribute is used to ensure the guard page is triggered on stack overflow. Stack frames larger than the guard page size will generate a call to __probestack to touch each page so the guard page won't be skipped. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D34386 llvm-svn: 305939
*	Do not inline recursive direct calls in sample loader pass.	Dehao Chen	2017-06-21	2	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls. Reviewers: iteratee, davidxl Reviewed By: iteratee Subscribers: sanjoy, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D34456 llvm-svn: 305934
*	[x86] set the datalayout to match the RUN line triple; NFC	Sanjay Patel	2017-06-21	1	-4/+2
\| \| \| \| \| \| \|	I don't think there's any visible difference from having the wrong layout for the 32-bit case at this point, but that could change in the future. llvm-svn: 305931
*	[InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on ↵	Craig Topper	2017-06-21	2	-18/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	known bits Summary: I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know. This patch adds range metadata to these intrinsics based on the known bits. Currently the patch punts if we already have range metadata present. Reviewers: spatel, RKSimon, davide, majnemer Reviewed By: RKSimon Subscribers: sanjoy, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D32582 llvm-svn: 305927
*	[InstCombine] Don't let folding (select (icmp eq (and X, C1), 0), Y, (or Y, ↵	Craig Topper	2017-06-21	1	-36/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	C2)) create more instructions than it removes Summary: Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184 This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34437 llvm-svn: 305926
*	[Reassociate] Support xor reassociating for splat vectors	Craig Topper	2017-06-21	1	-0/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support for xors of splat vectors. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34354 llvm-svn: 305925
*	[SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation time	Max Kazantsev	2017-06-21	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high. When constructing SCEVs of expressions like: x1 = a * a x2 = x1 * x1 x3 = x2 * x2 ... We actually have huge SCEVs with max allowed amount of operands inlined. Such expressions are easy to get from unrolling of loops looking like x = a for (i = 0; i < n; i++) x = x * x Or more tricky cases where big powers are involved. If some non-linear analysis tries to work with a SCEV that has 1000 operands, it may lead to excessively long compilation. The attached test does not pass within 1 minute with default threshold. This patch decreases its default value to 32, which looks much more reasonable if we use analyzes with complexity O(N^2) or O(N^3) working with SCEV. Differential Revision: https://reviews.llvm.org/D34397 llvm-svn: 305882
*	AMDGPU: Allow vectorization of packed types	Matt Arsenault	2017-06-20	3	-70/+229
\| \| \| \|	llvm-svn: 305844
*	[x86] enable CGP memcmp() expansion for 2/4/8 byte sizes	Sanjay Patel	2017-06-20	1	-18/+132
\| \| \| \| \| \| \| \| \|	There are a couple of potential improvements as seen in the IR and asm: 1. We're unnecessarily extending to a larger type to compare values. 2. The codegen for (select cond, 1, -1) could avoid a cmov. (or we could change the order of the compares, so we have a select with 0 operand) llvm-svn: 305802
*	[InstCombine] fix code/test comments for r305792; NFC	Sanjay Patel	2017-06-20	1	-2/+2
\| \| \| \| \| \| \|	These diffs were in the last version of the patch in D33342, but I accidentally committed the previous rev. llvm-svn: 305793
*	[InstCombine] try to canonicalize xor-of-icmps to and-of-icmps	Sanjay Patel	2017-06-20	1	-11/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine, but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds, we can use the truth table definition of xor: X ^ Y --> (X \| Y) & !(X & Y) ...to see if we can convert the xor to and/or and then use the existing folds. http://rise4fun.com/Alive/J9v Differential Revision: https://reviews.llvm.org/D33342 llvm-svn: 305792
*	[PATCH] [PGO] Fixed cast operation in ↵	Ana Pazos	2017-06-19	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	emIntrinsicVisitor::instrumentOneMemIntrinsic. Reviewers: xur, efriedma, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34293 llvm-svn: 305737
*	Improve profile-guided heuristics to use estimated trip count.	Taewook Oh	2017-06-19	2	-26/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Existing heuristic uses the ratio between the function entry frequency and the loop invocation frequency to find cold loops. However, even if the loop executes frequently, if it has a small trip count per each invocation, vectorization is not beneficial. On the other hand, even if the loop invocation frequency is much smaller than the function invocation frequency, if the trip count is high it is still beneficial to vectorize the loop. This patch uses estimated trip count computed from the profile metadata as a primary metric to determine coldness of the loop. If the estimated trip count cannot be computed, it falls back to the original heuristics. Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32451 llvm-svn: 305729
*	[InstCombine] Make sure AddReachableCodeToWorklist sets MadeIRChange	Bjorn Pettersson	2017-06-19	2	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some optimizations in AddReachableCodeToWorklist did not update the MadeIRChange state. This could happen both when removing trivially dead instructions (DCE) and at constant folds. It is essential that changes to the IR is reported correctly, since for example InstCombinePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). The new test case early_dce_clobbers_callgraph.ll is a reproducer for some asserts that started to trigger after changes in the inliner in r305245. With this patch the test case passes again. Reviewers: sanjoy, craig.topper, dblaikie Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34346 llvm-svn: 305725
*	Revert r304824 "Fix PR23384 (part 3 of 3)"	Hans Wennborg	2017-06-19	6	-26/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This seems to be interacting badly with ASan somehow, causing false reports of heap-buffer overflows: PR33514. > Summary: > The patch makes instruction count the highest priority for > LSR solution for X86 (previously registers had highest priority). > > Reviewers: qcolombet > > Differential Revision: http://reviews.llvm.org/D30562 > > From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 305720
*	[InstCombine] Cleanup some duplicated one use checks	Craig Topper	2017-06-19	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These 4 patterns have the same one use check repeated twice for each. Once without a cast and one with. But the cast has no effect on what method is called. For the OR case I believe it is always profitable regardless of the number of uses since we'll never increase the instruction count. For the AND case I believe it is profitable if the pair of xors has one use such that we'll get rid of it completely. Or if the C value is something freely invertible, in which case the not doesn't cost anything. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34308 llvm-svn: 305705
*	[Reassociate] Support some reassociation of vector xors	Craig Topper	2017-06-19	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently we don't try to do anything with vector xors. This patch adds support for removing duplicate pairs from a chain of vector xors as its pretty easy to support. We still dont' try to combine the xors with and/ors, but I might try that in a future patch. Reviewers: mcrosier, davide, resistor Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34338 llvm-svn: 305704
*	[TRE] Improve code motion in TRE, use AA to tell whether a load can be moved ↵	Xin Tong	2017-06-19	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	before a call that writes to memory. Summary: use AA to tell whether a load can be moved before a call that writes to memory. Reviewers: dberlin, davide, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D34115 llvm-svn: 305698
*	[SCEV] Teach SCEVExpander to expand BinPow	Max Kazantsev	2017-06-19	1	-0/+264
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current implementation of SCEVExpander demonstrates a very naive behavior when it deals with power calculation. For example, a SCEV for x^8 looks like (x * x * x * x * x * x * x * x) If we try to expand it, it generates a very straightforward sequence of muls, like: x2 = mul x, x x3 = mul x2, x x4 = mul x3, x ... x8 = mul x7, x This is a non-efficient way of doing that. A better way is to generate a sequence of binary power calculation. In this case the expanded calculation will look like: x2 = mul x, x x4 = mul x2, x2 x8 = mul x4, x4 In some cases the code size reduction for such SCEVs is dramatic. If we had a loop: x = a; for (int i = 0; i < 3; i++) x = x * x; And this loop have been fully unrolled, we have something like: x = a; x2 = x * x; x4 = x2 * x2; x8 = x4 * x4; The SCEV for x8 is the same as in example above, and if we for some reason want to expand it, we will generate naively 7 multiplications instead of 3. The BinPow expansion algorithm here allows to keep code size reasonable. This patch teaches SCEV Expander to generate a sequence of BinPow multiplications if we have repeating arguments in SCEVMulExpressions. Differential Revision: https://reviews.llvm.org/D34025 llvm-svn: 305663
*	NewGVN: Fix PR 33461, caused by slightly overzealous verification.	Daniel Berlin	2017-06-19	1	-0/+36
\| \| \| \|	llvm-svn: 305657
*	Add argmononly attribute to strlen and wcslen, i.e. they only read memory ↵	Xin Tong	2017-06-18	2	-3/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(string) passed to them. Summary: This allows strlen to be moved out of the loop in case its argument is not modified in the loop in LICM. Reviewers: hfinkel, davide, sanjoy, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34323 llvm-svn: 305641
*	[SROA] Add support for non-integral pointers	Sanjoy Das	2017-06-17	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: C.f. http://llvm.org/docs/LangRef.html#non-integral-pointer-type Reviewers: chandlerc, loladiro Reviewed By: loladiro Subscribers: reames, loladiro, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32203 llvm-svn: 305639
*	[InstCombine] Make FPMathOperator working with ConstantExpression(s).	Davide Italiano	2017-06-17	1	-0/+15
\| \| \| \| \| \| \| \|	Fixes PR33453. Differential Revision: https://reviews.llvm.org/D34303 llvm-svn: 305618
*	Revert rL305578. There is still some buildbot failure to be fixed.	Wei Mi	2017-06-16	3	-135/+4
\| \| \| \|	llvm-svn: 305603
*	[InstCombine] Set correct insertion point for selects generated while ↵	Anna Thomas	2017-06-16	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	folding phis Summary: When we fold vector constants that are operands of phi's that feed into select, we need to set the correct insertion point for the new selects that get generated. The correct insertion point is the incoming block for the phi. Such cases can occur with patch r298845, which fixed folding of vector constants, but the new selects could be inserted incorrectly (as the added test case shows). Reviewers: majnemer, spatel, sanjoy Reviewed by: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34162 llvm-svn: 305591
*	Change YAML traits for vector<string> to flow_vector.	Evgeniy Stepanov	2017-06-16	1	-7/+2
\| \| \| \| \| \| \|	This is a workaround for an ODR conflict with the definition in AMDGPUCodeObjectMetadata.cpp. llvm-svn: 305584
*	[GVN] Recommit the patch "Add phi-translate support in scalarpre".	Wei Mi	2017-06-16	3	-4/+135
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The recommit fixes two bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 305578
*	[InstCombine] Add test cases to show missed opportunities due to overly ↵	Craig Topper	2017-06-16	2	-0/+77
\| \| \| \| \| \|	conservative single use checks. NFC llvm-svn: 305562
*	[Atomics] Rename and change prototype for atomic memcpy intrinsic	Daniel Neilson	2017-06-16	3	-31/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558
*	[InstCombine] Fold (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 \| K2)) ↵	Craig Topper	2017-06-16	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	== (K1 \| K2) if K1 and K2 are a 1-bit mask Summary: This is the demorganed version of the case we already handle for the OR of iszero. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34244 llvm-svn: 305548
*	[cfi] CFI-ICall for ThinLTO.	Evgeniy Stepanov	2017-06-16	4	-0/+152
\| \| \| \| \| \| \| \|	Implement ControlFlowIntegrity for indirect function calls in ThinLTO. Design follows the RFC in llvm-dev, see https://groups.google.com/d/msg/llvm-dev/MgUlaphu4Qc/kywu0AqjAQAJ llvm-svn: 305533
*	[InstCombine] Add test cases to demonstrate instcombine increasing ↵	Craig Topper	2017-06-15	1	-0/+237
\| \| \| \| \| \|	instruction count when trying to fold (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y) when the pieces have multiple uses. llvm-svn: 305509
*	[InstCombine] Pre-commit test cases for the transform proposed in D34244.	Craig Topper	2017-06-15	1	-0/+58
\| \| \| \|	llvm-svn: 305492
*	[InstCombine] Handle (iszero(A & K1) \| iszero(A & K2)) -> (A & (K1 \| K2)) != ↵	Craig Topper	2017-06-15	1	-9/+4
\| \| \| \| \| \| \| \| \| \| \| \|	(K1 \| K2) when the one of the Ands is commuted relative to the other Currently we expect A to be on the same side in both Ands but nothing guarantees that. While there also switch to using matchers for some of the code. Differential Revision: https://reviews.llvm.org/D34230 llvm-svn: 305487
*	[BasicAA] Add test case that goes with r305481.	Craig Topper	2017-06-15	1	-0/+53
\| \| \| \| \| \|	Forgot to 'git add' the file. llvm-svn: 305483
*	[InstCombine] auto-generate complete checks; NFC	Sanjay Patel	2017-06-15	1	-50/+106
\| \| \| \|	llvm-svn: 305474