bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Add cvt.pkrtz intrinsic	Matt Arsenault	2017-02-22	1	-0/+25
\| \| \| \| \| \|	Convert llvm.SI.packf16 test uses llvm-svn: 295797
*	[InstCombine] canonicalize non-obivous forms of integer min/max	Sanjay Patel	2017-02-21	1	-17/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is part of trying to clean up our handling of min/max patterns in IR. By converting these to canonical form, we're more likely to recognize them because there are various places in InstCombine that don't use matchSelectPattern or m_SMax and friends. The backend fixups referenced in the now deleted TODO comment were added with: https://reviews.llvm.org/rL291392 https://reviews.llvm.org/rL289738 If there's any codegen fallout from this change, we should be able to address it in DAGCombiner or target-specific lowering. llvm-svn: 295758
*	[InstCombine] Do not exercise nested max/min pattern on abs	Anna Thomas	2017-02-21	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a fix for assertion failure in `getInverseMinMaxSelectPattern` when ABS is passed in as a select pattern. We should not be invoking the simplification rule for ABS(MIN(~ x,y))) or ABS(MAX(~x,y)) combinations. Added a test case which would cause an assertion failure without the patch. Reviewers: sanjoy, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30051 llvm-svn: 295719
*	[InstCombine] add nsw/nuw X, signbit --> or X, signbit	Sanjay Patel	2017-02-18	1	-2/+9
\| \| \| \| \| \| \| \| \|	Changing to 'or' (rather than 'xor' when no wrapping flags are set) allows icmp simplifies to happen as expected. Differential Revision: https://reviews.llvm.org/D29729 llvm-svn: 295574
*	InstCombine: fix extraction when performing vector/array punning	Eugene Leviant	2017-02-17	1	-1/+1
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D29491 llvm-svn: 295429
*	InstCombine: Canonicalize fast fmuladd to fmul + fadd	Matt Arsenault	2017-02-16	1	-1/+14
\| \| \| \|	llvm-svn: 295353
*	[AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus ↵	Craig Topper	2017-02-16	2	-4/+9
\| \| \| \| \| \|	intrinsics like it does 128/256-bit. llvm-svn: 295294
*	[InstCombine] improve formatting; NFC	Sanjay Patel	2017-02-15	1	-6/+3
\| \| \| \|	llvm-svn: 295237
*	[InstCombine] fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2)	Sanjay Patel	2017-02-12	1	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I found one special case of this transform for 'slt 0', so I removed that and added the general transform. Alive code to check correctness: Name: slt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp slt %a, C1 => %b = icmp slt %x, C1 - C2 Name: sgt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp sgt %a, C1 => %b = icmp sgt %x, C1 - C2 http://rise4fun.com/Alive/MH Differential Revision: https://reviews.llvm.org/D29774 llvm-svn: 294898
*	[InstCombine] Move class into anonymous namespace. NFC.	Benjamin Kramer	2017-02-10	1	-0/+2
\| \| \| \| \| \| \| \|	This is necessary to avoid warnings from GCC. InstCombineLoadStoreAlloca.cpp:238:7: error: 'PointerReplacer' declared with greater visibility than the type of its field 'PointerReplacer::IC' llvm-svn: 294794
*	[InstCombine] Silence unused variable warning in Release builds.	Benjamin Kramer	2017-02-10	1	-0/+2
\| \| \| \|	llvm-svn: 294788
*	Fix invalid addrspacecast due to combining alloca with global var	Yaxun Liu	2017-02-10	2	-7/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For function-scope variables with large initialisation list, FE usually generates a global variable to hold the initializer, then generates memcpy intrinsic to initialize the alloca. InstCombiner::visitAllocaInst identifies such allocas which are accessed only by reading and replaces them with the global variable. This is done by casting the global variable to the type of the alloca and replacing all references. However, when the global variable is in a different address space which is disjoint with addr space 0 (e.g. for IR generated from OpenCL, global variable cannot be in private addr space i.e. addr space 0), casting the global variable to addr space 0 results in invalid IR for certain targets (e.g. amdgpu). To fix this issue, when the global variable is not in addr space 0, instead of casting it to addr space 0, this patch chases down the uses of alloca until reaching the load instructions, then replaces load from alloca with load from the global variable. If during the chasing bitcast and GEP are encountered, new bitcast and GEP based on the global variable are generated and used in the load instructions. Differential Revision: https://reviews.llvm.org/D27283 llvm-svn: 294786
*	[InstCombine] allow (X * C2) << C1 --> X * (C2 << C1) for vectors	Sanjay Patel	2017-02-09	1	-13/+12
\| \| \| \| \| \| \| \| \| \|	This fold already existed for vectors but only when 'C1' was a splat constant (but 'C2' could be any constant). There were no tests for any vector constants, so I'm adding a test that shows non-splat constants for both operands. llvm-svn: 294650
*	[InstCombine] use m_APInt to allow demanded bits analysis on splat constants	Sanjay Patel	2017-02-09	1	-10/+13
\| \| \| \|	llvm-svn: 294628
*	[InstCombine] add local name for repeated calls; NFC	Sanjay Patel	2017-02-08	1	-6/+4
\| \| \| \|	llvm-svn: 294470
*	[InstComobineCalls] Fix buildbot failures after r294453.	Igor Laevsky	2017-02-08	1	-1/+1
\| \| \| \| \| \| \| \|	Some targets don't support uint64_t options. Change type to unsigned. Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294461
*	[InstCombineCalls] Unfold element atomic memcpy instruction	Igor Laevsky	2017-02-08	2	-0/+83
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294453
*	[InstCombineCalls] Remove zero length atomic memcpy intrinsics	Igor Laevsky	2017-02-08	1	-0/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294452
*	Fix the -Werror build for some sign-comparisons	David Blaikie	2017-02-07	1	-1/+1
\| \| \| \|	llvm-svn: 294331
*	[InstCombine] Make max size array combine a tunable.	Davide Italiano	2017-02-07	4	-3/+13
\| \| \| \| \| \| \|	Requested by Sanjoy/Hal a while ago, and forgotten by me (r283612). llvm-svn: 294323
*	Merge DebugLoc on combined stores; in this case, when combining stores	Paul Robinson	2017-02-06	1	-1/+4
\| \| \| \| \| \| \| \|	from the end of two blocks, merge instead of arbitrarily picking one. Differential Revision: http://reviews.llvm.org/D29504 llvm-svn: 294251
*	[InstCombine] simplify dyn_cast + isa; NFCI	Sanjay Patel	2017-02-06	1	-6/+4
\| \| \| \|	llvm-svn: 294198
*	[InstCombine] treat i1 as a special type in shouldChangeType()	Sanjay Patel	2017-02-03	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is based on the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html Folding to i1 should always be desirable because that's better for value tracking and we have special folds for i1 types. I checked for other users of shouldChangeType() where this might have an effect, but we already handle the i1 case differently than other types in all of those cases. Side note: the default datalayout includes i1, so it seems we only find this gap in shouldChangeType + phi folding for the case when there is (1) an explicit datalayout without i1, (2) casting to i1 from a legal type, and (3) a phi with exactly 2 incoming casted operands (as Björn mentioned). Differential Revision: https://reviews.llvm.org/D29336 llvm-svn: 294066
*	[InstCombine] fix operand-complexity-based canonicalization (PR28296)	Sanjay Patel	2017-02-03	1	-7/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code comments didn't match the code logic, and we didn't actually distinguish the fake unary (not/neg/fneg) operators from arguments. Adding another level to the weighting scheme provides more structure and can help simplify the pattern matching in InstCombine and other places. I fixed regressions that would have shown up from this change in: rL290067 rL290127 But that doesn't mean there are no pattern-matching logic holes left; some combines may just be missing regression tests. Should fix: https://llvm.org/bugs/show_bug.cgi?id=28296 Differential Revision: https://reviews.llvm.org/D27933 llvm-svn: 294049
*	[InstCombine] move folds for shift-shift pairs; NFCI	Sanjay Patel	2017-02-01	1	-48/+34
\| \| \| \| \| \| \| \| \| \| \|	Although this is 'no-functional-change-intended', I'm adding tests for shl-shl and lshr-lshr pairs because there is no existing test coverage for those folds. It seems like we should be able to remove some code from foldShiftedShift() at this point because we're handling those patterns on the general path. llvm-svn: 293814
*	[InstCombine] Allow InstCombine to merge adjacent guards	Sanjoy Das	2017-02-01	1	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If there are two adjacent guards with different conditions, we can remove one of them and include its condition into the condition of another one. This patch allows InstCombine to merge them by the following pattern: guard(a); guard(b) -> guard(a & b). Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29378 llvm-svn: 293778
*	[Instcombine] Combine consecutive identical fences	Davide Italiano	2017-01-31	2	-0/+10
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D29314 llvm-svn: 293661
*	Don't combine stores to a swifterror pointer operand to a different type	Arnold Schwaighofer	2017-01-31	1	-1/+2
\| \| \| \|	llvm-svn: 293658
*	fix formatting; NFC	Sanjay Patel	2017-01-31	5	-17/+17
\| \| \| \|	llvm-svn: 293652
*	[InstCombine] Make sure that LHS and RHS have the same type in	Silviu Baranga	2017-01-31	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	transformToIndexedCompare If they don't have the same type, the size of the constant index would need to be adjusted (and this wouldn't be always possible). Alternatively we could try the analysis with the initial RHS value, which would guarantee that the two sides have the same type. However it is unlikely that in practice this would pass our transformation requirements. Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808). llvm-svn: 293629
*	[InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors ↵	Sanjay Patel	2017-01-30	1	-54/+19
\| \| \| \| \| \|	with splat constants llvm-svn: 293570
*	[InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat ↵	Sanjay Patel	2017-01-30	1	-23/+17
\| \| \| \| \| \|	constants llvm-svn: 293562
*	[InstCombine] enable (X >>?exact C1) << C2 --> X >>?exact (C1-C2) for ↵	Sanjay Patel	2017-01-30	1	-24/+22
\| \| \| \| \| \|	vectors with splat constants llvm-svn: 293524
*	[InstCombine] use auto with obvious type; NFC	Sanjay Patel	2017-01-30	1	-3/+3
\| \| \| \|	llvm-svn: 293508
*	[InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1-C2) for vectors ↵	Sanjay Patel	2017-01-30	1	-20/+16
\| \| \| \| \| \|	with splat constants llvm-svn: 293507
*	[InstCombine] fixed to propagate 'exact' on lshr	Sanjay Patel	2017-01-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original shift is bigger, so this may qualify as 'obvious', but here's an attempt at an Alive-based proof: Name: exact Pre: (C1 u< C2) %a = shl i8 %x, C1 %b = lshr exact i8 %a, C2 => %c = lshr exact i8 %x, C2 - C1 %b = and i8 %c, ((1 << width(C1)) - 1) u>> C2 Optimization is correct! llvm-svn: 293498
*	[InstCombine] enable lshr(shl X, C1), C2 folds for vectors with splat constants	Sanjay Patel	2017-01-30	1	-25/+25
\| \| \| \|	llvm-svn: 293489
*	[InstCombine] enable (X >>?,exact C1) << C2 --> X << (C2 - C1) for vectors ↵	Sanjay Patel	2017-01-29	1	-17/+17
\| \| \| \| \| \|	with splats llvm-svn: 293435
*	[InstCombine] move icmp transforms that might be recognized as min/max and ↵	Sanjay Patel	2017-01-27	1	-10/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inf-loop (PR31751) This is a minimal patch to avoid the infinite loop in: https://llvm.org/bugs/show_bug.cgi?id=31751 But the general problem is bigger: we're not canonicalizing all of the min/max forms reported by value tracking's matchSelectPattern(), and we don't define min/max consistently. Some code uses matchSelectPattern(), other code uses matchers like m_Umax, and others have their own inline definitions which may be subtly different from any of the above. The reason that the test cases in this patch need a cast op to trigger is because we don't (yet) canonicalize all min/max forms based on matchSelectPattern() in canonicalizeMinMaxWithConstant(), but we do make min/max+cast transforms based on matchSelectPattern() in visitSelectInst(). The location of the icmp transforms that trigger the inf-loop seems arbitrary at best, so I'm moving those behind the min/max fence in visitICmpInst() as the quick fix. llvm-svn: 293345
*	[NVPTX] [InstCombine] Add llvm_unreachable to appease MSVC.	Justin Lebar	2017-01-27	1	-0/+1
\| \| \| \|	llvm-svn: 293253
*	[NVPTX] Fix use-after-stack-free bug in InstCombineCalls.	Justin Lebar	2017-01-27	1	-1/+1
\| \| \| \| \| \|	Introduced in r293244. llvm-svn: 293251
*	[NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.	Justin Lebar	2017-01-27	1	-0/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are many NVVM intrinsics that we can't entirely get rid of, but that nonetheless often correspond to target-generic LLVM intrinsics. For example, if flush denormals to zero (ftz) is enabled, we can convert @llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a non-ftz PTX instruction. In this case, we can, however, simplify the non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32. These transformations are particularly useful because they let us constant fold instructions that appear in libdevice, the bitcode library that ships with CUDA and essentially functions as its libm. Reviewers: tra Subscribers: hfinkel, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D28794 llvm-svn: 293244
*	Revert a couple of InstCombine/Guard checkins	Sanjoy Das	2017-01-26	1	-29/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change reverts: r293061: "[InstCombine] Canonicalize guards for NOT OR condition" r293058: "[InstCombine] Canonicalize guards for AND condition" They miscompile cases like: ``` declare void @llvm.experimental.guard(i1, ...) define void @test_guard_not_or(i1 %A, i1 %B) { %C = or i1 %A, %B %D = xor i1 %C, true call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ] ret void } ``` because they do transfer the `i32 20, i32 30` parameters to newly created guard instructions. llvm-svn: 293227
*	[InstCombine] fold (X >>u C) << C --> X & (-1 << C)	Sanjay Patel	2017-01-26	1	-18/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already have this fold when the lshr has one use, but it doesn't need that restriction. We may be able to remove some code from foldShiftedShift(). Also, move the similar: (X << C) >>u C --> X & (-1 >>u C) ...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst(). That whole function seems questionable since it is called by commonShiftTransforms(), but there's really not much in common if we're checking the shift opcodes for every fold. llvm-svn: 293215
*	[InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with ↵	Sanjay Patel	2017-01-26	1	-16/+24
\| \| \| \| \| \|	splat vectors llvm-svn: 293208
*	[X86] Add demanded elts support for the inputs to pclmul intrinsic	Craig Topper	2017-01-26	1	-0/+38
\| \| \| \| \| \| \| \|	This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors. Differential Revision: https://reviews.llvm.org/D28979 llvm-svn: 293151
*	[InstCombine] Canonicalize guards for NOT OR condition	Artur Pilipenko	2017-01-25	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D29075 Patch by Maxim Kazantsev. llvm-svn: 293061
*	[InstCombine][SSE] Add support for PACKSS/PACKUS constant folding	Simon Pilgrim	2017-01-25	1	-0/+94
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28949 llvm-svn: 293060
*	[InstCombine] Canonicalize guards for AND condition	Artur Pilipenko	2017-01-25	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \|	This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D29074 Patch by Maxim Kazantsev. llvm-svn: 293058
*	[InstCombine] Allow InstrCombine to remove one of adjacent guards if they ↵	Artur Pilipenko	2017-01-25	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	are equivalent This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: majnemer, apilipenko Differential Revision: https://reviews.llvm.org/D29071 Patch by Maxim Kazantsev. llvm-svn: 293056