bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[Scalarizer] Propagate IR flags	Jay Foad	2019-06-21	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The motivation for this was to propagate fast-math flags like nnan and ninf on vector floating point operations to the corresponding scalar operations to take advantage of follow-on optimizations. But I think the same argument applies to all of our IR flags: if they apply to the vector operation then they also apply to all the individual scalar operations, and they might enable follow-on optimizations. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63593 llvm-svn: 364051
*	[RISCV] Add RISCV-specific TargetTransformInfo	Sam Elliott	2019-06-21	2	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LLVM Allows Targets to provide information that guides optimisations made to LLVM IR. This is done with callbacks on a TargetTransformInfo object. This patch adds a TargetTransformInfo class for RISC-V. This will allow us to implement RISC-V specific callbacks as they become necessary. This commit also adds the getIntImmCost callbacks, and tests them with a simple constant hoisting test. Our immediate costs are on the conservative side, for the moment, but we prevent hoisting in most circumstances anyway. Previous review was on D63007 Reviewers: asb, luismarques Reviewed By: asb Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny Tags: #llvm Differential Revision: https://reviews.llvm.org/D63433 llvm-svn: 364046
*	[Reassociate] Remove bogus assert reported in PR42349.	Cameron McInally	2019-06-20	1	-0/+18
\| \| \| \| \| \| \| \|	Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete. The bogus assert with introduced in D63445. llvm-svn: 363998
*	[InstSimplify] simplify power-of-2 (single bit set) sequences	Sanjay Patel	2019-06-20	1	-8/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 Improving the canonicalization for these patterns: rL363956 ...means we should adjust/enhance the related simplification. https://rise4fun.com/Alive/w1cp Name: isPow2 or zero %x = and i32 %xx, 2048 %a = add i32 %x, -1 %r = and i32 %a, %x => %r = i32 0 llvm-svn: 363997
*	[LICM & MSSA] Limit unsafe sinking and hoisting.	Alina Sbirlea	2019-06-20	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch). If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink. Given: ``` for i load a[i] store a[i] ``` The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it. In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs. Caught by PR42294. This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63582 llvm-svn: 363982
*	[InstSimplify] add tests for known-not-a-power-of-2; NFC	Sanjay Patel	2019-06-20	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I added a canonicalization to create this general pattern in: rL363956 But as noted in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314#c11 ...we have a (potentially expensive) simplification for the version of the code that we just canonicalized away from, so we should add/adjust that code to match. llvm-svn: 363981
*	[NFC][SLP] Pre-commit unary FNeg test to X86/propagate_ir_flags.ll	Cameron McInally	2019-06-20	1	-0/+78
\| \| \| \|	llvm-svn: 363978
*	[NFC] Updated tests for D63546	David Bolvansky	2019-06-20	1	-46/+123
\| \| \| \|	llvm-svn: 363967
*	[InstCombine] canonicalize check for power-of-2	Sanjay Patel	2019-06-20	1	-21/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The form that compares against 0 is better because: 1. It removes a use of the input value. 2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2 3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS). This is a root cause for PR42314, but probably doesn't completely answer the codegen request: https://bugs.llvm.org/show_bug.cgi?id=42314 Alive proof: https://rise4fun.com/Alive/9kG Name: is power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp eq i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp eq i32 %a2, 0 Name: is not power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp ne i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp ne i32 %a2, 0 llvm-svn: 363956
*	[Tests] Add a tricky LFTR case for documentation purposes	Philip Reames	2019-06-20	1	-0/+64
\| \| \| \| \| \|	Thought of this case while working on something else. We appear to get it right in all of the variations I tried, but that's by accident. So, add a test which would catch the potential bug. llvm-svn: 363953
*	[InstCombine] cttz(-x) -> cttz(x)	David Bolvansky	2019-06-20	1	-16/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63534 llvm-svn: 363951
*	[InstCombine] add commuted variants for power-of-2 checks; NFC	Sanjay Patel	2019-06-20	1	-0/+47
\| \| \| \|	llvm-svn: 363945
*	[InstCombine] add tests for checking power-of-2; NFC	Sanjay Patel	2019-06-20	1	-0/+138
\| \| \| \|	llvm-svn: 363938
*	[NFC][SLP] Pre-commit unary FNeg test to X86/phi3.ll	Cameron McInally	2019-06-20	1	-0/+41
\| \| \| \|	llvm-svn: 363937
*	[SLP][X86] Add lookahead reordering tests from D60897	Simon Pilgrim	2019-06-20	1	-3/+235
\| \| \| \|	llvm-svn: 363925
*	LFTR for multiple exit loops	Philip Reames	2019-06-19	2	-64/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop. This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.) I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred. I do want to note that I added one bit after the review. When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one. This was theoretically possible with a single exit, but the zero case covered it in practice. With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached. Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done. The solution seemed obvious, so I didn't bother with another round of review. Differential Revision: https://reviews.llvm.org/D62625 llvm-svn: 363883
*	[Tests] Autogen a test so that future changes are understandable	Philip Reames	2019-06-19	1	-100/+444
\| \| \| \|	llvm-svn: 363882
*	[InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier	Huihui Zhang	2019-06-19	2	-36/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: To generate simplified IR, make sure fold ``` (X & signbit) ==/!= 0) -> X s>=/s< 0; ``` is scheduled before fold ``` ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. ``` https://rise4fun.com/Alive/fbdh Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63026 llvm-svn: 363845
*	[InstSimplify] add a phi test with 1 incoming value; NFC	Sanjay Patel	2019-06-19	1	-2/+32
\| \| \| \| \| \| \|	D63489 proposes to change this behavior, but there's no direct -instsimplify test to verify that the transform exists. llvm-svn: 363842
*	[NFC][LSR] Avoid undefined grep in pr2570.ll	Hubert Tong	2019-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \|	greater-than-sign is not a BRE special character. POSIX.1-2017 XBD Section 9.3.2 indicates that the interpretation of `\>` is undefined. This patch replaces the pattern. llvm-svn: 363828
*	[Reassociate] Handle unary FNeg in the Reassociate pass	Cameron McInally	2019-06-19	1	-4/+17
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63445 llvm-svn: 363813
*	[NFC] Added tests for D63534	David Bolvansky	2019-06-19	1	-0/+74
\| \| \| \|	llvm-svn: 363796
*	[NFC] Added tests for cttz(abs(x)) -> cttz(x) fold	David Bolvansky	2019-06-19	1	-0/+169
\| \| \| \|	llvm-svn: 363795
*	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵	Orlando Cazalet-Hyams	2019-06-19	11	-56/+368
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 > llvm-svn: 363046 llvm-svn: 363786
*	[ConstantFolding] Fix assertion failure on non-power-of-two vector load.	Jay Foad	2019-06-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The test case does an (out of bounds) load from a global constant with type <3 x float>. InstSimplify tried to turn this into an integer load of the whole alloc size of the vector, which is 128 bits due to alignment padding, and then bitcast this to <3 x vector> which failed an assertion due to the type size mismatch. The fix is to do an integer load of the normal size of the vector, with no alignment padding. Reviewers: tpr, arsenm, majnemer, dstuttard Reviewed By: arsenm Subscribers: hfinkel, wdng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63375 llvm-svn: 363784
*	Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas	Michael Liao	2019-06-18	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[SROA] Enhance SROA to handle `addrspacecast`ed allocas - Fix typo in original change - Add additional handling to ensure all return pointers are properly casted. Summary: - After `addrspacecast` is allowed to be eliminated in SROA, the adjusting of storage pointer (from `alloca) needs to handle the potential different address spaces between the storage pointer (from alloca) and the pointer being used. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63501 llvm-svn: 363743
*	InstCombine: Pre-commit test for reassociating nuw	Matt Arsenault	2019-06-18	1	-0/+131
\| \| \| \| \| \|	D39417 llvm-svn: 363741
*	Fix broken debug info in in an !llvm.loop attachment in this testcase.	Adrian Prantl	2019-06-18	1	-2/+2
\| \| \| \|	llvm-svn: 363730
*	Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas	Jordan Rupprecht	2019-06-18	1	-15/+0
\| \| \| \| \| \| \| \| \| \|	This reverts r363711 (git commit 76a149ef8187310a60fd20481fdb2a10c8ba968e) This causes stage2 build failures, e.g.: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio llvm-svn: 363718
*	[SROA] Enhance SROA to handle `addrspacecast`ed allocas	Michael Liao	2019-06-18	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - After `addrspacecast` is allowed to be eliminated in SROA, the adjusting of storage pointer (from `alloca) needs to handle the potential different address spaces between the storage pointer (from alloca) and the pointer being used. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63501 llvm-svn: 363711
*	Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops ↵	Philip Reames	2019-06-17	2	-15/+30
\| \| \| \| \| \| \| \| \| \| \| \|	w/taken backedges This patch really contains two pieces: Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi. Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses. Differential Revision: https://reviews.llvm.org/D63224 llvm-svn: 363619
*	Fix a bug w/inbounds invalidation in LFTR (recommit)	Philip Reames	2019-06-17	4	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recommit r363289 with a bug fix for crash identified in pr42279. Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case. Test case previously committed in r363601. Original commit comment follows: This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363613
*	Reduced test case for pr42279 in advance of the relevant re-commit + fix	Philip Reames	2019-06-17	1	-0/+31
\| \| \| \|	llvm-svn: 363601
*	[EarlyCSE] Fix hashing of self-compares	Joseph Tremoulet	2019-06-17	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Update compare normalization in SimpleValue hashing to break ties (when the same value is being compared to itself) by switching to the swapped predicate if it has a lower numerical value. This brings the hashing in line with isEqual, which already recognizes the self-compares with swapped predicates as equal. Fixes PR 42280. Reviewers: spatel, efriedma, nikic, fhahn, uabelho Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63349 llvm-svn: 363598
*	AMDGPU: Fold readlane/readfirstlane calls	Matt Arsenault	2019-06-17	1	-0/+125
\| \| \| \|	llvm-svn: 363587
*	[LV] Suppress vectorization in some nontemporal cases	Warren Ristow	2019-06-17	2	-5/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type DataType, unsigned Alignment) const; bool isLegalNTLoad(Type DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 llvm-svn: 363581
*	InferAddressSpaces: Fix cloning original addrspacecast	Matt Arsenault	2019-06-17	6	-39/+65
\| \| \| \| \| \| \| \|	If an addrspacecast needed to be inserted again, this was creating a clone of the original cast for each user. Just use the original, which also saves losing the value name. llvm-svn: 363562
*	AMDGPU: Ignore subtarget for InferAddressSpaces	Matt Arsenault	2019-06-17	1	-0/+13
\| \| \| \| \| \| \|	Even if the target doesn't have flat instructions, addrspace(0) is still flat. It just happens to not work. llvm-svn: 363561
*	AMDGPU: Mark exp/exp.compr as inaccessiblememonly	Matt Arsenault	2019-06-17	1	-17/+21
\| \| \| \| \| \| \| \| \| \|	Should also be marked writeonly, but I think that would require splitting the version with done set to a separate intrinsic Test change is only from renumbering the attribute group numbers, which for some reason the generated check lines consider. llvm-svn: 363560
*	[CodeGen] Check for HardwareLoop Latch ExitBlock	Sam Parker	2019-06-17	2	-0/+122
\| \| \| \| \| \| \| \| \| \| \| \|	The HardwareLoops pass finds exit blocks with a scevable exit count. If the target specifies to update the loop counter in a register, through a phi, we need to ensure that the exit block is a latch so that we can insert the phi with the correct value for the incoming edge. Differential Revision: https://reviews.llvm.org/D63336 llvm-svn: 363556
*	[LV] Deny irregular types in interleavedAccessCanBeWidened	Bjorn Pettersson	2019-06-17	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid that loop vectorizer creates loads/stores of vectors with "irregular" types when interleaving. An example of an irregular type is x86_fp80 that is 80 bits, but that may have an allocation size that is 96 bits. So an array of x86_fp80 is not bitcast compatible with a vector of the same type. Not sure if interleavedAccessCanBeWidened is the best place for this check, but it solves the problem seen in the added test case. And it is the same kind of check that already exists in memoryInstructionCanBeWidened. Reviewers: fhahn, Ayal, craig.topper Reviewed By: fhahn Subscribers: hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63386 llvm-svn: 363547
*	[SCEV] Use NoWrapFlags when expanding a simple mul	Sam Parker	2019-06-17	9	-19/+19
\| \| \| \| \| \| \| \| \| \|	Second functional change following on from rL362687. Pass the NoWrapFlags from the MulExpr to InsertBinop when we're generating a shl or mul. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363540
*	[lit] Delete empty lines at the end of lit.local.cfg NFC	Fangrui Song	2019-06-17	51	-51/+0
\| \| \| \|	llvm-svn: 363538
*	Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: ↵	Hans Wennborg	2019-06-17	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \|	Also sink function calls without used results (PR41259)" Third time's the charm. This was reverted in r363220 due to being suspected of an internal benchmark regression and a test failure, none of which turned out to be caused by this. llvm-svn: 363529
*	[SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases	Yevgeny Rouban	2019-06-17	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \|	SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata if unreachable switch cases are removed. This patch fixes this bug by making use of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122). A new test is created. Differential Revision: https://reviews.llvm.org/D62186 llvm-svn: 363527
*	[InstSimplify] Fix addo/subo undef folds (PR42209)	Roman Lebedev	2019-06-16	2	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix folds of addo and subo with an undef operand to be: `@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`, as per LLVM undef rules. Same for commuted variants. Based on the original version of the patch by @nikic. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 \| PR42209 ]] Differential Revision: https://reviews.llvm.org/D63065 llvm-svn: 363522
*	[CodeGenPrepare][x86] shift both sides of a vector select when profitable	Sanjay Patel	2019-06-16	1	-18/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is based on the example/discussion in PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 Proper vector shift instructions don't appear until AVX2, so we may generate several extra instructions within a loop trying to compensate for that. It's difficult to recover from that shift expansion later than this, so use the existing TLI hook and splat analysis to enable better codegen. This extends CGP functionality introduced with: rL201655 Differential Revision: https://reviews.llvm.org/D63233 llvm-svn: 363511
*	[SimplifyIndVar] Simplify non-overflowing saturating add/sub	Nikita Popov	2019-06-15	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	If we can detect that saturating math that depends on an IV cannot overflow, replace it with simple math. This is similar to the CVP optimization from D62703, just based on a different underlying analysis (SCEV vs LVI) that catches different cases. Differential Revision: https://reviews.llvm.org/D62792 llvm-svn: 363489
*	[InstCombine] Add tests to show missing fold opportunity for "icmp and ↵	Huihui Zhang	2019-06-15	6	-0/+1434
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shift" (nfc). Summary: For icmp pred (and (sh X, Y), C), 0 When C is signbit, expect to fold (X << Y) & signbit ==/!= 0 into (X << Y) >=/< 0, rather than (X & (signbit >> Y)) != 0. When C+1 is power of 2, expect to fold (X << Y) & ~C ==/!= 0 into (X << Y) </>= C+1, rather than (X & (~C >> Y)) == 0. For icmp pred (and X, (sh signbit, Y)), 0 Expect to fold (X & (signbit l>> Y)) ==/!= 0 into (X << Y) >=/< 0 Expect to fold (X & (signbit << Y)) ==/!= 0 into (X l>> Y) >=/< 0 Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63025 llvm-svn: 363479
*	[ObjC][ARC] Delete ObjC runtime calls on global variables annotated	Akira Hatanaka	2019-06-14	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \|	with 'objc_arc_inert' Those calls are no-ops, so they can be safely deleted. rdar://problem/49839633 Differential Revision: https://reviews.llvm.org/D62433 llvm-svn: 363468