bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[CostModel][X86] Add vXi8 vector division by constants costs.	Simon Pilgrim	2018-10-24	3	-193/+193
\| \| \| \| \| \|	ISD::MULHS/ISD::MULHU lowering of vXi8 types means we expand these in TargetLowering BuildSDIV/BuildUDIV. llvm-svn: 345175
*	[CostModel][X86] Enable non-uniform vector division by constants costs.	Simon Pilgrim	2018-10-24	3	-102/+598
\| \| \| \| \| \|	Non-uniform division/remainder handling was added back at D49248/D50765 - so share the 'mul+sub' costs that already exist for uniform cases. llvm-svn: 345164
*	[TTI][X86] Treat SK_Transpose shuffles as SK_PermuteTwoSrc - there's no ↵	Simon Pilgrim	2018-10-23	1	-61/+185
\| \| \| \| \| \|	difference in lowering. llvm-svn: 345048
*	[CostModel][X86] Add transpose shuffle cost tests	Simon Pilgrim	2018-10-23	1	-0/+164
\| \| \| \|	llvm-svn: 345045
*	Add BROADCAST shuffle cost tests.	Simon Pilgrim	2018-10-23	1	-0/+35
\| \| \| \| \| \|	Part of a lot of cleanup necessary before PR39368. llvm-svn: 345025
*	Add BROADCAST shuffle cost tests.	Simon Pilgrim	2018-10-23	1	-0/+33
\| \| \| \| \| \|	Part of a lot of cleanup necessary before PR39368. llvm-svn: 345023
*	[ARM] Regenerate reverse shuffle costs	Simon Pilgrim	2018-10-22	1	-19/+17
\| \| \| \| \| \|	Came about while cleaning up general shuffle costs for PR39368 llvm-svn: 344966
*	[CostModel][X86] Add some initial extract/insert subvector shuffle cost tests	Simon Pilgrim	2018-10-20	2	-0/+252
\| \| \| \| \| \|	Just f64/i64 tests initially to demonstrate PR39368 llvm-svn: 344857
*	[CostModel][X86] Add integer vector reduction cost tests	Simon Pilgrim	2018-10-20	9	-0/+2561
\| \| \| \|	llvm-svn: 344846
*	[ConstantFolding] Constant fold minimum and maximum intrinsics	Thomas Lively	2018-10-19	1	-0/+136
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Depends on D52764 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52765 llvm-svn: 344796
*	[LV] Teach vectorizer about variant value store into uniform address	Anna Thomas	2018-10-16	4	-12/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Teach vectorizer about vectorizing variant value stores to uniform address. Similar to rL343028, we do not allow vectorization if we have multiple stores to the same uniform address. Cost model already has the change for considering the extract instruction cost for a variant value store. See added test cases for how vectorization is done. The patch also contains changes to the ORE messages. Reviewers: Ayal, mkuper, anemet, hsaito Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D52656 llvm-svn: 344613
*	[SCEV] Limit AddRec "simplifications" to avoid combinatorial explosions	Max Kazantsev	2018-10-16	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SCEV's transform that turns `{A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>` into a single AddRec of size `2n+1` with complex combinatorial coefficients can easily trigger exponential growth of the SCEV (in case if nothing gets folded and simplified). We tried to restrain this transform using the option `scalar-evolution-max-add-rec-size`, but its default value seems to be insufficiently small: the test attached to this patch with default value of this option `16` has a SCEV of >3M symbols (when printed out). This patch reduces the simplification limit. It is not a cure to combinatorial explosions, but at least it reduces this corner case to something more or less reasonable. Differential Revision: https://reviews.llvm.org/D53282 Reviewed By: sanjoy llvm-svn: 344584
*	[SystemZ] Temporarily disable high VFs with integer div/rem.	Jonas Paulsson	2018-10-10	1	-32/+35
\| \| \| \| \| \| \| \|	Until mischeduler is clever enough to avoid spilling in a vectorized loop with many (scalar) DLRs it is better to avoid high vectorization factors (8 and above). llvm-svn: 344129
*	[SystemZ] Take better care when computing needed vector registers in TTI.	Jonas Paulsson	2018-10-10	2	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A new function getNumVectorRegs() is better to use for the number of needed vector registers instead of getNumberOfParts(). This is to make sure that the number of vector registers (and typically operations) required for a vector type is accurate. getNumberOfParts() which was previously used works by splitting the vector type until it is legal gives incorrect results for types with a non power of two number of elements (rare). A new static function getScalarSizeInBits() that also checks for a pointer type and returns 64U for it since otherwise it gets a value of 0). Used in a few places where Ty may be pointer. Review: Ulrich Weigand llvm-svn: 344115
*	[Analysis] Make LocationSize pretty-printing more descriptive	George Burgess IV	2018-10-10	6	-223/+223
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is the third patch in a series intended to make https://reviews.llvm.org/D44748 more easily reviewable. Please see that patch for more context. The second being r344013. The intent is to make the output of printing a LocationSize more precise. The main motivation for this is that we plan to add a bit to distinguish whether a given LocationSize is an upper-bound or is precise; making that information available in pretty-printing is nice. llvm-svn: 344108
*	[LV][LAA] Vectorize loop invariant values stored into loop invariant address	Anna Thomas	2018-09-25	4	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We are overly conservative in loop vectorizer with respect to stores to loop invariant addresses. More details in https://bugs.llvm.org/show_bug.cgi?id=38546 This is the first part of the fix where we start with vectorizing loop invariant values to loop invariant addresses. This also includes changes to ORE for stores to invariant address. Reviewers: anemet, Ayal, mkuper, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50665 llvm-svn: 343028
*	[SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM	Jonas Paulsson	2018-09-14	3	-84/+606
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After recent improvements which makes better use of LOC instead of IPM, the TTI cost functions also needs to be updated to reflect this. This involves sext, zext and xor of i1. The tests were updated so that for z13 the new costs are expected, while the old costs are still checked for on zEC12. Review: Ulrich Weigand https://reviews.llvm.org/D51339 llvm-svn: 342207
*	Prevent Constant Folding From Optimizing inrange GEP	Peter Collingbourne	2018-09-11	1	-15/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does the following things: 1. update SymbolicallyEvaluateGEP so that it bails out if it cannot preserve inrange arribute; 2. update llvm/test/Analysis/ConstantFolding/gep.ll to remove UB in it; 3. remove inaccurate comment above ConstantFoldInstOperandsImpl in llvm/lib/Analysis/ConstantFolding.cpp; 4. add a new regression test that makes sure that no optimizations change an inrange GEP in an unexpected way. Patch by Zhaomo Yang! Differential Revision: https://reviews.llvm.org/D51698 llvm-svn: 341888
*	[AST] Add test coverage of memsets	Philip Reames	2018-09-10	1	-0/+48
\| \| \| \| \| \|	Immediately after posting https://reviews.llvm.org/D51895, I noticed a small bug. These tests would have caught that. llvm-svn: 341880
*	[AST] Visit memtransfer arguments in order	Philip Reames	2018-09-10	1	-8/+8
\| \| \| \| \| \| \| \| \| \|	The only point to this change is the test diffs. When I remove this code entirely (in favor of the recently added generic handling), I don't want there to be any confusion due to spurious test diffs. As an aside, the fact out tests are AST construction order dependent is not great. I thought about fixing that, but the reasonable schemes I might want (e.g. sort by name) need the test diffs anyways. Philip llvm-svn: 341841
*	InstCombine: move hasOneUse check to the top of foldICmpAddConstant	Tim Northover	2018-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	There were two combines not covered by the check before now, neither of which actually differed from normal in the benefit analysis. The most recent seems to be because it was just added at the top of the function (naturally). The older is from way back in 2008 (r46687) when we just didn't put those checks in so routinely, and has been diligently maintained since. llvm-svn: 341831
*	AMDGPU: Fix tests using old number for constant address space	Matt Arsenault	2018-09-10	1	-3/+3
\| \| \| \|	llvm-svn: 341770
*	[AST] Generalize argument specific aliasing	Philip Reames	2018-09-07	1	-36/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated. The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics and only these three intrinsics. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target. The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM. Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch. It was originally part of this one, but separate for ease of review and rebase. Differential Revision: https://reviews.llvm.org/D50730 llvm-svn: 341713
*	[InstCombine] Fold icmp ugt/ult (add nuw X, C2), C --> icmp ugt/ult X, (C - C2)	Nicola Zaghen	2018-09-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Support for sgt/slt was added in rL294898, this adds the same cases also for unsigned compares. This is the Alive proof: https://rise4fun.com/Alive/nyY Differential Revision: https://reviews.llvm.org/D50972 llvm-svn: 341353
*	Move test/Analysis/DivergenceAnalysis/AMDGPU/loads.ll	Nicolai Haehnle	2018-08-30	1	-0/+0
\| \| \| \| \| \| \|	Should fix failures of buildbots that don't build the AMDGPU backend. Change-Id: I01cb84b4b47803b10c5b21ea0353546239860a51 llvm-svn: 341079
*	[NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis	Nicolai Haehnle	2018-08-30	12	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes. Patch by: Simon Moll Reviewed By: nhaehnle Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50434 Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071
*	[InstCombine] remove unnecessary shuffle undef folding	Sanjay Patel	2018-08-29	1	-0/+8
\| \| \| \| \| \| \| \|	Add a test for constant folding to show that (shuffle undef, undef, mask) should already be handled via instsimplify. llvm-svn: 340926
*	[PPC] Remove Darwin support from POWER backend.	Kit Barton	2018-08-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch issues an error message if Darwin ABI is attempted with the PPC backend. It also cleans up existing test cases, either converting the test to use an alternative triple or removing the test if the coverage is no longer needed. Updated Tests ------------- The majority of test cases were updated to use a different triple that does not include the Darwin ABI. Many tests were also updated to use FileCheck, in place of grep. Deleted Tests ------------- llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test specific functionality of dsymutil using an object file created with an old version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he suggested removing the test. llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a PPC test to a SystemZ test, as the behavior is also reproducible there. All other tests that were deleted were specific to the darwin/ppc ABI and no longer necessary. Phabricator Review: https://reviews.llvm.org/D50988 llvm-svn: 340795
*	[X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F.	Craig Topper	2018-08-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51267 llvm-svn: 340708
*	[PhiValues] Use callback value handles to invalidate deleted values	John Brawn	2018-08-24	1	-4/+34
\| \| \| \| \| \| \| \| \| \| \|	The way that PhiValues is integrated with BasicAA it is possible for a pass which uses BasicAA to pick up an instance of BasicAA that uses PhiValues without intending to, and then delete values from a function in a way that causes PhiValues to return dangling pointers to these deleted values. Fix this by having a set of callback value handles to invalidate values when they're deleted. llvm-svn: 340613
*	[FunctionAttrs] Infer WriteOnly Function Attribute	Brian Homerding	2018-08-23	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	These changes expand the FunctionAttr logic in order to mark functions as WriteOnly when appropriate. This is done through an additional bool variable and extended logic. Reviewers: hfinkel, jdoerfert Differential Revision: https://reviews.llvm.org/D48387 llvm-svn: 340537
*	[AST] Add a test for attribute intersection	Philip Reames	2018-08-22	1	-0/+18
\| \| \| \| \| \|	Already works, but I initially convinced myself it doesn't, so add a test which shows it does. :) llvm-svn: 340453
*	[AMDGPU] Consider loads from flat addrspace to be potentially divergent	Scott Linder	2018-08-21	1	-0/+15
\| \| \| \| \| \| \| \| \|	In general we can't assume flat loads are uniform, and cases where we can prove they are should be handled through infer-address-spaces. Differential Revision: https://reviews.llvm.org/D50991 llvm-svn: 340343
*	[AST] Remove notion of volatile from alias sets [NFCI]	Philip Reames	2018-08-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet. It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.) There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for.. llvm-svn: 340312
*	[ConstantFolding] improve folding of binops with vector undef operand	Sanjay Patel	2018-08-20	1	-8/+8
\| \| \| \| \| \| \|	A non-undef operand may still have undef constant elements, so we should always propagate the vector results per-lane. llvm-svn: 340194
*	[ConstantFolding] add tests for binops on vectors with undef elements; NFC	Sanjay Patel	2018-08-20	1	-0/+61
\| \| \| \|	llvm-svn: 340190
*	[AST] Clarify printing of unknown size locations [NFC]	Philip Reames	2018-08-17	1	-0/+22
\| \| \| \| \| \|	Printing "unknown" is much more clear than an arbitrary large integer llvm-svn: 340108
*	[AST][Tests] Clarify what each test is doing	Philip Reames	2018-08-17	1	-20/+23
\| \| \| \|	llvm-svn: 340100
*	[AST[Tests] Shorten tests using noalias params	Philip Reames	2018-08-17	1	-12/+4
\| \| \| \|	llvm-svn: 340099
*	[AST] Add tests for argmemonly calls [NFC]	Philip Reames	2018-08-17	1	-0/+130
\| \| \| \| \| \|	First step towards building a test set to rebase D50730 on top of. Starting with clone of memtransfer tests, more to come. llvm-svn: 340095
*	[ConstantFolding] add simplifications for funnel shift intrinsics	Sanjay Patel	2018-08-17	1	-12/+6
\| \| \| \| \| \| \| \| \| \| \|	This is another step towards being able to canonicalize to the funnel shift intrinsics in IR (see D49242 for the initial patch). We should not have any loss of simplification power in IR between these and the equivalent IR constructs. Differential Revision: https://reviews.llvm.org/D50848 llvm-svn: 340022
*	[MustExecute] Fix algorithmic bug in isGuaranteedToExecute. PR38514	Max Kazantsev	2018-08-17	2	-16/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The description of `isGuaranteedToExecute` does not correspond to its implementation. According to description, it should return `true` if an instruction is executed under the assumption that its loop is entered. However there is a sophisticated alrogithm inside that tries to prove that the instruction is executed if the loop is exited, which is not the same thing for infinite loops. There is an attempt to protect from dealing with infinite loops by prohibiting loops without exit blocks, however an infinite loop can have exit blocks. As result of that, MustExecute can falsely consider some blocks that are never entered as mustexec, and LICM can hoist dangerous instructions out of them basing on this fact. This may introduce UB to programs which did not contain it initially. This patch removes the problematic algorithm and replaced it with a one which tries to prove what is required in description. Differential Revision: https://reviews.llvm.org/D50558 Reviewed By: reames llvm-svn: 339984
*	[ConstantFolding] add tests for funnel shift intrinsics; NFC	Sanjay Patel	2018-08-16	1	-0/+89
\| \| \| \| \| \|	No functionality for this yet. llvm-svn: 339889
*	[BFI] Use rounding while computing profile counts.	Easwaran Raman	2018-08-16	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Profile count of a block is computed by multiplying its block frequency by entry count and dividing the result by entry block frequency. Do rounded division in the last step and update test cases appropriately. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50822 llvm-svn: 339835
*	[AliasSetTracker] Do not treat experimental_guard intrinsic as memory ↵	Max Kazantsev	2018-08-15	2	-48/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	writing instruction The `experimental_guard` intrinsic has memory write semantics to model the thread-exiting logic, but does not do any actual writes to memory. Currently, `AliasSetTracker` treats it as a normal memory write. As result, a loop-invariant load cannot be hoisted out of loop because the guard may possibly alias with it. This patch makes `AliasSetTracker` so that it doesn't treat guards as memory writes. Differential Revision: https://reviews.llvm.org/D50497 Reviewed By: reames llvm-svn: 339753
*	[NFC] Add comprehensive test of AliasSetTracker with guards	Max Kazantsev	2018-08-14	1	-0/+1550
\| \| \| \|	llvm-svn: 339643
*	[BasicAA] Don't assume tail calls with byval don't alias allocas	Reid Kleckner	2018-08-14	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Calls marked 'tail' cannot read or write allocas from the current frame because the current frame might be destroyed by the time they run. However, a tail call may use an alloca with byval. Calling with byval copies the contents of the alloca into argument registers or stack slots, so there is no lifetime issue. Tail calls never modify allocas, so we can return just ModRefInfo::Ref. Fixes PR38466, a longstanding bug. Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50679 llvm-svn: 339636
*	[NFC] Add tests that demonstrate that MustExecute is fundamentally broken	Max Kazantsev	2018-08-10	1	-0/+118
\| \| \| \|	llvm-svn: 339417
*	[SCEV] Properly solve quadratic equations	Krzysztof Parzyszek	2018-08-02	3	-0/+515
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48283 llvm-svn: 338758
*	[BasicAA] Use PhiValuesAnalysis if available when handling phi alias	John Brawn	2018-07-30	3	-2/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By using PhiValuesAnalysis we can get all the values reachable from a phi, so we can be more precise instead of giving up when a phi has phi operands. We can't make BaseicAA directly use PhiValuesAnalysis though, as the user of BasicAA may modify the function in ways that PhiValuesAnalysis can't cope with. For this optional usage to work correctly BasicAAWrapperPass now needs to be not marked as CFG-only (i.e. it is now invalidated even when CFG is preserved) due to how the legacy pass manager handles dependent passes being invalidated, namely the depending pass still has a pointer to the now-dead dependent pass. Differential Revision: https://reviews.llvm.org/D44564 llvm-svn: 338242