bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Add a combine for back to back VSRAI instructions	Craig Topper	2018-11-28	1	-3/+3
\| \| \| \| \| \| \| \|	Expansion of SIGN_EXTEND_INREG can create a VSRAI instruction. If there is already a VSRAI after it, we should combine them into a larger VSRAI Differential Revision: https://reviews.llvm.org/D54959 llvm-svn: 347784
*	[SystemZ::TTI] Improve cost for compare of i64 with extended i32 load	Jonas Paulsson	2018-11-28	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \|	CGF/CLGF compares an i64 register with a sign/zero extended loaded i32 value in memory. This patch makes such a load considered foldable and so gets a 0 cost. Review: Ulrich Weigand https://reviews.llvm.org/D54944 llvm-svn: 347735
*	[SystemZ::TTI] Improve costs for i16 add, sub and mul against memory.	Jonas Paulsson	2018-11-28	1	-0/+93
\| \| \| \| \| \| \| \| \| \| \| \| \|	AH, SH and MH costs are already covered in the cases where LHS is 32 bits and RHS is 16 bits of memory sign-extended to i32. As these instructions are also used when LHS is i16, this patch recognizes that the loads will get folded then as well. Review: Ulrich Weigand https://reviews.llvm.org/D54940 llvm-svn: 347734
*	[SystemZ::TTI] Improved cost values for comparison against memory.	Jonas Paulsson	2018-11-28	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Single instructions exist for i8 and i16 comparisons of memory against a small immediate. This patch makes sure that if the load in these cases has a single user (the ICmp), it gets a 0 cost (folded), and also that the ICmp gets a cost of 1. Review: Ulrich Weigand https://reviews.llvm.org/D54897 llvm-svn: 347733
*	[SystemZ::TTI] Return zero cost for scalar load/store connected with a bswap.	Jonas Paulsson	2018-11-28	1	-0/+67
\| \| \| \| \| \| \| \| \| \|	Since byte-swapping loads and stores are supported, a 'load -> bswap' or 'bswap -> store' sequence should have the cost of one. Review: Ulrich Weigand https://reviews.llvm.org/D54870 llvm-svn: 347732
*	[X86] Add test cases to show that we don't properly take ↵	Craig Topper	2018-11-28	1	-0/+144
\| \| \| \| \| \| \| \|	-mprefer-vector-width=256 and -min-legal-vector-width=256 into account when costing sext/zext. The check lines marked AVX256 in the zext256/sext256 functions should be closer to the AVX values which would take into account a splitting cost. llvm-svn: 347722
*	[X86] Add exhaustive cost model testing for sext/zext for all vector types ↵	Craig Topper	2018-11-27	2	-0/+958
\| \| \| \| \| \| \| \| \| \|	we reasonably support. Add cost model tests for truncating to vXi1. Our sext/zext cost modeling was somewhat incomplete. And had no coverage for the fact that avx512bw v32i16/v64i8 types return a scalarization cost. Truncates are a whole different mess because isTruncateFree is returning true for vectors when it shouldn't and that's the fall back for anything not in the tables. llvm-svn: 347719
*	[X86] Add cost model tests for experimental.vector.reduce.* with ↵	Craig Topper	2018-11-27	9	-0/+2852
\| \| \| \| \| \|	-x86-experimental-vector-widening-legalization llvm-svn: 347697
*	[X86] Add cost model test for masked load an store with ↵	Craig Topper	2018-11-27	1	-0/+606
\| \| \| \| \| \|	-x86-experimental-vector-widening-legalization llvm-svn: 347696
*	[X86] Add cost model tests for fp_to_int/int_to_fp with ↵	Craig Topper	2018-11-27	5	-0/+1765
\| \| \| \| \| \|	-x86-experimental-vector-widening-legalization llvm-svn: 347695
*	[X86] Add cost model tests for shifts with ↵	Craig Topper	2018-11-27	3	-0/+1589
\| \| \| \| \| \|	-x86-experimental-vector-widening-legalization. llvm-svn: 347694
*	[stack-safety] Inter-Procedural Analysis implementation	Vitaly Buka	2018-11-26	7	-7/+790
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: IPA is implemented as module pass which produce map from Function or Alias to StackSafetyInfo for a single function. From prototype by Evgenii Stepanov and Vlad Tsyrklevich. Reviewers: eugenis, vlad.tsyrklevich, pcc, glider Subscribers: hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D54543 llvm-svn: 347611
*	[stack-safety] Empty local passes for Stack Safety Global Analysis	Vitaly Buka	2018-11-26	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Reviewers: eugenis, vlad.tsyrklevich Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54541 llvm-svn: 347610
*	[stack-safety] Local analysis implementation	Vitaly Buka	2018-11-26	2	-3/+538
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Analysis produces StackSafetyInfo which contains information with how allocas and parameters were used in functions. From prototype by Evgenii Stepanov and Vlad Tsyrklevich. Reviewers: eugenis, vlad.tsyrklevich, pcc, glider Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54504 llvm-svn: 347603
*	[stack-safety] Empty local passes for Stack Safety Local Analysis	Vitaly Buka	2018-11-26	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	Reviewers: eugenis, vlad.tsyrklevich Subscribers: mgorny, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54502 llvm-svn: 347602
*	[DemandedBits] Add support for funnel shifts	Nikita Popov	2018-11-26	1	-0/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for funnel shifts to the DemandedBits analysis. The demanded bits of the first two operands can be determined if the shift amount is constant. The demanded bits of the third operand (shift amount) can be determined if the bitwidth is a power of two. This is basically the same functionality as implemented in D54869 and D54478, but for DemandedBits rather than InstCombine. Differential Revision: https://reviews.llvm.org/D54876 llvm-svn: 347561
*	Revert "[TTI] Reduction costs only need to include a single extract element ↵	Fedor Sergeev	2018-11-26	11	-1049/+1049
\| \| \| \| \| \| \| \| \| \|	cost" This reverts commit r346970. It was causing PR39774, a crash in slp-vectorizer on a rather simple loop with just a bunch of 'and's in the body. llvm-svn: 347541
*	[SystemZTTIImpl] Give correct cost values for vector bswap intrinsics.	Jonas Paulsson	2018-11-22	1	-0/+57
\| \| \| \| \| \| \| \| \| \|	Implement getIntrinsicInstrCost() and return costs reflecting that bswap can be done with a vperm per vector register. Review: Ulrich Weigand https://reviews.llvm.org/D54789 llvm-svn: 347445
*	[LVI] run transfer function for binary operator even when the RHS isn't a ↵	John Regehr	2018-11-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	constant LVI was symbolically executing binary operators only when the RHS was constant, missing the case where we have a ConstantRange for the RHS, but not an actual constant. Tested using check-all and by bootstrapping. Compile time is not impacted measurably. Differential Revision: https://reviews.llvm.org/D19859 llvm-svn: 347379
*	[ConstantFolding] Add support for saturating add/sub	Sanjay Patel	2018-11-20	1	-0/+111
\| \| \| \| \| \| \| \| \| \|	Support saturating add/sub in constant folding, based on the APInt methods introduced in D54332. Patch by: @nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54531 llvm-svn: 347328
*	[LV] Avoid vectorizing unsafe dependencies in uniform address	Anna Thomas	2018-11-19	4	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, when vectorizing stores to uniform addresses, the only instance we prevent vectorization is if there are multiple stores to the same uniform address causing an unsafe dependency. This patch teaches LAA to avoid vectorizing loops that have an unsafe cross-iteration dependency between a load and a store to the same uniform address. Fixes PR39653. Reviewers: Ayal, efriedma Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D54538 llvm-svn: 347220
*	[TTI] Reduction costs only need to include a single extract element cost	Simon Pilgrim	2018-11-15	11	-1049/+1049
\| \| \| \| \| \| \| \| \| \| \| \|	We were adding the entire scalarization extraction cost for reductions, which returns the total cost of extracting every element of a vector type. For reductions we don't need to do this - we just need to extract the 0'th element after the reduction pattern has completed. Fixes PR37731 Differential Revision: https://reviews.llvm.org/D54585 llvm-svn: 346970
*	[TTI] getOperandInfo - a broadcast shuffle means the result is OK_UniformValue	Simon Pilgrim	2018-11-14	2	-340/+268
\| \| \| \|	llvm-svn: 346868
*	[CostModel] Add generic expansion funnel shift cost support	Simon Pilgrim	2018-11-14	2	-1274/+2930
\| \| \| \| \| \| \| \|	Add support for the expansion of funnelshift/rotates to getIntrinsicInstrCost. This also required us to move the X86 fshl/fshr costs to the same place as the rotates to avoid expansion and get correct scalarization vs vectorization costs. llvm-svn: 346854
*	[CostModel][X86] Fix constant vector XOP rights shifts	Simon Pilgrim	2018-11-13	2	-87/+71
\| \| \| \| \| \| \| \|	We'll constant fold these cases so they are as cheap as vector left shift cases. Noticed while improving funnel shift costs. llvm-svn: 346760
*	[CostModel][X86] Add more cost tests for funnel shifts	Simon Pilgrim	2018-11-13	2	-8/+2853
\| \| \| \| \| \|	Added full uniform/constant coverage for funnel shifts + rotates llvm-svn: 346754
*	[CostModel][X86] Add funnel shift rotation special case costs	Simon Pilgrim	2018-11-12	2	-84/+140
\| \| \| \| \| \|	When we repeat the 2 shifting operands then this is a bit rotation - annoyingly this has to be done in the other getIntrinsicInstrCost than most intrinsics as we need to check the operands are the same. llvm-svn: 346688
*	[CostModel][X86] Add SHLD/SHRD scalar funnel shift costs	Simon Pilgrim	2018-11-12	2	-320/+320
\| \| \| \| \| \|	The costs match the typical reg-reg cases - the RMW case can be a lot slower but we don't model that at this level llvm-svn: 346683
*	[CostModel][X86] Add some initial cost tests for funnel shifts	Simon Pilgrim	2018-11-12	2	-0/+780
\| \| \| \| \| \|	Still need to add full uniform/constant coverage but this is enough to check basic fshl/fshr cost handling llvm-svn: 346670
*	[CostModel][X86] SK_ExtractSubvector is cheap if the (legal) subvector is ↵	Simon Pilgrim	2018-11-12	1	-22/+22
\| \| \| \| \| \|	aligned within the source vector llvm-svn: 346664
*	[SystemZ::TTI] Improve accuracy of costs for vector fp <-> int conversions	Jonas Paulsson	2018-11-12	1	-136/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve getCastInstrCost() by respecting the different types of Src and Dst for vector integer <-> fp conversions. This means that extracting from integer becomes more expensive (by the extraction penalty), and the extraction from fp becomes cheaper (no longer has a false extraction penalty). Review: Ulrich Weigand https://reviews.llvm.org/D54423 llvm-svn: 346663
*	[CostModel] Add more realistic SK_InsertSubvector generic costs.	Simon Pilgrim	2018-11-12	1	-8/+8
\| \| \| \| \| \|	Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles. llvm-svn: 346662
*	[CostModel] Add more realistic SK_ExtractSubvector generic costs.	Simon Pilgrim	2018-11-12	1	-28/+28
\| \| \| \| \| \| \| \|	Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles. This exposes an issue in LoopVectorize which could call SK_ExtractSubvector with a scalar subvector type. llvm-svn: 346656
*	[CostModel][X86] SK_ExtractSubvector is free if the subvector is at the ↵	Simon Pilgrim	2018-11-09	11	-515/+563
\| \| \| \| \| \|	start of the source vector llvm-svn: 346538
*	[CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput ↵	Simon Pilgrim	2018-11-09	1	-36/+36
\| \| \| \| \| \| \| \|	(PR39368) Add ShuffleVectorInst::isExtractSubvectorMask helper to match shuffle masks. llvm-svn: 346510
*	Return "[IndVars] Smart hard uses detection"	Max Kazantsev	2018-11-08	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	The patch has been reverted because it ended up prohibiting propagation of a constant to exit value. For such values, we should skip all checks related to hard uses because propagating a constant is always profitable. Differential Revision: https://reviews.llvm.org/D53691 llvm-svn: 346397
*	Revert "[IndVars] Smart hard uses detection"	Max Kazantsev	2018-11-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402. It seems that the check that we still should do the transform if we know the result is constant is missing in this code. So the logic that has been deleted by this change is still sometimes accidentally useful. I revert the change to see what can be done about it. The motivating case is the following: @Y = global [400 x i16] zeroinitializer, align 1 define i16 @foo() { entry: br label %for.body for.body: ; preds = %entry, %for.body %i = phi i16 [ 0, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i store i16 0, i16* %arrayidx, align 1 %inc = add nuw nsw i16 %i, 1 %cmp = icmp ult i16 %inc, 400 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body %inc.lcssa = phi i16 [ %inc, %for.body ] ret i16 %inc.lcssa } We should be able to figure out that the result is constant, but the patch breaks it. Differential Revision: https://reviews.llvm.org/D51584 llvm-svn: 346198
*	[SystemZ::TTI] Improve cost handling of uint/sint to fp conversions.	Jonas Paulsson	2018-11-02	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \|	Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load. Since the load already does the extension, there is no extra cost (previously returned 2). Review: Ulrich Weigand https://reviews.llvm.org/D54028 llvm-svn: 346009
*	[ProfileSummary] Add options to override hot and cold count thresholds.	Easwaran Raman	2018-11-02	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The hot and cold count thresholds are derived from the summary, but for debugging purposes it is convenient to provide the actual thresholds. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54040 llvm-svn: 346005
*	[SystemZ::TTI] Recognize the higher cost of scalar i1 -> fp conversion	Jonas Paulsson	2018-11-01	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	Scalar i1 to fp conversions are done with a branch sequence, so it should have a higher cost. Review: Ulrich Weigand https://reviews.llvm.org/D53924 llvm-svn: 345818
*	[SystemZ::TTI] Accurate costs for i1->double vector conversions	Jonas Paulsson	2018-11-01	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \|	This factors out a new method getBoolVecToIntConversionCost() containing the code for vector sext/zext of i1, in order to reuse it for i1 to double vector conversions. Review: Ulrich Weigand https://reviews.llvm.org/D53923 llvm-svn: 345817
*	[IndVars] Smart hard uses detection	Max Kazantsev	2018-11-01	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When rewriting loop exit values, IndVars considers this transform not profitable if the loop instruction has a loop user which it believes cannot be optimized away. In current implementation only calls that immediately use the instruction are considered as such. This patch extends the definition of "hard" users to any side-effecting instructions (which usually cannot be optimized away from the loop) and also allows handling of not just immediate users, but use chains. Differentlai Revision: https://reviews.llvm.org/D51584 Reviewed By: etherzhhb llvm-svn: 345814
*	[SCEV] Avoid redundant computations when doing AddRec merge	Max Kazantsev	2018-11-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we calculate a product of 2 AddRecs, we end up making quite massive computations to deduce the operands of resulting AddRec. This process can be optimized by computing all args of intermediate sum and then calling `getAddExpr` once rather than calling `getAddExpr` with intermediate result every time a new argument is computed. Differential Revision: https://reviews.llvm.org/D53189 Reviewed By: rtereshin llvm-svn: 345813
*	[TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)	Simon Pilgrim	2018-10-30	11	-478/+785
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector. Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination! I've done my best to fix a number of vectorizer uses: SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment. LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta. Differential Revision: https://reviews.llvm.org/D53573 llvm-svn: 345617
*	[SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.	Jonas Paulsson	2018-10-30	1	-36/+179
\| \| \| \| \| \| \| \| \| \|	Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a load. This patch adds a check for this. Review: Ulrich Weigand https://reviews.llvm.org/D53791 llvm-svn: 345596
*	[X86] Add -LABEL to some FileCheck checks. NFC	Craig Topper	2018-10-26	3	-240/+240
\| \| \| \|	llvm-svn: 345407
*	[SystemZ] Improve getMemoryOpCost() to find foldable loads that are converted.	Jonas Paulsson	2018-10-25	1	-1/+257
\| \| \| \| \| \| \| \| \| \| \| \| \|	The SystemZ backend can do arithmetic of memory by loading and then extending one of the operands. Similarly, a load + truncate can be folded into an operand. This patch improves the SystemZ TTI cost function to recognize this. Review: Ulrich Weigand https://reviews.llvm.org/D52692 llvm-svn: 345327
*	[SystemZ] Improve handling and cost estimates of vector integer div/rem	Jonas Paulsson	2018-10-25	6	-355/+974
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable the DAG optimization that converts vector div/rem with constants into multiply+shifts sequences by expanding them early. This is needed since ISD::SMUL_LOHI is 'Custom' lowered on SystemZ, and will therefore not be available to BuildSDIV after legalization. Better cost values for these instructions based on how they will be implemented (a constant divisor is cheaper). Review: Ulrich Weigand https://reviews.llvm.org/D53196 llvm-svn: 345321
*	[CostModel][X86] Add realistic vXi64 uitofp vXf64 costs	Simon Pilgrim	2018-10-25	1	-14/+14
\| \| \| \| \| \|	Match codegen improvements from D53649/rL345256 llvm-svn: 345263
*	[CostModel][X86] Add realistic i64 uitofp f64 scalar costs	Simon Pilgrim	2018-10-25	1	-3/+3
\| \| \| \|	llvm-svn: 345261