bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Only promote args when function attributes are compatible	Tom Stellard	2019-01-16	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Check to make sure that the caller and the callee have compatible function arguments before promoting arguments. This uses the same TargetTransformInfo queries that are used to determine if attributes are compatible for inlining. The goal here is to avoid breaking ABI when a called function's ABI depends on a target feature that is not enabled in the caller. This is a very conservative fix for PR37358. Ideally we would have a more sophisticated check for ABI compatiblity rather than checking if the attributes are compatible for inlining. Reviewers: echristo, chandlerc, eli.friedman, craig.topper Reviewed By: echristo, chandlerc Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits Differential Revision: https://reviews.llvm.org/D53554 llvm-svn: 351296
*	[TTI] getOperandInfo - a broadcast shuffle means the result is OK_UniformValue	Simon Pilgrim	2018-11-14	1	-0/+7
\| \| \| \|	llvm-svn: 346868
*	[TTI] Make TargetTransformInfo::getOperandInfo static. NFCI.	Simon Pilgrim	2018-11-13	1	-2/+1
\| \| \| \| \| \|	It has no member dependencies and this makes it easier to reuse in other cost analysis code. llvm-svn: 346755
*	[TTI] Flip vector types in getShuffleCost SK_ExtractSubvector call	Simon Pilgrim	2018-11-09	1	-1/+1
\| \| \| \| \| \| \| \|	For SK_ExtractSubvector, the default 'Ty' type is the source operand type and 'SubTy' is the destination subvector type I got this the wrong way around when I added rL346510 llvm-svn: 346534
*	[CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput ↵	Simon Pilgrim	2018-11-09	1	-2/+8
\| \| \| \| \| \| \| \|	(PR39368) Add ShuffleVectorInst::isExtractSubvectorMask helper to match shuffle masks. llvm-svn: 346510
*	[LV] Support vectorization of interleave-groups that require an epilog under	Dorit Nuzman	2018-10-31	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705
*	recommit 344472 after fixing build failure on ARM and PPC.	Dorit Nuzman	2018-10-14	1	-3/+7
\| \| \| \|	llvm-svn: 344475
*	revert 344472 due to failures.	Dorit Nuzman	2018-10-14	1	-7/+3
\| \| \| \|	llvm-svn: 344473
*	[IAI,LV] Add support for vectorizing predicated strided accesses using masked	Dorit Nuzman	2018-10-14	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
*	[LoopVectorizer] Use TTI.getOperandInfo()	Jonas Paulsson	2018-10-05	1	-43/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Call getOperandInfo() instead of using (near) duplicated code in LoopVectorizationCostModel::getInstructionCost(). This gets the OperandValueKind and OperandValueProperties values for a Value passed as operand to an arithmetic instruction. getOperandInfo() used to be a static method in TargetTransformInfo.cpp, but is now instead a public member. Review: Florian Hahn https://reviews.llvm.org/D52883 llvm-svn: 343852
*	Remove trailing space	Fangrui Song	2018-07-30	1	-9/+9
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	[TargetTransformInfo] Add pow2 analysis for scalar constants	Simon Pilgrim	2018-07-11	1	-0/+6
\| \| \| \| \| \|	Add ConstantInt analysis to getOperandInfo so we get more realistic div/rem expansion costs comparable to the vector costs. llvm-svn: 336827
*	[IR] move shuffle mask queries from TTI to ShuffleVectorInst	Sanjay Patel	2018-06-19	1	-168/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The optimizer is getting smarter (eg, D47986) about differentiating shuffles based on its mask values, so we should make queries on the mask constant operand generally available to avoid code duplication. We'll probably use this soon in the vectorizers and instcombine (D48023 and https://bugs.llvm.org/show_bug.cgi?id=37806). We might clean up TTI a bit more once all of its current 'SK_*' options are covered. Differential Revision: https://reviews.llvm.org/D48236 llvm-svn: 335067
*	Fix namespaces. No functionality change.	Benjamin Kramer	2018-06-16	1	-1/+1
\| \| \| \|	llvm-svn: 334890
*	[CostModel] Cleanup isSingleSourceVectorMask to match other shuffle ↵	Simon Pilgrim	2018-06-14	1	-10/+12
\| \| \| \| \| \|	matchers. NFCI. llvm-svn: 334699
*	[CostModel] Recognise REVERSE shuffle mask if the elements come from the ↵	Simon Pilgrim	2018-06-14	1	-4/+11
\| \| \| \| \| \|	second src llvm-svn: 334698
*	[CostModel] Recognise BROADCAST shuffle mask if the elements come from the ↵	Simon Pilgrim	2018-06-13	1	-4/+11
\| \| \| \| \| \|	second src llvm-svn: 334620
*	[CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select ↵	Simon Pilgrim	2018-06-12	1	-19/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR33744) As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources: e.g. v4f32: <0,5,2,7> or <4,1,6,3> This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline: e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc. This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns. Differential Revision: https://reviews.llvm.org/D47985 llvm-svn: 334513
*	Fix signed/unsigned warning. NFCI.	Simon Pilgrim	2018-06-12	1	-2/+2
\| \| \| \|	llvm-svn: 334509
*	[CostModel] Treat Identity shuffle masks as zero cost	Simon Pilgrim	2018-06-12	1	-0/+20
\| \| \| \| \| \| \| \| \| \|	As discussed on D47985, identity shuffle masks should probably be free. I've limited this to the case where the input and output types all match - but we could probably accept all cases. Differential Revision: https://reviews.llvm.org/D47986 llvm-svn: 334506
*	[TTI] Add uniform/non-uniform constant Pow2 detection to ↵	Simon Pilgrim	2018-05-22	1	-13/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TargetTransformInfo::getInstructionThroughput This enables us to detect more fast path sdiv cases under cost analysis. This patch also enables us to handle non-uniform-constant pow2 cases for X86 SDIV costs. Found while working on D46276 Future patches can then extend the vectorizers to more fully support non-uniform pow2 cases. Differential Revision: https://reviews.llvm.org/D46637 llvm-svn: 332969
*	Remove \brief commands from doxygen comments.	Adrian Prantl	2018-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
*	[TTI, AArch64] Add transpose shuffle kind	Matthew Simpson	2018-04-26	1	-10/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a new shuffle kind useful for transposing a 2xn matrix. These transpose shuffle masks read corresponding even- or odd-numbered vector elements from two n-dimensional source vectors and write each result into consecutive elements of an n-dimensional destination vector. The transpose shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such, this patch also considers transpose shuffles in the AArch64 implementation of getShuffleCost. Differential Revision: https://reviews.llvm.org/D45982 llvm-svn: 330941
*	[LV] Introduce TTI::getMinimumVF	Krzysztof Parzyszek	2018-04-13	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	The function getMinimumVF(ElemWidth) will return the minimum VF for a vector with elements of size ElemWidth bits. This value will only apply to targets for which TTI::shouldMaximizeVectorBandwidth returns true. The value of 0 indicates that there is no minimum VF. Differential Revision: https://reviews.llvm.org/D45271 llvm-svn: 330062
*	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header ↵	David Blaikie	2018-03-28	1	-0/+2
\| \| \| \| \| \| \| \|	dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737
*	[LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target	Krzysztof Parzyszek	2018-03-27	1	-0/+4
\| \| \| \| \| \| \| \|	The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632
*	[LSR] Allow giving priority to post-incrementing addressing modes	Krzysztof Parzyszek	2018-03-26	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \|	Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490
*	[LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused ↵	Sanjay Patel	2018-02-05	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR35681) In the motivating case from PR35681 and represented by the macro-fuse-cmp test: https://bugs.llvm.org/show_bug.cgi?id=35681 ...there's a 37 -> 31 byte size win for the loop because we eliminate the big base address offsets. SPEC2017 on Ryzen shows no significant perf difference. Differential Revision: https://reviews.llvm.org/D42607 llvm-svn: 324289
*	Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵	Zaara Syeda	2018-01-30	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
*	Revert [PowerPC] This reverts commit rL322721	Zaara Syeda	2018-01-17	1	-4/+0
\| \| \| \| \| \|	Failing build bots. Revert the commit now. llvm-svn: 322748
*	[PowerPC] Add handling for ColdCC calling convention and a pass to mark	Zaara Syeda	2018-01-17	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
*	Revert r321377, it causes regression to https://reviews.llvm.org/P8055.	Guozhi Wei	2017-12-28	1	-4/+0
\| \| \| \|	llvm-svn: 321528
*	[SimplifyCFG] Don't do if-conversion if there is a long dependence chain	Guozhi Wei	2017-12-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor. This patch checks for the long dependence chain, and give up if-conversion if find one. Differential Revision: https://reviews.llvm.org/D39352 llvm-svn: 321377
*	[Memcpy Loop Lowering] Remove the fixed int8 lowering.	Sean Fertile	2017-12-18	1	-9/+0
\| \| \| \| \| \| \| \|	Switch over to the lowering that uses target supplied operand types. Differential Revision: https://reviews.llvm.org/D41201 llvm-svn: 320989
*	[PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend ↵	Sanjay Patel	2017-11-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	on arg rather than result This should fix PR31455: https://bugs.llvm.org/show_bug.cgi?id=31455 Differential Revision: https://reviews.llvm.org/D28314 llvm-svn: 319094
*	[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).	Clement Courbet	2017-10-30	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	- Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905
*	[NVPTX] allow address space inference for volatile loads/stores.	Artem Belevich	2017-10-24	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	If particular target supports volatile memory access operations, we can avoid AS casting to generic AS. Currently it's only enabled in NVPTX for loads and stores that access global & shared AS. Differential Revision: https://reviews.llvm.org/D39026 llvm-svn: 316495
*	Revert r314923: "Recommit : Use the basic cost if a GEP is not used as ↵	Daniel Jasper	2017-10-13	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	addressing mode" Significantly reduces performancei (~30%) of gipfeli (https://github.com/google/gipfeli) I have not yet managed to reproduce this regression with the open-source version of the benchmark on github, but will work with others to get a reproducer to you later today. llvm-svn: 315680
*	Recommit : Use the basic cost if a GEP is not used as addressing mode	Jun Bum Lim	2017-10-04	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recommitting r314517 with the fix for handling ConstantExpr. Original commit message: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. llvm-svn: 314923
*	Revert "Use the basic cost if a GEP is not used as addressing mode"	Alex Shlyapnikov	2017-09-29	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r314517. This commit crashes sanitizer bots, for example: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167 Stack snippet: ... /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const, llvm::ArrayRef<llvm::Value const>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const, llvm::ArrayRef<llvm::Value const>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0 ... llvm-svn: 314560
*	Use the basic cost if a GEP is not used as addressing mode	Jun Bum Lim	2017-09-29	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng Reviewed By: hfinkel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38085 llvm-svn: 314517
*	[CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> ↵	Clement Courbet	2017-09-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TargetTransformInfo::enableMemCmpExpansion. Summary: Right now there are two functions with the same name, one does the work and the other one returns true if expansion is needed. Rename TargetTransformInfo::expandMemCmp to make it more consistent with other members of TargetTransformInfo. Remove the unused Instruction* parameter. Differential Revision: https://reviews.llvm.org/D38165 llvm-svn: 314096
*	[DivRempairs] add a pass to optimize div/rem pairs (PR31028)	Sanjay Patel	2017-09-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 llvm-svn: 312862
*	[TargetTransformInfo] Add a new public interface getInstructionCost	Guozhi Wei	2017-09-08	1	-0/+557
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface: enum TargetCostKind { TCK_RecipThroughput, ///< Reciprocal throughput. TCK_Latency, ///< The latency of instruction. TCK_CodeSize ///< Instruction code size. }; int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const; All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model. This patch also provides a simple default implementation of getInstructionLatency. The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways: Add more detail into this function. Add getXXXLatency function and call it from here. Implement target specific getInstructionLatency function. Differential Revision: https://reviews.llvm.org/D37170 llvm-svn: 312832
*	[SLP] Support for horizontal min/max reduction.	Alexey Bataev	2017-09-08	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 llvm-svn: 312791
*	Model cache size and associativity in TargetTransformInfo	Tobias Grosser	2017-08-24	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We add the precise cache sizes and associativity for the following Intel architectures: - Penry - Nehalem - Westmere - Sandy Bridge - Ivy Bridge - Haswell - Broadwell - Skylake - Kabylake Polly uses since several months a performance model for BLAS computations that derives optimal cache and register tile sizes from cache and latency information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016). While bootstrapping this model, these target values have been kept in Polly. However, as our implementation is now rather mature, it seems time to teach LLVM itself about cache sizes. Interestingly, L1 and L2 cache sizes are pretty constant across micro-architectures, hence a set of architecture specific default values seems like a good start. They can be expanded to more target specific values, in case certain newer architectures require different values. For now a set of Intel architectures are provided. Just as a little teaser, for a simple gemm kernel this model allows us to improve performance from 1.2s to 0.27s. For gemm kernels with less optimal memory layouts even larger speedups can be reported. Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb Reviewed By: fhahn, asb Subscribers: lsaba, asb, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D37051 llvm-svn: 311647
*	[LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess()	Jonas Paulsson	2017-08-09	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	isLegalAddressingMode() has recently gained the extra optional Instruction* parameter, and therefore it can now do the job that previously only isFoldableMemAccess() could do. The SystemZ implementation of isLegalAddressingMode() has gained the functionality of checking for offsets, which used to be done with isFoldableMemAccess(). The isFoldableMemAccess() hook has been removed everywhere. Review: Quentin Colombet, Ulrich Weigand https://reviews.llvm.org/D35933 llvm-svn: 310463
*	[Cost] Rename getReductionCost() to getArithmeticReductionCost(), NFC.	Alexey Bataev	2017-07-31	1	-3/+3
\| \| \| \|	llvm-svn: 309563
*	[TTI] fixing a bug in the isLegalMaskedScatter API	Mohammed Agabaria	2017-07-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	isLegalMaskedScatter called the Gather version which is a bug. use test case is provided within the patch of AVX2 gathers at: https://reviews.llvm.org/D35772 Differential Revision: https://reviews.llvm.org/D35786 llvm-svn: 309260
*	[SystemZ, LoopStrengthReduce]	Jonas Paulsson	2017-07-21	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729