bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Rename TTI::getIntImmCost for instructions and intrinsics	Reid Kleckner	2019-12-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381
*	[ARM] Teach the Arm cost model that a Shift can be folded into other ↵	David Green	2019-12-09	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
*	[Alignment][NFC] getMemoryOpCost uses MaybeAlign	Guillaume Chatelet	2019-10-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
*	recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure ↵	Zi Xuan Wu	2019-10-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634
*	Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure ↵	Jinsong Ji	2019-10-08	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091
*	[LoopVectorize][PowerPC] Estimate int and float register pressure separately ↵	Zi Xuan Wu	2019-10-08	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017
*	[SystemZ] Support z15 processor name	Ulrich Weigand	2019-09-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The recently announced IBM z15 processor implements the architecture already supported as "arch13" in LLVM. This patch adds support for "z15" as an alternate architecture name for arch13. The patch also uses z15 in a number of places where we used arch13 as long as the official name was not yet announced. llvm-svn: 372435
*	[SystemZ] Add support for new cpu architecture - arch13	Ulrich Weigand	2019-07-12	1	-5/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Assembler/disassembler support for new instructions. - CodeGen for new instructions, including new LLVM intrinsics. - Scheduler description for the new processor. - Detection of arch13 as host processor. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365932
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[SystemZ::TTI] Return zero cost for ICmp that becomes Load And Test.	Jonas Paulsson	2018-12-03	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	A loaded value with multiple users compared with 0 will become a load and test single instruction. The load is not folded in this case (multiple users), but the compare instruction is eliminated. This patch returns 0 cost for the icmp in these cases. Review: Ulrich Weigand https://reviews.llvm.org/D55111 llvm-svn: 348141
*	[SystemZ::TTI] i8/i16 operands extension costs revisited	Jonas Paulsson	2018-11-30	1	-20/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Three minor changes to these extra costs: * For ICmp instructions, instead of adding 2 all the time for extending each operand, this is only done if that operand is neither a load or an immediate. * The operands extension costs for divides removed, because we now use a high cost already for the divide (20). * The costs for lhsr/ashr extra costs removed as this did not seem useful. Review: Ulrich Weigand https://reviews.llvm.org/D55053 llvm-svn: 347961
*	[SystemZ::TTI] Improve cost for compare of i64 with extended i32 load	Jonas Paulsson	2018-11-28	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \|	CGF/CLGF compares an i64 register with a sign/zero extended loaded i32 value in memory. This patch makes such a load considered foldable and so gets a 0 cost. Review: Ulrich Weigand https://reviews.llvm.org/D54944 llvm-svn: 347735
*	[SystemZ::TTI] Improve costs for i16 add, sub and mul against memory.	Jonas Paulsson	2018-11-28	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	AH, SH and MH costs are already covered in the cases where LHS is 32 bits and RHS is 16 bits of memory sign-extended to i32. As these instructions are also used when LHS is i16, this patch recognizes that the loads will get folded then as well. Review: Ulrich Weigand https://reviews.llvm.org/D54940 llvm-svn: 347734
*	[SystemZ::TTI] Improved cost values for comparison against memory.	Jonas Paulsson	2018-11-28	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	Single instructions exist for i8 and i16 comparisons of memory against a small immediate. This patch makes sure that if the load in these cases has a single user (the ICmp), it gets a 0 cost (folded), and also that the ICmp gets a cost of 1. Review: Ulrich Weigand https://reviews.llvm.org/D54897 llvm-svn: 347733
*	[SystemZ::TTI] Return zero cost for scalar load/store connected with a bswap.	Jonas Paulsson	2018-11-28	1	-0/+25
\| \| \| \| \| \| \| \| \| \|	Since byte-swapping loads and stores are supported, a 'load -> bswap' or 'bswap -> store' sequence should have the cost of one. Review: Ulrich Weigand https://reviews.llvm.org/D54870 llvm-svn: 347732
*	[SystemZTTIImpl] Give correct cost values for vector bswap intrinsics.	Jonas Paulsson	2018-11-22	1	-0/+26
\| \| \| \| \| \| \| \| \| \|	Implement getIntrinsicInstrCost() and return costs reflecting that bswap can be done with a vperm per vector register. Review: Ulrich Weigand https://reviews.llvm.org/D54789 llvm-svn: 347445
*	[SystemZ::TTI] Improve accuracy of costs for vector fp <-> int conversions	Jonas Paulsson	2018-11-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve getCastInstrCost() by respecting the different types of Src and Dst for vector integer <-> fp conversions. This means that extracting from integer becomes more expensive (by the extraction penalty), and the extraction from fp becomes cheaper (no longer has a false extraction penalty). Review: Ulrich Weigand https://reviews.llvm.org/D54423 llvm-svn: 346663
*	[SystemZ::TTI] Improve cost handling of uint/sint to fp conversions.	Jonas Paulsson	2018-11-02	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load. Since the load already does the extension, there is no extra cost (previously returned 2). Review: Ulrich Weigand https://reviews.llvm.org/D54028 llvm-svn: 346009
*	[SystemZ] Rework getInterleavedMemoryOpCost()	Jonas Paulsson	2018-11-02	1	-16/+48
\| \| \| \| \| \| \| \| \| \| \| \| \|	Model this function more closely after the BasicTTIImpl version, with separate handling of loads and stores. For loads, the set of actually loaded vectors is checked. This makes it more readable and just slightly more accurate generally. Review: Ulrich Weigand https://reviews.llvm.org/D53071 llvm-svn: 345998
*	[SystemZ::TTI] Recognize the higher cost of scalar i1 -> fp conversion	Jonas Paulsson	2018-11-01	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	Scalar i1 to fp conversions are done with a branch sequence, so it should have a higher cost. Review: Ulrich Weigand https://reviews.llvm.org/D53924 llvm-svn: 345818
*	[SystemZ::TTI] Accurate costs for i1->double vector conversions	Jonas Paulsson	2018-11-01	1	-15/+28
\| \| \| \| \| \| \| \| \| \| \|	This factors out a new method getBoolVecToIntConversionCost() containing the code for vector sext/zext of i1, in order to reuse it for i1 to double vector conversions. Review: Ulrich Weigand https://reviews.llvm.org/D53923 llvm-svn: 345817
*	[LV] Support vectorization of interleave-groups that require an epilog under	Dorit Nuzman	2018-10-31	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705
*	[SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.	Jonas Paulsson	2018-10-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a load. This patch adds a check for this. Review: Ulrich Weigand https://reviews.llvm.org/D53791 llvm-svn: 345596
*	[SystemZ] NFC reformatting in SystemZTargetTransformInfo.cpp	Jonas Paulsson	2018-10-25	1	-5/+7
\| \| \| \| \| \|	Some lines more than 80 characters long reformatted. llvm-svn: 345331
*	[SystemZ] Improve getMemoryOpCost() to find foldable loads that are converted.	Jonas Paulsson	2018-10-25	1	-41/+89
\| \| \| \| \| \| \| \| \| \| \| \| \|	The SystemZ backend can do arithmetic of memory by loading and then extending one of the operands. Similarly, a load + truncate can be folded into an operand. This patch improves the SystemZ TTI cost function to recognize this. Review: Ulrich Weigand https://reviews.llvm.org/D52692 llvm-svn: 345327
*	[SystemZ] Improve handling and cost estimates of vector integer div/rem	Jonas Paulsson	2018-10-25	1	-39/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable the DAG optimization that converts vector div/rem with constants into multiply+shifts sequences by expanding them early. This is needed since ISD::SMUL_LOHI is 'Custom' lowered on SystemZ, and will therefore not be available to BuildSDIV after legalization. Better cost values for these instructions based on how they will be implemented (a constant divisor is cheaper). Review: Ulrich Weigand https://reviews.llvm.org/D53196 llvm-svn: 345321
*	recommit 344472 after fixing build failure on ARM and PPC.	Dorit Nuzman	2018-10-14	1	-1/+5
\| \| \| \|	llvm-svn: 344475
*	revert 344472 due to failures.	Dorit Nuzman	2018-10-14	1	-5/+1
\| \| \| \|	llvm-svn: 344473
*	[IAI,LV] Add support for vectorizing predicated strided accesses using masked	Dorit Nuzman	2018-10-14	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
*	[SystemZ] Temporarily disable high VFs with integer div/rem.	Jonas Paulsson	2018-10-10	1	-0/+7
\| \| \| \| \| \| \| \|	Until mischeduler is clever enough to avoid spilling in a vectorized loop with many (scalar) DLRs it is better to avoid high vectorization factors (8 and above). llvm-svn: 344129
*	[SystemZ] Take better care when computing needed vector registers in TTI.	Jonas Paulsson	2018-10-10	1	-17/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A new function getNumVectorRegs() is better to use for the number of needed vector registers instead of getNumberOfParts(). This is to make sure that the number of vector registers (and typically operations) required for a vector type is accurate. getNumberOfParts() which was previously used works by splitting the vector type until it is legal gives incorrect results for types with a non power of two number of elements (rare). A new static function getScalarSizeInBits() that also checks for a pointer type and returns 64U for it since otherwise it gets a value of 0). Used in a few places where Ty may be pointer. Review: Ulrich Weigand llvm-svn: 344115
*	[SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM	Jonas Paulsson	2018-09-14	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After recent improvements which makes better use of LOC instead of IPM, the TTI cost functions also needs to be updated to reflect this. This involves sext, zext and xor of i1. The tests were updated so that for z13 the new costs are expected, while the old costs are still checked for on zEC12. Review: Ulrich Weigand https://reviews.llvm.org/D51339 llvm-svn: 342207
*	Remove trailing space	Fangrui Song	2018-07-30	1	-4/+4
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	Fix a bunch of places where operator-> was used directly on the return from ↵	Craig Topper	2018-05-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	dyn_cast. Inspired by r331508, I did a grep and found these. Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa. llvm-svn: 331577
*	Fix a bunch more layering of CodeGen headers that are in Target	David Blaikie	2017-11-17	1	-2/+2
\| \| \| \| \| \| \| \|	All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
*	[SystemZ] implement hasDivRemOp()	Jonas Paulsson	2017-11-06	1	-0/+5
\| \| \| \| \| \| \| \| \|	SystemZ can do division and remainder in a single instruction for scalar integer types, which are now reflected by returning true in this hook for those cases. Review: Ulrich Weigand llvm-svn: 317477
*	[SystemZ, LoopStrengthReduce]	Jonas Paulsson	2017-07-21	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729
*	[SystemZ] Add support for IBM z14 processor (2/3)	Ulrich Weigand	2017-07-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for the new 32-bit vector float instructions of z14. This includes: - Enabling the instructions for the assembler/disassembler. - CodeGen for the instructions, including new LLVM intrinsics. - Scheduler description support for the instructions. - Update to the vector cost function calculations. In general, CodeGen support for the new v4f32 instructions closely matches support for the existing v2f64 instructions. llvm-svn: 308195
*	[IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing ↵	Craig Topper	2017-07-09	1	-2/+1
\| \| \| \| \| \|	isIntegerTy(unsigned), but also works for vectors. llvm-svn: 307492
*	[IR] Make use of ↵	Craig Topper	2017-07-09	1	-1/+1
\| \| \| \| \| \|	Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491
*	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.	Geoff Berry	2017-06-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554
*	Const correctness for TTI::getRegisterBitWidth	Daniel Neilson	2017-06-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189
*	[SystemZ] Modelling of costs of divisions with a constant power of 2.	Jonas Paulsson	2017-05-17	1	-1/+33
\| \| \| \| \| \| \| \|	Such divisions will eventually be implemented with shifts which should be reflected in the cost function. Review: Ulrich Weigand llvm-svn: 303254
*	[SystemZ] Properly check number of operands in getCmpOpsType()	Jonas Paulsson	2017-05-03	1	-3/+4
\| \| \| \| \| \| \| \|	It is needed to check that the number of operands are 2 when finding the case of a logic combination, e.g. 'and' of two compares. Review: Ulrich Weigand llvm-svn: 302022
*	[SystemZ] Remove incorrect assert in SystemZTTIImpl	Jonas Paulsson	2017-04-27	1	-1/+0
\| \| \| \| \| \| \| \| \|	In getCmpSelInstrCost(), CondTy may actually be scalar while ValTy is a vector when LoopVectorizer is the caller. Therefore the assert that CondTy must be a vector type if ValTy is was wrong and is now removed. Review: Ulrich Weigand llvm-svn: 301533
*	[SystemZ] TargetTransformInfo cost functions implemented.	Jonas Paulsson	2017-04-12	1	-4/+545
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
*	[SystemZ] Implementation of getUnrollingPreferences().	Jonas Paulsson	2016-09-28	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit enables more unrolling for SystemZ by implementing the SystemZTargetTransformInfo::getUnrollingPreferences() method. It has been found that it is better to only unroll moderately, so the DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order to set this to a lower value for SystemZ (4). Reviewers: Evgeny Stupachenko, Ulrich Weigand. https://reviews.llvm.org/D24451 llvm-svn: 282570
*	[TTI] Make the cost APIs in TargetTransformInfo consistently use 'int'	Chandler Carruth	2015-08-05	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rather than 'unsigned' for their costs. For something like costs in particular there is a natural "negative" value, that of savings or saved cost. As a consequence, there is a lot of code that subtracts or creates negative values based on cost, all of which is prone to awkwardness or bugs when dealing with an unsigned type. Similarly, we never want these values to wrap, as that would cause Very Bad code generation (likely percieved as an infinite loop as we try to emit over 2^32 instructions or some such insanity). All around 'int' seems a much better fit for these basic metrics. I've added asserts to ensure that at least the TTI interface never returns negative numbers here. If we ever have a use case for negative numbers, we can remove this, but this way a bug where someone used '-1' to produce a 'very large' cost will be caught by the assert. This passes all tests, and is also UBSan clean. No functional change intended. Differential Revision: http://reviews.llvm.org/D11741 llvm-svn: 244080
*	[SystemZ] Add CodeGen support for integer vector types	Ulrich Weigand	2015-05-05	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This the first of a series of patches to add CodeGen support exploiting the instructions of the z13 vector facility. This patch adds support for the native integer vector types (v16i8, v8i16, v4i32, v2i64). When the vector facility is present, we default to the new vector ABI. This is characterized by two major differences: - Vector types are passed/returned in vector registers (except for unnamed arguments of a variable-argument list function). - Vector types are at most 8-byte aligned. The reason for the choice of 8-byte vector alignment is that the hardware is able to efficiently load vectors at 8-byte alignment, and the ABI only guarantees 8-byte alignment of the stack pointer, so requiring any higher alignment for vectors would require dynamic stack re-alignment code. However, for compatibility with old code that may use vector types, when not using the vector facility, the old alignment rules (vector types are naturally aligned) remain in use. These alignment rules are not only implemented at the C language level (implemented in clang), but also at the LLVM IR level. This is done by selecting a different DataLayout string depending on whether the vector ABI is in effect or not. Based on a patch by Richard Sandiford. llvm-svn: 236521
*	[SystemZ] Use POPCNT instruction on z196	Ulrich Weigand	2015-03-31	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	We already exploit a number of instructions specific to z196, but not yet POPCNT. Add support for the population-count facility, MC support for the POPCNT instruction, CodeGen support for using POPCNT, and implement the getPopcntSupport TargetTransformInfo hook. llvm-svn: 233689