bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PowerPC][LoopVectorize] Extend getRegisterClassForType to consider double ↵	Jinsong Ji	2020-01-06	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	and other floating point type In https://reviews.llvm.org/D67148, we use isFloatTy to test floating point type, otherwise we return GPRRC. So 'double' will be classified as GPRRC, which is not accurate. This patch covers other floating point types. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D71946
*	Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821	Fangrui Song	2019-12-27	1	-18/+0
\| \| \| \| \| \| \|	Intrinsic has incorrect argument type! i32 (i32) @llvm.setjmp wipes tear
*	[PowerPC] Add missing legalization for vector BSWAP	Nemanja Ivanovic	2019-12-17	1	-0/+14
\| \| \| \| \| \| \| \|	We somehow missed doing this when we were working on Power9 exploitation. This just adds the missing legalization and cost for producing the vector intrinsics. Differential revision: https://reviews.llvm.org/D70436
*	Rename TTI::getIntImmCost for instructions and intrinsics	Reid Kleckner	2019-12-11	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381
*	[ARM] Teach the Arm cost model that a Shift can be folded into other ↵	David Green	2019-12-09	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
*	[PowerPC] Add new Future CPU for PowerPC in LLVM	Stefan Pintilie	2019-11-27	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	This is a continuation of D70262 The previous patch as listed above added the future CPU in clang. This patch adds the future CPU in the PowerPC backend. At this point the patch simply assumes that a future CPU will have the same characteristics as pwr9. Those characteristics may change with later patches. Differential Revision: https://reviews.llvm.org/D70333
*	[PowerPC] Rename DarwinDirective to CPUDirective (NFC)	Kit Barton	2019-11-25	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch renames the DarwinDirective (used to identify which CPU was defined) to CPUDirective. It also adds the getCPUDirective() method and replaces all uses of getDarwinDirective() with getCPUDirective(). Once this patch lands and downstream users of the getDarwinDirective() method have switched to the getCPUDirective() method, the old getDarwinDirective() method will be removed. Reviewers: nemanjai, hfinkel, power-llvm-team, jsji, echristo, #powerpc, jhibbits Reviewed By: hfinkel, jsji, jhibbits Subscribers: hiraditya, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70352
*	[PowerPC] Do not emit HW loop if the body contains calls to lrint/lround	Nemanja Ivanovic	2019-10-28	1	-0/+4
\| \| \| \| \| \| \| \|	These two intrinsics are lowered to calls so should prevent the formation of CTR loops. In a subsequent patch, we will handle all currently known intrinsics and prevent the formation of HW loops if any unknown intrinsics are encountered. Differential revision: https://reviews.llvm.org/D68841
*	[Alignment][NFC] getMemoryOpCost uses MaybeAlign	Guillaume Chatelet	2019-10-25	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
*	recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure ↵	Zi Xuan Wu	2019-10-12	1	-4/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634
*	[System Model] [TTI] Update cache and prefetch TTI interfaces	David Greene	2019-10-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo. Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Moving the default TargetTransformInfoImplBase implementation to a default MCSubtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default MCSubtargetInfo implementation essentially returns the defaults TargetTransformInfoImplBase used to return. Existing users of TTI defaults will hit the defaults now in MCSubtargetInfo. Targets that define their own custom TTI implementations won't use the BasicTTIImpl implementations that route to the subtarget. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 374205
*	Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure ↵	Jinsong Ji	2019-10-08	1	-31/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091
*	[LoopVectorize][PowerPC] Estimate int and float register pressure separately ↵	Zi Xuan Wu	2019-10-08	1	-4/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017
*	Recommit [PowerPC] Update P9 vector costs for insert/extract	Roland Froese	2019-08-26	1	-0/+29
\| \| \| \| \| \| \|	Now that the v1i128 smin regression has been fixed, recommit the P9 cost updates from D60160. llvm-svn: 369952
*	Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces"	David Greene	2019-07-10	1	-2/+2
\| \| \| \| \| \| \| \|	This broke some PPC prefetching tests. This reverts commit 9fdfb045ae8bb643ab0d0455dcf9ecaea3b1eb3c. llvm-svn: 365680
*	[System Model] [TTI] Update cache and prefetch TTI interfaces	David Greene	2019-07-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Adding a default "no information" subtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default subtarget implementation essentially returns "no information" for these interfaces. None of the existing users of TTI will hit that implementation because they define their own custom TTI implementations and won't use the BasicTTIImpl implementations. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 365676
*	[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware ↵	Chen Zheng	2019-07-03	1	-0/+22
\| \| \| \| \| \| \| \| \|	loop. Differential Revision: https://reviews.llvm.org/D63477 llvm-svn: 364993
*	Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element	Jordan Rupprecht	2019-07-01	1	-29/+0
\| \| \| \| \| \| \| \|	This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e) This crashes as reported on the commit thread. Repro instructions TBD. llvm-svn: 364876
*	Recommit [PowerPC] Update P9 vector costs for insert/extract element	Roland Froese	2019-06-27	1	-0/+29
\| \| \| \| \| \|	Recommit patch D60160 after regression fix patch D63463. llvm-svn: 364557
*	[ExpandMemCmp] Move all options to TargetTransformInfo.	Clement Courbet	2019-06-25	1	-11/+6
\| \| \| \| \| \|	Split off from D60318. llvm-svn: 364281
*	[NFC] move some hardware loop checking code to a common place for other using.	Chen Zheng	2019-06-19	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63478 llvm-svn: 363758
*	[CodeGen] Generic Hardware Loop Support	Sam Parker	2019-06-07	1	-0/+344
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining. llvm-svn: 362774
*	Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract ↵	David L. Jones	2019-05-01	1	-29/+0
\| \| \| \| \| \| \| \|	element" This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313. llvm-svn: 359648
*	Fix operator precedence warning. NFCI.	Simon Pilgrim	2019-04-29	1	-1/+2
\| \| \| \| \| \|	Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359469
*	[PowerPC] Update P9 vector costs for insert/extract element	Roland Froese	2019-04-26	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \|	The PPC vector cost model values for insert/extract element reflect older processors that lacked vector insert/extract and move-to/move-from VSR instructions. Update getVectorInstrCost to give appropriate values for when the newer instructions are present. Differential Revision: https://reviews.llvm.org/D60160 llvm-svn: 359313
*	test commit (add blank line) NFC	Roland Froese	2019-02-01	1	-0/+1
\| \| \| \|	llvm-svn: 352897
*	[PowerPC] Update Vector Costs for P9	Nemanja Ivanovic	2019-01-26	1	-11/+46
\| \| \| \| \| \| \| \| \| \| \| \| \|	For the power9 CPU, vector operations consume a pair of execution units rather than one execution unit like a scalar operation. Update the target transform cost functions to reflect the higher cost of vector operations when targeting Power9. Patch by RolandF. Differential revision: https://reviews.llvm.org/D55461 llvm-svn: 352261
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[LV] Support vectorization of interleave-groups that require an epilog under	Dorit Nuzman	2018-10-31	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705
*	recommit 344472 after fixing build failure on ARM and PPC.	Dorit Nuzman	2018-10-14	1	-1/+6
\| \| \| \|	llvm-svn: 344475
*	revert 344472 due to failures.	Dorit Nuzman	2018-10-14	1	-6/+1
\| \| \| \|	llvm-svn: 344473
*	[IAI,LV] Add support for vectorizing predicated strided accesses using masked	Dorit Nuzman	2018-10-14	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
*	Remove trailing space	Fangrui Song	2018-07-30	1	-1/+1
\| \| \| \| \| \|	sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293
*	Revert "[PowerPC] LSR tunings for PowerPC"	Stefan Pintilie	2018-03-09	1	-13/+0
\| \| \| \| \| \| \| \|	Revert the rest of the LST tune commit. It seems that the LSR tune commit breaks internal tests. Reverting the commit. llvm-svn: 327143
*	[PowerPC] LSR tunings for PowerPC	Stefan Pintilie	2018-03-07	1	-0/+13
\| \| \| \| \| \| \| \| \|	The purpose of this patch is to have LSR generate better code on Power. This is done by overriding isLSRCostLess. Differential Revision: https://reviews.llvm.org/D40855 llvm-svn: 326906
*	Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵	Zaara Syeda	2018-01-30	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
*	Revert [PowerPC] This reverts commit rL322721	Zaara Syeda	2018-01-17	1	-13/+0
\| \| \| \| \| \|	Failing build bots. Revert the commit now. llvm-svn: 322748
*	[PowerPC] Add handling for ColdCC calling convention and a pass to mark	Zaara Syeda	2018-01-17	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
*	Fix a bunch more layering of CodeGen headers that are in Target	David Blaikie	2017-11-17	1	-2/+2
\| \| \| \| \| \| \| \|	All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
*	[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).	Clement Courbet	2017-10-30	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \|	- Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905
*	The cost of splitting a large vector instruction is not being taken into ↵	Graham Yiu	2017-10-19	1	-0/+11
\| \| \| \| \| \| \| \| \| \|	account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is now being multiplied by the cost of the type legalization. This will return a more accurate cost. Committing on behalf on Brad Nemanich (brad.nemanich@ibm.com) Differential Revision: https://reviews.llvm.org/D38961 llvm-svn: 316174
*	[CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> ↵	Clement Courbet	2017-09-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TargetTransformInfo::enableMemCmpExpansion. Summary: Right now there are two functions with the same name, one does the work and the other one returns true if expansion is needed. Rename TargetTransformInfo::expandMemCmp to make it more consistent with other members of TargetTransformInfo. Remove the unused Instruction* parameter. Differential Revision: https://reviews.llvm.org/D38165 llvm-svn: 314096
*	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.	Geoff Berry	2017-06-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554
*	Const correctness for TTI::getRegisterBitWidth	Daniel Neilson	2017-06-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189
*	[PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.	Sean Fertile	2017-05-31	1	-3/+12
\| \| \| \| \| \| \| \| \| \|	Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for newer ppc processors. Commiting on behalf of Stefan Pintilie. Differential Revision: https://reviews.llvm.org/D33656 llvm-svn: 304317
*	[PPC] Inline expansion of memcmp	Zaara Syeda	2017-05-31	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313
*	[SystemZ] TargetTransformInfo cost functions implemented.	Jonas Paulsson	2017-04-12	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
*	[PPC] Give unaligned memory access lower cost on processor that supports it	Guozhi Wei	2017-02-17	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost. This patch fixes pr31492. Differential Revision: https://reviews.llvm.org/D28630 This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug. llvm-svn: 295506
*	Revert "[PPC] Give unaligned memory access lower cost on processor that ↵	Daniel Jasper	2017-01-25	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	supports it" This reverts commit r292680. It is causing significantly worse performance and test timeouts in our internal builds. I have already routed reproduction instructions your way. llvm-svn: 293092
*	[PPC] Give unaligned memory access lower cost on processor that supports it	Guozhi Wei	2017-01-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost. This patch fixes pr31492. Differential Revision: https://reviews.llvm.org/D28630 llvm-svn: 292680