bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PowerPC] Fix powerpcspe subtarget enablement in llvm backend	Justin Hibbits	2020-01-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: As currently written, -target powerpcspe will enable SPE regardless of disabling the feature later on in the command line. Instead, change this to just set a default CPU to 'e500' instead of a generic CPU. As part of this, add FeatureSPE to the e500 definition. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D72673
*	[PowerPC] Change default for unaligned FP access for older subtargets	Nemanja Ivanovic	2019-12-28	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554 Some CPU's trap to the kernel on unaligned floating point access and there are kernels that do not handle the interrupt. The program then fails with a SIGBUS according to the PR. This just switches the default for unaligned access to only allow it on recent server CPUs that are known to allow this. Differential revision: https://reviews.llvm.org/D71954
*	[NFC][PowerPC] Add the inheritable and additional features to make the ↵	QingShan Zhang	2019-12-03	1	-39/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	processor definition more clear The old processor design assume that, all the old processor's feature must be inherited into future processor. That is not true as instruction fusion or some implementation defined features are not inheritable. What this patch did: * Rename the old "specific features" to "additional features" that keep the new added inheritable features. * Use the "specific features" to keep those features only for specific processor. * Add the "inheritable features" to keep all the features that inherited from early processor. Differential Revision: https://reviews.llvm.org/D70768
*	[PowerPC] Separate Features that are known to be Power9 specific from Future CPU	Stefan Pintilie	2019-11-27	1	-4/+13
\| \| \| \| \| \| \| \|	The Power 9 CPU has some features that are unlikely to be passed on to future versions of the CPU. This patch separates this out so that future CPU does not inherit them. Differential Revision: https://reviews.llvm.org/D70466
*	[PowerPC] Add new Future CPU for PowerPC in LLVM	Stefan Pintilie	2019-11-27	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	This is a continuation of D70262 The previous patch as listed above added the future CPU in clang. This patch adds the future CPU in the PowerPC backend. At this point the patch simply assumes that a future CPU will have the same characteristics as pwr9. Those characteristics may change with later patches. Differential Revision: https://reviews.llvm.org/D70333
*	[PowerPC] Rename DarwinDirective to CPUDirective (NFC)	Kit Barton	2019-11-25	1	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch renames the DarwinDirective (used to identify which CPU was defined) to CPUDirective. It also adds the getCPUDirective() method and replaces all uses of getDarwinDirective() with getCPUDirective(). Once this patch lands and downstream users of the getDarwinDirective() method have switched to the getCPUDirective() method, the old getDarwinDirective() method will be removed. Reviewers: nemanjai, hfinkel, power-llvm-team, jsji, echristo, #powerpc, jhibbits Reviewed By: hfinkel, jsji, jhibbits Subscribers: hiraditya, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70352
*	[PowerPC][NFC] Remove unused (and unsupported) fusion feature bits.	Jinsong Ji	2019-06-27	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FeatureFusion bits was first introduced in https://reviews.llvm.org/rL253724. for add/load integer fusion for P8. The only use of `hasFusion` was https://reviews.llvm.org/rL255319. However, this was removed later in https://reviews.llvm.org/rL280440. So, there is NO any reference to fusion in code now. Leaving it there is misleading and confusing, so remove it for now. We can alwasy add back if we ever support fusion in the future. llvm-svn: 364581
*	[PowerPC][NFC] Fix comments for AltVSXFMARel mapping.	Jinsong Ji	2019-06-20	1	-3/+2
\| \| \| \|	llvm-svn: 363987
*	[PowerPC] Use the two-constant NR algorithm for refining estimates	Nemanja Ivanovic	2019-05-07	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	The single-constant algorithm produces infinities on a lot of denormal values. The precision of the two-constant algorithm is actually sufficient across the range of denormals. We will switch to that algorithm for now to avoid the infinities on denormals. In the future, we will re-evaluate the algorithm to find the optimal one for PowerPC. Differential revision: https://reviews.llvm.org/D60037 llvm-svn: 360144
*	[NFC][PowerPC] Custom PowerPC specific machine-scheduler	QingShan Zhang	2019-03-27	1	-1/+7
\| \| \| \| \| \| \| \| \| \|	This patch lays the groundwork for extending the generic machine scheduler by providing a PPC-specific implementation. There are no functional changes as this is an incremental patch that simply provides the necessary overrides which just encapsulate the behavior of the generic scheduler. Subsequent patches will add specific behavior. Differential Revision: https://reviews.llvm.org/D59284 llvm-svn: 357047
*	[PowerPC] Update Vector Costs for P9	Nemanja Ivanovic	2019-01-26	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	For the power9 CPU, vector operations consume a pair of execution units rather than one execution unit like a scalar operation. Update the target transform cost functions to reflect the higher cost of vector operations when targeting Power9. Patch by RolandF. Differential revision: https://reviews.llvm.org/D55461 llvm-svn: 352261
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[llvm-exegesis][NFC] Add a way to declare the default counter binding for ↵	Clement Courbet	2018-11-09	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unbound CPUs for a target. Summary: This simplifies the code and moves everything to tablegen for consistency. This also prepares the ground for adding issue counters. Reviewers: gchatelet, john.brawn, jsji Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54297 llvm-svn: 346489
*	Introduce codegen for the Signal Processing Engine	Justin Hibbits	2018-07-18	1	-14/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347
*	Add PowerPC e500(v2) core scheduler and directives.	Justin Hibbits	2018-07-18	1	-0/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D44828 llvm-svn: 337345
*	[PowerPC] Secure PLT support	Strahinja Petrovic	2018-03-27	1	-0/+2
\| \| \| \| \| \| \| \|	This patch supports secure PLT mode for PowerPC 32 architecture. Differential Revision: https://reviews.llvm.org/D42112 llvm-svn: 328617
*	[MachineOperand][Target] MachineOperand::isRenamable semantics changes	Geoff Berry	2018-02-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931
*	[Power9] Processor Model for Scheduling	Ehsan Amiri	2016-12-19	1	-3/+1
\| \| \| \| \| \| \| \|	PWR9 processor model for instruction scheduling. A subsequent patch will migrate PWR9 to Post RA MIScheduler. https://reviews.llvm.org/D24525 llvm-svn: 290102
*	[PowerPC] Refactor soft-float support, and enable PPC64 soft float	Hal Finkel	2016-10-02	1	-19/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change enables soft-float for PowerPC64, and also makes soft-float disable all vector instruction sets for both 32-bit and 64-bit modes. This latter part is necessary because the PPC backend canonicalizes many Altivec vector types to floating-point types, and so soft-float breaks scalarization support for many operations. Both for embedded targets and for operating-system kernels desiring soft-float support, it seems reasonable that disabling hardware floating-point also disables vector instructions (embedded targets without hardware floating point support are unlikely to have Altivec, etc. and operating system kernels desiring not to use floating-point registers to lower syscall cost are unlikely to want to use vector registers either). If someone needs this to work, we'll need to change the fact that we promote many Altivec operations to act on v4f32. To make it possible to disable Altivec when soft-float is enabled, hardware floating-point support needs to be expressed as a positive feature, like the others, and not a negative feature, because target features cannot have dependencies on the disabling of some other feature. So +soft-float has now become -hard-float. Fixes PR26970. llvm-svn: 283060
*	Adding missing directive for Power9.	Nemanja Ivanovic	2016-09-14	1	-1/+1
\| \| \| \| \| \| \| \|	There is currently no codegen for Power9 that depends on the directive so this is NFC for now but will be important in the future. This was missed in r268950 so I'm adding it now. llvm-svn: 281473
*	[PowerPC] Add support for -mlongcall	Hal Finkel	2016-08-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	The "long call" option forces the use of the indirect calling sequence for all calls (even those that don't really need it). GCC provides this option; This is helpful, under certain circumstances, for building very-large binaries, and some other specialized use cases. Fixes PR19098. llvm-svn: 280040
*	Use C++ comments for large block comment.	Eric Christopher	2016-06-23	1	-16/+17
\| \| \| \|	llvm-svn: 273526
*	[Power9] Add support for -mcpu=pwr9 in the back end	Nemanja Ivanovic	2016-05-09	1	-0/+5
\| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D19683 Simply adds the bits for being able to specify -mcpu=pwr9 to the back end. llvm-svn: 268950
*	[PowerPC] Basic support for P9 byte comparison and count trailing zero insns	Nemanja Ivanovic	2016-04-13	1	-7/+9
\| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D17850 This patch implements the following instructions: cmprb, cmpeqb, cnttzw, cnttzw., cnttzd, cnttzd. llvm-svn: 266228
*	[PowerPC] Basic support for P9 atomic loads and stores	Nemanja Ivanovic	2016-03-31	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D18032 This patch provides asm implementation for the following instructions: lwat, ldat, stwat, stdat, ldmx, mcrxrx llvm-svn: 265022
*	[PowerPC] Refactor popcnt[dw] target features	Hal Finkel	2016-03-29	1	-9/+11
\| \| \| \| \| \| \| \| \|	Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690
*	[PowerPC] On the A2, popcnt[dw] are very slow	Hal Finkel	2016-03-28	1	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600
*	Power9] Implement new vsx instructions: compare and conversion	Kit Barton	2016-02-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change implements the following vsx instructions: Quad/Double-Precision Compare: xscmpoqp xscmpuqp xscmpexpdp xscmpexpqp xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp xvcmpnedp(.) xvcmpnesp(.) Quad-Precision Floating-Point Conversion xscvqpdp(o) xscvdpqp xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp xscvdphp xscvhpdp xvcvhpsp xvcvsphp xsrqpi xsrqpix xsrqpxp 28 instructions Phabricator: http://reviews.llvm.org/D16709 llvm-svn: 262068
*	Define a feature for __float128 support in the PPC back end	Nemanja Ivanovic	2015-12-15	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D15117 In preparation for supporting IEEE Quad precision floating point, this patch simply defines a feature to specify the target supports this. For now, nothing is done with the target feature, we just don't want warnings from the Clang FE when a user specifies -mfloat128. Calling convention and other related work will add to this patch in the near future. llvm-svn: 255642
*	[Power PC] llvm soft float support for ppc32	Petar Jovanovic	2015-12-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	This is the second in a set of patches for soft float support for ppc32, it enables soft float operations. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D13700 llvm-svn: 255516
*	[PowerPC] Don't generate mfocrf on the e500mc	Hal Finkel	2015-11-25	1	-1/+1
\| \| \| \| \| \| \| \| \|	The e500mc does not actually support the mfocrf instruction; update the processor definitions to reflect that fact. Patch by Tom Rix (with some test-case cleanup by me). llvm-svn: 254064
*	Power8 and later support fusing addis/addi and addis/ld instruction	Eric Christopher	2015-11-20	1	-1/+4
\| \| \| \| \| \| \| \| \|	pairs that use the same register to execute as a single instruction. No Functional Change Patch by Kyle Butt! llvm-svn: 253724
*	[AsmParser] Backends can parameterize ASM tokenization.	Colin LeMahieu	2015-11-09	1	-0/+1
\| \| \| \|	llvm-svn: 252439
*	Properly handle the mftb instruction.	Kit Barton	2015-06-16	1	-32/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mftb instruction was incorrectly marked as deprecated in the PPC Backend. Instead, it should not be treated as deprecated, but rather be implemented using the mfspr instruction. A similar patch was put into GCC last year. Details can be found at: https://sourceware.org/ml/binutils/2014-11/msg00383.html. This change will replace instances of the mftb instruction with the mfspr instruction for all CPUs except 601 and pwr3. This will also be the default behaviour. Additional details can be found in: https://llvm.org/bugs/show_bug.cgi?id=23680 Phabricator review: http://reviews.llvm.org/D10419 llvm-svn: 239827
*	[PowerPC] Disable part-word atomics on the P7	Hal Finkel	2015-04-11	1	-2/+2
\| \| \| \| \| \| \|	As it turns out, even though these are part of ISA 2.06, the P7 does not support them (or, at least, not any P7s we're tested so far). llvm-svn: 234686
*	Add direct moves to/from VSR and exploit them for FP/INT conversions	Nemanja Ivanovic	2015-04-11	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D8928 It adds direct move instructions to/from VSX registers to GPR's. These are exploited for FP <-> INT conversions. llvm-svn: 234682
*	Add LLVM support for remaining integer divide and permute instructions from ↵	Nemanja Ivanovic	2015-04-09	1	-36/+37
\| \| \| \| \| \| \| \| \| \| \|	ISA 2.06 This is the patch corresponding to review: http://reviews.llvm.org/D8406 It adds some missing instructions from ISA 2.06 to the PPC back end. llvm-svn: 234546
*	Add Hardware Transactional Memory (HTM) Support	Kit Barton	2015-03-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently only the 'PowerPC HTM Low Level Built-in Function' are implemented. The HTM instructions follows the RC ones and the transaction initiation result is set on RC0 (with exception of tcheck). Currently approach is to create a register copy from CR0 to GPR and comapring. Although this is suboptimal, since the branch could be taken directly by comparing the CR0 value, it generates code correctly on both test and branch and just return value. A possible future optimization could be elimitate the MFCR instruction to branch directly. The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on powerpc64 and powerpc64le. This is send along a clang patch to enabled the builtins and option switch. [1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html Phabricator Review: http://reviews.llvm.org/D8247 llvm-svn: 233204
*	Add support for part-word atomics for PPC	Nemanja Ivanovic	2015-03-10	1	-5/+7
\| \| \| \| \| \|	http://reviews.llvm.org/D8090#inline-67337 llvm-svn: 231843
*	Add LLVM support for PPC cryptography builtins	Nemanja Ivanovic	2015-03-04	1	-1/+4
\| \| \| \| \| \|	Review: http://reviews.llvm.org/D7955 llvm-svn: 231285
*	Test commit. Removed an unnecessary space	Nemanja Ivanovic	2015-03-04	1	-1/+1
\| \| \| \|	llvm-svn: 231257
*	Move ABI handling and 64-bitness to the PowerPC target machine.	Eric Christopher	2015-02-17	1	-10/+0
\| \| \| \| \| \| \|	This required changing how the computation of the ABI is handled and how some of the checks for ABI/target are done. llvm-svn: 229471
*	[PowerPC] Implement the vpopcnt instructions for POWER8	Bill Schmidt	2015-02-03	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch by Kit Barton. Add the vector population count instructions for byte, halfword, word, and doubleword sizes. There are two major changes here: PPCISelLowering.cpp: Make CTPOP legal for vector types. PPCRegisterInfo.td: Added v2i64 to the VRRC register definition. This is needed for the doubleword variations of the integer ops that were added in P8. Test Plan Test the instruction vpcnt* encoding/decoding in ppc64-encoding-vmx.s Test the generation of the vpopcnt instructions for various vector data types. When adding the v2i64 type to the Vector Register set, I also needed to add the appropriate bit conversion patterns between v2i64 and the existing vector types. Testing for these conversions were also added in the test case by passing a different vector type as a parameter into the test functions. There is also a run step that will ensure the vpopcnt instructions are generated when the vsx feature is disabled. llvm-svn: 228046
*	[PowerPC] Reset the baseline for ppc64le to be equivalent to pwr8	Bill Schmidt	2015-01-25	1	-14/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Test by Nemanja Ivanovic. Since ppc64le implies POWER8 as a minimum, it makes sense that the same features are included. Since the pwr8 processor model will likely be getting new features until the implementation is complete, I created a new list to add these updates to. This will include them in both pwr8 and ppc64le. Furthermore, it seems that it would make sense to compose the feature lists for other processor models (pwr3 and up). Per discussion in the review, I will make this change in a subsequent patch. In order to test the changes, I've added an additional run step to test cases that specify -march=ppc64le -mcpu=pwr8 to omit the -mcpu option. Since the feature lists are the same, the behaviour should be unchanged. llvm-svn: 227053
*	[PowerPC] Loosen ELFv1 PPC64 func descriptor loads for indirect calls	Hal Finkel	2015-01-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. llvm-svn: 226207
*	[PPC64] Add support for the ICBT instruction on POWER8.	Bill Schmidt	2015-01-14	1	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch by Kit Barton. Support for the ICBT instruction is currently present, but limited to embedded processors. This change adds a new FeatureICBT that can be used to identify whether the ICBT instruction is available on a specific processor. Two new tests are added: * Positive test to ensure the icbt instruction is present when using -mcpu=pwr8 * Negative test to ensure the icbt instruction is not generated when using -mcpu=pwr7 Both test cases use the Prefetch opcode in LLVM. They are based on the ppc64-prefetch.ll test case. llvm-svn: 226033
*	[PowerPC] Add support for the CMPB instruction	Hal Finkel	2015-01-03	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Newer POWER cores, and the A2, support the cmpb instruction. This instruction compares its operands, treating each of the 8 bytes in the GPRs separately, returning a 'mask' result of 0 (for false) or -1 (for true) in each byte. Code generation support is added, in the form of a PPCISelDAGToDAG DAG-preprocessing routine, that recognizes patterns close to what the instruction computes (either exactly, or related by a constant masking operation), and generates the cmpb instruction (along with any necessary constant masking operation). This can be expanded if use cases arise. llvm-svn: 225106
*	Enable the P8Model entry	Will Schmidt	2014-12-17	1	-1/+1
\| \| \| \| \| \| \| \|	This was missed last time around, for the P8 Instruction Scheduling changes (223257). This will hook the P8Model entry in so those changes will actually be used. llvm-svn: 224452
*	Restore r223709 as it was meant to be, and enable FeatureP8Vector for P8	Bill Schmidt	2014-12-09	1	-2/+2
\| \| \| \|	llvm-svn: 223751
*	Revert r223709, "[PowerPC]Activate FeatureVSX for the Power target", to ↵	NAKAMURA Takumi	2014-12-09	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unbreak bots. CodeGen/PowerPC/vsx-p8.ll was failing. '+power8-vector' is not a recognized feature for this target (ignoring feature) llvm/test/CodeGen/PowerPC/vsx-p8.ll:33:14: error: expected string not found in input ; CHECK-REG: lxvw4x 34, 0, 3 ^ <stdin>:50:2: note: scanning from here .align 3 ^ <stdin>:61:2: note: possible intended match here lvx 3, 0, 3 ^ llvm-svn: 223729