bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[NFC][PowerPC] Refactor classifyGlobalReference	Jinsong Ji	2019-09-20	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	We always(and only) check the NLP flag after calling classifyGlobalReference to see whether it is accessed indirectly. Refactor to code to use isGVIndirectSym instead. llvm-svn: 372417
*	[PowerPC] Remove the SPE4RC register class and instead add f32 to the GPRC ↵	Craig Topper	2019-09-12	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	register class. Summary: Since the SPE4RC register class contains an identical set of registers and an identical spill size to the GPRC class its slightly confusing the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0. This is because SPE4C is found first in the super register class list when inheriting these properties and it doesn't set the VTs or AltOrders the same way as GPRC or GPRC_NOR0. This patch replaces all uses of GPE4RC with GPRC and allows GPRC and GPRC_NOR0 to contain f32. The test changes here are because the AltOrders are being inherited to GPRC_NOR0 now. Found while trying to determine if getCommonSubClass needs to take a VT argument. It was originally added to support fp128 on x86-64, I've changed some things about that so that it might be needed anymore. But a PowerPC test crashed without it and I think its due to this subclass issue. Reviewers: jhibbits, nemanjai, kbarton, hfinkel Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67513 llvm-svn: 371779
*	[PowerPC] Add combined ELF ABI and 32/64 bit queries to the subtarget. [NFC]	Sean Fertile	2019-08-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	A lot of places in the code combine checks for both ABI (SVR4/Darwin/AIX) and addressing mode (64-bit vs 32-bit). In an attempt to make some of the code more readable I've added a couple functions that combine checking for the ELF abi and 64-bit/32-bit code at once. As we add more AIX support I intend to add similar functions for the AIX ABI. Differential Revision: https://reviews.llvm.org/D65814 llvm-svn: 369658
*	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM	Daniel Sanders	2019-08-15	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
*	[NFC][PowerPC]Change ADDIStocHA to ADDIStocHA8 to follow 64-bit naming ↵	Jason Liu	2019-07-22	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	convention Summary: Since we are planning to add ADDIStocHA for 32bit in later patch, we decided to change 64bit one first to follow naming convention with 8 behind opcode. Patch by: Xiangling_L Differential Revision: https://reviews.llvm.org/D64814 llvm-svn: 366731
*	[AIX] Implement function descriptor on SDAG	Jason Liu	2019-06-06	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (1) Function descriptor on AIX On AIX, a called routine may have 2 distinct symbols associated with it: * A function descriptor (Name) * A function entry point (.Name) The descriptor structure on AIX is the same as those in the ELF V1 ABI: * The address of the entry point of the function. * The TOC base address for the function. * The environment pointer. The descriptor symbol uses the same name as the source level function in C. The function entry point is analogous to the symbol we would generate for a function in a non-descriptor-based ABI, except that it is renamed by prepending a ".". Which symbol gets referenced depends on the context: * Taking the address of the function references the descriptor symbol. * Calling the function references the entry point symbol. (2) Speaking of implementation on AIX, for direct function call target, we create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to replace original TargetGlobalAddress SDNode. Then down the path, we can take advantage of this MCSymbol. Patch by: Xiangling_L Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara Differential Revision: https://reviews.llvm.org/D62532 llvm-svn: 362735
*	[PowerPC] [PowerPC] Enhance the fast selection of fptoi & fptrunc ↵	Kang Zhang	2019-02-25	1	-4/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction and clean up related asserts Summary: Fast selection of llvm fptoi & fptrunc instructions is not handled well about VSX instruction support. We'd use VSX float convert integer instruction instead of non-vsx float convert integer instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is openeded. For float trunc instruction, we do this silimar work like float convert integer instruction to try to use VSX instruction. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D58430 llvm-svn: 354762
*	[PowerPC] [NFC] Create a helper function to copy register to particular ↵	Zi Xuan Wu	2019-01-30	1	-35/+18
\| \| \| \| \| \| \| \| \| \| \| \|	register class at PPCFastISel Make copy register code as common function as following. unsigned copyRegToRegClass(const TargetRegisterClass *ToRC, unsigned SrcReg, unsigned Flag = 0, unsigned SubReg = 0); Differential Revision: https://reviews.llvm.org/D57368 llvm-svn: 352596
*	[PPC] Include tablegenerated PPCGenCallingConv.inc once	Reid Kleckner	2019-01-29	1	-18/+0
\| \| \| \| \| \| \| \| \|	Move the CC analysis implementation to its own .cpp file instead of duplicating it and artificually using functions in PPCISelLowering.cpp and PPCFastISel.cpp. Follow-up to the same change done for X86, ARM, and AArch64. llvm-svn: 352444
*	[PowerPC] Enhance the fast selection of cmp instruction and clean up related ↵	Zi Xuan Wu	2019-01-25	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	asserts Fast selection of llvm icmp and fcmp instructions is not handled well about VSX instruction support. We'd use VSX float comparison instruction instead of non-vsx float comparison instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is opened. If the target does not have corresponding VSX instruction comparison for some type, just copy VSX-related register to common float register class and use non-vsx comparison instruction. Differential Revision: https://reviews.llvm.org/D57078 llvm-svn: 352174
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	Recommit "[PowerPC] Fix assert from machine verify pass that unmatched ↵	Zi Xuan Wu	2019-01-10	1	-13/+24
\| \| \| \| \| \| \| \| \| \|	register class about fcmp selection in fast-isel" This re-commit r350685. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350799
*	Revert "[PowerPC] Fix assert from machine verify pass that unmatched ↵	Zi Xuan Wu	2019-01-09	1	-20/+13
\| \| \| \| \| \| \| \| \| \|	register class about fcmp selection in fast-isel" This reverts commit r350685. See compile assert in compiler-rt. llvm-svn: 350693
*	[PowerPC] Fix assert from machine verify pass that unmatched register class ↵	Zi Xuan Wu	2019-01-09	1	-13/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	about fcmp selection in fast-isel Bad machine code: Illegal virtual register for instruction function: TestULE basic block: %bb.0 entry (0x1000a39b158) instruction: %2:crrc = FCMPUD %1:vsfrc, %3:f8rc operand 1: %1:vsfrc Fix assert about missing match between fcmp instruction and register class. We should use vsx related cmp instruction xvcmpudp instead of fcmpu when vsx is opened. add -verifymachineinstrs option into related test cases to enable the verify pass. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350685
*	FastIsel: take care to update iterators when removing instructions.	Tim Northover	2018-12-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365
*	Fix clang -Wimplicit-fallthrough warnings across llvm, NFC	Reid Kleckner	2018-11-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch should not introduce any behavior changes. It consists of mostly one of two changes: 1. Replacing fall through comments with the LLVM_FALLTHROUGH macro 2. Inserting 'break' before falling through into a case block consisting of only 'break'. We were already using this warning with GCC, but its warning behaves slightly differently. In this patch, the following differences are relevant: 1. GCC recognizes comments that say "fall through" as annotations, clang doesn't 2. GCC doesn't warn on "case N: foo(); default: break;", clang does 3. GCC doesn't warn when the case contains a switch, but falls through the outer case. I will enable the warning separately in a follow-up patch so that it can be cleanly reverted if necessary. Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu Differential Revision: https://reviews.llvm.org/D53950 llvm-svn: 345882
*	DAG: Add calling convention argument to calling convention funcs	Matt Arsenault	2018-07-28	1	-1/+1
\| \| \| \| \| \| \| \|	This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194
*	Fix build failures from r337347, found by clang	Justin Hibbits	2018-07-18	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \|	* Delete a no-longer-used override, and mark the other getRegisterTypeForCallingConv() as override. * SPE only supports i32, not i64, as the internal type, so simply remove the type check, so that DestReg and Opc are provably always set. GCC 6.4 did not warn about either of the above. llvm-svn: 337350
*	Introduce codegen for the Signal Processing Engine	Justin Hibbits	2018-07-18	1	-29/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347
*	Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵	Zaara Syeda	2018-01-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
*	[NFC] fix trivial typos in comments and documents	Hiroshi Inoue	2018-01-29	1	-1/+1
\| \| \| \| \| \|	"to to" -> "to" llvm-svn: 323628
*	Revert [PowerPC] This reverts commit rL322721	Zaara Syeda	2018-01-17	1	-2/+0
\| \| \| \| \| \|	Failing build bots. Revert the commit now. llvm-svn: 322748
*	[PowerPC] Add handling for ColdCC calling convention and a pass to mark	Zaara Syeda	2018-01-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
*	[CodeGen] Print register names in lowercase in both MIR and debug output	Francis Visoiu Mistrih	2017-11-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187
*	Fix a bunch more layering of CodeGen headers that are in Target	David Blaikie	2017-11-17	1	-1/+1
\| \| \| \| \| \| \| \|	All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
*	Delete Default and JITDefault code models	Rafael Espindola	2017-08-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
*	Sort the remaining #include lines in include/... and lib/....	Chandler Carruth	2017-06-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
*	[PowerPC] Eliminate integer compare instructions - vol. 1	Nemanja Ivanovic	2017-05-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 llvm-svn: 302810
*	Add extra operand to CALLSEQ_START to keep frame part set up previously	Serge Pavlov	2017-05-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527
*	IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵	Peter Collingbourne	2016-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
*	[PowerPC] Zero-extend constants in FastISel	Hal Finkel	2016-09-04	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which are illegal types promoted to i32 on PowerPC, is a choice constrained by assumptions within the infrastructure. Specifically, the logic in FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI operands will be zero extended, and so, at least when materializing constants that are PHI operands, we must do the same. The rest of our fast-isel implementation does not appear to depend on the fact that we were sign-extending i8/i16 constants, and all other targets also appear to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we had been doing this only for i1 constants, and sign-extending the others). Fixes PR27721. llvm-svn: 280614
*	Reformat.	NAKAMURA Takumi	2016-08-22	1	-20/+20
\| \| \| \|	llvm-svn: 279409
*	Untabify.	NAKAMURA Takumi	2016-08-22	1	-2/+2
\| \| \| \|	llvm-svn: 279408
*	[PowerPC] Wrong fast-isel codegen for VSX floating-point loads	Ulrich Weigand	2016-08-05	1	-12/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were two locations where fast-isel would generate a LFD instruction with a target register class VSFRC instead of F8RC when VSX was enabled. This can ccause invalid registers to be used in certain cases, like: lfd 36, ... instead of using a VSX load instruction. The wrong register number gets silently truncated, causing invalid code to be generated. The first place is PPCFastISel::PPCEmitLoad, which had multiple problems: 1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they are computed from resultReg, which is still zero at this point in many cases. Fixed by changing the helper routines to operate on a register class instead of a register and passing in UseRC. 2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo: bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS; bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD; The second line needs to use isVSFRC (like PPCEmitStore does). 3.) Once both the above are fixed, we're now generating a VSX instruction -- but an incorrect one, since generation of an indexed instruction with null index is wrong. Fixed by copying the code handling the same issue in PPCEmitStore. The second place is PPCFastISel::PPCMaterializeFP, where we would emit an LFD to load a constant from the literal pool, and use the wrong result register class. Fixed by hardcoding a F8RC class even on systems supporting VSX. Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630 Differential Revision: https://reviews.llvm.org/D22632 llvm-svn: 277823
*	Add back some dead code.	Rafael Espindola	2016-06-21	1	-0/+14
\| \| \| \| \| \| \|	It was there just to avoid warnings. Add a LLVM_ATTRIBUTE_UNUSED attribute so that it doesn't produce warnings with gcc 6. llvm-svn: 273308
*	Delete some dead code.	Rafael Espindola	2016-06-21	1	-14/+0
\| \| \| \| \| \|	Found by gcc 6. llvm-svn: 273303
*	[PowerPC] fix register alignment for long double type	Strahinja Petrovic	2016-05-09	1	-0/+1
\| \| \| \| \| \| \| \| \|	This patch fixes register alignment for long double type in soft float mode. Before this patch alignment was 8 and this patch changes it to 4. Differential Revision: http://reviews.llvm.org/D18034 llvm-svn: 268909
*	CXX_FAST_TLS calling convention: performance improvement for PPC64	Chuang-Yu Cheng	2016-04-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781
*	[PowerPC] Correctly compute 64-bit offsets in fast isel	Ulrich Weigand	2016-03-31	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PPCSimplifyAddress contains this code: IntegerType OffsetTy = ((VT == MVT::i32) ? Type::getInt32Ty(Context) : Type::getInt64Ty(Context)); to determine the type to be used for an index register, if one needs to be created. However, the "VT" here is the type of the data being loaded or stored, not* the type of an address. This means that if a data element of type i32 is accessed using an index that does not not fit into 32 bits, a wrong address is computed here. Note that PPCFastISel is only ever used on 64-bit currently, so the type of an address is actually always MVT::i64. Other parts of the code, even in this same PPCSimplifyAddress routine, already rely on that fact. Thus, this patch changes the code to simply unconditionally use Type::getInt64Ty(*Context) as OffsetTy. llvm-svn: 265023
*	[PowerPC] Remove incorrect use of COPY_TO_REGCLASS in fast isel	Ulrich Weigand	2016-03-31	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The fast isel pass currently emits a COPY_TO_REGCLASS node to convert from a F4RC to a F8RC register class during conversion of a floating-point number to integer. There is actually no support in the common code instruction printers to emit COPY_TO_REGCLASS nodes, so the PowerPC back-end has special code there to simply ignore COPY_TO_REGCLASS. This is correct if and only if the source and destination registers of COPY_TO_REGCLASS are the same (except for the different register class). But nothing guarantees this to be the case, and if the register allocator does end up allocating source and destination to different registers after all, the back-end simply generates incorrect code. I've included a test case that shows such incorrect code generation. However, it seems that COPY_TO_REGCLASS is actually not intended to be used at the MI layer at all. It is used during SelectionDAG, but always lowered to a plain COPY before emitting MI. Other back-end's fast isel passes never emit COPY_TO_REGCLASS at all. I suspect it is simply wrong for the PowerPC back-end to emit it here. This patch changes the PowerPC back-end to directly emit COPY instead of COPY_TO_REGCLASS and removes the special handling in the instruction printers. Differential Revision: http://reviews.llvm.org/D18605 llvm-svn: 265020
*	[PPC, FastISel] Fix ordered/unordered fcmp	Tim Shen	2016-03-17	1	-7/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For fcmp, major concern about the following 6 cases is NaN result. The comparison result consists of 4 bits, indicating lt, eq, gt and un (unordered), only one of which will be set. The result is generated by fcmpu instruction. However, bc instruction only inspects one of the first 3 bits, so when un is set, bc instruction may jump to to an undesired place. More specifically, if we expect an unordered comparison and un is set, we expect to always go to true branch; in such case UEQ, UGT and ULT still give false, which are undesired; but UNE, UGE, ULE happen to give true, since they are tested by inspecting !eq, !lt, !gt, respectively. Similarly, for ordered comparison, when un is set, we always expect the result to be false. In such case OGT, OLT and OEQ is good, since they are actually testing GT, LT, and EQ respectively, which are false. OGE, OLE and ONE are tested through !lt, !gt and !eq, and these are true. llvm-svn: 263753
*	Fix for PR26180	Nemanja Ivanovic	2016-02-29	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	Corresponds to Phabricator review: http://reviews.llvm.org/D16592 This fix includes both an update to how we handle the "generic" CPU on LE systems as well as Anton's fix for the Fast Isel issue. llvm-svn: 262233
*	Fix for PR 26356	Nemanja Ivanovic	2016-02-04	1	-5/+4
\| \| \| \| \| \| \| \| \|	Using the load immediate only when the immediate (whether signed or unsigned) can fit in a 16-bit signed field. Namely, from -32768 to 32767 for signed and 0 to 65535 for unsigned. This patch also ensures that we sign-extend under the right conditions. llvm-svn: 259840
*	Fix for PR 26381	Nemanja Ivanovic	2016-02-03	1	-1/+1
\| \| \| \| \| \|	Simple fix - Constant values were not being sign extended in FastIsel. llvm-svn: 259645
*	Refactor common code for PPC fast isel load immediate selection.	Eric Christopher	2016-01-29	1	-9/+5
\| \| \| \|	llvm-svn: 259178
*	Since LI/LIS sign extend the constant passed into the instruction we should	Eric Christopher	2016-01-29	1	-2/+3
\| \| \| \| \| \| \| \| \|	check that the sign extended constant fits into 16-bits if we want a zero extended value, otherwise go ahead and put it together piecemeal. Fixes PR26356. llvm-svn: 259177
*	Fix up conditional formatting.	Eric Christopher	2016-01-29	1	-5/+4
\| \| \| \|	llvm-svn: 259176
*	Refactor: Simplify boolean conditional return statements in lib/Target/PowerPC	Alexander Kornienko	2015-12-28	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use clang-tidy to simplify boolean conditional return statements Reviewers: uweigand, rafael, wschmidt Subscribers: craig.topper, llvm-commits Patch by Richard Thomson! Differential Revision: http://reviews.llvm.org/D9984 llvm-svn: 256493
*	Weak non-function symbols were being accessed directly, which is	Eric Christopher	2015-11-20	1	-8/+4
\| \| \| \| \| \| \| \| \| \|	incorrect, as the chosen representative of the weak symbol may not live with the code in question. Always indirect the access through the TOC instead. Patch by Kyle Butt! llvm-svn: 253708
*	FastISel: Use finishCondBranch() for ARM,Mips,PowerPC FastISel	Matthias Braun	2015-08-26	1	-2/+1
\| \| \| \| \| \|	Note that after this change branch probabilities are preserved now. llvm-svn: 245998