summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Implement atomic NAND operations as actual NANDUlrich Weigand2014-07-081-4/+4
| | | | | | | | | | | This changes the implementation of atomic NAND operations from "a & ~b" (compatible with GCC < 4.4) to actual "~(a & b)" (compatible with GCC >= 4.4). This is in line with the common-code and ARM back-end change implemented in r212433. llvm-svn: 212547
* [PowerPC] Fix no-assert buildUlrich Weigand2014-07-071-0/+1
| | | | | | | r212476 caused a compile failure (unused variable) in a non-assertion build ... llvm-svn: 212477
* [PowerPC] Fix "byval align" argumentsUlrich Weigand2014-07-071-67/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | Arguments passed as "byval align" should get the specified alignment in the parameter save area. There was some code in PPCISelLowering.cpp that attempted to implement this, but this didn't work correctly: while code did update the ArgOffset value, it neglected to update the PtrOff value (which was already computed from the old ArgOffset), and it also neglected to update GPR_idx -- fields skipped due to alignment in the save area must likewise be skipped in GPRs. This patch fixes and simplifies this logic by: - handling argument offset alignment right at the beginning of argument processing, using a new helper routine CalculateStackSlotAlignment (this avoids having to update PtrOff and other derived values later on) - not tracking GPR_idx separately, but always computing the correct GPR_idx for each argument *from* its ArgOffset - removing some redundant computation in LowerFormalArguments: MinReservedArea must equal ArgOffset after argument processing, so there's no use in computing it twice. [This doesn't change the behavior of the current clang front-end, since that never creates "byval align" arguments at the moment. This will change with a follow-on patch, however.] llvm-svn: 212476
* [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.Juergen Ributzka2014-07-011-1/+2
| | | | | | | | The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135
* Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to ↵Craig Topper2014-06-291-8/+6
| | | | | | simplify some code. llvm-svn: 211993
* [PowerPC] Constrain base register in PPCRegisterInfo::resolveFrameIndexUlrich Weigand2014-06-271-0/+8
| | | | | | | | | | | | | | | | | | | | | | I've run into a bug where current LLVM at -O0 (with fast-isel) generated invalid code like: ld 0, 20936(1) # 8-byte Folded Reload stw 12, 10348(0) stw 12, 10344(0) The underlying vreg had been introduced as base register by the Local Stack Slot Allocation pass. That register was constrained to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match the ADDI instruction used to set it, but it was *not* constrained to G8RC_NOX0 to fit the *use* of the register in an address. That should have happened in PPCRegisterInfo::resolveFrameIndex. This patch adds an appropriate constrainRegClass call. Reviewed by Hal Finkel. llvm-svn: 211897
* Remove extraneous includes from the target machines.Eric Christopher2014-06-261-4/+0
| | | | llvm-svn: 211800
* add ppc64/pwr8 as targetWill Schmidt2014-06-265-3/+18
| | | | | | | includes handling DIR_PWR8 where appropriate The P7Model Itinerary is currently tied in for use under the P8Model, and will be updated later. llvm-svn: 211779
* Move expression visitation logic up to MCStreamer.Rafael Espindola2014-06-252-2/+2
| | | | | | Remove the duplicate from MCRecordStreamer. No functionality change. llvm-svn: 211714
* Simplify the visitation of target expressions. No functionality change.Rafael Espindola2014-06-252-30/+4
| | | | llvm-svn: 211707
* [PPC64] Fix PR20071 (fctiduz generated for targets lacking that instruction)Bill Schmidt2014-06-241-0/+4
| | | | | | | | | | | | | | | | | PR20071 identifies a problem in PowerPC's fast-isel implementation for floating-point conversion to integer. The fctiduz instruction was added in Power ISA 2.06 (i.e., Power7 and later). However, this instruction is being generated regardless of which 64-bit PowerPC target is selected. The intent is for fast-isel to punt to DAG selection when this instruction is not available. This patch implements that change. For testing purposes, the existing fast-isel-conversion.ll test adds a RUN line for -mcpu=970 and tests for the expected code generation. Additionally, the existing test fast-isel-conversion-p5.ll was found to be incorrectly expecting the unavailable instruction to be generated. I've removed these test variants since we have adequate coverage in fast-isel-conversion.ll. llvm-svn: 211627
* [PowerPC] Refactor getMinCallFrameSize / getMinCallArgumentsSizeUlrich Weigand2014-06-234-51/+20
| | | | | | | | | | | | | | | | | | | | As of r211495, the only remaining users of getMinCallFrameSize are in core ABI code (LowerFormalParameter / LowerCall). This is actually a good thing, since the details of the parameter save area are ABI specific. With the new ELFv2 ABI in particular, the rules defining the size of the save area will become significantly more complex, so it wouldn't make sense to implement those outside ABI code that has all required information. In preparation, this patch eliminates the getMinCallFrameSize (and associated getMinCallArgumentsSize) routines, and inlines them into all callers. Note that since nearly all call arguments are constant, this allows simplifying the inlined copies to a single line everywhere. No change in generate code expected. llvm-svn: 211497
* [PowerPC] Allow stack frames without parameter save areaUlrich Weigand2014-06-232-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PPCFrameLowering::determineFrameLayout routine currently ensures that every function that allocates a stack frame provides space for the parameter save area (via PPCFrameLowering::getMinCallFrameSize). This is actually not necessary. There may be functions that never call another routine but still allocate a frame; those do not require the parameter save area. In the future, with the ELFv2 ABI, even some routines that do call other functions do not need to allocate the parameter save area. While it is not a bug to allocate the parameter area when it is not needed, it is better to avoid it to save stack space. Note that when any particular function call requires the parameter save area, this space will already have been included by ABI code in the size the CALLSEQ_START insn is annotated with, and therefore included in the size returned by MFI->getMaxCallFrameSize(). This means that determineFrameLayout simply does not need to care about the parameter save area. (It still needs to ensure that every frame provides the linkage area.) This is implemented by this patch. Note that this exposed a bug in the new fast-isel code where the parameter area was *not* included in the CALLSEQ_START size; this is also fixed. A couple of test cases needed to be adapted for the new (smaller) stack frame size those tests now see. llvm-svn: 211495
* [PowerPC] Fix IsDarwin arg in PPCFrameLowering:: callsUlrich Weigand2014-06-231-5/+5
| | | | | | | | | | | | | | | | As remarked in the commit message to r211493, in several places throughout the 64-bit SVR4 ABI code there are calls to PPCFrameLowering::getLinkageSize and getMinCallFrameSize using an incorrect IsDarwin argument of "true". (Some of those were made explicit by the above refactoring patch, others have been there all along.) This patch fixes those places to pass "false" for IsDarwin. No change in generated code expected. llvm-svn: 211494
* [PowerPC] Refactor setMinReservedArea and CalculateParameterAndLinkageAreaSizeUlrich Weigand2014-06-232-127/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PPCISelLowering.cpp routines PPCTargetLowering::setMinReservedArea and CalculateParameterAndLinkageAreaSize are currently used as subroutines from both 64-bit SVR4 and Darwin ABI code. However, the two ABIs are already quite different w.r.t. AltiVec conventions, and they will become more different when the ELFv2 ABI is supported. Also, in general it seems better to disentangle ABI support routines for different ABIs to avoid accidentally affecting one ABI when intending to change only the other. (Actually, the current code strictly speaking already contains a bug: these routines call PPCFrameLowering::getMinCallFrameSize and PPCFrameLowering::getLinkageSize with the IsDarwin parameter set to "true" even on 64-bit SVR4. This bug currently has no adverse effect since those routines always return the same for 64-bit SVR4 and 64-bit Darwin, but it still seems wrong ... I'll fix this in a follow-up commit shortly.) To remove this code sharing, I'm simply inlining both routines into all call sites (there are just two each, one for 64-bit SVR4 and one for Darwin), and simplifying due to constant parameters where possible. A small piece of code that *does* make sense to share is refactored into the new routine EnsureStackAlignment, now also called from 32-bit SVR4 ABI code. No change in generated code is expected. llvm-svn: 211493
* [PowerPC] Fix on-stack AltiVec arguments with 64-bit SVR4Ulrich Weigand2014-06-231-44/+29
| | | | | | | | | | | | | | | | | | Current 64-bit SVR4 code seems to have some remnants of Darwin code in AltiVec argument handing. This had the effect that AltiVec arguments (or subsequent arguments) were not correctly placed in the parameter area in some cases. The correct behaviour with the 64-bit SVR4 ABI is: - All AltiVec arguments take up space in the parameter area, just like any other arguments, whether vararg or not. - They are always 16-byte aligned, skipping a parameter area doubleword (and the associated GPR, if any), if necessary. This patch implements the correct behaviour and adds a test case. (Verified against GCC behaviour via the ABI compat test suite.) llvm-svn: 211492
* [PowerPC] Fix small argument stack slot offset for LEUlrich Weigand2014-06-201-11/+20
| | | | | | | | | | | | | | When small arguments (structures < 8 bytes or "float") are passed in a stack slot in the ppc64 SVR4 ABI, they must reside in the least significant part of that slot. On BE, this means that an offset needs to be added to the stack address of the parameter, but on LE, the least significant part of the slot has the same address as the slot itself. This changes the PowerPC back-end ABI code to only add the small argument stack slot offset for BE. It also adds test cases to verify the correct behavior on both BE and LE. llvm-svn: 211368
* [PowerPC] Remove unnecessary load of r12 in indirect callUlrich Weigand2014-06-181-4/+0
| | | | | | | | | | | | | | | When looking at the 64-bit SVR4 indirect call sequence, I noticed an unnecessary load of r12. And indeed the code says: // R12 must contain the address of an indirect callee. But this is not correct; in the 64-bit SVR4 (ELFv1) ABI, there is no need to load r12 at this point. It seems this code and comment is a remnant of code originally shared with the Darwin ABI ... This patch simply removes the unnecessary load. llvm-svn: 211203
* [PowerPC] Simplify and improve loading into TOC registerUlrich Weigand2014-06-186-40/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During an indirect function call sequence on the 64-bit SVR4 ABI, generate code must load and then restore the TOC register. This does not use a regular LOAD instruction since the TOC register r2 is marked as reserved. Instead, the are two special instruction patterns: let RST = 2, DS = 2 in def LDinto_toc: DSForm_1a<58, 0, (outs), (ins g8rc:$reg), "ld 2, 8($reg)", IIC_LdStLD, [(PPCload_toc i64:$reg)]>, isPPC64; let RST = 2, DS = 10, RA = 1 in def LDtoc_restore : DSForm_1a<58, 0, (outs), (ins), "ld 2, 40(1)", IIC_LdStLD, [(PPCtoc_restore)]>, isPPC64; Note that these not only restrict the destination of the load to r2, but they also restrict the *source* of the load to particular address combinations. The latter is a problem when we want to support the ELFv2 ABI, since there the TOC save slot is no longer at 40(1). This patch replaces those two instructions with a single instruction pattern that only hard-codes r2 as destination, but supports generic addresses as source. This will allow supporting the ELFv2 ABI, and also helps generate more efficient code for calls to absolute addresses (allowing simplification of the ppc64-calls.ll test case). llvm-svn: 211193
* [PowerPC] Do not use BLA with the 64-bit SVR4 ABIUlrich Weigand2014-06-181-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | The PowerPC back-end uses BLA to implement calls to functions at known-constant addresses, which is apparently used for certain system routines on Darwin. However, with the 64-bit SVR4 ABI, this is actually incorrect. An immediate function pointer value on this platform is not directly usable as a target address for BLA: - in the ELFv1 ABI, the function pointer value refers to the *function descriptor*, not the code address - in the ELFv2 ABI, the function pointer value refers to the global entry point, but BL(A) would only be correct when calling the *local* entry point This bug didn't show up since using immediate function pointer values is not usually done in the 64-bit SVR4 ABI in the first place. However, I ran into this issue with a certain use case of LLVM as JIT, where immediate function pointer values were uses to implement callbacks from JITted code to helpers in statically compiled code. Fixed by simply not using BLA with the 64-bit SVR4 ABI. llvm-svn: 211174
* [PowerPC] Fix emitting instruction pairs on LEUlrich Weigand2014-06-181-9/+37
| | | | | | | | | | | | | My patch r204634 to emit instructions in little-endian format failed to handle those special cases where we emit a pair of instructions from a single LLVM MC instructions (like the bl; nop pairs used to implement the call sequence). In those cases, we still need to emit the "first" instruction (the one in the more significant word) first, on both big and little endian, and not swap them. llvm-svn: 211171
* [PPC64] Fix PR19893 - improve code generation for local function addressesBill Schmidt2014-06-163-21/+25
| | | | | | | | | | | | | | | | | | | | | Rafael opened http://llvm.org/bugs/show_bug.cgi?id=19893 to track non-optimal code generation for forming a function address that is local to the compile unit. The existing code was treating both local and non-local functions identically. This patch fixes the problem by properly identifying local functions and generating the proper addis/addi code. I also noticed that Rafael's earlier changes to correct the surrounding code in PPCISelLowering.cpp were also needed for fast instruction selection in PPCFastISel.cpp, so this patch fixes that code as well. The existing test/CodeGen/PowerPC/func-addr.ll is modified to test the new code generation. I've added a -O0 run line to test the fast-isel code as well. Tested on powerpc64[le]-unknown-linux-gnu with no regressions. llvm-svn: 211056
* The hazard recognizer only needs a subtarget, not a target machineEric Christopher2014-06-132-7/+9
| | | | | | so make it take one. Fix up all users accordingly. llvm-svn: 210948
* Fix typo.Eric Christopher2014-06-131-1/+1
| | | | llvm-svn: 210947
* Move the PPCSelectionDAGInfo off the TargetMachine and onto theEric Christopher2014-06-124-5/+6
| | | | | | subtarget. llvm-svn: 210854
* Make PPCSelectionDAGInfo take a DataLayout instead of a TargetMachineEric Christopher2014-06-123-7/+6
| | | | | | since that's all it needs. llvm-svn: 210853
* Move PPCTargetLowering off of the TargetMachine and onto the subtarget.Eric Christopher2014-06-125-8/+11
| | | | llvm-svn: 210852
* Remove an extraneous this-> to access the subtarget.Eric Christopher2014-06-121-1/+1
| | | | llvm-svn: 210849
* Rename PPCSubTarget to Subtarget in PPCTargetLowering for consistency.Eric Christopher2014-06-122-126/+124
| | | | | | Also remove an extra local subtarget in the initialization functions. llvm-svn: 210848
* Move PPCJITInfo off of the TargetMachine and onto the subtarget.Eric Christopher2014-06-126-30/+33
| | | | | | | Needed to migrate a few functions around to avoid circular header dependencies. llvm-svn: 210845
* Remove the use of TargetMachine from PPCJITInfo and replace withEric Christopher2014-06-123-6/+6
| | | | | | | the subtarget. Also remove unnecessary argument to the constructor at the same time, we already have access via the subtarget. llvm-svn: 210844
* Move PPCInstrInfo off of the target machine and onto the subtarget.Eric Christopher2014-06-124-7/+11
| | | | llvm-svn: 210839
* Remove TargetMachine from PPCInstrInfo and all dependencies andEric Christopher2014-06-125-27/+29
| | | | | | replace with the current subtarget. llvm-svn: 210836
* Move DataLayout from the PPCTargetMachine to the subtarget.Eric Christopher2014-06-124-40/+46
| | | | llvm-svn: 210824
* Move PPCFrameLowering into PPCSubtarget from PPCTargetMachine. UseEric Christopher2014-06-126-196/+211
| | | | | | | | the initializeSubtargetDependencies code to obtain an initialized subtarget and migrate a couple of subtarget using functions to the .cpp file to avoid circular includes. llvm-svn: 210822
* Remove duplicate copy of InstrItineraryData from the TargetMachine,Eric Christopher2014-06-112-13/+8
| | | | | | it's already on the subtarget. llvm-svn: 210619
* [PPC64LE] Recognize shufflevector patterns for little endianBill Schmidt2014-06-103-84/+151
| | | | | | | | | | | | | | | | | Various masks on shufflevector instructions are recognizable as specific PowerPC instructions (vector pack, vector merge, etc.). There is existing code in PPCISelLowering.cpp to recognize the correct patterns for big endian code. The masks for these instructions are different for little endian code due to the big-endian numbering employed by these instructions. This patch adds the recognition code for little endian. I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for this. The existing recognizer test (vec_shuffle.ll) is unnecessarily verbose and difficult to read, so I felt it was better to add a new test rather than modify the old one. llvm-svn: 210536
* [PPC64LE] Generate correct code for unaligned little-endian vector loadsBill Schmidt2014-06-091-21/+39
| | | | | | | | | | | | | | | | | | | The code in PPCTargetLowering::PerformDAGCombine() that handles unaligned Altivec vector loads generates a lvsl followed by a vperm. As we've seen in numerous other places, the vperm instruction has a big-endian bias, and this is fixed for little endian by complementing the permute control vector and swapping the input operands. In this case the lvsl is providing the permute control vector. Rather than generating an lvsl and a complement operation, it is sufficient to generate an lvsr instruction instead. Thus for LE code generation we will generate an lvsr rather than an lvsl, and swap the other input arguments on the vperm. The existing test/CodeGen/PowerPC/vec_misalign.ll is updated to test the code generation for PPC64 and PPC64LE, in addition to the existing PPC32/G5 testing. llvm-svn: 210493
* [PPC64LE] Generate correct little-endian code for v16i8 multiplyBill Schmidt2014-06-091-4/+16
| | | | | | | | | | | | | | | | The existing code in PPCTargetLowering::LowerMUL() for multiplying two v16i8 values assumes that vector elements are numbered in big-endian order. For little-endian targets, the vector element numbering is reversed, but the vmuleub, vmuloub, and vperm instructions still assume big-endian numbering. To account for this, we must adjust the permute control vector and reverse the order of the input registers on the vperm instruction. The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed on powerpc64 and powerpc64le targets as well as the original powerpc (32-bit) target. llvm-svn: 210474
* AsmMatchers: Use unique_ptr to manage ownership of MCParsedAsmOperandDavid Blaikie2014-06-081-55/+48
| | | | | | | | | | | | I saw at least a memory leak or two from inspection (on probably untested error paths) and r206991, which was the original inspiration for this change. I ran this idea by Jim Grosbach a few weeks ago & he was OK with it. Since it's a basically mechanical patch that seemed sufficient - usual post-commit review, revert, etc, as needed. llvm-svn: 210427
* Have TargetSelectionDAGInfo take a DataLayout initializer rather thanEric Christopher2014-06-061-1/+1
| | | | | | a TargetMachine since the only thing it wants is DataLayout. llvm-svn: 210366
* [PPC64LE] Fix lowering of BUILD_VECTOR and SHUFFLE_VECTOR for little endianBill Schmidt2014-06-061-3/+34
| | | | | | | | | | | | | | | This patch fixes a couple of lowering issues for little endian PowerPC. The code for lowering BUILD_VECTOR contains a number of optimizations that are only valid for big endian. For now, we disable those optimizations for correctness. In the future, we will add analogous optimizations that are correct for little endian. When lowering a SHUFFLE_VECTOR to a VPERM operation, we again need to make the now-familiar transformation of swapping the input operands and complementing the permute control vector. Correctness of this transformation is tested by the accompanying test case. llvm-svn: 210336
* [PPC64LE] Temporarily disable VSX support in little-endian modeBill Schmidt2014-06-051-0/+5
| | | | | | | | | | This is a preliminary patch for the PowerPC64LE support. In stage 1 of the vector support, we will support the VMX (Altivec) instruction set, but will not yet support the VSX instructions. This is merely a staging issue to provide functional vector support as soon as possible. llvm-svn: 210271
* Omit else branch after return.Eric Christopher2014-06-021-2/+4
| | | | llvm-svn: 210034
* Have the TLOF creation take a Triple rather than needing a subtarget.Eric Christopher2014-05-311-3/+5
| | | | llvm-svn: 209937
* isSVR4ABI() returned !isDarwin() so just move that to the elseEric Christopher2014-05-301-4/+1
| | | | | | block and remove the unreachable code. llvm-svn: 209927
* Rename CreateTLOF->createTLOF to match the rest of the file and theEric Christopher2014-05-301-4/+4
| | | | | | rest of the targets with a similar function name. llvm-svn: 209926
* [PPC] Use alias symbols in address computation.Rafael Espindola2014-05-292-34/+17
| | | | | | | | | | | This seems to match what gcc does for ppc and what every other llvm backend does. This is a fixed version of r209638. The difference is to avoid any change in behavior for functions. The logic for using constant pools for function addresseses is spread over a few places and we have to keep them in sync. llvm-svn: 209821
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-281-7/+1
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Revert "[PPC] Use alias symbols in address computation."Hal Finkel2014-05-282-15/+34
| | | | | | | | | This reverts commit r209638 because it broke self-hosting on ppc64/Linux. (the Clang-compiled TableGen would segfault because it jumped to an invalid address from within _ZNK4llvm17ManagedStaticBase21RegisterManagedStaticEPFPvvEPFvS1_E (which is within the command-line parameter registration process)). llvm-svn: 209745
OpenPOWER on IntegriCloud