summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
* [PowerPC] Remove unnecessary load of r12 in indirect callUlrich Weigand2014-06-181-4/+0
| | | | | | | | | | | | | | | When looking at the 64-bit SVR4 indirect call sequence, I noticed an unnecessary load of r12. And indeed the code says: // R12 must contain the address of an indirect callee. But this is not correct; in the 64-bit SVR4 (ELFv1) ABI, there is no need to load r12 at this point. It seems this code and comment is a remnant of code originally shared with the Darwin ABI ... This patch simply removes the unnecessary load. llvm-svn: 211203
* [PowerPC] Simplify and improve loading into TOC registerUlrich Weigand2014-06-186-40/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During an indirect function call sequence on the 64-bit SVR4 ABI, generate code must load and then restore the TOC register. This does not use a regular LOAD instruction since the TOC register r2 is marked as reserved. Instead, the are two special instruction patterns: let RST = 2, DS = 2 in def LDinto_toc: DSForm_1a<58, 0, (outs), (ins g8rc:$reg), "ld 2, 8($reg)", IIC_LdStLD, [(PPCload_toc i64:$reg)]>, isPPC64; let RST = 2, DS = 10, RA = 1 in def LDtoc_restore : DSForm_1a<58, 0, (outs), (ins), "ld 2, 40(1)", IIC_LdStLD, [(PPCtoc_restore)]>, isPPC64; Note that these not only restrict the destination of the load to r2, but they also restrict the *source* of the load to particular address combinations. The latter is a problem when we want to support the ELFv2 ABI, since there the TOC save slot is no longer at 40(1). This patch replaces those two instructions with a single instruction pattern that only hard-codes r2 as destination, but supports generic addresses as source. This will allow supporting the ELFv2 ABI, and also helps generate more efficient code for calls to absolute addresses (allowing simplification of the ppc64-calls.ll test case). llvm-svn: 211193
* [PowerPC] Do not use BLA with the 64-bit SVR4 ABIUlrich Weigand2014-06-181-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | The PowerPC back-end uses BLA to implement calls to functions at known-constant addresses, which is apparently used for certain system routines on Darwin. However, with the 64-bit SVR4 ABI, this is actually incorrect. An immediate function pointer value on this platform is not directly usable as a target address for BLA: - in the ELFv1 ABI, the function pointer value refers to the *function descriptor*, not the code address - in the ELFv2 ABI, the function pointer value refers to the global entry point, but BL(A) would only be correct when calling the *local* entry point This bug didn't show up since using immediate function pointer values is not usually done in the 64-bit SVR4 ABI in the first place. However, I ran into this issue with a certain use case of LLVM as JIT, where immediate function pointer values were uses to implement callbacks from JITted code to helpers in statically compiled code. Fixed by simply not using BLA with the 64-bit SVR4 ABI. llvm-svn: 211174
* [PowerPC] Fix emitting instruction pairs on LEUlrich Weigand2014-06-181-9/+37
| | | | | | | | | | | | | My patch r204634 to emit instructions in little-endian format failed to handle those special cases where we emit a pair of instructions from a single LLVM MC instructions (like the bl; nop pairs used to implement the call sequence). In those cases, we still need to emit the "first" instruction (the one in the more significant word) first, on both big and little endian, and not swap them. llvm-svn: 211171
* [PPC64] Fix PR19893 - improve code generation for local function addressesBill Schmidt2014-06-163-21/+25
| | | | | | | | | | | | | | | | | | | | | Rafael opened http://llvm.org/bugs/show_bug.cgi?id=19893 to track non-optimal code generation for forming a function address that is local to the compile unit. The existing code was treating both local and non-local functions identically. This patch fixes the problem by properly identifying local functions and generating the proper addis/addi code. I also noticed that Rafael's earlier changes to correct the surrounding code in PPCISelLowering.cpp were also needed for fast instruction selection in PPCFastISel.cpp, so this patch fixes that code as well. The existing test/CodeGen/PowerPC/func-addr.ll is modified to test the new code generation. I've added a -O0 run line to test the fast-isel code as well. Tested on powerpc64[le]-unknown-linux-gnu with no regressions. llvm-svn: 211056
* The hazard recognizer only needs a subtarget, not a target machineEric Christopher2014-06-132-7/+9
| | | | | | so make it take one. Fix up all users accordingly. llvm-svn: 210948
* Fix typo.Eric Christopher2014-06-131-1/+1
| | | | llvm-svn: 210947
* Move the PPCSelectionDAGInfo off the TargetMachine and onto theEric Christopher2014-06-124-5/+6
| | | | | | subtarget. llvm-svn: 210854
* Make PPCSelectionDAGInfo take a DataLayout instead of a TargetMachineEric Christopher2014-06-123-7/+6
| | | | | | since that's all it needs. llvm-svn: 210853
* Move PPCTargetLowering off of the TargetMachine and onto the subtarget.Eric Christopher2014-06-125-8/+11
| | | | llvm-svn: 210852
* Remove an extraneous this-> to access the subtarget.Eric Christopher2014-06-121-1/+1
| | | | llvm-svn: 210849
* Rename PPCSubTarget to Subtarget in PPCTargetLowering for consistency.Eric Christopher2014-06-122-126/+124
| | | | | | Also remove an extra local subtarget in the initialization functions. llvm-svn: 210848
* Move PPCJITInfo off of the TargetMachine and onto the subtarget.Eric Christopher2014-06-126-30/+33
| | | | | | | Needed to migrate a few functions around to avoid circular header dependencies. llvm-svn: 210845
* Remove the use of TargetMachine from PPCJITInfo and replace withEric Christopher2014-06-123-6/+6
| | | | | | | the subtarget. Also remove unnecessary argument to the constructor at the same time, we already have access via the subtarget. llvm-svn: 210844
* Move PPCInstrInfo off of the target machine and onto the subtarget.Eric Christopher2014-06-124-7/+11
| | | | llvm-svn: 210839
* Remove TargetMachine from PPCInstrInfo and all dependencies andEric Christopher2014-06-125-27/+29
| | | | | | replace with the current subtarget. llvm-svn: 210836
* Move DataLayout from the PPCTargetMachine to the subtarget.Eric Christopher2014-06-124-40/+46
| | | | llvm-svn: 210824
* Move PPCFrameLowering into PPCSubtarget from PPCTargetMachine. UseEric Christopher2014-06-126-196/+211
| | | | | | | | the initializeSubtargetDependencies code to obtain an initialized subtarget and migrate a couple of subtarget using functions to the .cpp file to avoid circular includes. llvm-svn: 210822
* Remove duplicate copy of InstrItineraryData from the TargetMachine,Eric Christopher2014-06-112-13/+8
| | | | | | it's already on the subtarget. llvm-svn: 210619
* [PPC64LE] Recognize shufflevector patterns for little endianBill Schmidt2014-06-103-84/+151
| | | | | | | | | | | | | | | | | Various masks on shufflevector instructions are recognizable as specific PowerPC instructions (vector pack, vector merge, etc.). There is existing code in PPCISelLowering.cpp to recognize the correct patterns for big endian code. The masks for these instructions are different for little endian code due to the big-endian numbering employed by these instructions. This patch adds the recognition code for little endian. I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for this. The existing recognizer test (vec_shuffle.ll) is unnecessarily verbose and difficult to read, so I felt it was better to add a new test rather than modify the old one. llvm-svn: 210536
* [PPC64LE] Generate correct code for unaligned little-endian vector loadsBill Schmidt2014-06-091-21/+39
| | | | | | | | | | | | | | | | | | | The code in PPCTargetLowering::PerformDAGCombine() that handles unaligned Altivec vector loads generates a lvsl followed by a vperm. As we've seen in numerous other places, the vperm instruction has a big-endian bias, and this is fixed for little endian by complementing the permute control vector and swapping the input operands. In this case the lvsl is providing the permute control vector. Rather than generating an lvsl and a complement operation, it is sufficient to generate an lvsr instruction instead. Thus for LE code generation we will generate an lvsr rather than an lvsl, and swap the other input arguments on the vperm. The existing test/CodeGen/PowerPC/vec_misalign.ll is updated to test the code generation for PPC64 and PPC64LE, in addition to the existing PPC32/G5 testing. llvm-svn: 210493
* [PPC64LE] Generate correct little-endian code for v16i8 multiplyBill Schmidt2014-06-091-4/+16
| | | | | | | | | | | | | | | | The existing code in PPCTargetLowering::LowerMUL() for multiplying two v16i8 values assumes that vector elements are numbered in big-endian order. For little-endian targets, the vector element numbering is reversed, but the vmuleub, vmuloub, and vperm instructions still assume big-endian numbering. To account for this, we must adjust the permute control vector and reverse the order of the input registers on the vperm instruction. The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed on powerpc64 and powerpc64le targets as well as the original powerpc (32-bit) target. llvm-svn: 210474
* AsmMatchers: Use unique_ptr to manage ownership of MCParsedAsmOperandDavid Blaikie2014-06-081-55/+48
| | | | | | | | | | | | I saw at least a memory leak or two from inspection (on probably untested error paths) and r206991, which was the original inspiration for this change. I ran this idea by Jim Grosbach a few weeks ago & he was OK with it. Since it's a basically mechanical patch that seemed sufficient - usual post-commit review, revert, etc, as needed. llvm-svn: 210427
* Have TargetSelectionDAGInfo take a DataLayout initializer rather thanEric Christopher2014-06-061-1/+1
| | | | | | a TargetMachine since the only thing it wants is DataLayout. llvm-svn: 210366
* [PPC64LE] Fix lowering of BUILD_VECTOR and SHUFFLE_VECTOR for little endianBill Schmidt2014-06-061-3/+34
| | | | | | | | | | | | | | | This patch fixes a couple of lowering issues for little endian PowerPC. The code for lowering BUILD_VECTOR contains a number of optimizations that are only valid for big endian. For now, we disable those optimizations for correctness. In the future, we will add analogous optimizations that are correct for little endian. When lowering a SHUFFLE_VECTOR to a VPERM operation, we again need to make the now-familiar transformation of swapping the input operands and complementing the permute control vector. Correctness of this transformation is tested by the accompanying test case. llvm-svn: 210336
* [PPC64LE] Temporarily disable VSX support in little-endian modeBill Schmidt2014-06-051-0/+5
| | | | | | | | | | This is a preliminary patch for the PowerPC64LE support. In stage 1 of the vector support, we will support the VMX (Altivec) instruction set, but will not yet support the VSX instructions. This is merely a staging issue to provide functional vector support as soon as possible. llvm-svn: 210271
* Omit else branch after return.Eric Christopher2014-06-021-2/+4
| | | | llvm-svn: 210034
* Have the TLOF creation take a Triple rather than needing a subtarget.Eric Christopher2014-05-311-3/+5
| | | | llvm-svn: 209937
* isSVR4ABI() returned !isDarwin() so just move that to the elseEric Christopher2014-05-301-4/+1
| | | | | | block and remove the unreachable code. llvm-svn: 209927
* Rename CreateTLOF->createTLOF to match the rest of the file and theEric Christopher2014-05-301-4/+4
| | | | | | rest of the targets with a similar function name. llvm-svn: 209926
* [PPC] Use alias symbols in address computation.Rafael Espindola2014-05-292-34/+17
| | | | | | | | | | | This seems to match what gcc does for ppc and what every other llvm backend does. This is a fixed version of r209638. The difference is to avoid any change in behavior for functions. The logic for using constant pools for function addresseses is spread over a few places and we have to keep them in sync. llvm-svn: 209821
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-281-7/+1
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Revert "[PPC] Use alias symbols in address computation."Hal Finkel2014-05-282-15/+34
| | | | | | | | | This reverts commit r209638 because it broke self-hosting on ppc64/Linux. (the Clang-compiled TableGen would segfault because it jumped to an invalid address from within _ZNK4llvm17ManagedStaticBase21RegisterManagedStaticEPFPvvEPFvS1_E (which is within the command-line parameter registration process)). llvm-svn: 209745
* [PATCH] Correct type used for VADD_SPLAT optimization on PowerPCBill Schmidt2014-05-271-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In PPCISelLowering.cpp: PPCTargetLowering::LowerBUILD_VECTOR(), there is an optimization for certain patterns to generate one or two vector splats followed by a vector add or subtract. This operation is represented by a VADD_SPLAT in the selection DAG. Prior to this patch, it was possible for the VADD_SPLAT to be assigned the wrong data type, causing incorrect code generation. This patch corrects the problem. Specifically, the code previously assigned the value type of the BUILD_VECTOR node to the newly generated VADD_SPLAT node. This is correct much of the time, but not always. The problem is that the call to isConstantSplat() may return a SplatBitSize that is not the same as the number of bits in the original element vector type. The correct type to assign is a vector type with the same element bit size as SplatBitSize. The included test case shows an example of this, where the BUILD_VECTOR node has a type of v16i8. The vector to be built is {0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16}. isConstantSplat detects that we can generate a splat of 16 for type v8i16, which is the type we must assign to the VADD_SPLAT node. If we do not, we generate a vspltisb of 8 and a vaddubm, which generates the incorrect result {16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16}. The correct code generation is a vspltish of 8 and a vadduhm. This patch also corrected code generation for CodeGen/PowerPC/2008-07-10-SplatMiscompile.ll, which had been marked as an XFAIL, so we can remove the XFAIL from the test case. llvm-svn: 209662
* [PPC] Use alias symbols in address computation.Rafael Espindola2014-05-262-34/+15
| | | | | | | This seems to match what gcc does for ppc and what every other llvm backend does. llvm-svn: 209638
* Fix typo.Eric Christopher2014-05-221-1/+1
| | | | llvm-svn: 209377
* Avoid using subtarget features when initializing the pass pipelineEric Christopher2014-05-222-12/+17
| | | | | | on PPC. llvm-svn: 209376
* Reset the subtarget for DAGToDAG on every iteration of runOnMachineFunction.Eric Christopher2014-05-225-47/+47
| | | | | | | This required updating the generated functions and TD file accordingly to be pointers rather than const references. llvm-svn: 209375
* Make early if conversion dependent upon the subtarget and addEric Christopher2014-05-212-6/+4
| | | | | | | a subtarget hook to enable. Unconditionally add to the pass pipeline for targets that might want to use it. No functional change. llvm-svn: 209340
* [PowerPC] PR19796: Also match ISD::TargetConstant in isIntS16ImmediateAdam Nemet2014-05-201-1/+1
| | | | | | | | | | | | | The SplitIndexingFromLoad changes exposed a latent isel bug in the PowerPC64 backend. We matched an immediate offset with STWX8 even though it only supports register offset. The culprit is the complex-pattern predicate, SelectAddrIdx, which decides that if the offset is not ISD::Constant it must be a register. Many thanks to Bill Schmidt for testing this. llvm-svn: 209219
* SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the ↵Benjamin Kramer2014-05-191-0/+1
| | | | | | | | | | bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123
* Target: remove old constructors for CallLoweringInfoSaleem Abdulrasool2014-05-171-10/+5
| | | | | | | | | | This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082
* Use a sized enum for MachineOperandType. No functionality changePete Cooper2014-05-161-1/+1
| | | | llvm-svn: 209048
* Delete getAliasedGlobal.Rafael Espindola2014-05-163-9/+5
| | | | llvm-svn: 209040
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-143-20/+20
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* Fix typo in function name.Eric Christopher2014-05-141-5/+5
| | | | llvm-svn: 208743
* Save the optimization level the subtarget was created with in aEric Christopher2014-05-132-15/+14
| | | | | | | | | | member variable and sink the initialization of crbits into the subtarget feature reset code. No functional change, but this refactor will be used in a future commit. llvm-svn: 208726
* [PowerPC] Add global named register supportHal Finkel2014-05-112-0/+27
| | | | | | | Support for the intrinsics that read from and write to global named registers is added for r1, r2 and r13 (depending on the subtarget). llvm-svn: 208509
* [PowerPC] On PPC32, 128-bit shifts might be runtime callsHal Finkel2014-05-111-0/+8
| | | | | | | | | | | The counter-loops formation pass needs to know what operations might be function calls (because they can't appear in counter-based loops). On PPC32, 128-bit shifts might be runtime calls (even though you can't use __int128 on PPC32, it seems that SROA might form them). Fixes PR19709. llvm-svn: 208501
* Remove the UseCFI option from createAsmStreamer.Rafael Espindola2014-05-071-4/+3
| | | | | | We were already always passing true, this just removes the option. llvm-svn: 208205
OpenPOWER on IntegriCloud