summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Remove some dead VSX v4f32 store patternsHal Finkel2014-03-261-4/+2
| | | | | | | | | These patterns are dead (because v4f32 stores are currently promoted to v4i32 and stored using Altivec instructions), and also are likely not correct (because they'd store the vector elements in the opposite order from that assumed by the rest of the Altivec code). llvm-svn: 204839
* [PowerPC] Use VSX vector load/stores for v2[fi]64Hal Finkel2014-03-262-0/+13
| | | | | | | | These instructions have access to the complete VSX register file. In addition, they "swap" the order of the elements so that element 0 (the scalar part) comes first in memory and element 1 follows at a higher address. llvm-svn: 204838
* [PowerPC] Add v2i64 as a legal VSX typeHal Finkel2014-03-264-9/+32
| | | | | | | | | v2i64 needs to be a legal VSX type because it is the SetCC result type from v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations. This fixes the lowering for v2f64 VSELECT. llvm-svn: 204828
* [PowerPC] Lower VSELECT using xxsel when VSX is availableHal Finkel2014-03-262-3/+22
| | | | | | | | With VSX there is a real vector select instruction, and so we should use it. Note that VSELECT will still scalarize for v2f64 because the corresponding SetCC result type (v2i64) is not currently a legal type. llvm-svn: 204801
* Revert "Prevent alias from pointing to weak aliases."Rafael Espindola2014-03-263-9/+9
| | | | | | | | | This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784
* [PowerPC] Generate logical vector VSX instructionsHal Finkel2014-03-261-5/+12
| | | | | | | These instructions are essentially the same as their Altivec counterparts, but have access to the larger VSX register file. llvm-svn: 204782
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-263-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781
* [PowerPC] Select between VSX A-type and M-type FMA instructions just before RAHal Finkel2014-03-253-0/+283
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VSX instruction set has two types of FMA instructions: A-type (where the addend is taken from the output register) and M-type (where one of the product operands is taken from the output register). This adds a small pass that runs just after MI scheduling (and, thus, just before register allocation) that mutates A-type instructions (that are created during isel) into M-type instructions when: 1. This will eliminate an otherwise-necessary copy of the addend 2. One of the product operands is killed by the instruction The "right" moment to make this decision is in between scheduling and register allocation, because only there do we know whether or not one of the product operands is killed by any particular instruction. Unfortunately, this also makes the implementation somewhat complicated, because the MIs are not in SSA form and we need to preserve the LiveIntervals analysis. As a simple example, if we have: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 ... %vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19, %RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19 ... We can eliminate the copy by changing from the A-type to the M-type instruction. This means: %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 is replaced by: %vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9, %RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9 and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 llvm-svn: 204768
* [PowerPC] Correct commutable indices for VSX FMA instructionsHal Finkel2014-03-252-0/+18
| | | | | | | | | | Although the first two operands are the ones that can be swapped, the tied input operand is listed before them, so we need to adjust for that. I have a test case for this, but it goes along with an upcoming commit (so it will come soon). llvm-svn: 204748
* [PowerPC] Add a TableGen relation for A-type and M-type VSX FMA instructionsHal Finkel2014-03-252-27/+103
| | | | | | | TableGen will create a lookup table for the A-type FMA instructions providing their corresponding M-form opcodes. This will be used by upcoming commits. llvm-svn: 204746
* [PowerPC] Generate little-endian object filesUlrich Weigand2014-03-245-24/+55
| | | | | | | | | | | | | | | | | | | | As a first step towards real little-endian code generation, this patch changes the PowerPC MC layer to actually generate little-endian object files. This involves passing the little-endian flag through the various layers, including down to createELFObjectWriter so we actually get basic little-endian ELF objects, emitting instructions in little-endian order, and handling fixups and relocations as appropriate for little-endian. The bulk of the patch is to update most test cases in test/MC/PowerPC to verify both big- and little-endian encodings. (The only test cases *not* updated are those that create actual big-endian ABI code, like the TLS tests.) Note that while the object files are now little-endian, the generated code itself is not yet updated, in particular, it still does not adhere to the ELFv2 ABI. llvm-svn: 204634
* [PPC64LE] ELFv2 ABI updates for the .opd sectionWill Schmidt2014-03-241-0/+5
| | | | | | | | | | | | | | | | | | [PPC64LE] ELFv2 ABI updates for the .opd section The PPC64 Little Endian (PPC64LE) target supports the ELFv2 ABI, and as such, does not have a ".opd" section. This is keyed off a _CALL_ELF=2 macro check. The CALL_ELF check is not clearly documented at this time. The basis for usage in this patch is from the gcc thread here: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01144.html > Adding comment from Uli: Looks good to me. I think the old-style JIT doesn't really work anyway for 64-bit, but at least with this patch LLVM will compile and link again on a ppc64le host ... llvm-svn: 204614
* [PowerPC] Mark many instructions as commutativeHal Finkel2014-03-244-4/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm under the impression that we used to infer the isCommutable flag from the instruction-associated pattern. Regardless, we don't seem to do this (at least by default) any more. I've gone through all of our instruction definitions, and marked as commutative all of those that should be trivial to commute (by exchanging the first two operands). There has been special code for the RL* instructions, and that's not changed. Before this change, we had the following commutative instructions: RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo XSADDDP XSMULDP XVADDDP XVADDSP XVMULDP XVMULSP After: ADD4 ADD4o ADD8 ADD8o ADDC ADDC8 ADDC8o ADDCo ADDE ADDE8 ADDE8o ADDEo AND AND8 AND8o ANDo CRAND CREQV CRNAND CRNOR CROR CRXOR EQV EQV8 EQV8o EQVo FADD FADDS FADDSo FADDo FMADD FMADDS FMADDSo FMADDo FMSUB FMSUBS FMSUBSo FMSUBo FMUL FMULS FMULSo FMULo FNMADD FNMADDS FNMADDSo FNMADDo FNMSUB FNMSUBS FNMSUBSo FNMSUBo MULHD MULHDU MULHDUo MULHDo MULHW MULHWU MULHWUo MULHWo MULLD MULLDo MULLW MULLWo NAND NAND8 NAND8o NANDo NOR NOR8 NOR8o NORo OR OR8 OR8o ORo RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo VADDCUW VADDFP VADDSBS VADDSHS VADDSWS VADDUBM VADDUBS VADDUHM VADDUHS VADDUWM VADDUWS VAND VAVGSB VAVGSH VAVGSW VAVGUB VAVGUH VAVGUW VMADDFP VMAXFP VMAXSB VMAXSH VMAXSW VMAXUB VMAXUH VMAXUW VMHADDSHS VMHRADDSHS VMINFP VMINSB VMINSH VMINSW VMINUB VMINUH VMINUW VMLADDUHM VMULESB VMULESH VMULEUB VMULEUH VMULOSB VMULOSH VMULOUB VMULOUH VNMSUBFP VOR VXOR XOR XOR8 XOR8o XORo XSADDDP XSMADDADP XSMAXDP XSMINDP XSMSUBADP XSMULDP XSNMADDADP XSNMSUBADP XVADDDP XVADDSP XVMADDADP XVMADDASP XVMAXDP XVMAXSP XVMINDP XVMINSP XVMSUBADP XVMSUBASP XVMULDP XVMULSP XVNMADDADP XVNMADDASP XVNMSUBADP XVNMSUBASP XXLAND XXLNOR XXLOR XXLXOR This is a by-inspection change, and I'm not sure how to write a reliable test case. I would like advice on this, however. llvm-svn: 204609
* [PowerPC] Don't schedule VSX copy legalization unless VSX is enabledHal Finkel2014-03-241-1/+2
| | | | | | There is no need to schedule this extra pass if it will have nothing to do. llvm-svn: 204594
* [PowerPC] Update comment re: VSX copy-instruction selectionHal Finkel2014-03-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've done some experimentation with this, and it looks like using the lower-latency (but lower throughput) copy instruction is essentially always the right thing to do. My assumption is that, in order to be relatively sure that the higher-latency copy will increase throughput, we'd want to have it unlikely to be in-flight with its use. On the P7, the global completion table (GCT) can hold a maximum of 120 instructions, shared among all active threads (up to 4), giving 30 instructions per thread. So specifically, I'd require at least that many instructions between the copy and the use before the high-latency variant is used. Trying this, however, over the entire test suite resulted in zero cases where the high-latency form would be preferable. This may be a consequence of the fact that the scheduler views copies as free, and so they tend to end up close to their uses. For this experiment I created a function: unsigned chooseVSXCopy(MachineBasicBlock &MBB, MachineBasicBlock::iterator I, unsigned DestReg, unsigned SrcReg, unsigned StartDist = 1, unsigned Depth = 3) const; with an implementation like: if (!Depth) return PPC::XXLOR; const unsigned MaxDist = 30; unsigned Dist = StartDist; for (auto J = I, JE = MBB.end(); J != JE && Dist <= MaxDist; ++J) { if (J->isTransient() && !J->isCopy()) continue; if (J->isCall() || J->isReturn() || J->readsRegister(DestReg, TRI)) return PPC::XXLOR; ++Dist; } // We've exceeded the required distance for the high-latency form, use it. if (Dist > MaxDist) return PPC::XVCPSGNDP; // If this is only an exit block, use the low-latency form. if (MBB.succ_empty()) return PPC::XXLOR; // We've reached the end of the block, check the successor blocks (up to some // depth), and use the high-latency form if that is okay with all successors. for (auto J = MBB.succ_begin(), JE = MBB.succ_end(); J != JE; ++J) { if (chooseVSXCopy(**J, (*J)->begin(), DestReg, SrcReg, Dist, --Depth) == PPC::XXLOR) return PPC::XXLOR; } // All of our successor blocks seem okay with the high-latency variant, so // we'll use it. return PPC::XVCPSGNDP; and then changed the copy opcode selection from: Opc = PPC::XXLOR; to: Opc = chooseVSXCopy(MBB, std::next(I), DestReg, SrcReg); In conclusion, I'm removing the FIXME from the comment, because I believe that there is, at least absent other examples, nothing to fix. llvm-svn: 204591
* remove a bunch of unused private methodsNuno Lopes2014-03-231-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h | 1 include/llvm/IR/DebugInfo.h | 3 lib/CodeGen/MachineSSAUpdater.cpp | 10 -- lib/CodeGen/PostRASchedulerList.cpp | 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 10 -- lib/IR/DebugInfo.cpp | 12 -- lib/MC/MCAsmStreamer.cpp | 2 lib/Support/YAMLParser.cpp | 39 --------- lib/TableGen/TGParser.cpp | 16 --- lib/TableGen/TGParser.h | 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp | 9 -- lib/Target/ARM/ARMCodeEmitter.cpp | 12 -- lib/Target/ARM/ARMFastISel.cpp | 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp | 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp | 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp | 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h | 2 lib/Target/PowerPC/PPCFastISel.cpp | 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp | 2 lib/Transforms/Instrumentation/BoundsChecking.cpp | 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp | 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp | 8 - lib/Transforms/Scalar/SCCP.cpp | 1 utils/TableGen/CodeEmitterGen.cpp | 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560
* [PowerPC] Make use of VSX f64 <-> i64 conversion instructionsHal Finkel2014-03-231-6/+12
| | | | | | | When VSX is available, these instructions should be used in preference to the older variants that only have access to the scalar floating-point registers. llvm-svn: 204559
* [PowerPC] Fix the VSX v2f64 return registerHal Finkel2014-03-221-4/+4
| | | | | | | v2f64 values, like other 128-bit values, are returned under VSX in register vs34 (Altivec register v2). llvm-svn: 204543
* Add llvm_unreachable after fully-covered switches to appease GCCAlexey Samsonov2014-03-201-0/+1
| | | | llvm-svn: 204318
* Look through variables when computing relocations.Rafael Espindola2014-03-201-4/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given bar = foo + 4 .long bar MC would eat the 4. GNU as includes it in the relocation. The rule seems to be that a variable that defines a symbol is used in the relocation and one that does not define a symbol is evaluated and the result included in the relocation. Fixing this unfortunately required some other changes: * Since the variable is now evaluated, it would prevent the ELF writer from noticing the weakref marker the elf streamer uses. This patch then replaces that with a VariantKind in MCSymbolRefExpr. * Using VariantKind then requires us to look past other VariantKind to see .weakref bar,foo call bar@PLT doing this also fixes zed = foo +2 call zed@PLT so that is a good thing. * Looking past VariantKind means that the relocation selection has to use the fixup instead of the target. This is a reboot of the previous fixes for MC. I will watch the sanitizer buildbot and wait for a build before adding back the previous fixes. llvm-svn: 204294
* Fix PR19144: Incorrect offset generated for int-to-fp conversion at -O0.Bill Schmidt2014-03-181-3/+5
| | | | | | | | | | | | | | | | | | When converting a signed 32-bit integer to double-precision floating point on hardware without a lfiwax instruction, we have to instead use a lfd followed by fcfid. We were erroneously offsetting the address by 4 bytes in preparation for either a lfiwax or lfiwzx when generating the lfd. This fixes that silly error. This was not caught in the test suite since the conversion tests were run with -mcpu=pwr7, which implies availability of lfiwax. I've added another test case for older hardware that checks the code we expect in the absence of lfiwax and other flavors of fcfid. There are fewer tests in this test case because we punt to DAG selection in more cases on older hardware. (We must generate complex fiddly sequences in those cases, and there is marginal benefit in duplicating that logic in fast-isel.) llvm-svn: 204155
* [C++11] Mark the target fast isel classes as 'final' so that the compiler ↵Craig Topper2014-03-181-1/+1
| | | | | | can de-virtualize some of the internal calls. llvm-svn: 204123
* [ppc64] Avoid copy relocs in named rodata sectionsUlrich Weigand2014-03-141-13/+9
| | | | | | | | | | | | | | Commit r181723 introduced code to avoid placing initialized variables needing relocations into the .rodata section, which avoid copy relocs that do not work as expected on ppc64 function references. The same treatment is also needed for *named* .rodata.XXX sections. This patch changes PPC64LinuxTargetObjectFile::SelectSectionForGlobal to modify "Kind" *before* calling the default SelectSectionForGlobal routine, instead of first calling the default routine and then just checking for the (main) .rodata section afterwards. llvm-svn: 203921
* Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changingOwen Anderson2014-03-131-8/+9
| | | | | | | | | | operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
* [PowerPC] Initial support for the VSX instruction setHal Finkel2014-03-1319-23/+1275
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. llvm-svn: 203768
* [TableGen] Optionally forbid overlap between named and positional operandsHal Finkel2014-03-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are currently two schemes for mapping instruction operands to instruction-format variables for generating the instruction encoders and decoders for the assembler and disassembler respectively: a) to map by name and b) to map by position. In the long run, we'd like to remove the position-based scheme and use only name-based mapping. Unfortunately, the name-based scheme currently cannot deal with complex operands (those with suboperands), and so we currently must use the position-based scheme for those. On the other hand, the position-based scheme cannot deal with (register) variables that are split into multiple ranges. An upcoming commit to the PowerPC backend (adding VSX support) will require this capability. While we could teach the position-based scheme to handle that, since we'd like to move away from the position-based mapping generally, it seems silly to teach it new tricks now. What makes more sense is to allow for partial transitioning: use the name-based mapping when possible, and only use the position-based scheme when necessary. Now the problem is that mixing the two sensibly was not possible: the position-based mapping would map based on position, but would not skip those variables that were mapped by name. Instead, the two sets of assignments would overlap. However, I cannot currently change the current behavior, because there are some backends that rely on it [I think mistakenly, but I'll send a message to llvmdev about that]. So I've added a new TableGen bit variable: noNamedPositionallyEncodedOperands, that can be used to cause the position-based mapping to skip variables mapped by name. llvm-svn: 203767
* Allow exclamation and tilde to be parsed as a part of the ppc asm operand.Roman Divacky2014-03-121-0/+2
| | | | llvm-svn: 203699
* Try harder to evaluate expressions when printing assembly.Rafael Espindola2014-03-121-1/+4
| | | | | | | | | When printing assembly we don't have a Layout object, but we can still try to fold some constants. Testcase by Ulrich Weigand. llvm-svn: 203677
* Update the datalayout string for ppc64LE.Will Schmidt2014-03-121-2/+7
| | | | | | Update the datalayout string for ppc64LE. llvm-svn: 203664
* Replace '#include ValueTypes.h' with forward declarations.Patrik Hagglund2014-03-121-1/+0
| | | | | | | In some cases the include is pushed "downstream" (or removed if unused). llvm-svn: 203644
* [TTI] There is actually no realistic way to pop TTI implementations offChandler Carruth2014-03-101-4/+0
| | | | | | | | | | | | | | the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use override systematically -- we weren't overriding anything for 'finalizePass' because there is no such thing. This is kind of a lame restriction on the API -- we can no longer push and pop things, we just set up the stack and run. However, I'm not invested in building some better solution on top of the existing (terrifying) immutable pass and legacy pass manager. llvm-svn: 203437
* Don't avoid cfi instructions on the bg/p.Rafael Espindola2014-03-072-6/+0
| | | | | | | The integrated assembler now works for ppc. Since this was the last use of the bg/p predicate and Hal says that it is now dead, drop the predicate too. llvm-svn: 203269
* MC: Remove superfluous section attribute flag definitionsDavid Majnemer2014-03-071-8/+8
| | | | | | | | | | | | | | | | | | | Summary: llvm/MC/MCSectionMachO.h and llvm/Support/MachO.h both had the same definitions for the section flags. Instead, grab the definitions out of support. No functionality change. Reviewers: grosbach, Bigcheese, rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2998 llvm-svn: 203211
* Replace PROLOG_LABEL with a new CFI_INSTRUCTION.Rafael Espindola2014-03-072-29/+30
| | | | | | | | | | | | | | | | | | | | | | | The old system was fairly convoluted: * A temporary label was created. * A single PROLOG_LABEL was created with it. * A few MCCFIInstructions were created with the same label. The semantics were that the cfi instructions were mapped to the PROLOG_LABEL via the temporary label. The output position was that of the PROLOG_LABEL. The temporary label itself was used only for doing the mapping. The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to one by holding an index into the CFI instructions of this function. I did consider removing MMI.getFrameInstructions completelly and having CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non trivial constructors and destructors and are somewhat big, so the this setup is probably better. The net result is that we don't create temporary labels that are never used. llvm-svn: 203204
* The PPC global base register cannot be r0Hal Finkel2014-03-061-2/+2
| | | | | | | | | The global base register cannot be r0 because it might end up as the first argument to addi or addis. Fixes PR18316. I don't have a small stable test case. llvm-svn: 203054
* [Layering] Move DebugInfo.h into the IR library where its implementationChandler Carruth2014-03-061-1/+1
| | | | | | already lives. llvm-svn: 203046
* Fixup PPC Darwin i1 argument handlingHal Finkel2014-03-061-0/+7
| | | | | | | Like on other targets, we need to zero_extend/truncate i1 args before copying them to GPRs. llvm-svn: 203045
* When using CR bit registers on PPC32, handle the i1 vaarg caseHal Finkel2014-03-061-0/+3
| | | | | | | | When copying an i1 value into a GPR for a vaarg call, we need to explicitly zero-extend the i1 value (otherwise an invalid CRBIT -> GPR copy will be generated). llvm-svn: 203041
* With PPC CR bit registers, handle int_to_fp on older coresHal Finkel2014-03-051-6/+16
| | | | | | | | | On cores without fpcvt support, we cannot promote int_to_fp i1 operations, because there is nothing to promote them to. The most straightforward implementation of this uses a select to choose between the two possible resulting floating-point values (and that's what is done here). llvm-svn: 203015
* Enable integrated assembler on OpenBSD/PPC32 by default, too.Joerg Sonnenberger2014-03-051-1/+2
| | | | | | From Brad Smith. llvm-svn: 202967
* [PowerPC] support powerpc64le as syntax-checking target (pass2)Will Schmidt2014-03-041-0/+1
| | | | | | | | Register the Asm Printer for the ppc64le target. This fills in a spot that was missed in an earlier change (r187179). llvm-svn: 202861
* [Modules] Move ValueHandle into the IR library where Value itself lives.Chandler Carruth2014-03-041-1/+1
| | | | | | | | | | | Move the test for this class into the IR unittests as well. This uncovers that ValueMap too is in the IR library. Ironically, the unittest for ValueMap is useless in the Support library (honestly, so was the ValueHandle test) and so it already lives in the IR unittests. Mmmm, tasty layering. llvm-svn: 202821
* [Modules] Move GetElementPtrTypeIterator into the IR library. As itsChandler Carruth2014-03-041-1/+1
| | | | | | | | | name might indicate, it is an iterator over the types in an instruction in the IR.... You see where this is going. Another step of modularizing the support library. llvm-svn: 202815
* Add a PPC inline asm constraint type for single CR bitsHal Finkel2014-03-021-0/+8
| | | | | | | | | | | | | | | | | Now that the PowerPC backend can track individual CR bits as first-class registers, we should also have a way of allocating them for inline asm statements. Because these registers are only one bit, if an output variable is implicitly cast to a larger integer size, we'll get an any_extend to that larger type (this is part of the existing target-independent logic). As a result, regardless of the size of the output type, only the first bit is meaningful. The constraint identifier "wc" has been chosen for this purpose. Although gcc does not currently support allocating individual CR bits, this identifier choice has been coordinated with the gcc PowerPC team, and will be marked as reserved for this purpose in the gcc constraints.md file. llvm-svn: 202657
* [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.Benjamin Kramer2014-03-022-12/+7
| | | | | | Remove the old functions. llvm-svn: 202636
* Switch all uses of LLVM_OVERRIDE to just use 'override' directly.Craig Topper2014-03-023-21/+19
| | | | llvm-svn: 202621
* Switch all uses of LLVM_FINAL to just use 'final', and remove the macro.Craig Topper2014-03-021-1/+1
| | | | llvm-svn: 202618
* Remove extra truncs/exts around i32 bit operations on PPC64Hal Finkel2014-03-011-12/+82
| | | | | | | | | | | | | | | | | | | | | | | | | This generalizes the code to eliminate extra truncs/exts around i1 bit operations to also do the same on PPC64 for i32 bit operations. This eliminates a fairly prevalent code wart: int foo(int a) { return a == 5 ? 7 : 8; } On PPC64, because of the extension implied by the ABI, this would generate: cmplwi 0, 3, 5 li 12, 8 li 4, 7 isel 3, 4, 12, 2 rldicl 3, 3, 0, 32 blr where the 'rldicl 3, 3, 0, 32', the extension, is completely unnecessary. At least for the single-BB case (which is all that the DAG combine mechanism can handle), this unnecessary extension is no longer generated. llvm-svn: 202600
* Swap PPC isel operands to allow for 0-foldingHal Finkel2014-02-281-0/+117
| | | | | | | | | The PPC isel instruction can fold 0 into the first operand (thus eliminating the need to materialize a zero-containing register when the 'true' result of the isel is 0). When the isel is fed by a bit register operation that we can invert, do so as part of the bit-register-operation peephole routine. llvm-svn: 202469
* Trying to unbreak the darwin11 builderHal Finkel2014-02-281-0/+3
| | | | | | | | | The CR bit tracking code broke PPC/Darwin; trying to get it working again... (the darwin11 builder, which defaults to the darwin ABI when running PPC tests, asserted when running test/CodeGen/PowerPC/inverted-bool-compares.ll) llvm-svn: 202459
OpenPOWER on IntegriCloud