summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [C++11] Add 'override' keywords and remove 'virtual'. Additionally add ↵Craig Topper2014-04-2924-306/+311
| | | | | | 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition llvm-svn: 207504
* None of these targets actually define their own CFI_INSTRUCTIONEric Christopher2014-04-291-7/+8
| | | | | | | opcode so there's no reason to use the target namespace for it rather than TargetOpcode. llvm-svn: 207475
* 80-column, tab characters, comment fixups.Eric Christopher2014-04-291-43/+44
| | | | llvm-svn: 207473
* Convert more SelectionDAG functions to use ArrayRef.Craig Topper2014-04-281-1/+1
| | | | llvm-svn: 207397
* [C++] Use 'nullptr'.Craig Topper2014-04-285-9/+9
| | | | llvm-svn: 207394
* Convert SelectionDAG::SelectNodeTo to use ArrayRef.Craig Topper2014-04-271-17/+17
| | | | llvm-svn: 207377
* Convert SelectionDAG::getMergeValues to use ArrayRef.Craig Topper2014-04-271-4/+4
| | | | llvm-svn: 207374
* Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer ↵Craig Topper2014-04-261-7/+5
| | | | | | and size. llvm-svn: 207329
* Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.Craig Topper2014-04-261-35/+24
| | | | llvm-svn: 207327
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-2515-70/+71
| | | | llvm-svn: 207197
* Add 'musttail' marker to call instructionsReid Kleckner2014-04-241-0/+4
| | | | | | | | | | | | This is similar to the 'tail' marker, except that it guarantees that tail call optimization will occur. It also comes with convervative IR verification rules that ensure that tail call optimization is possible. Reviewers: nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D3240 llvm-svn: 207143
* Spread some const around for non-mutating uses of MCSymbolData.David Blaikie2014-04-241-3/+3
| | | | | | | | I discovered this const-hole while attempting to coalesnce the Symbol and SymbolMap data structures. There's some pending issues with that, but I figured this change was easy to flush early. llvm-svn: 207124
* Create MCTargetOptions.Evgeniy Stepanov2014-04-231-1/+2
| | | | | | | | | For now it contains a single flag, SanitizeAddress, which enables AddressSanitizer instrumentation of inline assembly. Patch by Yuri Gorshenin. llvm-svn: 206971
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-2213-14/+26
| | | | | | | definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842
* [cleanup] Lift using directives, DEBUG_TYPE definitions, and even someChandler Carruth2014-04-224-10/+10
| | | | | | | | | | | | system headers above the includes of generated '.inc' files that actually contain code. In a few targets this was already done pretty consistently, but it wasn't done *really* consistently anywhere. It is strictly cleaner IMO and necessary in a bunch of places where the DEBUG_TYPE is referenced from the generated code. Consistency with the necessary places trumps. Hopefully the build bots are OK with the movement of intrin.h... llvm-svn: 206838
* [Modules] Make Support/Debug.h modular. This requires it to not changeChandler Carruth2014-04-213-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro *after* header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822
* Break PseudoSourceValue out of the Value hierarchy. It is now the root of ↵Nick Lewycky2014-04-151-2/+2
| | | | | | its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead. llvm-svn: 206255
* [MC] Require an MCContext when constructing an MCDisassembler.Lang Hames2014-04-151-4/+5
| | | | | | | | | | | | | | | | This patch re-introduces the MCContext member that was removed from MCDisassembler in r206063, and requires that an MCContext be passed in at MCDisassembler construction time. (Previously the MCContext member had been initialized in an ad-hoc fashion after construction). The MCCContext member can be used by MCDisassembler sub-classes to construct constant or target-specific MCExprs. This patch updates disassemblers for in-tree targets, and provides the MCRegisterInfo instance that some disassemblers were using through the MCContext (previously those backends were constructing their own MCRegisterInfo instances). llvm-svn: 206241
* [PowerPC] [Constant Hoisting] Enable constant hoisting on PPCHal Finkel2014-04-131-0/+147
| | | | | | | | | | Implements the various TTI functions to enable constant hoisting on PPC. The only significant test-suite change is this: MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup (which essentially reverses the slowdown from r206120). llvm-svn: 206141
* [PowerPC] Fix rlwimi isel when mask is not constantHal Finkel2014-04-131-1/+8
| | | | | | | | | | | | | | | | | We had been using the known-zero values of the operand of the or to construct the mask for an rlwimi; this is not quite correct, but fine when the mask is constant. When the mask is constant, then the known zeros of the operand must be a superset of the zeros in the mask. However, when the mask is not a constant, then there might be bits in the operand that are not known to be zero that, at runtime, might be zero in the mask. Therefore, we check that any bits not known to be zero *are* known to be one in the mask. Otherwise, we can't fold the mask with the or and shift. This was revealed as a miscompile of MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with constant hoisting. llvm-svn: 206136
* [PowerPC] Implement some additional TLI callbacksHal Finkel2014-04-122-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | Add implementations of: bool isLegalICmpImmediate(int64_t Imm) const bool isLegalAddImmediate(int64_t Imm) const bool isTruncateFree(Type *Ty1, Type *Ty2) const bool isTruncateFree(EVT VT1, EVT VT2) const bool shouldConvertConstantLoadToIntImm(const APInt &Imm, Type *Ty) const Unfortunately, this regresses counter-register-based loop formation because some of the loops now end up in forms were SE cannot compute loop counts. However, nevertheless, the test-suite results favor committing: SingleSource/Benchmarks/BenchmarkGame/puzzle: 26% speedup MultiSource/Benchmarks/FreeBench/analyzer/analyzer: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan: 20% speedup SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv: 19% speedup SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv: 15% speedup MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2: 2% speedup MultiSource/Benchmarks/VersaBench/bmm/bmm: 26% slowdown llvm-svn: 206120
* LLVMBuild.txt: Reformat.NAKAMURA Takumi2014-04-102-3/+3
| | | | llvm-svn: 205961
* [PowerPC] Don't return false from PPC::isVSLDOIShuffleMaskHal Finkel2014-04-081-1/+1
| | | | | | | | | PPC::isVSLDOIShuffleMask should return -1, not false, when the shuffle predicate should be false. Noticed by inspection; no test case (yet). llvm-svn: 205787
* [PowerPC] Remove unused TM member variable to unbreak buildHal Finkel2014-04-051-3/+2
| | | | | | Fix "error: private field 'TM' is not used [-Werror,-Wunused-private-field]" llvm-svn: 205660
* [PowerPC] Adjust load/store costs in PPCTTIHal Finkel2014-04-041-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | This provides more realistic costs for the insert/extractelement instructions (which are load/store pairs), accounts for the cheap unaligned Altivec load sequence, and for unaligned VSX load/stores. Bad news: MultiSource/Applications/sgefa/sgefa - 35% slowdown (this will require more investigation) SingleSource/Benchmarks/McGill/queens - 20% slowdown (we no longer vectorize this, but it was a constant store that was scalarized) MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 - 2% slowdown Good news: SingleSource/Benchmarks/Shootout/ary3 - 54% speedup SingleSource/Benchmarks/Shootout-C++/ary - 40% speedup MultiSource/Benchmarks/Ptrdist/ks/ks - 35% speedup MultiSource/Benchmarks/FreeBench/neural/neural - 30% speedup MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt - 20% speedup Unfortunately, estimating the costs of the stack-based scalarization sequences is hard, and adjusting these costs is like a game of whac-a-mole :( I'll revisit this again after we have better codegen for vector extloads and truncstores and unaligned load/stores. llvm-svn: 205658
* [PowerPC] PPCTTI CleanupHal Finkel2014-04-041-4/+0
| | | | | | Remove the declaration of an unimplemented function. llvm-svn: 205657
* [PowerPC] Add a full condition code register to make the "cc" clobber workHal Finkel2014-04-041-0/+12
| | | | | | | | gcc inline asm supports specifying "cc" as a clobber of all condition registers. Add just enough modeling of the full register to make this work. Fixed PR19326. llvm-svn: 205630
* Make consistent use of MCPhysReg instead of uint16_t throughout the tree.Craig Topper2014-04-043-26/+26
| | | | llvm-svn: 205610
* [PowerPC] Make PPCTTI::getMemoryOpCost call BasicTTI::getMemoryOpCostHal Finkel2014-04-021-3/+3
| | | | | | | | | | PPCTTI::getMemoryOpCost will now make use of BasicTTI::getMemoryOpCost to calculate the base cost of the memory access, and then adjust on top of that. There is no functionality change from this modification, but it will become important so that PPCTTI can take advantage of scalarization information for which BasicTTI::getMemoryOpCost will account in the near future. llvm-svn: 205476
* Simplify resolveFrameIndex() signature.Jim Grosbach2014-04-022-7/+4
| | | | | | | | Just pass a MachineInstr reference rather than an MBB iterator. Creating a MachineInstr& is the first thing every implementation did anyway. llvm-svn: 205453
* [PowerPC] Add some missing VSX bitcast patternsHal Finkel2014-04-011-0/+8
| | | | llvm-svn: 205352
* [PowerPC] Don't ever expand BUILD_VECTOR of v2i64 with shufflesHal Finkel2014-03-312-0/+14
| | | | | | | If we have two unique values for a v2i64 build vector, this will always result in two vector loads if we expand using shuffles. Only one is necessary. llvm-svn: 205231
* [PowerPC] Correct P7 dispatch unit allocation for vector instructionsHal Finkel2014-03-311-16/+8
| | | | llvm-svn: 205222
* [PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREGHal Finkel2014-03-303-0/+45
| | | | | | | | | | | | | | | | sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node (and similarly for v2i16 and v2i8). Even though there are no sign-extension (or algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by converting two and from v2i64. The small trick necessary here is to shift the i32 elements into the right lanes before the i32 -> f64 step. This is because of the big Endian nature of the system, we need the i32 portion in the high word of the i64 elements. For v2i16 and v2i8 we can do the same, but we first use the default Altivec shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and then apply the above procedure. llvm-svn: 205146
* [PowerPC] Handle v2i64 comparisonsHal Finkel2014-03-291-0/+23
| | | | | | | | v2i64 is a legal type under VSX, however we don't have native vector comparisons. We can handle eq/ne by casting it to an Altivec type, but everything else must be expanded. llvm-svn: 205106
* [PowerPC] VSX instruction latency correctionsHal Finkel2014-03-292-15/+15
| | | | | | | | The vector divide and sqrt instructions have high latencies, and the scalar comparisons are like all of the others. On the P7, permutations take an extra cycle over purely-simple vector ops. llvm-svn: 205096
* Completely rewrite ELFObjectWriter::RecordRelocation.Rafael Espindola2014-03-292-57/+1
| | | | | | | | | | | | | | | | | | | I started trying to fix a small issue, but this code has seen a small fix too many. The old code was fairly convoluted. Some of the issues it had: * It failed to check if a symbol difference was in the some section when converting a relocation to pcrel. * It failed to check if the relocation was already pcrel. * The pcrel value computation was wrong in some cases (relocation-pc.s) * It was missing quiet a few cases where it should not convert symbol relocations to section relocations, leaving the backends to patch it up. * It would not propagate the fact that it had changed a relocation to pcrel, requiring a quiet nasty work around in ARM. * It was missing comments. llvm-svn: 205076
* [PowerPC] Add subregister classes for f64 VSX valuesHal Finkel2014-03-298-59/+192
| | | | | | | | | | | | | We had stored both f64 values and v2f64, etc. values in the VSX registers. This worked, but was suboptimal because we would always spill 16-byte values even through we almost always had scalar 8-byte values. This resulted in an increase in stack-size use, extra memory bandwidth, etc. To fix this, I've added 64-bit subregisters of the Altivec registers, and combined those with the existing scalar floating-point registers to form a class of VSX scalar floating-point registers. The ABI code has also been enhanced to use this register class and some other necessary improvements have been made. llvm-svn: 205075
* [PowerPC] Fix VSX permutation iselHal Finkel2014-03-281-1/+1
| | | | | | | Not only did I invert the indices when I wrote the code, but I also did the same thing when I wrote the regression test. Oops. llvm-svn: 205046
* [PowerPC] v2[fi]64 need to be explicitly passed in VSX registersHal Finkel2014-03-282-7/+36
| | | | | | | | v2[fi]64 values need to be explicitly passed in VSX registers. This is because the code in TRI that finds the minimal register class given a register and a value type will assert if given an Altivec register and a non-Altivec type. llvm-svn: 205041
* [PowerPC] Use a small cleanup pass to remove VSX self copiesHal Finkel2014-03-273-0/+78
| | | | | | | | | | | | As explained in r204976, because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) This adds a small cleanup pass to remove these prior to post-RA scheduling. llvm-svn: 204980
* [PowerPC] Don't remove self VSX copies in PPCInstrInfo::copyPhysRegHal Finkel2014-03-271-9/+13
| | | | | | | | | | | | | | | Because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) The problem is that ExpandPostRAPseudos always assumes that *some* instruction has been inserted, and adds implicit defs to it. This is a problem if no copy was inserted because it can cause subtle problems during post-RA scheduling. These self copies will have to be removed some other way. llvm-svn: 204976
* [PowerPC] Fix v2f64 vector extract and related patternsHal Finkel2014-03-272-4/+4
| | | | | | | | | First, v2f64 vector extract had not been declared legal (and so the existing patterns were not being used). Second, the patterns for that, and for scalar_to_vector, should really be a regclass copy, not a subregister operation, because the VSX registers directly hold both the vector and scalar data. llvm-svn: 204971
* [PowerPC] Expand v2i64 shiftsHal Finkel2014-03-271-0/+4
| | | | | | | | These operations need to be expanded during legalization so that isel does not crash. In theory, we might be able to custom lower some of these. That, however, would need to be follow-up work. llvm-svn: 204963
* Remove another unused argument.Rafael Espindola2014-03-271-3/+2
| | | | llvm-svn: 204961
* Remove unused argument.Rafael Espindola2014-03-271-5/+3
| | | | llvm-svn: 204956
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-273-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934
* [PowerPC] Generate VSX permutations for v2[fi]64 vectorsHal Finkel2014-03-263-5/+45
| | | | llvm-svn: 204873
* [PowerPC] VSX loads and stores support unaligned accessHal Finkel2014-03-262-3/+10
| | | | | | | I've not yet updated PPCTTI because I'm not sure what the actual relative cost is compared to the aligned uses. llvm-svn: 204848
* [PowerPC] Use v2f64 <-> v2i64 VSX conversion instructionsHal Finkel2014-03-262-4/+13
| | | | llvm-svn: 204843
OpenPOWER on IntegriCloud