summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* WebAssembly: fix build break introduced by ELFObjectWriter churnJF Bastien2016-01-131-1/+1
| | | | llvm-svn: 257709
* Convert a few assert failures into proper errors.Rafael Espindola2016-01-131-3/+3
| | | | | | Fixes PR25944. llvm-svn: 257697
* [PowerPC] Fix large code model with the ELFv2 ABIUlrich Weigand2016-01-133-24/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The global entry point prologue currently assumes that the TOC associated with a function is less than 2GB away from the function entry point. This is always true when using the medium or small code model, but may not be the case when using the large code model. This patch adds a new variant of the ELFv2 global entry point prologue that lifts the 2GB restriction when building with -mcmodel=large. This works by emitting a quadword containing the distance from the function entry point to its associated TOC immediately before the entry point, and then using a prologue like: ld r2,-8(r12) add r2,r2,r12 Since creation of the entry point prologue is now split across two separate routines (PPCLinuxAsmPrinter::EmitFunctionEntryLabel emits the data word, PPCLinuxAsmPrinter::EmitFunctionBodyStart the prolog code), I've switched to using named labels instead of just temporaries to indicate the locations of the global and local entry points and the new TOC offset data word. These names are provided by new routines in PPCFunctionInfo modeled after the existing PPCFunctionInfo::getPICOffsetSymbol. Note that a corresponding change was committed to GCC here: https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00355.html Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D15500 llvm-svn: 257597
* Codegen: [PPC] Handle weighted comparisons when inserting selects.Kyle Butt2016-01-121-10/+33
| | | | | | | | | | | | | Only non-weighted predicates were handled in PPCInstrInfo::insertSelect. Handle the weighted predicates as well. This latent bug was triggered by r255398, because it added use of the branch-weighted predicates. While here, switch over an enum instead of an int to get the compiler to enforce totality in the future. llvm-svn: 257518
* Prevent renaming of CR fields in AADB when a CR restore is presentNemanja Ivanovic2016-01-082-2/+28
| | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15930 Moves to and from CR fields depend on shifts/masks that depend on the target/source CR field. Thus, post-ra anti-dep breaking must not later change that CR register assignment. llvm-svn: 257168
* Add call sequence start and end for __tls_get_addrKyle Butt2016-01-081-0/+7
| | | | | | | | | | | | | | This is a fix for bug http://llvm.org/bugs/show_bug.cgi?id=25839. For a PIC TLS variable access in a function, prologue (mflr followed by std and stdu) gets scheduled after a tls_get_addr call. tls_get_addr messed up LR but no one saves/restores it. Also added a test for save/restore clobbered registers during calling __tls_get_addr. Patch by Tim Shen llvm-svn: 257137
* Refactor: Simplify boolean conditional return statements in lib/Target/PowerPCAlexander Kornienko2015-12-284-14/+4
| | | | | | | | | | | | | | Summary: Use clang-tidy to simplify boolean conditional return statements Reviewers: uweigand, rafael, wschmidt Subscribers: craig.topper, llvm-commits Patch by Richard Thomson! Differential Revision: http://reviews.llvm.org/D9984 llvm-svn: 256493
* Remove extra forward declarations and scrub includes for all in tree ↵Craig Topper2015-12-251-2/+0
| | | | | | InstPrinters. NFC llvm-svn: 256427
* [BPI] Replace weights by probabilities in BPI.Cong Hou2015-12-221-11/+9
| | | | | | | | | | | | This patch removes all weight-related interfaces from BPI and replace them by probability versions. With this patch, we won't use edge weight anymore in either IR or MC passes. Edge probabilitiy is a better representation in terms of CFG update and validation. Differential revision: http://reviews.llvm.org/D15519 llvm-svn: 256263
* Simplify. NFC.Rafael Espindola2015-12-171-6/+3
| | | | llvm-svn: 255894
* [CodeGen] Make MachineInstrBuilder::copyImplicitOps const. NFC.Ahmed Bougacha2015-12-161-13/+11
| | | | | | | | This matches the other MIB methods, none of which modify the builder. Without this, we can't chain copyImplicitOps. Also reformat the few users, in PPCEarlyReturn. llvm-svn: 255828
* LPM: Stop threading `Pass *` through all of the loop utility APIs. NFCJustin Bogner2015-12-152-2/+9
| | | | | | | | | | | | | | | | | | | | | | A large number of loop utility functions take a `Pass *` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669
* Bitcasts between FP and INT values using direct movesNemanja Ivanovic2015-12-153-4/+41
| | | | | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15286 This patch was meant to land in revision 255246, but I accidentally uploaded the patch that corresponds to http://reviews.llvm.org/D15372 in that revision accidentally. Thereby, this patch is the actual Bitcasts using direct moves patch, whereas http://reviews.llvm.org/rL255246 actually corresponds to http://reviews.llvm.org/D15372. llvm-svn: 255649
* Define a feature for __float128 support in the PPC back endNemanja Ivanovic2015-12-153-0/+7
| | | | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15117 In preparation for supporting IEEE Quad precision floating point, this patch simply defines a feature to specify the target supports this. For now, nothing is done with the target feature, we just don't want warnings from the Clang FE when a user specifies -mfloat128. Calling convention and other related work will add to this patch in the near future. llvm-svn: 255642
* [Power PC] llvm soft float support for ppc32Petar Jovanovic2015-12-146-8/+32
| | | | | | | | | | | This is the second in a set of patches for soft float support for ppc32, it enables soft float operations. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D13700 llvm-svn: 255516
* [PPC] Early exit loop. NFC.Chad Rosier2015-12-141-1/+4
| | | | llvm-svn: 255497
* Normalize MBB's successors' probabilities in several locations.Cong Hou2015-12-131-2/+2
| | | | | | | | | | | | This patch adds some missing calls to MBB::normalizeSuccProbs() in several locations where it should be called. Those places are found by checking if the sum of successors' probabilities is approximate one in MachineBlockPlacement pass with some instrumented code (not in this patch). Differential revision: http://reviews.llvm.org/D15259 llvm-svn: 255455
* [PowerPC] OutStreamer cleanup in PPCAsmPrinterHal Finkel2015-12-121-23/+19
| | | | | | | | We don't need to pass OutStreamer as a parameter to LowerSTACKMAP and LowerPATCHPOINT. It is a member variable of PPCAsmPrinter, and thus, is already available. NFC. llvm-svn: 255418
* [PowerPC] Add Branch Hints for Highly-Biased BranchesHal Finkel2015-12-122-2/+74
| | | | | | | | | | | This branch adds hints for highly biased branches on the PPC architecture. Even in absence of profiling information, LLVM will mark code reaching unreachable terminators and other exceptional control flow constructs as highly unlikely to be reached. Patch by Tom Jablin! llvm-svn: 255398
* Start replacing vector_extract/vector_insert with extractelt/inserteltMatt Arsenault2015-12-112-14/+14
| | | | | | | | | | | | | | | | | | | | These are redundant pairs of nodes defined for INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT. insertelement/extractelement are slightly closer to the corresponding C++ node name, and has stricter type checking so prefer it. Update targets to only use these nodes where it is trivial to do so. AArch64, ARM, and Mips all have various type errors on simple replacement, so they will need work to fix. Example from AArch64: def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8), (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>; Which is trying to do sext_inreg i8, i8. llvm-svn: 255359
* Fix build after r255319.Hans Wennborg2015-12-111-1/+1
| | | | llvm-svn: 255322
* [PPC]: Peephole optimize small accesss to aligned globals.Kyle Butt2015-12-111-9/+26
| | | | | | | | | | | | | | | | | | | | | | | Access to aligned globals gives us a chance to peephole optimize nonzero offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't overflow the available displacement. For example: addis 3, 2, b4v@toc@ha addi 4, 3, b4v@toc@l lbz 5, b4v@toc@l(3) ; This is the result of the current peephole lbz 6, 1(4) ; optimizer lbz 7, 2(4) lbz 8, 3(4) If b4v is 4-byte aligned, we can skip using register 4 because we know that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate: addis 3, 2, b4v@toc@ha lbz 4, b4v@toc@l(3) lbz 5, b4v@toc@l+1(3) lbz 6, b4v@toc@l+2(3) lbz 7, b4v@toc@l+3(3) Saving a register and an addition. Larger alignments allow larger structures/arrays to be optimized. llvm-svn: 255319
* PPC: Teach FMA mutate to respect register classes.Kyle Butt2015-12-101-2/+9
| | | | | | | | | This was causing bad code gen and assembly that won't assemble, as mixed altivec and vsx code would end up with a vsx high register assigned to an altivec instruction, which won't work. Constraining the classes allows the optimization to proceed. llvm-svn: 255299
* Bitcasts between FP and INT values using direct movesNemanja Ivanovic2015-12-101-90/+218
| | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15286 LLVM IR frequently contains bitcast operations between floating point and integer values of the same width. Doing this through memory operations is quite expensive on PPC. This patch allows the use of direct register moves between FPRs and GPRs for lowering bitcasts. llvm-svn: 255246
* [PPC64] Convert bool literals to i32Kit Barton2015-12-074-0/+261
| | | | | | | Convert i1 values to i32 values if they should be allocated in GPRs instead of CRs. Phabricator: http://reviews.llvm.org/D14064 llvm-svn: 254942
* Replace uint16_t with the MCPhysReg typedef in many places. A lot of ↵Craig Topper2015-12-052-3/+3
| | | | | | physical register arrays already use this typedef. llvm-svn: 254843
* [PowerPC] Remove wild call to RegScavenger::initRegState().Alexey Samsonov2015-12-021-2/+1
| | | | | | | | This call should in fact be made by RegScavenger::enterBasicBlock() called below. The first call does nothing except for triggering UB, indicated by UBSan (passing nullptr to memset()). llvm-svn: 254548
* Test Commit: iterateeKyle Butt2015-12-021-2/+2
| | | | | | Remove whitespace from blank lines. NFC llvm-svn: 254531
* Patch to fix a crash in the PowerPC back end due to ISD::ROTL and ISD::ROTRNemanja Ivanovic2015-12-021-0/+2
| | | | | | not being expanded. Test case included. llvm-svn: 254501
* Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.Yury Gribov2015-12-016-0/+60
| | | | | | | | | | | | | | | The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404
* Enable shrink wrapping for PPC64Kit Barton2015-11-301-6/+14
| | | | | | | | | | | Re-enable shrink wrapping for PPC64 Little Endian. One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly. Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found. PHabricator: http://reviews.llvm.org/D14778 llvm-svn: 254314
* Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)Artyom Skrobov2015-11-251-12/+6
| | | | | | | | | | | | | | Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085
* [PowerPC] Don't generate mfocrf on the e500mcHal Finkel2015-11-251-1/+1
| | | | | | | | | The e500mc does not actually support the mfocrf instruction; update the processor definitions to reflect that fact. Patch by Tom Rix (with some test-case cleanup by me). llvm-svn: 254064
* Let SelectionDAG start to use probability-based interface to add successors.Cong Hou2015-11-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes. 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights. 3. Use new interfaces in all other passes. 4. Remove old interfaces. This the second patch above. In this patch SelectionDAG starts to use probability-based interfaces in MBB to add successors but other MC passes are still using weight-based interfaces. Therefore, we need to maintain correct weight list in MBB even when probability-based interfaces are used. This is done by updating weight list in probability-based interfaces by treating the numerator of probabilities as weights. This change affects many test cases that check successor weight values. I will update those test cases once this patch looks good to you. Differential revision: http://reviews.llvm.org/D14361 llvm-svn: 253965
* Power8 and later support fusing addis/addi and addis/ld instructionEric Christopher2015-11-203-1/+7
| | | | | | | | | pairs that use the same register to execute as a single instruction. No Functional Change Patch by Kyle Butt! llvm-svn: 253724
* Weak non-function symbols were being accessed directly, which isEric Christopher2015-11-205-47/+64
| | | | | | | | | | incorrect, as the chosen representative of the weak symbol may not live with the code in question. Always indirect the access through the TOC instead. Patch by Kyle Butt! llvm-svn: 253708
* [Assembler] Make fatal assembler errors non-fatalOliver Stannard2015-11-171-2/+2
| | | | | | | | | | | | | | Currently, if the assembler encounters an error after parsing (such as an out-of-range fixup), it reports this as a fatal error, and so stops after the first error. However, for most of these there is an obvious way to recover after emitting the error, such as emitting the fixup with a value of zero. This means that we can report on all of the errors in a file, not just the first one. MCContext::reportError records the fact that an error was encountered, so we won't actually emit an object file with the incorrect contents. Differential Revision: http://reviews.llvm.org/D14717 llvm-svn: 253328
* Fix typos in comments.Jay Foad2015-11-171-1/+1
| | | | llvm-svn: 253324
* Drop prelink support.Rafael Espindola2015-11-171-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way prelink used to work was * The compiler decides if a given section only has relocations that are know to point to the same DSO. If so, it names it .data.rel.ro.local<something>. * The static linker puts all of these together. * The prelinker program assigns addresses to each library and resolves the local relocations. There are many problems with this: * It is incompatible with address space randomization. * The information passed by the compiler is redundant. The linker knows if a given relocation is in the same DSO or not. If could sort by that if so desired. * There are newer ways of speeding up DSO (gnu hash for example). * Even if we want to implement this again in the compiler, the previous implementation is pretty broken. It talks about relocations that are "resolved by the static linker". If they are resolved, there are none left for the prelinker. What one needs to track is if an expression will require only dynamic relocations that point to the same DSO. At this point it looks like the prelinker is an historical curiosity. For example, fedora has retired it because it failed to build for two releases (http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f) This patch removes support for it. That is, it stops printing the ".local" sections. llvm-svn: 253280
* Find available scratch register to use in function prologue and epilogue as ↵Kit Barton2015-11-162-4/+91
| | | | | | | part of shrink wrapping. Phabricator: http://reviews.llvm.org/D13955 llvm-svn: 253247
* Reduce the size of MCRelaxableFragment.Akira Hatanaka2015-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127
* [MCTargetAsmParser] Move the member varialbes that referenceAkira Hatanaka2015-11-141-7/+6
| | | | | | | | | | MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a member function getSTI. This is done in preparation for making changes to shrink the size of MCRelaxableFragment. (see http://reviews.llvm.org/D14346). llvm-svn: 253124
* Add PPCMIPeephole.cpp to CMakeLists.txtBill Schmidt2015-11-101-0/+1
| | | | llvm-svn: 252654
* [PowerPC] Add an MI SSA peephole pass.Bill Schmidt2015-11-103-0/+241
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a pass for doing PowerPC peephole optimizations at the MI level while the code is still in SSA form. This allows for easy modifications to the instructions while depending on a subsequent pass of DCE. Both passes are very fast due to the characteristics of SSA. At this time, the only peepholes added are for cleaning up various redundancies involving the XXPERMDI instruction. However, I would expect this will be a useful place to add more peepholes for inefficiencies generated during instruction selection. The pass is placed after VSX swap optimization, as it is best to let that pass remove unnecessary swaps before performing any remaining clean-ups. The utility of these clean-ups are demonstrated by changes to four existing test cases, all of which now have tighter expected code generation. I've also added Eric Schweiz's bugpoint-reduced test from PR25157, for which we now generate tight code. One other test started failing for me, and I've fixed it (test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not related to my changes, and I'm not sure why it works before and not after. The problem is that the CHECK-NOT: of "statepoint" from test1 fails because of the "statepoint" in test2, and so forth. Adding a CHECK-LABEL in between keeps the different occurrences of that string properly scoped. llvm-svn: 252651
* [PowerPC] Remove redundant code.Tilmann Scheller2015-11-101-3/+2
| | | | | | | | The local variable Hi is never being read. Issue identified by the Clang static analyzer. llvm-svn: 252600
* [AsmParser] Backends can parameterize ASM tokenization.Colin LeMahieu2015-11-091-0/+1
| | | | llvm-svn: 252439
* [PowerPC] Fix LoopPreIncPrep not to depend on SCEV constant simplificationsHal Finkel2015-11-081-36/+78
| | | | | | | | | | | | | | | | | | | | | | | Under most circumstances, if SCEV can simplify X-Y to a constant, then it can also simplify Y-X to a constant. However, there is no guarantee that this is always true, and concensus is not to consider that a correctness bug in SCEV (although it is undesirable). PPCLoopPreIncPrep gathers pointers used to access memory (via loads, stores and prefetches) into buckets, where in each bucket the relative pointer offsets are constant. We used to keep each bucket as a multimap, where SCEV's subtraction operation was used to define the ordering predicate. Instead, use a fixed SCEV base expression for each bucket, record the constant offsets from that base expression, and adjust it later, if desirable, once all pointers have been collected. Doing it this way should be more compile-time efficient than the previous scheme (in addition to making the implementation less sensitive to SCEV simplification quirks). Fixes PR25170. llvm-svn: 252417
* [WinEH] Update exception pointer registersJoseph Tremoulet2015-11-072-10/+21
| | | | | | | | | | | | | | | | | | | | Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383
* [WinEH] Mark funclet entries and exits as clobbering all registersReid Kleckner2015-11-061-1/+1
| | | | | | | | | | | | | | | | | Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318
* replace MachineCombinerPattern namespace and enum with enum class; NFCISanjay Patel2015-11-052-2/+2
| | | | | | | | Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196
OpenPOWER on IntegriCloud