summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Use rldicr instruction for AND with an immediate if possibleNemanja Ivanovic2017-02-241-0/+13
| | | | | | | | | | | Emit clrrdi (extended mnemonic for rldicr) for AND-ing with masks that clear bits from the right hand size. Committing on behalf of Hiroshi Inoue. Differential Revision: https://reviews.llvm.org/D29388 llvm-svn: 296143
* [PPC] cleanup of mayLoad/mayStore flags and memory operands.Sean Fertile2017-01-261-1/+5
| | | | | | | | | | | | 1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store instructions. 2) Updated the flags on a number of intrinsics indicating that they write memory. 3) Added SDNPMemOperand flags for some target dependent SDNodes so that they propagate their memory operand Review: https://reviews.llvm.org/D28818 llvm-svn: 293200
* [PowerPC] Expand ISEL instruction into if-then-else sequence.Tony Jiang2017-01-161-4/+0
| | | | | | | | | Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. llvm-svn: 292154
* Revert "[PowerPC] Expand ISEL instruction into if-then-else sequence."Tony Jiang2017-01-161-0/+4
| | | | | | This reverts commit 1d0e0374438ca6e153844c683826ba9b82486bb1. llvm-svn: 292131
* [PowerPC] Expand ISEL instruction into if-then-else sequence.Tony Jiang2017-01-161-4/+0
| | | | | | | | | Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. llvm-svn: 292128
* [PowerPC] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-01-131-33/+53
| | | | | | other minor fixes (NFC). llvm-svn: 291872
* Create the virtual register for the global base in the intersection ofJoerg Sonnenberger2016-11-021-2/+2
| | | | | | | | GPRC and GPRC_NOR0 (or the 64bit equivalent) and not just the latter. GPRC_NOR0 contains ZERO as alternative meaning of r0 and is therefore not a true subclass of GPRC. llvm-svn: 285813
* [PowerPC] - No SExt/ZExt needed for count trailing zerosNemanja Ivanovic2016-10-271-2/+4
| | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D25896 It just eliminates the redundant ZExt after a count trailing zeros instruction. llvm-svn: 285267
* [PPC] Better codegen for AND, ANY_EXT, SRL sequenceEhsan Amiri2016-10-241-0/+17
| | | | | | | | https://reviews.llvm.org/D24924 This improves the code generated for a sequence of AND, ANY_EXT, SRL instructions. This is a targetted fix for this special pattern. The pattern is generated by target independet dag combiner and so a more general fix may not be necessary. If we come across other similar cases, some ideas for handling it are discussed on the code review. llvm-svn: 284983
* [PPC] Shorter sequence to load 64bit constant with same hi/lo wordsGuozhi Wei2016-10-141-2/+23
| | | | | | | | | | | | This is a patch to implement pr30640. When a 64bit constant has the same hi/lo words, we can use rldimi to copy the low word into high word of the same register. This optimization caused failure of test case bperm.ll because of not optimal heuristic in function SelectAndParts64. It chooses AND or ROTATE to extract bit groups from a register, and OR them together. This optimization lowers the cost of loading 64bit constant mask used in AND method, and causes different code sequence. But actually ROTATE method is better in this test case. The reason is in ROTATE method the final OR operation can be avoided since rldimi can insert the rotated bits into target register directly. So this patch also enhances SelectAndParts64 to prefer ROTATE method when the two methods have same cost and there are multiple bit groups need to be ORed together. Differential Revision: https://reviews.llvm.org/D25521 llvm-svn: 284276
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283004
* getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCISanjay Patel2016-09-141-5/+5
| | | | llvm-svn: 281493
* [PowerPC] Fix address-offset folding for plain addiHal Finkel2016-09-071-15/+38
| | | | | | | | | | | | | | | | | | | | | | | When folding an addi into a memory access that can take an immediate offset, we were implicitly assuming that the existing offset was zero. This was incorrect. If we're dealing with an addi with a plain constant, we can add it to the existing offset (assuming that doesn't overflow the immediate, etc.), but if we have anything else (i.e. something that will become a relocation expression), we'll go back to requiring the existing immediate offset to be zero (because we don't know what the requirements on that relocation expression might be - e.g. maybe it is paired with some addis in some relevant way). On the other hand, when dealing with a plain addi with a regular constant immediate, the alignment restrictions (from the TOC base pointer, etc.) are irrelevant. I've added the test case from PR30280, which demonstrated the bug, but also demonstrates a missed optimization opportunity (i.e. we don't need the memory accesses at all). Fixes PR30280. llvm-svn: 280789
* [PowerPC] For larger offsets, when possible, fold offset into addis toc@haHal Finkel2016-09-021-2/+27
| | | | | | | | | | | | | When we have an offset into a global, etc. that is accessed relative to the TOC base pointer, and the offset is larger than the minimum alignment of the global itself and the TOC base pointer (which is 8-byte aligned), we can still fold the @toc@ha into the memory access, but we must update the addis instruction's symbol reference with the offset as the symbol addend. When there is only one use of the addi to be folded and only one use of the addis that would need its symbol's offset adjusted, then we can make the adjustment and fold the @toc@l into the memory access. llvm-svn: 280545
* [PowerPC] Don't apply the PPC64 address-formation peephole for offsets ↵Hal Finkel2016-09-021-2/+7
| | | | | | | | | | | | | | | | greater than 7 When applying our address-formation PPC64 peephole, we are reusing the @ha TOC addis value with the low parts associated with different offsets (i.e. different effective symbol addends). We were assuming this was okay so long as the offsets were less than the alignment of the global variable being accessed. This ignored the fact, however, that the TOC base pointer itself need only be 8-byte aligned. As a result, what we were doing is legal only for offsets less than 8 regardless of the alignment of the object being accessed. Fixes PR28727. llvm-svn: 280441
* [PowerPC] Don't consider fusion in PPC64 address-formation peepholeHal Finkel2016-09-021-7/+0
| | | | | | | | | | | | | The logic in this function assumes that the P8 supports fusion of addis/addi, but it does not. As a result, there is no advantage to restricting our peephole application, merging addi instructions into dependent memory accesses, even when the addi has multiple users, regardless of whether or not we're optimizing for size. We might need something like this again for the P9; I suspect we'll revisit this code when we work on P9 tuning. llvm-svn: 280440
* Replace "fallthrough" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-1/+2
| | | | | | | This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902
* [PPC] Memoize getValueBits. NFC.Tim Shen2016-08-121-35/+49
| | | | | | | | | | | | Summary: It triggers exponential behavior when the DAG has many branches. Reviewers: hfinkel, kbarton Subscribers: iteratee, nemanjai, echristo Differential Revision: https://reviews.llvm.org/D23428 llvm-svn: 278548
* Use the range variant of remove_if instead of unpacking begin/endDavid Majnemer2016-08-121-2/+1
| | | | | | No functionality change is intended. llvm-svn: 278475
* [Codegen] Change PICLevel.Davide Italiano2016-06-171-1/+1
| | | | | | | | | We convert `Default` to `NotPIC` so that target independent code can reason about this correctly. Differential Revision: http://reviews.llvm.org/D21394 llvm-svn: 273024
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-15/+17
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* Remove bogus initialization of the PPC and Hexagon SelectionDAGISelChandler Carruth2016-06-031-18/+1
| | | | | | | | | | | | | | | | | subclasses. These are not passes proper. We don't support registering them, they can't be constructed with default arguments, and the ID is actually in a base class. Only these two targets even had any boiler plate to try to do this, and it had to be munged out of the INITIALIZE_PASS macros to work. What's worse, the boiler plate has rotted and the "name" of the pass is actually the description string now!!! =/ All of this is completely unnecessary. No other target bothers, and nothing breaks if you don't initialize them because CodeGen has an entirely separate initialization path that is somewhat more durable than relying on the implicit initialization the way the 'opt' tool does for registered passes. llvm-svn: 271650
* SDAG: Implement Select instead of SelectImpl in PPCDAGToDAGISelJustin Bogner2016-05-201-151/+209
| | | | | | | | | | | - Where we were returning a node before, call ReplaceNode instead. - Where we would return null to fall back to another selector, rename the method to try* and return a bool for success. - Where we were calling SelectNodeTo, just return afterwards. Part of llvm.org/pr26808. llvm-svn: 270283
* SDAG: Rename Select->SelectImpl and repurpose Select as returning voidJustin Bogner2016-05-051-4/+3
| | | | | | | | | | | | | | This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693
* Remove uses of builtin comma operator.Richard Trieu2016-02-181-16/+24
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* Refactor: Simplify boolean conditional return statements in lib/Target/PowerPCAlexander Kornienko2015-12-281-4/+1
| | | | | | | | | | | | | | Summary: Use clang-tidy to simplify boolean conditional return statements Reviewers: uweigand, rafael, wschmidt Subscribers: craig.topper, llvm-commits Patch by Richard Thomson! Differential Revision: http://reviews.llvm.org/D9984 llvm-svn: 256493
* [BPI] Replace weights by probabilities in BPI.Cong Hou2015-12-221-11/+9
| | | | | | | | | | | | This patch removes all weight-related interfaces from BPI and replace them by probability versions. With this patch, we won't use edge weight anymore in either IR or MC passes. Edge probabilitiy is a better representation in terms of CFG update and validation. Differential revision: http://reviews.llvm.org/D15519 llvm-svn: 256263
* [PowerPC] Add Branch Hints for Highly-Biased BranchesHal Finkel2015-12-121-2/+66
| | | | | | | | | | | This branch adds hints for highly biased branches on the PPC architecture. Even in absence of profiling information, LLVM will mark code reaching unreachable terminators and other exceptional control flow constructs as highly unlikely to be reached. Patch by Tom Jablin! llvm-svn: 255398
* Fix build after r255319.Hans Wennborg2015-12-111-1/+1
| | | | llvm-svn: 255322
* [PPC]: Peephole optimize small accesss to aligned globals.Kyle Butt2015-12-111-9/+26
| | | | | | | | | | | | | | | | | | | | | | | Access to aligned globals gives us a chance to peephole optimize nonzero offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't overflow the available displacement. For example: addis 3, 2, b4v@toc@ha addi 4, 3, b4v@toc@l lbz 5, b4v@toc@l(3) ; This is the result of the current peephole lbz 6, 1(4) ; optimizer lbz 7, 2(4) lbz 8, 3(4) If b4v is 4-byte aligned, we can skip using register 4 because we know that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate: addis 3, 2, b4v@toc@ha lbz 4, b4v@toc@l(3) lbz 5, b4v@toc@l+1(3) lbz 6, b4v@toc@l+2(3) lbz 7, b4v@toc@l+3(3) Saving a register and an addition. Larger alignments allow larger structures/arrays to be optimized. llvm-svn: 255319
* Test Commit: iterateeKyle Butt2015-12-021-2/+2
| | | | | | Remove whitespace from blank lines. NFC llvm-svn: 254531
* Weak non-function symbols were being accessed directly, which isEric Christopher2015-11-201-8/+5
| | | | | | | | | | incorrect, as the chosen representative of the weak symbol may not live with the code in question. Always indirect the access through the TOC instead. Patch by Kyle Butt! llvm-svn: 253708
* [PowerPC] Remove redundant code.Tilmann Scheller2015-11-101-3/+2
| | | | | | | | The local variable Hi is never being read. Issue identified by the Clang static analyzer. llvm-svn: 252600
* Fix for bootstrap bug introduced in r244921Nemanja Ivanovic2015-11-021-1/+1
| | | | | | | | | | This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. It turns out that the new code path taken due to legalizing a scalar_to_vector of i64 -> v2i64 exposes a missing check in a micro optimization to change a load followed by a scalar_to_vector into a load and splat instruction on PPC. llvm-svn: 251798
* PowerPC: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-201-3/+3
| | | | llvm-svn: 250787
* [PowerPC] Fix invalid lxvdsx optimization (PR25157)Bill Schmidt2015-10-141-0/+2
| | | | | | | | | | | | PR25157 identifies a bug where a load plus a vector shuffle is incorrectly converted into an LXVDSX instruction. That optimization is only valid if the load is of a doubleword, and in the noted case, it was not. This corrects that problem. Joint patch with Eric Schweitz, who provided the bugpoint-reduced test case. llvm-svn: 250324
* MachineBasicBlock: Factor out common code into isReturnBlock()Matthias Braun2015-09-251-1/+1
| | | | llvm-svn: 248617
* [PowerPC] Fix and(or(x, c1), c2) -> rlwimi generationHal Finkel2015-09-051-3/+15
| | | | | | | | | | | | | | | | PPCISelDAGToDAG has a transformation that generates a rlwimi instruction from an input pattern that looks like this: and(or(x, c1), c2) but the associated logic does not work if there are bits that are 1 in c1 but 0 in c2 (these are normally canonicalized away, but that can't happen if the 'or' has other users. Make sure we abort the transformation if such bits are discovered. Fixes PR24704. llvm-svn: 246900
* [PowerPC] Fix value type on XVCMPEQDP for v2f64 comparisonsHal Finkel2015-08-201-3/+4
| | | | | | | | | XVCMPEQDP is used for VSX v2f64 equality comparisons, but the value type needs to be v2i64 (as that's the corresponding SETCC type). Fixes PR24225. llvm-svn: 245535
* Fix some comment typos.Benjamin Kramer2015-08-081-1/+1
| | | | llvm-svn: 244402
* Add allnodes() iterator range to SelectionDAG. NFC.Pete Cooper2015-07-141-3/+2
| | | | | | | | | | | SelectionDAG already had begin/end methods for iterating over all the nodes, but didn't define an iterator_range for us in foreach loops. This adds such a method and uses it in some of the eligible places throughout the backends. llvm-svn: 242212
* Make TargetLowering::getPointerTy() taking DataLayout as an argumentMehdi Amini2015-07-091-14/+22
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
* IR: Do not consider available_externally linkage to be linker-weak.Peter Collingbourne2015-07-051-1/+1
| | | | | | | | | | | | | | | From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413
* [PPC64LE] Enable missing lxvdsx optimization, and related swap optimizationBill Schmidt2015-07-011-12/+11
| | | | | | | | | | | | | | | | | | | | | | | When adding little-endian vector support for PowerPC last year, I inadvertently disabled an optimization that recognizes a load-splat idiom and generates the lxvdsx instruction. This patch moves the offending logic so lxvdsx is once again generated. This pattern is frequently generated by the vectorizer for scalar loads of an effective constant. Previously the lxvdsx instruction was wrongly listed as lane-sensitive for the VSX swap optimization (since both doublewords are identical, swaps are safe). This patch fixes this as well, so that vectorized code using lxvdsx can now have swaps removed from the computation. There is an existing test (@test50) in test/CodeGen/PowerPC/vsx.ll that checks for the missing optimization. However, vsx.ll was only being tested for POWER7 with big-endian code generation. I've added a little-endian RUN statement and expected LE code generation for all the tests in vsx.ll to give us a bit better VSX coverage, including what's needed for this patch. llvm-svn: 241183
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* [PPC] Factor vector removal into a function and remove O(n^2) behavior.Benjamin Kramer2015-06-201-25/+17
| | | | | | No functionality change intended. llvm-svn: 240222
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Add VSX Scalar loads and stores to the PPC back endNemanja Ivanovic2015-05-071-1/+6
| | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D9440 It adds a new register class to the PPC back end to contain single precision values in VSX registers. Additionally, it adds scalar loads and stores for VSX registers. llvm-svn: 236755
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-281-96/+122
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-281-122/+96
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
OpenPOWER on IntegriCloud