summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [NFC] fix formattingHiroshi Inoue2018-06-081-1/+1
| | | | llvm-svn: 334263
* [PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelectorHiroshi Inoue2018-06-071-0/+14
| | | | | | | | | | | | | | | | BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation. However, enforcing 32-bit instruction sometimes results in redundant generated code. For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF). uint64_t func(uint64_t a, uint64_t b) { return (a & 0xFFFFFFFF) | (b << 32) ; } To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group. Differential Revision: https://reviews.llvm.org/D47867 llvm-svn: 334195
* [PowerPC] fix trivial typos in comment, NFCHiroshi Inoue2018-06-071-2/+2
| | | | llvm-svn: 334191
* [MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixupPeter Smith2018-06-061-2/+4
| | | | | | | | | | | | | | | | | | On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078
* [PowerPC] reduce rotate in BitPermutationSelectorHiroshi Inoue2018-06-051-1/+7
| | | | | | | | | | | | | | BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers. Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation. For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation. This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5. This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first. Differential Revision: https://reviews.llvm.org/D47765 llvm-svn: 334011
* Move Analysis/Utils/Local.h back to TransformsDavid Blaikie2018-06-042-2/+2
| | | | | | | | | | Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954
* [NFC] Zero initialize local variablesHiroshi Inoue2018-06-011-1/+1
| | | | | | This patch makes local variables zero initialized to avoid broken values in debug output. llvm-svn: 333754
* Set ADDE/ADDC/SUBE/SUBC to expand by defaultAmaury Sechet2018-06-011-0/+9
| | | | | | | | | | | | | | | Summary: They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while. Target that uses these opcodes are changed in order to ensure their behavior doesn't change. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D47422 llvm-svn: 333748
* [PowerPC] Fix the incorrect iterator inside peepholeLei Huang2018-05-291-6/+3
| | | | | | | | | | | | Instruction selection can insert nodes into the underlying list after the root node so iterating will thereby miss it. We should NOT assume that, the root node is the last element in the DAG nodelist. Patch by: steven.zhang (Qing Shan Zhang) Differential Revision: https://reviews.llvm.org/D47437 llvm-svn: 333415
* [Power9]Legalize and emit code for HW/Byte vector extract and convert to QPLei Huang2018-05-281-0/+63
| | | | | | | | | Implemente patterns to extract HWord and Byte vector elements and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46774 llvm-svn: 333377
* [PowerPC] Set isAsmParserOnly=1 for X-form TLS loads/storesZaara Syeda2018-05-282-6/+33
| | | | | | | | | | | | The X-form TLS load/store instructions added for optimizing the initial-exec sequence in https://reviews.llvm.org/rL327635 fail to assemble. llvm-mc fails with the error: invalid operand for instruction. This patch adds these instructions into a block with isAsmParserOnly, similar to how ADD8TLS_ is currently handled. Differential Revision: https://reviews.llvm.org/D47382 llvm-svn: 333374
* [PowerPC] Remove the match pattern in the definition of LXSDX/STXSDXLei Huang2018-05-241-2/+2
| | | | | | | | | | | | | | The match pattern in the definition of LXSDX is xoaddr, so the Pseudo instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post RA based on the register pressure. To avoid ambiguity, we need to remove the select pattern for LXSDX, same as what was done for LXSD. STXSDX also have the same issue. Patch by Qing Shan Zhang (steven.zhang). Differential Revision: https://reviews.llvm.org/D47178 llvm-svn: 333150
* [Power9]Legalize and emit code for W vector extract and convert to QPLei Huang2018-05-231-5/+37
| | | | | | | | | Implemente patterns to extract [Un]signed Word vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46536 llvm-svn: 333115
* [Power9]Legalize and emit code for DW vector extract and convert to QPLei Huang2018-05-231-0/+27
| | | | | | | | | Implemente patterns to extract [Un]signed DWord vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46333 llvm-svn: 333112
* MC: Separate creating a generic object writer from creating a target object ↵Peter Collingbourne2018-05-214-27/+17
| | | | | | | | | | | | | writer. NFCI. With this we gain a little flexibility in how the generic object writer is created. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47045 llvm-svn: 332868
* MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an ↵Peter Collingbourne2018-05-211-17/+17
| | | | | | | | | | | | | MCObjectWriter. NFCI. To make this work I needed to add an endianness field to MCAsmBackend so that writeNopData() implementations know which endianness to use. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47035 llvm-svn: 332857
* Support: Simplify endian stream interface. NFCI.Peter Collingbourne2018-05-181-11/+4
| | | | | | | | | | | | Provide some free functions to reduce verbosity of endian-writing a single value, and replace the endianness template parameter with a field. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47032 llvm-svn: 332757
* [NFC] [Power] Fix instruction format for xsrqpiZaara Syeda2018-05-142-1/+22
| | | | | | | | | | | | xsrqpi is currently using Z23Form_1. The instruction format is xsrqpi R,VRT,VRB,RMC. Rathar than bits 11-15 being used for FRA, it should have bits 11-14 reserved and bit 15 for R. This patch adds a new class Z23Form_4 to fix the instruction format. Differential Revision: https://reviews.llvm.org/D46761 llvm-svn: 332253
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-1415-354/+370
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [STLExtras] Add distance() for ranges, pred_size(), and succ_size()Vedant Kumar2018-05-101-2/+1
| | | | | | | | | | | This commit adds a wrapper for std::distance() which works with ranges. As it would be a common case to write `distance(predecessors(BB))`, this also introduces `pred_size()` and `succ_size()` helpers to make that easier to write. Differential Revision: https://reviews.llvm.org/D46668 llvm-svn: 332057
* [DebugInfo] Examine all uses of isDebugValue() for debug instructions.Shiva Chen2018-05-094-6/+6
| | | | | | | | | | | | | | | | | | Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844
* [Power9]Legalize and emit code for truncate and convert QP to HW and ByteLei Huang2018-05-081-2/+14
| | | | | | | | | Legalize and emit code for truncate and convert float128 to (un)signed short and (un)signed char. Differential Revision: https://reviews.llvm.org/D46194 llvm-svn: 331797
* [Power9]Legalize and emit code for truncate and convert Quad-Precision to WordLei Huang2018-05-081-0/+10
| | | | | | | | | | | Legalize and emit code for: * xscvqpswz : VSX Scalar truncate & Convert Quad-Precision to Signed Word * xscvqpuwz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Word Differential Revision: https://reviews.llvm.org/D45635 llvm-svn: 331790
* [Power9]Legalize and emit code for truncate and convert QP to DWLei Huang2018-05-082-2/+27
| | | | | | | | | | | Legalize and emit code for: * xscvqpsdz : VSX Scalar truncate & Convert Quad-Precision to Signed Dword * xscvqpudz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Dword Differential Revision: https://reviews.llvm.org/D45553 llvm-svn: 331787
* [PowerPC] Unify handling for conversion of FP_TO_INT feeding a storeLei Huang2018-05-084-56/+149
| | | | | | | | | | | | Existing DAG combine only handles conversions for FP_TO_SINT: "{f32, f64} x { i32, i16 }" This patch simplifies the code to handle: "{ FP_TO_SINT, FP_TO_UINT } x { f64, f32 } x { i64, i32, i16, i8 }" Differential Revision: https://reviews.llvm.org/D46102 llvm-svn: 331778
* Commit r331416 breaks the big-endian PPC bot. On the big endian build, weNemanja Ivanovic2018-05-031-0/+3
| | | | | | | actually encounter constants wider than 64-bits. Add the guard to prevent tripping the assert. llvm-svn: 331420
* [PowerPC] Implement isMaskAndCmp0FoldingBeneficialNemanja Ivanovic2018-05-022-0/+15
| | | | | | | | | | | Sinking the and closer to a compare against zero is beneficial on PPC as it allows us to emit record-form instructions. In the future, we may expand this to a larger set of operations that feed compares against zero since PPC has lots of record-form instructions. Differential revision: https://reviews.llvm.org/D46060 llvm-svn: 331416
* [PowerPC] No CTR loop if the candidate exiting block is in a different loopNemanja Ivanovic2018-05-021-0/+14
| | | | | | | | | | | | | | | | The CTR loops pass will insert the decrementing branch instruction in an exiting block for the loop being transformed. However if that block is part of another loop as well (whether a nested loop or with irreducible CFG), it is not valid to use that exiting block. In fact, if the loop hass irreducible CFG, we don't bother analyzing it and we just bail on the transformation. In practice, this doesn't lead to a noticeable reduction in the number of loops transformed by this pass. Fixes https://bugs.llvm.org/show_bug.cgi?id=37229 Differential Revision: https://reviews.llvm.org/D46162 llvm-svn: 331410
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-014-13/+13
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* IWYU for llvm-config.h in llvm, additions.Nico Weber2018-04-302-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184
* Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txtNico Weber2018-04-231-1/+1
| | | | llvm-svn: 330584
* [PowerPC] fix incorrect vectorization of abs() on POWER9Hiroshi Inoue2018-04-212-14/+95
| | | | | | | | | | | | | | | | | | | | Vectorized loops with abs() returns incorrect results on POWER9. This patch fixes it. For example the following code returns negative result if input values are negative though it sums up the absolute value of the inputs. int vpx_satd_c(const int16_t *coeff, int length) { int satd = 0; for (int i = 0; i < length; ++i) satd += abs(coeff[i]); return satd; } This problem causes test failures for libvpx. For vector absolute and vector absolute difference on POWER9, LLVM generates VABSDUW (Vector Absolute Difference Unsigned Word) instruction or variants. Since these instructions are for unsigned integers, we need adjustment for signed integers. For abs(sub(a, b)), we generate VABSDUW(a+0x80000000, b+0x80000000). Otherwise, abs(sub(-1, 0)) returns 0xFFFFFFFF(=-1) instead of 1. For abs(a), we generate VABSDUW(a+0x80000000, 0x80000000). Differential Revision: https://reviews.llvm.org/D45522 llvm-svn: 330497
* [Power9]Legalize and emit code for converting Unsigned HWord/Char to ↵Lei Huang2018-04-181-0/+8
| | | | | | | | | | | | | | | | Quad-Precision Legalize and emit code for converting unsigned HWord/Char to QP: xscvsdqp xscvudqp Only covering patterns for unsigned forms cause we don't have part-word sign-extending integer loads into VSX registers. Differential Revision: https://reviews.llvm.org/D45494 llvm-svn: 330278
* [Power9]Legalize and emit code for converting (Un)Signed Word to Quad-PrecisionLei Huang2018-04-181-1/+10
| | | | | | | | | | | Legalize and emit code for converting (Un)Signed Word to quad-precision via: xscvsdqp xscvudqp Differential Revision: https://reviews.llvm.org/D45389 llvm-svn: 330273
* [NFC] Move verificaiton check for f128 conversion into LowerINT_TO_FP()Lei Huang2018-04-161-24/+14
| | | | | | | Move veriication check for legal conversions to f128 into LowerINT_TO_FP() and fix some indentations to match other sections of the code for readability. llvm-svn: 330138
* [Power9] Add the TLS store instructions to the Power 9 modelStefan Pintilie2018-04-132-2/+2
| | | | | | | | | | | The Power 9 scheduler model should now include the TLS instructions. We can now, once again, mark the model as complete. From now on, if instructions are added to Power 9 but are not added to the model the build should produce an error. Hopefully that will alert the developer who is adding new instructions that they should also be added to the scheulder model. llvm-svn: 330060
* [Power9]Legalize and emit code for converting (Un)Signed DWord to Quad-PrecisionLei Huang2018-04-122-1/+22
| | | | | | | | | | | Legalize and emit code for: * xscvsdqp * xscvudqp Differential Revision: https://reviews.llvm.org/D45230 llvm-svn: 329931
* [PowerPC] Fix condition for 64-bit rotate when replacing r+r instr with r+iNemanja Ivanovic2018-04-111-1/+2
| | | | | | | | | | This patch fixes https://bugs.llvm.org/show_bug.cgi?id=37039 The condition only covers one of the two 64-bit rotate instructions. This just adds the second (RLDICLo). Patch by Josh Stone. llvm-svn: 329852
* [PowerPC] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: hfinkel, RKSimon Reviewed By: RKSimon Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D44870 llvm-svn: 329535
* [PowerPC] allow D-form VSX load/store when accessing FrameIndex without offset Hiroshi Inoue2018-04-061-8/+16
| | | | | | | | | | VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this. So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack. For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions. Differential Revision: https://reviews.llvm.org/D45079 llvm-svn: 329377
* [PowerPC] fix assertion failure due to missing instruction in ↵Hiroshi Inoue2018-04-051-3/+3
| | | | | | | | P9InstrResources.td This patch adds L(W|H|B)ZXTLS_32 instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td. llvm-svn: 329299
* [SchedModel] Complete models shouldn't match against itineraries when they ↵Simon Pilgrim2018-04-051-1/+1
| | | | | | | | | | | | don't use them (PR35639) For schedule models that don't use itineraries, checkCompleteness still checks that an instruction has a matching itinerary instead of skipping and going straight to matching the InstRWs. That doesn't seem to match what happens in TargetSchedule.cpp This patch causes problems for a number of models that had been incorrectly flagged as complete. Differential Revision: https://reviews.llvm.org/D43235 llvm-svn: 329280
* [Power9]Legalize and emit code for quad-precision fma instructionsLei Huang2018-04-042-8/+39
| | | | | | | | | | | | | Legalize and emit code for the following quad-precision fma: * xsmaddqp * xsnmaddqp * xsmsubqp * xsnmsubqp Differential Revision: https://reviews.llvm.org/D44843 llvm-svn: 329206
* Sort targetgen calls in lib/Target/*/CMakeLists.Nico Weber2018-04-041-5/+6
| | | | | | | | | | | Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181
* [PowerPC] reorder entries in P9InstrResources.td in alphabetical order; NFCHiroshi Inoue2018-04-031-1/+1
| | | | | | Reorder entries added in my previous commit (rL328969) to keep alphabetical order. llvm-svn: 329064
* [PowerPC] fix assertion failure due to missing instruction in ↵Hiroshi Inoue2018-04-021-8/+4
| | | | | | | | P9InstrResources.td This patch adds L(D|W|H|B)XTLS instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td. llvm-svn: 328969
* [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to ↵Craig Topper2018-03-293-3/+3
| | | | | | | | | | | | CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806
* Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header ↵David Blaikie2018-03-281-1/+1
| | | | | | | | dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737
* Transforms: Introduce Transforms/Utils.h rather than spreading the ↵David Blaikie2018-03-282-0/+2
| | | | | | | | | declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717
* Initialize variable added in r328617.Sterling Augustine2018-03-271-0/+1
| | | | llvm-svn: 328667
OpenPOWER on IntegriCloud