summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] fix incorrect vectorization of abs() on POWER9Hiroshi Inoue2018-04-211-9/+17
| | | | | | | | | | | | | | | | | | | | Vectorized loops with abs() returns incorrect results on POWER9. This patch fixes it. For example the following code returns negative result if input values are negative though it sums up the absolute value of the inputs. int vpx_satd_c(const int16_t *coeff, int length) { int satd = 0; for (int i = 0; i < length; ++i) satd += abs(coeff[i]); return satd; } This problem causes test failures for libvpx. For vector absolute and vector absolute difference on POWER9, LLVM generates VABSDUW (Vector Absolute Difference Unsigned Word) instruction or variants. Since these instructions are for unsigned integers, we need adjustment for signed integers. For abs(sub(a, b)), we generate VABSDUW(a+0x80000000, b+0x80000000). Otherwise, abs(sub(-1, 0)) returns 0xFFFFFFFF(=-1) instead of 1. For abs(a), we generate VABSDUW(a+0x80000000, 0x80000000). Differential Revision: https://reviews.llvm.org/D45522 llvm-svn: 330497
* [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-204-48/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally committed at rL328921 and reverted at rL329920 to investigate failures in Chrome. This time I've added to the ReleaseNotes to warn users of the potential of exposing UB and let me repeat that here for more exposure: Optimization of floating-point casts is improved. This may cause surprising results for code that is relying on undefined behavior. Code sanitizers can be used to detect affected patterns such as this: int main() { float x = 4294967296.0f; x = (float)((int)x); printf("junk in the ftrunc: %f\n", x); return 0; } $ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of representable values of type 'int' junk in the ftrunc: 0.000000 Original commit message: fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 330437
* [NFC] test case clean upLei Huang2018-04-181-54/+14
| | | | | | | 1. remove redundant tests 2. update XForm_tests to generated expected code gen llvm-svn: 330290
* [Power9]Legalize and emit code for converting Unsigned HWord/Char to ↵Lei Huang2018-04-181-0/+141
| | | | | | | | | | | | | | | | Quad-Precision Legalize and emit code for converting unsigned HWord/Char to QP: xscvsdqp xscvudqp Only covering patterns for unsigned forms cause we don't have part-word sign-extending integer loads into VSX registers. Differential Revision: https://reviews.llvm.org/D45494 llvm-svn: 330278
* [Power9]Legalize and emit code for converting (Un)Signed Word to Quad-PrecisionLei Huang2018-04-181-1/+160
| | | | | | | | | | | Legalize and emit code for converting (Un)Signed Word to quad-precision via: xscvsdqp xscvudqp Differential Revision: https://reviews.llvm.org/D45389 llvm-svn: 330273
* [PowerPC] Mark the BDNZ intrinsic as NoDuplicateNemanja Ivanovic2018-04-171-0/+75
| | | | | | | | | | | | | Duplicating this intrinsic is not generally valid because it has the side-effect of decrementing the CTR. Any passes that duplicate it would need to be taught to keep the regions formed completely disjoint. This patch should be NFC for typical uses as CTRLoops runs after the remaining loop passes. It only affects situations where the loop passes are scheduled on the IR after the codegen passes (as is the case with some JIT pipelines). Fixes https://bugs.llvm.org/show_bug.cgi?id=37050 llvm-svn: 330186
* [DAGCombiner, PowerPC] allow X - (fpext(-Y) --> X + fpext(Y) with multiple usesSanjay Patel2018-04-151-4/+4
| | | | | | | | | | | | | This is a transform that I limited in instcombine in rL329821 because it was creating more instructions in IR when the cast has multiple uses. But if the cast is free, then we can do the transform regardless of other uses because it improves the potential throughput of the calculation by removing a dependency on the fneg. Differential Revision: https://reviews.llvm.org/D45598 llvm-svn: 330098
* [PowerPC] add fsub-fneg test; NFCSanjay Patel2018-04-121-0/+21
| | | | | | | | | This is a test for a transform that was suggested in the post-commit mailing list thread for rL329821. The target in question is not in trunk, so PPC gets to stand in for it because it's the only in-tree target that sets 'isFPExtFree()' to 'true'. llvm-svn: 329963
* [Power9]Legalize and emit code for converting (Un)Signed DWord to Quad-PrecisionLei Huang2018-04-121-0/+140
| | | | | | | | | | | Legalize and emit code for: * xscvsdqp * xscvudqp Differential Revision: https://reviews.llvm.org/D45230 llvm-svn: 329931
* revert r328921 - [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-124-13/+48
| | | | | | | This change is exposing UB in source code - as was warned/predicted. :) See D44909 for discussion. Reverting while we figure out how to fix things. llvm-svn: 329920
* [PowerPC] Fix condition for 64-bit rotate when replacing r+r instr with r+iNemanja Ivanovic2018-04-111-0/+64
| | | | | | | | | | This patch fixes https://bugs.llvm.org/show_bug.cgi?id=37039 The condition only covers one of the two 64-bit rotate instructions. This just adds the second (RLDICLo). Patch by Josh Stone. llvm-svn: 329852
* [DAGCombine] Improve ReduceLoad for SRLSam Parker2018-04-091-0/+18
| | | | | | | | | | | | | Recommitting r329283, third time lucky... If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 329551
* Reapply ARM: Do not spill CSR to stack on entry to noreturn functionsTim Northover2018-04-071-1/+1
| | | | | | | | | | | | | | | | | | Should fix UBSan bot by also checking there's no "uwtable" attribute before skipping. Otherwise the unwind table will be useless since its moves expect CSRs to actually be preserved. A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch mostly by myeisha (pmb). llvm-svn: 329494
* Revert "ARM: Do not spill CSR to stack on entry to noreturn functions"Vitaly Buka2018-04-071-1/+1
| | | | | | | | Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM This reverts commit r329287 llvm-svn: 329486
* [PowerPC] allow D-form VSX load/store when accessing FrameIndex without offset Hiroshi Inoue2018-04-061-0/+38
| | | | | | | | | | VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this. So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack. For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions. Differential Revision: https://reviews.llvm.org/D45079 llvm-svn: 329377
* ARM: Do not spill CSR to stack on entry to noreturn functionsTim Northover2018-04-051-1/+1
| | | | | | | | | | | | | | A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch by myeisha (pmb). llvm-svn: 329287
* [Power9]Legalize and emit code for quad-precision fma instructionsLei Huang2018-04-041-0/+203
| | | | | | | | | | | | | Legalize and emit code for the following quad-precision fma: * xsmaddqp * xsnmaddqp * xsmsubqp * xsnmsubqp Differential Revision: https://reviews.llvm.org/D44843 llvm-svn: 329206
* [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-03-314-48/+13
| | | | | | | | | | fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 328921
* [PowerPC] add ftrunc vector tests; NFCSanjay Patel2018-03-281-0/+47
| | | | | | Baseline tests for vectors as suggested in D44909. llvm-svn: 328682
* [Power9] Fix the resource list for the COPY instruction.Stefan Pintilie2018-03-272-156/+112
| | | | | | | The COPY instruction was listed as a 4 cycle instruction. It is now listed correctly as a 2 cycle ALU instruction. llvm-svn: 328647
* Use .set instead of = when printing assignment in assembly outputKrzysztof Parzyszek2018-03-271-3/+3
| | | | | | | | | On Hexagon "x = y" is a syntax used in most instructions, and is not treated as a directive. Differential Revision: https://reviews.llvm.org/D44256 llvm-svn: 328635
* [PowerPC] Secure PLT supportStrahinja Petrovic2018-03-271-0/+4
| | | | | | | | This patch supports secure PLT mode for PowerPC 32 architecture. Differential Revision: https://reviews.llvm.org/D42112 llvm-svn: 328617
* [Power9]Legalize and emit code for quad-precision convert from double-precisionLei Huang2018-03-261-1/+29
| | | | | | | | | Legalize and emit code for quad-precision floating point operation xscvdpqp and add option to guard the quad precision operation support. Differential Revision: https://reviews.llvm.org/D44746 llvm-svn: 328558
* [PowerPC] Infrastructure work. Implement getting the opcode for a spill in ↵Stefan Pintilie2018-03-261-8/+8
| | | | | | | | | | | one place. A new function getOpcodeForSpill should now be the only place to get the opcode for a given spilled register. Differential Revision: https://reviews.llvm.org/D43086 llvm-svn: 328556
* Re-commit: [MachineLICM] Add functions to MachineLICM to hoist invariant storesZaara Syeda2018-03-233-4/+129
| | | | | | | | | | | | | | | | This patch adds functions to allow MachineLICM to hoist invariant stores. Currently, MachineLICM does not hoist any store instructions, however when storing the same value to a constant spot on the stack, the store instruction should be considered invariant and be hoisted. The function isInvariantStore iterates each operand of the store instruction and checks that each register operand satisfies isCallerPreservedPhysReg. The store may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore. This patch also adds the PowerPC changes needed to consider the stack register as caller preserved. Differential Revision: https://reviews.llvm.org/D40196 llvm-svn: 328326
* [POWER9][NFC] update testcase check statementsLei Huang2018-03-211-39/+39
| | | | llvm-svn: 328147
* [PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 ↵Craig Topper2018-03-201-1/+318
| | | | | | | | | | | | | | fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything. If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all. I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it. Differential Revision: https://reviews.llvm.org/D44061 llvm-svn: 328017
* [Power9]Legalize and emit code for quad-precision copySign/abs/nabs/neg/sqrtLei Huang2018-03-191-0/+76
| | | | | | | | | | | | | | Legalize and emit code for quad-precision floating point operations: * xscpsgnqp * xsabsqp * xsnabsqp * xsnegqp * xssqrtqp Differential Revision: https://reviews.llvm.org/D44530 llvm-svn: 327889
* [PowerPC][Power9]Legalize and emit code for quad-precision add/div/mul/subLei Huang2018-03-191-0/+73
| | | | | | | | | | | | | Legalize and emit code for quad-precision floating point operations: * xsaddqp * xssubqp * xsdivqp * xsmulqp Differential Revision: https://reviews.llvm.org/D44506 llvm-svn: 327878
* [PowerPC] Make AddrSpaceCast noopNemanja Ivanovic2018-03-191-0/+22
| | | | | | | | | | | PowerPC targets do not use address spaces. As a result, we can get selection failures with address space casts. This patch makes those casts noops. Patch by Valentin Churavy. Differential revision: https://reviews.llvm.org/D43781 llvm-svn: 327877
* Revert [MachineLICM] This reverts commit rL327856Zaara Syeda2018-03-193-129/+4
| | | | | | Failing build bots. Revert the commit now. llvm-svn: 327864
* [MachineLICM] Add functions to MachineLICM to hoist invariant storesZaara Syeda2018-03-193-4/+129
| | | | | | | | | | | | | | | | This patch adds functions to allow MachineLICM to hoist invariant stores. Currently, MachineLICM does not hoist any store instructions, however when storing the same value to a constant spot on the stack, the store instruction should be considered invariant and be hoisted. The function isInvariantStore iterates each operand of the store instruction and checks that each register operand satisfies isCallerPreservedPhysReg. The store may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore. This patch also adds the PowerPC changes needed to consider the stack register as caller preserved. Differential Revision: https://reviews.llvm.org/D40196 llvm-svn: 327856
* [MergeICmps] Re-land 324317 "Enable the MergeICmps Pass by default."Clement Courbet2018-03-191-16/+6
| | | | | | Now that PR36557 is fixed. llvm-svn: 327840
* [PPC] Avoid non-simple MVT in STBRX optimizationGuozhi Wei2018-03-151-0/+18
| | | | | | | | | | PR35402 triggered this case. It bswap and stores a 48bit value, current STBRX optimization transforms it into STBRX. Unfortunately 48bit is not a simple MVT, there is no PPC instruction to support it, and it can't be automatically expanded by llvm, so caused a crash. This patch detects the non-simple MVT and returns early. Differential Revision: https://reviews.llvm.org/D44500 llvm-svn: 327651
* [PowerPC] Optimize TLS initial-exec sequence to use X-Form loads/storesZaara Syeda2018-03-151-0/+169
| | | | | | | | | This patch adds new load/store instructions for integer scalar types which can be used for X-Form when fed by add with an @tls relocation. Differential Revision: https://reviews.llvm.org/D43315 llvm-svn: 327635
* [CodeGen] Use MIR syntax for MachineMemOperand printingFrancis Visoiu Mistrih2018-03-143-8/+8
| | | | | | | | | | Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)". rdar://38163529 Differential Revision: https://reviews.llvm.org/D42377 llvm-svn: 327580
* [PowerPC] fix tests to be independent of FP undefSanjay Patel2018-03-102-5/+5
| | | | llvm-svn: 327210
* Revert "[PowerPC] Move test to correct location."Stefan Pintilie2018-03-091-57/+0
| | | | | | Revert part of the LSR tune commit. llvm-svn: 327142
* [PowerPC] Move test to correct location.Stefan Pintilie2018-03-071-0/+57
| | | | | | | Test was added in r326906 to an incorrect location. Moving the test to PPC CodeGen directory as the test is PPC specific. llvm-svn: 326923
* [PowerPC] Do not emit record-form rotates when record-form andi sufficesNemanja Ivanovic2018-03-052-1/+43
| | | | | | | | | | | | | | | | Up until Power9, the performance profile for rlwinm., rldicl. and andi. looked more or less equivalent. However with Power9, the rotates are still 2-way cracked whereas the and-immediate is not. This patch just ensures that we don't emit record-form rotates when an andi. is adequate. As first pointed out by Carrot in https://bugs.llvm.org/show_bug.cgi?id=30833 (this patch is a fix for that PR). Differential Revision: https://reviews.llvm.org/D43977 llvm-svn: 326736
* [MergeICmps] Revert 324317 "Enable the MergeICmps Pass by default."Clement Courbet2018-03-021-6/+16
| | | | | | While working on PR36557. llvm-svn: 326575
* [DAGCombiner] When combining zero_extend of a truncate, only mask before ↵Craig Topper2018-03-011-4/+4
| | | | | | | | | | extending for vectors. Masking first, prevents the extend from being combine with loads. Its also interfering with some vXi1 extraction code. Differential Revision: https://reviews.llvm.org/D42679 llvm-svn: 326500
* [TLS] use emulated TLS if the target supports only this modeChih-Hung Hsieh2018-02-281-0/+7
| | | | | | | | | | | | | | | Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2018-02-276-7/+8
| | | | | | | | Re-enable commit r323991 now that r325931 has been committed to make MachineOperand::isRenamable() check more conservative w.r.t. code changes and opt-in on a per-target basis. llvm-svn: 326208
* [CodeGen] Don't omit any redundant information in -debug outputFrancis Visoiu Mistrih2018-02-261-1/+1
| | | | | | | | | | | | | | | | | | | | | In r322867, we introduced IsStandalone when printing MIR in -debug output. The default behaviour for that was: 1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any redundant information. 2) When -debug-printing a MF entirely, don't print any redundant information. 3) When printing MIR, don't print any redundant information. I'd like to change 2) to: 2) When -debug-printing a MF entirely, don't omit any redundant information. Differential Revision: https://reviews.llvm.org/D43337 llvm-svn: 326094
* [PowerPC] Disable shrink-wrapping when getting PC address through the LRNemanja Ivanovic2018-02-231-0/+70
| | | | | | | | | | | | | | The instruction sequence used to get the address of the PC into a GPR requires that we clobber the link register. Doing so without having first saved it in the prologue leaves the function unable to return. Currently, this sequence is emitted into the entry block. To ensure the prologue is inserted before this sequence, disable shrink-wrapping. This fixes PR33547. Differential Revision: https://reviews.llvm.org/D43677 llvm-svn: 325972
* [PowerPC] Do not produce invalid CTR loop with an FRemNemanja Ivanovic2018-02-221-0/+46
| | | | | | | | | | | An FRem instruction inside a loop should prevent the loop from being converted into a CTR loop since this is not an operation that is legal on any PPC subtarget. This will always be a call to a library function which means the loop will be invalid if this instruction is in the body. Fixes PR36292. llvm-svn: 325739
* [PowerPC] Reduce stack frame for fastcc functions by only allocating ↵Lei Huang2018-02-201-0/+141
| | | | | | | | | | | | parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Quentin Colombet2018-02-176-8/+7
| | | | | | | | | | | | | | | | | This reverts commit r323991. This commit breaks target that don't model all the register constraints in TableGen. So far the workaround was to set the hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the cases. For instance, when mutating an instruction (like in the lowering of COPYs) the isRenamable flag is not properly updated. The same problem will happen when attaching machine operand from one instruction to another. Geoff Berry is working on a fix in https://reviews.llvm.org/D43042. llvm-svn: 325421
* Run these tests, the errors were old and not valid anymore.Eric Christopher2018-02-161-11/+8
| | | | llvm-svn: 325407
OpenPOWER on IntegriCloud