summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [NEON] Fix combining of vldx_dup intrinsics with updating of base addressesIvan A. Kosarev2018-07-051-0/+6
| | | | | | | | | | | | | Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
* Reverting r336322 for now, as it causes an assert failureSander de Smalen2018-07-052-16/+1
| | | | | | | in TableGen, for which there is already a patch in Phabricator (D48937) that needs to be committed first. llvm-svn: 336324
* [AArch64][SVE] Asm: Support for predicated FP rounding instructions.Sander de Smalen2018-07-052-1/+16
| | | | | | | | | | | | | | | | | | This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336322
* [ARM] ParallelDSP: only support i16 loads for nowSjoerd Meijer2018-07-051-28/+25
| | | | | | | | | We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
* [AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABDSander de Smalen2018-07-054-0/+53
| | | | | | | | | | | | | | | | | | | | This patch implements the following varieties: - Unpredicated signed max, e.g. smax z0.h, z1.h, #-128 - Unpredicated signed min, e.g. smin z0.h, z1.h, #-128 - Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255 - Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255 - Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h - Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h - Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h - Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h - Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h - Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h llvm-svn: 336317
* [Power9] Optimize codgen for conversions of int to float128Lei Huang2018-07-051-0/+17
| | | | | | | | | | | | Optimize code sequences for integer conversion to fp128 when the integer is a result of: * float->int * float->long * double->int * double->long Differential Revision: https://reviews.llvm.org/D48429 llvm-svn: 336316
* [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵Craig Topper2018-07-054-58/+33
| | | | | | independent FMA and extractelement/insertelement. llvm-svn: 336315
* [demangler] Avoid alignment warningSerge Pavlov2018-07-051-1/+1
| | | | | | | | | | | | | | The alignment specified by a constant for the field `BumpPointerAllocator::InitialBuffer` exceeded the alignment guaranteed by `malloc` and `new` on Windows. This change set the alignment value to that of `long double`, which is defined by the used platform. It fixes https://bugs.llvm.org/show_bug.cgi?id=37944. Differential Revision: https://reviews.llvm.org/D48889 llvm-svn: 336311
* [Power9] Ensure float128 in non-homogenous aggregates are passed via VSX regLei Huang2018-07-054-0/+43
| | | | | | | | | | | | | Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory, or in memory. This patch ensures that float128 members of non-homogenous aggregates are passed via VSX registers. This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128 to a new PPCISD node, BUILD_FP128. Differential Revision: https://reviews.llvm.org/D48308 llvm-svn: 336310
* [Power9]Legalize and emit code for quad-precision convert from single-precisionLei Huang2018-07-052-2/+10
| | | | | | | | | Legalize and emit code for quad-precision floating point operation conversion of single-precision value to quad-precision. Differential Revision: https://reviews.llvm.org/D47569 llvm-svn: 336307
* [Power9] Implement float128 parameter passing and return valuesLei Huang2018-07-052-5/+25
| | | | | | | | | | This patch enable parameter passing and return by value for float128 types. Passing aggregate/union which contain float128 members will be submitted in subsequent patches. Differential Revision: https://reviews.llvm.org/D47552 llvm-svn: 336306
* [X86] Remove some isel patterns for X86ISD::SELECTS that specifically looked ↵Craig Topper2018-07-051-18/+0
| | | | | | | | for the v1i1 mask to have come from a scalar_to_vector from GR8. We have patterns for SELECTS that top at v1i1 and we have a pattern for (v1i1 (scalar_to_vector GR8)). The patterns being removed here do the same thing as the two other patterns combined so there is no need for them. llvm-svn: 336305
* [X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg ↵Craig Topper2018-07-051-78/+119
| | | | | | | | | | input. Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)). This patch fixes that so we can also combine things like (fmsub (fneg)). llvm-svn: 336304
* [X86] Remove some of the packed FMA3 intrinsics since we no longer use them ↵Craig Topper2018-07-052-44/+32
| | | | | | | | | | in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303
* [Power9]Legalize and emit code for round & convert quad-precision valuesLei Huang2018-07-043-3/+26
| | | | | | | | | Legalize and emit code for round & convert float128 to double precision and single precision. Differential Revision: https://reviews.llvm.org/D46997 llvm-svn: 336299
* [mips] Warn when crc, ginv, virt flags are used with too old revisionVladimir Stefanovic2018-07-042-11/+33
| | | | | | | | | CRC and GINV ASE require revision 6, Virtualization requires revision 5. Print a warning when revision is older than required. Differential Revision: https://reviews.llvm.org/D48843 llvm-svn: 336296
* [PowerPC] Replace the Post RA List Scheduler with the Machine SchedulerStefan Pintilie2018-07-041-1/+7
| | | | | | | | | | | We want to run the Machine Scheduler instead of the List Scheduler after RA. Checked with a performance run on a Power 9 machine with SPEC 2006 and while some benchmarks improved and others degraded the geomean was slightly improved with the Machine Scheduler. Differential Revision: https://reviews.llvm.org/D45265 llvm-svn: 336295
* [InstCombine] allow narrowing of min/max/absSanjay Patel2018-07-041-14/+14
| | | | | | | | | | | | | | | We have bailout hacks based on min/max in various places in instcombine that shouldn't be necessary. The affected test was added for: D48930 ...which is a consequence of the improvement in: D48584 (https://reviews.llvm.org/rL336172) I'm assuming the visitTrunc bailout in this patch was added specifically to avoid a change from SimplifyDemandedBits, so I'm just moving that below the EvaluateInDifferentType optimization. A narrow min/max is still a min/max. llvm-svn: 336293
* Fix some irregular whitespace/indentation. NFCI.Simon Pilgrim2018-07-041-18/+14
| | | | llvm-svn: 336291
* [ARM] [Assembler] Support negative immediates: cover few missing casesVolodymyr Turanskyy2018-07-042-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | Support for negative immediates was implemented in https://reviews.llvm.org/rL298380, however few instruction options were missing. This change adds negative immediates support and respective tests for the following: ADD ADDS ADDS.W AND.W ANDS BIC.W BICS BICS.W SUB SUBS SUBS.W Differential Revision: https://reviews.llvm.org/D48649 llvm-svn: 336286
* [MachineOutliner] Fix typo in getOutliningCandidateInfo function nameYvan Roux2018-07-045-5/+5
| | | | | | | | getOutlininingCandidateInfo -> getOutliningCandidateInfo Differential Revision: https://reviews.llvm.org/D48867 llvm-svn: 336285
* [llvm-objdump] Add --file-headers (-f) optionPaul Semel2018-07-041-0/+6
| | | | llvm-svn: 336284
* [ThinLTO] Update ThinLTO cache file atimes when on WindowsAndrew Ng2018-07-043-9/+56
| | | | | | | | | | | | | | | | | | ThinLTO cache file access times are used for expiration based pruning and since Vista, file access times are not updated by Windows by default: https://blogs.technet.microsoft.com/filecab/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance This means on Windows, cache files are currently being pruned from creation time. This change manually updates cache files that are accessed by ThinLTO, when on Windows. Patch by Owen Reynolds. Differential Revision: https://reviews.llvm.org/D47266 llvm-svn: 336276
* [AArch64][SVE] Asm: Support for reversed subtract (SUBR) instruction. Sander de Smalen2018-07-041-2/+4
| | | | | | | | | | | | | | | | | | This patch adds both a vector and an immediate form, e.g. - Vector form: subr z0.h, p0/m, z0.h, z1.h subtract active elements of z0 from z1, and store the result in z0. - Immediate form: subr z0.h, z0.h, #255 subtract elements of z0, and store the result in z0. llvm-svn: 336274
* [AArch64][SVE] Asm: Support for instructions to set/read FFR.Sander de Smalen2018-07-042-0/+61
| | | | | | | | | | | | | | | | | | Includes instructions to read the First-Faulting Register (FFR): - RDFFR (unpredicated) rdffr p0.b - RDFFR (predicated) rdffr p0.b, p0/z - RDFFRS (predicated, sets condition flags) rdffr p0.b, p0/z Includes instructions to set/write the FFR: - SETFFR (no arguments, sets the FFR to all true) setffr - WRFFR (unpredicated) wrffr p0.b llvm-svn: 336267
* [AArch64][SVE] Asm: Support for FP conversion instructions.Sander de Smalen2018-07-042-0/+61
| | | | | | | | | | | | | | | | | | | The variants added are: - fcvt (FP convert precision) - scvtf (signed int -> FP) - ucvtf (unsigned int -> FP) - fcvtzs (FP -> signed int (round to zero)) - fcvtzu (FP -> unsigned int (round to zero)) For example: fcvt z0.h, p0/m, z0.s (single- to half-precision FP) scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP) ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP) fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int) fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int) llvm-svn: 336265
* [DebugInfo][LoopVectorize] Preserve DL in generated phi instructionAnastasis Grammenos2018-07-041-0/+2
| | | | | | | | | When creating `phi` instructions to resume at the scalar part of the loop, copy the DebugLoc from the original phi over to the new one. Differential Revision: https://reviews.llvm.org/D48769 llvm-svn: 336256
* [DebugInfo][InstCombine] Preserve DI after combining zextAnastasis Grammenos2018-07-041-0/+11
| | | | | | | | | | | When zext is EvaluatedInDifferentType, InstCombine drops the dbg.value intrinsic. This patch tries to preserve said DI, by inserting the zext's old DI in the resulting instruction. (Only for integer type for now) Differential Revision: https://reviews.llvm.org/D48331 llvm-svn: 336254
* [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values (REAPPLIED)Simon Pilgrim2018-07-041-48/+23
| | | | | | | | We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case. Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247. llvm-svn: 336250
* [AArch64][SVE] Asm: Support for SVE condition code aliasesSander de Smalen2018-07-041-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SVE overloads the AArch64 PSTATE condition flags and introduces a set of condition code aliases for the assembler. The details are described in section 2.2 of the architecture reference manual supplement for SVE. In short: SVE alias => AArch64 name -------------------------- NONE => EQ ANY => NE NLAST => HS LAST => LO FIRST => MI NFRST => PL PMORE => HI PLAST => LS TCONT => GE TSTOP => LT Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48869 llvm-svn: 336245
* [ImplicitNullChecks] Check for rewrite of register used in 'test' instructionMax Kazantsev2018-07-041-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | The following code pattern: mov %rax, %rcx test %rax, %rax %rax = .... je throw_npe mov(%rcx), %r9 mov(%rax), %r10 gets transformed into the following incorrect code after implicit null check pass: mov %rax, %rcx %rax = .... faulting_load_op("movl (%rax), %r10", throw_npe) mov(%rcx), %r9 For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization. Patch by Surya Kumari Jangala! Differential Revision: https://reviews.llvm.org/D48627 Reviewed By: skatkov llvm-svn: 336241
* [lanai] Handle atomic load of i8 like regular load.Jacques Pienaar2018-07-031-0/+4
| | | | | | Loads and stores less than 64-bits are already atomic, this adds support for a special case thereof. This needs to be expanded. llvm-svn: 336236
* [X86][AsmParser] Fix inconsistent declaration parameter name in r336218Fangrui Song2018-07-031-1/+1
| | | | llvm-svn: 336232
* [NVPTX] Expand v2f16 INSERT_VECTOR_ELTBenjamin Kramer2018-07-031-0/+1
| | | | | | Vectorization can create them. llvm-svn: 336227
* [X86] Remove repeated 'the' from multiple comments that have been copy and ↵Craig Topper2018-07-031-7/+7
| | | | | | pasted. NFC llvm-svn: 336226
* [ARM] Fix inconsistent declaration parameter name in r336195Fangrui Song2018-07-031-1/+1
| | | | llvm-svn: 336223
* [AArch64] Make function parameter names in declarations match those of ↵Fangrui Song2018-07-031-8/+8
| | | | | | definitions llvm-svn: 336222
* [X86][AsmParser] Rework the in/out (%dx) hack one more time.Craig Topper2018-07-032-24/+45
| | | | | | | | This patch adds a new token type specifically for (%dx). We will now always create this token when we parse (%dx). After all operands have been parsed, if the mnemonic is in/out we'll morph this token to a regular register token. Otherwise we keep it as the special DX token which won't match any instructions. This removes the need for passing Mnemonic through the parsing functions. It also seems closer to gas where when its used on the wrong instruction it just gets diagnosed as an invalid operand rather than a bad memory address. llvm-svn: 336218
* [X86][AsmParser] Don't consider %eip as a valid register outside of 32-bit mode.Craig Topper2018-07-031-1/+1
| | | | | | | | This might make the error message added in r335668 unneeded, but I'm not sure yet. The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit. llvm-svn: 336217
* Fix typo in lib/Support/Path.cpp to test commit accessVladimir Stefanovic2018-07-031-1/+1
| | | | llvm-svn: 336216
* [Constants] add identity constants for fadd/fmulSanjay Patel2018-07-031-1/+6
| | | | | | | | | | | | | | | | As the test diffs show, the current users of getBinOpIdentity() are InstCombine and Reassociate. SLP vectorizer is a candidate for using this functionality too (D28907). The InstCombine shuffle improvements are part of the planned enhancements noted in D48830. InstCombine actually has several other uses of getBinOpIdentity() via SimplifyUsingDistributiveLaws(), but we don't call that for any FP ops. Fixing that might be another part of removing the custom reassociation in InstCombine that is only done for fadd+fmul. llvm-svn: 336215
* [AArch64][SVE] Asm: Support for FP Complex ADD/MLA.Sander de Smalen2018-07-033-4/+114
| | | | | | | | | | | | | | | | | | | | | | | | The variants added in this patch are: - Predicated Complex floating point ADD with rotate, e.g. fcadd z0.h, p0/m, z0.h, z1.h, #90 - Predicated Complex floating point MLA with rotate, e.g. fcmla z0.h, p0/m, z1.h, z2.h, #180 - Unpredicated Complex floating point MLA with rotate (indexed operand), e.g. fcmla z0.h, p0/m, z1.h, z2.h[0], #180 Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48824 llvm-svn: 336210
* [AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to ↵Amara Emerson2018-07-031-2/+5
| | | | | | | | | | unselectable stores. r336120 resulted in falling back to SelectionDAG more often due to the G_STORE MMOs not matching the vreg size. This fixes that by explicitly any-extending the value. llvm-svn: 336209
* Rename lazy initialization functions to reflect behavior (NFC)Teresa Johnson2018-07-031-12/+12
| | | | | | Suggested in review for D48698. llvm-svn: 336207
* [AArch64][SVE] Asm: Support for FMUL (indexed)Sander de Smalen2018-07-035-3/+125
| | | | | | | | | | | | | | | | | | | | | | Unpredicated FP-multiply of SVE vector with a vector-element given by vector[index], for example: fmul z0.s, z1.s, z2.s[0] which performs an unpredicated FP-multiply of all 32-bit elements in 'z1' with the first element from 'z2'. This patch adds restricted register classes for SVE vectors: ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements. ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48823 llvm-svn: 336205
* [AArch64][SVE] Asm: Support for predicated unary operations.Sander de Smalen2018-07-032-0/+57
| | | | | | | | | | | | | | | | | | The patch includes support for the following instructions: ABS z0.h, p0/m, z0.h NEG z0.h, p0/m, z0.h (S|U)XTB z0.h, p0/m, z0.h (S|U)XTB z0.s, p0/m, z0.s (S|U)XTB z0.d, p0/m, z0.d (S|U)XTH z0.s, p0/m, z0.s (S|U)XTH z0.d, p0/m, z0.d (S|U)XTW z0.d, p0/m, z0.d llvm-svn: 336204
* [DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegenSimon Pilgrim2018-07-031-2/+0
| | | | | | Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code llvm-svn: 336198
* [InstCombine] fold shuffle-with-binop and common valueSanjay Patel2018-07-032-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 llvm-svn: 336196
* [ARM][NFC] Refactor sequential access for DSPSam Parker2018-07-031-18/+27
| | | | | | | | | | With a view to support parallel operations that have their results stored to memory, refactor the consecutive access helper out so it could support stores instructions. Differential Revision: https://reviews.llvm.org/D48872 llvm-svn: 336195
* [IR] Strip trailing whitespace. NFCBjorn Pettersson2018-07-037-46/+46
| | | | llvm-svn: 336194
OpenPOWER on IntegriCloud