summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [Constants] extend getBinOpIdentity(); NFCSanjay Patel2018-07-061-24/+41
| | | | | | | | The enhanced version will be used in D48893 and related patches and an almost identical (fadd is different) version is proposed in D28907, so adding this as a preliminary step. llvm-svn: 336444
* [Constant] add undef element query for vector constants; NFCSanjay Patel2018-07-061-0/+10
| | | | | | | This is likely to be used in D48987 and similar patches, so adding it as an NFC preliminary step. llvm-svn: 336442
* [ARM] ParallelDSP: added statistics, NFC.Sjoerd Meijer2018-07-061-3/+7
| | | | | | | | | Added statistics for the number of SMLAD instructions created, and als renamed the pass name to -arm-parallel-dsp. Differential Revision: https://reviews.llvm.org/D48971 llvm-svn: 336441
* [LoopSink] Make the enforcement of determinism deterministic.Benjamin Kramer2018-07-061-4/+6
| | | | | | | | | | | | | | LoopBlockNumber is a DenseMap<BasicBlock*, int>, comparing the result of find() will compare a pair<BasicBlock*, int>. That's of course depending on pointer ordering which varies from run to run. Reverse iteration doesn't find this because we're copying to a vector first. This bug has been there since 2016 but only recently showed up on clang selfhost with FDO and ThinLTO, which is also why I didn't manage to get a reasonable test case for this. Add an assert that would've caught this. llvm-svn: 336439
* [AArch64] Armv8.4-A: TLB supportSjoerd Meijer2018-07-062-0/+56
| | | | | | | | This adds: - outer shareable TLB Maintenance instructions, and - TLB range maintenance instructions. llvm-svn: 336434
* Recommit: [AArch64] Armv8.4-A: Flag manipulation instructionsSjoerd Meijer2018-07-063-0/+61
| | | | | | Now with the asm operand definition included. llvm-svn: 336432
* Added missing semicolonDiogo N. Sampaio2018-07-061-2/+1
| | | | llvm-svn: 336428
* [SelectionDAG] https://reviews.llvm.org/D48278Diogo N. Sampaio2018-07-061-0/+106
| | | | | | | | | | | | | | | | D48278 Allow to reduce redundant shift masks. For example: x1 = x & 0xAB00 x2 = (x >> 8) & 0xAB can be reduced to: x1 = x & 0xAB00 x2 = x1 >> 8 It only allows folding when the masks and shift values are constants. llvm-svn: 336426
* Revert [AArch64] Armv8.4-A: Flag manipulation instructionsSjoerd Meijer2018-07-062-35/+0
| | | | | | It's causing build errors. llvm-svn: 336422
* [AArch64] Armv8.4-A: Flag manipulation instructionsSjoerd Meijer2018-07-062-0/+35
| | | | | | | | These instructions are added to AArch64 only. Differential Revision: https://reviews.llvm.org/D48926 llvm-svn: 336421
* CallGraphSCCPass: iterate over all functions.Tim Northover2018-07-061-39/+71
| | | | | | | | | | | | | | | Previously we only iterated over functions reachable from the set of external functions in the module. But since some of the passes under this (notably the always-inliner and coroutine lowerer) are required for correctness, they need to run over everything. This just adds an extra layer of iteration over the CallGraph to keep track of which functions we've already visited and get the next batch of SCCs. Should fix PR38029. llvm-svn: 336419
* [AArch64][ARM] Armv8.4-A: Trace synchronization barrier instructionSjoerd Meijer2018-07-0612-4/+164
| | | | | | | | This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction. Differential Revision: https://reviews.llvm.org/D48918 llvm-svn: 336418
* [X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead.Craig Topper2018-07-066-55/+77
| | | | | | | | The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector. There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG. llvm-svn: 336416
* Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms ↵Max Kazantsev2018-07-061-7/+3
| | | | | | are done" llvm-svn: 336410
* [X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or ↵Craig Topper2018-07-063-98/+129
| | | | | | | | | | unmasked 512-bit intrinsics with rounding mode. This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking. This matches how clang uses the intrinsics these days. llvm-svn: 336409
* [Power9] Add __float128 library call for fremStefan Pintilie2018-07-061-0/+2
| | | | | | | | Power 9 does not have a hardware instruction for frem but we can call fmodf128. Differential Revision: https://reviews.llvm.org/D48552 llvm-svn: 336406
* [PDB] Sort globals symbols by name in GSI hash buckets.Zachary Turner2018-07-061-5/+19
| | | | | | | | | | | It seems like the debugger first computes a symbol's bucket, and then does a binary search of entries in the bucket using the symbol's name in order to find it. If the bucket entries are not in sorted order, this obviously won't work. After this patch a couple of simple test cases show that we generate an exactly identical GSI hash stream, which is very nice. llvm-svn: 336405
* [OpenEmbedded] Add OpenEmbedded vendorMandeep Singh Grang2018-07-051-0/+2
| | | | | | | | | | | | | | | | | | Summary: The lib paths are not correctly picked up for OpenEmbedded sysroots (like arm-oe-linux-gnueabi). I fix this in a follow-up clang patch. But in order to add the correct libs I need to detect if the vendor is oe. For this reason, it is first necessary to teach llvm to detect oe vendor, which is what this patch does. Reviewers: chandlerc, compnerd, rengolin, javed.absar Reviewed By: compnerd Subscribers: kristof.beyls, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D48861 llvm-svn: 336401
* [X86][Disassembler] Fix LOCK prefix disassembler supportMaksim Panchenko2018-07-053-0/+7
| | | | | | | | | | | | | | | | | | | Summary: If LOCK prefix is not the first prefix in an instruction, LLVM disassembler silently drops the prefix. The fix is to select a proper instruction with a builtin LOCK prefix if one exists. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49001 llvm-svn: 336400
* Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading.""Michael Zolotukhin2018-07-051-19/+15
| | | | | | | There were a couple of issues reported (PR38047, PR37929) - I'll reland the patch when I figure out and fix the rootcause. llvm-svn: 336393
* [WebAssembly] Add missing _S opcodes of atomic stores to InstPrinterHeejin Ahn2018-07-051-0/+7
| | | | | | | | | | | | Summary: This was missing in D48839 (rL336145). Reviewers: aardappel Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D48992 llvm-svn: 336390
* [ORC] Add BitReader/BitWriter to target_link_librariesHeejin Ahn2018-07-051-0/+6
| | | | | | | | | | | | | | Summary: CompileOnDemandLayer.cpp uses function in these libraries, and builds with `-DSHARED_LIB=ON` fail without this. Reviewers: lhames Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D48995 llvm-svn: 336389
* This is a recommit of r336322, previously reverted in r336324 due toSander de Smalen2018-07-052-1/+16
| | | | | | | | | | | | | | | | | | | | | | a deficiency in TableGen that has been addressed in r336334. [AArch64][SVE] Asm: Support for predicated FP rounding instructions. This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336387
* [ORC] In CompileOnDemandLayer2, clone modules on to different contexts byLang Hames2018-07-051-77/+79
| | | | | | | | | | | | | | | writing them to a buffer and re-loading them. Also introduces a multithreaded variant of SimpleCompiler (MultiThreadedSimpleCompiler) for compiling IR concurrently on multiple threads. These changes are required to JIT IR on multiple threads correctly. No test case yet. I will be looking at how to modify LLI / LLJIT to test multithreaded JIT support soon. llvm-svn: 336385
* Testing commit permisionDiogo N. Sampaio2018-07-051-1/+1
| | | | llvm-svn: 336384
* [X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to ↵Craig Topper2018-07-052-23/+25
| | | | | | | | 'llvm.fma'. Add upgrade tests for all. Still need to remove the AVX512 masked versions. llvm-svn: 336383
* [X86] Add SHUF128 to target shuffle decoding.Craig Topper2018-07-051-0/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D48954 llvm-svn: 336376
* Fix asserts in AMDGCN fmed3 folding by handling more cases of NaNMatt Arsenault2018-07-051-7/+18
| | | | | | | | | | | | | | | Better NaN handling for AMDGCN fmed3. All operands are checked for NaN now. The checks were moved before the canonicalization to provide a better mapping from fclamp. Changed the behaviour of fmed3(x,y,NaN) to return max(x,y) instead of min(x,y) in light of this. Updated tests as a result and added some new cases to cover the fix. Patch by Alan Baker llvm-svn: 336375
* AMDGPU/GlobalISel: Implement custom kernel arg loweringMatt Arsenault2018-07-054-32/+40
| | | | | | | | | | | | | Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering. For now at least this exists alongside the IR argument lowering pass. This is necessary to handle struct padding correctly while some arguments are still skipped by the IR argument lowering pass. llvm-svn: 336373
* [CostModel][X86] Add UDIV/UREM by pow2 costsSimon Pilgrim2018-07-051-15/+29
| | | | | | Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc. llvm-svn: 336371
* [Power9] Add lib calls for float128 operations with no equivalent PPC ↵Lei Huang2018-07-051-0/+19
| | | | | | | | | | | instructions Map the following instructions to the proper float128 lib calls: pow[i], exp[2], log[2|10], sin, cos, fmin, fmax Differential Revision: https://reviews.llvm.org/D48544 llvm-svn: 336361
* [SLPVectorizer] Begin abstracting InstructionsState alternate matching away ↵Simon Pilgrim2018-07-051-42/+55
| | | | | | | | | | from opcodes. NFCI. This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism. Differential Revision: https://reviews.llvm.org/D48945 llvm-svn: 336344
* [AMDGPU] Add VALU to V_INTERP InstructionsRyan Taylor2018-07-051-0/+1
| | | | | | | | | | | | Wait states are not properly being inserted after buffer_store for v_interp instructions. Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can check and insert the appropriate wait states when needed. Differential Revision: https://reviews.llvm.org/D48772 Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64 llvm-svn: 336339
* Try to fix -Wimplicit-fallthrough warning. NFCI.Simon Pilgrim2018-07-051-0/+1
| | | | llvm-svn: 336331
* [mips] Fix atomic operations at O0, v3Aleksandar Beserminji2018-07-058-337/+1058
| | | | | | | | | | | | | | | | | | | | | | | Similar to PR/25526, fast-regalloc introduces spills at the end of basic blocks. When this occurs in between an ll and sc, the stores can cause the atomic sequence to fail. This patch fixes the issue by introducing more pseudos to represent atomic operations and moving their lowering to after the expansion of postRA pseudos. This version addresses issues with the initial implementation and covers all atomic operations. This resolves PR/32020. Thanks to James Cowgill for reporting the issue! Patch By: Simon Dardis Differential Revision: https://reviews.llvm.org/D31287 llvm-svn: 336328
* [NEON] Fix combining of vldx_dup intrinsics with updating of base addressesIvan A. Kosarev2018-07-051-0/+6
| | | | | | | | | | | | | Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
* Reverting r336322 for now, as it causes an assert failureSander de Smalen2018-07-052-16/+1
| | | | | | | in TableGen, for which there is already a patch in Phabricator (D48937) that needs to be committed first. llvm-svn: 336324
* [AArch64][SVE] Asm: Support for predicated FP rounding instructions.Sander de Smalen2018-07-052-1/+16
| | | | | | | | | | | | | | | | | | This patch also adds instructions for predicated FP square-root and reciprocal exponent. The added instructions are: - FRINTI Round to integral value (current FPCR rounding mode) - FRINTX Round to integral value (current FPCR rounding mode, signalling inexact) - FRINTA Round to integral value (to nearest, with ties away from zero) - FRINTN Round to integral value (to nearest, with ties to even) - FRINTZ Round to integral value (toward zero) - FRINTM Round to integral value (toward minus Infinity) - FRINTP Round to integral value (toward plus Infinity) - FSQRT Floating-point square root - FRECPX Floating-point reciprocal exponent llvm-svn: 336322
* [ARM] ParallelDSP: only support i16 loads for nowSjoerd Meijer2018-07-051-28/+25
| | | | | | | | | We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
* [AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABDSander de Smalen2018-07-054-0/+53
| | | | | | | | | | | | | | | | | | | | This patch implements the following varieties: - Unpredicated signed max, e.g. smax z0.h, z1.h, #-128 - Unpredicated signed min, e.g. smin z0.h, z1.h, #-128 - Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255 - Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255 - Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h - Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h - Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h - Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h - Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h - Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h llvm-svn: 336317
* [Power9] Optimize codgen for conversions of int to float128Lei Huang2018-07-051-0/+17
| | | | | | | | | | | | Optimize code sequences for integer conversion to fp128 when the integer is a result of: * float->int * float->long * double->int * double->long Differential Revision: https://reviews.llvm.org/D48429 llvm-svn: 336316
* [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵Craig Topper2018-07-054-58/+33
| | | | | | independent FMA and extractelement/insertelement. llvm-svn: 336315
* [demangler] Avoid alignment warningSerge Pavlov2018-07-051-1/+1
| | | | | | | | | | | | | | The alignment specified by a constant for the field `BumpPointerAllocator::InitialBuffer` exceeded the alignment guaranteed by `malloc` and `new` on Windows. This change set the alignment value to that of `long double`, which is defined by the used platform. It fixes https://bugs.llvm.org/show_bug.cgi?id=37944. Differential Revision: https://reviews.llvm.org/D48889 llvm-svn: 336311
* [Power9] Ensure float128 in non-homogenous aggregates are passed via VSX regLei Huang2018-07-054-0/+43
| | | | | | | | | | | | | Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory, or in memory. This patch ensures that float128 members of non-homogenous aggregates are passed via VSX registers. This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128 to a new PPCISD node, BUILD_FP128. Differential Revision: https://reviews.llvm.org/D48308 llvm-svn: 336310
* [Power9]Legalize and emit code for quad-precision convert from single-precisionLei Huang2018-07-052-2/+10
| | | | | | | | | Legalize and emit code for quad-precision floating point operation conversion of single-precision value to quad-precision. Differential Revision: https://reviews.llvm.org/D47569 llvm-svn: 336307
* [Power9] Implement float128 parameter passing and return valuesLei Huang2018-07-052-5/+25
| | | | | | | | | | This patch enable parameter passing and return by value for float128 types. Passing aggregate/union which contain float128 members will be submitted in subsequent patches. Differential Revision: https://reviews.llvm.org/D47552 llvm-svn: 336306
* [X86] Remove some isel patterns for X86ISD::SELECTS that specifically looked ↵Craig Topper2018-07-051-18/+0
| | | | | | | | for the v1i1 mask to have come from a scalar_to_vector from GR8. We have patterns for SELECTS that top at v1i1 and we have a pattern for (v1i1 (scalar_to_vector GR8)). The patterns being removed here do the same thing as the two other patterns combined so there is no need for them. llvm-svn: 336305
* [X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg ↵Craig Topper2018-07-051-78/+119
| | | | | | | | | | input. Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)). This patch fixes that so we can also combine things like (fmsub (fneg)). llvm-svn: 336304
* [X86] Remove some of the packed FMA3 intrinsics since we no longer use them ↵Craig Topper2018-07-052-44/+32
| | | | | | | | | | in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303
* [Power9]Legalize and emit code for round & convert quad-precision valuesLei Huang2018-07-043-3/+26
| | | | | | | | | Legalize and emit code for round & convert float128 to double precision and single precision. Differential Revision: https://reviews.llvm.org/D46997 llvm-svn: 336299
OpenPOWER on IntegriCloud