summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Rename __asan_gen_* symbols to ___asan_gen_*.Peter Collingbourne2018-07-181-1/+1
| | | | | | | | | | This prevents gold from printing a warning when trying to export these symbols via the asan dynamic list after ThinLTO promotes them from private symbols to external symbols with hidden visibility. Differential Revision: https://reviews.llvm.org/D49498 llvm-svn: 337428
* [WebAssembly] Add missing -mattr=+exception-handling guardsHeejin Ahn2018-07-181-0/+3
| | | | | | | | | | | | | | Summary: The use of exception handling instructions should only be enabled with `-mattr=+exception-handling` option. Reviewers: jgravelle-google Subscribers: dschuff, sbc100, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49391 llvm-svn: 337425
* Revert "ARM: switch armv7em triple to hard-float defaults and libcalls."Tim Northover2018-07-181-1/+0
| | | | | | This reverts commit r337385 until it can be targeted at MachO only. llvm-svn: 337424
* [X86][SSE] Canonicalize scalar fp arithmetic shuffle patternsSimon Pilgrim2018-07-181-2/+31
| | | | | | | | | | | | As discussed on PR38197, this canonicalizes MOVS*(N0, OP(N0, N1)) --> MOVS*(N0, SCALAR_TO_VECTOR(OP(N0[0], N1[0]))) This returns the scalar-fp codegen lost by rL336971. Additionally it handles the OP(N1, N0)) case for commutable (FADD/FMUL) ops. Differential Revision: https://reviews.llvm.org/D49474 llvm-svn: 337419
* Skip debuginfo intrinsic in markLiveBlocks.Xin Tong2018-07-181-39/+38
| | | | | | | | | | | | | | | | | | | | Summary: The optimizer is 10%+ slower with vs without debuginfo. I started checking where the difference is coming from. I compiled sqlite3.c with and without debug info from CTMark and compare the time difference. I use Xcode Instrument to find where time is spent. This brings about 20ms, out of ~20s. Reviewers: davide, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D49337 llvm-svn: 337416
* [DebugInfo] Dwarfv5: Avoid unnecessary base_address specifiers in rnglistsDavid Blaikie2018-07-181-10/+18
| | | | | | | | | | | | Since DWARFv5 rnglists are self descriptive and have distinct encodings for base-relative (offset_pair) and absolute (start_length) entries, there's no need to use a base address specifier when describing a lone address range in a section. Use that, and improve the test coverage a bit here to include cases like this and others. llvm-svn: 337411
* [ScheduleDAG] Fix unfolding of SUnits to already existent nodes.Nirav Dave2018-07-181-18/+30
| | | | | | | | | | | | | | | | | Summary: If unfolding an SUnit results in both load or the operation using it which already exist in the DAG, abort the unfold if they are already scheduled. If not, make sure we don't add duplicate dependencies. This fixes PR37916. Reviewers: davide, eli.friedman, fhahn, bogner Subscribers: MatzeB, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48666 llvm-svn: 337409
* [RegAlloc][NFC] Fix the help string of the option "huge-size-for-split".Wei Mi2018-07-181-1/+2
| | | | llvm-svn: 337402
* [MC] Fix nested macro body parsingNirav Dave2018-07-181-1/+2
| | | | | | Add missing .rep case in nestlevel checking for macro body parsing. llvm-svn: 337398
* [mips] Fix predicate for the MipsTruncIntFP patternSimon Atanasyan2018-07-181-1/+1
| | | | | | | | | This is a follow-up to the rL337171. This patch fixes regression introduced by the r337171 and enables MipsTruncIntFP pattern. Differential revision: https://reviews.llvm.org/D49469 llvm-svn: 337392
* [SLPVectorizer] Avoid duplicate scalar cost calculations in ↵Simon Pilgrim2018-07-181-50/+37
| | | | | | | | BoUpSLP::getEntryCost. NFCI. Pulled out from D49225, we have a lot of repeated scalar cost calculations, often with arguments that don't look the same but turn out to be. llvm-svn: 337390
* [Support] Build fix for Haiku when checking for a local filesystemTim Northover2018-07-181-0/+3
| | | | | | | | | Haiku does not expose information about local versus remote mounts, so just return false, like Cygwin. Patch by Niels Sascha Reedijk. llvm-svn: 337389
* [X86][SSE] Remove BLENDPD canonicalization from combineTargetShuffleSimon Pilgrim2018-07-181-25/+0
| | | | | | When rL336971 removed the scalar-fp isel patterns, we lost the need for this canonicalization - commutation/folding can handle everything else. llvm-svn: 337387
* ARM: stop explicitly marking armv7k libcalls as hard-float. NFC.Tim Northover2018-07-181-7/+0
| | | | | | | Since the triple's default is hard float, the libcalls will already use VFP registers. llvm-svn: 337386
* ARM: switch armv7em triple to hard-float defaults and libcalls.Tim Northover2018-07-181-0/+1
| | | | | | | We were emitting incorrect calls to libm functions that LLVM had decided it knew about because the default is soft-float. llvm-svn: 337385
* ARM: deduplicate hard-float detection code. NFC.Tim Northover2018-07-184-12/+12
| | | | | | | | ARMSubtarget had a copy/pasted block to determine whether the target was hard-float, but it just delegated to triple features anyway so it's better at the TargetMachine level. llvm-svn: 337384
* [AArch64][SVE] Asm: Support for unpredicated FP operations.Sander de Smalen2018-07-182-0/+35
| | | | | | | | | | | | | | | | | | This patch adds support for the following unpredicated floating-point instructions: FADD Floating point add FSUB Floating point subtract FMUL Floating point multiplication FTSMUL Floating point trigonometric starting value FRECPS Floating point reciprocal step FRSQRTS Floating point reciprocal square root step The instructions have the following assembly format: fadd z0.h, z1.h, z2.h and have variants for 16, 32 and 64-bit FP elements. llvm-svn: 337383
* [InstCombine] Re-commit: Fold 'check for [no] signed truncation' patternRoman Lebedev2018-07-181-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. The DAGCombine will reverse this transform, see https://reviews.llvm.org/D49266 This transform is surprisingly frustrating. This does not deal with non-splat shift amounts, or with undef shift amounts. I've outlined what i think the solution should be: ``` // Potential handling of non-splats: for each element: // * if both are undef, replace with constant 0. // Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0. // * if both are not undef, and are different, bailout. // * else, only one is undef, then pick the non-undef one. ``` This is a re-commit, as the original patch, committed in rL337190 was reverted in rL337344 as it broke chromium build: https://bugs.llvm.org/show_bug.cgi?id=38204 and https://crbug.com/864832 Proofs that the fixed folds are ok: https://rise4fun.com/Alive/VYM Differential Revision: https://reviews.llvm.org/D49320 llvm-svn: 337376
* Revert "[Sparc] Use the IntPair reg class for r constraints with value type f64"Daniel Cederman2018-07-181-1/+1
| | | | | | | | | This reverts commit 55222c9183c6e07f53a54c4061677734f54feac1. I missed that this patch has a dependency on https://reviews.llvm.org/D49219 that has not been approved yet. llvm-svn: 337373
* [AArch64][SVE] Asm: Support for UDOT/SDOT instructions.Sander de Smalen2018-07-182-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | The signed/unsigned DOT instructions perform a dot-product on quadtuplets from two source vectors and accumulate the result in the destination register. The instructions come in two forms: Vector form, e.g. sdot z0.s, z1.b, z2.b - signed dot product on four 8-bit quad-tuplets, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h - unsigned dot product on four 16-bit quad-tuplets, accumulating results in 64-bit elements. Indexed form, e.g. sdot z0.s, z1.b, z2.b[3] - signed dot product on four 8-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h[1] - dot product on four 16-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 64-bit elements. llvm-svn: 337372
* [Sparc] Use the IntPair reg class for r constraints with value type f64Daniel Cederman2018-07-181-1/+1
| | | | | | | | | | | | | | | Summary: This is how it appears to be handled in GCC and it prevents a "Unknown mismatch" error in the SelectionDAGBuilder. Reviewers: venkatra, jyknight, jrtc27 Reviewed By: jyknight, jrtc27 Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D49218 llvm-svn: 337370
* [AArch64][SVE] Asm: Integer divide instructions.Sander de Smalen2018-07-182-0/+11
| | | | | | | | | | | | | | | | | | This patch adds the following predicated instructions: UDIV Unsigned divide active elements UDIVR Unsigned divide active elements, reverse form. SDIV Signed divide active elements SDIVR Signed divide active elements, reverse form. e.g. udiv z0.s, p0/m, z0.s, z1.s (unsigned divide active elements in z0 by z1, store result in z0) sdivr z0.s, p0/m, z0.s, z1.s (signed divide active elements in z1 by z0, store result in z0) llvm-svn: 337369
* Fix -Wdocumentation warning. NFCI.Simon Pilgrim2018-07-181-1/+1
| | | | llvm-svn: 337367
* [AArch64][SVE] Asm: Support for integer MUL instructions.Sander de Smalen2018-07-182-8/+26
| | | | | | | | | | | | | | | | This patch adds the following instructions: MUL - multiply vectors, e.g. mul z0.h, p0/m, z0.h, z1.h - multiply with immediate, e.g. mul z0.h, z0.h, #127 SMULH - signed multiply returning high half, e.g. smulh z0.h, p0/m, z0.h, z1.h UMULH - unsigned multiply returning high half, e.g. umulh z0.h, p0/m, z0.h, z1.h llvm-svn: 337358
* [X86] Enable commuting of VUNPCKHPD to VMOVLHPS to enable load folding by ↵Craig Topper2018-07-183-16/+38
| | | | | | | | using VMOVLPS with a modified address. This required an annoying amount of tablegen multiclass changes to make only VUNPCKHPDZ128rr commutable. llvm-svn: 337357
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-07-182-4/+4
| | | | llvm-svn: 337351
* Fix build failures from r337347, found by clangJustin Hibbits2018-07-183-15/+6
| | | | | | | | | | | * Delete a no-longer-used override, and mark the other getRegisterTypeForCallingConv() as override. * SPE only supports i32, not i64, as the internal type, so simply remove the type check, so that DestReg and Opc are provably always set. GCC 6.4 did not warn about either of the above. llvm-svn: 337350
* [X86] Remove patterns that mix X86ISD::MOVLHPS/MOVHLPS with v2i64/v2f64 types.Craig Topper2018-07-182-33/+0
| | | | | | The X86ISD::MOVLHPS/MOVHLPS should now only be emitted in SSE1 only. This means that the v2i64/v2f64 types would be illegal thus we don't need these patterns. llvm-svn: 337349
* [X86] Generate v2f64 X86ISD::UNPCKL/UNPCKH instead of ↵Craig Topper2018-07-182-4/+17
| | | | | | | | | | | | X86ISD::MOVLHPS/MOVHLPS for unary v2f64 {0,0} and {1,1} shuffles with SSE2. I'm trying to restrict the MOVLHPS/MOVHLPS ISD nodes to SSE1 only. With SSE2 we can use unpcks. I believe this will allow some patterns to be cleaned up to require fewer bitcasts. I've put in an odd isel hack to still select MOVHLPS instruction from the unpckh node to avoid changing tests and because movhlps is a shorter encoding. Ideally we'd do execution domain switching on this, but the operands are in the wrong order and are tied. We might be able to try a commute in the domain switching using custom code. We already support domain switching for UNPCKLPD and MOVLHPS. llvm-svn: 337348
* Introduce codegen for the Signal Processing EngineJustin Hibbits2018-07-1818-614/+1323
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347
* Complete the SPE instruction set patternsJustin Hibbits2018-07-186-225/+562
| | | | | | | | | This is the lead-up to having SPE codegen. Add the rest of the instructions, along with MC tests. Differential Revision: https://reviews.llvm.org/D44829 llvm-svn: 337346
* Add PowerPC e500(v2) core scheduler and directives.Justin Hibbits2018-07-187-220/+497
| | | | | | Differential Revision: https://reviews.llvm.org/D44828 llvm-svn: 337345
* Revert "[InstCombine] Fold 'check for [no] signed truncation' pattern"Bob Haarman2018-07-181-69/+0
| | | | | | | | | This reverts r337190 (and a few follow-up commits), which caused the Chromium build to fail. See https://bugs.llvm.org/show_bug.cgi?id=38204 and https://crbug.com/864832 llvm-svn: 337344
* CodeGen: Don't create address significance table entries for thread-local ↵Peter Collingbourne2018-07-181-1/+2
| | | | | | | | | | variables. The presence of these symbols in the symbol table can cause symbol type mismatch errors (or undefined symbol errors on emulated TLS targets) and they can't be ICF'd anyway. llvm-svn: 337338
* [X86] Remove the vector alignment requirement from the patterns added in ↵Craig Topper2018-07-171-2/+4
| | | | | | | | r337320. The resulting instruction will only load 64 bits so alignment isn't required. llvm-svn: 337334
* CodeGen: Add a target option for emitting .addrsig directives for all ↵Peter Collingbourne2018-07-171-0/+8
| | | | | | | | address-significant symbols. Differential Revision: https://reviews.llvm.org/D48143 llvm-svn: 337331
* MC: Implement support for new .addrsig and .addrsig_sym directives.Peter Collingbourne2018-07-176-0/+89
| | | | | | | | | Part of the address-significance tables proposal: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123514.html Differential Revision: https://reviews.llvm.org/D47744 llvm-svn: 337328
* [X86] Add patterns for folding full vector load into MOVHPS and MOVLPS with ↵Craig Topper2018-07-172-16/+25
| | | | | | SSE1 only. llvm-svn: 337320
* [Demangle] Add missing header filesFangrui Song2018-07-171-0/+2
| | | | llvm-svn: 337318
* Add missing include.Zachary Turner2018-07-171-1/+2
| | | | llvm-svn: 337317
* Add some helper functions to the demangle utility classes.Zachary Turner2018-07-173-16/+128
| | | | | | | | | | | | | | These are all methods that, while not currently used in the Itanium demangler, are generally useful enough that it's likely the itanium demangler could find a use for them. More importantly, they are all necessary for the Microsoft demangler which is up and coming in a subsequent patch. Rather than combine these into a single monolithic patch, I think it makes sense to commit this utility code first since it is very simple, this way it won't detract from the substance of the MS demangler patch. llvm-svn: 337316
* [InstCombine] Preserve debug value when simplifying cast-of-selectVedant Kumar2018-07-171-1/+3
| | | | | | | | | | | | | | | | | | | | InstCombine has a cast transform that matches a cast-of-select: Orig = cast (Src = select Cond TV FV) And tries to replace it with a select which has the cast folded in: NewSel = select Cond (cast TV) (cast FV) The combiner does RAUW(Orig, NewSel), so any debug values for Orig would survive the transform. But debug values for Src would be lost. This patch teaches InstCombine to replace all debug uses of Src with NewSel (taking care of doing any necessary DIExpression rewriting). Differential Revision: https://reviews.llvm.org/D49270 llvm-svn: 337310
* [x86/SLH] Flesh out the data-invariant instruction table a bit based on ↵Chandler Carruth2018-07-171-7/+37
| | | | | | | | | | | | | | | | | | | | | | | | | feedback from Craig. Summary: The only thing he suggested that I've skipped here is the double-wide multiply instructions. Multiply is an area I'm nervous about there being some hidden data-dependent behavior, and it doesn't seem important for any benchmarks I have, so skipping it and sticking with the minimal multiply support that matches what I know is widely used in existing crypto libraries. We can always add double-wide multiply when we have clarity from vendors about its behavior and guarantees. I've tried to at least cover the fundamentals here with tests, although I've not tried to cover every width or permutation. I can add more tests where folks think it would be helpful. Reviewers: craig.topper Subscribers: sanjoy, mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D49413 llvm-svn: 337308
* [WebAssembly] Update WebAssemblyLowerEmscriptenEHSjLj to handle separate ↵Sam Clegg2018-07-171-38/+17
| | | | | | | | | | | | | | | | | | | | | compilation Previously we were assuming whole program compilation. Now that separate compilation is a thing we need to update this pass. Firstly, it can no longer assert on the existence of malloc and free. This functions might not be in the current translation unit. If we need them then we will generate not imports for them. Secondly the global helper function we create should be marked as weak since we will be generating a separate copy in each translation unit. Finally the names of the symbols used must be unique and fixed since they need to agree across translation units. Differential Revision: https://reviews.llvm.org/D49263 llvm-svn: 337301
* [X86] Remove some standalone patterns in favor of the patterns in the MOVLPD ↵Craig Topper2018-07-172-20/+2
| | | | | | | | instruction definitions. Previously we passed 'null_frag' into the instruction definition. The multiclass is shared with MOVHPD which doesn't use null_frag. It turns out by passing X86Movsd it produces patterns equivalent to some standalone patterns. llvm-svn: 337299
* [AArch64][SVE]: Integer multiply-add/subtract instructions.Sander de Smalen2018-07-172-0/+69
| | | | | | | | | | This patch adds support for the following instructions: MLA mul-add, writing addend (Zda = Zda + Zn * Zm) MLS mul-sub, writing addend (Zda = Zda + -Zn * Zm) MAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm) MSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm) llvm-svn: 337293
* [Mips][FastISel] Fix handling of icmp with i1 typePetar Jovanovic2018-07-171-0/+4
| | | | | | | | | | | The Mips FastISel back-end does not extend i1 values while lowering icmp. Ensure that we bail into DAG ISel when handling this case. Patch by Dragan Mladjenovic. Differential Revision: https://reviews.llvm.org/D49290 llvm-svn: 337288
* [IPSCCP] Run Solve each time we resolved an undef in a function.Florian Hahn2018-07-171-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once we resolved an undef in a function we can run Solve, which could lead to finding a constant return value for the function, which in turn could turn undefs into constants in other functions that call it, before resolving undefs there. Computationally the amount of work we are doing stays the same, just the order we process things is slightly different and potentially there are a few less undefs to resolve. We are still relying on the order of functions in the IR, which means depending on the order, we are able to resolve the optimal undef first or not. For example, if @test1 comes before @testf, we find the constant return value of @testf too late and we cannot use it while solving @test1. This on its own does not lead to more constants removed in the test-suite, probably because currently we have to be very lucky to visit applicable functions in the right order. Maybe we manage to come up with a better way of resolving undefs in more 'profitable' functions first. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma, davide Differential Revision: https://reviews.llvm.org/D49385 llvm-svn: 337283
* [AArch64][SVE] Asm: FP fused multiply-add/subtract instructions.Sander de Smalen2018-07-172-0/+119
| | | | | | | | | | | | | | | | | | This patch adds support for the following instructions: FMLA mul-add, writing addend (Zda = Zda + Zn * Zm) FNMLA negated mul-add, writing addend (Zda = -Zda + -Zn * Zm) FMLS mul-sub, writing addend (Zda = Zda + -Zn * Zm) FNMLS negated mul-sub, writing addend (Zda = -Zda + Zn * Zm) FMAD mul-add, writing multiplicant (Zdn = Za + Zdn * Zm) FNMAD negated mul-add, writing multiplicant (Zdn = -Za + -Zdn * Zm) FMSB mul-sub, writing multiplicant (Zdn = Za + -Zdn * Zm) FNMSB negated mul-sub, writing multiplicant (Zdn = -Za + Zdn * Zm) llvm-svn: 337282
* [SLPVectorizer] Don't attempt horizontal reduction on pointer types (PR38191)Simon Pilgrim2018-07-171-0/+2
| | | | | | TTI::getMinMaxReductionCost typically can't handle pointer types - until this is changed its better to limit horizontal reduction to integer/float vector types only. llvm-svn: 337280
OpenPOWER on IntegriCloud