summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM] Add a batch of similarly encoded MVE instructions.Simon Tatham2019-06-213-1/+345
| | | | | | | | | | | | | | | | | | | | | | | Summary: This adds the `MVE_qDest_qSrc` superclass and all instructions that inherit from it. It's not the complete class of _everything_ with a q-register as both destination and source; it's a subset of them that all have similar encodings (but it would have been hopelessly unwieldy to call it anything like MVE_111x11100). This category includes add/sub with carry; long multiplies; halving multiplies; multiply and accumulate, and some more complex instructions. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62677 llvm-svn: 364037
* [X86] createMMXBuildVector - call with BuildVectorSDNode directly. NFCI.Simon Pilgrim2019-06-211-7/+5
| | | | llvm-svn: 364030
* [ARM] Fix -Wimplicit-fallthrough after D62675Fangrui Song2019-06-211-0/+2
| | | | llvm-svn: 364028
* [ARM] Add MVE vector compare instructions.Simon Tatham2019-06-213-6/+201
| | | | | | | | | | | | | | | | | | Summary: These take a pair of vector register to compare, and a comparison type (written in the form of an Arm condition suffix); they output a vector of booleans in the VPR register, where predication can conveniently use them. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62676 llvm-svn: 364027
* [X86] combineAndnp - use isNOT instead of manually checking for (XOR x, -1)Simon Pilgrim2019-06-211-5/+3
| | | | llvm-svn: 364026
* [Symbolize] Avoid lifetime extension and simplify std::map find/insert. NFCFangrui Song2019-06-211-33/+26
| | | | llvm-svn: 364025
* [X86] foldVectorXorShiftIntoCmp - use isConstOrConstSplat. NFCI.Simon Pilgrim2019-06-211-7/+4
| | | | | | Use the isConstOrConstSplat helper instead of inspecting the build vector manually. llvm-svn: 364024
* [X86][AVX] isNOT - handle concat_vectors(xor X, -1, xor Y, -1) patternSimon Pilgrim2019-06-211-0/+10
| | | | llvm-svn: 364022
* [ARM] Add a batch of MVE floating-point instructions.Simon Tatham2019-06-213-4/+456
| | | | | | | | | | | | | | | | | Summary: This includes floating-point basic arithmetic (add/sub/multiply), complex add/multiply, unary negation and absolute value, rounding to integer value, and conversion to/from integer formats. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62675 llvm-svn: 364013
* Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFCFangrui Song2019-06-2124-90/+57
| | | | llvm-svn: 364006
* [MIPS GlobalISel] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off ↵Fangrui Song2019-06-212-2/+4
| | | | | | builds after D63541 llvm-svn: 364003
* [GlobalISel][Localizer] Allow localization of G_INTTOPTR and chains of ↵Amara Emerson2019-06-211-14/+15
| | | | | | | | | | | | | | | | | | | | instructions. G_INTTOPTR can prevent the localizer from moving G_CONSTANTs, but since it's essentially a side effect free cast instruction we can remat both instructions. This patch changes the localizer to enable localization of the chains by iterating over the entry block instructions in reverse order. That way, uses will localized first, and then the defs are free to be localized as well. This also changes the previous SmallPtrSet of localized instructions to use a SetVector instead. We're dealing with pointers and need deterministic iteration order. Overall, this change improves ARM64 -O0 CTMark code size by around 0.7% geomean. Differential Revision: https://reviews.llvm.org/D63630 llvm-svn: 364001
* [Reassociate] Remove bogus assert reported in PR42349.Cameron McInally2019-06-201-5/+1
| | | | | | | | Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete. The bogus assert with introduced in D63445. llvm-svn: 363998
* [InstSimplify] simplify power-of-2 (single bit set) sequencesSanjay Patel2019-06-201-0/+10
| | | | | | | | | | | | | | | | | | | | As discussed in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 Improving the canonicalization for these patterns: rL363956 ...means we should adjust/enhance the related simplification. https://rise4fun.com/Alive/w1cp Name: isPow2 or zero %x = and i32 %xx, 2048 %a = add i32 %x, -1 %r = and i32 %a, %x => %r = i32 0 llvm-svn: 363997
* AMDGPU: Always use s33 for global scratch wave offsetMatt Arsenault2019-06-202-9/+1
| | | | | | | | | Every called function could possibly need this to calculate the absolute address of stack objectst, and this avoids inserting a copy around every call site in the kernel. It's also somewhat cleaner to keep this in a callee saved SGPR. llvm-svn: 363990
* [ARM GlobalISel] Add support for s64 G_ADD and G_SUB.Eli Friedman2019-06-202-2/+19
| | | | | | | | | | | | | Teach RegisterBankInfo to use the correct register class, and tell the legalizer it's legal. Everything else just works. The one thing that's slightly weird about this compared to SelectionDAG isel is that legalization can't distinguish between i64 and <1 x i64>, so we might end up with more NEON instructions than the user expects. Differential Revision: https://reviews.llvm.org/D63585 llvm-svn: 363989
* [PowerPC][NFC] Fix comments for AltVSXFMARel mapping.Jinsong Ji2019-06-201-3/+2
| | | | llvm-svn: 363987
* [profile] Solaris ld supports __start___llvm_prof_data etc. labelsRainer Orth2019-06-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, many profiling tests on Solaris FAIL like Command Output (stderr): -- Undefined first referenced symbol in file __llvm_profile_register_names_function /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o __llvm_profile_register_function /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o Solaris 11.4 ld supports the non-standard GNU ld extension of adding __start_SECNAME and __stop_SECNAME labels to sections whose names are valid as C identifiers. Given that we already use Solaris 11.4-only features like ld -z gnu-version-script-compat and fully working .preinit_array support in compiler-rt, we don't need to worry about older versions of Solaris ld. The patch documents that support (although the comment in lib/Transforms/Instrumentation/InstrProfiling.cpp (needsRuntimeRegistrationOfSectionRange) is quite cryptic what it's actually about), and adapts the affected testcase not to expect the alternativeq __llvm_profile_register_functions and __llvm_profile_init. It fixes all affected tests. Tested on amd64-pc-solaris2.11. Differential Revision: https://reviews.llvm.org/D41111 llvm-svn: 363984
* AMDGPU: Add intrinsics for DS GWS semaphore instructionsMatt Arsenault2019-06-205-25/+72
| | | | llvm-svn: 363983
* [LICM & MSSA] Limit unsafe sinking and hoisting.Alina Sbirlea2019-06-201-10/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch). If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink. Given: ``` for i load a[i] store a[i] ``` The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it. In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs. Caught by PR42294. This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63582 llvm-svn: 363982
* AMDGPU: Insert mem_viol check loop around GWS pre-GFX9Matt Arsenault2019-06-205-19/+129
| | | | | | | It is necessary to emit this loop around GWS operations in case the wave is preempted pre-GFX9. llvm-svn: 363979
* [InstCombine] fix typo in comment; NFCSanjay Patel2019-06-201-1/+1
| | | | llvm-svn: 363974
* [clang][NewPM] Do not eliminate available_externally durng `-O2 -flto` runsLeonard Chan2019-06-201-2/+4
| | | | | | | | | | | | This fixes CodeGen/available-externally-suppress.c when the new pass manager is turned on by default. available_externally was not emitted during -O2 -flto runs when it should still be retained for link time inlining purposes. This can be fixed by checking that we aren't LTOPrelinking when adding the EliminateAvailableExternallyPass. Differential Revision: https://reviews.llvm.org/D63580 llvm-svn: 363971
* [LFTR] Fix a (latent?) bug related to nested loopsPhilip Reames2019-06-201-0/+6
| | | | | | | | I can't actually come up with a test case this triggers on without an out of tree change, but in theory, it's a bug in the recently added multiple exit LFTR support. The root issue is that an exiting block common to two loops can (in theory) have computable exit counts for both loops. Rewriting the exit of an inner loop in terms of the outer loops IV would cause the inner loop to either a) run forever, or b) terminate on the first iteration. In practice, we appear to get lucky and not have the exit count computable for the outer loop, except when it's trivially zero. Given we bail on zero exit counts, we don't appear to ever trigger this. But I can't come up with a reason we *can't* compute an exit count for the outer loop on the common exiting block, so this may very well be triggering in some cases. llvm-svn: 363964
* [X86] Add BLSI to isUseDefConvertible.Craig Topper2019-06-201-0/+4
| | | | | | | | | | | | | | | | | | | | | Summary: BLSI sets the C flag is the input is not zero. So if its followed by a TEST of the input where only the Z flag is consumed, we can replace it with the opposite check of the C flag. We should be able to do the same for BLSMSK and BLSR, but the naive test case for those is being optimized to a subo by CodeGenPrepare. Reviewers: spatel, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63589 llvm-svn: 363957
* [InstCombine] canonicalize check for power-of-2Sanjay Patel2019-06-201-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The form that compares against 0 is better because: 1. It removes a use of the input value. 2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2 3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS). This is a root cause for PR42314, but probably doesn't completely answer the codegen request: https://bugs.llvm.org/show_bug.cgi?id=42314 Alive proof: https://rise4fun.com/Alive/9kG Name: is power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp eq i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp eq i32 %a2, 0 Name: is not power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp ne i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp ne i32 %a2, 0 llvm-svn: 363956
* [DAGCombiner] Use getAPIntValue() instead of getZExtValue() where possible.Simon Pilgrim2019-06-201-21/+20
| | | | | | Better handling of out-of-i64-range values due to large integer types or from fuzz tests. llvm-svn: 363955
* [DAGCombiner][NFC] Remove unused varJordan Rupprecht2019-06-201-1/+0
| | | | llvm-svn: 363954
* Store a pointer to the return value in a static alloca and let the debugger ↵Amy Huang2019-06-201-2/+12
| | | | | | | | | | | | | | use that as the variable address for NRVO variables. Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63361 llvm-svn: 363952
* [InstCombine] cttz(-x) -> cttz(x)David Bolvansky2019-06-201-0/+6
| | | | | | | | | | | | | | | | Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63534 llvm-svn: 363951
* AMDGPU: Fix ignoring DisableFramePointerElim in leaf functionsMatt Arsenault2019-06-201-11/+7
| | | | | | | | The attribute can specify elimination for leaf or non-leaf, so it should always be considered. I copied this bug from AArch64, which probably should also be fixed. llvm-svn: 363949
* [CodeGen] Fix formatting and comments (NFC)Evandro Menezes2019-06-201-6/+8
| | | | llvm-svn: 363947
* AMDGPU: Treat undef as an inline immediateMatt Arsenault2019-06-202-5/+19
| | | | | | | This should only matter in vectors with an undef component, since a full undef vector would have been folded out. llvm-svn: 363941
* [ARM] Add a batch of MVE integer instructions.Simon Tatham2019-06-203-1/+406
| | | | | | | | | | | | | | | | This includes integer arithmetic of various kinds (add/sub/multiply, saturating and not), and the immediate forms of VMOV and VMVN that load an immediate into all lanes of a vector. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62674 llvm-svn: 363936
* [AMDGPU] gfx1010 core wave32 changesStanislav Mekhanoshin2019-06-2010-40/+56
| | | | | | Differential Revision: https://reviews.llvm.org/D63204 llvm-svn: 363934
* [DAGCombiner] Support (shl (zext (srl x, C)), C) -> (zext (shl (srl x, C), ↵Simon Pilgrim2019-06-201-17/+19
| | | | | | | | C)) non-uniform folds. Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. llvm-svn: 363929
* [DAGCombine] Add TODOs for some combines that should support non-uniform vectorsSimon Pilgrim2019-06-201-0/+15
| | | | | | We tend to only test for scalar/scalar consts when really we could support non-uniform vectors using ISD::matchUnaryPredicate/matchBinaryPredicate etc. llvm-svn: 363924
* [X86] LowerAVXExtend - handle ANY_EXTEND_VECTOR_INREG lowering as well.Simon Pilgrim2019-06-201-6/+10
| | | | llvm-svn: 363922
* [DAGCombine] Reduce scope of ShAmtVal variable. NFCI.Simon Pilgrim2019-06-201-2/+1
| | | | | | | | Fixes cppcheck warning. Use the more capable getAPIntVal() instead of getZExtValue() as well since I'm here. llvm-svn: 363921
* [MIPS GlobalISel] Select integer to floating point conversionsPetar Avramovic2019-06-203-2/+23
| | | | | | | | Select G_SITOFP and G_UITOFP for MIPS32. Differential Revision: https://reviews.llvm.org/D63542 llvm-svn: 363912
* [MIPS GlobalISel] Select floating point to integer conversionsPetar Avramovic2019-06-204-2/+53
| | | | | | | | Select G_FPTOSI and G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D63541 llvm-svn: 363911
* [X86] Remove memory instructions form isUseDefConvertible.Craig Topper2019-06-201-15/+15
| | | | | | | | The caller of this is looking for comparisons of the input to these instructions with 0. But the memory instructions input is an addess not a value input in a register. llvm-svn: 363907
* [X86] Add v64i8/v32i16 to several places in X86CallingConv.td where they ↵Craig Topper2019-06-201-3/+4
| | | | | | seemed obviously missing. llvm-svn: 363906
* AMDGPU: Don't clobber VCC in MUBUF addr64 emulationMatt Arsenault2019-06-201-9/+16
| | | | | | | | | Introducing VCC defs during SIFixSGPRCopies is generally problematic. Avoid it by starting with the VOP3 form with the general condition register. This is the easiest to fix instance, but doesn't solve any specific problems I'm looking at. llvm-svn: 363904
* [llvm-objdump] Switch between ARM/Thumb based on mapping symbols.Eli Friedman2019-06-201-29/+28
| | | | | | | | | | | | | | | The ARMDisassembler changes allow changing between ARM and Thumb mode based on the MCSubtargetInfo, rather than the Target, which simplifies the other changes a bit. I'm not really happy with adding more target-specific logic to tools/llvm-objdump/, but there isn't any easy way around it: the logic in question specifically applies to disassembling an object file, and that code simply isn't located in lib/Target, at least at the moment. Differential Revision: https://reviews.llvm.org/D60927 llvm-svn: 363903
* AMDGPU: Consolidate some getGeneration checksMatt Arsenault2019-06-199-31/+82
| | | | | | | | This is incomplete, and ideally these would all be removed, but it's better to localize them to the subtarget first with comments about what they're for. llvm-svn: 363902
* [FileCheck] Stop qualifying expressions as numericThomas Preud'homme2019-06-191-40/+39
| | | | | | | | | | | | | | | | | Summary: Stop referring to "numeric expression", using simply the term "expression" instead. Likewise for numeric operation since operations are only used in numeric expressions. Reviewers: jhenderson, jdenny, probinson, arichardson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63500 llvm-svn: 363901
* FileCheck: Return parse error w/ Error & ExpectedThomas Preud'homme2019-06-191-201/+214
| | | | | | | | | | | | | | | | Summary: Make use of Error and Expected to bubble up diagnostics and force checking of errors in the callers. Reviewers: jhenderson, jdenny, probinson, arichardson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63125 llvm-svn: 363900
* AMDGPU: Undo sub x, c canonicalization for v2i16Matt Arsenault2019-06-193-26/+87
| | | | | | Should avoid regression from D62341 llvm-svn: 363899
* [DAGCombine] Use ConstantSDNode::getAPIntValue() instead of getZExtValue().Simon Pilgrim2019-06-191-2/+2
| | | | | | Use getAPIntValue() in a few more places. Most of the time getZExtValue() is fine, but occasionally there's fuzzed code or someone decides to create i65536 or something..... llvm-svn: 363887
OpenPOWER on IntegriCloud