summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [Attributor] Deduce memory behavior of functions and argumentsJohannes Doerfert2019-10-071-3/+480
| | | | | | | | | | | | | | | Deduce the memory behavior, aka "read-none", "read-only", or "write-only", for functions and arguments. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67384 llvm-svn: 373965
* [InstCombine] Fold conditional sign-extend of high-bit-extract into ↵Roman Lebedev2019-10-071-0/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | high-bit-extract-with-signext (PR42389) This can come up in Bit Stream abstractions. The pattern looks big/scary, but it can't be simplified any further. It only is so simple because a number of my preparatory folds had happened already (shift amount reassociation / shift amount reassociation in bit test, sign bit test detection). Highlights: * There are two main flavors: https://rise4fun.com/Alive/zWi The difference is add vs. sub, and left-shift of -1 vs. 1 * Since we only change the shift opcode, we can preserve the exact-ness: https://rise4fun.com/Alive/4u4 * There can be truncation after high-bit-extraction: https://rise4fun.com/Alive/slHc1 (the main pattern i'm after!) Which means that we need to ignore zext of shift amounts and of NBits. * The sign-extending magic can be extended itself (in add pattern via sext, in sub pattern via zext. not the other way around!) https://rise4fun.com/Alive/NhG (or those sext/zext can be sinked into `select`!) Which again means we should pay attention when matching NBits. * We can have both truncation of extraction and widening of magic: https://rise4fun.com/Alive/XTw In other words, i don't believe we need to have any checks on bitwidths of any of these constructs. This is worsened in general by the fact that we may have `sext` instead of `zext` for shift amounts, and we don't yet canonicalize to `zext`, although we should. I have not done anything about that here. Also, we really should have something to weed out `sub` like these, by folding them into `add` variant. https://bugs.llvm.org/show_bug.cgi?id=42389 llvm-svn: 373964
* [InstCombine] Move isSignBitCheck(), handle rest of the predicatesRoman Lebedev2019-10-072-28/+39
| | | | | | | | | True, no test coverage is being added here. But those non-canonical predicates that are already handled here already have no test coverage as far as i can tell. I tried to add tests for them, but all the patterns already get handled elsewhere. llvm-svn: 373962
* [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we ↵Roman Lebedev2019-10-071-70/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | deal with mask Summary: Currently, we pre-check whether we need to produce a mask or not. This involves some rather magical constants. I'd like to extend this fold to also handle the situation when there's also a `trunc` before outer shift. That will require another set of magical constants. It's ugly. Instead, we can just compute the mask, and check whether mask is a pass-through (all-ones) or not. This way we don't need to have any magical numbers. This change is NFC other than the fact that we now compute the mask and then check if we need (and can!) apply it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68470 llvm-svn: 373961
* [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift ↵Roman Lebedev2019-10-071-0/+33
| | | | | | | | | | | | | | | | | | | | | | amounts Summary: When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`, which means that for patterns a/b we'd assume that we must not produce any bits for that channel, while in reality we simply didn't care about that channel - i.e. we don't need to mask it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68239 llvm-svn: 373960
* [Bitcode] Update naming of UNOP_NEG to UNOP_FNEGCameron McInally2019-10-072-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D68588 llvm-svn: 373958
* AMDGPU/GlobalISel: Handle more G_INSERT casesMatt Arsenault2019-10-073-57/+55
| | | | | | | | | Start manually writing a table to get the subreg index. TableGen should probably generate this, but I'm not sure what it looks like in the arbitrary case where subregisters are allowed to not fully cover the super-registers. llvm-svn: 373947
* GlobalISel: Partially implement lower for G_INSERTMatt Arsenault2019-10-072-7/+44
| | | | llvm-svn: 373946
* AMDGPU/GlobalISel: Fix selection of 16-bit shiftsMatt Arsenault2019-10-071-3/+6
| | | | llvm-svn: 373945
* AMDGPU/GlobalISel: Select VALU G_AMDGPU_FFBH_U32Matt Arsenault2019-10-071-1/+1
| | | | llvm-svn: 373944
* AMDGPU/GlobalISel: Use S_MOV_B64 for inline constantsMatt Arsenault2019-10-071-20/+27
| | | | | | | This hides some defects in SIFoldOperands when the immediates are split. llvm-svn: 373943
* AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sourcesMatt Arsenault2019-10-071-18/+29
| | | | | | Continue making a mess of merge/unmerge legality. llvm-svn: 373942
* AMDGPU/GlobalISel: Select more G_INSERT casesMatt Arsenault2019-10-071-20/+78
| | | | | | | | | | At minimum handle the s64 insert type, which are emitted in real cases during legalization. We really need TableGen to emit something to emit something like the inverse of composeSubRegIndices do determine the subreg index to use. llvm-svn: 373938
* GlobalISel: Add target pre-isel instructionsMatt Arsenault2019-10-076-4/+18
| | | | | | | | | | | | | | Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. llvm-svn: 373937
* Second attempt to add iterator_range::empty()Jordan Rose2019-10-0718-32/+32
| | | | | | | | | | | | Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 llvm-svn: 373935
* Fix Calling Convention through aliasesErich Keane2019-10-071-0/+8
| | | | | | | | | | | | | | r369697 changed the behavior of stripPointerCasts to no longer include aliases. However, the code in CGDeclCXX.cpp's createAtExitStub counted on the looking through aliases to properly set the calling convention of a call. The result of the change was that the calling convention mismatch of the call would be replaced with a llvm.trap, causing a runtime crash. Differential Revision: https://reviews.llvm.org/D68584 llvm-svn: 373929
* [Remarks] Pass StringBlockValue as StringRef.Florian Hahn2019-10-071-1/+1
| | | | | | | | | | | | | | | After changing the remark serialization, we now pass StringRefs to the serializer. We should use StringRef for StringBlockVal, to avoid creating temporary objects, which then cause StringBlockVal.Value to point to invalid memory. Reviewers: thegameg, anemet Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D68571 llvm-svn: 373923
* Fix build errors caused by rL373914.Wei Mi2019-10-071-1/+2
| | | | llvm-svn: 373919
* [llvm-profdata] Minor format fixWenlei He2019-10-071-0/+1
| | | | | | | | | | | | | | Summary: Minor format fix for output of "llvm-profdata -show" Reviewers: wmi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68440 llvm-svn: 373917
* [X86][SSE] getTargetShuffleInputs - move VT.isSimple/isVector checks inside. ↵Simon Pilgrim2019-10-071-4/+11
| | | | | | | | NFCI. Stop all the callers from having to check the value type before calling getTargetShuffleInputs. llvm-svn: 373915
* [SampleFDO] Add compression support for any section in ExtBinary profile formatWei Mi2019-10-073-83/+158
| | | | | | | | | | | | | Previously ExtBinary profile format only supports compression using zlib for profile symbol list. In this patch, we extend the compression support to any section. User can select some or all of the sections to compress. In an experiment, for a 45M profile in ExtBinary format, compressing name table reduced its size to 24M, and compressing all the sections reduced its size to 11M. Differential Revision: https://reviews.llvm.org/D68253 llvm-svn: 373914
* [Mips] Always save RA when disabling frame pointer eliminationSimon Atanasyan2019-10-071-2/+5
| | | | | | | | | | | | This ensures that frame-based unwinding will continue to work when calling a noreturn function; there is not much use having the caller's frame pointer saved if you don't also have the caller's program counter. Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D68542 llvm-svn: 373907
* [Mips] Fix evaluating J-format branch targetsSimon Atanasyan2019-10-071-4/+7
| | | | | | | | | | | | J/JAL/JALX/JALS are absolute branches, but stay within the current 256 MB-aligned region, so we must include the high bits of the instruction address when calculating the branch target. Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D68548 llvm-svn: 373906
* [LLVM-C] Add bindings to create macro debug infowhitequark2019-10-071-0/+20
| | | | | | | | | | | | | | | | Summary: The C API doesn't have the bindings to create macro debug information. Reviewers: whitequark, CodaFi, deadalnix Reviewed By: whitequark Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58334 llvm-svn: 373903
* Test commitMirko Brkusanin2019-10-071-1/+1
| | | | | | Fix comment. llvm-svn: 373901
* [FPEnv] Add constrained intrinsics for lrint and lroundKevin P. Neal2019-10-077-28/+121
| | | | | | | | | | | Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here. Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373900
* Revert r373888 "[IA] Recognize hexadecimal escape sequences"Nico Weber2019-10-071-16/+1
| | | | | | | | | It broke MC/AsmParser/directive_ascii.s on all bots: Assertion failed: (Index < Length && "Invalid index!"), function operator[], file ../../llvm/include/llvm/ADT/StringRef.h, line 243. llvm-svn: 373898
* [IA] Recognize hexadecimal escape sequencesBill Wendling2019-10-071-1/+16
| | | | | | | | | | | | | | | | | Summary: Implement support for hexadecimal escape sequences to match how GNU 'as' handles them. I.e., read all hexadecimal characters and truncate to the lower 16 bits. Reviewers: nickdesaulniers Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68483 llvm-svn: 373888
* Revert "[SLP] avoid reduction transform on patterns that the backend can ↵Martin Storsjo2019-10-072-65/+3
| | | | | | | | | | load-combine" This reverts SVN r373833, as it caused a failed assert "Non-zero loop cost expected" on building numerous projects, see PR43582 for details and reproduction samples. llvm-svn: 373882
* [X86] Support LEA64_32r in processInstrForSlow3OpLEA and use INC/DEC when ↵Craig Topper2019-10-071-80/+110
| | | | | | | | | | | | | | possible. Move the erasing and iterator updating inside to match the other slow LEA function. I've adapted code from optTwoAddrLEA and basically rebuilt the implementation here. We do lose the kill flags now just like optTwoAddrLEA. This runs late enough in the pipeline that shouldn't really be a problem. llvm-svn: 373877
* [X86][AVX] Access a scalar float/double as a free extract from a broadcast ↵Simon Pilgrim2019-10-062-11/+29
| | | | | | | | | | | | | | | | load (PR43217) If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element. This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted. Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper. Fixes PR43217 Differential Revision: https://reviews.llvm.org/D68544 llvm-svn: 373871
* Fix signed/unsigned warning. NFCISimon Pilgrim2019-10-061-1/+1
| | | | llvm-svn: 373870
* [NFC][PowerPC] Reorganize CRNotPat multiclass patterns in PPCInstrInfo.tdAmy Kwan2019-10-061-84/+91
| | | | | | | | | | | | This is patch aims to group together the `CRNotPat` multi class instantiations within the `PPCInstrInfo.td` file. Integer instantiations of the multi class are grouped together into a section, and the floating point patterns are separated into its own section. Differential Revision: https://reviews.llvm.org/D67975 llvm-svn: 373869
* [X86][SSE] Remove resolveTargetShuffleInputs and use getTargetShuffleInputs ↵Simon Pilgrim2019-10-061-42/+22
| | | | | | | | directly. Move the resolveTargetShuffleInputsAndMask call to after the shuffle mask combine before the undef/zero constant fold instead. llvm-svn: 373868
* [X86][SSE] Don't merge known undef/zero elements into target shuffle masks.Simon Pilgrim2019-10-061-30/+50
| | | | | | | | Replaces setTargetShuffleZeroElements with getTargetShuffleAndZeroables which reports the Zeroable elements but doesn't merge them into the decoded target shuffle mask (the merging has been moved up into getTargetShuffleInputs until we can get rid of it entirely). This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle mask isn't adjusted by its source inputs but instead we cache them in a parallel Zeroable mask. llvm-svn: 373867
* [X86] Add custom type legalization for v16i64->v16i8 truncate and ↵Craig Topper2019-10-061-3/+23
| | | | | | | | | | | | | | | | | | | | | v8i64->v8i8 truncate when v8i64 isn't legal Summary: The default legalization for v16i64->v16i8 tries to create a multiple stage truncate concatenating after each stage and truncating again. But avx512 implements truncates with multiple uops. So it should be better to truncate all the way to the desired element size and then concatenate the pieces using unpckl instructions. This minimizes the number of 2 uop truncates. The unpcks are all single uop instructions. I tried to handle this by just custom splitting the v16i64->v16i8 shuffle. And hoped that the DAG combiner would leave the two halves in the state needed to make D68374 do the job for each half. This worked for the first half, but the second half got messed up. So I've implemented custom handling for v8i64->v8i8 when v8i64 needs to be split to produce the VTRUNCs directly. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68428 llvm-svn: 373864
* [LegalizeTypes][X86] When splitting a vselect for type legalization, don't ↵Craig Topper2019-10-061-3/+12
| | | | | | | | | | | | | | | | | | split a setcc condition if the setcc input is legal and vXi1 conditions are supported Summary: The VSELECT splitting code tries to split a setcc input as well. But on avx512 where mask registers are well supported it should be better to just split the mask and use a single compare. Reviewers: RKSimon, spatel, efriedma Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68359 llvm-svn: 373863
* [LOOPGUARD] Remove asserts in getLoopGuardBranchWhitney Tsang2019-10-061-3/+9
| | | | | | | | | | | | | Summary: The assertion in getLoopGuardBranch can be a 'return nullptr' under if condition. Authored By: DTharun Reviewer: Whitney, fhahn Reviewed By: Whitney, fhahn Subscribers: fhahn, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D66084 llvm-svn: 373857
* [X86][SSE] resolveTargetShuffleInputs - call getTargetShuffleInputs instead ↵Simon Pilgrim2019-10-061-5/+4
| | | | | | of using setTargetShuffleZeroElements directly. NFCI. llvm-svn: 373855
* Revert [DAGCombine] Match more patterns for half word bswapSanjay Patel2019-10-061-29/+29
| | | | | | | | This reverts r373850 (git commit 25ba49824d2d4f2347b4a7cb1623600a76ce9433) This patch appears to cause multiple codegen regression test failures - http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/10680 llvm-svn: 373853
* [NFC] Replace 'isDarwin' with 'IsDarwin'Xiangling Liao2019-10-067-38/+38
| | | | | | | | Summary: Replace 'isDarwin' with 'IsDarwin' based on LLVM naming convention. Differential Revision: https://reviews.llvm.org/D68336 llvm-svn: 373852
* [InstCombine] fold fneg disguised as select+fmul (PR43497)Sanjay Patel2019-10-061-18/+49
| | | | | | | Extends rL373230 and solves the motivating bug (although in a narrow way): https://bugs.llvm.org/show_bug.cgi?id=43497 llvm-svn: 373851
* [DAGCombine] Match more patterns for half word bswapAmaury Sechet2019-10-061-29/+29
| | | | | | | | | | | | | | Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68250 llvm-svn: 373850
* [X86][AVX] combineExtractSubvector - merge duplicate variables. NFCI.Simon Pilgrim2019-10-061-18/+17
| | | | llvm-svn: 373849
* [InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform ↵Sanjay Patel2019-10-061-2/+9
| | | | | | | | | | | | (PR43501) https://bugs.llvm.org/show_bug.cgi?id=43501 We can't declare a GEP 'inbounds' in general. But we may salvage that information if we have known dereferenceable bytes on the source pointer. Differential Revision: https://reviews.llvm.org/D68244 llvm-svn: 373847
* [X86][SSE] matchVectorShuffleAsBlend - use Zeroable element mask directly.Simon Pilgrim2019-10-061-34/+13
| | | | | | | | | | We can make use of the Zeroable mask to indicate which elements we can safely set to zero instead of creating a target shuffle mask on the fly. This allows us to remove createTargetShuffleMask. This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle masks isn't adjusted by its source inputs in setTargetShuffleZeroElements but instead we cache them in a parallel Zeroable mask. llvm-svn: 373846
* [X86] Enable AVX512BW for memcmp()David Zarzycki2019-10-061-2/+7
| | | | llvm-svn: 373845
* AMDGPU/GlobalISel: Fall back on weird G_EXTRACT offsetsMatt Arsenault2019-10-061-2/+5
| | | | llvm-svn: 373842
* AMDGPU/GlobalISel: RegBankSelect mul24 intrinsicsMatt Arsenault2019-10-061-0/+2
| | | | llvm-svn: 373841
* AMDGPU/GlobalISel: RegBankSelect DS GWS intrinsicsMatt Arsenault2019-10-061-0/+35
| | | | llvm-svn: 373840
OpenPOWER on IntegriCloud