summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [CodeGen] Add missing vector type legalization for ctlz_zero_undefRoland Froese2019-06-241-0/+2
| | | | | | | | | Widen vector result type for ctlz_zero_undef and cttz_zero_undef the same as ctlz and cttz. Differential Revision: https://reviews.llvm.org/D63463 llvm-svn: 364221
* [SLP] Support unary FNeg vectorizationCameron McInally2019-06-241-2/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D63609 llvm-svn: 364219
* AMDGPU/GlobalISel: Select G_TRUNCMatt Arsenault2019-06-244-24/+115
| | | | llvm-svn: 364215
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.classMatt Arsenault2019-06-241-0/+9
| | | | llvm-svn: 364214
* AMDGPU/GlobalISel: Split VALU s64 G_ZEXT/G_SEXT in RegBankSelectMatt Arsenault2019-06-241-13/+57
| | | | | | | | | | | Scalar extends to s64 can use S_BFE_{I64|U64}, but vector extends need to extend to the 32-bit half, and then to 64. I'm not sure what the line should be between what RegBankSelect handles, and what instruction select does, but for now I'm erring on the side of RegBankSelect for future post-RBS combines. llvm-svn: 364212
* [AMDGPU] Allow any value in unused src0 field in v_nopTim Renouf2019-06-241-1/+1
| | | | | | | | | | | | | Summary: The LLVM disassembler assumes that the unused src0 operand of v_nop is zero. Other tools can put another value in that field, which is still valid. This commit fixes the LLVM disassembler to recognize such an encoding as v_nop, in the same way as we already do for s_getpc. Differential Revision: https://reviews.llvm.org/D63724 Change-Id: Iaf0363eae26ff92fc4ebc716216476adbff37a6f llvm-svn: 364208
* [X86] Don't a vzext_movl in LowerBuildVectorv16i8/LowerBuildVectorv8i16 if ↵Craig Topper2019-06-241-3/+3
| | | | | | | | | | | | | | there are no zeroes in the vector we're building. In LowerBuildVectorv16i8 we took care to use an any_extend if the first pair is in the lower 16-bits of the vector and no elements are 0. So bits [31:16] will be undefined. But we still emitted a vzext_movl to ensure that bits [127:32] are 0. If we don't need any zeroes we should be consistent and make all of 127:16 undefined. In LowerBuildVectorv8i16 we can just delete the vzext_movl code because we only use the scalar_to_vector when there are no zeroes. So the vzext_movl is always unnecessary. Found while investigating whether (vzext_movl (scalar_to_vector (loadi32)) patterns are necessary. At least one of the cases where they were necessary was where the loadi32 matched 32-bit aligned 16-bit extload. Seemed weird that we required vzext_movl for that case. Differential Revision: https://reviews.llvm.org/D63700 llvm-svn: 364207
* [X86] Cleanups and safety checks around the isFNEGCraig Topper2019-06-241-11/+20
| | | | | | | | | | | | | This patch does a few things to start cleaning up the isFNEG function. -Remove the Op0/Op1 peekThroughBitcast calls that seem unnecessary. getTargetConstantBitsFromNode has its own peekThroughBitcast inside. And we have a separate peekThroughBitcast on the return value. -Add a check of the scalar size after the first peekThroughBitcast to ensure we haven't changed the element size and just did something like f32->i32 or f64->i64. -Remove an unnecessary check that Op1's type is floating point after the peekThroughBitcast. We're just going to look for a bit pattern from a constant. We don't care about its type. -Add VT checks on several places that consume the return value of isFNEG. Due to the peekThroughBitcasts inside, the type of the return value isn't guaranteed. So its not safe to use it to build other nodes without ensuring the type matches the type being used to build the node. We might be able to replace these checks with bitcasts instead, but I don't have a test case so a bail out check seemed better for now. Differential Revision: https://reviews.llvm.org/D63683 llvm-svn: 364206
* AMDGPU/GlobalISel: Fix selecting G_IMPLICIT_DEF for s1Matt Arsenault2019-06-242-9/+27
| | | | | | Try to fail for scc, since I don't think that should ever be produced. llvm-svn: 364199
* Hexagon: Rename another copy of Register classMatt Arsenault2019-06-241-87/+90
| | | | | | For some reason clang is happy with the conflict, but MSVC is not. llvm-svn: 364196
* ARC: Fix -Wimplicit-fallthroughMatt Arsenault2019-06-241-0/+4
| | | | llvm-svn: 364195
* GlobalISel: Remove unsigned variant of SrcOpMatt Arsenault2019-06-2420-441/+442
| | | | | | | | | Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
* CodeGen: Introduce a class for registersMatt Arsenault2019-06-2480-412/+414
| | | | | | | | | Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
* [AMDGPU] Remove unused variable AllSGPRSpilledToVGPRs. NFCBjorn Pettersson2019-06-241-5/+1
| | | | | | | | | | | | | | | | | | | | | Summary: Removing the unused variable AllSGPRSpilledToVGPRs in SIFrameLowering::processFunctionBeforeFrameFinalized to avoid error: variable 'AllSGPRSpilledToVGPRs' set but not used [-Werror=unused-but-set-variable] Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63721 llvm-svn: 364190
* Hexagon: Rename Register classMatt Arsenault2019-06-241-32/+33
| | | | | | This avoids a naming conflict in a future patch. llvm-svn: 364188
* [InstCombine] reduce funnel-shift i16 X, X, 8 to bswap XSanjay Patel2019-06-241-0/+7
| | | | | | | | | | Prefer the more exact intrinsic to remove a use of the input value and possibly make further transforms easier (we will still need to match patterns with funnel-shift of wider types as pieces of bswap, especially if we want to canonicalize to funnel-shift with constant shift amount). Discussed in D46760. llvm-svn: 364187
* AMDGPU/GlobalISel: Fix RegBankSelect for s1 sext/zext/anyextMatt Arsenault2019-06-241-10/+76
| | | | | | | | This needs different handling if the source is known to be a valid condition or not. Handle turning it into shifts or a select during regbankselect. llvm-svn: 364186
* AMDGPU: Fold frame index into MUBUFMatt Arsenault2019-06-242-10/+49
| | | | | | | | | | | | | | | | This matters for byval uses outside of the entry block, which appear as copies. Previously, the only folding done was during selection, which could not see the underlying frame index. For any uses outside the entry block, the frame index was materialized in the entry block relative to the global scratch wave offset. This may produce worse code in cases where the offset ends up not fitting in the MUBUF offset field. A better heuristic would be helpfu for extreme frames. llvm-svn: 364185
* AMDGPU: Cleanup checking when spills need emergency slotsMatt Arsenault2019-06-241-7/+6
| | | | | | Address fixme, which should no longer be a problem since r363757. llvm-svn: 364182
* [InstCombine] SliceUpIllegalIntegerPHI - bail on out of range shiftsSimon Pilgrim2019-06-241-0/+5
| | | | | | | | trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts. Fixes OSS Fuzz #15217 llvm-svn: 364181
* [DAGCombine] visitMUL - allow shift by zero in MulByConstant.Simon Pilgrim2019-06-241-6/+6
| | | | | | | | This can occur under certain circumstances when undefs are created later on in the constant multipliers (e.g. in this case due to SimplifyDemandedVectorElts). Its better to let the shift by zero to occur and perform any cleanup afterward. Fixes OSS Fuzz #15429 llvm-svn: 364179
* [ConstantFolding] Use hasVectorInstrinsicScalarOpd. NFCBjorn Pettersson2019-06-241-16/+13
| | | | | | | | | | | | | | | | | | Summary: Use the hasVectorInstrinsicScalarOpd helper function in ConstantFoldVectorCall. Reviewers: rengolin, RKSimon, dblaikie Reviewed By: rengolin, RKSimon Subscribers: tschuett, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63705 llvm-svn: 364178
* [Scalarizer] Add scalarizer support for smul.fix.satBjorn Pettersson2019-06-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: Handle smul.fix.sat in the scalarizer. This is done by adding smul.fix.sat to the set of "isTriviallyVectorizable" intrinsics. The addition of smul.fix.sat in isTriviallyVectorizable and hasVectorInstrinsicScalarOpd can also be seen as a preparation to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding. Reviewers: rengolin, RKSimon, dblaikie Reviewed By: rengolin Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63704 llvm-svn: 364177
* [ARM] Add MVE interleaving load/store family.Simon Tatham2019-06-247-33/+272
| | | | | | | | | | | | | | | | | | This adds the family of loads and stores with names like VLD20.8 and VST42.32, which load and store parts of multiple q-registers in such a way that executing both VLD20 and VLD21, or all four of VLD40..VLD43, will distribute 2 or 4 vectors' worth of memory data across the lanes of the same number of registers but in a transposed order. In addition to the Tablegen descriptions of the instructions themselves, this patch also adds encode and decode support for the QQPR and QQQQPR register classes (representing the range of loaded or stored vector registers), and tweaks to the parsing system for lists of vector registers to make it return the right format in this case (since, unlike NEON, MVE regards q-registers as primitive, and not just an alias for two d-registers). llvm-svn: 364172
* [Support] Fix error handling in DataExtractor::get[US]LEB128Pavel Labath2019-06-241-14/+14
| | | | | | | | | | | | | | | | | | | | | | | Summary: These functions are documented as not modifying the offset argument if the extraction fails (just like other DataExtractor functions). However, while reviewing D63591 we discovered that this is not the case -- if the function reaches the end of the data buffer, it will just return the value parsed until that point and set offset to point to the end of the buffer. This fixes the functions to act as advertised, and adds a regression test. Reviewers: dblaikie, probinson, bkramer Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63645 llvm-svn: 364169
* [X86] Turn v16i16->v16i8 truncate+store into a any_extend+truncstore if we ↵Craig Topper2019-06-232-3/+14
| | | | | | | | | | | | | | | | avx512f, but not avx512bw. Ideally we'd be able to represent this truncate as a any_extend to v16i32 and a truncate, but SelectionDAG doens't know how to not fold those together. We have isel patterns to use a vpmovzxwd+vpdmovdb for the truncate, but we aren't able to simultaneously fold the load and the store from the isel pattern. By pulling the truncate into the store we can successfully hide it from the DAG combiner. Then we can isel pattern match the truncstore and load+any_extend separately. llvm-svn: 364163
* Fix typo in comment; NFCSanjoy Das2019-06-231-1/+1
| | | | llvm-svn: 364159
* [X86] Fix isel pattern that was looking for a bitcasted load. Remove what ↵Craig Topper2019-06-231-13/+1
| | | | | | | | | | appears to be a copy/paste mistake. DAG combine should ensure bitcasts of loads don't exist. Also remove 3 patterns that are identical to the block above them. llvm-svn: 364158
* [IndVars] Remove dead instructions after folding trivial loop exitPhilip Reames2019-06-231-3/+5
| | | | | | In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration). This extends that to eliminate the dead comparison left around. llvm-svn: 364155
* SlotIndexes: delete unused functionsFangrui Song2019-06-231-15/+0
| | | | llvm-svn: 364154
* [InstCombine] squash is-power-of-2 that uses ctpopSanjay Patel2019-06-231-0/+23
| | | | | | | | | | | | | | | This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is power-of-2-or-0 using ctpop(X) < 2, so combining that with a non-zero check of the input is the same as testing if exactly 1 bit is set: (X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1 Differential Revision: https://reviews.llvm.org/D63660 llvm-svn: 364153
* SlotIndexes: simplify IdxMBBPair operatorsFangrui Song2019-06-231-1/+1
| | | | llvm-svn: 364152
* [SelectionDAG] Remove the code that attempts to calculate the alignment for ↵Craig Topper2019-06-232-27/+4
| | | | | | | | | | | | | | the second half of a split masked load/store. The code divides the alignment by 2 if the original alignment is equal to the original VT size. But this wouldn't be correct if the alignment was larger than the VT size. The memory operand object already takes care of calling MinAlign on the base alignment and the memory pointer offset. So we don't need any special code at all. llvm-svn: 364151
* [X86][SelectionDAG] Cleanup and simplify masked_load/masked_store in ↵Craig Topper2019-06-233-59/+35
| | | | | | | | | | | | | | | | | | | | tablegen. Use more precise PatFrags for scalar masked load/store. Rename masked_load/masked_store to masked_ld/masked_st to discourage their direct use. We need to check truncating/extending and compressing/expanding before using them. This revealed that our scalar masked load/store patterns were misusing these. With those out of the way, renamed masked_load_unaligned and masked_store_unaligned to remove the "_unaligned". We didn't check the alignment anyway so the name was somewhat misleading. Make the aligned versions inherit from masked_load/store instead from a separate identical version. Merge the 3 different alignments PatFrags into a single version that uses the VT from the SDNode to determine the size that the alignment needs to match. llvm-svn: 364150
* [Support] Fix build under EmscriptenKeno Fischer2019-06-231-0/+3
| | | | | | | | | | | | | | | Summary: Emscripten's libc doesn't define MNT_LOCAL, thus causing a build failure in the fallback path. However, to the best of my knowledge, it also doesn't support remote file system mounts, so we may simply return `true` here (as we do for e.g. Fuchsia). With this fix, the core LLVM libraries build correctly under emscripten (though some of the tools and utils do not). Reviewers: kripken Differential Revision: https://reviews.llvm.org/D63688 llvm-svn: 364143
* Revert [CommandLine] Remove OptionCategory and SubCommand caches from the ↵Don Hinton2019-06-221-49/+58
| | | | | | | | | | | | | Option class. This reverts r364134 (git commit a5b83bc9e3b8e8945b55068c762bd6c73621a4b0) Caused errors in the asan bot, so the GeneralCategory global needs to be changed to ManagedStatic. Differential Revision: https://reviews.llvm.org/D62105 llvm-svn: 364141
* [X86][SSE] Fold extract_subvector(vselect(x,y,z),0) -> ↵Simon Pilgrim2019-06-221-0/+10
| | | | | | vselect(extract_subvector(x,0),extract_subvector(y,0),extract_subvector(z,0)) llvm-svn: 364136
* Exploit a zero LoopExit count to eliminate loop exitsPhilip Reames2019-06-221-2/+14
| | | | | | | | | | This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning. Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here. Differential Revision: https://reviews.llvm.org/D63618 llvm-svn: 364135
* [CommandLine] Remove OptionCategory and SubCommand caches from the Option class.Don Hinton2019-06-221-58/+49
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change processes `OptionCategory`s and `SubCommand`s as they are seen instead of caching them in the Option class and processing them later. Doing so simplifies the work needed to be done by the Global parser and significantly reduces the size of the Option class to a mere 64 bytes. Removing the `OptionCategory` cache saved 24 bytes, and removing the `SubCommand` cache saved an additional 48 bytes, for a total of a 72 byte reduction. Reviewers: beanz, zturner, MaskRay, serge-sans-paille Reviewed By: serge-sans-paille Subscribers: serge-sans-paille, tstellar, zturner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62105 llvm-svn: 364134
* [NFC] Fix indentation in PPCAsmPrinter.cppHubert Tong2019-06-221-54/+54
| | | | | | | | After r248261, the indentation switches, inside a namespace definition, between indenting and not indenting one level in for that namespace; the abomination occurs in the middle of a class definition. Fix that. llvm-svn: 364133
* [PowerPC][NFC] Move comment to the relevant functionHubert Tong2019-06-221-1/+1
| | | | | | | A comment that applies to a virtual destructor was placed on a class constructor. Move the comment to where it belongs. llvm-svn: 364132
* [NewGVN] Fix copy/paste mistake in castNikita Popov2019-06-221-1/+1
| | | | llvm-svn: 364130
* [NewGVN] Remove dead SwitchEdges variable; NFCNikita Popov2019-06-221-4/+0
| | | | llvm-svn: 364129
* AArch64: Add support for reading pc using llvm.read_register.Peter Collingbourne2019-06-221-0/+8
| | | | | | | | | | | | | | | This is useful for allowing code to efficiently take an address that can be later mapped onto debug info. Currently the hwasan pass achieves this by taking the address of the current function: http://llvm-cs.pcc.me.uk/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp#921 but this costs two instructions (plus a GOT entry in PIC code) per function with stack variables. This will allow the cost to be reduced to a single instruction. Differential Revision: https://reviews.llvm.org/D63471 llvm-svn: 364126
* [CMake] Delete redundant DEPENDS/LINK_LIBS from LineEditor/XRayFangrui Song2019-06-222-9/+0
| | | | | | The link dependencies are already specified in LLVMBuild.txt llvm-svn: 364125
* Make GlobalISel depend on SelectionDAG after D63169Fangrui Song2019-06-221-1/+1
| | | | | | | | | | | GlobalISel/IRTranslator.cpp now references SelectionDAG/FunctionLoweringInfo.cpp. This fixes a link error in -DBUILD_SHARED_LIBS=on builds: ld.lld: error: undefined symbol: llvm::FunctionLoweringInfo::clear() >>> referenced by IRTranslator.cpp:2198 (../lib/CodeGen/GlobalISel/IRTranslator.cpp:2198) >>> lib/CodeGen/GlobalISel/CMakeFiles/LLVMGlobalISel.dir/IRTranslator.cpp.o:(llvm::IRTranslator::finalizeFunction()) llvm-svn: 364124
* AArch64: Prefer FP-relative debug locations in HWASANified functions.Peter Collingbourne2019-06-223-11/+18
| | | | | | | | | | | | | | | | | | | | To help produce better diagnostics for stack use-after-return, we'd like to be able to determine the addresses of each HWASANified function's local variables given a small amount of information recorded on entry to the function. Currently we require all HWASANified functions to use frame pointers and record (PC, FP) on function entry. This works better than recording SP because FP cannot change during the function, unlike SP which can change e.g. due to dynamic alloca. However, most variables currently end up using SP-relative locations in their debug info. This prevents us from recomputing the address of most variables because the distance between SP and FP isn't recorded in the debug info. To address this, make the AArch64 backend prefer FP-relative debug locations when producing debug info for HWASANified functions. Differential Revision: https://reviews.llvm.org/D63300 llvm-svn: 364117
* [COFF, ARM64] Fix encoding of debugtrap for WindowsTom Tan2019-06-214-0/+17
| | | | | | | | | | | | On Windows ARM64, intrinsic __debugbreak is compiled into brk #0xF000 which is mapped to llvm.debugtrap in Clang. Instruction brk #F000 is the defined break point instruction on ARM64 which is recognized by Windows debugger and exception handling code, so llvm.debugtrap should map to it instead of redirecting to llvm.trap (brk #1) as the default implementation. Differential Revision: https://reviews.llvm.org/D63635 llvm-svn: 364115
* Revert [SLP] Look-ahead operand reordering heuristic.Reid Kleckner2019-06-211-232/+46
| | | | | | | | | This reverts r364084 (git commit 5698921be2d567f6abf925479ac9f5a376d6d74f) It caused crashes while compiling a file in Chrome. Reduction forthcoming. llvm-svn: 364111
* [ASan] Use dynamic shadow on 32-bit iOS and simulatorsJulian Lettner2019-06-211-10/+2
| | | | | | | | | | | | | | | | | | | | The VM layout on iOS is not stable between releases. On 64-bit iOS and its derivatives we use a dynamic shadow offset that enables ASan to search for a valid location for the shadow heap on process launch rather than hardcode it. This commit extends that approach for 32-bit iOS plus derivatives and their simulators. rdar://50645192 rdar://51200372 rdar://51767702 Reviewed By: delcypher Differential Revision: https://reviews.llvm.org/D63586 llvm-svn: 364105
OpenPOWER on IntegriCloud