summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* CodeGen: Remove pipeline dependencies on StackProtector; NFCMatthias Braun2018-07-1316-69/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | This re-applies r336929 with a fix to accomodate for the Mips target scheduling multiple SelectionDAG instances into the pass pipeline. PrologEpilogInserter and StackColoring depend on the StackProtector analysis being alive from the point it is run until PEI, which requires that they are all scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass between StackProtector and PEI results in these passes being in separate FunctionPassManagers and the StackProtector is not available for PEI. PEI and StackColoring don't use much information from the StackProtector pass, so transfering the required information to MachineFrameInfo is cleaner than keeping the StackProtector pass around. This commit moves the SSP layout information to MFI instead of keeping it in the pass. This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587) is a first draft of the pagerando implementation described in http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html. Patch by Stephen Crane <sjc@immunant.com> Differential Revision: https://reviews.llvm.org/D49256 llvm-svn: 336964
* Simplify recursive launder.invariant.group and stripPiotr Padlewski2018-07-121-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is crucial for proving equality laundered/stripped pointers. eg: bool foo(A *a) { return a == std::launder(a); } Clang with -fstrict-vtable-pointers will emit something like: define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) { entry: %c = bitcast %struct.A* %a to i8* %call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c) %0 = bitcast %struct.A* %a to i8* %1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0) %2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call) %cmp = icmp eq i8* %1, %2 ret i1 %cmp } and because %2 can be replaced with @llvm.strip.invariant.group(%0) and that %2 and %1 will produce the same value (because strip is readnone) we can replace compare with true. Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D47423 llvm-svn: 336963
* [InstCombine] Simplify isKnownNegationFangrui Song2018-07-121-5/+2
| | | | llvm-svn: 336957
* [X86] Add AVX512 equivalents of some isel patterns so we get EVEX instructions.Craig Topper2018-07-122-17/+48
| | | | | | These are the patterns for matching fceil, ffloor, and sqrt to intrinsic instructions if they have a MOVSS/SD. llvm-svn: 336954
* Revert r336950 and r336951 "[X86] Add AVX512 equivalents of some isel ↵Craig Topper2018-07-122-48/+17
| | | | | | | | patterns so we get EVEX instructions." and "foo" One of them had a bad title and they should have been squashed. llvm-svn: 336953
* Remove redundant *_or_null checks; NFCGeorge Burgess IV2018-07-121-2/+2
| | | | | | | For the first one, we dereference `NewDef` right before the `if` anyway. For the second, we shouldn't have NULL users(). llvm-svn: 336952
* [X86] Add AVX512 equivalents of some isel patterns so we get EVEX instructions.Craig Topper2018-07-121-0/+31
| | | | | | These are the patterns for matching fceil, ffloor, and sqrt to intrinsic instructions if they have a MOVSS/SD. llvm-svn: 336951
* fooCraig Topper2018-07-122-17/+17
| | | | llvm-svn: 336950
* Revert "[SLPVectorizer] Add initial alternate opcode support for cast ↵Martin Storsjo2018-07-121-62/+22
| | | | | | | | | instructions. (REAPPLIED)" This reverts commit r336812, which broke compilation of a number of projects, see PR38154. llvm-svn: 336949
* [SanitizerCoverage] Add associated metadata to 8-bit counters.Matt Morehouse2018-07-121-1/+3
| | | | | | | | | | | | | | | | | | | | Summary: This allows counters associated with unused functions to be dead-stripped along with their functions. This approach is the same one we used for PC tables. Fixes an issue where LLD removes an unused PC table but leaves the 8-bit counter. Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, hiraditya, kcc Differential Revision: https://reviews.llvm.org/D49264 llvm-svn: 336941
* [X86][FastISel] Support EVEX version of sqrt.Craig Topper2018-07-121-8/+11
| | | | llvm-svn: 336939
* AMDGPU: Fix assert in truncate combine with vectorsMatt Arsenault2018-07-121-1/+1
| | | | | | | The piece above probably has the same problem, but I need to try to come up with a test for it. llvm-svn: 336935
* Revert "(HEAD -> master, origin/master, arcpatch-D37582) CodeGen: Remove ↵Matthias Braun2018-07-1213-83/+69
| | | | | | | | | | pipeline dependencies on StackProtector; NFC" This was triggering pass scheduling failures. This reverts commit r336929. llvm-svn: 336934
* CodeGen: Remove pipeline dependencies on StackProtector; NFCMatthias Braun2018-07-1213-69/+83
| | | | | | | | | | | | | | | | | | | | | | | PrologEpilogInserter and StackColoring depend on the StackProtector analysis being alive from the point it is run until PEI, which requires that they are all scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass between StackProtector and PEI results in these passes being in separate FunctionPassManagers and the StackProtector is not available for PEI. PEI and StackColoring don't use much information from the StackProtector pass, so transfering the required information to MachineFrameInfo is cleaner than keeping the StackProtector pass around. This commit moves the SSP layout information to MFI instead of keeping it in the pass. This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587) is a first draft of the pagerando implementation described in http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html. Patch by Stephen Crane <sjc@immunant.com> Differential Revision: https://reviews.llvm.org/D49256 llvm-svn: 336929
* [DWARF v5] Generate range list tables into the .debug_rnglists section. No ↵Wolfgang Pieb2018-07-127-19/+144
| | | | | | | | | | | | support for split DWARF and no use of DW_FORM_rnglistx with the DW_AT_ranges attribute. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D49214 llvm-svn: 336927
* [X86] Connect the flags user from PCMPISTR instructions to the correct node ↵Craig Topper2018-07-121-1/+1
| | | | | | | | from the instruction. We were accidentally connecting it to result 0 instead of result 1. This was caught by the machine verifier that noticed the flags were dead, but we were using them somehow. I'm still not clear what actually happened downstream. llvm-svn: 336925
* [X86][FastISel] Choose EVEX instructions when possible when lowering ↵Craig Topper2018-07-121-8/+12
| | | | | | | | x86_sse_cvttss2si and similar intrinsics. This should fix a machine verifier error. llvm-svn: 336924
* [AArch64] Armv8.4-A: LDAPR & STLR with immediate offset instructionsSjoerd Meijer2018-07-123-0/+43
| | | | | | These instructions are added to AArch64 only. llvm-svn: 336913
* [InstCombine] Fold x & (-1 >> y) != x to x u> (-1 >> y)Roman Lebedev2018-07-121-0/+4
| | | | | | | | | | | | | | | | | | | | Summary: A complementary fold to D49179. https://bugs.llvm.org/show_bug.cgi?id=38123 https://rise4fun.com/Alive/Rny Caveat: one more thing in `test/Transforms/InstCombine/icmp-logical.ll` breaks. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49205 llvm-svn: 336911
* [ThinLTO] Escape module paths when printingAndrew Ng2018-07-121-2/+3
| | | | | | | | | | | | | | | | | | | | We have located a bug in AssemblyWriter::printModuleSummaryIndex(). This function outputs path strings incorrectly. Backslashes in the strings are not correctly escaped. Consequently, if a path name contains a backslash followed by two hexadecimal characters, the sequence is incorrectly interpreted when the output is read by another component. This mangles the path and results in error. This patch fixes this issue by calling printEscapedString() to output the module paths. Patch by Chris Jackson. Differential Revision: https://reviews.llvm.org/D49090 llvm-svn: 336908
* [X86][SSE] Utilize ZeroableElements for canWidenShuffleElementsSimon Pilgrim2018-07-121-2/+31
| | | | | | | | | | canWidenShuffleElements can do a better job if given a mask with ZeroableElements info. Apparently, ZeroableElements was being only used to identify AllZero candidates, but possibly we could plug it into more shuffle matchers. Original Patch by Zvi Rackover @zvi Differential Revision: https://reviews.llvm.org/D42044 llvm-svn: 336903
* [X86][AVX] Use Zeroable mask to improve shuffle mask wideningSimon Pilgrim2018-07-121-2/+17
| | | | | | | | | | Noticed while updating D42044, lowerV2X128VectorShuffle can improve the shuffle mask with the zeroable data to create a target shuffle mask to recognise more 'zero upper 128' patterns. NOTE: lowerV4X128VectorShuffle could benefit as well but the code needs refactoring first to discriminate between SM_SentinelUndef and SM_SentinelZero for negative shuffle indices. Differential Revision: https://reviews.llvm.org/D49092 llvm-svn: 336900
* [UnJ] Use SmallPtrSets for block collections. NFCDavid Green2018-07-121-30/+27
| | | | | | | | | We no longer care about the order of blocks in these collections, so can change to SmallPtrSets, making contains checks quicker. Differential revision: https://reviews.llvm.org/D49060 llvm-svn: 336897
* [mips] Mark standard encoded instructions as not being in MIPS16eSimon Atanasyan2018-07-122-3/+3
| | | | | | | | | | | Mark standard encoded instructions and pseudo "standard encoded" as not being in MIPS16e by default. Patch by Simon Dardis. Differential revision: https://reviews.llvm.org/D48379 llvm-svn: 336893
* [X86] Remove i128 type from FR128 regclass.Craig Topper2018-07-123-18/+1
| | | | | | i128 isn't a legal type in our x86 implementation today. So remove this and the few patterns that used it until it becomes necessary. llvm-svn: 336889
* Fix few typos in comments (write access test commit)Stefan Granitz2018-07-121-2/+2
| | | | llvm-svn: 336887
* [X86] Remove patterns and ISD nodes for the old scalar FMA intrinsic lowering.Craig Topper2018-07-125-165/+19
| | | | | | We now use llvm.fma.f32/f64 or llvm.x86.fmadd.f32/f64 intrinsics that use scalar types rather than vector types. So we don't these special ISD nodes that operate on the lowest element of a vector. llvm-svn: 336883
* [InstSimplify] simplify add instruction if two operands are negativeChen Zheng2018-07-122-0/+24
| | | | | | Differential Revision: https://reviews.llvm.org/D49216 llvm-svn: 336881
* [AsmParser] Fix inconsistent declaration parameter nameFangrui Song2018-07-123-41/+41
| | | | llvm-svn: 336879
* Temporarily revert "Recommit r328307: [IPSCCP] Use constant range ↵Eric Christopher2018-07-121-81/+111
| | | | | | | | | | information for comparisons of parameters." as it's causing miscompiles. A testcase was provided in the original review thread. This reverts commit r336098. llvm-svn: 336877
* [x86] Fix another trivial bug in x86 flags copy lowering that has beenChandler Carruth2018-07-121-3/+6
| | | | | | | | | | | | | | | there for a long time. The boolean tracking whether we saw a kill of the flags was supposed to be per-block we are scanning and instead was outside that loop and never cleared. It requires a quite contrived test case to hit this as you have to have multiple levels of successors and interleave them with kills. I've included such a test case here. This is another bug found testing SLH and extracted to its own focused patch. llvm-svn: 336876
* [X86] Add patterns to use VMOVSS/SD zero masking for scalar f32/f64 select ↵Craig Topper2018-07-121-0/+8
| | | | | | | | with zero. These showed up in some of the upgraded FMA code. We really need to improve these test cases more, but this helps for now. llvm-svn: 336875
* [x86] Fix EFLAGS copy lowering to correctly handle walking past uses inChandler Carruth2018-07-121-1/+1
| | | | | | | | | | | | | | | | | | multiple successors where some of the uses end up killing the EFLAGS register. There was a bug where rather than skipping to the next basic block queued up with uses once we saw a kill, we stopped processing the blocks entirely. =/ Test case produces completely nonsensical code w/o this tiny fix. This was found testing Speculative Load Hardening and split out of that work. Differential Revision: https://reviews.llvm.org/D49211 llvm-svn: 336874
* [X86] Remove and autoupgrade the scalar fma intrinsics with masking.Craig Topper2018-07-127-136/+129
| | | | | | This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871
* IR: Skip -print-*-all after -print-*Duncan P. N. Exon Smith2018-07-111-3/+3
| | | | | | | | | | This changes `-print-*` from transformation passes to analysis passes so that `-print-after-all` and `-print-before-all` don't trigger. This avoids some redundant output. Patch by Son Tuan Vu! llvm-svn: 336869
* [CodeGen] Emit more precise AssertZext/AssertSext nodes.Eli Friedman2018-07-112-26/+9
| | | | | | | | | | | | This is marginally helpful for removing redundant extensions, and the code is easier to read, so it seems like an all-around win. In the new test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an AssertZext i2, which allows the extension of the return value to be eliminated. Differential Revision: https://reviews.llvm.org/D49004 llvm-svn: 336868
* [LoopIdiomRecognize] Don't convert a do while loop to ctlz.Craig Topper2018-07-111-10/+15
| | | | | | | | | | | | | | | | | | | This commit suppresses turning loops like this into "(bitwidth - ctlz(input))". unsigned foo(unsigned input) { unsigned num = 0; do { ++num; input >>= 1; } while (input != 0); return num; } The loop version returns a value of 1 for both an input of 0 and an input of 1. Converting to a naive ctlz does not preserve that. Theoretically we could do better if we checked isKnownNonZero or we could insert a select to handle the divergence. But until we have motivating cases for that, this is the easiest solution. llvm-svn: 336864
* AMDGPU/SI: Initialize InstrInfo before TargetLoweringInfo in GCNSubtargetTom Stellard2018-07-112-3/+3
| | | | | | | | SITargetLowering queries SIInstrInfo in its constructor, so SIInstrInfo must be initialized first. This fixes msan buildbot failures and was introduced by r336851. llvm-svn: 336861
* [MemorySSA] Add APIs to move memory accesses between blocks, following CFG ↵Alina Sbirlea2018-07-112-1/+61
| | | | | | | | | | | | | | | | changes. Summary: The move APIs added in this patch will be used to update MemorySSA when CFG changes merge or split blocks, by moving memory accesses accordingly in MemorySSA's internal data structures. [Split from D45299 for easier review] Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D48897 llvm-svn: 336860
* AMDGPU: Remove duplicate call to initializeSubtargetDependencies()Tom Stellard2018-07-111-1/+0
| | | | | | This was added in r336851. llvm-svn: 336853
* AMDGPU: Refactor Subtarget classesTom Stellard2018-07-1174-381/+340
| | | | | | | | | | | | | | | | | Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
* [DebugInfo] Fix getPreviousSibling after r336823Fangrui Song2018-07-111-1/+2
| | | | llvm-svn: 336837
* [InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y)Roman Lebedev2018-07-111-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: https://bugs.llvm.org/show_bug.cgi?id=38123 This pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in unsigned case, therefore it is probably a good idea to improve it. https://rise4fun.com/Alive/Rny ^ there are more opportunities for folds, i will follow up with them afterwards. Caveat: this somehow exposes a missing opportunities in `test/Transforms/InstCombine/icmp-logical.ll` It seems, the problem is in `foldLogOpOfMaskedICmps()` in `InstCombineAndOrXor.cpp`. But i'm not quite sure what is wrong, because it calls `getMaskedTypeForICmpPair()`, which calls `decomposeBitTestICmp()` which should already work for these cases... As @spatel notes in https://reviews.llvm.org/D49179#1158760, that code is a rather complex mess, so we'll let it slide. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: yamauchi, majnemer, t.p.northover, llvm-commits Differential Revision: https://reviews.llvm.org/D49179 llvm-svn: 336834
* [X86] Remove patterns for inserting a load into a zero vector.Craig Topper2018-07-112-90/+50
| | | | | | | | We can instead block the load folding isProfitableToFold. Then isel will emit a register->register move for the zeroing part and a separate load. The PostProcessISelDAG should be able to remove the register->register move. This saves us patterns and fixes the fact that we only had unaligned load patterns. The test changes show places where we should have been using an aligned load. llvm-svn: 336828
* [TargetTransformInfo] Add pow2 analysis for scalar constantsSimon Pilgrim2018-07-111-0/+6
| | | | | | Add ConstantInt analysis to getOperandInfo so we get more realistic div/rem expansion costs comparable to the vector costs. llvm-svn: 336827
* AMDGPU/NFC: Use already available explicit kernargKonstantin Zhuravlyov2018-07-111-1/+2
| | | | | | | size instead of calculating it again when filling out the metadata. llvm-svn: 336825
* [DebugInfo] Make children iterator bidirectionalJonas Devlieghere2018-07-112-0/+45
| | | | | | | | | Make the DIE iterator bidirectional so we can move to the previous sibling of a DIE. Differential revision: https://reviews.llvm.org/D49173 llvm-svn: 336823
* [X86] Fix MayLoad/HasSideEffect flag for (V)MOVLPSrm instructions.Andrea Di Biagio2018-07-112-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Before revision 336728, the "mayLoad" flag for instruction (V)MOVLPSrm was inferred directly from the "default" pattern associated with the instruction definition. r336728 removed special node X86Movlps, and all the patterns associated to it. Now instruction (V)MOVLPSrm doesn't have a pattern associated to it, and the 'mayLoad/hasSideEffects' flags are left unset. When the instruction info is emitted by tablegen, method CodeGenDAGPatterns::InferInstructionFlags() sees that (V)MOVLPSrm doesn't have a pattern, and flags are undefined. So, it conservatively sets the "hasSideEffects" flag for it. As a consequence, we were losing the 'mayLoad' flag, and we were gaining a 'hasSideEffect' flag in its place. This patch fixes the issue (originally reported by Michael Holmen). The mca tests show the differences in the instruction info flags. Instructions that were affected by this problem were: MOVLPSrm/VMOVLPSrm/VMOVLPSZ128rm. Differential Revision: https://reviews.llvm.org/D49182 llvm-svn: 336818
* [SLPVectorizer] Add initial alternate opcode support for cast instructions. ↵Simon Pilgrim2018-07-111-22/+62
| | | | | | | | | | | | | | | | | | | (REAPPLIED) We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Reapplied with fix to only accept 2 different casts if they come from the same source type. Differential Revision: https://reviews.llvm.org/D49135 llvm-svn: 336812
* Revert rL336804: [SLPVectorizer] Add initial alternate opcode support for ↵Simon Pilgrim2018-07-111-58/+22
| | | | | | | | cast instructions. Reverting due to buildbot failures llvm-svn: 336806
OpenPOWER on IntegriCloud