summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [ARM][MVE] Change VCTP operandSam Parker2019-09-301-3/+3
| | | | | | | | | | | | The VCTP instruction will calculate the predicate masked based upon the number of elements that need to be processed. I had inserted the sub before the vctp intrinsic and supplied it as the operand, but this is incorrect as the phi should directly feed the vctp. The sub is calculating the value for the next iteration. Differential Revision: https://reviews.llvm.org/D67921 llvm-svn: 373188
* [TargetLowering] Simplify expansion of S{ADD,SUB}ORoger Ferrer Ibanez2019-09-301-18/+13
| | | | | | | | | | ISD::SADDO uses the suggested sequence described in the section §2.4 of the RISCV Spec v2.2. ISD::SSUBO uses the dual approach but checking for (non-zero) positive. Differential Revision: https://reviews.llvm.org/D47927 llvm-svn: 373187
* [ARM][CGP] Allow signext argumentsSam Parker2019-09-301-5/+2
| | | | | | | | | | | | As we perform a zext on any arguments used in the promoted tree, it doesn't matter if they're marked as signext. The only permitted user(s) in the tree which would interpret the sign bits are signed icmps. For these instructions, their promoted operands are truncated before the icmp uses them. Differential Revision: https://reviews.llvm.org/D68019 llvm-svn: 373186
* Revert "[SCEV] add no wrap flag for SCEVAddExpr."Tim Northover2019-09-301-1/+1
| | | | | | | | This reverts r366419 because the analysis performed is within the context of the loop and it's only valid to add wrapping flags to "global" expressions if they're always correct. llvm-svn: 373184
* [SystemZ] Add SystemZPostRewrite in addPostRegAlloc() instead at -O0.Jonas Paulsson2019-09-301-1/+4
| | | | | | | | SystemZPostRewrite needs to be run before (it may emit COPYs) the Post-RA pseudo pass also at -O0, so it should be added in addPostRegAlloc(). Review: Ulrich Weigand llvm-svn: 373182
* [X86] Remove some redundant isel patterns. NFCICraig Topper2019-09-301-78/+0
| | | | | | | These are all also implemented in avx512_logical_lowering_types with support for masking. llvm-svn: 373181
* AMDGPU/GlobalISel: Fix select for v2s16 and/or/xorMatt Arsenault2019-09-301-15/+17
| | | | llvm-svn: 373180
* [X86] Split v16i32/v8i64 bitreverse on avx512f targets without avx512bw to ↵Craig Topper2019-09-301-1/+12
| | | | | | enable the use of vpshufb on the 256-bit halves. llvm-svn: 373177
* [X86] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after r373174Fangrui Song2019-09-301-0/+1
| | | | llvm-svn: 373175
* [X86] Remove -x86-experimental-vector-widening-legalization command line flagCraig Topper2019-09-292-1257/+145
| | | | | | | | | This was added back to allow some performance regressions to be investigated. The main perf issue was fixed shortly after adding this back and no other major issues have been reported. So I think its safe to remove this again. llvm-svn: 373174
* [X86] Add custom isel logic to match VPTERNLOG from 2 logic ops.Craig Topper2019-09-291-1/+79
| | | | | | | | | | | | | | | There's room from improvement here, but this is a decent starting point. There are a few minor regressions in the vector-rotate tests, where we are now forming a vpternlog from an and before we get a chance to form it for a bitselect that we were matching previously. This results in an AND and an ANDN feeding the vpternlog where previously we just had an AND after the vpternlog. I think we can probably DAG combine the AND with the bitselect to get back to similar codegen. llvm-svn: 373172
* [LLVM-C][Ocaml] Add MergeFunctions and DCE passAditya Kumar2019-09-292-0/+8
| | | | | | | | | | | | | | | | | | | MergeFunctions and DCE pass are missing from OCaml/C-api. This patch adds them. Differential Revision: https://reviews.llvm.org/D65071 Reviewers: whitequark, hiraditya, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Tags: #llvm Authored by: kren1 llvm-svn: 373170
* [MC] Emit unused undefined symbol even if its binding is not setFangrui Song2019-09-291-3/+0
| | | | | | | | | | | | | | | | | | | | | | | For the following two cases, we currently suppress the symbols. This patch emits them (compatible with GNU as). * `test2_a = undef`: if `undef` is otherwise unused. * `.hidden hidden`: if `hidden` is unused. This is the main point of the patch, because omitting the symbol would cause a linker semantic difference. It causes a behavior change that is not compatible with GNU as: .weakref foo1, bar1 When neither foo1 nor bar1 is used, we now emit bar1, which is arguably more consistent. Another change is that we will emit .TOC. for .TOC.@tocbase . For this directive, suppressing .TOC. can be seen as a size optimization, but we choose to drop it for simplicity and consistency. llvm-svn: 373168
* [DivRemPairs] Don't assert that we won't ever get expanded-form rem pairs in ↵Roman Lebedev2019-09-291-2/+0
| | | | | | | | | | | | | different BB's (PR43500) If we happen to have the same div in two basic blocks, and in one of those we also happen to have the rem part, we'd match the div-rem pair, but the wrong ones. So let's drop overly-ambiguous assert. Fixes https://bugs.llvm.org/show_bug.cgi?id=43500 llvm-svn: 373167
* [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && ↵Alexey Bataev2019-09-291-67/+85
| | | | | | | | | | | | | | | | | | | | "SCEVAddRecExpr operand is not loop-invariant!") Initially SLP vectorizer replaced all going-to-be-vectorized instructions with Undef values. It may break ScalarEvaluation and may cause a crash. Reworked SLP vectorizer so that it does not replace vectorized instructions by UndefValue anymore. Instead vectorized instructions are marked for deletion inside if BoUpSLP class and deleted upon class destruction. Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29641 llvm-svn: 373166
* [PowerPC] Fix conditions of assert in PPCAsmPrinterJinsong Ji2019-09-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: g++ build emits warning: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:667:77: error: suggest parentheses around ?&&? within ?||? [-Werror=parentheses] assert(MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress() && ~~~~~~~~~~~~~~~~~~~~^~ "Unexpected operand type for LWZtoc pseudo."); I believe the intension is to assert all different types, so we should add a parentheses to include all '||'. Reviewers: #powerpc, sfertile, hubert.reinterpretcast, Xiangling_L Reviewed By: Xiangling_L Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, shchenz, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68180 llvm-svn: 373164
* [ARM] Cortex-M4 schedule additionsDavid Green2019-09-295-17/+40
| | | | | | | | | | | | | | | | | | | This is an attempt to fill in some of the missing instructions from the Cortex-M4 schedule, and make it easier to do the same for other ARM cpus. - Some instructions are marked as hasNoSchedulingInfo as they are pseudos or otherwise do not require scheduling info - A lot of features have been marked not supported - Some WriteRes's have been added for cvt instructions. - Some extra instruction latencies have been added, notably by relaxing the regex for dsp instruction to catch more cases, and some fp instructions. This goes a long way to get the CompleteModel working for this CPU. It does not go far enough as to get all scheduling info for all output operands correct. Differential Revision: https://reviews.llvm.org/D67957 llvm-svn: 373163
* [X86] Enable isel to fold broadcast loads that have been bitcasted from FP ↵Craig Topper2019-09-291-0/+96
| | | | | | into a vpternlog. llvm-svn: 373157
* [X86] Move bitselect matching to vpternlog into X86ISelDAGToDAG.cppCraig Topper2019-09-292-43/+160
| | | | | | | | | | | | This allows us to reduce the use count on the condition node before the match. This enables load folding for that operand without relying on the peephole pass. This will be improved on for broadcast load folding in a subsequent commit. This still requires a bunch of isel patterns for vXi16/vXi8 types though. llvm-svn: 373156
* [X86] Enable canonicalizeBitSelect for AVX512 since we can use VPTERNLOG now.Craig Topper2019-09-291-5/+7
| | | | llvm-svn: 373155
* [X86] Match (or (and A, B), (andn (A, C))) to VPTERNLOG with AVX512.Craig Topper2019-09-291-0/+43
| | | | | | This uses a similar isel pattern as we used for vpcmov with XOP. llvm-svn: 373154
* [NFC] Move hot cold splitting class to header fileAditya Kumar2019-09-281-31/+0
| | | | | | | | | | | | | | | | Summary: This is to facilitate unittests Reviewers: compnerd, vsk, tejohnson, sebpop, brzycki, SirishP Reviewed By: tejohnson Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68079 llvm-svn: 373151
* [GlobalISel Enable memcpy inlining with optsize.Amara Emerson2019-09-281-1/+1
| | | | | | We should be disabling inline for minsize, not optsize. llvm-svn: 373143
* [TimeProfiler] Fix "OptModule" section and add new "Backend" sectionsAnton Afanasyev2019-09-281-1/+0
| | | | | | | Remove unnecessary "OptModule" section. Add "PerFunctionPasses", "PerModulePasses" and "CodeGenPasses" sections under "Backend" section. llvm-svn: 373142
* Add an operand to memory intrinsics to denote the "tail" marker.Amara Emerson2019-09-283-2/+23
| | | | | | | | | | | | | | We need to propagate this information from the IR in order to be able to safely do tail call optimizations on the intrinsics during legalization. Assuming it's safe to do tail call opt without checking for the marker isn't safe because the mem libcall may use allocas from the caller. This adds an extra immediate operand to the end of the intrinsics and fixes the legalizer to handle it. Differential Revision: https://reviews.llvm.org/D68151 llvm-svn: 373140
* AMDGPU/GlobalISel: Avoid getting MRI in every functionMatt Arsenault2019-09-282-221/+156
| | | | | | | Store it in AMDGPUInstructionSelector to avoid boilerplate in nearly every select function. llvm-svn: 373139
* [X86] Add broadcast load unfolding support for VPTESTMD/Q and VPTESTNMD/Q.Craig Topper2019-09-281-0/+12
| | | | llvm-svn: 373138
* [X86] Stop using UpdateNodeOperands in combineGatherScatter. Create new ↵Craig Topper2019-09-281-35/+58
| | | | | | | | | | | | | | nodes like most other DAG combines. Creating new nodes is what we usually do. Have to explicitly check that we don't update to an existing node and having to manually manage the worklist is unusual. We can probably add a helper function to reduce the duplication of having to check if we should create a gather or scatter, but I wanted to just get the simple thing done. llvm-svn: 373137
* [X86] Split combineGatherScatter into a version for generic ISD nodes and ↵Craig Topper2019-09-281-5/+39
| | | | | | | | | | | | | | | | another version for X86 specific nodes. The majority of the code doesn't run on the X86 nodes today since its gated by isBeforeLegalizeOps and we don't formm X86 nodes until after that. Except for a couple special case in type legalization. But I think we would probably break those if some of the transforms fire on them. I want to remove the hardcoded operand numbers and the unusual use of UpdateNodeOperands. Being able to know which ISD opcodes are present should help with that. llvm-svn: 373136
* [SampleFDO] Create a separate flag profile-accurate-for-symsinlist to handleWei Mi2019-09-271-35/+58
| | | | | | | | | | | | | | | | | | | | profile symbol list. Currently many existing users using profile-sample-accurate want to reduce code size as much as possible. Their use cases are different from the scenario profile symbol list tries to handle -- the major motivation of adding profile symbol list is to get the major memory/code size saving without introduce performance regression. So to keep the behavior of profile-sample-accurate unchanged, we think decoupling these two things and using a new flag to control the handling of profile symbol list may be better. When profile-sample-accurate and the new flag profile-accurate-for-symsinlist are both present, since profile-sample-accurate is a user assertion we let it have a higher precedence. Differential Revision: https://reviews.llvm.org/D68047 llvm-svn: 373133
* [InstSimplify] generalize FP folds with undef/NaN; NFCSanjay Patel2019-09-271-12/+14
| | | | | | We can reuse this logic for things like fma. llvm-svn: 373119
* Revert [Dominators][CodeGen] Clean up MachineDominatorsJakub Kuderski2019-09-271-3/+13
| | | | | | This reverts r373101 (git commit 72c57ec3e6b320c31274dadb888dc16772b8e7b6) llvm-svn: 373117
* [X86] Call SimplifyDemandedBits in combineGatherScatter any time the mask ↵Craig Topper2019-09-271-3/+3
| | | | | | | | | element is wider than i1, not just when AVX512 is disabled. The AVX2 intrinsics can still be used when AVX512 is enabled and those go through this path. So we should simplify them. llvm-svn: 373108
* [InstCombine] Simplify shift-by-sext to shift-by-zextRoman Lebedev2019-09-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is valid for any `sext` bitwidth pair: ``` Processing /tmp/opt.ll.. ---------------------------------------- %signed = sext %y %r = shl %x, %signed ret %r => %unsigned = zext %y %r = shl %x, %unsigned ret %r %signed = sext %y Done: 2016 Optimization is correct! ``` (This isn't so for funnel shifts, there it's illegal for e.g. i6->i7.) Main motivation is the C++ semantics: ``` int shl(int a, char b) { return a << b; } ``` ends as ``` %3 = sext i8 %1 to i32 %4 = shl i32 %0, %3 ``` https://godbolt.org/z/0jgqUq which is, as this shows, too pessimistic. There is another problem here - we can only do the fold if sext is one-use. But we can trivially have cases where several shifts have the same sext shift amount. This should be resolved, later. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: efriedma, hiraditya, nlopes, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68103 llvm-svn: 373106
* [Dominators][CodeGen] Clean up MachineDominatorsJakub Kuderski2019-09-271-13/+3
| | | | | | | | | | | | | | | | Summary: This is a cleanup patch for MachineDominatorTree. It would be an NFC, except for replacing custom DomTree verification with the generic one. Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao Reviewed By: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67976 llvm-svn: 373101
* ModuleUtils - silence static analyzer dyn_cast<> null dereference warning. NFCI.Simon Pilgrim2019-09-271-1/+1
| | | | | | The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 373099
* FunctionImportGlobalProcessing::processGlobalForThinLTO - silence static ↵Simon Pilgrim2019-09-271-1/+1
| | | | | | | | analyzer dyn_cast<FunctionSummary> null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<FunctionSummary> directly and if not assert will fire for us. llvm-svn: 373097
* [RISCV] Rename FPRs and use Register arithmeticLuis Marques2019-09-276-199/+136
| | | | | | | | | | | | | | | | | The new names for FPRs ensure that the Register values within the same class are enumerated consecutively (the order is determined by the `LessRecordRegister` function object). Where there were tables mapping between 32- and 64-bit FPRs (and vice versa) this patch replaces them with Register arithmetic. The enumeration order between different register classes is expected to continue to be arbitrary, although it does impact the conversion from the (overloaded) asm FPR names to Register values, and therefore might require updates to the target if the sorting algorithm is changed. Static asserts were added to ensure that changes to the ordering that would impact the current implementation are detected. Differential Revision: https://reviews.llvm.org/D67423 llvm-svn: 373096
* SCCP - silence static analyzer dyn_cast<StructType> null dereference ↵Simon Pilgrim2019-09-271-1/+1
| | | | | | | | warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<StructType> directly and if not assert will fire for us. llvm-svn: 373095
* [AMDGPU][MC] Corrected parsing of registersDmitry Preobrazhensky2019-09-271-144/+208
| | | | | | | | | | | | | Summary of changes: refactored code for better readability and future improvements; fixed bug 41281: https://bugs.llvm.org/show_bug.cgi?id=41281 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D65224 llvm-svn: 373094
* [DebugInfo] Exclude memory location values as parameter entry valuesDjordje Todorovic2019-09-272-14/+5
| | | | | | | | | | | | | | | Abandon describing of loaded values due to safety concerns. Loaded values are described as derefed memory location at caller point. At callee we can unintentionally change that memory location which would lead to different entry being printed value before and after the memory location clobbering. This problem is described in llvm.org/PR43343. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D67717 llvm-svn: 373089
* [CodeGenPrepare] Mend "avoid crashing from replacing a phi twice" fix.Jesper Antonsson2019-09-271-1/+1
| | | | | | | | | | | | | | | | | | | Summary: An erroneously negated if-statement by an earlier (March 2019) bugfix left phi replacement/simplification under optimizeMemoryInst() in CodeGenPrepare largely inactivated. The error was found when csmith found that the same assert as in the original bug report could still be triggered in a different way. This patch fixes the bugfix. The original bug was: https://bugs.llvm.org/show_bug.cgi?id=41052 ... and the previous fix was D59358. Reviewers: aprantl, skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67838 llvm-svn: 373084
* [Alignment][NFC] Remove unneeded llvm:: scoping on Align typesGuillaume Chatelet2019-09-2777-359/+341
| | | | llvm-svn: 373081
* Revert r372893 "[CodeGen] Replace -max-jump-table-size with ↵Hans Wennborg2019-09-275-66/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | -max-jump-table-targets" This caused severe compile-time regressions, see PR43455. > Modern processors predict the targets of an indirect branch regardless of > the size of any jump table used to glean its target address. Moreover, > branch predictors typically use resources limited by the number of actual > targets that occur at run time. > > This patch changes the semantics of the option `-max-jump-table-size` to limit > the number of different targets instead of the number of entries in a jump > table. Thus, it is now renamed to `-max-jump-table-targets`. > > Before, when `-max-jump-table-size` was specified, it could happen that > cluster jump tables could have targets used repeatedly, but each one was > counted and typically resulted in tables with the same number of entries. > With this patch, when specifying `-max-jump-table-targets`, tables may have > different lengths, since the number of unique targets is counted towards the > limit, but the number of unique targets in tables is the same, but for the > last one containing the balance of targets. > > Differential revision: https://reviews.llvm.org/D60295 llvm-svn: 373060
* [Alignment][NFC] MaybeAlign in GVNExpressionGuillaume Chatelet2019-09-271-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67922 llvm-svn: 373054
* [MC][ARM] vscclrm disassembles as vldmiaAlexandros Lamprineas2019-09-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Happens only when the mve.fp subtarget feature is enabled: $ llvm-mc -triple thumbv8.1m.main -mattr=+mve.fp,+8msecext -disassemble <<< "0x9f,0xec,0x08,0x0b" .text vldmia pc, {d0, d1, d2, d3} $ llvm-mc -triple thumbv8.1m.main -mattr=+8msecext -disassemble <<< "0x9f,0xec,0x08,0x0b" .text vscclrm {d0, d1, d2, d3, vpr} Assembling returns the correct encoding with or without mve.fp: $ llvm-mc -triple thumbv8.1m.main -mattr=+mve.fp,+8msecext -show-encoding <<< "vscclrm {d0-d3, vpr}" .text vscclrm {d0, d1, d2, d3, vpr} @ encoding: [0x9f,0xec,0x08,0x0b] $ llvm-mc -triple thumbv8.1m.main -mattr=+8msecext -show-encoding <<< "vscclrm {d0-d3, vpr}" .text vscclrm {d0, d1, d2, d3, vpr} @ encoding: [0x9f,0xec,0x08,0x0b] The problem seems to be in the TableGen description of VSCCLRMD. The least significant bit should be set to zero. Differential Revision: https://reviews.llvm.org/D68025 llvm-svn: 373052
* Revert "[LoopInfo] Limit the iterations to check whether a loop has dedicatedWei Mi2019-09-271-7/+0
| | | | | | | | | | | exits" Get a better approach in https://reviews.llvm.org/D68107 to solve the problem. Revert the initial patch and will commit the new one soon. This reverts commit rL372990. llvm-svn: 373044
* [WebAssembly] v128.andnotThomas Lively2019-09-271-0/+5
| | | | | | | | | | | | | | | | Summary: As specified at https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#bitwise-and-not Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68113 llvm-svn: 373041
* [WebAssembly] SIMD Load and extend operationsThomas Lively2019-09-274-2/+71
| | | | | | | | | | | | | | | | | | Summary: As specified at https://github.com/webassembly/simd/blob/master/proposals/simd/SIMD.md#load-and-extend. These instructions are behind the unimplemented-simd128 target feature for now because they have not been implemented in V8 yet. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68058 llvm-svn: 373040
* Speculative fix for gcc build.Peter Collingbourne2019-09-271-2/+4
| | | | llvm-svn: 373038
OpenPOWER on IntegriCloud