summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Print register pressure for agpr and vgpr separatelyStanislav Mekhanoshin2019-07-301-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D65476 llvm-svn: 367355
* [MemorySSA] Extend allowed behavior for simplified instructions.Alina Sbirlea2019-07-302-27/+51
| | | | | | | | | | | | | | | | Summary: LoopRotate may simplify instructions, leading to the new instructions not having memory accesses created for them. Allow this behavior, by allowing the new access to be null when the template is null, and looking upwards for the proper defined access when dealing with simplified instructions. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65338 llvm-svn: 367352
* [NVPTX] Fix PR41651Michael Liao2019-07-301-2/+2
| | | | | | | | | | | | | | | Summary: - Use the passed `DL` directly as retrieving data layout from CS by checking the called function is not reliable. Under indirect function call, there is no called function. Subscribers: jholewinski, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65468 llvm-svn: 367349
* [AMDGPU] Reserve all AGPRs on targets which do not have themStanislav Mekhanoshin2019-07-302-0/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D65471 llvm-svn: 367347
* [AMDGPU/GlobalISel] Add llvm.amdgcn.fdiv.fast legalization.Austin Kerbow2019-07-304-5/+47
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64966 llvm-svn: 367344
* [FunctionAttrs] Annotate "willreturn" for AssumeLikeInstHideto Ueno2019-07-301-4/+1
| | | | | | | | | | | | | | | | | Summary: In D37215, AssumeLikeInstruction are regarded as `willreturn`. In this patch, annotation is added to those which don't have `willreturn` now(`sideeffect, object_size, experimental_widenable_condition`). Reviewers: jdoerfert, nikic, sstefan1 Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65455 llvm-svn: 367342
* [WebAssembly] Do not emit tail calls with return type mismatchThomas Lively2019-07-302-7/+41
| | | | | | | | | | | | | | | | | Summary: return_call and return_call_indirect are only valid if the return types of the callee and caller match. We were previously not enforcing that, which was producing invalid modules. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65246 llvm-svn: 367339
* Revert [GVN] Preserve loop related analysis/canonical forms.Florian Hahn2019-07-302-29/+7
| | | | | | This reverts r367332 (git commit 2d7227ec3ac91f36fc32b1c21e72e2f1f5d030ad) llvm-svn: 367335
* [GVN] Preserve loop related analysis/canonical forms.Florian Hahn2019-07-302-7/+29
| | | | | | | | | | | | | | | | | | | | LoopInfo can be easily preserved by passing it to the functions that modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor. SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in LoopInfo. The test case shows that we preserve LoopSimplify and LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some reason. Also I am not sure if it is possible to preserve those in the new pass manager, as they aren't analysis passes. Reviewers: reames, hfinkel, davide, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D65137 llvm-svn: 367332
* [Remarks] Add two serialization modes for remarks: separate and standaloneFrancis Visoiu Mistrih2019-07-303-9/+20
| | | | | | | | | | The default mode is separate, where the metadata is serialized separately from the remarks. Another mode is the standalone mode, where the metadata is serialized before the remarks, on the same stream. llvm-svn: 367328
* [LoopFusion] Extend use of OptimizationRemarkEmitterKit Barton2019-07-301-73/+107
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch extends the use of the OptimizationRemarkEmitter to provide information about loops that are not fused, and loops that are not eligible for fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops that are not eligible for fusion and the OptimizationRemarkMissed to identify loops that cannot be fused. It also reuses the statistics to provide the messages used in the OptimizationRemarks. This provides common message strings between the optimization remarks and the statistics. I would like feedback on this approach, in general. If people are OK with this, I will flesh out additional remarks in subsequent commits. Subscribers: hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63844 llvm-svn: 367327
* AMDGPU: Avoid emitting "true" predicatesMatt Arsenault2019-07-301-1/+1
| | | | | | | | | | Empty condition strings are considerde always true. This removes a lot of clutter from the generated matcher tables. This shrinks the source size of AMDGPUGenDAGISel.inc from 7.3M to 6.1M. llvm-svn: 367326
* Address post commit review comments on revision 366727.Sean Fertile2019-07-303-11/+11
| | | | | | | | | | | | | Addresses number of comment made on D64652 after commiting: - Reorders function decls in the TargetLoweringObjectFileXCOFF class. - Fix comment in MCSectionXCOFF to include description of external reference csects. - Convert several llvm_unreachables to report_fatal_error - Convert several dyn_casts to casts as they are expected not to fail. - Avoid copying DataLayout object. llvm-svn: 367324
* [InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is ↵Roman Lebedev2019-07-303-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | power-of-two Summary: I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling. While we did happen to handle this fold in most of the cases, the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two), or first turn `x s% -y` to `x u% y`; that does handle most of the cases. But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`, and thus we end up being stuck with `(x s% INT_MIN) == 0`. There is no such restriction for the more general fold: https://rise4fun.com/Alive/IIeS To be noted, the fold does not enforce that `y` is a constant, so it may indeed increase instruction count. This is consistent with what `x u% y`->`x & (y-1)` already does. I think it makes sense, it's at most one (simple) extra instruction, while `rem`ainder is really much more un-simple (and likely **very** costly). Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65046 llvm-svn: 367322
* [X86] SimplifyDemandedVectorEltsForTargetNode should be calling ↵Simon Pilgrim2019-07-301-0/+1
| | | | | | | | resolveTargetShuffleInputs not getTargetShuffleMask Add TODO comment. llvm-svn: 367318
* [X86][AVX] SimplifyDemandedVectorElts - handle extraction from ↵Simon Pilgrim2019-07-301-8/+10
| | | | | | | | | | X86ISD::SUBV_BROADCAST source (PR42819) PR42819 showed an issue that we couldn't handle the case where we demanded a 'sub-sub-vector' of the SUBV_BROADCAST 'sub-vector' source. This patch recognizes these cases and extracts the sub-sub-vector instead of trying to broadcast to a type smaller than the 'sub-vector' source. llvm-svn: 367306
* [ARM][LowOverheadLoops] Enable by defaultSam Parker2019-07-301-1/+1
| | | | | | | | | The code is now in a good enough state to pass the bunch of tests that I have run (after fixing the bugs), so let's enable it by default. Differential Revision: https://reviews.llvm.org/D65277 llvm-svn: 367297
* [ARM][LowOverheadLoops] Revert non-header LE targetSam Parker2019-07-301-3/+9
| | | | | | | | | Revert the hardware loop upon finding a LoopEnd that doesn't target the loop header, instead of asserting a failure. Differential Revision: https://reviews.llvm.org/D65268 llvm-svn: 367296
* Revert "[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)"Roman Lebedev2019-07-301-97/+13
| | | | | | | | | | | test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke: Only PHI nodes may reference their own value! %sub33 = srem i32 %sub33, %ranks_in_i This reverts commit r367288. llvm-svn: 367289
* [DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)Roman Lebedev2019-07-301-13/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367288
* Revert "[llvm-objdump] Add warning messages if disassembly + source for ↵Michael Pozulp2019-07-302-7/+12
| | | | | | | | | | problematic inputs" This reverts r367284 (git commit b1cbe51bdf44098c74f5c74b7bcd8c041a7c6772). My changes to LLVMSymbolizer caused a test to fail: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/29488 llvm-svn: 367286
* [llvm-objdump] Add warning messages if disassembly + source for problematic ↵Michael Pozulp2019-07-302-12/+7
| | | | | | | | | | | | | | | | | | inputs Summary: Addresses https://bugs.llvm.org/show_bug.cgi?id=41905 Reviewers: jhenderson, rupprecht, grimar Reviewed By: jhenderson, grimar Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62462 llvm-svn: 367284
* [FileCollector] Add a VFS that records FS accesses using the FileCollectorAlex Lorenz2019-07-291-0/+79
| | | | | | | | | | This patch adds a VFS that can be overlaid on top of another VFS to record file system accesses using the FileCollector. This can help to gather files that are needed for reproducers. Differential Revision: https://reviews.llvm.org/D65411 llvm-svn: 367278
* [IR] Consolidate fixed metadata kind definitions (NFC)Vedant Kumar2019-07-291-28/+3
| | | | | | | | | | Put the list of fixed metadata kinds in one place. Testing: check-llvm with+without LLVM_ENABLE_MODULES=On Differential Revision: https://reviews.llvm.org/D64437 llvm-svn: 367257
* [PowerPC][NFC]Fix a typo in comment.Jinsong Ji2019-07-291-1/+1
| | | | llvm-svn: 367252
* [X86] Fix typo in comment. We're looking at a right shift not a left shift. NFCCraig Topper2019-07-291-1/+1
| | | | llvm-svn: 367251
* [Remarks] Update error message format stringFrancis Visoiu Mistrih2019-07-291-4/+4
| | | | | | | All the clang-cmake-armv{7,8} bots are failing this test. This is an attempt to fix this. llvm-svn: 367243
* ThinLTOBitcodeWriter: Include globals associated with type metadata globals ↵Peter Collingbourne2019-07-291-3/+11
| | | | | | | | | | | in the merged module. Globals that are associated with globals with type metadata need to appear in the merged module because they will reference the global's section directly. Differential Revision: https://reviews.llvm.org/D65312 llvm-svn: 367242
* [X86] resolveTargetShuffleInputs - add depth to limit recursion.Simon Pilgrim2019-07-291-15/+19
| | | | | | Avoids slow downs from calls to ComputeNumSignBits/computeKnownBits going too deep. llvm-svn: 367240
* AMDGPU/LoadStoreOptimizer: combine MMOs when merging instructionsTom Stellard2019-07-291-3/+38
| | | | | | | | | | | | | | | | | | | | | | | Summary: The LoadStoreOptimizer was creating instructions with 2 MachineMemOperands, which meant they were assumed to alias with all other instructions, because MachineInstr:mayAlias() returns true when an instruction has multiple MachineMemOperands. This was preventing these instructions from being merged again, and was giving the scheduler less freedom to reorder them. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65036 llvm-svn: 367237
* [AMDGPU] Fix typo in error messageJay Foad2019-07-291-1/+1
| | | | llvm-svn: 367235
* [X86] combineX86ShufflesRecursively - start recursion at depth = 0. NFCI.Simon Pilgrim2019-07-291-18/+18
| | | | | | | | As discussed on rL367171, we have a problem where the depth recursion used in combineX86ShufflesRecursively was subtly different to computeKnownBits etc. - it starts at Depth=1 instead of Depth=0 like the others and has a different maximum recursion depth. This NFC patch fixes the recursion depth to start at 0, so we can more easily reuse depth values in calls from combineX86ShufflesRecursively and its helper functions in computeKnownBits etc. llvm-svn: 367232
* [RISCV] Fix uninitialized variable after call to evaluateConstantImmFrancis Visoiu Mistrih2019-07-291-22/+22
| | | | | | | | | | | | | | | | | For llvm/test/MC/RISCV/rv64i-aliases-invalid.s, UBSan reports: lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp:371:9: runtime error: load of value 3879186881, which is not a valid value for type 'RISCVMCExpr::VariantKind' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp:371:9 in It turns out that evaluateConstantImm does not set `VK` and it remains unitialized when doing comparisons in `isImmXLenLI()`. Differential Revision: https://reviews.llvm.org/D65347 llvm-svn: 367230
* [InstCombine] fold fadd+fneg with fdiv/fmul betweenaSanjay Patel2019-07-291-0/+18
| | | | | | | | | The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367227
* [ValueTracking] Remove volatile check in ↵Hideto Ueno2019-07-291-15/+2
| | | | | | | | | | | | | | | | | | isGuaranteedToTransferExecutionToSuccessor Summary: As clarified in D53184, volatile load and store do not trap. Therefore, we should remove volatile checks for instructions in `isGuaranteedToTransferExecutionToSuccessor`. Reviewers: jdoerfert, efriedma, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65375 llvm-svn: 367226
* [InstCombine] reduce code for fadd with fneg operand; NFCSanjay Patel2019-07-291-7/+4
| | | | llvm-svn: 367224
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for binops that ↵Simon Pilgrim2019-07-291-0/+4
| | | | | | | | change value type. NFCI. This is implicit in the value type checks in getSubVectorSrc - this just makes it upfront and obvious. llvm-svn: 367220
* [DivergenceAnalysis] Add methods for querying divergence at useJay Foad2019-07-293-11/+36
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The existing isDivergent(Value) methods query whether a value is divergent at its definition. However even if a value is uniform at its definition, a use of it in another basic block can be divergent because of divergent control flow between the def and the use. This patch adds new isDivergent(Use) methods to DivergenceAnalysis, LegacyDivergenceAnalysis and GPUDivergenceAnalysis. This might allow D63953 or other similar workarounds to be removed. Reviewers: alex-t, nhaehnle, arsenm, rtaylor, rampitec, simoll, jingyue Reviewed By: nhaehnle Subscribers: jfb, jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65141 llvm-svn: 367218
* [NFC][ARM[ParallelDSP] Cleanup of BinOpChainSam Parker2019-07-291-81/+58
| | | | | | | | | | | | - Remove some unused typedefs. - Rename BinOpChain struct to MulCandidate. - Remove the size method of MulCandidate. - Store only the first input of the ValueList provided to MulCandidate, as it's the only value we care about. This means we don't have to perform any ugly (and unnecessary) iterations of the list later on. llvm-svn: 367208
* [AMDGPU] Enable v4f16 and above for v_pk_fma instructionsDavid Stuttard2019-07-292-0/+28
| | | | | | | | | | | | | | | | | | | | Summary: If isel is presented with <2 x half> vectors then it will correctly select v_pk_fma style instructions. If isel is presented with e.g. <4 x half> vectors it will scalarize, unlike for other instruction types (such as fadd, fmul etc.) Added extra support to enable this. Updated one of the tests to include a test for this (as well as extending the test to GFX9) Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65325 Change-Id: I50a4577a3f8223fb53992af3b7d26121f65b71ee llvm-svn: 367206
* [NFC][ARM][ParallelDSP] Remove AreSymmetricalSam Parker2019-07-291-43/+0
| | | | | | | We explicitly search for a parallel mac and we only care about its inputs, checking for symmetry doesn't add anything here. llvm-svn: 367205
* [NFC][ARM][ParallelDSP] Remove PopulateLoadsSam Parker2019-07-291-9/+0
| | | | | | | | We no longer have to check what loads are used, all this is performed at the start of the transform, so it's not doing anything now. llvm-svn: 367204
* [X86] Don't use PMADDWD for vector add reductions of multiplies if the mul ↵Craig Topper2019-07-291-12/+22
| | | | | | | | | | | | | | | | | | | | | inputs have an additional user. The pmaddwd inserts a truncate, if that truncate would end up creating additional instructions instead of making a zext narrower, then we shouldn't do it. I've restricted this to only sse4.1 targets since on prior targets the zext will be done in stages. So the truncate will probably not create additional instructions. Might need some more investigation of mul shrinking and the other pmaddwd transform to be sure this is the right decision. There might be a slight regression on AVX1 targets due to add splitting. Hard to say for sure. Maybe we need to look into using the vector reduction flag to use 2 narrow loads and a blend instead of extracting and inserting. llvm-svn: 367198
* [X86] In combineLoopMAddPattern and combineLoopSADPattern, preserve the ↵Craig Topper2019-07-281-78/+63
| | | | | | | | | | vector reduction flag on the final add. Handle unrolled loops by letting DAG combine revisit. This reverts r340478 and r340631 and replaces them with a simpler method of just letting DAG combine revisit the nodes to handle the other operand. llvm-svn: 367195
* [InstCombine] fold fsub+fneg with fdiv/fmul betweenSanjay Patel2019-07-281-0/+15
| | | | | | | | | The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367194
* [ARM] MVE VPNOTDavid Green2019-07-282-3/+22
| | | | | | | | | | This adds the patterns required to transform xor P0, -1 to a VPNOT. The instruction operands have to change a little for this, adding an in and an out VCCR reg and using a custom DecodeMVEVPNOT for the decode. Differential Revision: https://reviews.llvm.org/D65133 llvm-svn: 367192
* [ARM] Better patterns for fp <> predicate vectorsDavid Green2019-07-282-4/+26
| | | | | | | | | | These are some better patterns for converting between predicates and floating points. Much like the extends, we select "1"/"-1" or "0" depending on the predicate value. Or we perform a compare against 0 to convert to a predicate. Differential Revision: https://reviews.llvm.org/D65103 llvm-svn: 367191
* [Attributor] Deduce "align" attributeHideto Ueno2019-07-281-0/+236
| | | | | | | | | | | | | | | | | Summary: Deduce "align" attribute in attributor. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64152 llvm-svn: 367187
* [IR] Fix getPointerAlignment for CallBaseHideto Ueno2019-07-281-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In current getPointerAlignemnt implementation, CallBase.getPointerAlignement(..) checks only parameter attriutes in the callsite. For example, ``` declare align 8 i8* @foo() define void @bar() { %a = tail call align 8 i8* @foo() ; getPointerAlignment returns 8 %b = tail call i8* @foo() ; getPointerAlignemnt returns 0 ret void } ``` This patch will fix the problem. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65281 llvm-svn: 367185
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for illegal op. NFCI.Simon Pilgrim2019-07-271-1/+4
| | | | | | If the subvector binop is illegal then early-out and avoid the subvector searches. llvm-svn: 367181
OpenPOWER on IntegriCloud