summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold. Try 2Roman Lebedev2019-05-301-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Only vector tests are being affected here, since subtraction by scalar constant is rewritten as addition by negated constant. No surprising test changes. https://rise4fun.com/Alive/pbT This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62257 llvm-svn: 362146
* [DAGCombine] (x - C) - y -> (x - y) - C fold. Try 3Roman Lebedev2019-05-301-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Again only vectors affected. Frustrating. Let me take a look into that.. https://rise4fun.com/Alive/AAq This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62294 llvm-svn: 362145
* [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x ↵Roman Lebedev2019-05-301-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fold. Try 3 Summary: This prevents regressions in next patch, and somewhat recovers from the regression to AMDGPU test in D62223. It is indeed not great that we leave vector decrement, don't transform it into vector add all-ones.. https://rise4fun.com/Alive/ZRl This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: RKSimon, craig.topper, spatel, arsenm Reviewed By: RKSimon, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62263 llvm-svn: 362144
* [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C ↵Roman Lebedev2019-05-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fold. Try 3 Summary: Direct sibling of D62223 patch. While i don't have a direct motivational pattern for this, it would seem to make sense to handle both patterns (or none), for symmetry? The aarch64 changes look neutral; sparc and systemz look like improvement (one less instruction each); x86 changes - 32bit case improves, 64bit case shows that LEA no longer gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea` https://rise4fun.com/Alive/ffh This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62252 llvm-svn: 362143
* [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 3Roman Lebedev2019-05-301-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The main motivation is shown by all these `neg` instructions that are now created. In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test. AArch64 test changes all look good (`neg` created), or neutral. X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created). I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill is now hoisted into preheader (which should still be good?), 2 4-byte reloads become 1 8-byte reload, and are elsewhere, but i'm not sure how that affects that loop. I'm unable to interpret AMDGPU change, looks neutral-ish? This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]]. https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later) This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs, and then reverted in rL362109 to fix missing constant folds that were causing endless combine loops. Reviewers: craig.topper, RKSimon, spatel, arsenm Reviewed By: RKSimon Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62223 llvm-svn: 362142
* [RuntimeDyld] Apply padding and alignment bumps to all sections with stubs, andLang Hames2019-05-302-7/+6
| | | | | | | | | | | | | increase the MachO/x86-64 stub alignment to 8. Stub alignment should be guaranteed for any section containing RuntimeDyld stubs/GOT-entries. To do this we should pad and align all sections containing stubs, not just code sections. This commit also bumps the MachO/x86-64 stub alignment to 8, so that GOT entries will be aligned. llvm-svn: 362139
* AMDGPU/GlobalISel: Add wave scratch offset argumentMatt Arsenault2019-05-301-0/+42
| | | | | | Avoids crashing in PEI in a future change. llvm-svn: 362136
* [DAGCombine] ((c1-A)-c2) -> ((c1-c2)-A) constant-foldRoman Lebedev2019-05-301-0/+10
| | | | | | | | | | | | | | | | Summary: https://rise4fun.com/Alive/B0A Reviewers: t.p.northover, RKSimon, spatel, craig.topper Reviewed By: RKSimon Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62691 llvm-svn: 362135
* [DAGCombine] (A-C1)-C2 -> A-(C1+C2) constant-foldRoman Lebedev2019-05-301-0/+10
| | | | | | | | | | | | | | | | Summary: https://rise4fun.com/Alive/Mb1M Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62689 llvm-svn: 362134
* [DAGCombine] (A+C1)-C2 -> A+(C1-C2) constant-foldRoman Lebedev2019-05-301-0/+10
| | | | | | | | | | | | | | | | | | | Summary: Direct sibling of D62662, the root cause of the endless combine loop in D62257 https://rise4fun.com/Alive/d3W Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62664 llvm-svn: 362133
* [DAGCombine] Use FoldConstantArithmetic() to perform C2-(A+C1) -> (C2-C1)-A foldRoman Lebedev2019-05-301-1/+3
| | | | | | | | | | | | | | | | | Summary: No tests change, and i'm not sure how to test this, but it's better safe than sorry. Reviewers: spatel, RKSimon, craig.topper, t.p.northover Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62663 llvm-svn: 362132
* [DAGCombine] ((A-c1)+c2) -> (A+(c2-c1)) constant-foldRoman Lebedev2019-05-301-0/+9
| | | | | | | | | | | | | | | | | | | Summary: This was the root cause of the endless combine loop in D62257 https://rise4fun.com/Alive/d3W Reviewers: RKSimon, spatel, craig.topper, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62662 llvm-svn: 362131
* [DAGCombine] Use FoldConstantArithmetic() to perform ((c1-A)+c2) -> ↵Roman Lebedev2019-05-301-4/+4
| | | | | | | | | | | | | | | | | | (c1+c2)-A fold Summary: No tests change, and i'm not sure how to test this, but it's better safe than sorry. Reviewers: spatel, RKSimon, craig.topper, t.p.northover Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62661 llvm-svn: 362130
* Reapply: IR: add optional type to 'byval' function parametersTim Northover2019-05-3015-27/+294
| | | | | | | | | | | | | | | | | When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. The original commit did not remap byval types when linking modules, which broke LTO. This version fixes that. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362128
* [AMDGPU] Added target-specific attribute amdgpu-max-memory-clauseTim Renouf2019-05-301-1/+3
| | | | | | | | | | | | | | | | With LLPC, previous investigation has suggested that si-scheduler interacts badly with SiFormMemoryClauses on an XNACK target in some games. That needs further investigation in the future. In the meantime, this commit adds a target-specific attribute to allow us to disable SIFormMemoryClauses by setting it to 1 on a per-function basis for LLPC to use. Differential Revision: https://reviews.llvm.org/D62572 Change-Id: Ia0ca12ce79093cbbe86caded723ffb13384ede92 llvm-svn: 362127
* [LV] Remove the redundant using LoopVectorizationPlanner:VPlanPtrFlorian Hahn2019-05-302-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | VPlan.h already contains the declaration of VPlanPtr type alias: using VPlanPtr = std::unique_ptr<VPlan>; The LoopVectorizationPlanner class also contains the same declaration of VPlanPtr and therefore LoopVectorize requires a long wording when its methods return VPlanPtr: LoopVectorizationPlanner::VPlanPtr LoopVectorizationPlanner::buildVPlanWithVPRecipes(...) but LoopVectorize.cpp includes VPlan.h (via LoopVectorizationPlanner.h) and can use VPlanPtr from that header. Patch by Pavel Samolysov. Reviewers: hsaito, rengolin, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D62576 llvm-svn: 362126
* [LoopVectorize] Add FNeg instruction supportCraig Topper2019-05-301-9/+20
| | | | | | Differential Revision: https://reviews.llvm.org/D62510 llvm-svn: 362124
* [MIR-Canon] Add support for rewriting VRegs that are typed but don't have an RC.Puyan Lotfi2019-05-301-5/+6
| | | | | | | | | | | There were crashes (addrspace-memoperands.mir was only one of them) in MIR that had operands that came from before register classes were set. With these operands, creating a replacement vreg (for MIR-Canon's renaming) needs to use the vreg type rather than the RegisterClass which is not present. Differential Revision: https://reviews.llvm.org/D62543 llvm-svn: 362122
* Correct error in revert of r362112.Kevin P. Neal2019-05-301-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D62546 llvm-svn: 362118
* Revert r362112, it broke the bots with the message "Unsupported vector ↵Kevin P. Neal2019-05-302-63/+1
| | | | | | | | argument or return type" Differential Revision: http://reviews.llvm.org/D62546 llvm-svn: 362117
* [FPEnv] Added a special UnrollVectorOp method to deal with the chain on ↵Kevin P. Neal2019-05-302-1/+63
| | | | | | | | | | | | | StrictFP opcodes This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully. Submitted by: Drew Wock <drew.wock@sas.com> Reviewed by: Cameron McInally, Kevin P. Neal Approved by: Cameron McInally Differential Revision: http://reviews.llvm.org/D62546 llvm-svn: 362112
* [DAGCombine] Revert of recommit of "binop-with-const hoisting" patchesRoman Lebedev2019-05-301-34/+0
| | | | | | | | | | | | | | I was looking into an endless combine loop the uncommitted follow-up patch was causing, and it appears even these patches can exibit such an endless loop. The root cause is that we try to hoist one binop (add/sub) with constant operand, and if we get two such binops both of which are eligible for this hoisting, we get stuck. Some cases may highlight missing constant-folds. Reverts r361871,r361872,r361873,r361874. llvm-svn: 362109
* [NFC][ARM][ParallelDSP] Refactor narrow sequenceSam Parker2019-05-301-48/+19
| | | | | | | Most of the code used for finding a 'narrow' sequence is not used, so I've removed it and simplified the calls from the smlad matcher. llvm-svn: 362104
* [ARM] Change the MC names for VMAXNM/VMINNMSjoerd Meijer2019-05-303-35/+36
| | | | | | | | | | | | | Now the NEON ones have a prefix "NEON_", and the VFP ones have a prefix "VFP_". This is so that the regex in ARMScheduleA57.td can be made to match both of _those_ classes of VMAXNM without also matching the MVE ones that are going to be introduced soon. NFCI. Patch by Simon Tatham. Differential Revision: https://reviews.llvm.org/D60700 llvm-svn: 362097
* [ARM] LowerVECTOR_SHUFFLE - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-05-301-4/+4
| | | | llvm-svn: 362094
* [LoopIdiom] Basic OptimizationRemarkEmitter handlingRoman Lebedev2019-05-301-4/+40
| | | | | | | | | | | | | | | | | | Summary: I'm adding ORE to memset/memcpy formation, with tests, but mainly this is split off from D61144. Reviewers: reames, anemet, thegameg, craig.topper Reviewed By: thegameg Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62631 llvm-svn: 362092
* [LoopIdiomRecognize][NFC] Sort includesRoman Lebedev2019-05-301-2/+2
| | | | | | Split off from D61144 llvm-svn: 362091
* [ARM] add target arch definitions for 8.1-M and MVESjoerd Meijer2019-05-309-3/+96
| | | | | | | | | | | | | | | | | This adds: - LLVM subtarget features to make all the new instructions conditional on, - CPU and FPU names for use on clang's command line, with default FPUs set so that "armv8.1-m.main+fp" and "armv8.1-m.main+fp.dp" will select the right FPU features, - architecture extension names "mve" and "mve.fp", - ABI build attribute support for v8.1-M (a new value for Tag_CPU_arch) and MVE (a new actual tag). Patch mostly by Simon Tatham. Differential Revision: https://reviews.llvm.org/D60698 llvm-svn: 362090
* [ARM] Introduce separate features for FP registersSjoerd Meijer2019-05-305-18/+69
| | | | | | | | | | | | | | | | | The MVE extension in Arm v8.1-M permits the use of some move, load and store isntructions which access the FP registers, even if there's no actual FP support in the processor (in particular, if you have the integer-only version of MVE). Therefore, we need separate subtarget features to condition those instructions on, which are implied by both FP and MVE but are not part of either. Patch mostly by Simon Tatham. Differential Revision: https://reviews.llvm.org/D60694 llvm-svn: 362088
* [X86][SSE] Improve bool vector extload (PR26091)Simon Pilgrim2019-05-301-0/+15
| | | | | | | | We already have good codegen for (vXiY *ext(vXi1 bitcast(iX))) cases, this patch uses it for loads of vXi1 types as well - changing the load into a iX integer load, and bitcasting so that combineToExtendBoolVectorInReg can then use it. Differential Revision: https://reviews.llvm.org/D62449 llvm-svn: 362081
* [AArch64][SVE2] Asm: support SVE2 vector splice (constructive)Cullen Rhodes2019-05-302-0/+27
| | | | | | | | | | | | Summary: The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62530 llvm-svn: 362073
* [AArch64][SVE2] Asm: support SVE2 load instructionsCullen Rhodes2019-05-302-0/+55
| | | | | | | | | | | | | | | Summary: Patch adds support for the following instructions: * LDNT1SB, LDNT1B, LDNT1SH, LDNT1H, LDNT1SW, LDNT1W, LDNT1D The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62528 llvm-svn: 362072
* [AArch64][SVE2] Asm: support FCVTX/FLOGB instructionsCullen Rhodes2019-05-302-0/+12
| | | | | | | | | | | | | | | | | Summary: Patch completes SVE2 support for: SVE Floating Point Unary Operations - Predicated Group The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62526 llvm-svn: 362071
* [AArch64][SVE2] Asm: add ext (immediate offset, constructive) instructionCullen Rhodes2019-05-302-0/+18
| | | | | | | | | | | | Summary: The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62518 llvm-svn: 362070
* [ARM] Add an MVE execution domainSjoerd Meijer2019-05-302-6/+8
| | | | | | | | | | | | | | | | | | | | | | MVE architecturally specifies a 'beat' system in which a vector instruction executed now will complete its actual operation over the next four cycles, so it can overlap with the execution of the previous and next MVE instruction. This makes it generally an advantage to avoid moving values back and forth between MVE registers and anywhere else, if there's any sensible way to do the same processing in whatever register type the values already occupied. That's just what the 'execution domain' system is supposed to achieve. So here we add a new execution domain which will contain all the MVE vector instructions when they are added. Patch by: Simon Tatham Differential Revision: https://reviews.llvm.org/D60703 llvm-svn: 362068
* [LV] Inform about exactly reason of loop illegalityFlorian Hahn2019-05-301-2/+10
| | | | | | | | | | | | | | | | | | | | | | | Currently, only the following information is provided by LoopVectorizer in the case when the CF of the loop is not legal for vectorization: LV: Can't vectorize the instructions or CFG LV: Not vectorizing: Cannot prove legality. But this information is not enough for the root cause analysis; what is exactly wrong with the loop should also be printed: LV: Not vectorizing: The exiting block is not the loop latch. Patch by Pavel Samolysov. Reviewers: mkuper, hsaito, rengolin, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D62311 llvm-svn: 362056
* [X86] Add ENQCMD instructionsPengfei Wang2019-05-307-0/+75
| | | | | | | | | | | | For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Patch by Tianqing Wang (tianqing) Differential Revision: https://reviews.llvm.org/D62281 llvm-svn: 362053
* CodeView - add static data members to global variable debug info.Amy Huang2019-05-291-1/+6
| | | | | | | | | | | | | | | | | | Summary: Add static data members to IR debug info's list of global variables so that they are emitted as S_CONSTANT records. Related to https://bugs.llvm.org/show_bug.cgi?id=41615. Reviewers: rnk Subscribers: aprantl, cfe-commits, llvm-commits, thakis Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D62167 llvm-svn: 362038
* LoopVersioningLICM: Respect convergent and noduplicateMatt Arsenault2019-05-291-1/+9
| | | | llvm-svn: 362031
* Revert "IR: add optional type to 'byval' function parameters"Tim Northover2019-05-2913-252/+26
| | | | | | | The IRLinker doesn't delve into the new byval attribute when mapping types, and this breaks LTO. llvm-svn: 362029
* [LoopIdiomRecognize][NFC] Use DEBUG_TYPE, add LLVM_DEBUG() to ↵Roman Lebedev2019-05-291-2/+8
| | | | | | | | runOnNoncountableLoop() Split off from D61144 llvm-svn: 362022
* [ARC] Cleanup ARCAsmPrinter.Pete Couperus2019-05-291-16/+0
| | | | | | | | | | | | | | | | Summary: Remove unused getTargetStreamer. Remove unused headers. Reviewers: dantrushin Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62549 llvm-svn: 362021
* [DAGCombiner] Replace gathers with a zero mask with the passthru valueBenjamin Kramer2019-05-291-3/+7
| | | | | | | | | | These can be created by the legalizer when splitting a larger gather. See https://llvm.org/PR42055 for a motivating example. Differential Revision: https://reviews.llvm.org/D62613 llvm-svn: 362015
* IR: add optional type to 'byval' function parametersTim Northover2019-05-2913-26/+252
| | | | | | | | | | | | | | When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362012
* [InstCombine] Optimize always overflowing signed saturating add/subNikita Popov2019-05-291-8/+12
| | | | | | | | | Based on the overflow direction information added in D62463, we can now fold always overflowing signed saturating add/sub to signed min/max. Differential Revision: https://reviews.llvm.org/D62544 llvm-svn: 362006
* AMDGPU: Return address loweringAakanksha Patil2019-05-292-1/+27
| | | | | | | | The patch computes the return address for the current function. Differential revision: https://reviews.llvm.org/D59666 llvm-svn: 362001
* CallSiteSplitting: Respect convergent and noduplicateMatt Arsenault2019-05-291-0/+3
| | | | llvm-svn: 361990
* [ThinLTO] Use original alias visibility when importingTeresa Johnson2019-05-291-2/+3
| | | | | | | | | | | | | | | | | | | | | Summary: When we import an alias, we do so by making a clone of the aliasee. Just as this clone uses the original alias name and linkage, it should also use the same visibility (not the aliasee's visibility). Otherwise, linker behavior is affected (e.g. if the aliasee was hidden, but the alias is not, the resulting imported clone should not be hidden, otherwise the linker will make the final symbol hidden which is incorrect). Reviewers: wmi Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62535 llvm-svn: 361989
* Qualify use of llvm::empty that's ambiguous with std::emptySam McCall2019-05-291-1/+1
| | | | llvm-svn: 361968
* [mips] Iterate over MSACtrlRegClass to reserve all MSA control registers. NFCSimon Atanasyan2019-05-291-8/+2
| | | | llvm-svn: 361965
OpenPOWER on IntegriCloud