summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [NFC] Encapsulate work with BlockColors in LoopSafetyInfoMax Kazantsev2018-10-162-10/+22
| | | | llvm-svn: 344590
* [DebugInfo][LCSSA] Rewrite pre-existing debug values outside loopDavid Stenberg2018-10-162-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Extend LCSSA so that debug values outside loops are rewritten to use the PHI nodes that the pass creates. This fixes PR39019. In that case, we ran LCSSA on a loop that was later on vectorized, which left us with something like this: for.cond.cleanup: %add.lcssa = phi i32 [ %add, %for.body ], [ %34, %middle.block ] call void @llvm.dbg.value(metadata i32 %add, ret i32 %add.lcssa for.body: %add = [...] br i1 %exitcond, label %for.cond.cleanup, label %for.body which later resulted in the debug.value becoming undef when removing the scalar loop (and the location would have probably been wrong for the vectorized case otherwise). As we now may need to query the AvailableVals cache more than once for a basic block, FindAvailableVals() in SSAUpdaterImpl is changed so that it updates the cache for blocks that we do not create a PHI node for, regardless of the block's number of predecessors. The debug value in the attached IR reproducer would not be properly rewritten without this. Debug values residing in blocks where we have not inserted any PHI nodes are currently left as-is by this patch. I'm not sure what should be done with those uses. Reviewers: mattd, aprantl, vsk, probinson Reviewed By: mattd, aprantl Subscribers: jmorse, gbedwell, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D53130 llvm-svn: 344589
* [NFC] Move block throw check inside allLoopPathsLeadToBlockMax Kazantsev2018-10-161-6/+10
| | | | llvm-svn: 344588
* [NFC] Turn isGuaranteedToExecute into a methodMax Kazantsev2018-10-163-12/+12
| | | | llvm-svn: 344587
* [SCEV] Limit AddRec "simplifications" to avoid combinatorial explosionsMax Kazantsev2018-10-161-1/+1
| | | | | | | | | | | | | | | | | | SCEV's transform that turns `{A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>` into a single AddRec of size `2n+1` with complex combinatorial coefficients can easily trigger exponential growth of the SCEV (in case if nothing gets folded and simplified). We tried to restrain this transform using the option `scalar-evolution-max-add-rec-size`, but its default value seems to be insufficiently small: the test attached to this patch with default value of this option `16` has a SCEV of >3M symbols (when printed out). This patch reduces the simplification limit. It is not a cure to combinatorial explosions, but at least it reduces this corner case to something more or less reasonable. Differential Revision: https://reviews.llvm.org/D53282 Reviewed By: sanjoy llvm-svn: 344584
* [WebAssembly] LSDA info generationHeejin Ahn2018-10-1616-62/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds support for LSDA (exception table) generation for wasm EH. Wasm EH mostly follows the structure of Itanium-style exception tables, with one exception: a call site table entry in wasm EH corresponds to not a call site but a landing pad. In wasm EH, the VM is responsible for stack unwinding. After an exception occurs and the stack is unwound, the control flow is transferred to wasm 'catch' instruction by the VM, after which the personality function is called from the compiler-generated code. (Refer to WasmEHPrepare pass for more information on this part.) This patch: - Changes wasm.landingpad.index intrinsic to take a token argument, to make this 1:1 match with a catchpad instruction - Stores landingpad index info and catch type info MachineFunction in before instruction selection - Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an exception table - Adds WasmException class with overridden methods for table generation - Adds support for LSDA section in Wasm object writer Reviewers: dschuff, sbc100, rnk Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52748 llvm-svn: 344575
* [X86] Remove some isel patterns that shouldn't be possible.Craig Topper2018-10-152-6/+0
| | | | | | These included a bitcast of a load from v4f32 to v2f64, but DAG combine should have already changed the type of the load to remove the cast. llvm-svn: 344573
* [ORC] Rename ORC layers to make the "new" ORC layers the default.Lang Hames2018-10-1510-44/+44
| | | | | | | | | | | | | This commit adds a 'Legacy' prefix to old ORC layers and utilities, and removes the '2' suffix from the new ORC layers. If you wish to continue using the old ORC layers you will need to add a 'Legacy' prefix to your classes. If you were already using the new ORC layers you will need to drop the '2' suffix. The legacy layers will remain in-tree until the new layers reach feature parity with them. This will involve adding support for removing code from the new layers, and ensuring that performance is comperable. llvm-svn: 344572
* [ORC] Rename MultiThreadedSimpleCompiler to ConcurrentIRCompiler.Lang Hames2018-10-151-1/+1
| | | | | | | | | | | The new name is a better fit: This class does not actually spawn any new threads for compilation, it is just safe to call from multiple threads concurrently. The "Simple" part of the name did not convey much either, so it was dropped. llvm-svn: 344567
* Change a TerminatorInst* to an Instruction* in HotColdSplitting.cpp.Lang Hames2018-10-151-1/+1
| | | | | | | | | | | r344558 added an assignment to a TerminatorInst* from BasicBlock::getTerminatorInst(), but BasicBlock::getTerminatorInst() returns an Instruction* rather than a TerminatorInst* since r344504 so this fails to compile. Changing the variable to an Instruction* should get the bots building again. llvm-svn: 344566
* [ORC] Switch to DenseMap/DenseSet for ORC symbol map/set types.Lang Hames2018-10-152-29/+38
| | | | llvm-svn: 344565
* NFC: Fix a -Wsign-conversion warningErik Pilkington2018-10-151-5/+11
| | | | llvm-svn: 344564
* [X86] Fix a bad bitcast in the load form of vXi16 uniform shift patterns for ↵Craig Topper2018-10-151-9/+10
| | | | | | EVEX encoded instructions. llvm-svn: 344563
* [hot-cold-split] fix static analysis of cold regionsSebastian Pop2018-10-151-7/+41
| | | | | | | | | | | | | | | | | | | | | | Make the code of blockEndsInUnreachable to match the function blockEndsInUnreachable in CodeGen/BranchFolding.cpp. I also have added a note to make sure the code of this function will not be modified unless the back-end version is also modified. An early return before outlining has been added to avoid outlining the full function body when the first block in the function is marked cold. The static analysis of cold code has been amended to avoid marking the whole function as cold by back-propagation because the back-propagation would mark blocks with return statements as cold. The patch adds debug statements to help discover these problems. Differential Revision: https://reviews.llvm.org/D52904 llvm-svn: 344558
* [AARCH64] Improve vector popcnt lowering with ADDLPSimon Pilgrim2018-10-151-12/+36
| | | | | | | | | | AARCH64 equivalent to D53257 - uses widening pairwise adds on vXi8 CTPOP to support i16/i32/i64 vectors. This is a blocker for generic vector CTPOP expansion (P32655) - this will remove the aarch64 diff from D53258. Differential Revision: https://reviews.llvm.org/D53259 llvm-svn: 344554
* AMDGPU: Generate .amdgcn_target for object code v3Konstantin Zhuravlyov2018-10-151-3/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D53221 llvm-svn: 344552
* [CodeExtractor] Erase debug intrinsics in outlined thunks (fix PR22900)Vedant Kumar2018-10-151-0/+13
| | | | | | | | | | | | | | | Variable updates within the outlined function are invisible to debuggers. This could be improved by defining a DISubprogram for the new function. For the moment, simply erase the debug intrinsics instead. This fixes verifier failures about function-local metadata being used in the wrong function, seen while testing the hot/cold splitting pass. rdar://45142482 Differential Revision: https://reviews.llvm.org/D53267 llvm-svn: 344545
* [SelectionDAG] allow FP binops in SimplifyDemandedVectorEltsSanjay Patel2018-10-151-1/+6
| | | | | | | | | | | | This is intended to make the backend on par with functionality that was added to the IR version of SimplifyDemandedVectorElts in: rL343727 ...and the original motivation is that we need to improve demanded-vector-elements in several ways to avoid problems that would be exposed in D51553. Differential Revision: https://reviews.llvm.org/D52912 llvm-svn: 344541
* [DAGCombiner] allow undef elts in vector fmul matchingSanjay Patel2018-10-151-1/+1
| | | | llvm-svn: 344534
* [DAGCombiner] refactor folds for fadd (fmul X, -2.0), Y; NFCISanjay Patel2018-10-151-16/+18
| | | | | | The transform doesn't work if the vector constant has undef elements. llvm-svn: 344532
* [DAGCombiner] allow undef elts in vector fma matchingSanjay Patel2018-10-151-21/+22
| | | | llvm-svn: 344528
* [DAGCombiner] allow undef elts in vector fma matchingSanjay Patel2018-10-151-9/+10
| | | | llvm-svn: 344525
* Revert "[NewPM] teach -passes= to emit meaningful error messages"Fedor Sergeev2018-10-152-218/+162
| | | | | | This reverts r344519 due to failures in pipeline-parsing test. llvm-svn: 344524
* [NewPM] teach -passes= to emit meaningful error messagesFedor Sergeev2018-10-152-162/+218
| | | | | | | | | | | | | | | Summary: All the PassBuilder::parse interfaces now return descriptive StringError instead of a plain bool. It allows to make -passes/aa-pipeline parsing errors context-specific and thus less confusing. TODO: ideally we should also make suggestions for misspelled pass names, but that requires some extensions to PassBuilder. Reviewed By: philip.pfaffe, chandlerc Differential Revision: https://reviews.llvm.org/D53246 llvm-svn: 344519
* [mips][micromips] Fix overlaping FDEs errorAleksandar Beserminji2018-10-152-0/+24
| | | | | | | | | | | | | | When compiling static executable for micromips, CFI symbols are incorrectly labeled as MICROMIPS, which cause ".eh_frame_hdr refers to overlapping FDEs." error. This patch does not label CFI symbols as MICROMIPS, and FDEs do not overlap anymore. This patch also exposes another bug, which is fixed here: https://reviews.llvm.org/D52985 Differential Revision: https://reviews.llvm.org/D52987 llvm-svn: 344516
* [mips][micromips] Revert "Fix overlaping FDEs error"Aleksandar Beserminji2018-10-152-24/+0
| | | | | | This reverts r344511. llvm-svn: 344515
* [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281)Simon Pilgrim2018-10-151-130/+26
| | | | | | | | | | As I suggested on PR39281, this patch uses PADDL pairwise addition to widen from the vXi8 CTPOP result to the target vector type. This is a blocker for moving more x86 code to generic vector CTPOP expansion (P32655 + D53258) - ARM's vXi64 CTPOP currently expands, which would generate a vXi64 MUL but ARM's custom lowering expands the general MUL case and vectors aren't well handled in LegalizeDAG - improving the CTPOP lowering was a lot easier than fixing the MUL lowering for this one case...... Differential Revision: https://reviews.llvm.org/D53257 llvm-svn: 344512
* [mips][micromips] Fix overlaping FDEs errorAleksandar Beserminji2018-10-152-0/+24
| | | | | | | | | | | | | | When compiling static executable for micromips, CFI symbols are incorrectly labeled as MICROMIPS, which cause ".eh_frame_hdr refers to overlapping FDEs." error. This patch does not label CFI symbols as MICROMIPS, and FDEs do not overlap anymore. This patch also exposes another bug, which is fixed here: https://reviews.llvm.org/D52985 Differential Revision: https://reviews.llvm.org/D52987 llvm-svn: 344511
* [NewPM] implement SCC printing for -print-before-all/-print-after-allFedor Sergeev2018-10-151-4/+28
| | | | | | | | | | | | Removing deficiency of initial implementation of -print-before-all/-after-all - it was effectively skipping IR printing for all the SCC passes. Now LazyCallGraph:SCC gets its IR printed. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D53270 llvm-svn: 344505
* [TI removal] Make `getTerminator()` return a generic `Instruction`.Chandler Carruth2018-10-154-14/+15
| | | | | | | | | | | | | | | | | | | | | | | | This removes the primary remaining API producing `TerminatorInst` which will reduce the rate at which code is introduced trying to use it and generally make it much easier to remove the remaining APIs across the codebase. Also clean up some of the stragglers that the previous mechanical update of variables missed. Users of LLVM and out-of-tree code generally will need to update any explicit variable types to handle this. Replacing `TerminatorInst` with `Instruction` (or `auto`) almost always works. Most of these edits were made in prior commits using the perl one-liner: ``` perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator\(\))/Instruction\1/g' ``` This also my break some rare use cases where people overload for both `Instruction` and `TerminatorInst`, but these should be easily fixed by removing the `TerminatorInst` overload. llvm-svn: 344504
* [TI removal] Rework `InstVisitor` to support visiting instructions thatChandler Carruth2018-10-152-24/+24
| | | | | | | | | are terminators without relying on the specific `TerminatorInst` type. This required cleaning up two users of `InstVisitor`s usage of `TerminatorInst` as well. llvm-svn: 344503
* [TI removal] Make variables declared as `TerminatorInst` and initializedChandler Carruth2018-10-1558-144/+143
| | | | | | | | | | | | | by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
* [TI removal] Remove `TerminatorInst` from GVN.h and GVN.cpp.Chandler Carruth2018-10-151-1/+1
| | | | | | | | | | This is the last interesting usage in all of LLVM's headers. The remaining usages in headers are the core typesystem bits (Core.h, instruction types, and InstVisitor) and as the return of `BasicBlock::getTerminator`. The latter is the big remaining API point that I'll remove after mass updates to user code. llvm-svn: 344501
* [TI removal] Remove `TerminatorInst` from BasicBlockUtils.hChandler Carruth2018-10-158-32/+34
| | | | | | | | | This requires updating a number of .cpp files to adapt to the new API. I've just systematically updated all uses of `TerminatorInst` within these files te `Instruction` so thta I won't have to touch them again in the future. llvm-svn: 344498
* [TI removal] Remove TerminatorInst as an input parameter from all publicChandler Carruth2018-10-152-2/+3
| | | | | | | | | LLVM APIs. There weren't very many. We still have the instruction visitor, and APIs with TerminatorInst as a return type or an output parameter. llvm-svn: 344494
* [TwoAddressInstructionPass] Replace subregister uses when processing tied ↵Bjorn Pettersson2018-10-151-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | operands Summary: TwoAddressInstruction pass typically rewrites %1:short = foo %0.sub_lo:long as %1:short = COPY %0.sub_lo:long %1:short = foo %1:short when having tied operands. If there are extra un-tied operands that uses the same reg and subreg, such as the second and third inputs to fie here: %1:short = fie %0.sub_lo:long, %0.sub_hi:long, %0.sub_lo:long then there was a bug which replaced the register %0 also for the un-tied operand, but without changing the subregister indices. So we used to get: %1:short = COPY %0.sub_lo:long %1:short = fie %1, %1.sub_hi:short, %1.sub_lo:short With this fix we instead get: %1:short = COPY %0.sub_lo:long %1:short = fie %1, %0.sub_hi:long, %1 Reviewers: arsenm, JesperAntonsson, kparzysz, MatzeB Reviewed By: MatzeB Subscribers: bjope, kparzysz, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D36224 llvm-svn: 344492
* [ORC] Simplify naming for JITDylib definition generators.Lang Hames2018-10-152-33/+36
| | | | | | | | | Renames: JITDylib's setFallbackDefinitionGenerator method to setGenerator. DynamicLibraryFallbackGenerator class to DynamicLibrarySearchGenerator. ReexportsFallbackDefinitionGenerator to ReexportsGenerator. llvm-svn: 344489
* [X86] Move promotion of vector and/or/xor from legalization to DAG combineCraig Topper2018-10-151-17/+34
| | | | | | | | | | | | | | | | | | | Summary: I've noticed that the bitcasts we introduce for these make computeKnownBits and computeNumSignBits not work well in LegalizeVectorOps. LegalizeVectorOps legalizes bottom up while LegalizeDAG legalizes top down. The bottom up strategy for LegalizeVectorOps means operands are legalized before their uses. So we promote and/or/xor before we legalize the operands that use them making computeKnownBits/computeNumSignBits in places like LowerTruncate suboptimal. I looked at changing LegalizeVectorOps to be top down as well, but that was more disruptive and caused some regressions. I also looked at just moving promotion of binops to LegalizeDAG, but that had a few issues one around matching AND,ANDN,OR into VSELECT because I had to create ANDN as vXi64, but the other nodes hadn't legalized yet, I didn't look too hard at fixing that. This patch seems to produce better results overall than my other attempts. We now form broadcasts of constants better in some cases. For at least some of them the AND was being introduced in LegalizeDAG, promoted to vXi64, and the BUILD_VECTOR was also legalized there. I think we got bad ordering of that. Now the promotion is out of the legalizer so we handle this better. In the longer term I think we really should evaluate whether we should be doing this promotion at all. It's really there to reduce isel pattern count, but I'm wondering if we'd be better served just eating the pattern cost or doing C++ based isel for vector and/or/xor in X86ISelDAGToDAG. The masked and/or/xor will definitely be difficult in patterns if a bitcast gets between the vselect and the and/or/xor node. That becomes a lot of permutations to cover. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53107 llvm-svn: 344487
* [X86] Add 128 MOVDDUP to the constant pool printing in ↵Craig Topper2018-10-151-0/+6
| | | | | | | | X86AsmPrinter::EmitInstruction. We use this instruction to broadcast a single 64-bit value to a v2i64/v2f64 vector. llvm-svn: 344486
* [LV] Fix comments reported when not vectorizing single iteration loops; NFCAyal Zaks2018-10-141-1/+8
| | | | | | | | Landing this as a separate part of https://reviews.llvm.org/D50480, being a seemingly unrelated change ([LV] Vectorizing loops of arbitrary trip count without remainder under opt for size). llvm-svn: 344483
* [X86][AVX] Enable lowerVectorShuffleAsLanePermuteAndPermute v16i16/v32i8 ↵Simon Pilgrim2018-10-141-0/+10
| | | | | | | | shuffle lowering Extends D53148 from v4f64 now that we have test coverage for v16i16/v32i8 shuffles. llvm-svn: 344481
* [LegalizeDAG] Don't bother with final MUL+SRL stage for byte CTPOP. Simon Pilgrim2018-10-141-3/+4
| | | | | | | | The final stage of CTPOP expansion (v = (v * 0x01010101...) >> (Len - 8)) is completely pointless for the byte (Len = 8) case as it reduces to (v = (v * 0x01...) >> 0), but annoyingly this doesn't always get optimized away. Found while investigating generic vector CTPOP expansion (PR32655). llvm-svn: 344477
* [InstCombine] combine a shuffle and an extract subvector shuffle Sanjay Patel2018-10-141-0/+38
| | | | | | | | | | | | This is part of the missing IR-level folding noted in D52912. This should be ok as a canonicalization because the new shuffle mask can't be any more complicated than the existing shuffle mask. If there's some target where the shorter vector shuffle is not legal, it should just end up expanding to something like the pair of shuffles that we're starting with here. Differential Revision: https://reviews.llvm.org/D53037 llvm-svn: 344476
* recommit 344472 after fixing build failure on ARM and PPC.Dorit Nuzman2018-10-1417-58/+195
| | | | llvm-svn: 344475
* revert 344472 due to failures.Dorit Nuzman2018-10-1417-195/+58
| | | | llvm-svn: 344473
* [IAI,LV] Add support for vectorizing predicated strided accesses using maskedDorit Nuzman2018-10-1417-58/+195
| | | | | | | | | | | | | | | | | | | | | | | interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
* [X86] Fix bad indentation. NFCCraig Topper2018-10-141-1/+1
| | | | llvm-svn: 344471
* [X86] Type legalize v2f32 stores by widening to v4f32, casting to v2f64, ↵Craig Topper2018-10-141-13/+34
| | | | | | | | | | | | | | | | extracting f64 and storing. Summary: This is similar to what D52528 did for loads. It should match what generic type legalization does in 64-bit mode where it uses a v2i64 cast and an i64 store. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53173 llvm-svn: 344470
* Move some helpers from the global namespace into anonymous ones.Benjamin Kramer2018-10-135-13/+17
| | | | llvm-svn: 344468
* [ORC] During lookup, do not match against hidden symbols in other JITDylibs.Lang Hames2018-10-136-46/+64
| | | | | | | | | | | | This adds two arguments to the main ExecutionSession::lookup method: MatchNonExportedInJD, and MatchNonExported. These control whether and where hidden symbols should be matched when searching a list of JITDylibs. A similar effect could have been achieved by filtering search results, but this would have involved materializing symbol definitions (since materialization is triggered on lookup) only to throw the results away, among other issues. llvm-svn: 344467
OpenPOWER on IntegriCloud