summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [ThinLTO] Add funtions in callees metadata to CallGraphEdgesTaewook Oh2018-03-131-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If there's a callees metadata attached to the indirect call instruction, add CallGraphEdges to the callees mentioned in the metadata when computing FunctionSummary. * Why this is necessary: Consider following code example: ``` (foo.c) static int f1(int x) {...} static int f2(int x); static int (*fptr)(int) = f2; static int f2(int x) { if (x) fptr=f1; return f1(x); } int foo(int x) { (*fptr)(x); // !callees metadata of !{i32 (i32)* @f1, i32 (i32)* @f2} would be attached to this call. } (bar.c) int bar(int x) { return foo(x); } ``` At LTO time when `foo.o` is imported into `bar.o`, function `foo` might be inlined into `bar` and PGO-guided indirect call promotion will run after that. If the profile data tells that the promotion of `@f1` or `@f2` is beneficial, the optimizer will check if the "promoted" `@f1` or `@f2` (such as `@f1.llvm.0` or `@f2.llvm.0`) is available. Without this patch, importing `!callees` metadata would only add promoted declarations of `@f1` and `@f2` to the `bar.o`, but still the optimizer will assume that the function is available and perform the promotion. The result of that is link failure with `undefined reference to @f1.llvm.0`. This patch fixes this problem by adding callees in the `!callees` metadata to CallGraphEdges so that their definition would be properly imported into. One may ask that there already is a logic to add indirect call promotion targets to be added to CallGraphEdges. However, if profile data says "indirect call promotion is only beneficial under a certain inline context", the logic wouldn't work. In the code example above, if profile data is like ``` bar:1000000:100000 1:100000 1: foo:100000 1: 100000 f1:100000 ``` , Computing FunctionSummary for `foo.o` wouldn't add `foo->f1` to CallGraphEdges. (Also, it is at least "possible" that one can provide profile data to only link step but not to compilation step). Reviewers: tejohnson, mehdi_amini, pcc Reviewed By: tejohnson Subscribers: inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44399 llvm-svn: 327358
* [LegalizeTypes] In SplitVecOp_TruncateHelper, use GetSplitVector on the ↵Craig Topper2018-03-131-2/+2
| | | | | | input instead of creating new extract_subvectors. llvm-svn: 327355
* ObjCARC: address review comments from majnemerSaleem Abdulrasool2018-03-121-8/+5
| | | | | | | I forgot to incorporate these comments into the original revision. This is just code cleanup addressing the feedback, NFC. llvm-svn: 327351
* BlockExtractor: Don’t delete functions directlyVolkan Keles2018-03-121-2/+3
| | | | | | | Blocks may have function calls, so don’t erase functions directly to avoid erasing a function that has a user. llvm-svn: 327340
* ObjCARC: teach the cloner about funcletsSaleem Abdulrasool2018-03-121-1/+36
| | | | | | | | | | | | | | In the case that the CallInst that is being moved has an associated operand bundle which is a funclet, the move will construct an invalid instruction. The new site will have a different token and needs to be reassociated with the new instruction. Unfortunately, there is no way to alter the bundle after the construction of the instruction. Replace the call instruction cloning with a custom helper to clone the instruction and reassociate the funclet token. llvm-svn: 327336
* [X86][Btver2] Clean up formatting/comments in scheduler model. NFCI.Simon Pilgrim2018-03-121-11/+18
| | | | | | Moved 'special cases' to be closer to other system classes. llvm-svn: 327332
* Remove the LoopInstSimplify pass (-loop-instsimplify)Vedant Kumar2018-03-125-230/+2
| | | | | | | | | | | | LoopInstSimplify is unused and untested. Reading through the commit history the pass also seems to have a high maintenance burden. It would be best to retire the pass for now. It should be easy to recover if we need something similar in the future. Differential Revision: https://reviews.llvm.org/D44053 llvm-svn: 327329
* Improve caching scheme in ProvenanceAnalysis.Michael Zolotukhin2018-03-122-8/+10
| | | | | | | | | | | | | | | | | | | | | | Summary: ProvenanceAnalysis::related(A, B) currently memoizes its results, and on big tests the cache grows too large, and we're spending most of the time growing/looking through DenseMap. This patch reduces the size of the cache by normalizing keys first: we do that by calling GetUnderlyingObjCPtr on the input values. The results of GetUnderlyingObjCPtr are also memoized in a separate cache. The patch doesn't bring noticable changes to compile time on CTMark, however significantly helps one of our internal tests. Reviewers: gottesmm Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44270 llvm-svn: 327328
* [PowerPC][NFC] Explicitly state types on FP SDAG patterns in anticipation of ↵Lei Huang2018-03-123-132/+158
| | | | | | adding the f128 type llvm-svn: 327319
* [AArch64] Fold adds with tprel_lo12_nc and secrel_lo12 into a following ldr/strMartin Storsjo2018-03-123-10/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D44355 llvm-svn: 327316
* [InstCombine] Replace calls to getNumUses with hasNUses or hasNUsesOrMoreCraig Topper2018-03-122-5/+5
| | | | | | | | | | getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer. There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them. Differential Revision: https://reviews.llvm.org/D44398 llvm-svn: 327315
* [CallSiteSplitting] Use !Instruction::use_empty instead of checking for a ↵Craig Topper2018-03-121-1/+1
| | | | | | | | non-zero return from getNumUses getNumUses is a linear operation. It walks a linked list to get a count. So in this case its better to just ask if there are any users rather than how many. llvm-svn: 327314
* [NFC] Replace iterators in PrintHelp with range-based forJan Korous2018-03-121-6/+4
| | | | llvm-svn: 327312
* [NFC] PrintHelp cleanupJan Korous2018-03-121-3/+1
| | | | llvm-svn: 327311
* [Hexagon] Counting leading/trailing bits is cheapKrzysztof Parzyszek2018-03-121-0/+4
| | | | llvm-svn: 327308
* [X86][Btver2] FSqrt/FDiv reg-reg instructions don't use the AGU.Simon Pilgrim2018-03-121-4/+4
| | | | | | I love you llvm-mca. llvm-svn: 327306
* [SelectionDAG] Improve handling of dangling debug infoBjorn Pettersson2018-03-126-58/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: 1) Make sure to discard dangling debug info if the variable (or variable fragment) is mapped to something new before we had a chance to resolve the dangling debug info. 2) When resolving debug info, make sure to bump the associated SDNodeOrder to ensure that the DBG_VALUE is emitted after the instruction that defines the value used in the DBG_VALUE. This will avoid a debug-use before def scenario as seen in https://bugs.llvm.org/show_bug.cgi?id=36417. The new test case, test/DebugInfo/X86/sdag-dangling-dbgvalue.ll, show some other limitations in how dangling debug info is handled in the SelectionDAG. Since we currently only support having one dangling dbg.value per Value, we will end up dropping debug info when there are more than one variable that is described by the same "dangling value". Reviewers: aprantl Reviewed By: aprantl Subscribers: aprantl, eraman, llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D44369 llvm-svn: 327303
* [Hexagon] Subtarget feature to emit one instruction per packetKrzysztof Parzyszek2018-03-126-11/+34
| | | | | | | | | | | | | | | | | This adds two features: "packets", and "nvj". Enabling "packets" allows the compiler to generate instruction packets, while disabling it will prevent it and disable all optimizations that generate them. This feature is enabled by default on all subtargets. The feature "nvj" allows the compiler to generate new-value jumps and it implies "packets". It is enabled on all subtargets. The exception is made for packets with endloop instructions, since they require a certain minimum number of instructions in the packets to which they apply. Disabling "packets" will not prevent hardware loops from being generated. llvm-svn: 327302
* [X86] Deleting README-MMX.txt now that all tasks have been completed.Simon Pilgrim2018-03-121-42/+0
| | | | | | MMX buildvectors were improved at rL327247 - new MMX bugs should be raised on bugzilla llvm-svn: 327300
* [AMDGPU][MC][GFX8] Added BUFFER_STORE_LDS_DWORD InstructionDmitry Preobrazhensky2018-03-122-4/+33
| | | | | | | | | See bug 36558: https://bugs.llvm.org/show_bug.cgi?id=36558 Differential Revision: https://reviews.llvm.org/D43950 Reviewers: artem.tamazov, arsenm llvm-svn: 327299
* [X86][Btver2] Prefix all scheduler defs. NFCI.Simon Pilgrim2018-03-121-147/+147
| | | | | | These are all global, so prefix with 'J' to help prevent accidental name clashes with other models. llvm-svn: 327296
* [X86] Remove use of MVT class from the ShuffleDecode library.Craig Topper2018-03-124-277/+240
| | | | | | | | MVT belongs to the CodeGen layer, but ShuffleDecode is used by the X86 InstPrinter which is part of the MC layer. This only worked because MVT is completely implemented in a header file with no other library dependencies. Differential Revision: https://reviews.llvm.org/D44353 llvm-svn: 327292
* [AMDGPU] Fix lowering enqueue kernel when kernel has no nameYaxun Liu2018-03-121-8/+16
| | | | | | | | | | Since the enqueued kernels have internal linkage, their names may be dropped. In this case, give them unique names __amdgpu_enqueued_kernel or __amdgpu_enqueued_kernel.n where n is a sequential number starting from 1. Differential Revision: https://reviews.llvm.org/D44322 llvm-svn: 327291
* [X86][Btver2] Extend JWriteResFpuPair to accept resource/uop counts. NFCI.Simon Pilgrim2018-03-121-51/+24
| | | | | | This allows the single resource classes (VarBlend, MPSAD, VarVecShift) to use the JWriteResFpuPair macro. llvm-svn: 327289
* [X86][Btver2] Use JWriteResFpuPair wrapper for AES/CLMUL/HADD scheduler ↵Simon Pilgrim2018-03-121-49/+6
| | | | | | | | cases. NFCI. These are single pipe and have the default resource/uop counts like JWriteResFpuPair so there's no need to handle them separately. llvm-svn: 327283
* [AMDGPU][MC] Corrected GATHER4 opcodesDmitry Preobrazhensky2018-03-123-82/+119
| | | | | | | | | See bug 36252: https://bugs.llvm.org/show_bug.cgi?id=36252 Differential Revision: https://reviews.llvm.org/D43874 Reviewers: artem.tamazov, arsenm llvm-svn: 327278
* [DebugInfo] Replace unreachable with NoneJonas Devlieghere2018-03-121-1/+1
| | | | | | | | | Invalid user input should not trigger assertions and unreachables. We already return an Option so we should just return None here. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5532 llvm-svn: 327274
* [Hexagon] fix 'must explicitly initialize the const member' error which ↵Sam McCall2018-03-121-2/+2
| | | | | | clang 3.8 emits llvm-svn: 327273
* AMDGPU/GlobalISel: Legality and RegBankInfo for G_{INSERT|EXTRACT}_VECTOR_ELTMatt Arsenault2018-03-122-0/+70
| | | | llvm-svn: 327269
* AMDGPU/GlobalISel: InstrMapping for G_MERGE_VALUESMatt Arsenault2018-03-121-0/+12
| | | | llvm-svn: 327268
* AMDGPU/GlobalISel: Make some G_MERGE_VALUEs legalMatt Arsenault2018-03-121-0/+27
| | | | llvm-svn: 327267
* [mips] Split out ASEPredicate from InsnPredicates (NFC)Simon Dardis2018-03-125-38/+38
| | | | | | | | | | | This simplifies tagging instructions with the correct ISA and ASE, albeit making instruction definitions a bit more verbose. Reviewers: atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D44299 llvm-svn: 327265
* MC intel asm parser: Allow @ at the start of function names.Nico Weber2018-03-121-1/+5
| | | | | | | | Ports parts of r193000 to the intel parser. Fixes part of PR36676. https://reviews.llvm.org/D44359 llvm-svn: 327262
* [X86][SSE] createVariablePermute - PSHUFB requires SSSE3 not just SSE3Simon Pilgrim2018-03-121-3/+3
| | | | llvm-svn: 327259
* Fix compilation on Darwin with expensive checks.Jonas Devlieghere2018-03-121-5/+10
| | | | | | | | | | | | | | | | | | After r327219 was landed, the bot with expensive checks on GreenDragon started failing. The problem was missing symbols `regex_t` and `regmatch_t` in `xlocale/_regex.h`. The latter was included because after the change in r327219, `random` is needed, which transitively includes `xlocale.h.` which in turn conditionally includes `xlocale/_regex.h` when _REGEX_H_ is defined. Because this is the header guard in `regex_impl.h` and because `regex_impl.h` was included before the other LLVM includes, `xlocale/_regex.h` was included without the necessary types being available. This commit fixes this by moving the include of `regex_impl.h` all the way down. I also added a comment to stress the significance of its position. llvm-svn: 327256
* [ThinLTO] Recommit of import global variablesEugene Leviant2018-03-122-20/+87
| | | | | | | This wasreverted in r326638 due to link problems and fixed afterwards llvm-svn: 327254
* Back out "Re-land: Teach CorrelatedValuePropagation to reduce the width of ↵Justin Lebar2018-03-121-54/+0
| | | | | | | | | | | udiv/urem instructions." This reverts r326908, originally landed as D44102. Reverted for causing performance regressions on x86. (These regressions are not yet understood.) llvm-svn: 327252
* [X86] Don't compute known bits twice for the same SDValue in LowerMUL.Craig Topper2018-03-121-4/+8
| | | | | | We called MaskedValueIsZero with two different masks, but underneath that calls computeKnownBits before applying the mask. This means we compute the same known bits twice due to the two calls. Instead just call computeKnownBits directly and apply the two masks ourselves. llvm-svn: 327251
* [CGP] Fix the remove of matched phis in complex addressing modeSerguei Katkov2018-03-121-1/+13
| | | | | | | | | | | | | | | | | | When we replace the Phi we created with matched ones it is possible that there are two identical phi nodes in IR. And matcher is smart enough to find that new created phi matches both of them. So we try to replace our phi node with matched ones twice and what is bad we delete our phi node twice causing a crash. As soon as we found that we have two identical Phi nodes it makes sense to do a clean-up and replace one phi node by other one. The patch implements it. Reviewers: john.brawn, reames Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43758 llvm-svn: 327250
* [X86][MMX] Support MMX build vectors to avoid SSE usage (PR29222)Simon Pilgrim2018-03-111-0/+81
| | | | | | | | | | | | 64-bit MMX vector generation usually ends up lowering into SSE instructions before being spilled/reloaded as a MMX type. This patch creates a MMX vector from MMX source values, taking the lowest element from each source and constructing broadcasts/build_vectors with direct calls to the MMX PUNPCKL/PSHUFW intrinsics. We're missing a few consecutive load combines that could be handled in a future patch if that would be useful - my main interest here is just avoiding a lot of the MMX/SSE crossover. Differential Revision: https://reviews.llvm.org/D43618 llvm-svn: 327247
* [X86][AVX] createVariablePermute - scale v16i16 variable permutes to use ↵Simon Pilgrim2018-03-111-1/+1
| | | | | | | | v32i8 codegen XOP was already doing this, and now AVX performs v32i8 variable permutes as well. llvm-svn: 327245
* [X86][AVX] createVariablePermute - widen permutes for cases where the source ↵Simon Pilgrim2018-03-111-5/+19
| | | | | | vector is wider than the destination type llvm-svn: 327244
* [X86][AVX] createVariablePermute - use PSHUFB+PCMPGT+SELECT for v32i8 ↵Simon Pilgrim2018-03-111-0/+20
| | | | | | | | variable permutes Same as the VPERMILPS/VPERMILPD approach for v8f32/v4f64 cases, rely on PSHUFB using bits[3:0] for indexing - we can ignore the sign bit (zero element) as those index vector values are considered undefined. The select between the lo/hi permute results based on the index size. llvm-svn: 327242
* Fix for buildbots which didn't like makeArrayRef with initializer lists.Simon Pilgrim2018-03-111-2/+2
| | | | llvm-svn: 327241
* [X86][SSE] Generalized SplitBinaryOpsAndApply to SplitOpsAndApply to support ↵Simon Pilgrim2018-03-111-44/+54
| | | | | | | | any number of ops. I've kept SplitBinaryOpsAndApply as a wrapper to avoid a lot of makeArrayRef code. llvm-svn: 327240
* [X86][AVX] createVariablePermute - use 2xVPERMIL+PCMPGT+SELECT for ↵Simon Pilgrim2018-03-111-11/+26
| | | | | | | | | | v8i32/v8f32 and v4i64/v4f64 variable permutes As VPERMILPS/VPERMILPD only selects elements based on the bits[1:0]/bit[1] then we can permute both the (repeated) lo/hi 128-bit vectors in each case and then select between these results based on whether the index was for for lo/hi. For v4i64/v4f64 this avoids some rather nasty v4i64 multiples on the AVX2 implementation, which seems to be worse than the extra port5 pressure from the additional shuffles/blends. llvm-svn: 327239
* [X86][AVX512] createVariablePermute - Non-VLX targets can widen v4i64/v8f64 ↵Simon Pilgrim2018-03-111-2/+12
| | | | | | | | variable permutes to v8i64/v8f64 Permutes in the upper elements will be undefined, but they will be discarded anyway. llvm-svn: 327238
* [x86][SSE] Add widenSubVector helper. NFCI.Simon Pilgrim2018-03-111-9/+17
| | | | | | | | Helper function to insert a subvector into the bottom elements of a larger zero/undef vector with the same scalar type. I've converted a couple of INSERT_SUBVECTOR calls to use it, there are plenty more although in some cases I was worried it might make the code more ambiguous. llvm-svn: 327236
* [MemorySSA] Fix comment + remove redundant dyn_casts; NFCGeorge Burgess IV2018-03-111-18/+12
| | | | | | StartingAccess is already a MemoryUseOrDef. llvm-svn: 327235
* Test commit - change comment slightly.Michael Bedy2018-03-111-2/+2
| | | | llvm-svn: 327234
OpenPOWER on IntegriCloud