summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SelectionDAG] Assert on the width of DemandedElts argument to ↵Craig Topper2018-11-081-2/+3
| | | | | | | | computeKnownBits for all vector typed operations not just build_vector. Fix AArch64 unit test that fails with the assertion added. llvm-svn: 346437
* [LTO] Drop non-prevailing definitions only if linkage is not local or appendingPirama Arumuga Nainar2018-11-084-26/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes PR 37422 In ELF, non-weak symbols can also be non-prevailing. In this particular PR, the __llvm_profile_* symbols are non-prevailing but weren't getting dropped - causing multiply-defined errors with lld. Also add a test, strong_non_prevailing.ll, to ensure that multiple copies of a strong symbol are dropped. To fix the test regressions exposed by this fix, - do not mark prevailing copies for symbols with 'appending' linkage. There's no one prevailing copy for such symbols. - fix the prevailing version in dead-strip-fulllto.ll - explicitly pass exported symbols to llvm-lto in fumcimport.ll and funcimport_var.ll Reviewers: tejohnson, pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D54125 llvm-svn: 346436
* [x86] use shuffles for scalar insertion into high elements of a constant vectorSanjay Patel2018-11-081-4/+18
| | | | | | | | | | | | | | | As discussed in D54073, we have a potential regression from more aggressive vector narrowing here, so let's try to avoid that by changing build-vector lowering slightly. Insert-vector-element lowering always does this since there's no "pinsr" for ymm/zmm: // If the vector is wider than 128 bits, extract the 128-bit subvector, insert // into that, and then insert the subvector back into the result. ...but we can sometimes do better for insert-into-constant-vector by using shuffle lowering. Differential Revision: https://reviews.llvm.org/D54271 llvm-svn: 346433
* [DAGCombine] Improve alias analysis for chain of independent stores.Nirav Dave2018-11-081-59/+116
| | | | | | | | | | | | | | | | | | | FindBetterNeighborChains simulateanously improves the chain dependencies of a chain of related stores avoiding the generation of extra token factors. For chains longer than the GatherAllAliasDepths, stores further down in the chain will necessarily fail, a potentially significant waste and preventing otherwise trivial parallelization. This patch directly parallelize the chains of stores before improving each store. This generally improves DAG-level parallelism. Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53552 llvm-svn: 346432
* InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfeTom Stellard2018-11-081-13/+5
| | | | | | | | | | | | | | | | | | | Summary: When the 3rd argument to these intrinsics is zero, lowering them to shift instructions produces poison values, since we end up with shift amounts equal to the number of bits in the shifted value. This means we can only lower these intrinsics if we can prove that the 3rd argument is not zero. Reviewers: arsenm Reviewed By: arsenm Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D53739 llvm-svn: 346422
* [CodeExtractor] Mark functions noreturn when applicableVedant Kumar2018-11-081-0/+7
| | | | | | | | | | This eliminates the outlining penalty for llvm.trap/unreachable, because callers no longer have to emit cleanup/ret instructions after calling an outlined `noreturn` function. rdar://45523626 llvm-svn: 346421
* Revert "[MSP430] Add MC layer"Davide Italiano2018-11-0830-2691/+1092
| | | | | | | | | | | This commit broke the module buildbots. Error: lib/Target/MSP430/MSP430GenAsmMatcher.inc:1027:1: error: redundant namespace 'llvm' [-Wmodules-import-nested-redundant] ^ llvm-svn: 346410
* [SystemZ] Bugfix in shouldCoalesce()Jonas Paulsson2018-11-081-10/+15
| | | | | | | | | | | | | | | | | | | | It was discovered in randomized testing that the SystemZ implementation of shouldCoalesce() could be caused to crash when subreg liveness was enabled. This was because an undef use of the virtual register was copied outside current MBB at the point of shouldCoalesce() being called. For more details, see https://bugs.llvm.org/show_bug.cgi?id=39276. This patch changes the check for MBB locality from livein/liveout checks to do checks for all instructions of both intervals being inside MBB. This avoids the cases with dead defs / undef uses outside MBB, which are not affecting liveness in/out of MBB. The original test case included as a reduced .mir test case. Review: Ulrich Weigand https://reviews.llvm.org/D54197 llvm-svn: 346406
* [LLD] Fix Microsoft precompiled headers cross-compile on LinuxAlexandre Ganea2018-11-081-0/+2
| | | | | | Differential revision: https://reviews.llvm.org/D54122 llvm-svn: 346403
* [ARM] Enable spilling of the hGPR register class in Thumb2Petr Pavlu2018-11-081-6/+2
| | | | | | | | | | Generalize code in Thumb2InstrInfo::storeRegToStackSlot() and loadRegToStackSlot() to allow the GPR class or any of its sub-classes (including hGPR) to be stored/loaded by ARM::t2STRi12/ARM::t2LDRi12. Differential Revision: https://reviews.llvm.org/D51927 llvm-svn: 346401
* Return "[IndVars] Smart hard uses detection"Max Kazantsev2018-11-081-13/+26
| | | | | | | | | | The patch has been reverted because it ended up prohibiting propagation of a constant to exit value. For such values, we should skip all checks related to hard uses because propagating a constant is always profitable. Differential Revision: https://reviews.llvm.org/D53691 llvm-svn: 346397
* [MSP430] Fix encodeInstruction() for big endian hostsAnton Korobeynikov2018-11-081-4/+3
| | | | | | | | | | Reviewers: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54251 llvm-svn: 346391
* [LSR] Combine unfolded offset into invariant registerGil Rapaport2018-11-081-12/+42
| | | | | | | | | | | | | | | LSR reassociates constants as unfolded offsets when the constants fit as immediate add operands, which currently prevents such constants from being combined later with loop invariant registers. This patch modifies GenerateCombinations() to generate a second formula which includes the unfolded offset in the combined loop-invariant register. This commit fixes a bug in the original patch (committed at r345114, reverted at r345123). Differential Revision: https://reviews.llvm.org/D51861 llvm-svn: 346390
* [SCEV][NFC] Verify IR in isLoop[Entry,Backedge]GuardedByCondMax Kazantsev2018-11-081-0/+15
| | | | | | | | | | | | | | | | | | We have a lot of various bugs that are caused by misuse of SCEV (in particular in LV), all of them can simply be described as "we ask SCEV to prove some fact on invalid IR". Some of examples of those are PR36311, PR37221, PR39160. The problem is that these failues manifest differently (what we saw was failure of various asserts across SCEV, but there can also be miscompiles). This patch adds an assert into two SCEV methods that strongly rely on correctness of the IR and are involved in known failues. This will at least allow us to have a clear indication of what was wrong in this case. This patch also fixes a unit test with incorrect IR that fails this verification. Differential Revision: https://reviews.llvm.org/D52930 Reviewed By: fhahn llvm-svn: 346389
* [MergeFuncs] Improve ordering of equal functionswhitequark2018-11-081-9/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: MergeFunctions currently tries to process strong functions before weak functions, because weak functions can simply call strong functions, while a strong/weak function cannot call a weak function (a backing strong function is needed). This patch additionally tries to process external functions before local functions, because we definitely have to keep the external function, but may be able to drop the local one (and definitely can if it is also unnamed_addr). Unfortunately, this exposes an existing bug in the implementation: The FnTree and FNodesInTree structures can currently go out of sync in the case where two weak functions are merged, because the function in FnTree/FNodesInTree is RAUWed. This leaves it behind in FnTree (this is intended, as it is the strong backing function which should be used for further merges), while it is replaced in FNodesInTree (this is not intended). This is fixed by switching FNodesInTree from using a ValueMap to using a DenseMap of AssertingVH. This exposes another minor issue: Currently FNodesInTree is not cleared after MergeFunctions finishes running. Currently, this is potentially dangerous (e.g. if something else wants to RAUW a function with a non-function), but at the very least it is unnecessary/inefficient. After the change to use AssertingVH it becomes more problematic, because there are certainly passes that remove functions. This issue is fixed by clearing FNodesInTree at the end of the pass. Reviewers: jfb, whitequark Reviewed By: whitequark Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D53271 llvm-svn: 346386
* [MergeFuncs] Call removeUsers() prior to unnamed_addr RAUWwhitequark2018-11-081-0/+1
| | | | | | | | | | | | | | | | | Summary: For unnamed_addr functions we RAUW instead of only replacing direct callers. However, functions in which replacements were performed currently are not added back to the worklist, resulting in missed merging opportunities. Fix this by calling removeUsers() prior to RAUW. Reviewers: jfb, whitequark Reviewed By: whitequark Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D53262 llvm-svn: 346385
* [WebAssembly] Add V128 to WebAssemblyInstrInfo::copyPhysRegThomas Lively2018-11-081-0/+2
| | | | | | | | | | Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53872 llvm-svn: 346384
* [sancov] Put .SCOV* sections into the right comdat groups on COFFReid Kleckner2018-11-083-7/+22
| | | | | | | | | | | | | | | Avoids linker errors about relocations against discarded sections. This was uncovered during the Chromium clang roll here: https://chromium-review.googlesource.com/c/chromium/src/+/1321863#message-717516acfcf829176f6a2f50980f7a4bdd66469a After this change, Chromium's libGLESv2 links successfully for me. Reviewers: metzman, hans, morehouse Differential Revision: https://reviews.llvm.org/D54232 llvm-svn: 346381
* NFC: DebugInfo: Track the origin CU rather than just the base address for ↵David Blaikie2018-11-084-13/+11
| | | | | | | | | | range lists Turns out knowing more than just the base address might be useful - specifically a future change to respect a DICompileUnit flag for the use of base address specifiers in DWARF < 5. llvm-svn: 346380
* [MachineOutliner][NFC] Only map blocks which have adjacent legal instructionsJessica Paquette2018-11-081-14/+36
| | | | | | | | | | | If a block doesn't have any ranges of adjacent legal instructions, then it can't have outlining candidates. There's no point in mapping legal isntructions in situations like this. I noticed this reduces the size of the suffix tree in sqlite3 for AArch64 at -Oz by about 3%. llvm-svn: 346379
* [AMDGPU] Extend promote alloca vectorizationStanislav Mekhanoshin2018-11-081-4/+20
| | | | | | | | | | | Promote alloca can vectorize a small array by bitcasting it to a vector type. Extend vectorization for the case when alloca is already a vector type. We still want to replace GEPs with an insert/extract element instructions in this case. Differential Revision: https://reviews.llvm.org/D54219 llvm-svn: 346376
* [MSP430] Add MC layerAnton Korobeynikov2018-11-0830-1092/+2692
| | | | | | | | | | | | | | | | | Summary: This change implements assembler parser, code emitter, ELF object writer and disassembler for the MSP430 ISA. Also, more instruction forms are added to the target description. Reviewers: asl Reviewed By: asl Subscribers: pftbest, krisb, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D53661 llvm-svn: 346374
* [MachineOutliner][NFC] Don't map MBBs that don't contain legal instructionsJessica Paquette2018-11-081-18/+47
| | | | | | | | | | | | | | I noticed that there are lots of basic blocks that don't have enough legal instructions in them to warrant outlining. We can skip mapping these entirely. In sqlite3, compiled for AArch64 at -Oz, this results in a 10% reduction of the total nodes in the suffix tree. These nodes can never be part of a repeated substring, and so they don't impact the result at all. Before this, there were 62128 nodes in the tree for sqlite3. After this, there are 56457 nodes. llvm-svn: 346373
* Extend virtual file system with `isLocal` methodJonas Devlieghere2018-11-081-0/+25
| | | | | | | | Expose the `llvm::sys::fs::is_local` function through the VFS. Differential revision: https://reviews.llvm.org/D54127 llvm-svn: 346372
* [PGO] Exit early if all count values are zeroRong Xu2018-11-071-3/+12
| | | | | | | | | If all the edge counts for a function are zero, skip count population and annotation, as nothing will happen. This can save some compile time. Differential Revision: https://reviews.llvm.org/D54212 llvm-svn: 346370
* [AArch64] [Windows] Address post-commit review comment on r346358.Eli Friedman2018-11-071-1/+2
| | | | | | | | In this context, usesWindowsCFI() is basically the same thing as isOSWindows(), but it makes the relevant property of the target more explicit. llvm-svn: 346366
* Add parentheses to silence warning.Jorge Gorbe Moya2018-11-071-1/+1
| | | | | | DWARFContext.cpp:356:20: error: using the result of an assignment as a condition without parentheses [-Werror,-Wparentheses] llvm-svn: 346365
* Revert "AMDGPU: Divergence-driven selection of scalar buffer load intrinsics"Nicolai Haehnle2018-11-076-90/+220
| | | | | | | | This reverts commit r344696 for now (except for some test additions). See https://bugs.freedesktop.org/show_bug.cgi?id=108611. llvm-svn: 346364
* AMDGPU/InsertWaitcnts: Cleanup some old cruft (NFCI)Nicolai Haehnle2018-11-071-91/+71
| | | | | | | | | | | | Summary: Remove redundant logic and simplify control flow. Reviewers: msearles, rampitec, scott.linder, kanarayan Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D54086 llvm-svn: 346363
* AMDGPU/InsertWaitcnts: Remove kill-related logicNicolai Haehnle2018-11-071-101/+1
| | | | | | | | | | | | | | | | Summary: This is not needed, because we don't actually insert relevant branches for KILLs that late in the compilation flow. Besides, this was always checking for the wrong kill opcode anyway... Reviewers: msearles, rampitec, scott.linder, kanarayan Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D54085 llvm-svn: 346362
* AMDGPU/NFC: Split FLAT_Global_Atomic_Pseudo into RTN/NO_RTN multiclassesKonstantin Zhuravlyov2018-11-071-11/+30
| | | | llvm-svn: 346361
* [DWARFv5] Read and dump multiple .debug_info sections.Paul Robinson2018-11-072-33/+66
| | | | | | | | Type units go in .debug_info comdats, not .debug_types, in v5. Differential Revision: https://reviews.llvm.org/D53907 llvm-svn: 346360
* [AArch64] [Windows] Trap after noreturn calls.Eli Friedman2018-11-071-0/+10
| | | | | | | | | Like the comment says, this isn't the most efficient fix in terms of codesize, but it works. Differential Revision: https://reviews.llvm.org/D54129 llvm-svn: 346358
* AMDGPU/NFC: Split MUBUF_Pseudo_Atomics into RTN/NO_RTN multiclassesKonstantin Zhuravlyov2018-11-071-5/+16
| | | | llvm-svn: 346357
* [ARM] Fix CPSR liveness in tMOVCCr_pseudo lowering.Eli Friedman2018-11-071-0/+44
| | | | | | | | | | | | | | | | | | | | | The lowering was missing live-ins in certain cases, like a sequence of multiple tMOVCCr_pseudo instructions. This would lead to a verifier failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered. For reasons I don't completely understand, it's hard to get a sequence of multiple tMOVCCr_pseudo instructions; the issue only seems to show up with 64-bit comparisons where the result is zero-extended. I added some extra testcases in case that changes in the future. Probably some optimization opportunities here if anyone is interested. (@test_slt_not is the case that was getting miscompiled.) The code to check the liveness of CPSR was stolen from X86ISelLowering.cpp; maybe it could be refactored into common helper, but I have no idea where to put it. Differential Revision: https://reviews.llvm.org/D54192 llvm-svn: 346355
* Allow subclassing ExternalAAMatt Arsenault2018-11-075-29/+24
| | | | | | | | | | | | | | This allows testing AMDGPU alias analysis like any other alias analysis pass. This fixes the existing test pointlessly running opt -O3 when it really just wants to run the one analysis. Before there was no way to test this using -aa-eval with opt, since the default constructed pass is run. The wrapper subclass allows the default constructor to pass the necessary callback. llvm-svn: 346353
* [SimpleLoopUnswitch] partial unswitch needs to be careful when replacing ↵Fedor Sergeev2018-11-071-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | invariants with constants When partial unswitch operates on multiple conditions at once, .e.g: if (Cond1 || Cond2 || NonInv) ... it should infer (and replace) values for individual conditions only on one side of unswitch and not another. More precisely only these derivations hold true: (Cond1 || Cond2) == false => Cond1 == Cond2 == false (Cond1 && Cond2) == true => Cond1 == Cond2 == true By the way we organize unswitching it means only replacing on "continue" blocks and never on "unswitched" ones. Since trivial unswitch does not have "unswitched" blocks it does not have this problem. Fixes PR 39568. Reviewers: chandlerc, asbirlea Differential Revision: https://reviews.llvm.org/D54211 llvm-svn: 346350
* [MachineOutliner][NFC] Remove Parent field from SuffixTreeNodeJessica Paquette2018-11-071-28/+14
| | | | | | | | | | This is only used for calculating ConcatLen. This isn't necessary, since it's easily derived from the traversal setting suffix indices. Remove that. Rename CurrIdx to CurrNodeLen to better describe what's going on. llvm-svn: 346349
* [MachineOutliner][NFC] Traverse suffix tree using a RepeatedSubstring iteratorJessica Paquette2018-11-071-53/+111
| | | | | | | | | This takes the traversal methods introduced in r346269 and adapts them into an iterator. This allows the outliner to iterate over repeated substrings within the suffix tree directly without having to initially find all of the substrings and then iterate over them after you've found them. llvm-svn: 346345
* [MachineOutliner] Don't store outlined function numberings on OutlinedFunctionJessica Paquette2018-11-071-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFC-ish. This doesn't change the behaviour of the outliner, but does make sure that you won't end up with say OUTLINED_FUNCTION_2: ... ret OUTLINED_FUNCTION_248: ... ret as the only outlined functions in your module. Those should really be OUTLINED_FUNCTION_0: ... ret OUTLINED_FUNCTION_1: ... ret If we produce outlined functions, they probably should have sequential numbers attached to them. This makes it a bit easier+stable to write outliner tests. The point of this is to move towards a bit more stability in outlined function names. By doing this, we at least don't rely on the traversal order of the suffix tree. Instead, we rely on the order of the candidate list, which is *far* more consistent. The candidate list is ordered by the end indices of candidates, so we're more likely to get a stable ordering. This is still susceptible to changes in the cost model though (like, if we suddenly find new candidates, for example). llvm-svn: 346340
* [LoopSink] Do not sink instructions into non-cold blocksMandeep Singh Grang2018-11-071-0/+7
| | | | | | | | | | | | | | Summary: This fixes PR39570. Reviewers: danielcdh, rnk, bkramer Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54181 llvm-svn: 346337
* [X86] improve split-stack machine BB placementThan McIntosh2018-11-071-2/+2
| | | | | | | | | | | | | | | Summary: The conditional branch created to support -fsplit-stack for X86 is left unbiased/unhinted, resulting in less than ideal block placement: the __morestack call block is kept on the main hot path. Bias the branch to insure that the stack allocation block is treated as a "cold" block during machine basic block placement. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54123 llvm-svn: 346336
* [NewGVN] Make sure we do not add a user to itself.Florian Hahn2018-11-071-3/+7
| | | | | | | | | | | | | | | | If we simplify an instruction to itself, we do not need to add a user to itself. For congruence classes with a defining expression, we already use a similar logic. Fixes PR38259. Reviewers: davide, efriedma, mcrosier Reviewed By: davide Differential Revision: https://reviews.llvm.org/D51168 llvm-svn: 346335
* Fix ignorded type qualifier warning [NFC]Serge Guelton2018-11-071-1/+1
| | | | llvm-svn: 346332
* [InstCombine] propagate FMF for fcmp+fabs foldsSanjay Patel2018-11-071-7/+13
| | | | | | | By morphing the instruction rather than deleting and creating a new one, we retain fast-math-flags and potentially other metadata (profile info?). llvm-svn: 346331
* [InstCombine] peek through fabs() when checking isnan()Sanjay Patel2018-11-071-1/+6
| | | | | | | | | That should be the end of the missing cases for this fold. See earlier patches in this series: rL346321 rL346324 llvm-svn: 346327
* [InstCombine] add folds for fcmp Pred fabs(X), 0.0Sanjay Patel2018-11-071-0/+8
| | | | | | | | Similar to rL346321, we had folds for the ordered versions of these compares already, so add the unordered siblings for completeness. llvm-svn: 346324
* Add support for llvm.is.constant intrinsic (PR4898)James Y Knight2018-11-076-14/+73
| | | | | | | | | | | | | | | This adds the llvm-side support for post-inlining evaluation of the __builtin_constant_p GCC intrinsic. Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call to a function where canConstantFoldTo returns true, and one of the arguments is a struct. Updated from patch initially by Janusz Sobczak. Differential Revision: https://reviews.llvm.org/D4276 llvm-svn: 346322
* [InstCombine] add fold for fabs(X) u< 0.0Sanjay Patel2018-11-071-0/+5
| | | | | | | | | | | | | | The sibling fold for 'oge' --> 'ord' was already here, but this half was missing. The result of fabs() must be positive or nan, so asking if the result is negative or nan is the same as asking if the result is nan. This is another step towards fixing: https://bugs.llvm.org/show_bug.cgi?id=39475 llvm-svn: 346321
* Fix unit tests after patch https://reviews.llvm.org/rL346313Calixte Denizet2018-11-071-5/+1
| | | | | | | | | | | | | | Summary: Tests are broken so fix them. Reviewers: marco-c Reviewed By: marco-c Subscribers: sylvestre.ledru, llvm-commits Differential Revision: https://reviews.llvm.org/D54208 llvm-svn: 346318
OpenPOWER on IntegriCloud