summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [Hexagon] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-07-298-216/+279
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 309469
* [llvm] Update MachOObjectFile::exports interfaceAlexander Shaposhnikov2017-07-291-3/+2
| | | | | | | | | | This diff removes the second argument of the method MachOObjectFile::exports. In all in-tree uses this argument is equal to "this" and without this argument the interface seems to be cleaner. Test plan: make check-all llvm-svn: 309462
* Remove the unused offset field from LiveDebugValues (NFC)Adrian Prantl2017-07-281-16/+3
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309455
* Remove the unused offset field from LiveDebugVariables (NFC)Adrian Prantl2017-07-281-17/+14
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309451
* Remove the unused offset from DBG_VALUE (NFC)Adrian Prantl2017-07-287-23/+23
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309450
* Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)Adrian Prantl2017-07-282-9/+7
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309449
* Remove the unused DBG_VALUE offset parameter from RegAllocFast (NFC)Adrian Prantl2017-07-281-2/+4
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309446
* [SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup ↵Sumanth Gundapaneni2017-07-281-3/+6
| | | | | | | | tables Differential Revision: https://reviews.llvm.org/D35579 llvm-svn: 309444
* [libFuzzer] improve support for inline-8bit-counters (make it more correct ↵Kostya Serebryany2017-07-283-3/+21
| | | | | | and faster) llvm-svn: 309443
* [Hexagon] Formatting changes, NFCKrzysztof Parzyszek2017-07-281-66/+49
| | | | llvm-svn: 309442
* [Inliner] Do not apply any bonus for cold callsites.Easwaran Raman2017-07-281-28/+75
| | | | | | | | | | | | | | | | | | | | | | Summary: Inlining threshold is increased by application of bonuses when the callee has a single reachable basic block or is rich in vector instructions. Similarly, inlining cost is reduced by applying a large bonus when the last call to a static function is considered for inlining. This patch disables the application of these bonuses when the callsite or the callee is cold. The intention here is to prevent a large cold callsite from being inlined to a non-cold caller that could prevent the caller from being inlined. This is especially important when the cold callsite is a last call to a static since the associated bonus is very high. Reviewers: chandlerc, davidxl Subscribers: danielcdh, llvm-commits Differential Revision: https://reviews.llvm.org/D35823 llvm-svn: 309441
* Remove the unused dbg.value offset from SelectionDAG (NFC)Adrian Prantl2017-07-285-63/+43
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309436
* Remove the obsolete offset parameter from @llvm.dbg.valueAdrian Prantl2017-07-286-31/+43
| | | | | | | | | | | | There is no situation where this rarely-used argument cannot be substituted with a DIExpression and removing it allows us to simplify the DWARF backend. Note that this patch does not yet remove any of the newly dead code. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D35951 llvm-svn: 309426
* [SLP] Allow vectorization of the instruction from the same basic blocks ↵Alexey Bataev2017-07-281-3/+8
| | | | | | | | | | | | | | | | only, NFC. Summary: After some changes in SLP vectorizer we missed some additional checks to limit the instructions for vectorization. We should not perform analysis of the instructions if the parent of instruction is not the same as the parent of the first instruction in the tree or it was analyzed already. Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D34881 llvm-svn: 309425
* Fix conditional tail call branch folding when both edges are the sameReid Kleckner2017-07-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The conditional tail call logic did the wrong thing when both destinations of a conditional branch were the same: BB#1: derived from LLVM BB %entry Live Ins: %EFLAGS Predecessors according to CFG: BB#0 JE_1 <BB#5>, %EFLAGS<imp-use,kill> JMP_1 <BB#5> BB#5: derived from LLVM BB %sw.epilog Predecessors according to CFG: BB#1 TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ... We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5 successor. Then BB#5 would be deleted as it had no predecessors, leaving a dangling "JMP_1 <BB#5>" reference behind to cause assertions later. This patch checks that both conditional branch destinations are different before doing the transform. The standard branch folding logic is able to remove both the JMP_1 and the JE_1, and for my test case we end up forming a better conditional tail call later. Fixes PR33980 llvm-svn: 309422
* AMDGPU: Look through a bitcast user of an out argumentMatt Arsenault2017-07-281-16/+101
| | | | | | | | | | | | | | This allows handling of a lot more of the interesting cases in Blender. Most of the large functions unlikely to be inlined have this pattern. This is a special case for what clang emits for OpenCL 3 element vectors. Annoyingly, these are emitted as <3 x elt>* pointers, but accessed as <4 x elt>* operations. This also needs to handle cases where a struct containing a single vector is used. llvm-svn: 309419
* [Value Tracking] Refactor icmp comparison logic into helper. NFC.Chad Rosier2017-07-281-41/+62
| | | | llvm-svn: 309417
* AMDGPU: Add pass to replace out argumentsMatt Arsenault2017-07-284-0/+381
| | | | | | | | | | | | | | | | | | | | | | | It is better to return arguments directly in registers if we are making a call rather than introducing expensive stack usage. In one of sample compile from one of Blender's many kernel variants, this fires on about ~20 different functions. Future improvements may be to recognize simple cases where the pointer is indexing a small array. This also fails when the store to the out argument is in a separate block from the return, which happens in a few of the Blender functions. This should also probably be using MemorySSA which might help with that. I'm not sure this is correct as a FunctionPass, but MemoryDependenceAnalysis seems to not work with a ModulePass. I'm also not sure where it should run.I think it should run before DeadArgumentElimination, so maybe either EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate. llvm-svn: 309416
* [LVI] Constant-propagate a zero extension of the switch condition value ↵Hiroshi Yamauchi2017-07-281-4/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | through case edges Summary: LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges. But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur. This patch adds a small logic to handle such a case in getEdgeValueLocal(). This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary. With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%. Reviewers: wmi, dberlin, sanjoy Reviewed By: sanjoy Subscribers: davide, davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D34822 llvm-svn: 309415
* GlobalISel: map 128-bit values to an FPR by default.Tim Northover2017-07-281-1/+2
| | | | | | | Eventually we may want to allow a pair of GPRs but absolutely nothing in the entire world is ready for that yet. llvm-svn: 309404
* AMDGPU: Annotate implicitarg.ptr usageMatt Arsenault2017-07-286-6/+32
| | | | | | | | | | | We need to pass something to functions for this to work. It isn't derivable just from the kernarg segment pointer because the implicit arguments are placed after the kernel arguments. Also fixes missing test for the intrinsic. llvm-svn: 309398
* [GVN] Recommit the patch "Add phi-translate support in scalarpre"Wei Mi2017-07-281-28/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommit after workaround the bug PR31652. Three bugs fixed in previous recommits: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 309397
* [ValueTracking] Remove a number of unused arguments. NFC.Chad Rosier2017-07-281-26/+17
| | | | llvm-svn: 309385
* [AArch64] Standardize suffixes for LSE Atomics mnemonics (NFCI)Joel Jones2017-07-283-130/+130
| | | | | | | | | | | | | | | | This NFC changeset standardizes the suffixes used for LSE Atomics instructions. It changes the existing suffixes - 'b', 'h', 's', 'd' - to the existing standard 'B', 'H', 'W' and 'X'. This changeset is the result of the code review discussion for D35319. Patch by: steleman Differential Revision: https://reviews.llvm.org/D35927 llvm-svn: 309384
* [ARM] Add the option to directly access TLS pointerStrahinja Petrovic2017-07-283-1/+16
| | | | | | | | | This patch enables choice for accessing thread local storage pointer (like '-mtp' in gcc). Differential Revision: https://reviews.llvm.org/D34408 llvm-svn: 309381
* [SCEV] Do not visit nodes twice in containsConstantSomewhereMax Kazantsev2017-07-281-13/+20
| | | | | | | | | | This patch reworks the function that searches constants in Add and Mul SCEV expression chains so that now it does not visit a node more than once, and also renames this function for better correspondence between its implementation and semantics. Differential Revision: https://reviews.llvm.org/D35931 llvm-svn: 309367
* [MachineOutliner] NFC: Comment tidyingJessica Paquette2017-07-281-23/+1
| | | | | | | | | The comment on describing the suffix tree had some pruning stuff that was out of date in it. Also fixed some typos. llvm-svn: 309365
* MC: add support for cfi_return_columnSaleem Abdulrasool2017-07-284-10/+40
| | | | | | | | | This adds support for the CFI pseudo-op return_column. This specifies the frame table column which contains the return address. Addresses PR33953! llvm-svn: 309360
* MC: clang-format enumeration (NFC)Saleem Abdulrasool2017-07-281-29/+146
| | | | | | | This was hard to insert elements into. clang-format it so that it is easier. NFC. llvm-svn: 309359
* Revert "[SCEV] Cache results of computeExitLimit"Sanjoy Das2017-07-281-21/+0
| | | | | | | | | This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults. I've replied on the commit thread for the patch with more details. llvm-svn: 309357
* [MachineOutliner] NFC: Split up getOutliningBenefitJessica Paquette2017-07-285-268/+312
| | | | | | | | | | | | | | | | | | | | | This is some more cleanup in preparation for some actual functional changes. This splits getOutliningBenefit into two cost functions: getOutliningCallOverhead and getOutliningFrameOverhead. These functions return the number of instructions that would be required to call a specific function and the number of instructions that would be required to construct a frame for a specific funtion. The actual outlining benefit logic is moved into the outliner, which calls these functions. The goal of refactoring getOutliningBenefit is to: - Get us closer to getting rid of the IsTailCall flag - Further split up "target-specific" things and "general algorithm" things llvm-svn: 309356
* [JumpThreading] Stop falsely preserving LazyValueInfo.Davide Italiano2017-07-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | JumpThreading claims to preserve LVI, but it doesn't preserve the analyses which LVI holds a reference to (e.g. the Dominator). In the current pass manager infrastructure, after JT runs, the PM frees these analyses (including DominatorTree) but preserves LVI. CorrelatedValuePropagation runs immediately after and queries a corrupted domtree, causing weird miscompiles. This commit disables the preservation of LVI for the time being. Eventually, we should either move LVI to a proper dependency tracking mechanism (i.e. an analyses shouldn't hold references to other analyses and compute them on demand if needed), or we should teach all the passes preserving LVI to preserve the analyses LVI depends on. The new pass manager has a mechanism to invalidate LVI in case one of the analyses it depends on becomes invalid, so this problem shouldn't exist (at least not in this immediate form), but handling of analyses holding references is still a very delicate subject. Fixes PR33917 (and rustc). llvm-svn: 309355
* DebugInfo: Consider a CU containing only local imported entities to be 'empty'David Blaikie2017-07-281-11/+14
| | | | | | | | | | | | | | | | | | | | | | | This can come up in ThinLTO & wastes space & makes degenerate IR. As per the added FIXME, ultimately, local imported entities should hang off the function and that way the imported entity list on the CU can be tested for emptiness like all the other CU lists. (function-attached local imported entities are probably also the best path forward for fixing how imported entities are handled both in cross-module use (currently, while ThinLTO preserves the imported entities, they would not get used at the imported inlined location - only in the abstract origin that appears in the partial CU created by the import (which isn't emitted under Fission due to cross-CU limitations there)) and to reduce the number of points where imported entities are emitted (they're currently emitted into every inlined instance, concrete instance, and abstract origin - they should only go in teh abstract origin if there is one, otherwise in the concrete instance - but this requires lots of delayed handling and wiring up, same as abstract variables & subprograms)) llvm-svn: 309354
* [JumpThreading] Add an option to dump LazyValueInfo after the run.Davide Italiano2017-07-281-2/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D35973 llvm-svn: 309353
* ARMFrameLowering: Only set ExtraCSSpill for actually unused registers.Matthias Braun2017-07-281-9/+18
| | | | | | | | | | The code assumed that unclobbered/unspilled callee saved registers are unused in the function. This is not true for callee saved registers that are also used to pass parameters such as swiftself. rdar://33401922 llvm-svn: 309350
* Changing the default MaxNumPromotions from 2 to 3.Dehao Chen2017-07-281-1/+1
| | | | | | | | | | | | | | Summary: In performance tuning, we see performance benefits when enlarge the maximum num promotion targets to 3. This is safe as soon as we have total percentage threshold properly setup (https://reviews.llvm.org/D35962) Reviewers: davidxl, tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D35966 llvm-svn: 309346
* Separate the ICP total threshold and remaining threshold.Dehao Chen2017-07-281-12/+20
| | | | | | | | | | | | | | Summary: In the current implementation, isPromotionProfitable only checks if the call count to a direct target is no less than a certain percentage threshold of the remaining call counts that have not been promoted. This causes code size problems when the target count is small but greater than a large portion of remaining counts. E.g. target1 takes 99.9%, while target2 takes 0.1%. Both targets will be promoted and inlined, makes the function size too large, which potentially prevents it from further inlining into its callers. This patch adds another percentage threshold against the total indirect call count. If the target count needs to be no less than both thresholds in order to be promoted speculatively. Reviewers: davidxl, tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D35962 llvm-svn: 309345
* Increase the ImportHotMultiplier to 10.0Dehao Chen2017-07-281-1/+1
| | | | | | | | | | | | | | Summary: The original 3.0 hot mupltiplier is too small, and would prevent hot callsites from being inline. This patch increases the hot multilier to 10.0 Reviewers: davidxl, tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D35969 llvm-svn: 309344
* [X86] Fix latent bug in sibcall eligibility logicReid Kleckner2017-07-281-0/+7
| | | | | | | | | | | | | | | | | | | The X86 tail call eligibility logic was correct when it was written, but the addition of inalloca and argument copy elision broke its assumptions. It was assuming that fixed stack objects were immutable. Currently, we aim to emit a tail call if no arguments have to be re-arranged in memory. This code would trace the outgoing argument values back to check if they are loads from an incoming stack object. If the stack argument is immutable, then we won't need to store it back to the stack when we tail call. Fortunately, stack objects track their mutability, so we can just make the obvious check to fix the bug. This was http://crbug.com/749826 llvm-svn: 309343
* [sanitizer-coverage] rename sanitizer-coverage-create-pc-table into ↵Kostya Serebryany2017-07-281-4/+5
| | | | | | sanitizer-coverage-pc-table and add plumbing for a clang flag llvm-svn: 309337
* Remove unused function from AArch64 backend (NFC)Adrian Prantl2017-07-272-16/+0
| | | | llvm-svn: 309336
* [sanitizer-coverage] add a feature sanitizer-coverage-create-pc-table=1 ↵Kostya Serebryany2017-07-271-22/+81
| | | | | | (works with trace-pc-guard and inline-8bit-counters) that adds a static table of instrumented PCs to be used at run-time llvm-svn: 309335
* [MachineOutliner] Cleanup: move findCandidates out of suffix treeJessica Paquette2017-07-271-204/+166
| | | | | | | | | | | | Doing some cleanup in preparation for some functional changes. This commit moves findCandidates out of the suffix tree and into the MachineOutliner class. This is much easier to follow, and removes the burden of candidate choice from the suffix tree. It also adds a couple FIXMEs and simplifies building outlined function names. llvm-svn: 309334
* [PDB] Initialize the std::array<ulittle32_t> used for the gsi bitmapReid Kleckner2017-07-271-0/+2
| | | | | | | | With ASan, we would write about 512 bytes of malloc fill value to the PDB, with some random bits ORed in here and there. Dumping the PDB would always fail reliably. llvm-svn: 309331
* [ConstantFolder] Don't try to fold gep when the idx is a vector.Davide Italiano2017-07-271-4/+8
| | | | | | | | | | | | | | | | The code in ConstantFoldGetElementPtr() assumes integers, and therefore it crashes trying to get the integer bidwith of a vector type (in this case <4 x i32>. I just changed the code to prevent the folding in case of vectors and I didn't bother to generalize as this doesn't seem to me something that really happens in practice, but I'm willing to change the patch if you think it's worth it. This is hard to trigger from -instsimplify or -instcombine only as the second instruction is dead, so the test uses loop-unroll. Differential Revision: https://reviews.llvm.org/D35956 llvm-svn: 309330
* [X86] Don't lie about legality to TLI's demanded bits.Ahmed Bougacha2017-07-271-2/+2
| | | | | | | | | | | Like r309323, X86 had a typo where it passed the wrong flags to TLO. Found by inspection; I haven't been able to tickle this into having observable behavior. I don't think it does, given that X86 doesn't have custom demanded bits logic, and the generic logic doesn't have a lot of exposure to illegal constructs. llvm-svn: 309325
* [AArch64] Remove outdated comment. NFC.Ahmed Bougacha2017-07-271-2/+0
| | | | | | There hasn't been a ternary since r231987. llvm-svn: 309324
* [AArch64] Fix legality info passed to demanded bits for TBI opt.Ahmed Bougacha2017-07-271-2/+2
| | | | | | | | | | | | | | | The (seldom-used) TBI-aware optimization had a typo lying dormant since it was first introduced, in r252573: when asking for demanded bits, it told TLI that it was running after legalize, where the opposite was true. This is an important piece of information, that the demanded bits analysis uses to make assumptions about the node. r301019 added such an assumption, which was broken by the TBI combine. Instead, pass the correct flags to TLO. llvm-svn: 309323
* [ARM] Add use-misched feature, to enable the MachineScheduler.Florian Hahn2017-07-273-8/+16
| | | | | | | | | | | | | | | | | | | Summary: This change makes it easier to experiment with the MachineScheduler in the ARM backend and also makes it very explicit which CPUs use the MachineScheduler (currently only swift and cyclone). Reviewers: MatzeB, t.p.northover, javed.absar Reviewed By: MatzeB Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35935 llvm-svn: 309316
* [MergeFunctions] Remove alias support.whitequark2017-07-271-47/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The alias support was dead code since 2011. It was last touched in r124182, where it was reintroduced after being removed in r110434, and since then it was gated behind a HasGlobalAliases flag that was permanently stuck as `false`. It is also broken. I'm not sure if it bitrotted or was just broken in the first place because it appears to have never been tested, but the following IR results in a crash: define internal i32 @a(i32 %a, i32 %b) unnamed_addr { %c = add i32 %a, %b %d = xor i32 %a, %c ret i32 %c } define internal i32 @b(i32 %a, i32 %b) unnamed_addr { %c = add i32 %a, %b %d = xor i32 %a, %c ret i32 %c } It seems safe to remove buggy untested code that no one cared about for seven years. Differential Revision: https://reviews.llvm.org/D34802 llvm-svn: 309313
OpenPOWER on IntegriCloud