summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Fix sub_oneuse being marked commutativeMatt Arsenault2017-01-121-1/+2
| | | | llvm-svn: 291748
* [AVX-512] Improve lowering of zero_extend of v4i1 to v4i32 and v2i1 to v2i64 ↵Craig Topper2017-01-121-4/+4
| | | | | | with VLX, but no DQ or BW support. llvm-svn: 291747
* [AVX-512] Improve lowering of sign_extend of v4i1 to v4i32 and v2i1 to v2i64 ↵Craig Topper2017-01-121-11/+13
| | | | | | when avx512vl is available, but not avx512dq. llvm-svn: 291746
* [X86][AVX512] Fix PR31515 - Do not flip vselect condition if it's not a vXi1 ↵Elad Cohen2017-01-121-5/+8
| | | | | | | | | | | | | | | mask r289653 added a case where `vselect <cond> <vector1> <all-zeros>` is transformed to: `vselect xor(cond, DAG.getConstant(1, DL, CondVT) <all-zeros> <vector1>` This was not aimed to catch cases where Cond is not a vXi1 mask but it does. Moreover, when Cond type is VxiN (N > 1) then xor(cond, DAG.getConstant(1, DL, CondVT) != NOT(cond). This patch changes the above to xor with allones, and avoids entering the case for non-mask Conds. llvm-svn: 291745
* AMDGPU: Fold fneg into fma or fmadMatt Arsenault2017-01-121-0/+24
| | | | | | Patch mostly by Fiona Glaser llvm-svn: 291733
* AMDGPU: Fold fneg into fmulMatt Arsenault2017-01-121-0/+17
| | | | | | Patch mostly by Fiona Glaser llvm-svn: 291732
* AMDGPU: Fold fneg into faddMatt Arsenault2017-01-122-0/+61
| | | | | | Patch mostly by Fiona Glaser llvm-svn: 291731
* AMDGPU: Pull fneg/fabs out of a selectMatt Arsenault2017-01-111-0/+74
| | | | | | Allows better source modifier usage. llvm-svn: 291729
* [NewGVN] Fixup store count for the `initial` congruency class.Davide Italiano2017-01-111-3/+6
| | | | | | | | | | | | It was always zero. When we move a store from `initial` to its own congruency class, we end up with a negative store count, which is obviously wrong. Also, while here, change StoreCount to be signed so that the assertions actually fire. Ack'ed by Daniel Berlin. llvm-svn: 291725
* [CodeView] Finish decoupling TypeDatabase from TypeDumper.Zachary Turner2017-01-115-542/+576
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the type dumper itself was passed around to a lot of different places and manipulated in ways that were more appropriate on the type database. For example, the entire TypeDumper was passed into the symbol dumper, when all the symbol dumper wanted to do was lookup the name of a TypeIndex so it could print it. That's what the TypeDatabase is for -- mapping type indices to names. Another example is how if the user runs llvm-pdbdump with the option to dump symbols but not types, we still have to visit all types so that we can print minimal information about the type of a symbol, but just without dumping full symbol records. The way we did this before is by hacking it up so that we run everything through the type dumper with a null printer, so that the output goes to /dev/null. But really, we don't need to dump anything, all we want to do is build the type database. Since TypeDatabaseVisitor now exists independently of TypeDumper, we can do this. We just build a custom visitor callback pipeline that includes a database visitor but not a dumper. All the hackery around printers etc goes away. After this patch, we could probably even delete the entire CVTypeDumper class since really all it is at this point is a thin wrapper that hides the details of how to build a useful visitation pipeline. It's not a priority though, so CVTypeDumper remains for now. After this patch we will be able to easily plug in a different style of type dumper by only implementing the proper visitation methods to dump one-line output and then sticking it on the pipeline. Differential Revision: https://reviews.llvm.org/D28524 llvm-svn: 291724
* X86: Remove dead code. NFC.Peter Collingbourne2017-01-111-10/+0
| | | | llvm-svn: 291721
* AMDGPU: Fix shrinking of addc/subb.Matt Arsenault2017-01-111-7/+25
| | | | | | To shrink to VOP2 the input carry must also be VCC. llvm-svn: 291720
* AMDGPU: Fix sext_inreg for i1 in i16Matt Arsenault2017-01-111-0/+5
| | | | | | | | This produces worse code when i16 is legal, mostly due to combines getting confused by conversions inserted for uniform 16-bit operations. llvm-svn: 291717
* AMDGPU: Fix breaking VOP3 v_add_i32sMatt Arsenault2017-01-111-1/+11
| | | | | | | This was shrinking the instruction even though the carry output register was a virtual register, not known VCC. llvm-svn: 291716
* [asan] Set alignment of __asan_global_* globals to sizeof(GlobalStruct)Kuba Mracek2017-01-111-6/+3
| | | | | | | | When using profiling and ASan together (-fprofile-instr-generate -fcoverage-mapping -fsanitize=address), at least on Darwin, the section of globals that ASan emits (__asan_globals) is misaligned and starts at an odd offset. This really doesn't have anything to do with profiling, but it triggers the issue because profiling emits a string section, which can have arbitrary size. This patch changes the alignment to sizeof(GlobalStruct). Differential Revision: https://reviews.llvm.org/D28573 llvm-svn: 291715
* Revert "[NewGVN] Strengthen a couple of assertions."Davide Italiano2017-01-111-2/+2
| | | | | | It's breaking some bots. Will investigate and recommit. llvm-svn: 291712
* AMDGPU: Fix folding immediates into mac src2Matt Arsenault2017-01-111-2/+30
| | | | | | | Whether it is legal or not needs to check for the instruction it will be replaced with. llvm-svn: 291711
* [NewGVN] Parenthesise assertion condition (-Wparenthesis).Davide Italiano2017-01-111-5/+4
| | | | | | Format an assertion message while I'm here. llvm-svn: 291710
* [NewGVN] Strengthen a couple of assertions.Davide Italiano2017-01-111-2/+2
| | | | | | StoreCount >= 0 on `unsigned` is always true, otherwise. llvm-svn: 291709
* LowerTypeTests: Represent the memory region size with the constant size-1.Peter Collingbourne2017-01-111-15/+15
| | | | | | | | | This means that we can use a shorter instruction sequence in the case where the size is a power of two and on the boundary between two representations. Differential Revision: https://reviews.llvm.org/D28421 llvm-svn: 291706
* [SCEV] Make howFarToZero max backedge-taken count check for precondition.Eli Friedman2017-01-111-0/+17
| | | | | | | | | Refines max backedge-taken count if a loop like "for (int i = 0; i != n; ++i) { /* body */ }" is rotated. Differential Revision: https://reviews.llvm.org/D28536 llvm-svn: 291704
* [SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.Eli Friedman2017-01-111-11/+2
| | | | | | | | | This is both easier to understand, and produces a tighter bound in certain cases. Differential Revision: https://reviews.llvm.org/D28393 llvm-svn: 291701
* Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase ↵Peter Collingbourne2017-01-111-110/+155
| | | | | | and a lowering phase.", with a fix for an off-by-one error. llvm-svn: 291699
* NewGVN: Fix PR31594, by tracking the store count of congruenceDaniel Berlin2017-01-111-11/+50
| | | | | | | | | | | classes, and updating checking to allow for equivalence through reachability. (Sadly, the checking here is not perfect, and can't be made perfect, so we'll have to disable it after we are satisfied with correctness. Right now it is just "very unlikely" to happen.) llvm-svn: 291698
* NewGVN: Refactor performCongruenceFinding and split out congruence class movingDaniel Berlin2017-01-111-27/+43
| | | | llvm-svn: 291697
* Resubmit "[PGO] Turn off comdat renaming in IR PGO by default"Rong Xu2017-01-113-16/+71
| | | | | | This patch resubmits the changes in r291588. llvm-svn: 291696
* Revert "CodeGen: Allow small copyable blocks to "break" the CFG."Kyle Butt2017-01-111-52/+7
| | | | | | | | | This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded. This needs a simple probability check because there are some cases where it is not profitable. llvm-svn: 291695
* [ARM] More aggressive matching for vpadd and vpaddl.Eli Friedman2017-01-111-4/+104
| | | | | | | | | The new matchers work after legalization to make them simpler, and to avoid blocking other optimizations. Differential Revision: https://reviews.llvm.org/D27779 llvm-svn: 291693
* [SLP] Remove bogus assert.Michael Kuperstein2017-01-111-4/+0
| | | | | | | | | | | | The removed assert seems bogus - it's perfectly legal for the roots of the vectorized subtrees to be equal even if the original scalar values aren't, if the original scalars happen to be equivalent. This fixes PR31599. Differential Revision: https://reviews.llvm.org/D28539 llvm-svn: 291692
* [lib/Object] Unbreak build with -Werror (unused variable). NFCI.Davide Italiano2017-01-111-1/+1
| | | | llvm-svn: 291691
* Remove all variants of DWARFDie::getAttributeValueAs...() that had ↵Greg Clayton2017-01-112-45/+10
| | | | | | | | | | parameters that specified default values. Now we only support returning Optional<> values and have changed all clients over to use Optional::getValueOr(). Differential Revision: https://reviews.llvm.org/D28569 llvm-svn: 291686
* GlobalISel: only print debug info with -debug. NFC.Tim Northover2017-01-111-1/+1
| | | | | | Turns out DEBUG(...) has uses even inside NDEBUG checks. llvm-svn: 291685
* Revert rL291205 because it breaks Chrome tests under CFI.Ivan Krasin2017-01-111-155/+110
| | | | | | | | | | | | | | | | | | | | | Summary: Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase. This change separates how type identifiers are resolved from how intrinsic calls are lowered. All information required to lower an intrinsic call is stored in a new TypeIdLowering data structure. The idea is that this data structure can either be initialized using the module itself during regular LTO, or using the module summary in ThinLTO backends. Original URL: https://reviews.llvm.org/D28341 Reviewers: pcc Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D28532 llvm-svn: 291684
* Remove trailing whitespace. NFCI.Simon Pilgrim2017-01-111-3/+3
| | | | llvm-svn: 291680
* [MemDep] NFC variable name changePiotr Padlewski2017-01-111-3/+3
| | | | llvm-svn: 291679
* [lib/Object] - Introduce Decompressor class.George Rimar2017-01-113-66/+115
| | | | | | | | | | | | | Decompressor intention is to reduce duplication of code. Currently LLD has own implementation of decompressor for compressed debug sections. This class helps to avoid it and share the code. LLD patch for reusing it is D28106 Differential revision: https://reviews.llvm.org/D28105 llvm-svn: 291675
* [SystemZ] Improve isFoldableMemAccessOffset().Jonas Paulsson2017-01-111-2/+20
| | | | | | | | | | | A store of an extracted element or a load which gets inserted into a vector, will be combined into a vector load/store element instruction. Therefore, isFoldableMemAccessOffset(), which is called by LSR, should return false in these cases. Reviewer: Ulrich Weigand llvm-svn: 291673
* Make processing @llvm.assume more efficient - Add affected values to the ↵Hal Finkel2017-01-114-2/+117
| | | | | | | | | | | | | | | | | | | | | | | | assumption cache Here's my second try at making @llvm.assume processing more efficient. My previous attempt, which leveraged operand bundles, r289755, didn't end up working: it did make assume processing more efficient but eliminating the assumption cache made ephemeral value computation too expensive. This is a more-targeted change. We'll keep the assumption cache, but extend it to keep a map of affected values (i.e. values about which an assumption might provide some information) to the corresponding assumption intrinsics. This allows ValueTracking and LVI to find assumptions relevant to the value being queried without scanning all assumptions in the function. The fact that ValueTracking started doing O(number of assumptions in the function) work, for every known-bits query, has become prohibitively expensive in some cases. As discussed during the review, this is a pragmatic fix that, longer term, will likely be replaced by a more-principled solution (perhaps based on an extended SSA form). Differential Revision: https://reviews.llvm.org/D28459 llvm-svn: 291671
* X86 CodeGen: Optimized pattern for truncate with unsigned saturation.Elena Demikhovsky2017-01-111-0/+97
| | | | | | | | | DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512. And VPACKUS* instructions on SEE* targets. Differential Revision: https://reviews.llvm.org/D28216 llvm-svn: 291670
* [AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and ↵Sam Kolton2017-01-115-39/+133
| | | | | | | | | | | | immediate operands Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28157 llvm-svn: 291668
* [X86][AVX512BW] Vectorize v64i8 vector shiftsSimon Pilgrim2017-01-112-4/+20
| | | | | | Differential Revision: https://reviews.llvm.org/D28447 llvm-svn: 291665
* [PM] Separate the LoopAnalysisManager from the LoopPassManager and moveChandler Carruth2017-01-1122-116/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | the latter to the Transforms library. While the loop PM uses an analysis to form the IR units, the current plan is to have the PM itself establish and enforce both loop simplified form and LCSSA. This would be a layering violation in the analysis library. Fundamentally, the idea behind the loop PM is to *transform* loops in addition to running passes over them, so it really seemed like the most natural place to sink this was into the transforms library. We can't just move *everything* because we also have loop analyses that rely on a subset of the invariants. So this patch splits the the loop infrastructure into the analysis management that has to be part of the analysis library, and the transform-aware pass manager. This also required splitting the loop analyses' printer passes out to the transforms library, which makes sense to me as running these will transform the code into LCSSA in theory. I haven't split the unittest though because testing one component without the other seems nearly intractable. Differential Revision: https://reviews.llvm.org/D28452 llvm-svn: 291662
* [X86] Fix PR30926 - Add patterns for (v)cvtsi2s{s,d} and (v)cvtsd2s{s,d}Elad Cohen2017-01-112-1/+112
| | | | | | | | | | | The code emiited by Clang's intrinsics for (v)cvtsi2ss, (v)cvtsi2sd, (v)cvtsd2ss and (v)cvtss2sd is lowered to a code sequence that includes redundant (v)movss/(v)movsd instructions. This patch adds patterns for optimizing these sequences. Differential revision: https://reviews.llvm.org/D28455 llvm-svn: 291660
* [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.Mohammed Agabaria2017-01-1118-23/+84
| | | | | | | | | | | | updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657
* [XRay] Define the library for XRay trace logsDean Michael Berris2017-01-113-0/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In this change we move the definition of the log reading routines from the tools directory in LLVM to {include/llvm,lib}/XRay. We improve the documentation a little bit for the publicly accessible headers, and adjust the top-matter. This also leads to some refactoring and cleanup in the tooling code. In particular, we do the following: - Rename the class from LogReader to Trace, as it better represents the logical set of records as opposed to a log. - Use file type detection instead of asking the user to say what format the input file is. This allows us to keep the interface simple and encapsulate the logic of loading the data appropriately. In future changes we increase the API surface and write dedicated unit tests for the XRay library. Depends on D24376. Reviewers: dblaikie, echristo Subscribers: mehdi_amini, mgorny, llvm-commits, varno Differential Revision: https://reviews.llvm.org/D28345 llvm-svn: 291652
* [PM] Rewrite the loop pass manager to use a worklist and augmented runChandler Carruth2017-01-1117-190/+266
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to support updates to the set of loops during the traversal of the loop nest and to support invalidation of analyses. An additional significant burden in the loop PM is that so many passes require access to a large number of function analyses. Manually ensuring these are cached, available, and preserved has been a long-standing burden in LLVM even with the help of the automatic scheduling in the old pass manager. And it made the new pass manager extremely unweildy. With this design, we can package the common analyses up while in a function pass and make them immediately available to all the loop passes. While in some cases this is unnecessary, I think the simplicity afforded is worth it. This does not (yet) address loop simplified form or LCSSA form, but those are the next things on my radar and I have a clear plan for them. While the patch is very large, most of it is either mechanically updating loop passes to the new API or the new testing for the loop PM. The code for it is reasonably compact. I have not yet updated all of the loop passes to correctly leverage the update mechanisms demonstrated in the unittests. I'll do that in follow-up patches along with improved FileCheck tests for those passes that ensure things work in more realistic scenarios. In many cases, there isn't much we can do with these until the loop simplified form and LCSSA form are in place. Differential Revision: https://reviews.llvm.org/D28292 llvm-svn: 291651
* Revert r291645 "[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor ↵Craig Topper2017-01-111-9/+0
| | | | | | | | AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent." Some test appears to be hanging on the build bots. llvm-svn: 291650
* [LICM] Report failing to hoist conditionally-executed loadsAdam Nemet2017-01-111-6/+20
| | | | | | | | | | | | These are interesting again because the user may not be aware that this is a common reason preventing LICM. A const is removed from an instruction pointer declaration in order to pass it to ORE. Differential Revision: https://reviews.llvm.org/D27940 llvm-svn: 291649
* [LICM] Report failing to hoist a load with an invariant addressAdam Nemet2017-01-111-4/+15
| | | | | | | | | | | | | | | | | These are interesting because lack of precision in alias information could be standing in the way of this optimization. An example is the case in the test suite that I showed in the DevMeeting talk: http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/MultiSource/Benchmarks/FreeBench/distray/CMakeFiles/distray.dir/html/_org_test-suite_MultiSource_Benchmarks_FreeBench_distray_distray.c.html#L236 canSinkOrHoistInst is also used from LoopSink, which does not use opt-remarks so we need to take ORE as an optional argument. Differential Revision: https://reviews.llvm.org/D27939 llvm-svn: 291648
* [LICM] Report successful hoist/sink/promotionAdam Nemet2017-01-111-19/+45
| | | | | | Differential Revision: https://reviews.llvm.org/D27938 llvm-svn: 291646
OpenPOWER on IntegriCloud