summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Refactor SimplifyUsingDistributiveLaws to more explicitly skip ↵Craig Topper2017-04-141-30/+33
| | | | | | | | | | | | | | code when LHS/RHS aren't BinaryOperators Currently this code always makes 2 or 3 calls to tryFactorization regardless of whether the LHS/RHS are BinaryOperators. We make 3 calls when both operands are BinaryOperators with the same opcode. Or surprisingly, when neither are BinaryOperators. This is because getBinOpsForFactorization returns Instruction::BinaryOpsEnd when the operand is not a BinaryOperator. If both LHS and RHS are not BinaryOperators then they both have an Opcode of Instruction::BinaryOpsEnd. When this happens we rely on tryFactorization to early out due to A/B/C/D being null. Similar behavior occurs for the other calls, we rely on getBinOpsForFactorization having made A/B or C/D null to get tryFactorization to early out. We also rely on these null checks to check the result of getIdentityValue and early out for it. This patches refactors this to pull these checks up to SimplifyUsingDistributiveLaws so we don't rely on BinaryOpsEnd as a sentinel or this A/B/C/D null behavior. I think this makes this code easier to reason about. Should also give a tiny performance improvement for cases where the LHS or RHS isn't a BinaryOperator. Differential Revision: https://reviews.llvm.org/D31913 llvm-svn: 300353
* [Profile] Make host tool aware of object format when quering prof section names Xinliang David Li2017-04-143-16/+28
| | | | | | Differential Revision: https://reviews.llvm.org/D32073 llvm-svn: 300352
* Rewrite SCEV Normalization using SCEVRewriteVisitor; NFCSanjoy Das2017-04-141-121/+57
| | | | | | | Removes all of the boilerplate, cache management etc. from ScalarEvolutionNormalization, and keeps only the interesting bits. llvm-svn: 300349
* [RDF] Switch RegisterAggr to a bit vector of register unitsKrzysztof Parzyszek2017-04-145-185/+168
| | | | | | | This avoids many complications related to the complex register aliasing schemes. llvm-svn: 300345
* [FunctionImport] assert(false) -> llvm_unreachable(). NFCI.Davide Italiano2017-04-141-1/+1
| | | | llvm-svn: 300344
* Remove "#if 0"ed out assertSanjoy Das2017-04-141-5/+0
| | | | | | | | | | | It won't compile after the recent changes I've made, and I think keeping it in provides very little value. Instead I've added (in an earlier commit) a C++ unit test to check the Denormalize(Normalized(X)) == X property for specific instances of X, which is what the assert was trying to do anyway. llvm-svn: 300339
* Delete some unnecessary boilerplateSanjoy Das2017-04-141-47/+29
| | | | | | | | | | | | The PostIncTransform class was not pulling its weight, so delete it and use free functions instead. This also makes the use of `function_ref` more idiomatic. We were storing an instance of function_ref in the PostIncTransform class before, which was fine in that specific case, but the usage after this change is more obviously okay. llvm-svn: 300338
* [RDF] Refine propagation of reached uses in liveness computationKrzysztof Parzyszek2017-04-143-5/+63
| | | | llvm-svn: 300337
* [Hexagon] Fix a latent problem with interpreting live-in lane masksKrzysztof Parzyszek2017-04-141-5/+7
| | | | | | | A non-zero lane mask on a register with no subregister means that the whole register is live-in. It is equivalent to a full mask. llvm-svn: 300335
* Use range forSanjoy Das2017-04-141-3/+1
| | | | llvm-svn: 300334
* Simplify PostIncTransform further; NFCSanjoy Das2017-04-141-16/+19
| | | | | | | Instead of having two ways to check if an add recurrence needs to be normalized, just pass in one predicate to decide that. llvm-svn: 300333
* Tighten the API for ScalarEvolutionNormalizationSanjoy Das2017-04-144-20/+43
| | | | llvm-svn: 300331
* Remove NormalizeAutodetect; NFCSanjoy Das2017-04-144-131/+99
| | | | | | | | | It is cleaner to have a callback based system where the logic of whether an add recurrence is normalized or not lives on IVUsers. This is one step in a multi-step cleanup. llvm-svn: 300330
* [Hexagon] Make a couple of passes compliant with -opt-bisect-limitKrzysztof Parzyszek2017-04-142-0/+5
| | | | llvm-svn: 300329
* [X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (LLVM)Simon Pilgrim2017-04-143-12/+23
| | | | | | | | | | MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics. Clang companion patch: D31766. Differential Revision: https://reviews.llvm.org/D31767 llvm-svn: 300325
* Reorder StoreMergeCandidates to run faster. NFCI.Nirav Dave2017-04-141-20/+23
| | | | llvm-svn: 300321
* [AMDGPU][MC] Corrected ds_write_src2_* to require one offset instead of two.Dmitry Preobrazhensky2017-04-141-14/+2
| | | | | | | | | | Fixed bug 32551: https://bugs.llvm.org//show_bug.cgi?id=32551 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31809 llvm-svn: 300319
* [AMDGPU][MC] Enabled constants for src operands of s_cbranch_g_forkDmitry Preobrazhensky2017-04-141-1/+1
| | | | | | | | | | Fixed bug 32619: https://bugs.llvm.org//show_bug.cgi?id=32619 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31973 llvm-svn: 300318
* Fix for PR#30562: Selection DAG error: Detected cycle in SelectionDAG.Andrew V. Tischenko2017-04-141-2/+5
| | | | | | Patch by Dinar Temirbulatov llvm-svn: 300314
* This patch closes PR#32216: Better testing of schedule model instruction ↵Andrew V. Tischenko2017-04-1418-41/+212
| | | | | | | | latencies/throughputs. The details are here: https://reviews.llvm.org/D30941 llvm-svn: 300311
* [LV] Remove implicit single basic block assumptionGil Rapaport2017-04-141-6/+5
| | | | | | | | | | | | | This patch is part of D28975's breakdown - no change in output intended. LV's code currently assumes the vectorized loop is a single basic block up until predicateInstructions() is called. This patch removes two manifestations of this assumption (loop phi incoming values, dominator tree update) by replacing the use of vectorLoopBody with the vectorized loop's latch/header. Differential Revision: https://reviews.llvm.org/D32040 llvm-svn: 300310
* [ValueTracking] Calculate the KnownZeros for Intrinsic::ctpop without using ↵Craig Topper2017-04-141-5/+2
| | | | | | | | a temporary APInt to count leading zeros on. The APInt was created from an 'unsigned' and we just wanted to know how many bits the value needed to represent it. We can just use Log2_32 from MathExtras.h to get the info. llvm-svn: 300309
* [ValueTracking] Use APInt::isNegative(). NFCCraig Topper2017-04-141-1/+1
| | | | llvm-svn: 300308
* [ValueTracking] Use APInt::sext instead of zext and setBitsFrom. NFCCraig Topper2017-04-141-7/+2
| | | | llvm-svn: 300307
* [InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFCCraig Topper2017-04-141-3/+3
| | | | llvm-svn: 300305
* Fix test failure on windows: pass module to getInstrProfXXName callsXinliang David Li2017-04-141-4/+4
| | | | llvm-svn: 300302
* Object, LTO: Add target triple to irsymtab and LTO API.Peter Collingbourne2017-04-142-0/+2
| | | | | | | | | | Start using it in LLD to avoid needing to read bitcode again just to get the target triple, and in llvm-lto2 to avoid printing symbol table information that is inappropriate for the target. Differential Revision: https://reviews.llvm.org/D32038 llvm-svn: 300300
* NewGVN: Don't propagate over phi backedges where undef causes us toDaniel Berlin2017-04-141-8/+149
| | | | | | | | have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299
* Use range-for; NFCSanjoy Das2017-04-141-6/+4
| | | | llvm-svn: 300292
* Use transform instead of manual loop; NFCSanjoy Das2017-04-141-5/+5
| | | | llvm-svn: 300291
* LLVMCodeGen: Add ProfileData into deps corresponding to r300277.NAKAMURA Takumi2017-04-141-1/+1
| | | | llvm-svn: 300289
* [AMDGPU] added SIInstrInfo::getAddNoCarry() helperStanislav Mekhanoshin2017-04-144-23/+44
| | | | | | | | Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288
* Simplify some Verifier attribute checks with AttributeSetReid Kleckner2017-04-141-188/+175
| | | | | | | | | | | Now that we have a type that can represent the attributes on a single return, function, or parameter, we can pass it around directly rather than passing around AttributeList and Idx. Removes some more one-based argument attribute index counting. NFC llvm-svn: 300285
* [Profile] PE binary coverage bug fixXinliang David Li2017-04-135-16/+105
| | | | | | | | PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277
* [AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16Adam Nemet2017-04-131-3/+8
| | | | | | | | | | | | This further improves Ahmed's change in rL299482. See the new comment for the rationale. The patch recovers most of the regression for bzip2 after D31965. We're down to +2.68% from +6.97%. Differential Revision: https://reviews.llvm.org/D32028 llvm-svn: 300276
* AMDGPU/GFX9: Do not use v_pack_b32_f16 when packingKonstantin Zhuravlyov2017-04-131-29/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D31819 llvm-svn: 300275
* [IR] Make getParamAttributes take argument numbers, not ArgNo+1Reid Kleckner2017-04-1314-74/+73
| | | | | | | | | | | | Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272
* [bpf] Fix memory offset check for loads and storesAlexei Starovoitov2017-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the offset cannot fit into the instruction, an addition to the pointer is emitted before the actual access. However, BPF offsets are 16-bit but LLVM considers them to be, for the matter of this check, to be 32-bit long. This causes the following program: int bpf_prog1(void *ign) { volatile unsigned long t = 0x8983984739ull; return *(unsigned long *)((0xffffffff8fff0002ull) + t); } To generate the following (wrong) code: 0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00 r1 = 590618314553ll 2: 7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1 3: 79 a1 f8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 8) 4: 79 10 02 00 00 00 00 00 r0 = *(u64 *)(r1 + 2) 5: 95 00 00 00 00 00 00 00 exit Fix it by changing the offset check to 16-bit. Patch by Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D32055 llvm-svn: 300269
* [Support] Fix ErrorOr assertion when /proc/cpuinfo doesn't exist.Teresa Johnson2017-04-131-0/+1
| | | | | | | | | | | | | | The ErrorOr should not be dereferenced on the error path. Patch by Jacob Young Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32032 llvm-svn: 300267
* [InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of ↵Craig Topper2017-04-131-4/+2
| | | | | | getLowBitsSet. NFC llvm-svn: 300265
* [llvm-pdbdump] Recursively dump class layout.Zachary Turner2017-04-133-42/+191
| | | | llvm-svn: 300258
* [ValueTracking] Remove duplicate call to computeKnownBits for the operands ↵Craig Topper2017-04-131-5/+1
| | | | | | | | of Select. We call it unconditionally on the operands of the select. Then decide if its a min/max and call it on the min/max operands or on the select operands again. Either of those second calls will overwrite the results of the initial call so we can just delete the first call. llvm-svn: 300256
* [LCSSA] Efficiently compute blocks dominating at least one exit.Davide Italiano2017-04-131-19/+54
| | | | | | | | | | | | | | | | | | | | | | | For LCSSA purposes, loop BBs not dominating any of the exits aren't interesting, as none of the values defined in these blocks can be used outside the loop. The way the code computed this information was by comparing each BB of the loop with each of the exit blocks and ask the dominator tree about their dominance relation. This is slow. A more efficient way, implemented here, is that of starting from the exit blocks and walking the dom upwards until we hit an header. By transitivity, all the blocks we encounter in our path dominate an exit. For the testcase provided in PR31851, this reduces compile time on `opt -O2` by ~25%, going from 1m47s to 1m22s. Thanks to Dan/MichaelZ for discussions/suggesting the approach/review. Differential Revision: https://reviews.llvm.org/D31843 llvm-svn: 300255
* Fix -Wunused-value warningReid Kleckner2017-04-131-6/+6
| | | | llvm-svn: 300254
* Revert accidentally-committed files in r300252.Richard Smith2017-04-131-403/+0
| | | | llvm-svn: 300253
* Remove all allocation and divisions from GreatestCommonDivisorRichard Smith2017-04-132-70/+485
| | | | | | | | | | | Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252
* [InstCombine] Fix !prof metadata preservation for invokesReid Kleckner2017-04-131-18/+16
| | | | | | | | | | | | | | | | | | | | Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251
* [LCSSA] Assert that we always have a valid loop.Davide Italiano2017-04-131-0/+1
| | | | | | | We could otherwise add BBs not belonging to a loop in `formLCSSA` and later crash when trying to iterate the loop blocks. llvm-svn: 300244
* [LCSSA] Remove spurious whitespaces. NFCI.Davide Italiano2017-04-131-1/+1
| | | | llvm-svn: 300243
* [LCSSA] Use `auto` when the type is obvious. NFCI.Davide Italiano2017-04-131-3/+3
| | | | llvm-svn: 300242
OpenPOWER on IntegriCloud