summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove NormalizeAutodetect; NFCSanjoy Das2017-04-144-131/+99
| | | | | | | | | It is cleaner to have a callback based system where the logic of whether an add recurrence is normalized or not lives on IVUsers. This is one step in a multi-step cleanup. llvm-svn: 300330
* [Hexagon] Make a couple of passes compliant with -opt-bisect-limitKrzysztof Parzyszek2017-04-142-0/+5
| | | | llvm-svn: 300329
* [X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (LLVM)Simon Pilgrim2017-04-143-12/+23
| | | | | | | | | | MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics. Clang companion patch: D31766. Differential Revision: https://reviews.llvm.org/D31767 llvm-svn: 300325
* Reorder StoreMergeCandidates to run faster. NFCI.Nirav Dave2017-04-141-20/+23
| | | | llvm-svn: 300321
* [AMDGPU][MC] Corrected ds_write_src2_* to require one offset instead of two.Dmitry Preobrazhensky2017-04-141-14/+2
| | | | | | | | | | Fixed bug 32551: https://bugs.llvm.org//show_bug.cgi?id=32551 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31809 llvm-svn: 300319
* [AMDGPU][MC] Enabled constants for src operands of s_cbranch_g_forkDmitry Preobrazhensky2017-04-141-1/+1
| | | | | | | | | | Fixed bug 32619: https://bugs.llvm.org//show_bug.cgi?id=32619 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31973 llvm-svn: 300318
* Fix for PR#30562: Selection DAG error: Detected cycle in SelectionDAG.Andrew V. Tischenko2017-04-141-2/+5
| | | | | | Patch by Dinar Temirbulatov llvm-svn: 300314
* This patch closes PR#32216: Better testing of schedule model instruction ↵Andrew V. Tischenko2017-04-1418-41/+212
| | | | | | | | latencies/throughputs. The details are here: https://reviews.llvm.org/D30941 llvm-svn: 300311
* [LV] Remove implicit single basic block assumptionGil Rapaport2017-04-141-6/+5
| | | | | | | | | | | | | This patch is part of D28975's breakdown - no change in output intended. LV's code currently assumes the vectorized loop is a single basic block up until predicateInstructions() is called. This patch removes two manifestations of this assumption (loop phi incoming values, dominator tree update) by replacing the use of vectorLoopBody with the vectorized loop's latch/header. Differential Revision: https://reviews.llvm.org/D32040 llvm-svn: 300310
* [ValueTracking] Calculate the KnownZeros for Intrinsic::ctpop without using ↵Craig Topper2017-04-141-5/+2
| | | | | | | | a temporary APInt to count leading zeros on. The APInt was created from an 'unsigned' and we just wanted to know how many bits the value needed to represent it. We can just use Log2_32 from MathExtras.h to get the info. llvm-svn: 300309
* [ValueTracking] Use APInt::isNegative(). NFCCraig Topper2017-04-141-1/+1
| | | | llvm-svn: 300308
* [ValueTracking] Use APInt::sext instead of zext and setBitsFrom. NFCCraig Topper2017-04-141-7/+2
| | | | llvm-svn: 300307
* [InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFCCraig Topper2017-04-141-3/+3
| | | | llvm-svn: 300305
* Fix test failure on windows: pass module to getInstrProfXXName callsXinliang David Li2017-04-141-4/+4
| | | | llvm-svn: 300302
* Object, LTO: Add target triple to irsymtab and LTO API.Peter Collingbourne2017-04-142-0/+2
| | | | | | | | | | Start using it in LLD to avoid needing to read bitcode again just to get the target triple, and in llvm-lto2 to avoid printing symbol table information that is inappropriate for the target. Differential Revision: https://reviews.llvm.org/D32038 llvm-svn: 300300
* NewGVN: Don't propagate over phi backedges where undef causes us toDaniel Berlin2017-04-141-8/+149
| | | | | | | | have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299
* Use range-for; NFCSanjoy Das2017-04-141-6/+4
| | | | llvm-svn: 300292
* Use transform instead of manual loop; NFCSanjoy Das2017-04-141-5/+5
| | | | llvm-svn: 300291
* LLVMCodeGen: Add ProfileData into deps corresponding to r300277.NAKAMURA Takumi2017-04-141-1/+1
| | | | llvm-svn: 300289
* [AMDGPU] added SIInstrInfo::getAddNoCarry() helperStanislav Mekhanoshin2017-04-144-23/+44
| | | | | | | | Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288
* Simplify some Verifier attribute checks with AttributeSetReid Kleckner2017-04-141-188/+175
| | | | | | | | | | | Now that we have a type that can represent the attributes on a single return, function, or parameter, we can pass it around directly rather than passing around AttributeList and Idx. Removes some more one-based argument attribute index counting. NFC llvm-svn: 300285
* [Profile] PE binary coverage bug fixXinliang David Li2017-04-135-16/+105
| | | | | | | | PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277
* [AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16Adam Nemet2017-04-131-3/+8
| | | | | | | | | | | | This further improves Ahmed's change in rL299482. See the new comment for the rationale. The patch recovers most of the regression for bzip2 after D31965. We're down to +2.68% from +6.97%. Differential Revision: https://reviews.llvm.org/D32028 llvm-svn: 300276
* AMDGPU/GFX9: Do not use v_pack_b32_f16 when packingKonstantin Zhuravlyov2017-04-131-29/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D31819 llvm-svn: 300275
* [IR] Make getParamAttributes take argument numbers, not ArgNo+1Reid Kleckner2017-04-1314-74/+73
| | | | | | | | | | | | Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272
* [bpf] Fix memory offset check for loads and storesAlexei Starovoitov2017-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the offset cannot fit into the instruction, an addition to the pointer is emitted before the actual access. However, BPF offsets are 16-bit but LLVM considers them to be, for the matter of this check, to be 32-bit long. This causes the following program: int bpf_prog1(void *ign) { volatile unsigned long t = 0x8983984739ull; return *(unsigned long *)((0xffffffff8fff0002ull) + t); } To generate the following (wrong) code: 0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00 r1 = 590618314553ll 2: 7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1 3: 79 a1 f8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 8) 4: 79 10 02 00 00 00 00 00 r0 = *(u64 *)(r1 + 2) 5: 95 00 00 00 00 00 00 00 exit Fix it by changing the offset check to 16-bit. Patch by Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D32055 llvm-svn: 300269
* [Support] Fix ErrorOr assertion when /proc/cpuinfo doesn't exist.Teresa Johnson2017-04-131-0/+1
| | | | | | | | | | | | | | The ErrorOr should not be dereferenced on the error path. Patch by Jacob Young Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32032 llvm-svn: 300267
* [InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of ↵Craig Topper2017-04-131-4/+2
| | | | | | getLowBitsSet. NFC llvm-svn: 300265
* [llvm-pdbdump] Recursively dump class layout.Zachary Turner2017-04-133-42/+191
| | | | llvm-svn: 300258
* [ValueTracking] Remove duplicate call to computeKnownBits for the operands ↵Craig Topper2017-04-131-5/+1
| | | | | | | | of Select. We call it unconditionally on the operands of the select. Then decide if its a min/max and call it on the min/max operands or on the select operands again. Either of those second calls will overwrite the results of the initial call so we can just delete the first call. llvm-svn: 300256
* [LCSSA] Efficiently compute blocks dominating at least one exit.Davide Italiano2017-04-131-19/+54
| | | | | | | | | | | | | | | | | | | | | | | For LCSSA purposes, loop BBs not dominating any of the exits aren't interesting, as none of the values defined in these blocks can be used outside the loop. The way the code computed this information was by comparing each BB of the loop with each of the exit blocks and ask the dominator tree about their dominance relation. This is slow. A more efficient way, implemented here, is that of starting from the exit blocks and walking the dom upwards until we hit an header. By transitivity, all the blocks we encounter in our path dominate an exit. For the testcase provided in PR31851, this reduces compile time on `opt -O2` by ~25%, going from 1m47s to 1m22s. Thanks to Dan/MichaelZ for discussions/suggesting the approach/review. Differential Revision: https://reviews.llvm.org/D31843 llvm-svn: 300255
* Fix -Wunused-value warningReid Kleckner2017-04-131-6/+6
| | | | llvm-svn: 300254
* Revert accidentally-committed files in r300252.Richard Smith2017-04-131-403/+0
| | | | llvm-svn: 300253
* Remove all allocation and divisions from GreatestCommonDivisorRichard Smith2017-04-132-70/+485
| | | | | | | | | | | Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252
* [InstCombine] Fix !prof metadata preservation for invokesReid Kleckner2017-04-131-18/+16
| | | | | | | | | | | | | | | | | | | | Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251
* [LCSSA] Assert that we always have a valid loop.Davide Italiano2017-04-131-0/+1
| | | | | | | We could otherwise add BBs not belonging to a loop in `formLCSSA` and later crash when trying to iterate the loop blocks. llvm-svn: 300244
* [LCSSA] Remove spurious whitespaces. NFCI.Davide Italiano2017-04-131-1/+1
| | | | llvm-svn: 300243
* [LCSSA] Use `auto` when the type is obvious. NFCI.Davide Italiano2017-04-131-3/+3
| | | | llvm-svn: 300242
* [DAG] Fold away temporary vector in store candidate merge NFC.Nirav Dave2017-04-131-14/+11
| | | | llvm-svn: 300241
* SamplePGO: convert callsite samples map key from callsite_location to ↵Dehao Chen2017-04-134-72/+126
| | | | | | | | | | | | | | | | callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240
* [ValueTracking] Prevent a call to computeKnownBits if we already know the ↵Craig Topper2017-04-131-7/+8
| | | | | | state of the bit we would calculate. Also reuse a temporary APInt instead of creating a new one. llvm-svn: 300239
* [LV] Fix the vector code generation for first order recurrenceAnna Thomas2017-04-132-24/+25
| | | | | | | | | | | | | | | | | | | Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238
* [InstCombine] fold X == 0 || X == -1 to one compare (PR32524)Sanjay Patel2017-04-131-1/+5
| | | | | | | | | | | | | | | | This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236
* [DAE] Simplify call site replacement code with CallSite NFCReid Kleckner2017-04-131-27/+24
| | | | llvm-svn: 300235
* [ValueTracking] Move a temporary APInt instead of copying it.Craig Topper2017-04-131-1/+1
| | | | llvm-svn: 300233
* [InstCombine] Simplify attribute code with new AttributeList::get NFCReid Kleckner2017-04-131-31/+20
| | | | llvm-svn: 300230
* [ArgPromotion] Don't drop !prof metadata on promoted callsReid Kleckner2017-04-131-1/+4
| | | | | | | | | | Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229
* [AMDGPU] Combine DS operations with offsets bigger than byteStanislav Mekhanoshin2017-04-131-150/+166
| | | | | | | | | In many cases ds operations can be combined even if offsets do not fit into 8 bit encoding. What it takes is to adjust base address. Differential Revision: https://reviews.llvm.org/D31993 llvm-svn: 300227
* [InstCombine] use similar ops for related folds; NFCISanjay Patel2017-04-131-10/+9
| | | | | | | | | | | | | | | It's less efficient to produce 'ule' than 'ult' since we know we're going to canonicalize to 'ult', but we shouldn't have duplicated code for these folds. As a trade-off, this was a pretty terrible way to make a '2'. :) if (LHSC == SubOne(RHSC)) AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC); The next steps are to share the code to fix PR32524 and add the missing 'and' fold that was left out when PR14708 was fixed: https://bugs.llvm.org/show_bug.cgi?id=14708 llvm-svn: 300222
* [Analysis] Support bitreverse in -demanded-bits passBrian Gesiak2017-04-131-0/+3
| | | | | | | | | | | | | | | | | | | | Summary: * Add a bitreverse case in the demanded bits analysis pass. * Add tests for the bitreverse (and bswap) intrinsic in the demanded bits pass. * Add a test case to the BDCE tests: that manipulations to high-order bits are eliminated once the bits are reversed and then right-shifted. Reviewers: mkuper, jmolloy, hfinkel, trentxintong Reviewed By: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31857 llvm-svn: 300215
OpenPOWER on IntegriCloud