summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFCCraig Topper2017-04-141-3/+3
| | | | llvm-svn: 300305
* Fix test failure on windows: pass module to getInstrProfXXName callsXinliang David Li2017-04-141-4/+4
| | | | llvm-svn: 300302
* Object, LTO: Add target triple to irsymtab and LTO API.Peter Collingbourne2017-04-142-0/+2
| | | | | | | | | | Start using it in LLD to avoid needing to read bitcode again just to get the target triple, and in llvm-lto2 to avoid printing symbol table information that is inappropriate for the target. Differential Revision: https://reviews.llvm.org/D32038 llvm-svn: 300300
* NewGVN: Don't propagate over phi backedges where undef causes us toDaniel Berlin2017-04-141-8/+149
| | | | | | | | have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299
* Use range-for; NFCSanjoy Das2017-04-141-6/+4
| | | | llvm-svn: 300292
* Use transform instead of manual loop; NFCSanjoy Das2017-04-141-5/+5
| | | | llvm-svn: 300291
* LLVMCodeGen: Add ProfileData into deps corresponding to r300277.NAKAMURA Takumi2017-04-141-1/+1
| | | | llvm-svn: 300289
* [AMDGPU] added SIInstrInfo::getAddNoCarry() helperStanislav Mekhanoshin2017-04-144-23/+44
| | | | | | | | Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288
* Simplify some Verifier attribute checks with AttributeSetReid Kleckner2017-04-141-188/+175
| | | | | | | | | | | Now that we have a type that can represent the attributes on a single return, function, or parameter, we can pass it around directly rather than passing around AttributeList and Idx. Removes some more one-based argument attribute index counting. NFC llvm-svn: 300285
* [Profile] PE binary coverage bug fixXinliang David Li2017-04-135-16/+105
| | | | | | | | PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277
* [AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16Adam Nemet2017-04-131-3/+8
| | | | | | | | | | | | This further improves Ahmed's change in rL299482. See the new comment for the rationale. The patch recovers most of the regression for bzip2 after D31965. We're down to +2.68% from +6.97%. Differential Revision: https://reviews.llvm.org/D32028 llvm-svn: 300276
* AMDGPU/GFX9: Do not use v_pack_b32_f16 when packingKonstantin Zhuravlyov2017-04-131-29/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D31819 llvm-svn: 300275
* [IR] Make getParamAttributes take argument numbers, not ArgNo+1Reid Kleckner2017-04-1314-74/+73
| | | | | | | | | | | | Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272
* [bpf] Fix memory offset check for loads and storesAlexei Starovoitov2017-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the offset cannot fit into the instruction, an addition to the pointer is emitted before the actual access. However, BPF offsets are 16-bit but LLVM considers them to be, for the matter of this check, to be 32-bit long. This causes the following program: int bpf_prog1(void *ign) { volatile unsigned long t = 0x8983984739ull; return *(unsigned long *)((0xffffffff8fff0002ull) + t); } To generate the following (wrong) code: 0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00 r1 = 590618314553ll 2: 7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1 3: 79 a1 f8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 8) 4: 79 10 02 00 00 00 00 00 r0 = *(u64 *)(r1 + 2) 5: 95 00 00 00 00 00 00 00 exit Fix it by changing the offset check to 16-bit. Patch by Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D32055 llvm-svn: 300269
* [Support] Fix ErrorOr assertion when /proc/cpuinfo doesn't exist.Teresa Johnson2017-04-131-0/+1
| | | | | | | | | | | | | | The ErrorOr should not be dereferenced on the error path. Patch by Jacob Young Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32032 llvm-svn: 300267
* [InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of ↵Craig Topper2017-04-131-4/+2
| | | | | | getLowBitsSet. NFC llvm-svn: 300265
* [llvm-pdbdump] Recursively dump class layout.Zachary Turner2017-04-133-42/+191
| | | | llvm-svn: 300258
* [ValueTracking] Remove duplicate call to computeKnownBits for the operands ↵Craig Topper2017-04-131-5/+1
| | | | | | | | of Select. We call it unconditionally on the operands of the select. Then decide if its a min/max and call it on the min/max operands or on the select operands again. Either of those second calls will overwrite the results of the initial call so we can just delete the first call. llvm-svn: 300256
* [LCSSA] Efficiently compute blocks dominating at least one exit.Davide Italiano2017-04-131-19/+54
| | | | | | | | | | | | | | | | | | | | | | | For LCSSA purposes, loop BBs not dominating any of the exits aren't interesting, as none of the values defined in these blocks can be used outside the loop. The way the code computed this information was by comparing each BB of the loop with each of the exit blocks and ask the dominator tree about their dominance relation. This is slow. A more efficient way, implemented here, is that of starting from the exit blocks and walking the dom upwards until we hit an header. By transitivity, all the blocks we encounter in our path dominate an exit. For the testcase provided in PR31851, this reduces compile time on `opt -O2` by ~25%, going from 1m47s to 1m22s. Thanks to Dan/MichaelZ for discussions/suggesting the approach/review. Differential Revision: https://reviews.llvm.org/D31843 llvm-svn: 300255
* Fix -Wunused-value warningReid Kleckner2017-04-131-6/+6
| | | | llvm-svn: 300254
* Revert accidentally-committed files in r300252.Richard Smith2017-04-131-403/+0
| | | | llvm-svn: 300253
* Remove all allocation and divisions from GreatestCommonDivisorRichard Smith2017-04-132-70/+485
| | | | | | | | | | | Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252
* [InstCombine] Fix !prof metadata preservation for invokesReid Kleckner2017-04-131-18/+16
| | | | | | | | | | | | | | | | | | | | Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251
* [LCSSA] Assert that we always have a valid loop.Davide Italiano2017-04-131-0/+1
| | | | | | | We could otherwise add BBs not belonging to a loop in `formLCSSA` and later crash when trying to iterate the loop blocks. llvm-svn: 300244
* [LCSSA] Remove spurious whitespaces. NFCI.Davide Italiano2017-04-131-1/+1
| | | | llvm-svn: 300243
* [LCSSA] Use `auto` when the type is obvious. NFCI.Davide Italiano2017-04-131-3/+3
| | | | llvm-svn: 300242
* [DAG] Fold away temporary vector in store candidate merge NFC.Nirav Dave2017-04-131-14/+11
| | | | llvm-svn: 300241
* SamplePGO: convert callsite samples map key from callsite_location to ↵Dehao Chen2017-04-134-72/+126
| | | | | | | | | | | | | | | | callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240
* [ValueTracking] Prevent a call to computeKnownBits if we already know the ↵Craig Topper2017-04-131-7/+8
| | | | | | state of the bit we would calculate. Also reuse a temporary APInt instead of creating a new one. llvm-svn: 300239
* [LV] Fix the vector code generation for first order recurrenceAnna Thomas2017-04-132-24/+25
| | | | | | | | | | | | | | | | | | | Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238
* [InstCombine] fold X == 0 || X == -1 to one compare (PR32524)Sanjay Patel2017-04-131-1/+5
| | | | | | | | | | | | | | | | This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236
* [DAE] Simplify call site replacement code with CallSite NFCReid Kleckner2017-04-131-27/+24
| | | | llvm-svn: 300235
* [ValueTracking] Move a temporary APInt instead of copying it.Craig Topper2017-04-131-1/+1
| | | | llvm-svn: 300233
* [InstCombine] Simplify attribute code with new AttributeList::get NFCReid Kleckner2017-04-131-31/+20
| | | | llvm-svn: 300230
* [ArgPromotion] Don't drop !prof metadata on promoted callsReid Kleckner2017-04-131-1/+4
| | | | | | | | | | Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229
* [AMDGPU] Combine DS operations with offsets bigger than byteStanislav Mekhanoshin2017-04-131-150/+166
| | | | | | | | | In many cases ds operations can be combined even if offsets do not fit into 8 bit encoding. What it takes is to adjust base address. Differential Revision: https://reviews.llvm.org/D31993 llvm-svn: 300227
* [InstCombine] use similar ops for related folds; NFCISanjay Patel2017-04-131-10/+9
| | | | | | | | | | | | | | | It's less efficient to produce 'ule' than 'ult' since we know we're going to canonicalize to 'ult', but we shouldn't have duplicated code for these folds. As a trade-off, this was a pretty terrible way to make a '2'. :) if (LHSC == SubOne(RHSC)) AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC); The next steps are to share the code to fix PR32524 and add the missing 'and' fold that was left out when PR14708 was fixed: https://bugs.llvm.org/show_bug.cgi?id=14708 llvm-svn: 300222
* [Analysis] Support bitreverse in -demanded-bits passBrian Gesiak2017-04-131-0/+3
| | | | | | | | | | | | | | | | | | | | Summary: * Add a bitreverse case in the demanded bits analysis pass. * Add tests for the bitreverse (and bswap) intrinsic in the demanded bits pass. * Add a test case to the BDCE tests: that manipulations to high-order bits are eliminated once the bits are reversed and then right-shifted. Reviewers: mkuper, jmolloy, hfinkel, trentxintong Reviewed By: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31857 llvm-svn: 300215
* LTO: Pass SF_Executable flag through to InputFile::SymbolTobias Edler von Koch2017-04-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The linker needs to be able to determine whether a symbol is text or data to handle the case of a common being overridden by a strong definition in an archive. If the archive contains a text member of the same name as the common, that function is discarded. However, if the archive contains a data member of the same name, that strong definition overrides the common. This is a behavior of ld.bfd, which the Qualcomm linker also supports in LTO. Here's a test case to illustrate: #### cat > 1.c << \! int blah; ! cat > 2.c << \! int blah() { return 0; } ! cat > 3.c << \! int blah = 20; ! clang -c 1.c clang -c 2.c clang -c 3.c ar cr lib.a 2.o 3.o ld 1.o lib.a -t #### The correct output is: 1.o (lib.a)3.o Thanks to Shankar Easwaran and Hemant Kulkarni for the test case! Reviewers: mehdi_amini, rafael, pcc, davide Reviewed By: pcc Subscribers: davide, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D31901 llvm-svn: 300205
* [InstCombine] fix assert to not always be trueSanjay Patel2017-04-131-1/+1
| | | | llvm-svn: 300202
* Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline."Geoff Berry2017-04-131-2/+2
| | | | | | This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200
* [Hexagon] Implement HexagonTargetLowering::CanLowerReturnKrzysztof Parzyszek2017-04-132-12/+18
| | | | | | | | Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D32000 llvm-svn: 300199
* [Hexagon] Fix "LowerFormalArguments emitted a value with the wrong type!" ↵Krzysztof Parzyszek2017-04-131-1/+1
| | | | | | | | | | assertion Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D31999 llvm-svn: 300198
* Use methods to access data stored with frame instructionsSerge Pavlov2017-04-138-46/+36
| | | | | | | | | | | | | Instructions CALLSEQ_START..CALLSEQ_END and their target dependent counterparts keep data like frame size, stack adjustment etc. These data are accessed by getOperand using hard coded indices. It is error prone way. This change implements the access by special methods, which improve readability and allow changing data representation without massive changes of index values. Differential Revision: https://reviews.llvm.org/D31953 llvm-svn: 300196
* [X86] Added missing mayLoad/mayStore attributes to some X86 instructions.Ayman Musa2017-04-137-19/+55
| | | | | | | | | Throughout the effort of automatically generating the X86 memory folding tables these missing information were encountered. This is a preparation work for a future patch including the automation of these tables. Differential Revision: https://reviews.llvm.org/D31714 llvm-svn: 300190
* [DWARF] - Simplify (use dyn_cast instead of isa + cast).George Rimar2017-04-131-2/+2
| | | | | | This addresses post commit review comments for r300039. llvm-svn: 300188
* [X86] Change instructions names to keep consistency with the naming ↵Ayman Musa2017-04-131-2/+2
| | | | | | | | convention. NFC Differential Revision: https://reviews.llvm.org/D31743 llvm-svn: 300184
* [LV] Refactor ILV to provide vectorizeInstruction(); NFCAyal Zaks2017-04-131-310/+302
| | | | | | | | | | | Refactoring InnerLoopVectorizer's vectorizeBlockInLoop() to provide vectorizeInstruction(). Aligning DeadInstructions with its only user. Facilitates driving the transformation by VPlan - follows https://reviews.llvm.org/D28975 and its tentative breakdown. Differential Revision: https://reviews.llvm.org/D31997 llvm-svn: 300183
* [APInt] Reorder fields to avoid a hole in the middle of the classCraig Topper2017-04-131-1/+1
| | | | | | | | | | | | | | | | | | | Summary: APInt is currently implemented with an unsigned BitWidth field first and then a uint_64/pointer union. Due to the 64-bit size of the union there is a hole after the bitwidth. Putting the union first allows the class to be packed. Making it 12 bytes instead of 16 bytes. An APSInt goes from 20 bytes to 16 bytes. This shows a 4k reduction on the size of the opt binary on my local x86-64 build. So this enables some other improvement to the code as well. Reviewers: dblaikie, RKSimon, hans, davide Reviewed By: davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D32001 llvm-svn: 300171
* [APInt] Generalize the implementation of tcIncrement to support adding a ↵Craig Topper2017-04-131-64/+40
| | | | | | full 'word' by introducing tcAddPart. Use this to support tcIncrement, operator++ and operator+=(uint64_t). Do the same for subtract. NFCI. llvm-svn: 300169
OpenPOWER on IntegriCloud