summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] clean up visitAshr(); NFCISanjay Patel2017-01-141-20/+9
| | | | llvm-svn: 292036
* [InstCombine] add test to show missed vector fold; NFCSanjay Patel2017-01-141-0/+13
| | | | llvm-svn: 292035
* Adding const overloads of operator* and operator-> for DenseSet iteratorsDavid Majnemer2017-01-141-2/+4
| | | | | | | | | | This fixes some problems when building ClangDiagnostics.cpp on Visual Studio 2017 RC. As far as I understand, there was a change in the implementation of the constructor for std::vector with two iterator parameters, which in our case causes an attempt to dereference const Iterator objects. Since there was no overload for a const Iterator, the compile would fail. Patch by Hugo Puhlmann! Differential Revision: https://reviews.llvm.org/D28726 llvm-svn: 292034
* [NewGVN] Fix a warning from GCC.Davide Italiano2017-01-141-7/+6
| | | | | | | Patch by Gonsolo. Differential Revision: https://reviews.llvm.org/D28731 llvm-svn: 292031
* [NewGVN] clang-format this file after recent changes.Davide Italiano2017-01-141-6/+7
| | | | llvm-svn: 292026
* [NewGVN] Try to be consistent wit the style used in this file. NFCI.Davide Italiano2017-01-141-1/+1
| | | | llvm-svn: 292025
* [TargetLowering] Simplfiy a bit. NFCI.Davide Italiano2017-01-141-4/+1
| | | | llvm-svn: 292024
* [CostModel][X86] Updated vXi64 ASHR costs on AVX512 targets now that D28604 ↵Simon Pilgrim2017-01-142-8/+16
| | | | | | has landed llvm-svn: 292023
* [X86][XOP] Added support for VPMADCSWD 'extend+hadd' IFMA patternsSimon Pilgrim2017-01-142-2/+4
| | | | | | VPMADCSWD act as VPADDD( VPMADDWD( x, y ), z ) - multiply+extend+hadd and add to v4i32 accumulator llvm-svn: 292021
* [X86][XOP] Added support for VPMACSDQH/VPMACSDQL 'extension' IFMA patternsSimon Pilgrim2017-01-142-17/+18
| | | | | | VPMACSDQH/VPMACSDQL act as VPADDQ( VPMULDQ( x, y ), z ) - multiply+extending either the odd/even 4i32 input elements and adding to v2i64 accumulator llvm-svn: 292020
* [X86][XOP] Added support for VPMACSWW/VPMACSDD 'lossy' IFMA patternsSimon Pilgrim2017-01-142-20/+25
| | | | | | VPMACSWW/VPMACSDD act as add( mul( x, y ), z ) - ignoring any upper bits from both the multiply and add stages llvm-svn: 292019
* [X86][XOP] Add tests for integer fused multiply addSimon Pilgrim2017-01-141-0/+142
| | | | | | | | Tests showing missed opportunities to use XOP's integer fma instructions Some of these are pretty awkward to match as they often have implicit sext/trunc stages but many just ignore overflow bits which makes things pretty straightforward. llvm-svn: 292017
* fix some typos in the docSylvestre Ledru2017-01-148-10/+10
| | | | llvm-svn: 292014
* [utils] Improve extraction of check prefixes from RUN linesNikolai Bozhenov2017-01-142-6/+6
| | | | | | | | | | | | Correct handling of the following FileCheck options is implemented in update_llc_test_checks.py and update_test_checks.py scripts: 1) -check-prefix (with a single dash) 2) -check-prefixes (with multiple prefixes) Differential Revision: https://reviews.llvm.org/D28572 llvm-svn: 292008
* [AVX-512] Teach two address instruction pass to replace masked move ↵Craig Topper2017-01-1416-357/+314
| | | | | | | | | | | | instructions with blendm instructions when its beneficial. Isel now selects masked move instructions for vselect instead of blendm. But sometimes it beneficial to register allocation to remove the tied register constraint by using blendm instructions. This also picks up cases where the masked move was created due to a masked load intrinsic. Differential Revision: https://reviews.llvm.org/D28454 llvm-svn: 292005
* [AVX-512] Replace V_SET0 in AVX-512 patterns with AVX512_128_SET0. Enhance ↵Craig Topper2017-01-144-28/+47
| | | | | | | | | | AVX512_128_SET0 expansion to make this possible. We'll now expand AVX512_128_SET0 to an EVEX VXORD if VLX available. Or if its not, but register allocation has selected a non-extended register we will use VEX VXORPS. And if its an extended register without VLX we'll use a 512-bit XOR. Do the same for AVX512_FsFLD0SS/SD. This makes it possible for the register allocator to have all 32 registers available to work with. llvm-svn: 292004
* Removing potentially error-prone fallthrough. NFCMarcello Maggioni2017-01-141-0/+1
| | | | | | | | This fallthrough if other cases are added between fabs and default could cause fabs to fall to the next case resulting in a bug. Better getting rid of it immediately just to be sure. llvm-svn: 292003
* Delete duplicate word. NFCXin Tong2017-01-141-1/+1
| | | | llvm-svn: 291999
* [X86] Simplify the code that calculates a scaled blend mask. We don't need a ↵Craig Topper2017-01-141-2/+1
| | | | | | second loop. llvm-svn: 291996
* [AVX-512] Change blend mask in lowerVectorShuffleAsBlend to a 64-bit value. ↵Craig Topper2017-01-142-349/+738
| | | | | | | | Also add 32-bit mode command lines to the test case that exercises this just to make sure we sanely handle the 64-bit immediate there. This fixes a undefined sanitizer failure from r291888. llvm-svn: 291994
* Fix modules buildbots broken in r291983.Eugene Zelenko2017-01-141-1/+2
| | | | llvm-svn: 291985
* [Transforms/Utils] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-01-1412-152/+175
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 291983
* Compute summary before calling extractProfTotalWeightEaswaran Raman2017-01-143-35/+66
| | | | | | | | | | extractProfTotalWeight checks if the profile type is sample profile, but before that we have to ensure that summary is available. Also expanded the unittest to test the case where there is no summar Differential Revision: https://reviews.llvm.org/D28708 llvm-svn: 291982
* NewGVN: Kill unneeded DFSDomMap, cleanup a few comments.Daniel Berlin2017-01-141-13/+7
| | | | llvm-svn: 291981
* Fix update_test_checks not to accidentally believe type names are variable namesDaniel Berlin2017-01-131-1/+1
| | | | llvm-svn: 291980
* NewGVN: Fix PR31613 test regex namingDaniel Berlin2017-01-131-2/+2
| | | | llvm-svn: 291979
* GlobalISel: Abort in ResetMachineFunctionPass if fallback isn't enabledJustin Bogner2017-01-133-6/+16
| | | | | | | | | | When GlobalISel is configured to abort rather than fallback the only thing that resetting the machine function does is make things harder to debug. If we ever get to this point in the abort configuration it indicates that we've already hit a bug, so this changes the behaviour to abort instead. llvm-svn: 291977
* [InstCombine] optimize unsigned icmp of incrementSanjay Patel2017-01-132-0/+169
| | | | | | | | | | | | | | | | | | | | | | | Allows LLVM to optimize sequences like the following: %add = add nuw i32 %x, 1 %cmp = icmp ugt i32 %add, %y Into: %cmp = icmp uge i32 %x, %y Previously, only signed comparisons were being handled. Decrements could also be handled, but 'sub nuw %x, 1' is currently canonicalized to 'add %x, -1' in InstCombineAddSub, losing the nuw flag. Removing that canonicalization seems like it might have far-reaching ramifications so I kept this simple for now. Patch by Matti Niemenmaa! Differential Revision: https://reviews.llvm.org/D24700 llvm-svn: 291975
* [GlobalISel] track predecessor mapping during switch lowering.Tim Northover2017-01-133-18/+110
| | | | | | | | Correctly populating Machine PHIs relies on knowing exactly how the IR level CFG was lowered to MachineIR. This needs to be tracked by any translation phases that meddle (currently only SwitchInst handling). llvm-svn: 291973
* [InstCombine] use m_APInt to allow lshr folds for vectors with splat constantsSanjay Patel2017-01-132-24/+21
| | | | llvm-svn: 291972
* [InstCombine / InstSimplify] add and move tests for lshr transforms; NFCSanjay Patel2017-01-133-22/+146
| | | | llvm-svn: 291970
* NewGVN: Move leaders around properly to ensure we have a canonical ↵Daniel Berlin2017-01-132-40/+224
| | | | | | | | | | | | | | | | | | | | | | | | | | | | dominating leader. Fixes PR 31613. Summary: This is a testcase where phi node cycling happens, and because we do not order the leaders by domination or anything similar, the leader keeps changing. Using std::set for the members is too expensive, and we actually don't need them sorted all the time, only at leader changes. We could keep both a set and a vector, and keep them mostly sorted and resort as necessary, or use a set and a fibheap, but all of this seems premature. After running some statistics, we are able to avoid the vast majority of sorting by keeping a "next leader" field. Most congruence classes only have leader changes once or twice during GVN. Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28594 llvm-svn: 291968
* Add a variant of DWARFDie::find() and DWARFDie::findRecursively() that takes ↵Greg Clayton2017-01-135-20/+131
| | | | | | | | | | a llvm::ArrayRef<dwarf::Attribute>. This allows us efficiently look for more than one attribute, something that is quite common in DWARF consumption. Differential Revision: https://reviews.llvm.org/D28704 llvm-svn: 291967
* [LoopStrengthReduce] Don't bother rewriting PHIs in catchswitch blocksDavid Majnemer2017-01-132-1/+63
| | | | | | | | | The catchswitch instruction cannot be split, don't bother trying to rewrite it. This fixes PR31627. llvm-svn: 291966
* [CodeGen] Simplify getRecipEstimateForFuncDavid Majnemer2017-01-131-5/+1
| | | | | | It used two attribute lookups when only one was needed. llvm-svn: 291965
* Cleanup how DWARFDie attributes are accessed and decoded.Greg Clayton2017-01-139-326/+513
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed all DWARFDie::getAttributeValueAs*() calls. Renamed: Optional<DWARFFormValue> DWARFDie::getAttributeValue(dwarf::Attribute); To: Optional<DWARFFormValue> DWARFDie::find(dwarf::Attribute); Added: Optional<DWARFFormValue> DWARFDie::findRecursively(dwarf::Attribute); All decoding of Optional<DWARFFormValue> values are now done using the dwarf::to*() functions from DWARFFormValue.h: Old code: auto DeclLine = DWARFDie.getAttributeValueAsSignedConstant(DW_AT_decl_line).getValueOr(0); New code: auto DeclLine = toUnsigned(DWARFDie.find(DW_AT_decl_line), 0); This composition helps us since we can now easily do: auto DeclLine = toUnsigned(DWARFDie.findRecursively(DW_AT_decl_line), 0); This allows us to easily find attribute values in the current DIE only (the first new code above) or in any DW_AT_abstract_origin or DW_AT_specification Dies using the line above. Note that the code line length is shorter and more concise. Differential Revision: https://reviews.llvm.org/D28581 llvm-svn: 291959
* "Use" lambda captures which are otherwise only used in asserts. NFCDavid L. Jones2017-01-133-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The LLVM coding standards recommend "using" values that are only needed by asserts: http://llvm.org/docs/CodingStandards.html#assert-liberally Without this change, LLVM cannot bootstrap with -Werror as the second stage fails with this new warning: https://reviews.llvm.org/rL291905 See also the previous fixes: https://reviews.llvm.org/rL291916 https://reviews.llvm.org/rL291939 https://reviews.llvm.org/rL291940 https://reviews.llvm.org/rL291941 Reviewers: rsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28695 llvm-svn: 291957
* [NVPTX] Added support for half-precision floating point.Artem Belevich2017-01-1318-102/+1487
| | | | | | | | | | | | | | | | Only scalar half-precision operations are supported at the moment. - Adds general support for 'half' type in NVPTX. - fp16 math operations are supported on sm_53+ GPUs only (can be disabled with --nvptx-no-f16-math). - Type conversions to/from fp16 are supported on all GPU variants. - On GPU variants that do not have full fp16 support (or if it's disabled), fp16 operations are promoted to fp32 and results are converted back to fp16 for storage. Differential Revision: https://reviews.llvm.org/D28540 llvm-svn: 291956
* [AMDGPU] Implement f16 fcopysign and fcopysign(f32, f64)Konstantin Zhuravlyov2017-01-133-0/+299
| | | | | | Differential Revision: https://reviews.llvm.org/D28496 llvm-svn: 291954
* Add a description how to checkout the LLD repository.Rui Ueyama2017-01-131-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D28687 llvm-svn: 291948
* Check for register clobbers when merging a vreg live range with aJames Y Knight2017-01-133-8/+58
| | | | | | | | | | | reserved physreg in RegisterCoalescer. Previously, we only checked for clobbers when merging into a READ of the physreg, but not when merging from a WRITE to the physreg. Differential Revision: https://reviews.llvm.org/D28527 llvm-svn: 291942
* [InstCombine] use 'match' and other clean-up; NFCISanjay Patel2017-01-131-17/+8
| | | | llvm-svn: 291937
* [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is ↵Artem Belevich2017-01-138-13/+76
| | | | | | | | | | | | | | allowed. Previously we'd always lower @llvm.{sin,cos}.f32 to {sin.cos}.approx.f32 instruction even when unsafe FP math was not allowed. Clang-generated IR is not affected by this as it uses precise sin/cos from CUDA's libdevice when unsafe math is disabled. Differential Revision: https://reviews.llvm.org/D28619 llvm-svn: 291936
* [InstCombine] use m_APInt to allow shl folds for vectors with splat constantsSanjay Patel2017-01-132-7/+9
| | | | llvm-svn: 291934
* [SCEV] Limit recursion depth of constant evolving.Michael Liao2017-01-131-3/+10
| | | | | | | | | | - For a loop body with VERY complicated exit condition evaluation, constant evolving may run out of stack on platforms such as Windows. Need to limit the recursion depth. Differential Revision: https://reviews.llvm.org/D28629 llvm-svn: 291927
* [InstCombine] add tests to show missing transforms for vector shl; NFCSanjay Patel2017-01-131-5/+24
| | | | llvm-svn: 291926
* [X86][AVX] Bad v4f64/v4i64 '1z3z' shuffle test caseSimon Pilgrim2017-01-131-0/+49
| | | | | | This lowers to SHUFPD if the input is zeroinitializer but not with a demanded elts optimized build vector. llvm-svn: 291924
* [InstCombine] use Op0/Op1 local variables more consistently with shifts; NFCSanjay Patel2017-01-131-22/+16
| | | | llvm-svn: 291923
* Regenerate test.Simon Pilgrim2017-01-131-5/+5
| | | | llvm-svn: 291920
* Fix UBSan bots by blacklisting bits/stl_tree.h.Ivan Krasin2017-01-132-0/+9
| | | | | | | | | | | | | | | | Summary: libstdc++ has some undefined behavior in bits/stl_tree.h that has recently became excercised by some of the LLVM code. Given that fixing libstdc++ will take years, adding the file into a blacklist to fix bots seems like a necessity. Reviewers: vitalybuka Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D28686 llvm-svn: 291918
OpenPOWER on IntegriCloud