summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Teach visitEXTRACT_SUBVECTOR to turn extracts of BUILD_VECTOR ↵Craig Topper2017-08-281-0/+23
| | | | | | | | | | into smaller BUILD_VECTORs Only do this before operations are legalized of BUILD_VECTOR is Legal for the target. Differential Revision: https://reviews.llvm.org/D37186 llvm-svn: 311892
* The current version of LLVM X86 disassembler incorrectly interprets some ↵Andrew V. Tischenko2017-08-282-17/+71
| | | | | | | | | | | | | | possible sets of x86 prefixes. This patch is the first step to close PR7709 and PR17697. There will be next patch(es) to close relative PRs. Differential Revision: https://reviews.llvm.org/D36788 M lib/Target/X86/Disassembler/X86DisassemblerDecoder.cpp M lib/Target/X86/Disassembler/X86DisassemblerDecoder.h A test/MC/Disassembler/X86/prefixes-i386.s A test/MC/Disassembler/X86/prefixes-x86_64.s M test/MC/Disassembler/X86/prefixes.txt llvm-svn: 311882
* [X86][Haswell] Updating HSW instruction scheduling informationGadi Haber2017-08-281-1204/+3475
| | | | | | | | | | | | | | | This patch completely replaces the instruction scheduling information for the Haswell architecture target by modifying the file X86SchedHaswell.td located under the X86 Target. We used the scheduling information retrieved from the Haswell architects in order to replace and modify the existing scheduling. The patch continues the scheduling replacement effort started with the SNB target in r307529 and r310792. Information includes latency, number of micro-Ops and used ports by each HSW instruction. Please expect some performance fluctuations due to code alignment effects. Reviewers: RKSimon, zvi, aymanmus, craig.topper, m_zuckerman, igorb, dim, chandlerc, aaboud Differential Revision: https://reviews.llvm.org/D36663 llvm-svn: 311879
* Untabify.NAKAMURA Takumi2017-08-2813-65/+65
| | | | llvm-svn: 311875
* [X86] Use getUnpackl helper to create an ISD::VECTOR_SHUFFLE instead of ↵Craig Topper2017-08-281-1/+1
| | | | | | | | using X86ISD::UNPCKL in reduceVMULWidth. This runs fairly early, we should use target independent nodes if possible. llvm-svn: 311873
* [X86] Add an early out to combineLoopMAddPattern and combineLoopSADPattern ↵Craig Topper2017-08-281-0/+6
| | | | | | | | when SSE2 is disabled. Without this the madd.ll and sad.ll test cases both throw assertions if you run them with SSE2 disabled. llvm-svn: 311872
* revert r310985 which breaks for the following case:Dehao Chen2017-08-271-2/+0
| | | | | | | | | | | | | | | | | | | | | struct string { ~string(); }; void f2(); void f1(int) { f2(); } void run(int c) { string body; while (true) { if (c) f1(c); else f1(c); } } Will recommit once the issue is fixed. llvm-svn: 311864
* [mips] Generate NMADD and NMSUB instructions when fneg node is presentPetar Jovanovic2017-08-272-0/+20
| | | | | | | | | | | | This patch enables generation of NMADD and NMSUB instructions when fneg node is present. These instructions are currently only generated if fsub node is present. Patch by Stanislav Ocovaj. Differential Revision: https://reviews.llvm.org/D34507 llvm-svn: 311862
* [ARM] Tidy-up condition-code support functionsJaved Absar2017-08-273-102/+89
| | | | | | | | | Move condition code support functions to Utils and remove code duplication. Reviewed by: @fhahn, @asb Differential Revision: https://reviews.llvm.org/D37179 llvm-svn: 311860
* [AVX512] Add more patterns for using masked moves for subvector extracts of ↵Craig Topper2017-08-271-123/+58
| | | | | | the lowest subvector. This time with bitcasts between the vselect and the extract. llvm-svn: 311856
* [DAGCombiner] allow undef shuffle operands when eliminating bitcasts (PR34111)Sanjay Patel2017-08-271-1/+4
| | | | | | | | As noted in the FIXME, this could be improved more, but this is the smallest fix that helps: https://bugs.llvm.org/show_bug.cgi?id=34111 llvm-svn: 311853
* [ARM] Tidy-up ARMAsmParser. NFC.Javed Absar2017-08-271-23/+5
| | | | | | | | | Simplify getDRegFromQReg function Reviewed by: @fhahn, @asb Differential Revision: https://reviews.llvm.org/D37118 llvm-svn: 311850
* [LV] Fix PR34248 - recommit D32871 after revert r311304Ayal Zaks2017-08-274-523/+2247
| | | | | | | | | | | Original commit r311077 of D32871 was reverted in r311304 due to failures reported in PR34248. This recommit fixes PR34248 by restricting the packing of predicated scalars into vectors only when vectorizing, avoiding doing so when unrolling w/o vectorizing. Added a test derived from the reproducer of PR34248. llvm-svn: 311849
* [X86] Add a target-specific DAG combine to combine extract_subvector from ↵Craig Topper2017-08-271-0/+22
| | | | | | all zero/one build_vectors. llvm-svn: 311841
* [X86] Use getOnesVector instead of using DAG.getConstant(-1).Craig Topper2017-08-271-1/+1
| | | | llvm-svn: 311840
* [NewGVN] Use `auto` when the type is obvious NFCI.Davide Italiano2017-08-261-2/+2
| | | | llvm-svn: 311838
* [AVX512] Add patterns to match masked extract_subvector with bitcasts ↵Craig Topper2017-08-262-21/+106
| | | | | | | | | | between the vselect and the extract_subvector. Remove the late DAG combine. We used to do a late DAG combine to move the bitcasts out of the way, but I'm starting to think that it's better to canonicalize extract_subvector's type to match the type of its input. I've seen some cases where we've formed two different extract_subvector from the same node where one had a bitcast and the other didn't. Add some more test cases to ensure we've also got most of the zero masking covered too. llvm-svn: 311837
* [Dominators] Remove redundant explicit template instantiation.Don Hinton2017-08-261-2/+0
| | | | | | | | | | | | | | | Summary: Remove redundant explicit template instantiation. This was reported by Andrew Kelley building release_50 with gcc7.2.0 on MacOS: duplicate symbol llvm::DominatorTreeBase. Reviewers: kuhar, andrewrk, davide, hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37185 llvm-svn: 311835
* [DAGCombiner] Extending pattern detection for vector shuffle.Jatin Bhateja2017-08-261-3/+53
| | | | | | | | | | | | | | | | | | Summary: If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. This will also fix PR 33784 Reviewers: zvi, delena, RKSimon, thakis Reviewed By: RKSimon Subscribers: chandlerc, eladcohen, llvm-commits Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 311833
* Revert rL311247 : To rectify commit message.Jatin Bhateja2017-08-261-53/+3
| | | | | | | | Summary: This reverts commit rL311247. Differential Revision: https://reviews.llvm.org/D36927 llvm-svn: 311832
* NewGVN: Fix PR33204 - We need to add memory users when we bypass memorydefs ↵Daniel Berlin2017-08-261-4/+8
| | | | | | for loads, not just when we do it for stores. llvm-svn: 311829
* [X86] Qualify the RMW INC/DEC patterns with NotSlowIncDec.Craig Topper2017-08-261-2/+2
| | | | | | We were suppressing most uses of INC/DEC, but this one seems to have been missed. llvm-svn: 311828
* Add options to dump block frequency/branch probability info in text.Hiroshi Yamauchi2017-08-263-0/+42
| | | | | | | | | | | | | | | | | | | | | Summary: Add options -print-bfi/-print-bpi that dump block frequency and branch probability info like -view-block-freq-propagation-dags and -view-machine-block-freq-propagation-dags do but in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37165 llvm-svn: 311822
* [AVX512] Add patterns to use masked moves to implement masked ↵Craig Topper2017-08-251-0/+133
| | | | | | | | extract_subvector of the lowest subvector. This only supports 32 and 64 bit element sizes for now. But we could probably do 16 and 8-bit elements with BWI. llvm-svn: 311821
* [x86] Teach the backend to fold more read-modify-write memory operandsChandler Carruth2017-08-251-53/+73
| | | | | | | | | | | | | | | | | | | | | | | | to instructions. These can't be reasonably matched in tablegen due to the handling of flags, so we have to do this in C++ code. We only did it for `inc` and `dec` historically, this starts fleshing that out to more interesting instructions. Notably, this handles transfering operands to `add` and `sub`. Currently this forces them into a register. The next patch will add support for keeping immediate operands as immediates. Then I'll extend this beyond just `add` and `sub`. I'm not super thrilled by the repeated switches in the code but everything else I tried was really ugly or problematic. Many thanks to Craig Topper for the suggestions about where to even begin here and how to make this stuff work. Differential Revision: https://reviews.llvm.org/D37130 llvm-svn: 311806
* [Verifier] Diagnose invalid DIType references instead of crashing.Davide Italiano2017-08-251-0/+1
| | | | | | Fixes PR34325. llvm-svn: 311805
* [Inliner] Only compute fully inline cost when remarks are enabled.Davide Italiano2017-08-251-1/+11
| | | | | | | | Prior to this change (and after r311371), we computed it unconditionally, causin gsevere compile time regressions (in some cases, 5 to 10x). llvm-svn: 311804
* Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer"Matt Morehouse2017-08-251-21/+9
| | | | | | This reverts r311801 due to a bot failure. llvm-svn: 311803
* [SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzerMatt Morehouse2017-08-251-9/+21
| | | | | | | | | | | | | | | | | Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. Reviewers: vitalybuka, kcc Reviewed By: kcc Subscribers: cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 311801
* [sanitizer-coverage] extend fsanitize-coverage=pc-table with flags for every PCKostya Serebryany2017-08-251-13/+20
| | | | llvm-svn: 311794
* [InlineCost] Small changes to early exit condition. NFC.Haicheng Wu2017-08-251-3/+3
| | | | | | | | | Change the early exit condition from Cost > Threshold to Cost >= Threshold because the inline condition is Cost < Threshold. Differential Revision: https://reviews.llvm.org/D37087 llvm-svn: 311791
* [InstCombine] Don't fall back to only calling computeKnownBits if the upper ↵Craig Topper2017-08-251-23/+24
| | | | | | | | | | | | | | | | bit of Add/Sub is demanded. Just create an all 1s demanded mask and continue recursing like normal. The recursive calls should be able to handle an all 1s mask and do the right thing. The only time we should care about knowing whether the upper bit was demanded is when we need to know if we should clear the NSW/NUW flags. Now that we have a consistent path through the code for all cases, use KnownBits::computeForAddSub to compute the known bits at the end since we already have the LHS and RHS. My larger goal here is to move the code that turns add into xor if only 1 bit is demanded and no bits below it are non-zero from InstCombiner::OptAndOp to here. This will allow it to be more general instead of just looking for 'add' and 'and' with constant RHS. Differential Revision: https://reviews.llvm.org/D36486 llvm-svn: 311789
* [LoopInterchange] Skip zext instructions when looking for induction var.Florian Hahn2017-08-251-1/+2
| | | | | | | | | | | | | | | | | | Summary: SimplifyIndVar may introduce zext instructions to widen arguments of the loop exit check. They should not prevent us from splitting the loop at the induction variable, but maybe the check should be more conservative, e.g. making sure it only extends arguments used by a comparison? Reviewers: karthikthecool, mcrosier, mzolotukhin Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34879 llvm-svn: 311783
* Fix unused-lambda-capture warning by using default capture-by-refDavid Blaikie2017-08-251-2/+1
| | | | | | | | Since the lambda isn't escaped (via a std::function or similar) it's fine/better to use default capture-by-ref to provide semantics similar to language-level nested scopes (if/for/while/etc). llvm-svn: 311782
* Fix buildbot breakage from r311763. Remove unused lambda capture.Matt Morehouse2017-08-251-2/+1
| | | | llvm-svn: 311781
* Normlize to LF line endings.Michael Kruse2017-08-252-3/+3
| | | | | | | Commit r297442 introduced mixed CRLF/LF line endings to two files. Normalize to to LF-only line endings. llvm-svn: 311774
* [InstCombine] Consider more cases where SimplifyDemandedUseBits does not ↵Amjad Aboud2017-08-251-2/+5
| | | | | | | | | | convert AShr to LShr. There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit. Differential Revision: https://reviews.llvm.org/D36936 llvm-svn: 311773
* [X86] Use SDValue::getOpcode instead of calling getNode and calling ↵Craig Topper2017-08-251-2/+2
| | | | | | getOpcode on that. NFC llvm-svn: 311765
* [X86] Use isUInt and isShiftedUInt instead of using our own masking and ↵Craig Topper2017-08-251-21/+13
| | | | | | | | compares. NFCI While there use a local variable instead of calling C->getZExtValue() repeatedly. llvm-svn: 311764
* [GISel]: Implement widenScalar for Legalizing G_PHIAditya Nandakumar2017-08-253-14/+50
| | | | | | https://reviews.llvm.org/D37018 llvm-svn: 311763
* [coroutines] Add support for symmetric control transfer (musttail on ↵Gor Nishanov2017-08-251-0/+88
| | | | | | | | | | | | | | | | | | | | | coro.resumes followed by a suspend) Summary: Add musttail to any resume instructions that is immediately followed by a suspend (i.e. ret). We do this even in -O0 to support guaranteed tail call for symmetrical coroutine control transfer (C++ Coroutines TS extension). This transformation is done only in the resume part of the coroutine that has identical signature and calling convention as the coro.resume call. Reviewers: GorNishanov Reviewed By: GorNishanov Subscribers: EricWF, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D37125 llvm-svn: 311751
* [x86] NFC: More refactoring to pave the way to extending this ISel logicChandler Carruth2017-08-251-36/+51
| | | | | | | | | to handle other x86 pseudos that carry flags and thus can't be matched by our ISel patterns with fused memory accesses. Differential Revision: https://reviews.llvm.org/D37088 llvm-svn: 311749
* [x86] NFC - Refactor the custom lowering of `(load; op; store)` RMW sequences.Chandler Carruth2017-08-251-49/+57
| | | | | | | | | | This extracts the code out of a giant switch in preparation for expanding it to handle operations other thin `inc` and `dec`. Add a FIXME indicating what's coming here. Differential Revision: https://reviews.llvm.org/D37045 llvm-svn: 311748
* [X86] Add TBM instructions to X86InstrInfo::isDefConvertible.Craig Topper2017-08-251-0/+16
| | | | | | This allows us to remove "test" instructions and use the flags from the TBM instructions directly. llvm-svn: 311747
* [sanitizer-coverage] Make sure pc-tables aren't dead strippedJustin Bogner2017-08-251-0/+4
| | | | | | | Add a reference to the PC array in llvm.used so that linkers that aggressively dead strip (like ld64) don't remove it. llvm-svn: 311742
* [x86] Back out one aspect of r311318: don't generically setChandler Carruth2017-08-251-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FeatureSlowUAMem32. The idea was to mark things that are slow on widely available processors as slow in the generic CPU so that the code generated for that CPU would be fast across those processors. However, for this feature that doesn't work out very well at all. The problem here is that you can very easily enable AVX or AVX2 on top of this generic CPU. For example, this can happen just by using AVX2 intrinsics from Clang within a region of code guarded by a dynamic CPU feature test. When you do that, the generated code with SlowUAMem32 set is ... amazingly slower. The problem is that there really aren't very good alternatives to the unaligned loads, and so our vector codegen regresses significantly. The other issue is that there are plenty of AMD CPUs with AVX1 that don't set FeatureSlowUAMem32 and so we shouldn't just check for AVX2 instead of this special feature. =/ It would be nice to have the target attriute logic be able to enable/disable more than just one feature at a time and control this in a more fine grained and useful way, but that doesn't seem easy. Given that it is only Sandybridge and Ivybridge that set this feature, for now I'm just backing it out of the generic CPU. That has the additional advantage of going back to the previous state that people seemed vaguely happy with. llvm-svn: 311740
* [x86] Fix an amazing goof in the handling of sub, or, and xor lowering.Chandler Carruth2017-08-251-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment for this code indicated that it should work similar to our handling of add lowering above: if we see uses of an instruction other than flag usage and store usage, it tries to avoid the specialized X86ISD::* nodes that are designed for flag+op modeling and emits an explicit test. Problem is, only the add case actually did this. In all the other cases, the logic was incomplete and inverted. Any time the value was used by a store, we bailed on the specialized X86ISD node. All of this appears to have been historical where we had different logic here. =/ Turns out, we have quite a few patterns designed around these nodes. We should actually form them. I fixed the code to match what we do for add, and it has quite a positive effect just within some of our test cases. The only thing close to a regression I see is using: notl %r testl %r, %r instead of: xorl -1, %r But we can add a pattern or something to fold that back out. The improvements seem more than worth this. I've also worked with Craig to update the comments to no longer be actively contradicted by the code. =[ Some of this still remains a mystery to both Craig and myself, but this seems like a large step in the direction of consistency and slightly more accurate comments. Many thanks to Craig for help figuring out this nasty stuff. Differential Revision: https://reviews.llvm.org/D37096 llvm-svn: 311737
* [DAG] convert vector select-of-constants to logic/mathSanjay Patel2017-08-244-6/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | This goes back to a discussion about IR canonicalization. We'd like to preserve and convert more IR to 'select' than we currently do because that's likely the best choice in IR: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html ...but that's often not true for codegen, so we need to account for this pattern coming in to the backend and transform it to better DAG ops. Steps in this patch: 1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely enable this transform. Other targets will probably want that anyway to distinguish scalars from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary. 2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the vector ext is often free. Implementing a more general fold using xor+and can be a follow-up for targets that don't have a legal vselect. It's also possible that we can remove the TLI hook for the special case fold implemented here because we're eliminating a constant, but it needs to be tested on other targets. Differential Revision: https://reviews.llvm.org/D36840 llvm-svn: 311731
* [ADT] Enable reverse iteration for DenseMapMandeep Singh Grang2017-08-241-11/+0
| | | | | | | | | | | | Reviewers: mehdi_amini, dexonsmith, dblaikie, davide, chandlerc, davidxl, echristo, efriedma Reviewed By: dblaikie Subscribers: rsmith, mgorny, emaste, llvm-commits Differential Revision: https://reviews.llvm.org/D35043 llvm-svn: 311730
* [Profile] backward propagate profile info in JumpThreadingXinliang David Li2017-08-241-1/+115
| | | | | | | | Take-2 after fixing bugs in the original patch. Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311727
OpenPOWER on IntegriCloud