summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] More thoroughly canonicalize the position of zextsDavid Majnemer2016-12-302-9/+120
| | | | | | | | We correctly canonicalized (add (sext x), (sext y)) to (sext (add x, y)) where possible. However, we didn't perform the same canonicalization for zexts or for muls. llvm-svn: 290733
* [AVR] Optimize 16-bit ORs with '0'Dylan McKay2016-12-303-16/+64
| | | | | | | | | | | | | | Summary: Fixes PR 31344 Authored by Anmol P. Paralkar Reviewers: dylanmckay Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D28121 llvm-svn: 290732
* Simplify FunctionLoweringInfo.cpp with range for loopsReid Kleckner2016-12-301-40/+31
| | | | | | | I'm preparing to add some pattern matching code here, so simplify the code before I do. NFC llvm-svn: 290731
* Include <algorithm> for std::max etcReid Kleckner2016-12-301-0/+1
| | | | llvm-svn: 290730
* [LICM] Compute exit blocks for promotion eagerly. NFC.Michael Kuperstein2016-12-291-35/+36
| | | | | | | | | | | This moves the exit block and insertion point computation to be eager, instead of after seeing the first scalar we can promote. The cost is relatively small (the computation happens anyway, see discussion on D28147), and the code is easier to follow, and can bail out earlier if there's a catchswitch present. llvm-svn: 290729
* [LICM] Don't try to promote in loops where we have no chance to promote. NFC.Michael Kuperstein2016-12-291-10/+6
| | | | | | | | | | We would check whether we have a prehader *or* dedicated exit blocks, and go into the promotion loop. Then, for each alias set we'd check if we have a preheader *and* dedicated exit blocks, and bail if not. Instead, bail immediately if we don't have both. llvm-svn: 290728
* Fix build using the buildit scriptEric Fiselier2016-12-291-1/+1
| | | | llvm-svn: 290727
* [LICM] Only recompute LCSSA when we actually promoted something.Michael Kuperstein2016-12-291-3/+6
| | | | | | | | | | | | We want to recompute LCSSA only when we actually promoted a value. This means we only need to look at changes made by promotion when deciding whether to recompute it or not, not at regular sinking/hoisting. (This was what the code was documented as doing, just not what it did) Hopefully NFC. llvm-svn: 290726
* [OpenMP] Sema and parsing for 'target teams distribute parallel for’ pragmaKelvin Li2016-12-2949-17/+5451
| | | | | | | | This patch is to implement sema and parsing for 'target teams distribute parallel for’ pragma. Differential Revision: https://reviews.llvm.org/D28160 llvm-svn: 290725
* NewGVN: Fix PR 31491 by ensuring that we touch the right instructions. ↵Daniel Berlin2016-12-292-11/+51
| | | | | | Change to one based numbering so we can assert we don't cause the same bug again. llvm-svn: 290724
* [Analysis] Remove repeated text from a comment. NFCCraig Topper2016-12-291-2/+1
| | | | llvm-svn: 290723
* Fix indentation in r290716.Bryant Wong2016-12-292-8/+8
| | | | | | Use two-space indentation like the rest of the file. llvm-svn: 290722
* Fix PR31489 - std::function self-swap segfaultsEric Fiselier2016-12-294-94/+205
| | | | llvm-svn: 290721
* [ADT] Rewrite IntrusiveRefCntPtr's comments. NFCJustin Lebar2016-12-291-48/+84
| | | | | | | | Edit for voice, and also add examples. In particular, add an explanation for why you might want to specialize IntrusiveRefCntPtrInfo, which is not obvious. llvm-svn: 290720
* [ADT] Rename RefCountedBase::ref_cnt to RefCount. NFCJustin Lebar2016-12-291-5/+5
| | | | | | | This makes it comply with the LLVM style guide, and also makes it consistent with ThreadSafeRefCountedBase below. llvm-svn: 290719
* [ADT] clang-format IntrusiveRefCntrPtr.h. NFCJustin Lebar2016-12-291-150/+128
| | | | | | | | | This file had some strange indentation. Also remove some unnecessary whitespace between one-line member functions. llvm-svn: 290718
* [ADT] Delete RefCountedBaseVPTR.Justin Lebar2016-12-297-77/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This class is unnecessary. Its comment indicated that it was a compile error to allocate an instance of a class that inherits from RefCountedBaseVPTR on the stack. This may have been true at one point, but it's not today. Moreover you really do not want to allocate *any* refcounted object on the stack, vptrs or not, so if we did have a way to prevent these objects from being stack-allocated, we'd want to apply it to regular RefCountedBase too, obviating the need for a separate RefCountedBaseVPTR class. It seems that the main way RefCountedBaseVPTR provides safety is by making its subclass's destructor virtual. This may have been helpful at one point, but these days clang will emit an error if you define a class with virtual functions that inherits from RefCountedBase but doesn't have a virtual destructor. Reviewers: compnerd, dblaikie Subscribers: cfe-commits, klimek, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D28162 llvm-svn: 290717
* Correctly handle multi-lined RUN lines.Bryant Wong2016-12-292-2/+16
| | | | | | | | | | `utils/update_{llc_test,test}_checks` ought to be able to handle RUN commands that span multiple lines, as shown in the example at http://llvm.org/docs/CommandGuide/FileCheck.html#the-filecheck-check-prefix-option Differential Revision: https://reviews.llvm.org/D26523 llvm-svn: 290716
* [ADT] Use memcpy for type punning in MathExtras.Justin Lebar2016-12-291-24/+16
| | | | | | | | | | | | Summary: Previously we type-punned through a union, which is not safe. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28161 llvm-svn: 290715
* Revert "[COFF] Use 32-bit jump table entries in .rdata for Win64"Reid Kleckner2016-12-295-53/+9
| | | | | | | | This reverts commit r290694. It broke sanitizer tests on Win64. I'll probably bring this back, but the jump tables will just live in .text like they do for MSVC. llvm-svn: 290714
* [TBAAVerifier] Be stricter around verifying scalar nodesSanjoy Das2016-12-296-33/+40
| | | | | | | | | | | This fixes the issue exposed in PR31393, where we weren't trying sufficiently hard to diagnose bad TBAA metadata. This does reduce the variety in the error messages we print out, but I think the tradeoff of verifying more, simply and quickly overrules the need for more helpful error messags here. llvm-svn: 290713
* [TBAAVerifier] Make things const-consistent; NFCSanjoy Das2016-12-292-11/+12
| | | | llvm-svn: 290712
* [TBAAVerifier] Memoize validity of scalar tbaa nodes; NFCISanjoy Das2016-12-292-5/+20
| | | | llvm-svn: 290711
* [AMDGPU][mc] Enable absolute expressions in .hsa_code_object_isa directiveArtem Tamazov2016-12-293-28/+48
| | | | | | | | | | | Among other stuff, this allows to use predefined .option.machine_version_major /minor/stepping symbols in the directive. Relevant test expanded at once (also file renamed for clarity). Differential Revision: https://reviews.llvm.org/D28140 llvm-svn: 290710
* Fix documentation generator warnings after rL290708.Igor Laevsky2016-12-291-3/+3
| | | | llvm-svn: 290709
* Introduce element-wise atomic memcpy intrinsicIgor Laevsky2016-12-298-0/+275
| | | | | | | | | | This change adds a new intrinsic which is intended to provide memcpy functionality with additional atomicity guarantees. Please refer to the review thread or language reference for further details. Differential Revision: https://reviews.llvm.org/D27133 llvm-svn: 290708
* [InstCombine] Use getVectorNumElements instead of explicitly casting to ↵Craig Topper2016-12-291-8/+7
| | | | | | VectorType and calling getNumElements. NFC llvm-svn: 290707
* [InstCombine] Fix typo in comment. NFCCraig Topper2016-12-291-1/+1
| | | | llvm-svn: 290706
* [InstCombine] Use a 32-bits instead of 64-bits for storing the number of ↵Craig Topper2016-12-291-2/+2
| | | | | | elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC llvm-svn: 290705
* [InstCombine][X86] If the lowest element of a scalar intrinsic isn't used ↵Craig Topper2016-12-291-6/+18
| | | | | | | | make sure we add it to the worklist so we can DCE it sooner. We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded. llvm-svn: 290704
* [libFuzzer] make __sanitizer_cov_trace_switch more predictableKostya Serebryany2016-12-292-24/+19
| | | | llvm-svn: 290703
* [InstCombine] Fix some of the AVX-512 scalar arithmetic test cases to do a ↵Craig Topper2016-12-291-36/+36
| | | | | | | | better job of testing what they intended to test. The accidentally had trivially dead code. Also needed to adjust the rounding mode to not CUR_DIRECTION so the intrinsics don't get converted to native operations before going through SimplifyDemandedVectorElts. llvm-svn: 290702
* Remove BitstreamWriter::Emit64(), it was never called (NFC)Mehdi Amini2016-12-291-9/+0
| | | | llvm-svn: 290701
* Fix mingw build by moving the static const data member before the bitfieldsReid Kleckner2016-12-291-2/+3
| | | | | | | | | | | | | | Apparently GCC targeting Windows breaks bitfields on static data members: struct Foo { unsigned X : 16; static const int M = 42; unsigned Y : 16; }; static_assert(sizeof(Foo) == 4, "asdf"); // fails Who knew. llvm-svn: 290700
* NewGVN: Sort Dominator Tree in RPO order, and use that for generating order.Daniel Berlin2016-12-291-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The optimal iteration order for this problem is RPO order. We want to process as many preds of a backedge as we can before we process the backedge. At the same time, as we add predicate handling, we want to be able to touch instructions that are dominated by a given block by ranges (because a change in value numbering a predicate possibly affects all users we dominate that are using that predicate). If we don't do it this way, we can't do value inference over backedges (the paper covers this in depth). The newgvn branch currently overshoots the last part, and guarantees that it will touch *at least* the right set of instructions, but it does touch more. This is because the bitvector instruction ranges are currently generated in RPO order (so we take the max and the min of the ranges of dominated blocks, which means there are some in the middle we didn't have to touch that we did). We can do better by sorting the dominator tree, and then just using dominator tree order. As a preliminary, the dominator tree has some RPO guarantees, but not enough. It guarantees that for a given node, your idom must come before you in the RPO ordering. It guarantees no relative RPO ordering for siblings. We add siblings in whatever order they appear in the module. So that is what we fix. We sort the children array of the domtree into RPO order, and then use the dominator tree for ordering, instead of RPO, since the dominator tree is now a valid RPO ordering. Note: This would help any other pass that iterates a forward problem in dominator tree order. Most of them are single pass. It will still maximize whatever result they compute. We could also build the dominator tree in this order, but our incremental updates would still put it out of sort order, and recomputing the sort order is almost as hard as general incremental updates of the domtree. Also note that the sorting does not affect any tests, etc. Nothing depends on domtree order, including the verifier, the equals functions for domtree nodes, etc. How much could this matter, you ask? Here are the current numbers. This is generated by running NewGVN over all files in LLVM. Note that once we propagate equalities, the differences go up by an order of magnitude or two (IE instead of 29, the max ends up in the thousands, since the worst case we add a factor of N, where N is the number of branch predicates). So while it doesn't look that stark for the default ordering, it gets *much much* worse. There are also programs in the wild where the difference is already pretty stark (2 iterations vs hundreds). RPO ordering: 759040 Number of iterations is 1 112908 Number of iterations is 2 Default dominator tree ordering: 755081 Number of iterations is 1 116234 Number of iterations is 2 603 Number of iterations is 3 27 Number of iterations is 4 2 Number of iterations is 5 1 Number of iterations is 7 Dominator tree sorted: 759040 Number of iterations is 1 112908 Number of iterations is 2 <yay!> Really bad ordering (sort domtree siblings in postorder. not quite the worst possible, but yeah): 754008 Number of iterations is 1 21 Number of iterations is 10 8 Number of iterations is 11 6 Number of iterations is 12 5 Number of iterations is 13 2 Number of iterations is 14 2 Number of iterations is 15 3 Number of iterations is 16 1 Number of iterations is 17 2 Number of iterations is 18 96642 Number of iterations is 2 1 Number of iterations is 20 2 Number of iterations is 21 1 Number of iterations is 22 1 Number of iterations is 29 17266 Number of iterations is 3 2598 Number of iterations is 4 798 Number of iterations is 5 273 Number of iterations is 6 186 Number of iterations is 7 80 Number of iterations is 8 42 Number of iterations is 9 Reviewers: chandlerc, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28129 llvm-svn: 290699
* Add a static_assert about the sizeof(GlobalValue)Reid Kleckner2016-12-291-0/+7
| | | | | | | | I added one for Value back in r262045, and I'm starting to think we should have these for any class with bitfields whose memory efficiency really matters. llvm-svn: 290698
* Update equalsStoreHelper for the fact that only one branch can be trueDaniel Berlin2016-12-291-4/+5
| | | | llvm-svn: 290697
* [GlobalValue] Move HasLLVMReservedName into existing bitfield. NFCJustin Lebar2016-12-291-9/+10
| | | | | | | | | | | | | | | | | | Summary: Follow-up to r290691, where I introduced HasLLVMReservedName. rnk pointed out that that patch added an extra word to GlobalValue on MSVC, because it doesn't pack bitfields with different types. This patch moves HasLLVMReservedName into the existing bitfield, where we appear to have plenty of bits to spare. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28149 llvm-svn: 290696
* [IR] Clarify that Value::getName() is not actually cheap.Justin Lebar2016-12-291-2/+3
| | | | | | It involves a hashtable lookup when the Value has a name. llvm-svn: 290695
* [COFF] Use 32-bit jump table entries in .rdata for Win64Reid Kleckner2016-12-295-9/+53
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: We were already using 32-bit jump table entries, but this was a consequence of the default PIC model on Win64, and not an intentional design decision. This patch ensures that we always use 32-bit label difference jump table entries on Win64 regardless of the PIC model. This is a good idea because it saves executable size and object file size. Moving the jump tables to .rdata cleans up the disassembled object code and reduces the available ROP targets, but it requires adding one more RIP-relative lea to the code. COFF doesn't have relocations to express the difference between two arbitrary symbols, so we can't use the jump table label in the label difference like we do elsewhere. Fixes PR31488 Reviewers: majnemer, compnerd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28141 llvm-svn: 290694
* Change Metadata Index emission in the bitcode to use 2x32 bits for the ↵Mehdi Amini2016-12-283-4/+10
| | | | | | | | | | placeholder The Bitstream reader and writer are limited to handle a "size_t" at most, which means that we can't backpatch and read back a 64bits value on 32 bits platform. llvm-svn: 290693
* Revert "[NewGVN] replace emplace_back with push_back"Piotr Padlewski2016-12-281-7/+7
| | | | llvm-svn: 290692
* Speed up Function::isIntrinsic() by adding a bit to GlobalValue. NFCJustin Lebar2016-12-283-6/+18
| | | | | | | | | | | | | | | | | | Summary: Previously isIntrinsic() called getName(). This involves a hashtable lookup, so is nontrivially expensive. And isIntrinsic() is called frequently, particularly by dyn_cast<IntrinsicInstr>. This patch steals a bit of IntID and uses that to store whether or not getName() starts with "llvm." Reviewers: bogner, arsenm, joker-eph Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D22949 llvm-svn: 290691
* Add an index for Module Metadata record in the bitcodeMehdi Amini2016-12-289-35/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | This index record the position for each metadata record in the bitcode, so that the reader will be able to lazy-load on demand each individual record. We also make sure that every abbrev is emitted upfront so that the block can be skipped while reading. I don't plan to commit this before having the reader counterpart, but I figured this can be reviewed mostly independently. Recommit r290684 (was reverted in r290686 because a test was broken) after adding a threshold to avoid emitting the index when unnecessary (little amount of metadata). This optimization "hides" a limitation of the ability to backpatch in the bitstream: we can only backpatch safely when the position has been flushed. So if we emit an index for one metadata, it is possible that (part of) the offset placeholder hasn't been flushed and the backpatch will fail. Differential Revision: https://reviews.llvm.org/D28083 llvm-svn: 290690
* Decrease kLargeMalloc block size in ASAN unit tests.Evgeniy Stepanov2016-12-281-1/+3
| | | | | | | | | | | | | | | | | Summary: Make kLargeMalloc big enough to be handled by secondary allocator and small enough to fit into quarantine for all configurations. It become too big to fit into quarantine on Android after D27873. Reviewers: eugenis Patch by Alex Shlyapnikov. Subscribers: danalbert, llvm-commits, kubabrecka Differential Revision: https://reviews.llvm.org/D28142 llvm-svn: 290689
* Fix the variable view in the "gui" curses mode so that variables whose ↵Greg Clayton2016-12-284-22/+182
| | | | | | children change will update correctly. Previously the variable view would update the children once and not change. If you were stepping through code where the dynamic type of a variable would change the value and its children, or a synthetic type (like say for a std::vector<int>), the variable view wouldn't update. Now it caches the children and uses the process stop ID to tell when the children need to be updated. llvm-svn: 290688
* Quiet a warning where we weren't checking if this was the same and rhs.Greg Clayton2016-12-281-1/+2
| | | | llvm-svn: 290687
* Revert "Add an index for Module Metadata record in the bitcode"Saleem Abdulrasool2016-12-289-133/+8
| | | | | | | This reverts commit a0ca6ae2d38339e4ede0dfa588086fc23d87e836. Revert at Mehdi's request as it is breaking bots. llvm-svn: 290686
* [NewGVN] replace emplace_back with push_backPiotr Padlewski2016-12-281-7/+7
| | | | | | | | emplace_back is not faster if it is equivalent to push_back. In this cases emplaced value had the same type that the one stored in container. It is ugly and it might be even slower (see Scott Meyers presentation about emplacement). llvm-svn: 290685
* Add an index for Module Metadata record in the bitcodeMehdi Amini2016-12-289-8/+133
| | | | | | | | | | | | | | | | | | | | | | Summary: This index record the position for each metadata record in the bitcode, so that the reader will be able to lazy-load on demand each individual record. We also make sure that every abbrev is emitted upfront so that the block can be skipped while reading. I don't plan to commit this before having the reader counterpart, but I figured this can be reviewed mostly independently. Reviewers: pcc, tejohnson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28083 llvm-svn: 290684
OpenPOWER on IntegriCloud