summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Reapply "Make BitCodeAbbrev ownership explicit using shared_ptr rather than ↵David Blaikie2017-01-047-127/+119
| | | | | | | | | | | | | | | | | IntrusiveRefCntPtr"" If this is a problem for anyone (shared_ptr is two pointers in size, whereas IntrusiveRefCntPtr is 1 - and the ref count control block that make_shared adds is probably larger than the one int in RefCountedBase) I'd prefer to address this by adding a lower-overhead version of shared_ptr (possibly refactoring IntrusiveRefCntPtr into such a thing) to avoid the intrusiveness - this allows memory ownership to remain orthogonal to types and at least to me, seems to make code easier to understand (since no implicit ownership acquisition can happen). This recommits 291006, reverted in r291007. llvm-svn: 291016
* [Legalizer] Fix fp-to-uint to fp-tosint promotion assertion.Tim Shen2017-01-043-2/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When promoting fp-to-uint16 to fp-to-sint32, the result is actually zero extended. For example, given double 65534.0, without legalization: fp-to-uint16: 65534.0 -> 0xfffe With the legalization: fp-to-sint32: 65534.0 -> 0x0000fffe Without this patch, legalization wrongly emits a signed extend assertion, which is consumed by later icmp instruction, and cause miscompile. Note that the floating point value must be in [0, 65535), otherwise the behavior is undefined. This patch reverts r279223 behavior and adds more tests and documentations. In PR29041's context, James Molloy mentioned that: We don't need to mask because conversion from float->uint8_t is undefined if the integer part of the float value is not representable in uint8_t. Therefore we can assume this doesn't happen! which is totally true and good, because fptoui is documented clearly to have undefined behavior when overflow/underflow happens. We should take the advantage of this behavior so that we can save unnecessary mask instructions. Reviewers: jmolloy, nadav, echristo, kbarton Subscribers: mehdi_amini, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28284 llvm-svn: 291015
* Fix some buildbot issues with const objects with default ctorsDavid Blaikie2017-01-041-2/+2
| | | | llvm-svn: 291013
* The patch fixes (base, index, offset) match.Evgeny Stupachenko2017-01-042-9/+43
| | | | | | | | | | | | | | | Summary: Instead of matching: (a + i) + 1 -> (a + i, undef, 1) Now it matches: (a + i) + 1 -> (a, i, 1) Reviewers: rengolin Differential Revision: http://reviews.llvm.org/D26367 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 291012
* [AArch64] Update the feature set for Qualcomm's Falkor CPU.Chad Rosier2017-01-042-1/+12
| | | | llvm-svn: 291010
* Add positive test for sqrt "partial inlining". NFC.Michael Kuperstein2017-01-042-0/+23
| | | | llvm-svn: 291009
* [AArch64] Fix over-eager early-exit in load-store combinerNirav Dave2017-01-042-0/+15
| | | | | | | | | | | | | Fix early-exit analysis for memory operation pairing when operations are not emitted in ascending order. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D28251 llvm-svn: 291008
* Revert "Make BitCodeAbbrev ownership explicit using shared_ptr rather than ↵David Blaikie2017-01-047-119/+127
| | | | | | | | | | | IntrusiveRefCntPtr" Breaks Clang's use of bitcode. Reverting until I have a fix to go with it there. This reverts commit r291006. llvm-svn: 291007
* Make BitCodeAbbrev ownership explicit using shared_ptr rather than ↵David Blaikie2017-01-047-127/+119
| | | | | | | | | | | | | | | IntrusiveRefCntPtr If this is a problem for anyone (shared_ptr is two pointers in size, whereas IntrusiveRefCntPtr is 1 - and the ref count control block that make_shared adds is probably larger than the one int in RefCountedBase) I'd prefer to address this by adding a lower-overhead version of shared_ptr (possibly refactoring IntrusiveRefCntPtr into such a thing) to avoid the intrusiveness - this allows memory ownership to remain orthogonal to types and at least to me, seems to make code easier to understand (since no implicit ownership acquisition can happen). llvm-svn: 291006
* Remove unnecessary intrusive ref counting in favor of ↵David Blaikie2017-01-041-11/+7
| | | | | | | | | | std::shared_ptr/make_shared The intrusive nature of the reference counting is not required/used here, so simplify the ownership model to make the code easier to understand. llvm-svn: 291005
* Remove accidentally target-dependent test and pacify bots.Michael Kuperstein2017-01-041-23/+0
| | | | llvm-svn: 291004
* [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)Hal Finkel2017-01-043-44/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. Unfortunately, because of the way that the ABI works, we need to generate code as if we supported interposition whenever the linker might insert stubs for the purpose of supporting it. Differential Revision: https://reviews.llvm.org/D27231 llvm-svn: 291003
* NewGVN: Track the maximum number of iterations GVN takes on any function, so ↵Daniel Berlin2017-01-041-1/+4
| | | | | | we can pinpoint performance issues. llvm-svn: 291002
* Add positive test for sqrt "partial inlining". NFC.Michael Kuperstein2017-01-041-0/+23
| | | | llvm-svn: 291001
* [lib/LTO] Simplify logic removing set but unused variable. NFCI.Davide Italiano2017-01-041-9/+3
| | | | | | | Reported by David Binderman and ack'ed by Teresa on IRC. PR: 31527 llvm-svn: 291000
* YAML: Remove Input::MapHNode::isValidKey(), use llvm::is_contained() ↵Peter Collingbourne2017-01-042-11/+1
| | | | | | instead. NFC. llvm-svn: 290999
* Remove dead and unused variable NumSentinelElements.Eric Christopher2017-01-041-2/+2
| | | | | | Fixes PR31529. llvm-svn: 290998
* Remove dead variable Len.Eric Christopher2017-01-041-4/+1
| | | | | | Fixes PR31528 llvm-svn: 290995
* Add missing CHECK: line to test case added in 29097Tobias Grosser2017-01-041-0/+1
| | | | | | | Without this CHECK line, we may not detect incorrectly detected additional regions at the end of the region tree. llvm-svn: 290994
* ADT: IntrusiveRefCntPtr: Broaden the definition of correct usage of ↵David Blaikie2017-01-041-5/+2
| | | | | | | | | | | | | | | RefCountedBase This roughly matches the semantics of std::enable_shared_from_this - that it does not dictate the ownership model of all users, but constrains those users taking advantage of the intrusive nature to do so only when there's a guarantee that that's the ownership model being used for the object being passed. Reviewers: jlebar Differential Revision: https://reviews.llvm.org/D28245 llvm-svn: 290987
* fix comment formatting; NFCSanjay Patel2017-01-041-10/+8
| | | | llvm-svn: 290980
* AMDGPU/SI: Implement sendmsghalt intrinsicJan Vesely2017-01-0411-45/+230
| | | | | | | | v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 llvm-svn: 290977
* RegionInfo: add new test caseTobias Grosser2017-01-041-0/+42
| | | | | | | | | | | | This test case has been reduced from test/Analysis/RegionInfo/mix_1.ll and provides us with a minimal example of a test case which caused problems while working on an improved version of the RegionInfo analysis. We upstream this test case, as it certainly can be helpful in future debugging and optimization tests. Test case reduced by Pratik Bhatu <cs12b1010@iith.ac.in> llvm-svn: 290974
* Reapply "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of ↵Robert Lougher2017-01-042-1/+79
| | | | | | | | | | | | | | | | | | | | common inst" This reapplies r289828 (reverted in r289833 as it broke the address sanitizer). The debugloc is now only set when the instruction is not a call, as this causes the verifier to assert (the inliner requires an inlinable callsite to have a debug loc if the caller and callee have debug info). Original commit message: Simplify CFG will try to sink the last instruction in a series of basic blocks, creating a "common" instruction in the successor block (sinkLastInstruction). When it does this, the debug location of the single instruction should be the merged debug locations of the commoned instructions. Original review: https://reviews.llvm.org/D27590 llvm-svn: 290973
* Revert r290970 [SLPVectorizer] Regenerate test.Simon Pilgrim2017-01-041-1/+1
| | | | | | The check script will use var names before they are declared, which filecheck doesn't like. llvm-svn: 290971
* [SLPVectorizer] Regenerate test. Simon Pilgrim2017-01-041-1/+1
| | | | | | Missed var name llvm-svn: 290970
* Regenerate test. Simon Pilgrim2017-01-041-5/+10
| | | | llvm-svn: 290969
* Fix x86 gold tests on non-x86 targets.Asiri Rathnayake2017-01-0413-0/+31
| | | | | | | | | | These tests are missing a target triple and the -m elf_x86_64 gold option, which makes them fail on non-x86 targets. Differential revision: https://reviews.llvm.org/D28285 Reviewers: tejohnson llvm-svn: 290965
* [ThinLTO] Rework llvm-link to use the FunctionImporterTeresa Johnson2017-01-044-38/+20
| | | | | | | | | | | | | | | | | | | | Summary: Change llvm-link to use the FunctionImporter handling, instead of manually invoking the Linker. We still need to load the module in llvm-link to do the desired testing for invalid import requests (weak functions), and to get the GUID (in case the function is local). Also change the drop-debug-info test to use llvm-link so that importing is forced (in order to test debug info handling) and independent of import logic changes. Reviewers: mehdi_amini Subscribers: mgorny, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D28277 llvm-svn: 290964
* [SPARC] Fix test so that it checks the correct label.Davide Italiano2017-01-041-3/+3
| | | | | | Before it wasn't checking anything. llvm-svn: 290963
* [CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costsSimon Pilgrim2017-01-042-33/+31
| | | | | | Actual codegen is much better than the extract+insert patterns that was assumed. llvm-svn: 290962
* [PowerPC] Add identification for POWER8NVLNemanja Ivanovic2017-01-041-0/+1
| | | | | | | This CPU type was not previously recognized by LLVM which led to emitting poor (and sometimes incorrect) code in some JIT workloads on such a machine. llvm-svn: 290961
* [MC/COFF] Fix a test to actually check the relocation.Davide Italiano2017-01-041-1/+1
| | | | | | Inspired by r290953 + grep -R 'CHCEK'. llvm-svn: 290958
* [X86] Merged Reverse/Alternate shuffle cost tables. NFCI.Simon Pilgrim2017-01-041-141/+81
| | | | | | As discussed on D27811, merged the shuffle cost LUTs and use the shuffle kind to perform the lookup instead of the ISD opcode. llvm-svn: 290956
* [framelowering] Skip dbg values when getting next/previous instruction.Florian Hahn2017-01-044-16/+158
| | | | | | | | | | | | | | | | | | | Summary: In mergeSPUpdates, debug values need to be ignored when getting the previous element, otherwise debug data could have an impact on codegen. In eliminateCallFramePseudoInstr, debug values after the erased element could have an impact on codegen and should be skipped. Closes PR31319 (https://llvm.org/bugs/show_bug.cgi?id=31319) Reviewers: aprantl, MatzeB, mkuper Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D27688 llvm-svn: 290955
* [ADT] Speculative attempt to fix build bot issues with r290952.Chandler Carruth2017-01-042-1/+13
| | | | | | | | | | | | This just removes the usage of llvm::reverse and llvm::seq. That makes it harder to handle the empty case correctly and so I've also added a test there. This is just a shot in the dark at what might be behind the buildbot failures. I can't reproduce any issues locally including with ASan... I feel like I'm missing something... llvm-svn: 290954
* [Inliner] Fix a test where I typo'ed 'CHECK' as 'CHCEK' when convertingChandler Carruth2017-01-041-1/+1
| | | | | | | | | | to FileCheck. Fortunately, it passes. =] Spotted in review by Bob Wilson! llvm-svn: 290953
* [ADT] Enhance the PriorityWorklist to support bulk insertion.Chandler Carruth2017-01-042-0/+74
| | | | | | | | | | | This is both convenient and more efficient as we can skip any intermediate reallocation of the vector. This usage pattern came up in a subsequent patch on the pass manager, but it seems generically useful so I factored it out and added unittests here. llvm-svn: 290952
* Fix for InlineSpiller accessing not updated dom tree base information.Bjorn Pettersson2017-01-042-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The InlineSpiller was accessing the DominatorTreeBase directly through the public data member DT in the MachineDominatorTree. This is not a good idea as the "cached" information in SplitCriticalEdges is not applied before the access. The DominatorTreeBase must be accessed through the member function getBase() in MachineDominatorTree. The fault was introduced in r266162. I think the public data member DT in the MachineDominatorTree should have been made private in the original code (r215576) that introduced the concept of lazily updating the MachineDominatorTree information from MachineBasicBlock::SplitCriticalEdge(). Patch by Karl-Johan Karlsson <karl-johan.karlsson@ericsson.com> Reviewers: wmi, qcolombet Subscribers: llvm-commits, bjope, uabelho Differential Revision: https://reviews.llvm.org/D27983 llvm-svn: 290950
* [LLC][MIPS] Fix crash after enabling LLVM_ENABLE_EXPENSIVE_CHECKSNitesh Jain2017-01-042-0/+8
| | | | | | | | | Reviewers: sdardis, vkalintiris Subscribers: jaydeep, slthakur, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D27841 llvm-svn: 290949
* [X86][AVX512] Passing the appropriate memory operand class to ↵Ayman Musa2017-01-042-26/+43
| | | | | | | | | | INT_{U}COMIS{S|D} instructions Replacing the memory operand in the intrinsic versions of the comis/ucomis instrucions from f128mem to ssmem/sdmem accordingly. Differential Revision: https://reviews.llvm.org/D28138 llvm-svn: 290948
* [X86] Attempt to pre-truncate arithmetic operations if usefulSimon Pilgrim2017-01-044-1004/+504
| | | | | | | | | | | | | | In some cases its more efficient to combine TRUNC( BINOP( X, Y ) ) --> BINOP( TRUNC( X ), TRUNC( Y ) ) if the binop is legal for the truncated types. This is true for vector integer multiplication (especially vXi64), as well as ADD/AND/XOR/OR in cases where we only need to truncate one of the inputs at runtime (e.g. a duplicated input or an one use constant we can fold). Further work could be done here - scalar cases (especially i64) could often benefit (if we avoid partial registers etc.), other opcodes, and better analysis of when truncating the inputs reduces costs. I have considered implementing this for all targets within the DAGCombiner but wasn't sure we could devise a suitable cost model system that would give us the range we need. Differential Revision: https://reviews.llvm.org/D28219 llvm-svn: 290947
* [AVX-512] Add support for detecting 512-bit shuffles that contain a 128-bit ↵Craig Topper2017-01-045-65/+71
| | | | | | | | subvector insertion from the lowest subvector of one of the sources. These are best handled with a vinsert32x4 or vinsert64x2 instruction. llvm-svn: 290946
* [AVX-512] Add more test cases for shuffles that should be handled with ↵Craig Topper2017-01-042-0/+266
| | | | | | subvector insert instructions. llvm-svn: 290945
* [AVX-512] Fix a typo in a couple case names to match their behavior.Craig Topper2017-01-041-4/+4
| | | | llvm-svn: 290944
* [AVX-512] Add avx512dq to the vector-shuffle-512-v16.ll test command lines ↵Craig Topper2017-01-041-6/+6
| | | | | | in preparation for a future change that needs these features. llvm-svn: 290943
* [AVX-512] Simplify code for creating 512-bit SHUF128 operations.Craig Topper2017-01-041-18/+11
| | | | | | We don't need two loops and we can safely assume assume and hardcode the size of the widened mask. llvm-svn: 290942
* Support: Add YAML I/O support for custom mappings.Peter Collingbourne2017-01-043-3/+175
| | | | | | | | This will be used to YAMLify parts of the module summary. Differential Revision: https://reviews.llvm.org/D28014 llvm-svn: 290935
* On a 64-bit system, the DWARFDebugLine::Row struct is 32 bytes. Each field ↵Eric Christopher2017-01-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | has the following byte offsets: 0-7: Address 8-11: Line 12-13: Column 14-15: File 16-19: Isa 20-23: Discriminator 24+: bit fields The packing is fine until the "Isa" field, which is an 8-bit int that occupies 4 bytes. We can instead move Discriminator into the 16-19 slot, and pack Isa into the 20-23 range along with the bit fields: 0-7: Address 8-11: Line 12-13: Column 14-15: File 16-19: Discriminator 20-23: Isa + bit fields This layout is only 24 bytes. This 25% reduction in size may seem small but a large binary can have line tables with thousands of rows stored in a vector. Patch by Simon Que! Differential Revision: https://reviews.llvm.org/D27961 llvm-svn: 290931
* [InstCombine] Add a test for r290733David Majnemer2017-01-041-0/+71
| | | | llvm-svn: 290929
OpenPOWER on IntegriCloud