summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "Make BitCodeAbbrev ownership explicit using shared_ptr rather than ↵David Blaikie2017-01-042-94/+94
| | | | | | | | | | | IntrusiveRefCntPtr" Breaks Clang's use of bitcode. Reverting until I have a fix to go with it there. This reverts commit r291006. llvm-svn: 291007
* Make BitCodeAbbrev ownership explicit using shared_ptr rather than ↵David Blaikie2017-01-042-94/+94
| | | | | | | | | | | | | | | IntrusiveRefCntPtr If this is a problem for anyone (shared_ptr is two pointers in size, whereas IntrusiveRefCntPtr is 1 - and the ref count control block that make_shared adds is probably larger than the one int in RefCountedBase) I'd prefer to address this by adding a lower-overhead version of shared_ptr (possibly refactoring IntrusiveRefCntPtr into such a thing) to avoid the intrusiveness - this allows memory ownership to remain orthogonal to types and at least to me, seems to make code easier to understand (since no implicit ownership acquisition can happen). llvm-svn: 291006
* [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)Hal Finkel2017-01-041-40/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. Unfortunately, because of the way that the ABI works, we need to generate code as if we supported interposition whenever the linker might insert stubs for the purpose of supporting it. Differential Revision: https://reviews.llvm.org/D27231 llvm-svn: 291003
* NewGVN: Track the maximum number of iterations GVN takes on any function, so ↵Daniel Berlin2017-01-041-1/+4
| | | | | | we can pinpoint performance issues. llvm-svn: 291002
* [lib/LTO] Simplify logic removing set but unused variable. NFCI.Davide Italiano2017-01-041-9/+3
| | | | | | | Reported by David Binderman and ack'ed by Teresa on IRC. PR: 31527 llvm-svn: 291000
* YAML: Remove Input::MapHNode::isValidKey(), use llvm::is_contained() ↵Peter Collingbourne2017-01-041-9/+1
| | | | | | instead. NFC. llvm-svn: 290999
* Remove dead and unused variable NumSentinelElements.Eric Christopher2017-01-041-2/+2
| | | | | | Fixes PR31529. llvm-svn: 290998
* Remove dead variable Len.Eric Christopher2017-01-041-4/+1
| | | | | | Fixes PR31528 llvm-svn: 290995
* AMDGPU/SI: Implement sendmsghalt intrinsicJan Vesely2017-01-046-4/+21
| | | | | | | | v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 llvm-svn: 290977
* Reapply "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of ↵Robert Lougher2017-01-041-1/+9
| | | | | | | | | | | | | | | | | | | | common inst" This reapplies r289828 (reverted in r289833 as it broke the address sanitizer). The debugloc is now only set when the instruction is not a call, as this causes the verifier to assert (the inliner requires an inlinable callsite to have a debug loc if the caller and callee have debug info). Original commit message: Simplify CFG will try to sink the last instruction in a series of basic blocks, creating a "common" instruction in the successor block (sinkLastInstruction). When it does this, the debug location of the single instruction should be the merged debug locations of the commoned instructions. Original review: https://reviews.llvm.org/D27590 llvm-svn: 290973
* [CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costsSimon Pilgrim2017-01-041-11/+9
| | | | | | Actual codegen is much better than the extract+insert patterns that was assumed. llvm-svn: 290962
* [PowerPC] Add identification for POWER8NVLNemanja Ivanovic2017-01-041-0/+1
| | | | | | | This CPU type was not previously recognized by LLVM which led to emitting poor (and sometimes incorrect) code in some JIT workloads on such a machine. llvm-svn: 290961
* [X86] Merged Reverse/Alternate shuffle cost tables. NFCI.Simon Pilgrim2017-01-041-141/+81
| | | | | | As discussed on D27811, merged the shuffle cost LUTs and use the shuffle kind to perform the lookup instead of the ISD opcode. llvm-svn: 290956
* [framelowering] Skip dbg values when getting next/previous instruction.Florian Hahn2017-01-041-8/+14
| | | | | | | | | | | | | | | | | | | Summary: In mergeSPUpdates, debug values need to be ignored when getting the previous element, otherwise debug data could have an impact on codegen. In eliminateCallFramePseudoInstr, debug values after the erased element could have an impact on codegen and should be skipped. Closes PR31319 (https://llvm.org/bugs/show_bug.cgi?id=31319) Reviewers: aprantl, MatzeB, mkuper Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D27688 llvm-svn: 290955
* Fix for InlineSpiller accessing not updated dom tree base information.Bjorn Pettersson2017-01-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The InlineSpiller was accessing the DominatorTreeBase directly through the public data member DT in the MachineDominatorTree. This is not a good idea as the "cached" information in SplitCriticalEdges is not applied before the access. The DominatorTreeBase must be accessed through the member function getBase() in MachineDominatorTree. The fault was introduced in r266162. I think the public data member DT in the MachineDominatorTree should have been made private in the original code (r215576) that introduced the concept of lazily updating the MachineDominatorTree information from MachineBasicBlock::SplitCriticalEdge(). Patch by Karl-Johan Karlsson <karl-johan.karlsson@ericsson.com> Reviewers: wmi, qcolombet Subscribers: llvm-commits, bjope, uabelho Differential Revision: https://reviews.llvm.org/D27983 llvm-svn: 290950
* [LLC][MIPS] Fix crash after enabling LLVM_ENABLE_EXPENSIVE_CHECKSNitesh Jain2017-01-042-0/+8
| | | | | | | | | Reviewers: sdardis, vkalintiris Subscribers: jaydeep, slthakur, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D27841 llvm-svn: 290949
* [X86][AVX512] Passing the appropriate memory operand class to ↵Ayman Musa2017-01-042-26/+43
| | | | | | | | | | INT_{U}COMIS{S|D} instructions Replacing the memory operand in the intrinsic versions of the comis/ucomis instrucions from f128mem to ssmem/sdmem accordingly. Differential Revision: https://reviews.llvm.org/D28138 llvm-svn: 290948
* [X86] Attempt to pre-truncate arithmetic operations if usefulSimon Pilgrim2017-01-041-0/+81
| | | | | | | | | | | | | | In some cases its more efficient to combine TRUNC( BINOP( X, Y ) ) --> BINOP( TRUNC( X ), TRUNC( Y ) ) if the binop is legal for the truncated types. This is true for vector integer multiplication (especially vXi64), as well as ADD/AND/XOR/OR in cases where we only need to truncate one of the inputs at runtime (e.g. a duplicated input or an one use constant we can fold). Further work could be done here - scalar cases (especially i64) could often benefit (if we avoid partial registers etc.), other opcodes, and better analysis of when truncating the inputs reduces costs. I have considered implementing this for all targets within the DAGCombiner but wasn't sure we could devise a suitable cost model system that would give us the range we need. Differential Revision: https://reviews.llvm.org/D28219 llvm-svn: 290947
* [AVX-512] Add support for detecting 512-bit shuffles that contain a 128-bit ↵Craig Topper2017-01-041-3/+33
| | | | | | | | subvector insertion from the lowest subvector of one of the sources. These are best handled with a vinsert32x4 or vinsert64x2 instruction. llvm-svn: 290946
* [AVX-512] Simplify code for creating 512-bit SHUF128 operations.Craig Topper2017-01-041-18/+11
| | | | | | We don't need two loops and we can safely assume assume and hardcode the size of the widened mask. llvm-svn: 290942
* Support: Add YAML I/O support for custom mappings.Peter Collingbourne2017-01-041-2/+18
| | | | | | | | This will be used to YAMLify parts of the module summary. Differential Revision: https://reviews.llvm.org/D28014 llvm-svn: 290935
* [InstCombine] Move casts around shift operationsDavid Majnemer2017-01-041-0/+19
| | | | | | | It is possible to perform a left shift before zero extending if the shift would only shift out zeros. llvm-svn: 290928
* [InstCombine] Combine adds across a zextDavid Majnemer2017-01-041-0/+12
| | | | | | | | | We can perform the following: (add (zext (add nuw X, C1)), C2) -> (zext (add nuw X, C1+C2)) This is only possible if C2 is negative and C2 is greater than or equal to negative C1. llvm-svn: 290927
* [Hexagon, TableGen] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-01-0413-392/+313
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 290925
* [ThinLTO] Import type as decl only when non-null IdentifierTeresa Johnson2017-01-031-1/+1
| | | | | | | | As per post-commit review for r289993 (D27775), we can only safely import a type as a decl if it has an Identifier, as the Name alone is not enough to be unique across modules. llvm-svn: 290915
* InstCombine: Fold fabs on select of constantsMatt Arsenault2017-01-031-0/+12
| | | | llvm-svn: 290913
* [InstCombine] use 'match' to reduce code bloat; NFCISanjay Patel2017-01-031-15/+11
| | | | | | | | | | | I wrote this patch before seeing the comment in: https://reviews.llvm.org/D27114 ...that suggests we should actually be canonicalizing the other way. So just in case we decide this is the right way, we might as well have a cleaner implementation. llvm-svn: 290912
* [CodeGen] Further simplify returned call operand logic. NFC.Ahmed Bougacha2017-01-031-8/+2
| | | | | | As Pete points out in r290905, CallSite lets us avoid duplicating this! llvm-svn: 290909
* [ExecutionEngine] Fix compile errors in OProfileJITEventListener.Lang Hames2017-01-031-8/+8
| | | | | | | | Allows LLVM to build with LLVM_USE_OPROFILE=True. Patch by Mark Dewing. Thanks Mark! llvm-svn: 290908
* [CodeGen] Simplify logic that looks for returned call operands. NFC-ish.Ahmed Bougacha2017-01-031-22/+10
| | | | | | | | | | | | | | | Use getReturnedArgOperand() instead of rolling our own. Note that it's equivalent because there can only be one 'returned' operand. The existing code was also incorrect: there already was awkward logic to ignore callee/EH blocks, but operands can now also be operand bundles, in which case we'll look for non-existent parameter attributes. Unfortunately, this isn't observable in-tree, as it only crashes when exercising the regular call lowering logic with operand bundles. Still, this is a nice small cleanup anyway. llvm-svn: 290905
* [libFuzzer] disable -print_pcs by default (was enabled by mistake)Kostya Serebryany2017-01-031-0/+2
| | | | llvm-svn: 290899
* [ADT] APFloatBase: Prevent collapsing semPPCDoubleDouble and semBogusMichal Gorny2017-01-031-2/+6
| | | | | | | | | | | | | | | | | | | Provide a distinct contents for semBogus and semPPCDoubleDouble in order to prevent compilers from collapsing them to a single memory address, while we heavily rely on every semantic having distinct address. This happens if insecure optimization collapsing identical values is enabled. As a result, APFloats of semBogus are indistinguishable from semPPCDoubleDouble -- and whenever the move constructor is used, the old value beings being incorrectly recognized as a semPPCDoubleDouble. Since the values in semPPCDoubleDouble are not used anywhere, we can easily solve this issue via altering the value of one of the fields and therefore ensuring that the collapse can not occur. Differential Revision: https://reviews.llvm.org/D28112 llvm-svn: 290896
* [X86] Move 128-bit shuffle mask widening check into lowerV2X128VectorShuffle ↵Craig Topper2017-01-031-22/+17
| | | | | | to reduce code duplication. Use the now available widened mask to simplify some code inside lowerV2X128VectorShuffle. llvm-svn: 290872
* [AVX-512] Simplify the code added in r290870 to recognized 256-bit subvector ↵Craig Topper2017-01-031-30/+7
| | | | | | inserts and avoid calling isShuffleEquivalent on a widened mask. llvm-svn: 290871
* [AVX-512] Teach shuffle lowering to use vinsert instructions for shuffles ↵Craig Topper2017-01-031-0/+39
| | | | | | corresponding to 256-bit subvector inserts. llvm-svn: 290870
* [AVX-512] Teach EVEX to VEX conversion pass to handle VINSERT and VEXTRACT ↵Craig Topper2017-01-031-0/+16
| | | | | | instructions. llvm-svn: 290869
* [X86] Remove trailing whitespace and an unnecessary line wrap. NFCCraig Topper2017-01-031-37/+35
| | | | llvm-svn: 290867
* [X86] Fix header comment. NFCCraig Topper2017-01-031-1/+1
| | | | llvm-svn: 290866
* [AVX-512] Add support for pushing bitcasts through INSERT_SUBVEC in order to ↵Craig Topper2017-01-031-0/+23
| | | | | | select a masked operation. llvm-svn: 290865
* [AVX-512] Remove vinsert intrinsics and autoupgrade to native ↵Craig Topper2017-01-033-49/+25
| | | | | | shufflevectors. There are some codegen problems here that I'll try to fix in future commits. llvm-svn: 290864
* [AVX-512] Remove vextract intrinsics and autoupgrade to native ↵Craig Topper2017-01-032-37/+16
| | | | | | | | shufflevectors. This unfortunately generates some really terrible code without VLX support due to v2i1 and v4i1 not being legal. Hopefully we can improve that in future patches. llvm-svn: 290863
* InstCombine: Add fma with constant transformsMatt Arsenault2017-01-031-3/+17
| | | | | | DAGCombine already does these. llvm-svn: 290860
* InstCombine: Add fma + fabs/fneg transformsMatt Arsenault2017-01-031-0/+30
| | | | | | | fma (fneg x), (fneg y), z -> fma x, y, z fma (fabs x), (fabs x), z -> fma x, x, z llvm-svn: 290859
* [XRay] Merge instrumentation point table emission code into AsmPrinter.Dean Michael Berris2017-01-037-150/+59
| | | | | | | | | | | | | | | | | | Summary: No need to have this per-architecture. While there, unify 32-bit ARM's behaviour with what changed elsewhere and start function names lowercase as per the coding standards. Individual entry emission code goes to the entry's own class. Fully tested on amd64, cross-builds on both ARMs and PowerPC. Reviewers: dberris Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28209 llvm-svn: 290858
* [EarlyCSE] less else, more auto; NFCSanjay Patel2017-01-031-2/+2
| | | | llvm-svn: 290848
* [InstCombine] use combineMetadataForCSE instead of copying it; NFCISanjay Patel2017-01-021-14/+4
| | | | llvm-svn: 290844
* Make sure total loop body weight is preserved in loop peelingXin Tong2017-01-021-8/+17
| | | | | | | | | | | | | | | Summary: Regardless how the loop body weight is distributed, we should preserve total loop body weight. i.e. we should have same weight reaching the body of the loop or its duplicates in peeled and unpeeled case. Reviewers: mkuper, davidxl, anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28179 llvm-svn: 290833
* NewGVN: Clean up after removing possibility of null expressions.Daniel Berlin2017-01-021-17/+14
| | | | llvm-svn: 290828
* fix typo; NFCSanjay Patel2017-01-021-1/+1
| | | | llvm-svn: 290827
* [ValueTracking] remove stale comments; NFCSanjay Patel2017-01-021-6/+0
| | | | | | | The checks were improved with: https://reviews.llvm.org/rL290194 llvm-svn: 290826
OpenPOWER on IntegriCloud