summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* MC: De-duplicate the object streamer implementations of EmitFileDirective ↵Peter Collingbourne2017-03-036-25/+6
| | | | | | into MCObjectStreamer. NFCI. llvm-svn: 296912
* [ObjectYAML] [DWARF] Abstract DWARF Initial Length valuesChris Bieneman2017-03-032-12/+20
| | | | | | | | In the DWARF 4 Spec section 7.2.2, data in many DWARF sections, and some DWARF structures start with "Initial Length Values", which are a 32-bit length, and an optional 64-bit length if the 32 bit value == UINT32_MAX. This patch abstracts the Initial Length type in YAML, and extends its use to all the DWARF structures that are supported in the DWARFYAML code that have Initial Length values. llvm-svn: 296911
* LTO: Hash the set of imported symbols for each module.Peter Collingbourne2017-03-031-1/+19
| | | | | | | | | | This set may affect code generation and is sensitive to link order (and possibly in the future to the linker's choice of prevailing symbol), so we need to include it. Differential Revision: https://reviews.llvm.org/D30586 llvm-svn: 296907
* RegisterCoalescer: Simplify subrange splitting code; NFCMatthias Braun2017-03-033-94/+51
| | | | | | - Use slightly better variable names / compute in a more direct way. llvm-svn: 296905
* Fix a compiler warningSanjoy Das2017-03-031-1/+2
| | | | llvm-svn: 296903
* Add missing #includes for FreeBSD.Zachary Turner2017-03-031-4/+9
| | | | llvm-svn: 296902
* Make TargetInstrInfo::isPredicable take a const reference, NFCKrzysztof Parzyszek2017-03-0312-16/+16
| | | | llvm-svn: 296901
* Try again to appease the FreeBSD bot.Zachary Turner2017-03-031-7/+2
| | | | | | | The actual logic was wrong, not just the type conversion. This should get it correct. llvm-svn: 296899
* [LoopUnrolling] Peel loops with invariant backedge Phi inputSanjoy Das2017-03-031-0/+25
| | | | | | | | | | | | | | | | | | | | | Summary: If a loop contains a Phi node which has an invariant input from back edge, it is profitable to peel such loops (rather than unroll them) to use the advantage that this Phi is always invariant starting from 2nd iteration. After the 1st iteration is peeled, other optimizations can potentially simplify calculations with this invariant. Patch by Max Kazantsev! Reviewers: sanjoy, apilipenko, igor-laevsky, anna, mkuper, reames Reviewed By: mkuper Subscribers: mkuper, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D30161 llvm-svn: 296898
* [LoopUnrolling] Re-prioritize Peeling and Partial unrollingSanjoy Das2017-03-032-10/+16
| | | | | | | | | | | | | | | | | | | | | | | Summary: In current implementation the loop peeling happens after trip-count based partial unrolling and may sometimes not happen at all due to it (for example, if trip count is known, but UP.Partial = false). This is generally bad, the more than there are some situations where peeling is profitable even if the partial unrolling is disabled. This patch is a NFC which reorders peeling and partial unrolling application and prepares the code for implementation of the said optimizations. Patch by Max Kazantsev! Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Reviewed By: mkuper Subscribers: mkuper, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30243 llvm-svn: 296897
* [x86] clean up materializeSBB(); NFCISanjay Patel2017-03-031-20/+14
| | | | | | This is producing SBB where it is obviously not necessary, so it needs to be limited. llvm-svn: 296894
* Try to appease the FreeBSD bots.Zachary Turner2017-03-031-2/+2
| | | | | | | | pthread_self() returns a pthread_t, but we were setting it to an int. It seems the cast to int when calling sysctl is still the correct thing to do, though. llvm-svn: 296892
* Don't bring in llvm/Support/thread.h in Threading.cppZachary Turner2017-03-031-2/+8
| | | | | | | | | | | | Doing so defines the type llvm::thread. On FreeBSD, we need to call a macro which references its own ::thread type, which causes an ambiguity due to ADL when inside of the llvm namespace. Since we don't even need this unless LLVM_ENABLE_THREADS == 1, we don't even need this type anyway, as it is always equal to std::thread, so we can just use that directly. llvm-svn: 296891
* Add #include for unistd.h on Linux.Zachary Turner2017-03-031-0/+1
| | | | llvm-svn: 296890
* [Support] Provide access to current thread name/thread id.Zachary Turner2017-03-033-84/+311
| | | | | | | | | | | | | | | | Applications often need the current thread id when making system calls, and some operating systems provide the notion of a thread name, which can be useful in enabling better diagnostics when debugging or logging. This patch adds an accessor for the thread id, and "best effort" getters and setters for the thread name. Since this is non critical functionality, no error is returned to indicate that a platform doesn't support thread names. Differential Revision: https://reviews.llvm.org/D30526 llvm-svn: 296887
* Use APInt::setBits instead of OR'ing in a separate APInt::getBitsSet callSimon Pilgrim2017-03-031-1/+1
| | | | llvm-svn: 296886
* Use APInt::getLowBitsSet instead of APInt::getBitsSet for lower bit mask ↵Simon Pilgrim2017-03-031-1/+1
| | | | | | creation llvm-svn: 296882
* Use APInt::getOneBitSet instead of APInt::getBitsSet for sign bit mask creationSimon Pilgrim2017-03-031-1/+1
| | | | | | Avoids all the unnecessary extra bitrange creation/shift stages. llvm-svn: 296879
* [x86] fix formatting; NFCSanjay Patel2017-03-031-3/+2
| | | | llvm-svn: 296875
* Use APInt::getHighBitsSet instead of APInt::getBitsSet for upper bit mask ↵Simon Pilgrim2017-03-031-1/+1
| | | | | | creation llvm-svn: 296874
* [AMDGPU][MC] Fix for Bug 30829 + LIT testsDmitry Preobrazhensky2017-03-037-0/+163
| | | | | | | | Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction). Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code). Added LIT tests. llvm-svn: 296873
* Revert "Re-apply "[GVNHoist] Move GVNHoist to function simplification part ↵Benjamin Kramer2017-03-031-2/+2
| | | | | | | | of pipeline."" This reverts commit r296759. Miscompiles bash. llvm-svn: 296872
* Use APInt::getOneBitSet instead of APInt::getBitsSet for sign bit mask creationSimon Pilgrim2017-03-031-1/+1
| | | | | | Avoids all the unnecessary extra bitrange creation/shift stages. llvm-svn: 296871
* Fix Wdocumentation warningSimon Pilgrim2017-03-031-3/+3
| | | | llvm-svn: 296866
* [SLP] Fixes the bug due to absence of in order uses of scalars which needs ↵Mohammad Shahid2017-03-032-80/+103
| | | | | | | | | | | | | | | | to be available for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR. The fix is to compute the mask for out of order memory accesses while building the vectorizable tree instead of actual vectorization of vectorizable tree.It also needs to recompute the proper Lane for external use of vectorizable scalars based on shuffle mask. Reviewers: mkuper Differential Revision: https://reviews.llvm.org/D30159 Change-Id: Ide8773ce0ad3562f3cf4d1a0ad0f487e2f60ce5d llvm-svn: 296863
* [SDAG] Revert r296476 (and r296486, r296668, r296690).Chandler Carruth2017-03-035-386/+372
| | | | | | | | | | This patch causes compile times for some patterns to explode. I have a (large, unreduced) test case that slows down by more than 20x and several test cases slow down by 2x. I'm sending some of the test cases directly to Nirav and following up with more details in the review log, but this should unblock anyone else hitting this. llvm-svn: 296862
* [X86] Generate VZEROUPPER for Skylake-avx512.Amjad Aboud2017-03-034-65/+77
| | | | | | | | VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be. Differential Revision: https://reviews.llvm.org/D29874 llvm-svn: 296859
* [AArch64AsmParser] rewrite of function parseSysAliasSjoerd Meijer2017-03-032-212/+62
| | | | | | | | | | | | This is a cleanup/rewrite of the parseSysAlias function. It was not using the tablegen instruction descriptions, but was “manually” matching the mnemonics and recreating the operands whereas all this information is already in tablegen; all this code has been replaced with calls to lookupXYZByName tablegen calls. Differential Revision: https://reviews.llvm.org/D30491 llvm-svn: 296857
* [GlobalISel][X86] Support float/double and vector types.Igor Breger2017-03-037-32/+306
| | | | | | | | | | | | | | Summary: [GlobalISel][X86] Add support for f32/f64 and vector types in RegisterBank and InstructionSelector. Reviewers: delena, zvi Reviewed By: zvi Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30533 llvm-svn: 296856
* [msan] Handle x86_sse_stmxcsr and x86_sse_ldmxcsr.Evgeniy Stepanov2017-03-031-4/+46
| | | | llvm-svn: 296848
* LiveDebugValues: Assume calls never clobber SP.Adrian Prantl2017-03-031-1/+5
| | | | | | | | | | | | | | A call should never modify the stack pointer, but some backends are not so sure about this and never list SP in the regmask. For the purposes of LiveDebugValues we assume a call never clobbers SP. We already have a similar workaround in DbgValueHistoryCalculator (which we hopefully can retire soon). This fixes the availabilty of local ASANified variables on AArch64. rdar://problem/27757381 llvm-svn: 296847
* [ProfileData] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-03-039-97/+241
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 296846
* CodeGen: BlockPlacement: Precompute layout for chains of triangles.Kyle Butt2017-03-031-0/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | For chains of triangles with small join blocks that can be tail duplicated, a simple calculation of probabilities is insufficient. Tail duplication can be profitable in 3 different ways for these cases: 1) The post-dominators marked 50% are actually taken 56% (This shrinks with longer chains) 2) The chains are statically correlated. Branch probabilities have a very U-shaped distribution. [http://nrs.harvard.edu/urn-3:HUL.InstRepos:24015805] If the branches in a chain are likely to be from the same side of the distribution as their predecessor, but are independent at runtime, this transformation is profitable. (Because the cost of being wrong is a small fixed cost, unlike the standard triangle layout where the cost of being wrong scales with the # of triangles.) 3) The chains are dynamically correlated. If the probability that a previous branch was taken positively influences whether the next branch will be taken We believe that 2 and 3 are common enough to justify the small margin in 1. The code pre-scans a function's CFG to identify this pattern and marks the edges so that the standard layout algorithm can use the computed results. llvm-svn: 296845
* [msan] Remove stale comments.Evgeniy Stepanov2017-03-031-2/+0
| | | | | | ClStoreCleanOrigin flag was removed back in 2014. llvm-svn: 296844
* AMDGPU: Fix missing dominator tree dependencyMatt Arsenault2017-03-021-0/+1
| | | | llvm-svn: 296842
* ThinLTOBitcodeWriter: Do not follow operand edges of type GlobalValue when ↵Peter Collingbourne2017-03-021-0/+2
| | | | | | | | | | | | looking for virtual functions. Such edges may otherwise result in infinite recursion if a pointer to a vtable is reachable from the vtable itself. This can happen in practice if a TU defines the ABI types used to implement RTTI, and is itself compiled with RTTI. Fixes PR32121. llvm-svn: 296839
* Move defClobbersUseOrDef to being a protected member of a class since we ↵Daniel Berlin2017-03-022-4/+4
| | | | | | don't want anyone else using it llvm-svn: 296838
* [BypassSlowDivision] Use ValueTracking to simplify run-time checksNikolai Bozhenov2017-03-021-29/+108
| | | | | | | | | | | | | | | | | | | | | | | | ValueTracking is used for more thorough analysis of operands. Based on the analysis, either run-time checks can be simplified (e.g. check only one operand instead of two) or the transformation can be avoided. For example, it is quite often the case that a divisor is promoted from a shorter type and run-time checks for it are redundant. With additional compile-time analysis of values, two special cases naturally arise and are addressed by the patch: 1) Both operands are known to be short enough. Then, the long division can be simply replaced with a short one without CFG modification. 2) If a division is unsigned and the dividend is known to be short then the long division is not needed at all. Because if the divisor is too big for short division then the quotient is obviously zero (and the remainder is equal to the dividend). Actually, the division is not needed when (divisor > dividend). Differential Revision: https://reviews.llvm.org/D29897 llvm-svn: 296832
* [BypassSlowDivision] Refactor fast division insertion logic (NFC)Nikolai Bozhenov2017-03-021-160/+218
| | | | | | | | | | The most important goal of the patch is to break large insertFastDiv function into separate pieces, so that later a different fast insertion logic can be implemented using some of these pieces. Differential Revision: https://reviews.llvm.org/D29896 llvm-svn: 296828
* [DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y)Taewook Oh2017-03-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, when 't1: i1 = setcc t2, t3, cc' followed by 't4: i1 = xor t1, Constant:i1<-1>' is folded into 't5: i1 = setcc t2, t3 !cc', SDLoc of newly created SDValue 't5' follows SDLoc of 't4', not 't1'. However, as the opcode of newly created SDValue is 'setcc', it make more sense to take DebugLoc from 't1' than 't4'. For the code below ``` extern int bar(); extern int baz(); int foo(int x, int y) { if (x != y) return bar(); else return baz(); } ``` , following is the bitcode representation of 'foo' at the end of llvm-ir level optimization: ``` define i32 @foo(i32 %x, i32 %y) !dbg !4 { entry: tail call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !9, metadata !11), !dbg !12 tail call void @llvm.dbg.value(metadata i32 %y, i64 0, metadata !10, metadata !11), !dbg !13 %cmp = icmp ne i32 %x, %y, !dbg !14 br i1 %cmp, label %if.then, label %if.else, !dbg !16 if.then: ; preds = %entry %call = tail call i32 (...) @bar() #3, !dbg !17 br label %return, !dbg !18 if.else: ; preds = %entry %call1 = tail call i32 (...) @baz() #3, !dbg !19 br label %return, !dbg !20 return: ; preds = %if.else, %if.then %retval.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ] ret i32 %retval.0, !dbg !21 } !14 = !DILocation(line: 5, column: 9, scope: !15) !16 = !DILocation(line: 5, column: 7, scope: !4) ``` As you can see, in 'entry' block, 'icmp' instruction and 'br' instruction have different debug locations. However, with current implementation, there's no distinction between debug locations of these two when they are lowered to asm instructions. This is because 'icmp' and 'br' become 'setcc' 'xor' and 'brcond' in SelectionDAG, where SDLoc of 'setcc' follows the debug location of 'icmp' but SDLOC of 'xor' and 'brcond' follows the debug location of 'br' instruction, and SDLoc of 'xor' overwrites SDLoc of 'setcc' when they are folded. This patch addresses this issue. Reviewers: atrick, bogner, andreadb, craig.topper, aprantl Reviewed By: andreadb Subscribers: jlebar, mkuper, jholewinski, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D29813 llvm-svn: 296825
* [DAG] early exit to improve readability and formatting of visitMemCmpCall(); ↵Sanjay Patel2017-03-021-64/+53
| | | | | | NFCI llvm-svn: 296824
* [Hexagon] Pick the right branch opcode depending on branch probabilitiesKrzysztof Parzyszek2017-03-021-15/+69
| | | | | | | Specifically, pick the opcode with the correct branch prediction, i.e. jump:t or jump:nt. llvm-svn: 296821
* CodeGen: MachineBlockPlacement: Remove the unused outlining heuristic.Kyle Butt2017-03-021-98/+1
| | | | | | | Outlining optional branches isn't a good heuristic, and it's never been on by default. Remove it to clean things up. llvm-svn: 296818
* [ARM] Fix insert point for store rescheduling.Eli Friedman2017-03-021-12/+19
| | | | | | | | | | | | | | | | In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would move stores to the wrong insert point. Re-commit with a fix to increment NumMove in the right place. Differential Revision: https://reviews.llvm.org/D30124 llvm-svn: 296815
* Revert "Fix PR 24415 (at least), by making our post-dominator tree behavior ↵Tobias Grosser2017-03-021-14/+37
| | | | | | | | | | | | | | | sane." and also "clang-format GenericDomTreeConstruction.h, since the current formatting makes it look like their is a bug in the loop indentation, and there is not" This reverts commit r296535. There are still some open design questions which I would like to discuss. I revert this for Daniel (who gave the OK), as he is on vacation. llvm-svn: 296812
* [PPC] Fix code generation for bswap(int32) followed by store16Guozhi Wei2017-03-021-2/+10
| | | | | | | | | | | | | | | | | | | This patch fixes pr32063. Current code in PPCTargetLowering::PerformDAGCombine can transform bswap store into a single PPCISD::STBRX instruction. but it doesn't consider the case that the operand size of bswap may be larger than store size. When it occurs, we need 2 modifications, 1 For the last operand of PPCISD::STBRX, we should not use DAG.getValueType(N->getOperand(1).getValueType()), instead we should use cast<StoreSDNode>(N)->getMemoryVT(). 2 Before PPCISD::STBRX, we need to shift the original operand of bswap to the right side. Differential Revision: https://reviews.llvm.org/D30362 llvm-svn: 296811
* [Support] Move Stream library from MSF -> Support.Zachary Turner2017-03-0236-56/+56
| | | | | | | | | | After several smaller patches to get most of the core improvements finished up, this patch is a straight move and header fixup of the source. Differential Revision: https://reviews.llvm.org/D30266 llvm-svn: 296810
* [AArch64] Extend redundant copy elimination pass to handle non-zero stores.Chad Rosier2017-03-021-76/+212
| | | | | | | | | | | | | | | This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle non-zero cases such as: BB#0: cmp x0, #1 b.eq .LBB0_1 .LBB0_1: orr x0, xzr, #0x1 ; <-- redundant copy; x0 known to hold #1. Differential Revision: https://reviews.llvm.org/D29344 llvm-svn: 296809
* [DAG] improve documentation comments; NFCSanjay Patel2017-03-022-90/+48
| | | | llvm-svn: 296808
* [MSP430] Add SRet support to MSP430 targetVadzim Dambrouski2017-03-024-24/+85
| | | | | | | | | | | | | This patch adds support for struct return values to the MSP430 target backend. It also reverses the order of argument and return registers in the calling convention to bring it into closer alignment with the published EABI from TI. Patch by Andrew Wygle (awygle). Differential Revision: https://reviews.llvm.org/D29069 llvm-svn: 296807
OpenPOWER on IntegriCloud