summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* The patch improves ValueTracking on left shift with nsw flag.Evgeny Stupachenko2016-08-242-5/+78
| | | | | | | | | | | | Summary: The patch fixes PR28946. Reviewers: majnemer, sanjoy Differential Revision: http://reviews.llvm.org/D23296 From: Li Huang llvm-svn: 279684
* [WebAssembly] Change a comment lineHeejin Ahn2016-08-241-1/+2
| | | | | | Test for commit access. llvm-svn: 279683
* MIRYamlMapping cleanupMatthias Braun2016-08-241-2/+0
| | | | | | Missed two lines got lost when cherry picking old commits to master. llvm-svn: 279682
* [Hexagon] Check for block end when skipping debug instructionsKrzysztof Parzyszek2016-08-242-4/+60
| | | | llvm-svn: 279681
* MIRParser/MIRPrinter: Compute HasInlineAsm instead of printing/parsing itMatthias Braun2016-08-2422-45/+13
| | | | llvm-svn: 279680
* Missed a test in my last commitMatthias Braun2016-08-241-1/+0
| | | | llvm-svn: 279679
* [Hexagon] Change insertion of expand-condsets pass to avoid memory leaksKrzysztof Parzyszek2016-08-242-5/+10
| | | | llvm-svn: 279678
* [InstCombine] use m_APInt to allow icmp eq/ne (shr X, C2), C folds for splat ↵Sanjay Patel2016-08-245-38/+51
| | | | | | constant vectors llvm-svn: 279677
* MachineRegisterInfo/MIR: Initialize tracksSubRegLiveness early, do not ↵Matthias Braun2016-08-2426-57/+17
| | | | | | | | | | | | | print/parser it tracksSubRegLiveness only depends on the Subtarget and a cl::opt, there is not need to change it or save/parse it in a .mir file. Make the field const and move the initialization LiveIntervalAnalysis to the MachineRegisterInfo constructor. Also cleanup some code and fix some instances which better use MachineRegisterInfo::subRegLivenessEnabled() instead of TargetSubtargetInfo::enableSubRegLiveness(). llvm-svn: 279676
* [CMake] Be more consistent about naming targets and componentsChris Bieneman2016-08-243-12/+18
| | | | | | | | | | | | | | | Summary: The point of this patch is to have a consistent convention for naming build, check and install targets so that the targets can be constructed from the project name. This change renames a bunch of CMake components and targets from libcxx to cxx. For each renamed target I've added a convenience target that matches the old target name and depends on the new target. This will preserve function of the old targets so that the change doesn't break the world. We can evaluate if it is worth removing the extra targets later. Reviewers: EricWF Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23699 llvm-svn: 279675
* [lsan] give a test a bit more stack -- it started failing after r279664 on ↵Kostya Serebryany2016-08-241-2/+3
| | | | | | the debian bot, hopefully this is a fix. llvm-svn: 279674
* [Order Files] On Darwin use DTrace's oneshot probeChris Bieneman2016-08-241-0/+5
| | | | | | The oneshot probe only gets executed the first time the probe is hit in the process. For order file generation this is really all we care about. llvm-svn: 279673
* fix darwin_log test errors on macOS < 10.12Todd Fiala2016-08-241-0/+9
| | | | | | | | The newer event-based tests I added neglected to do the macOS 10.12 check in the setup. This caused earlier macOS test suite runs to attempt to compile code that doesn't exist. llvm-svn: 279672
* CodeGen: If Convert blocks that would form a diamond when tail-merged.Kyle Butt2016-08-243-78/+356
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279671
* IfConversion: Rescan diamonds.Kyle Butt2016-08-243-54/+208
| | | | | | | | | | | | | | | | The cost of predicating a diamond is only the instructions that are not shared between the two branches. Additionally If a predicate clobbering instruction occurs in the shared portion of the branches (e.g. a cond move), it may still be possible to if convert the sub-cfg. This change handles these two facts by rescanning the non-shared portion of a diamond sub-cfg to recalculate both the predication cost and whether both blocks are pred-clobbering. Fixed 2 bugs before recommitting. Branch instructions must be compared and found identical before diamond conversion. Also, predicate-clobbering instructions in the shared prefix disqualifies a potential diamond conversion. Includes tests for both. llvm-svn: 279670
* [StreamExecutor] Rename Executor to DeviceJason Henline2016-08-2414-580/+575
| | | | | | | | | | | | Summary: This more clearly describes what the class is. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23851 llvm-svn: 279669
* Disable test under asan: it uses a lot of stack, and asan increases theRichard Smith2016-08-241-4/+9
| | | | | | | | per-frame stack usage enough to cause it to hit our stack limit. This is not ideal; we should find a better way of dealing with this, such as increasing our stack allocation when built with ASan. llvm-svn: 279668
* PR29097: add an update record when we instantiate the default memberRichard Smith2016-08-248-6/+68
| | | | | | initializer of an imported field. llvm-svn: 279667
* [clang-tidy misc-move-const-arg] more specific messages + suggest ↵Alexander Kornienko2016-08-242-10/+15
| | | | | | alternative solution llvm-svn: 279666
* ARM: don't diagnose cbz/cbnz to Thumb functions.Tim Northover2016-08-243-1/+17
| | | | | | | | A branch-distance to a Thumb function shouldn't be forced to be odd for CBZ/CBNZ instructions because (assuming it's within range), it's going to be a valid, even offset. llvm-svn: 279665
* [sanitizer] re-apply r279572 and r279595 reverted in r279643: change the ↵Kostya Serebryany2016-08-243-182/+144
| | | | | | 64-bit allocator to use a single array for free-d chunks instead of a lock-free linked list of tranfer batches. This change simplifies the code, makes the allocator more 'hardened', and will allow simpler code to release RAM to OS. This may also slowdown malloc stress tests due to lock contension, but I did not observe noticeable slowdown on various real multi-threaded benchmarks. llvm-svn: 279664
* [Sema][Comments] Support @param with c++ 'using' keywordBruno Cardoso Lopes2016-08-242-6/+69
| | | | | | | | | | | | | Give appropriate warnings with -Wdocumentation for @param comments that refer to function aliases defined with 'using'. Very similar to typedef's behavior. This does not add support for TypeAliasTemplateDecl yet. Differential Revision: https://reviews.llvm.org/D23783 rdar://problem/27300695 llvm-svn: 279662
* [ubsan] fix the test to me more resistent against changes in the sanitizer ↵Kostya Serebryany2016-08-241-0/+7
| | | | | | allocator llvm-svn: 279661
* AMDGCN/SI: Implement readlane/readfirstlane intrinsicsChangpeng Fang2016-08-244-4/+91
| | | | | | | | | | | | | | | Summary: This patch implements readlane/readfirstlane intrinsics. TODO: need to define a new register class to consider the case that the source could be a vector register or M0. Reviewed by: arsenm and tstellarAMD Differential Revision: http://reviews.llvm.org/D22489 llvm-svn: 279660
* Clang-tidy documentation style. Two Google checks are aliases.Eugene Zelenko2016-08-248-36/+29
| | | | | | Differential revision: https://reviews.llvm.org/D23815 llvm-svn: 279659
* [StreamExecutor] Fix allocateDeviceMemoryJason Henline2016-08-242-2/+37
| | | | | | | | | | | | | | | | | | | Summary: The return value from PlatformExecutor::allocateDeviceMemory needs to be converted from Expected<GlobalDeviceMemoryBase> to Expected<GlobalDeviceMemory<T>> in Executor::allocateDeviceMemory. A similar bug is also fixed for Executor::allocateHostMemory. Thanks to jprice for identifying this bug. Reviewers: jprice, jlebar Subscribers: parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23849 llvm-svn: 279658
* Add %loadPolly to test command line.Michael Kruse2016-08-241-1/+1
| | | | | | Required for out-of-tree builds of Polly. llvm-svn: 279657
* amdgcn: Also correct get_local_size type for HSAMatt Arsenault2016-08-241-5/+8
| | | | llvm-svn: 279656
* Use isTargetMachO instead of isTargetDarwin.Rafael Espindola2016-08-242-1/+11
| | | | llvm-svn: 279655
* [StreamExecutor] Clean up device copy commentsJason Henline2016-08-244-159/+2361
| | | | | | | | | | | | | | | | | Summary: Consolidate Executor::synchronousCopy* and Stream::thenCopy* methods into Doxygen method groups and combine all their comments into one section. Also a "doc" target to the build files to use Doxygen to build the documentation. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23845 llvm-svn: 279654
* Fix offload bundler tests so that diagnostic can start with caps.Samuel Antao2016-08-241-1/+1
| | | | | | Windows require that. llvm-svn: 279653
* [X86][SSE] Add MINSD/MAXSD/MINSS/MAXSS intrinsic scalar load folding supportSimon Pilgrim2016-08-242-6/+50
| | | | | | These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad llvm-svn: 279652
* DebugInfo: Add flag to CU to disable emission of inline debug info into the ↵David Blaikie2016-08-248-6/+24
| | | | | | | | | | skeleton CU In cases where .dwo/.dwp files are guaranteed to be available, skipping the extra online (in the .o file) inline info can save a substantial amount of space - see the original r221306 for more details there. llvm-svn: 279651
* DebugInfo: Add flag to CU to disable emission of inline debug info into the ↵David Blaikie2016-08-2412-42/+105
| | | | | | | | | | skeleton CU In cases where .dwo/.dwp files are guaranteed to be available, skipping the extra online (in the .o file) inline info can save a substantial amount of space - see the original r221306 for more details there. llvm-svn: 279650
* [LV] Unify vector and scalar mapsMatthew Simpson2016-08-245-296/+325
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop are maintained in VectorLoopValueMap. The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions. If a scalarized value is used by a scalar instruction, the scalar value is used directly. However, if the scalarized value is needed by a vector instruction, we generate the needed insertelement instructions on-demand. A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into VectorLoopValueMap. These functions work together to return the requested values if they're available or to produce them if they're not. The mapping has also be made less permissive. Entries can be added to VectorLoopValue map with the new initVector and initScalar functions. getVectorValue has been modified to return a constant reference to the mapped entries. There's no real functional change with this patch; however, in some cases we will generate slightly different code. For example, instead of an insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes. Differential Revision: https://reviews.llvm.org/D23169 llvm-svn: 279649
* [AArch64] Adjust the feature set for Exynos M1.Evandro Menezes2016-08-241-1/+2
| | | | | | Enable zero cycle zeroing. llvm-svn: 279648
* [SCCP] Don't delete side-effecting instructionsSanjoy Das2016-08-242-22/+21
| | | | | | | | I'm not sure if the `!isa<CallInst>(Inst) && !isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the case we know is broken. llvm-svn: 279647
* [X86][SSE] Add support for combining VZEXT_MOVL target shufflesSimon Pilgrim2016-08-247-82/+63
| | | | | | | | | | Includes adding more general support for the pattern: VZEXT_MOVL(VZEXT_LOAD(ptr)) -> VZEXT_LOAD(ptr) This has unearthed a couple of latent poor codegen issues (MINSS/MAXSS scalar load folding and MOVDDUP/BROADCAST load folding patterns), which will be fixed shortly. Its also reduced a couple of tests so that they no longer reach the instruction threshold necessary to be combined to PSHUFB (see PR26183). llvm-svn: 279646
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-245-2/+24
| | | | llvm-svn: 279644
* Revert r279572 "[sanitizer] change the 64-bit..." because of failures on ubsanVitaly Buka2016-08-243-144/+182
| | | | | | This reverts commit r279572 and r279595. llvm-svn: 279643
* [Hexagon] Enable subregister liveness trackingKrzysztof Parzyszek2016-08-241-1/+1
| | | | llvm-svn: 279642
* clang-offload-bundler: Update libdeps.NAKAMURA Takumi2016-08-241-3/+1
| | | | llvm-svn: 279641
* [StreamExecutor] Executor add synchronous methodsJason Henline2016-08-249-117/+1475
| | | | | | | | | | | | | | Summary: Add Executor methods that block the host until completion. Since these methods are host-synchronous, they don't require Stream arguments. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D23577 llvm-svn: 279640
* fix typo "varaible"Nico Weber2016-08-241-1/+1
| | | | llvm-svn: 279639
* fix typo "varaibles"Nico Weber2016-08-241-1/+1
| | | | llvm-svn: 279638
* [Hexagon] Remove the utilization of IMPLICIT_DEFs from expand-condsetsKrzysztof Parzyszek2016-08-241-104/+1
| | | | | | | This is no longer necessary, because since r279625 the subregister liveness properly accounts for read-undefs. llvm-svn: 279637
* fix typo 'varaible' in assertNico Weber2016-08-241-1/+1
| | | | llvm-svn: 279636
* Add target REQUIRES directives to offload bundler test. Samuel Antao2016-08-241-0/+3
| | | | llvm-svn: 279635
* [Driver][OpenMP][CUDA] Add capability to bundle object files in sections of ↵Samuel Antao2016-08-243-7/+331
| | | | | | | | | | | | | | | | | | | the host binary format. Summary: This patch adds the capability to bundle object files in sections of the host binary using a designated naming convention for these sections. This patch uses the functionality of the object reader already in the LLVM library to read bundled files, and invokes clang with the incremental linking options to create bundle files. Bundling files involves creating an IR file with the contents of the bundle assigned as initializers of globals binded to the designated sections. This way the bundling implementation is agnostic of the host object format. The features added by this patch were requested in the RFC discussion in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html. Reviewers: echristo, tra, jlebar, hfinkel, ABataev, Hahnfeld Subscribers: mkuron, whchung, cfe-commits, andreybokhanko, Hahnfeld, arpith-jacob, carlo.bertolli, mehdi_amini, caomhin Differential Revision: https://reviews.llvm.org/D21851 llvm-svn: 279634
* GlobalISel: fix cmp test to be in SSA formTim Northover2016-08-241-19/+20
| | | | llvm-svn: 279633
OpenPOWER on IntegriCloud