summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Include X86CallFrameOptimization in the opt-bisect process.Andrew Kaylor2016-08-181-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D23683 llvm-svn: 279175
* AArch64: remove extraneous paddingSaleem Abdulrasool2016-08-181-3/+3
| | | | | | | | | | The structs BarrierOp, PrefetchOp, PSBHintOp are in AArch64AsmParser.cpp (inside anonymous namespace). This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov! llvm-svn: 279173
* CodeGen: Add/Factor out LiveRegUnits class; NFCIMatthias Braun2016-08-183-60/+107
| | | | | | | | | | | | | This is a set of register units intended to track register liveness, it is similar in spirit to LivePhysRegs. You can also think of this as the liveness tracking parts of the RegisterScavenger factored out into an own class. This was proposed in http://llvm.org/PR27609 Differential Revision: http://reviews.llvm.org/D21916 llvm-svn: 279171
* CodeGen: If Convert blocks that would form a diamond when tail-merged.Kyle Butt2016-08-181-78/+287
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. Regression on self-hosting bots with no obvious explanation. Tidied up range handling to be more obviously correct, but there was no smoking gun. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279168
* IfConversion: Rescan diamonds.Kyle Butt2016-08-181-34/+115
| | | | | | | | | | | The cost of predicating a diamond is only the instructions that are not shared between the two branches. Additionally If a predicate clobbering instruction occurs in the shared portion of the branches (e.g. a cond move), it may still be possible to if convert the sub-cfg. This change handles these two facts by rescanning the non-shared portion of a diamond sub-cfg to recalculate both the predication cost and whether both blocks are pred-clobbering. llvm-svn: 279167
* IfConversion: Handle inclusive ranges more carefully.Kyle Butt2016-08-181-22/+56
| | | | | | | | | | | This may affect calculations for thresholds, but is not a significant change in behavior. The problem was that an inclusive range must have an additonal flag to showr that it is empty, because otherwise begin == end implies that the range has one element, and it may not be possible to move past on either side. llvm-svn: 279166
* [SystemZ] Use valid base/index regs for inline asmZhan Jun Liau2016-08-181-0/+23
| | | | | | | | | | | | | | | Summary: Inline asm memory constraints can have the base or index register be assigned to %r0 right now. Make sure that we assign only ADDR64 registers to the base and index. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23367 llvm-svn: 279157
* [InstCombine] add helper function for folds of icmp (shl 1, Y), C; NFCISanjay Patel2016-08-181-62/+65
| | | | | | | | | | | | | Clean up the existing code by: 1. Renaming variables 2. Adding local variables 3. Making it vector-safe This is still guarded by a ConstantInt check, so no functional change is intended. But this should be ready to go: if we move the ConstantInt check down, all of these folds should do the right thing for vector types. llvm-svn: 279150
* [libFuzzer] add more __attribute__((visibility("default")))Kostya Serebryany2016-08-181-0/+2
| | | | llvm-svn: 279143
* Make cltz and cttz zero undef when the operand cannot be zero in InstCombineAmaury Sechet2016-08-181-5/+20
| | | | | | | | | | | | Summary: Also add popcount(n) == bitsize(n) -> n == -1 transformation. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23134 llvm-svn: 279141
* [InstCombine] use m_APInt to allow icmp (trunc X, Y), C folds for splat ↵Sanjay Patel2016-08-181-9/+4
| | | | | | | | | | | | | | constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 https://reviews.llvm.org/rL279101 llvm-svn: 279133
* [InstCombine] clean up foldICmpTruncConstant(); NFCISanjay Patel2016-08-181-14/+17
| | | | | | | 1. Fix variable names 2. Add local variables to reduce code llvm-svn: 279132
* [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> froundMichael Kuperstein2016-08-1821-109/+109
| | | | | | | | | | The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129
* AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type.Wei Ding2016-08-183-2/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D23689 llvm-svn: 279126
* [SLP] Initialize VectorizedValue when gatheringMatthew Simpson2016-08-181-8/+66
| | | | | | | | | | | | | | | | | We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector. This patch ensures we properly set VectorizedValue, and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 Differential Revison: https://reviews.llvm.org/D23410 llvm-svn: 279125
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2016-08-182-108/+243
| | | | | | | | | | | | | | | | Re-apply r276044 with off-by-1 instruction fix for the reload placement. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 279124
* Branch Folding: Accept explicit threshold for tail merge size.Kyle Butt2016-08-183-22/+44
| | | | | | | | This is prep work for allowing the threshold to be different during layout, and to enforce a single threshold between merging and duplicating during layout. No observable change intended. llvm-svn: 279117
* Add a version of Intrinsic::getName which is more efficient when there are ↵Pete Cooper2016-08-181-0/+5
| | | | | | | | | | | | | no overloads. When running 'opt -O2 verify-uselistorder-nodbg.lto.bc', there are 33m allocations. 8.2m come from std::string allocations in Intrinsic::getName(). Turns out this method only returns a std::string because it needs to handle overloads, but that is not the common case. This adds an overload of getName which just returns a StringRef when there are no overloads and so saves on the allocations. llvm-svn: 279113
* [AMDGPU] add s_incperflevel/s_decperflevel intrinsics.Valery Pykhtin2016-08-181-2/+12
| | | | | | Differential revision: https://reviews.llvm.org/D23666 llvm-svn: 279106
* Fix SystemZ compilation abort caused by negative AND maskElliot Colp2016-08-182-34/+34
| | | | | | | | | | Normally, when an AND with a constant is lowered to NILL, the constant value is truncated to 16 bits. However, since r274066, ANDs whose results are used in a shift are caught by a different pattern that does not truncate. The instruction printer expects a 16-bit unsigned immediate operand for NILL, so this results in an abort. This patch adds code to manually truncate the constant in this situation. The rest of the bits are then set, so we will detect a case for NILL "naturally" rather than using peephole optimizations. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 279105
* AArch64: Don't call getIterator() on iteratorsDuncan P. N. Exon Smith2016-08-181-2/+1
| | | | | | | | | | | | | | | | | Remove an unnecessary round-trip: iterator => operator->() => getIterator() In some cases, the iterator is end(), so the dereference of operator-> is invalid (UB). The testcase only crashes with r278974 (currently reverted to investigate this), which adds an assertion for invalid dereferences of ilist nodes. Fixes PR29035. llvm-svn: 279104
* [LLVM] Fix some Clang-tidy modernize-use-using and Include What You Use warningsEugene Zelenko2016-08-184-38/+48
| | | | | | Differential revision: https://reviews.llvm.org/D23675 llvm-svn: 279102
* [InstCombine] use m_APInt to allow icmp (udiv X, Y), C folds for splat ↵Sanjay Patel2016-08-181-18/+20
| | | | | | | | | | | | | constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 llvm-svn: 279101
* [WebAssembly] Disable the store-results optimization.Dan Gohman2016-08-181-20/+0
| | | | | | | | | | The WebAssemly spec removing the return value from store instructions, so remove the associated optimization from LLVM. This patch leaves the store instruction operands in place for now, so stores now always write to "$drop"; these will be removed in a seperate patch. llvm-svn: 279100
* [Assumptions] Make collecting ephemeral values not quadratic in theChandler Carruth2016-08-181-23/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | number of assume intrinsics. The classical way to have a cache-friendly vector style container when we need queue semantics for BFS instead of stack semantics for DFS is to use an ever-growing vector and an index. Erasing from the front requires O(size) work, and unless we expect the worklist to grow *very* large, its probably cheaper to just grow and race down the list. But that makes it more bad that we're putting the assume intrinsics in this at all. We end up looking at the (by definition empty) use list to see if they're ephemeral (when we've already put them in that set), etc. Instead, directly populate the worklist with the operands when we mark the assume intrinsics as ephemeral. Also, test the visited set *before* putting things into the worklist so we don't accumulate the same value in the list 100s of times. It would be nice to use a set-vector for this but I think its useful to test the set earlier to avoid repeatedly querying whether the same instruction is safe to speculate. Hopefully with these changes the number of values pushed onto the worklist is smaller, and we avoid quadratic work by letting it grow as necessary. Differential Revision: https://reviews.llvm.org/D23396 llvm-svn: 279099
* Fix -Wpessimizing-move error, NFCVedant Kumar2016-08-181-1/+1
| | | | llvm-svn: 279095
* [InstCombine] clean up foldICmpUDivConstant; NFCSanjay Patel2016-08-181-16/+12
| | | | | | | 1. Better variable names 2. Remove unnecessary check of ConstantInt llvm-svn: 279094
* Resubmit "Write the TPI stream from a PDB to Yaml."Zachary Turner2016-08-186-92/+144
| | | | | | | | The original patch was breaking some buildbots due to an incorrect ordering of function definitions which caused some compilers to recognize a definition but others to not. llvm-svn: 279089
* CVP. Turn marking adds as no wrap (introduced by r278107) off by defaultArtur Pilipenko2016-08-181-0/+5
| | | | | | It causes a regression on our internal benchmark. Introduce cvp-dont-process flag and set it off by default while investigating the regression. llvm-svn: 279082
* [AArch64][GlobalISel] Select floating-point binary ops.Ahmed Bougacha2016-08-182-0/+38
| | | | | | There is no FREM instruction, but the others are straightforward. llvm-svn: 279081
* [IRCE] Switch over to LLVM_DUMP_METHOD. NFCI.Davide Italiano2016-08-181-2/+1
| | | | llvm-svn: 279079
* [InstCombine] use m_APInt to allow icmp (mul X, Y), C folds for splat ↵Sanjay Patel2016-08-181-18/+14
| | | | | | | | | | | | constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 llvm-svn: 279077
* [WebAssembly] Refactor WebAssemblyLowerEmscriptenException pass for ↵Derek Schuff2016-08-184-124/+182
| | | | | | | | | | | | | | | | | | | | | | setjmp/longjmp This patch changes the code structure of WebAssemblyLowerEmscriptenException pass to support both exception handling and setjmp/longjmp. It also changes the name of the pass and the source file. 1. Change the file/pass name to WebAssemblyLowerEmscriptenExceptions -> WebAssemblyLowerEmscriptenEHSjLj to make it clear that it supports both EH and SjLj 2. List function / global variable names at the top so they can be changed easily 3. Some cosmetic changes Patch by Heejin Ahn Differential Revision: https://reviews.llvm.org/D23588 llvm-svn: 279075
* [AArch64][GlobalISel] Select G_SDIV/G_UDIV.Ahmed Bougacha2016-08-182-1/+11
| | | | | | | | There is no REM instruction; that will require an expansion. It's not obvious that should be done in select, rather than as a (custom?) legalization. llvm-svn: 279074
* [InstCombine] use APInt in isSignTest instead of ConstantInt; NFCSanjay Patel2016-08-181-6/+7
| | | | | | | This will enable vector splat folding, but NFC until the callers have their ConstantInt restrictions removed. llvm-svn: 279072
* fix typo; NFCSanjay Patel2016-08-181-1/+1
| | | | llvm-svn: 279068
* [Hexagon] Create vcombine in HexagonCopyToCombineKrzysztof Parzyszek2016-08-181-18/+56
| | | | llvm-svn: 279067
* [InstCombine] use m_APInt to allow icmp (xor X, Y), C folds for splat ↵Sanjay Patel2016-08-181-13/+10
| | | | | | | | | | | constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 llvm-svn: 279066
* [mips] Correct tail call encoding for MIPSR6Simon Dardis2016-08-189-41/+31
| | | | | | | | | | | | | r277708 enabled tails calls for MIPS but used the 'jr' instruction when the jump target was held in a register. For MIPSR6, 'jalr $zero, $reg' should have been used. Additionally, add missing patterns for external and global symbols for tail calls. Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23301 llvm-svn: 279064
* (Trivial) TargetPassConfig: assert when TargetMachine has no MCAsmInfoAlex Bradbury2016-08-181-1/+3
| | | | | | | | | | | | | | | Summary: This is a pretty trivial, but I thought it was worth just checking that nobody feels it's completely the wrong thing to be doing. The motivation is that when starting a new backend, you often start with a minimal stub, pretty much just FooTargetMachine and FooTargetInfo. Once that's built, you might naturally try `llc -march=foo myinput.ll` and it seems more developer-friendly if this ends up asserting due to the lack of MCAsmInfo with an informative message rather than just segfaulting. Reviewers: MatzeB, chandlerc Subscribers: bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D23443 llvm-svn: 279061
* Remove trailing whitespaceSimon Pilgrim2016-08-181-9/+9
| | | | llvm-svn: 279054
* Revert r279016 -- it breaks win32-elf JIT tests.Lang Hames2016-08-181-2/+2
| | | | llvm-svn: 279029
* [sanitizer-coverage/libFuzzer] instrument comparisons with ↵Kostya Serebryany2016-08-183-11/+74
| | | | | | __sanitizer_cov_trace_cmp[1248] instead of __sanitizer_cov_trace_cmp, don't pass the comparison type to save a bit performance. Use these new callbacks in libFuzzer llvm-svn: 279027
* TailDuplicator: Fix crash after r278974Matthias Braun2016-08-181-1/+1
| | | | | | | | Some inputs would after r278974 without this fix (see http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_build/2733/console for an example) llvm-svn: 279022
* [LTO] Promote before performing weak resolutionMehdi Amini2016-08-181-2/+2
| | | | | | | | | | | | | | | | Summary: This was reversed compared to ThinLTOCodeGenerator for some reason, and lead to an increased code-size on my tests. I figured that the weak resolution may internalize a linkonce function, which will be promoted immediately (and renamed), before being internalized again. Reviewers: tejohnson Subscribers: pcc, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23632 llvm-svn: 279021
* [asan] Add support of lifetime poisoning into ComputeASanStackFrameLayoutVitaly Buka2016-08-182-4/+14
| | | | | | | | | | | | | | | Summary: We are going to combine poisoning of red zones and scope poisoning. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23623 llvm-svn: 279020
* [RuntimeDyld] Strip leading '_' from symbols on 32-bit windows inLang Hames2016-08-181-2/+2
| | | | | | | | | | | RTDyldMemoryManager::getSymbolAddressInProcess() This should allow JIT'd code for win32 to find in-process symbols. See http://llvm.org/PR28699 . Patch by James Holderness. Thanks James! llvm-svn: 279016
* [LTO] Change addSaveTemps API: do not add dot to the supplied prefix pathMehdi Amini2016-08-181-5/+3
| | | | | | | | | | | | | | | | Summary: It does not play well with directories (end up with a bunch of hidden files). Also, do not strip the 0 suffix for the first task, especially since 0 can be used by ThinLTO as well now. Reviewers: tejohnson Subscribers: mehdi_amini, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D23612 llvm-svn: 279014
* [WebAssembly] Handle debug information and virtual registers without ↵Dominic Chen2016-08-173-3/+5
| | | | | | | | | | | | | | crashing (reland r278967) Summary: Currently, enabling debug information when compiling for WebAssembly crashes the backend. This commit fixes these by skipping debug values in backend passes. Reviewers: jfb, aprantl, dschuff, echristo Subscribers: llvm-commits, dschuff, jfb, MatzeB, dexonsmith, yurydelendik, mehdi_amini Differential Revision: https://reviews.llvm.org/D23635 llvm-svn: 279011
* [libFuzzer] force proper popcnt instructionKostya Serebryany2016-08-172-1/+3
| | | | llvm-svn: 279002
OpenPOWER on IntegriCloud