summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-195-153/+416
| | | | | | | | | | | | | | | | | | | | | | | | | | The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 279229
* [PM] Fix a compile error with GCC. NFC.Chandler Carruth2016-08-191-2/+2
| | | | llvm-svn: 279228
* [PM] Make the the new pass manager support fully generic extra argumentsChandler Carruth2016-08-195-51/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to run methods, both for transform passes and analysis passes. This also allows the analysis manager to use a different set of extra arguments from the pass manager where useful. Consider passes over analysis produced units of IR like SCCs of the call graph or loops. Passes of this nature will often want to refer to the analysis result that was used to compute their IR units (the call graph or LoopInfo). And for transformations, they may want to communicate special update information to the outer pass manager. With this change, it becomes possible to have a run method for a loop pass that looks more like: PreservedAnalyses run(Loop &L, AnalysisManager<Loop, LoopInfo> &AM, LoopInfo &LI, LoopUpdateRecord &UR); And to query the analysis manager like: AM.getResult<MyLoopAnalysis>(L, LI); This makes accessing the known-available analyses convenient and clear, and it makes passing customized data structures around easy. My initial use case is going to be in updating the pass manager layers when the analysis units of IR change. But there are more use cases here such as having a layer that lets inner passes signal whether certain additional passes should be run because of particular simplifications made. Two desires for this have come up in the past: triggering additional optimization after successfully unrolling loops, and triggering additional inlining after collapsing indirect calls to direct calls. Despite adding this layer of generic extensibility, the *only* change to existing, simple usage are for places where we forward declare the AnalysisManager template. We really shouldn't be doing this because of the fragility exposed here, but currently it makes coping with the legacy PM code easier. Differential Revision: http://reviews.llvm.org/D21462 llvm-svn: 279227
* [PM] Try to work-around what appears to be an MSVC SFINAE issue withChandler Carruth2016-08-191-1/+10
| | | | | | | | | | | | | | r279217 where it fails to select the path that other compilers select. The workaround won't be as careful to produce an error when an analysis result is incorrect, but we can rely on non-MSVC builds to catch such errors it seems and MSVC doesn't seem to support the alternative techniques. Hoping this brings the windows bots back to life. If not, will have to revert all of this. llvm-svn: 279225
* [CodeGen] Fix a trivial type conversion bug dating back to pre-2008James Molloy2016-08-192-1/+16
| | | | | | | | The heuristic above this code is incredibly suspect, but disregarding that it mutates the cast opcode so we need to check the *mutated* opcode later to see if we need to emit an AssertSext or AssertZext node. Fixes PR29041. llvm-svn: 279223
* [asan] Fix size of shadow incorrectly calculated in r279178Vitaly Buka2016-08-192-4/+3
| | | | | | | | | | | | Summary: r279178 generates 8 times more stores than necessary. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23708 llvm-svn: 279222
* [PM] NFC refactoring: remove the AnalysisManagerBase class, folding itChandler Carruth2016-08-191-156/+95
| | | | | | | | | | | | | | | | | | | | | | into the AnalysisManager class template. Back when I first added this base class there were separate analysis managers and some plausible reason why it would be a useful factoring of common code between them. However, after a lot of refactoring cleaning, we now have *entirely* shared code. The base class was just an arbitrary division between code in one class template and a separate class template. It didn't add anything and forced lots of indirection through "derived_this" for no real gain. We can always factor a base CRTP class out with common code if there is ever some *other* analysis manager that wants to share a subset of logic. But for now, folding things into the primary template is a non-trivial simplification with no down sides I see. It shortens the code considerably, removes an unhelpful abstraction, and will make subsequent patches *dramatically* less complex which enhance the analysis manager infrastructure to effectively cope with invalidation. llvm-svn: 279221
* [modules] Add missing include.Vassil Vassilev2016-08-191-0/+1
| | | | llvm-svn: 279219
* [PM] Redesign how the new PM detects whether an analysis result providesChandler Carruth2016-08-193-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | its own invalidate method. Previously, the technique would assume that if a result didn't have an invalidate method that didn't exactly match the expected signature it didn't have one at all. This is in fact not the case. And we had analyses with incorrect signatures for the invalidate method in the tree that would be erroneously invalidated in certain cases! Yikes. Moreover a result might legitimately want to have multiple overloads for the invalidate method, and if one changes or a new one is needed we again really want a compiler error. For example in the tree we had not added the overload for a *function* IR unit to the invalidate routine for TLI. Doh. So a new techique for the SFINAE detection here: if the result has *any* member spelled "invalidate" we turn off the synthesis of a default version. We don't care if it is a member function or a member variable or how many overloads there are. Once a result has something by that name it must provide suitable overloads for the contexts in which it is used. This seems much more resilient and durable. Huge props to Richard Smith who helped me figure out how on earth we could even do this in C++. It took quite some doing. The technique is remarkably clean however, and merely requires that the analysis results are not *final* classes. I think that's a requirement we can live with even if it is a bit odd. I've fixed the two bad in-tree analysis results. And this will make my next change which changes the API for invalidate much easier to validate as correct. llvm-svn: 279217
* [PM] Rework the new PM support for building the ModuleSummaryIndex toChandler Carruth2016-08-194-91/+55
| | | | | | | | | | | | | | | | | | | | | | directly produce the index as the value type result. This requires making the index movable which is straightforward. It greatly simplifies things by allowing us to completely avoid the builder API and the layers of abstraction inherent there. Instead both pass managers can directly construct these when run by value. They still won't be constructed truly eagerly thanks to the optional in the legacy PM. The code that directly builds the index can also just share a direct function. A notable change here is that the result type of the analysis for the new PM is no longer a reference type. This was really problematic when making changes to how we handle result types to make our interface requirements *much* more strict and precise. But I think this is an overall improvement. Differential Revision: https://reviews.llvm.org/D23701 llvm-svn: 279216
* Fix tests in llvm/test/tools/gold/X86 to satisfy r279014.NAKAMURA Takumi2016-08-197-9/+9
| | | | | | They would unexpectedly pass if test/tools/gold/X86/Output had outputs of previous tests. llvm-svn: 279214
* [Profile] Fix edge count read bugXinliang David Li2016-08-191-2/+2
| | | | | | Use uint64_t to avoid value truncation before scaling. llvm-svn: 279213
* [LTO] Move callback member from base class to the derived where it is used (NFC)Mehdi Amini2016-08-191-12/+10
| | | | llvm-svn: 279212
* Constify some path in the bitcode writer (NFC)Mehdi Amini2016-08-192-7/+7
| | | | llvm-svn: 279211
* [LTO] Add a move to inialize member in ctor initialization list (NFC)Mehdi Amini2016-08-191-1/+1
| | | | llvm-svn: 279210
* [Profile] Simple code refactoring for reuse /NFCXinliang David Li2016-08-191-12/+16
| | | | llvm-svn: 279209
* [XRay] Synthesize a reference to the xray_instr_mapDean Michael Berris2016-08-192-0/+18
| | | | | | | | | | | | | | | | | | | | | Without the synthesized reference to a symbol in the xray_instr_map, linker section garbage collection will helpfully remove the whole xray_instr_map section from the final executable (or archive). This will cause the runtime to not be able to identify the sleds and hot-patch the calls/jumps into the runtime trampolines. This change adds a reference from the text section at the end of the function to keep around the associated xray_instr_map section as well. We also make sure that we catch this reference in the test. Reviewers: chandlerc, echristo, majnemer, mehdi_amini Subscribers: mehdi_amini, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D23398 llvm-svn: 279204
* [RuntimeDyld][MCJIT] Un-XFAIL some tests that were fixed by r279182.Lang Hames2016-08-1910-10/+10
| | | | llvm-svn: 279201
* Revert "RegScavenging: Add scavengeRegisterBackwards()"Matthias Braun2016-08-1910-551/+209
| | | | | | | | | | | The ppc64 multistage bot fails on this. This reverts commit r279124. Also Revert "CodeGen: Add/Factor out LiveRegUnits class; NFCI" because it depends on the previous change This reverts commit r279171. llvm-svn: 279199
* [ADT] Add the worlds simplest STL extra. Or at least close to it.Chandler Carruth2016-08-193-0/+46
| | | | | | | | | | | | | | | | | This is a little class template that just builds an inheritance chain of empty classes. Despite how simple this is, it can be used to really nicely create ranked overload sets. I've added a unittest as much to document this as test it. You can pass an object of this type as an argument to a function overload set an it will call the first viable and enabled candidate at or below the rank of the object. I'm planning to use this in a subsequent commit to more clearly rank overload candidates used for SFINAE. All credit for this technique and both lines of code here to Richard Smith who was helping me rewrite the SFINAE check in question to much more effectively capture the intended set of checks. llvm-svn: 279197
* [RuntimeDyld] Add support for ELF R_ARM_REL32 and R_ARM_GOT_PREL.Lang Hames2016-08-191-0/+16
| | | | | | | | | | | | | | | | Patch by William Dillon. Thanks William! This patch adds support for the R_ARM_REL32 and R_ARM_GOT_PREL ELF ARM relocations to RuntimeDyld, which should allow JITing of code that produces these relocations. No test case: Unfortunately RuntimeDyldELF's GOT building mechanism (which uses a separate section for GOT entries) isn't compatible with RuntimeDyldChecker. The correct fix for this is to fix RuntimeDyldELF's GOT support (it's fundamentally broken at the moment: separate sections aren't guaranteed to be in range of a GOT entry load), but that's a non-trivial job. llvm-svn: 279182
* [asan] Optimize store size in FunctionStackPoisoner::poisonRedZonesVitaly Buka2016-08-184-47/+63
| | | | | | | | | | | | Summary: Reduce store size to avoid leading and trailing zeros. Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23648 llvm-svn: 279178
* Include X86CallFrameOptimization in the opt-bisect process.Andrew Kaylor2016-08-181-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D23683 llvm-svn: 279175
* AArch64: remove extraneous paddingSaleem Abdulrasool2016-08-181-3/+3
| | | | | | | | | | The structs BarrierOp, PrefetchOp, PSBHintOp are in AArch64AsmParser.cpp (inside anonymous namespace). This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov! llvm-svn: 279173
* [CMake] Add variables for tracking which runtimes are includedChris Bieneman2016-08-181-0/+4
| | | | | | This allows sub-projects to have conditionals based on the presence of other projects. llvm-svn: 279172
* CodeGen: Add/Factor out LiveRegUnits class; NFCIMatthias Braun2016-08-185-68/+229
| | | | | | | | | | | | | This is a set of register units intended to track register liveness, it is similar in spirit to LivePhysRegs. You can also think of this as the liveness tracking parts of the RegisterScavenger factored out into an own class. This was proposed in http://llvm.org/PR27609 Differential Revision: http://reviews.llvm.org/D21916 llvm-svn: 279171
* Fix link quotes on AArch64's CompilerWriterInfo section.Jacques Pienaar2016-08-181-2/+2
| | | | | | | | | | Reviewers: t.p.northover Subscribers: t.p.northover, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D23697 llvm-svn: 279169
* CodeGen: If Convert blocks that would form a diamond when tail-merged.Kyle Butt2016-08-183-82/+366
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. Regression on self-hosting bots with no obvious explanation. Tidied up range handling to be more obviously correct, but there was no smoking gun. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279168
* IfConversion: Rescan diamonds.Kyle Butt2016-08-181-34/+115
| | | | | | | | | | | The cost of predicating a diamond is only the instructions that are not shared between the two branches. Additionally If a predicate clobbering instruction occurs in the shared portion of the branches (e.g. a cond move), it may still be possible to if convert the sub-cfg. This change handles these two facts by rescanning the non-shared portion of a diamond sub-cfg to recalculate both the predication cost and whether both blocks are pred-clobbering. llvm-svn: 279167
* IfConversion: Handle inclusive ranges more carefully.Kyle Butt2016-08-181-22/+56
| | | | | | | | | | | This may affect calculations for thresholds, but is not a significant change in behavior. The problem was that an inclusive range must have an additonal flag to showr that it is empty, because otherwise begin == end implies that the range has one element, and it may not be possible to move past on either side. llvm-svn: 279166
* llvm-objdump: Add Hexagon printer changes for -S/-l optionsHemant Kulkarni2016-08-185-0/+105
| | | | | | Differential Revision: https://reviews.llvm.org/D23521 llvm-svn: 279161
* [CMake] Create convenience targets for runtime projectsChris Bieneman2016-08-181-0/+7
| | | | | | Each runtime project has a top-level target that is the name of the runtime (minus the "lib" prefix if applicable). This creates top-level targets mapping to runtime projects. llvm-svn: 279160
* [SystemZ] Use valid base/index regs for inline asmZhan Jun Liau2016-08-182-2/+25
| | | | | | | | | | | | | | | Summary: Inline asm memory constraints can have the base or index register be assigned to %r0 right now. Make sure that we assign only ADDR64 registers to the base and index. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23367 llvm-svn: 279157
* [Analysis] Change several Analysis pieces to use NodeRef. NFC.Tim Shen2016-08-183-66/+74
| | | | | | | | | | Reviewers: dblaikie, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23625 llvm-svn: 279156
* [CMake] Make llvm-config implicit dependency for subprojectsChris Bieneman2016-08-182-2/+2
| | | | | | | | The subproject interface being used for runtime libraries expects that llvm-config is passed into the subproject for consumption. We currently do this for every subproject, so we should expect that all LLVM ExternalProjects depend on llvm-config for the time being. Eventually I'd like to see the sub-projects using LLVMConfig.cmake instead of the llvm-config binary, but that will take time to roll out. llvm-svn: 279155
* [CMake] Minor fix to regex in r279152Chris Bieneman2016-08-181-1/+1
| | | | | | The third version component is optional in Xcode's version spew, so we need to make it optional in the regex. llvm-svn: 279153
* [CMake] Support for generating Xcode 8 compatible toolchainsChris Bieneman2016-08-181-1/+30
| | | | | | Xcode 8 requires toolchain compatibility version 2. This allows us to select the correct compatibility version based on the installed version of Xcode. llvm-svn: 279152
* [InstCombine] add helper function for folds of icmp (shl 1, Y), C; NFCISanjay Patel2016-08-181-62/+65
| | | | | | | | | | | | | Clean up the existing code by: 1. Renaming variables 2. Adding local variables 3. Making it vector-safe This is still guarded by a ConstantInt check, so no functional change is intended. But this should be ready to go: if we move the ConstantInt check down, all of these folds should do the right thing for vector types. llvm-svn: 279150
* [lanai] Add ISA document to CompilerWritersInfoJacques Pienaar2016-08-181-0/+6
| | | | | | | | | | | | Summary: Add Lanai ISA document to CompilerWritersInfo. Reviewers: eliben Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D23693 llvm-svn: 279149
* AMDGPU/SI: Fix a test in wqm.ll to always use s_cbranch_vcc*Tom Stellard2016-08-181-7/+7
| | | | | | | | | | | | | | | Summary: We need to use floating-point compares to ensure that s_cbranch_vcc* instructions are always generated. With integer compares, future optimizations could cause s_cbranch_scc* to be generated instead. Reviewers: arsenm, nhaehnle Subscribers: llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23401 llvm-svn: 279148
* [libFuzzer] add more __attribute__((visibility("default")))Kostya Serebryany2016-08-181-0/+2
| | | | llvm-svn: 279143
* Make cltz and cttz zero undef when the operand cannot be zero in InstCombineAmaury Sechet2016-08-182-5/+40
| | | | | | | | | | | | Summary: Also add popcount(n) == bitsize(n) -> n == -1 transformation. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23134 llvm-svn: 279141
* [InstCombine] use m_APInt to allow icmp (trunc X, Y), C folds for splat ↵Sanjay Patel2016-08-183-20/+8
| | | | | | | | | | | | | | constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 https://reviews.llvm.org/rL279101 llvm-svn: 279133
* [InstCombine] clean up foldICmpTruncConstant(); NFCISanjay Patel2016-08-181-14/+17
| | | | | | | 1. Fix variable names 2. Add local variables to reduce code llvm-svn: 279132
* [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> froundMichael Kuperstein2016-08-1822-112/+112
| | | | | | | | | | The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129
* AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type.Wei Ding2016-08-186-22/+23
| | | | | | Differential Revision: http://reviews.llvm.org/D23689 llvm-svn: 279126
* [SLP] Initialize VectorizedValue when gatheringMatthew Simpson2016-08-182-8/+161
| | | | | | | | | | | | | | | | | We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector. This patch ensures we properly set VectorizedValue, and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 Differential Revison: https://reviews.llvm.org/D23410 llvm-svn: 279125
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2016-08-187-144/+325
| | | | | | | | | | | | | | | | Re-apply r276044 with off-by-1 instruction fix for the reload placement. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 279124
* Branch Folding: Accept explicit threshold for tail merge size.Kyle Butt2016-08-183-22/+44
| | | | | | | | This is prep work for allowing the threshold to be different during layout, and to enforce a single threshold between merging and duplicating during layout. No observable change intended. llvm-svn: 279117
* Add a version of Intrinsic::getName which is more efficient when there are ↵Pete Cooper2016-08-182-1/+9
| | | | | | | | | | | | | no overloads. When running 'opt -O2 verify-uselistorder-nodbg.lto.bc', there are 33m allocations. 8.2m come from std::string allocations in Intrinsic::getName(). Turns out this method only returns a std::string because it needs to handle overloads, but that is not the common case. This adds an overload of getName which just returns a StringRef when there are no overloads and so saves on the allocations. llvm-svn: 279113
OpenPOWER on IntegriCloud