summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine][X86] Regenerate masked memory op combine testsSimon Pilgrim2016-09-081-88/+114
| | | | llvm-svn: 280960
* [InstCombine][X86] Regenerate vperm2f128/vperm2i128 combine testsSimon Pilgrim2016-09-081-86/+116
| | | | llvm-svn: 280959
* [InstCombine][X86] Regenerate insertps combine testsSimon Pilgrim2016-09-081-43/+59
| | | | llvm-svn: 280957
* Revert "[LoopUnroll] Properly update loop-info when cloning prologues and ↵Michael Zolotukhin2016-09-081-44/+0
| | | | | | | | | | epilogues." This reverts commit r280901. This caused a bunch of failures, reverting it until I investigate them. llvm-svn: 280905
* [LoopUnroll] Properly update loop-info when cloning prologues and epilogues.Michael Zolotukhin2016-09-081-0/+44
| | | | | | | | | | | | | | | | | | | Summary: When cloning blocks for prologue/epilogue we need to replicate the loop structure from the original loop. It wasn't a problem for the innermost loops, but it led to an incorrect loop info when we unrolled a loop with a child loop - in this case created prologue-loop had a child loop, but loop info didn't reflect that. This fixes PR28888. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas Differential Revision: https://reviews.llvm.org/D24203 llvm-svn: 280901
* [InstCombine] use m_APInt to allow icmp (and (sh X, Y), C2), C1 folds for ↵Sanjay Patel2016-09-074-15/+9
| | | | | | splat constant vectors llvm-svn: 280873
* [SimplifyCFG] Don't try to create metadata-valued PHIsHal Finkel2016-09-071-0/+51
| | | | | | | | | | | | | | | | We can't create metadata-valued PHIs; don't try to do so when sinking. I created a test case for this using the @llvm.type.test intrinsic, because it takes a metadata parameter and does not have severe side effects (thus SimplifyCFG is willing to otherwise sink it). Previously, running the test case would crash with: Invalid use of metadata! %.sink = select i1 %flag, metadata <...>, metadata <0x4e45dc0> LLVM ERROR: Broken function found, compilation aborted! llvm-svn: 280866
* [InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectorsSanjay Patel2016-09-071-12/+22
| | | | | | | | This is a revert of r280676 which was a revert of r280637; ie, this is r280637 again. It was speculatively reverted to help debug buildbot failures. llvm-svn: 280861
* [LSV] Use the original loads' names for the extractelement instructions.Justin Lebar2016-09-072-35/+35
| | | | | | | | | | | | | | | | Summary: LSV replaces multiple adjacent loads with one vectorized load and a bunch of extractelement instructions. This patch makes the extractelement instructions' names match those of the original loads, for (hopefully) improved readability. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D23748 llvm-svn: 280818
* Regenerate vector bitcast folding tests using update_test_checks.py.Andrea Di Biagio2016-09-073-108/+125
| | | | | | | Two tests have been merged together, regenerated and then moved to a more appropriate directory. No functional change. llvm-svn: 280814
* [InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining ↵Andrea Di Biagio2016-09-071-0/+18
| | | | | | | | | | | logic. This fixes a similar issue to the one already fixed by r280804 (revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts in the extrq/extrqi combining logic. However, it turns out that even the insertq/insertqi logic was affected by the same problem. llvm-svn: 280807
* [InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the ↵Andrea Di Biagio2016-09-071-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | operands of extrq/extrqi intrinsic calls. This patch fixes an assertion failure caused by unsafe dynamic casts on the constant operands of sse4a intrinsic calls to extrq/extrqi The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently checks if the input operands are constants. Internally, that logic relies on dyn_casts of values returned by calls to method Constant::getAggregateElement. However, method getAggregateElemet may return nullptr if the constant element cannot be retrieved. So, all the dyn_casts can potentially fail. This is what happens for example if a constexpr value is passed in input to an extrq/extrqi intrinsic call. This patch fixes the problem by using a dyn_cast_or_null (instead of a simple dyn_cast) on the result of each call to Constant::getAggregateElement. Added reproducible test cases to x86-sse4a.ll. Differential Revision: https://reviews.llvm.org/D24256 llvm-svn: 280804
* [SimplifyCFG] Update workaround for PR30188 to also include loadsJames Molloy2016-09-071-0/+29
| | | | | | | | I should have realised this the first time around, but if we're avoiding sinking stores where the operands come from allocas so they don't create selects, we also have to do the same for loads because SROA will be just as defective looking at loads of selected addresses as stores. Fixes PR30188 (again). llvm-svn: 280792
* [SimplifyCFG] Check PHI uses more accuratelyJames Molloy2016-09-071-0/+25
| | | | | | | | PR30292 showed a case where our PHI checking wasn't correct. We were checking that all values were used by the same PHI before deciding to sink, but we weren't checking that the incoming values for that PHI were what we expected. As a result, we had to bail out after block splitting which caused us to never reach a steady state in SimplifyCFG. Fixes PR30292. llvm-svn: 280790
* [JumpThreading] Only write back branch-weight MDs for blocks that originally ↵Adam Nemet2016-09-061-0/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | had PGO info Currently the pass updates branch weights in the IR if the function has any PGO info (entry frequency is set). However we could still have regions of the CFG that does not have branch weights collected (e.g. a cold region). In this case we'd use static estimates. Since static estimates for branches are determined independently, they are inconsistent. Updating them can "randomly" inflate block frequencies. I've run into this in a completely cold loop of h264ref from SPEC. -Rpass-with-hotness showed the loop to be completely cold during inlining (before JT) but completely hot during vectorization (after JT). The new testcase demonstrate the problem. We check array elements against 1, 2 and 3 in a loop. The check against 3 is the loop-exiting check. The block names should be self-explanatory. In this example, jump threading incorrectly updates the weight of the loop-exiting branch to 0, drastically inflating the frequency of the loop (in the range of billions). There is no run-time profile info for edges inside the loop, so branch probabilities are estimated. These are the resulting branch and block frequencies for the loop body: check_1 (16) (8) / | eq_1 | (8) \ | check_2 (16) (8) / | eq_2 | (8) \ | check_3 (16) (1) / | (loop exit) | (15) | (back edge) First we thread eq_1 -> check_2 to check_3. Frequencies are updated to remove the frequency of eq_1 from check_2 and then from the false edge leaving check_2. Changed frequencies are highlighted with * *: check_1 (16) (8) / | eq_1~ | (8) / | / check_2 (*8*) / (8) / | \ eq_2 | (*0*) \ \ | ` --- check_3 (16) (1) / | (loop exit) | (15) | (back edge) Next we thread eq_1 -> check_3 and eq_2 -> check_3 to check_1 as new back edges. Frequencies are updated to remove the frequency of eq_1 and eq_3 from check_3 and then the false edge leaving check_3 (changed frequencies are highlighted with * *): check_1 (16) (8) / | eq_1~ | (8) / | / check_2 (*8*) / (8) / | /-- eq_2~ | (*0*) (back edge) | check_3 (*0*) (*0*) / | (loop exit) | (*0*) | (back edge) As a result, the loop exit edge ends up with 0 frequency which in turn makes the loop header to have maximum frequency. There are a few potential problems here: 1. The profile data seems odd. There is a single profile sample of the loop being entered. On the other hand, there are no weights inside the loop. 2. Based on static estimation we shouldn't set edges to "extreme" values, i.e. extremely likely or unlikely. 3. We shouldn't create profile metadata that is calculated from static estimation. I am not sure what policy is but it seems to make sense to treat profile metadata as something that is known to originate from profiling. Estimated probabilities should only be reflected in BPI/BFI. Any one of these would probably fix the immediate problem. I went for 3 because I think it's a good policy to have and added a FIXME about 2. Differential Revision: https://reviews.llvm.org/D24118 llvm-svn: 280713
* fix FileCheck variables for test added with r280677Sanjay Patel2016-09-051-2/+2
| | | | | | | The script (utils/update_test_checks.py) seems to have problems with variable names that start with the same string. llvm-svn: 280679
* [Coroutines] Part12: Handle alloca address-takenGor Nishanov2016-09-051-0/+67
| | | | | | | | | | | | | | | | | | | | | | | Summary: Move early uses of spilled variables after CoroBegin. For example, if a parameter had address taken, we may end up with the code like: define @f(i32 %n) { %n.addr = alloca i32 store %n, %n.addr ... call @coro.begin This patch fixes the problem by moving uses of spilled variables after CoroBegin. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24234 llvm-svn: 280678
* [InstCombine] don't assert that division-by-constant has been folded (PR30281)Sanjay Patel2016-09-051-0/+93
| | | | | | | | | | This is effectively a revert of: https://reviews.llvm.org/rL280115 And this should fix https://llvm.org/bugs/show_bug.cgi?id=30281: llvm-svn: 280677
* [InstCombine] revert r280637 because it causes test failures on an ARM botSanjay Patel2016-09-051-22/+12
| | | | | | http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/14952/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Aicmp.ll llvm-svn: 280676
* [SimplifyCFG] Add test for sinking inline asm in if/elseOliver Stannard2016-09-051-0/+29
| | | | | | | | | | | | | | | This test code previously caused a failure in the module verifier, because SimplifyCFG created this invalid instruction, which tries to take the address of inline asm: %.sink = select i1 %1, i64 ()* asm "mov $0, #1", "=r", i64 ()* asm %"mov $0, #2", "=r" This has been fixed recently, presumably by James Molloy's patches that re-wrote and changed parts of SimplifyCFG, so this patch just adds a regression test for it. Differential Revision: https://reviews.llvm.org/D24231 llvm-svn: 280660
* [Coroutines] Part11: Add final suspend handling.Gor Nishanov2016-09-051-0/+73
| | | | | | | | | | | | | | | | | | | Summary: A frontend may designate a particular suspend to be final, by setting the second argument of the coro.suspend intrinsic to true. Such a suspend point has two properties: * it is possible to check whether a suspended coroutine is at the final suspend point via coro.done intrinsic; * a resumption of a coroutine stopped at the final suspend point leads to undefined behavior. The only possible action for a coroutine at a final suspend point is destroying it via coro.destroy intrinsic. This patch adds final suspend handling logic to CoroEarly and CoroSplit passes. Now, the final suspend point example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex5.ll). Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24068 llvm-svn: 280646
* [InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectorsSanjay Patel2016-09-041-12/+22
| | | | | | | | The code to calculate 'UsesRemoved' could be simplified. As-is, that code is a victim of PR30273: https://llvm.org/bugs/show_bug.cgi?id=30273 llvm-svn: 280637
* [InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacingDorit Nuzman2016-09-041-0/+61
| | | | | | | | | | | | memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617
* Fix inliner funclet unwind memoizationJoseph Tremoulet2016-09-041-4/+224
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The inliner may need to determine where a given funclet unwinds to, and this determination may depend on other funclets throughout the funclet tree. The code that performs this walk in getUnwindDestToken memoizes results to avoid redundant computations. In the case that a funclet's unwind destination is derived from its ancestor, there's code to walk back down the tree from the ancestor updating the memo map of its descendants to record the unwind destination. This change fixes that code to account for the case that some descendant has a different unwind destination, which can happen if that unwind dest is a descendant of the EHPad being queried and thus didn't determine its unwind destination. Also update test inline-funclets.ll, which is supposed to cover such scenarios, to include a case that fails an assertion without this fix but passes with it. Fixes PR29151. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24117 llvm-svn: 280610
* AMDGPU: Do basic folding of class intrinsicMatt Arsenault2016-09-031-0/+237
| | | | | | | This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586
* Fix buildbot error.Wei Mi2016-09-031-62/+0
| | | | | | Add -mtriple=x86_64-unknown-linux-gnu for the test and move it to CodeGen/X86. llvm-svn: 280568
* [Profile] handle select instruction in 'expect' loweringXinliang David Li2016-09-021-0/+10
| | | | | | | | | Builtin expect lowering currently ignores select. This patch fixes the issue Differential Revision: http://reviews.llvm.org/D24166 llvm-svn: 280547
* [InstCombine] auto-generate assertions for tighter checkingSanjay Patel2016-09-021-60/+95
| | | | llvm-svn: 280531
* Split the store of a wide value merged from an int-fp pair into multiple stores.Wei Mi2016-09-021-0/+62
| | | | | | | | | | | | | | For the store of a wide value merged from a pair of values, especially int-fp pair, sometimes it is more efficent to split it into separate narrow stores, which can remove the bitwise instructions or sink them to colder places. Now the feature is only enabled on x86 target, and only store of int-fp pair is splitted. It is possible that the application scope gets extended with perf evidence support in the future. Differential Revision: https://reviews.llvm.org/D22840 llvm-svn: 280505
* [InsttCombine] fold insertelement of constant into shuffle with constant ↵Sanjay Patel2016-09-021-7/+4
| | | | | | | | | | | | | | | | | operand (PR29126) The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards shrinking that to a single shufflevector. Note that the transform is intentionally limited to shuffles that are equivalent to vector selects to avoid creating arbitrary shuffle masks that may not lower well. This should solve PR29126: https://llvm.org/bugs/show_bug.cgi?id=29126 Differential Revision: https://reviews.llvm.org/D23886 llvm-svn: 280504
* [LV] Ensure reverse interleaved group GEPs remain uniformMatthew Simpson2016-09-021-2/+8
| | | | | | | | | | | | | | For uniform instructions, we're only required to generate a scalar value for the first vector lane of each unroll iteration. Thus, if we have a reverse interleaved group, computing the member index off the scalar GEP corresponding to the last vector lane of its pointer operand technically makes the GEP non-uniform. We should compute the member index off the first scalar GEP instead. I've added the updated member index computation to the existing reverse interleaved group test. llvm-svn: 280497
* [instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all ↵Andrea Di Biagio2016-09-021-0/+8
| | | | | | | | | | | | | | | | | | | | | | | NaN. This patch fixes a crash caused by an incorrect folding of an ordered comparison between a packed floating point vector and a splat vector of NaN. An ordered comparison between a vector and a constant vector of NaN, should always be folded into a constant vector where each element is i1 false. Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar 'false'. Later on, this would cause an assertion failure, since the value type of the folded value doesn't match the expected value type of the uses of the original instruction: "Assertion failed: New->getType() == getType() && "replaceAllUses of value with new value of different type!". This patch fixes the issue and adds a test case to the already existing test InstSimplify/floating-point-compares.ll. Differential Revision: https://reviews.llvm.org/D24143 llvm-svn: 280488
* [InstCombine] Add test for insertelementinsts with constants.Alexey Bataev2016-09-021-0/+77
| | | | | | | Added a tests that shows that several insertelementinsts with constant indexes/data are not folded into a single shuffleinst. llvm-svn: 280474
* [SimplifyCFG] Add a workaround to fix PR30188James Molloy2016-09-021-0/+23
| | | | | | | | We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions. The real fix is in SROA, which I'll be looking into. llvm-svn: 280470
* llvm/test/Transforms/GCOVProfiling/three-element-mdnode.ll: Use %/T instead ↵NAKAMURA Takumi2016-09-021-1/+1
| | | | | | of %T, not to emit backslashes. llvm-svn: 280451
* [InstCombine] add tests to show potential shuffle+insert foldsSanjay Patel2016-09-011-0/+112
| | | | llvm-svn: 280403
* [InstCombine] remove fold of an icmp pattern that should never happenSanjay Patel2016-09-011-0/+15
| | | | | | | | | | | | While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger this code path. The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic() or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast(). Differential Revision: https://reviews.llvm.org/D24031 llvm-svn: 280370
* [SimplifyCFG] Handle tail-sinking of more than 2 incoming branchesJames Molloy2016-09-011-0/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] | | \ | | \ | [x(2)] | \ | / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] | | \ | | \ | [x(2)] | \ / | [sink.split] | \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280364
* [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEndJames Molloy2016-09-011-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280351
* Add a counter-function insertion passHal Finkel2016-09-011-0/+27
| | | | | | | | | | | | | | | | | | As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong. Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so we're passing it here as a function attribute. Clang will be updated to use this mechanism. Differential Revision: https://reviews.llvm.org/D22825 llvm-svn: 280347
* Add -fprofile-dir= to clang.Nick Lewycky2016-08-311-0/+27
| | | | | | | | | | | | | | | | | | | | | -fprofile-dir=path allows the user to specify where .gcda files should be emitted when the program is run. In particular, this is the first flag that causes the .gcno and .o files to have different paths, LLVM is extended to support this. -fprofile-dir= does not change the file name in the .gcno (and thus where lcov looks for the source) but it does change the name in the .gcda (and thus where the runtime library writes the .gcda file). It's different from a GCOV_PREFIX because a user can observe that the GCOV_PREFIX_STRIP will strip paths off of -fprofile-dir= but not off of a supplied GCOV_PREFIX. To implement this we split -coverage-file into -coverage-data-file and -coverage-notes-file to specify the two different names. The !llvm.gcov metadata node grows from a 2-element form {string coverage-file, node dbg.cu} to 3-elements, {string coverage-notes-file, string coverage-data-file, node dbg.cu}. In the 3-element form, the file name is already "mangled" with .gcno/.gcda suffixes, while the 2-element form left that to the middle end pass. llvm-svn: 280306
* [InstCombine] allow icmp (shr exact X, C2), C fold for splat constant vectorsSanjay Patel2016-08-311-3/+1
| | | | | | | The enhancement to foldICmpDivConstant ( http://llvm.org/viewvc/llvm-project?view=revision&revision=280299 ) allows us to remove the ConstantInt check; no other changes needed. llvm-svn: 280300
* [InstCombine] allow icmp (div X, Y), C folds for splat constant vectorsSanjay Patel2016-08-314-44/+27
| | | | | | Converting all of the overflow ops to APInt looked risky, so I've left that as a TODO. llvm-svn: 280299
* [EarlyCSE] Optionally use MemorySSA. NFC.Geoff Berry2016-08-3115-0/+49
| | | | | | | | | | | | | | | | | Summary: Use MemorySSA, if requested, to do less conservative memory dependency checking. This change doesn't enable the MemorySSA enhanced EarlyCSE in the default pipelines, so should be NFC. Reviewers: dberlin, sanjoy, reames, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19821 llvm-svn: 280279
* [EarlyCSE] Allow forwarding a non-invariant load into an invariant load.Geoff Berry2016-08-311-7/+3
| | | | | | | | | | Reviewers: sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23935 llvm-svn: 280265
* [statepoints][experimental] Add support for live-in semantics of values in ↵Philip Reames2016-08-311-0/+23
| | | | | | | | | | | | | | | | deopt bundles This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes. The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate. Today, we *only* fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.) My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any *known* miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default. Differential Revision: https://reviews.llvm.org/D24000 llvm-svn: 280250
* Revert "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd"James Molloy2016-08-311-60/+0
| | | | | | This reverts commit r280216 - it caused buildbot failures. llvm-svn: 280234
* Revert "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"James Molloy2016-08-311-110/+0
| | | | | | This reverts commit r280217. r280216 caused buildbot failures - backing out the entire chain. llvm-svn: 280233
* Revert "[SimplifyCFG] Add a workaround to fix PR30188"James Molloy2016-08-311-23/+0
| | | | | | This reverts commit r280219. r280216 caused buildbot failures - backing out the entire chain. llvm-svn: 280232
* Revert "[SimplifyCFG] Fix bootstrap failure after r280220"James Molloy2016-08-311-23/+0
| | | | | | This reverts commit r280228. r280216 caused buildbot failures - backing out the entire sequence. llvm-svn: 280231
OpenPOWER on IntegriCloud