summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [IndVars] Fix a bug noticed by inspectionPhilip Reames2019-08-231-1/+2
| | | | | | We were computing the loop exit value, but not ensuring the addrec belonged to the loop whose exit value we were computing. I couldn't actually trip this; the test case shows the basic setup which *might* trip this, but none of the variations I've tried actually do. llvm-svn: 369730
* [AlignmentFromAssumptions] getNewAlignmentDiff(): use getURemExpr()Fangrui Song2019-08-231-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The alignment is calculated incorrectly, thus sometimes it doesn't generate aligned mov instructions, as shown by the example below: ``` // b.cc typedef long long index; extern "C" index g_tid; extern "C" index g_num; void add3(float* __restrict__ a, float* __restrict__ b, float* __restrict__ c) { index n = 64*1024; index m = 16*1024; index k = 4*1024; index tid = g_tid; index num = g_num; __builtin_assume_aligned(a, 32); __builtin_assume_aligned(b, 32); __builtin_assume_aligned(c, 32); for (index i0=tid*k; i0<m; i0+=num*k) for (index i1=0; i1<n*m; i1+=m) for (index i2=0; i2<k; i2++) c[i1+i0+i2] = b[i0+i2] + a[i1+i0+i2]; } ``` Compile with `clang b.cc -Ofast -march=skylake -mavx2 -S` ``` vmovaps -224(%rdi,%rbx,4), %ymm0 vmovups -192(%rdi,%rbx,4), %ymm1 # should be movaps vmovups -160(%rdi,%rbx,4), %ymm2 # should be movaps vmovups -128(%rdi,%rbx,4), %ymm3 # should be movaps vaddps -224(%rsi,%rbx,4), %ymm0, %ymm0 vaddps -192(%rsi,%rbx,4), %ymm1, %ymm1 vaddps -160(%rsi,%rbx,4), %ymm2, %ymm2 vaddps -128(%rsi,%rbx,4), %ymm3, %ymm3 vmovaps %ymm0, -224(%rdx,%rbx,4) vmovups %ymm1, -192(%rdx,%rbx,4) # should be movaps vmovups %ymm2, -160(%rdx,%rbx,4) # should be movaps vmovups %ymm3, -128(%rdx,%rbx,4) # should be movaps ``` Differential Revision: https://reviews.llvm.org/D66575 Patch by Dun Liang llvm-svn: 369723
* hwasan: Untag unwound stack frames by wrapping personality functions.Peter Collingbourne2019-08-231-8/+102
| | | | | | | | | | | | | | | | | | | | | | | One problem with untagging memory in landing pads is that it only works correctly if the function that catches the exception is instrumented. If the function is uninstrumented, we have no opportunity to untag the memory. To address this, replace landing pad instrumentation with personality function wrapping. Each function with an instrumented stack has its personality function replaced with a wrapper provided by the runtime. Functions that did not have a personality function to begin with also get wrappers if they may be unwound past. As the unwinder calls personality functions during stack unwinding, the original personality function is called and the function's stack frame is untagged by the wrapper if the personality function instructs the unwinder to keep unwinding. If unwinding stops at a landing pad, the function is still responsible for untagging its stack frame if it resumes unwinding. The old landing pad mechanism is preserved for compatibility with old runtimes. Differential Revision: https://reviews.llvm.org/D66377 llvm-svn: 369721
* IR. Change strip* family of functions to not look through aliases.Peter Collingbourne2019-08-222-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | I noticed another instance of the issue where references to aliases were being replaced with aliasees, this time in InstCombine. In the instance that I saw it turned out to be only a QoI issue (a symbol ended up being missing from the symbol table due to the last reference to the alias being removed, preventing HWASAN from symbolizing a global reference), but it could easily have manifested as incorrect behaviour. Since this is the third such issue encountered (previously: D65118, D65314) it seems to be time to address this common error/QoI issue once and for all and make the strip* family of functions not look through aliases. Includes a test for the specific issue that I saw, but no doubt there are other similar bugs fixed here. As with D65118 this has been tested to make sure that the optimization isn't load bearing. I built Clang, Chromium for Linux, Android and Windows as well as the test-suite and there were no size regressions. Differential Revision: https://reviews.llvm.org/D66606 llvm-svn: 369697
* [Attributor][NFC] Move DerefState to header and use StateWrapperHideto Ueno2019-08-221-96/+1
| | | | | | | | | | | | | | | | Summary: In D65402, I want to get DerefState from AADereferenceable but it was not allowed. This patch moves DerefState definition into Attributor.h and makes AADerefenceable inherit StateWrapper. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66585 llvm-svn: 369653
* [Loop Peeling] Fix silly bug in metadata update.Serguei Katkov2019-08-221-6/+6
| | | | | | | We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637
* [Attributor] Fix: Gracefully handle non-instruction usersJohannes Doerfert2019-08-211-1/+5
| | | | | | | Function can have users that are not instructions, e.g., bitcasts. For now, we simply give up when we see them. llvm-svn: 369588
* [Attributor][NFC] Fix copy & paste errorJohannes Doerfert2019-08-211-1/+1
| | | | llvm-svn: 369577
* [Attributor][NFC] Remove leftover semicolonJohannes Doerfert2019-08-211-1/+1
| | | | llvm-svn: 369576
* [Attributor] Use existing unreachable instead of introducing new onesJohannes Doerfert2019-08-211-0/+3
| | | | | | | So far we split the unreachable off and placed a new one, this is not necessary. llvm-svn: 369575
* [GVN] Do PHI translations across all edges between the load and the ↵Florian Hahn2019-08-211-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | unavailable pred. Currently we do not properly translate addresses with PHIs if LoadBB != LI->getParent(), because PHITranslateAddr expects a direct predecessor as argument, because it considers all instructions outside of the current block to not requiring translation. The amount of cases that trigger this should be very low, as most single predecessor blocks should be folded into their predecessor by GVN before we actually start with value numbering. It is still not guaranteed to happen, so we should do PHI translation along all edges between the loads' block and the predecessor where we have to place a load. There are a few test cases showing current limits of the PHI translation, which could be improved later. Reviewers: spatel, reames, efriedma, john.brawn Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D65020 llvm-svn: 369570
* [instcombine] icmp eq/ne (sub C, Y), C -> icmp eq/ne Y, 0Philip Reames2019-08-211-0/+5
| | | | | | Noticed while looking at pr43028. llvm-svn: 369541
* [InstCombine] narrow icmp with extended operands of different widthsSanjay Patel2019-08-211-6/+17
| | | | | | | | | | | An intermediate extend is used to widen the narrow operand to the width of the other (wider) operand. At that point, we have the same logic as the existing transform that was restricted to folds of equal width zext/sext. This mostly solves PR42700: https://bugs.llvm.org/show_bug.cgi?id=42700 llvm-svn: 369519
* [Attributor] Liveness for internal functions.Stefan Stipanovic2019-08-201-2/+43
| | | | | | | | | | | | For an internal function, if all its call sites are dead, the body of the function is considered dead. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D66155 llvm-svn: 369470
* [Sanitizer] Remove unused functionsAlexandre Ganea2019-08-201-9/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D66503 llvm-svn: 369468
* [Attributor] Remove unused variable. NFC.Michael Liao2019-08-201-1/+1
| | | | llvm-svn: 369444
* [AutoFDO] Make call targets order deterministic for sample profileWenlei He2019-08-201-13/+8
| | | | | | | | | | | | | | | | | Summary: StringMap is used for storing call target to frequency map for AutoFDO. However the iterating order of StringMap is non-deterministic, which leads to non-determinism in AutoFDO profile output. Now new API getSortedCallTargets and SortCallTargets are added for deterministic ordering and output. Roundtrip test for text profile and binary profile is added. Reviewers: wmi, davidxl, danielcdh Subscribers: hiraditya, mgrang, llvm-commits, twoh Tags: #llvm Differential Revision: https://reviews.llvm.org/D66191 llvm-svn: 369440
* [InstCombine] add helper function for icmp+zext/sext; NFCSanjay Patel2019-08-201-69/+75
| | | | llvm-svn: 369421
* [InstCombine] make fold for icmp with sext more efficient; NFCSanjay Patel2019-08-201-13/+7
| | | | | | | We were creating 2 instructions and relying on a subsequent fold to invert a not(icmp). Create the final icmp directly instead. llvm-svn: 369411
* [InstCombine] improve readability for icmp with cast folds; NFCSanjay Patel2019-08-202-47/+42
| | | | | | | | 1. Update function name and stale code comments. 2. Use variable names that are less ambiguous. 3. Move operand checks into the function as early exits. llvm-svn: 369390
* [BlockExtractor] Avoid assert with wrong line formatJinsong Ji2019-08-201-0/+2
| | | | | | | | | | | | | | | | | | | | | Summary: When the line format is wrong, we may end up accessing out of bound memory. eg: the test with invalide line will cause assert. Assertion `idx < size()' failed The fix is to report fatal when we found mismatched line format. Reviewers: qcolombet, volkan Reviewed By: qcolombet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66444 llvm-svn: 369389
* [InstCombine] simplify min/max of min/max with same operands (PR35607)Sanjay Patel2019-08-201-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the original integer variant requested in: https://bugs.llvm.org/show_bug.cgi?id=35607 As noted in the TODO and several similar TODOs around this block, we could do this in instsimplify, but then it would cost more because we would be trying to match min/max via ValueTracking in 2 different places. There are 4 commuted variants for each of smin/smax/umin/umax that are not matched here. There are also icmp predicate variants that are not included in the affected test file because they are already handled by instsimplify by folding the final icmp to true/false. https://rise4fun.com/Alive/3KVc Name: smax(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: smin(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %min, i32 %max => %r = %min Name: umax(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: umin(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %min, i32 %max => %r = %min llvm-svn: 369386
* [Attributor] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds ↵Fangrui Song2019-08-201-0/+1
| | | | | | after r369331 llvm-svn: 369334
* [Attributor] Create abstract attributes on-demandJohannes Doerfert2019-08-201-140/+194
| | | | | | | | | | | | | | | | | | | | | | | | Before, we create the set of abstract attributes initially and then dealt with the fact hat a lookup could fail, e.g., return a nullptr. This patch will ensure we always return a valid object from a lookup, allowing us not only to remove the nullptr checks but also to grow the set of abstract attributes "in-flight" on-demand. One can now start from those that have the best chance of improving performance without the need to specify all they might depend on. While this introduces some boilerplate, the usage of attributes is much easier and cleaner now. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66276 llvm-svn: 369331
* [Attributor][NFC] Cleanup statistics codeJohannes Doerfert2019-08-201-4/+7
| | | | llvm-svn: 369330
* [Attributor] Use structured deduction for AADereferenceableJohannes Doerfert2019-08-201-139/+74
| | | | | | | | | | | | | | | | | | | | Summary: This is analogous to D66128 but for AADereferenceable. We have the logic concentrated in the floating value updateImpl and we use the combiner helper classes for arguments and return values. The regressions will go away with "on-demand" attribute creation. Improvements are already visible in the existing tests. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66272 llvm-svn: 369329
* [Attributor] Use structured deduction for AANonNullJohannes Doerfert2019-08-201-103/+83
| | | | | | | | | | | | | | | | | | | Summary: What D66126 did for AAAlign, this patch does for AANonNull. Agian, the logic becomes more concise and localized. Again, returned poiners are not annotated properly but that will not be an issue if this lands with the "on-demand" generation of attributes. First improvements due to the genericValueTraversal are already visible. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66128 llvm-svn: 369328
* [Attributor] Fix the "clamp" operatorJohannes Doerfert2019-08-201-0/+6
| | | | | | | | The clamp operator should not take the known of the given state as the known is potentially based on assumed information. This also adds TODOs to guide improvements. llvm-svn: 369327
* [SLP][NFC] Avoid repetitive calls to getSameOpcode()Dinar Temirbulatov2019-08-201-120/+176
| | | | | | | | We can avoid repetitive calls getSameOpcode() for already known tree elements by keeping MainOp and AltOp in TreeEntry. Differential Revision: https://reviews.llvm.org/D64700 llvm-svn: 369315
* Recommit "[Attributor] Fix: Do not partially resolve returned calls."Johannes Doerfert2019-08-191-11/+28
| | | | | | | | | This reverts commit b1752f670f3d6393306dd5d37546b6e23384d8a2. Fixed the issue with a different commit, reapply this one as it was, afaik, not broken. llvm-svn: 369303
* Refactor isPointerOffset (NFC).Evgeniy Stepanov2019-08-191-7/+7
| | | | | | | | | | | | | | | | Summary: Simplify the API using Optional<> and address comments in https://reviews.llvm.org/D66165 Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits, ostannard, pcc Tags: #llvm Differential Revision: https://reviews.llvm.org/D66317 llvm-svn: 369300
* Re-apply fixed "[Attributor] Fix: Make sure we set the changed flag"Johannes Doerfert2019-08-191-4/+4
| | | | | | | | | This reverts commit cedd0d9a6e4b433e1cd6585d1d4d152eb5e60b11. Re-apply the original commit but make sure the variables are initialized (even if they are not used) so UBSan is not complaining. llvm-svn: 369294
* [MemorySSA] Rename uses when inserting memory uses.Alina Sbirlea2019-08-191-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When inserting uses from outside the MemorySSA creation, we don't normally need to rename uses, based on the assumption that there will be no inserted Phis (if Def existed that required a Phi, that Phi already exists). However, when dealing with unreachable blocks, MemorySSA will optimize away Phis whose incoming blocks are unreachable, and these Phis end up being re-added when inserting a Use. There are two potential solutions here: 1. Analyze the inserted Phis and clean them up if they are unneeded (current method for cleaning up trivial phis does not cover this) 2. Leave the Phi in place and rename uses, the same way as whe inserting defs. This patch use approach 2. Resolves first test in PR42940. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66033 llvm-svn: 369291
* [SLP] reduce duplicated code; NFCSanjay Patel2019-08-191-2/+4
| | | | llvm-svn: 369250
* Revert [Attributor] Fix: Make sure we set the changed flagDavid L. Jones2019-08-191-4/+4
| | | | | | | | This reverts r369159 (git commit cbaf1fdea2de891bdbc49cdec89ae2077e6b9ed0) r369160 caused a test to fail under UBSAN. See thread on llvm-commits. llvm-svn: 369241
* Revert [Attributor] Fix: Do not partially resolve returned calls.David L. Jones2019-08-191-28/+11
| | | | | | | | This reverts r369160 (git commit f72d9b1c97b41fff48ad1eecbba59a29c171bff4) r369160 caused some tests to fail under UBSAN. See thread on llvm-commits. llvm-svn: 369236
* [InstCombine] Cherry-pick NFC cleanups of ↵Roman Lebedev2019-08-181-5/+8
| | | | | | foldShiftIntoShiftInAnotherHandOfAndInICmp() from D66383 llvm-svn: 369207
* [MemorySSA] Loop passes should mark MSSA preserved when available.Alina Sbirlea2019-08-175-6/+6
| | | | | | | | This patch applies only to the new pass manager. Currently, when MSSA Analysis is available, and pass to each loop pass, it will be preserved by that loop pass. Hence, mark the analysis preserved based on that condition, vs the current `EnableMSSALoopDependency`. This leaves the global flag to affect only the entry point in the loop pass manager (in FunctionToLoopPassAdaptor). llvm-svn: 369181
* Revert r367891 - "[InstCombine] combine mul+shl separated by zext"Sanjay Patel2019-08-161-13/+2
| | | | | | | | | | | | | This reverts commit 5dbb90bfe14ace30224239cac7c61a1422fa5144. As noted in the post-commit thread for r367891, this can create a multiply that is lowered to a libcall that may not exist. We need to improve the backend decomposition for integer multiply before trying to re-land this (if it's still worthwhile after doing the backend work). llvm-svn: 369174
* Reland "[ARM] push LR before __gnu_mcount_nc"Jian Cai2019-08-161-1/+1
| | | | | | | | This relands r369147 with fixes to unit tests. https://reviews.llvm.org/D65019 llvm-svn: 369173
* [Attributor] Fix: Do not partially resolve returned calls.Johannes Doerfert2019-08-161-11/+28
| | | | | | | | | | By partially resolving returned calls we did not record that they were not fully resolved which caused odd behavior down the line. We could also end up with some, but not all, returned values of the callee in the returned values map of the caller, another odd behavior we want to avoid. llvm-svn: 369160
* [Attributor] Fix: Make sure we set the changed flagJohannes Doerfert2019-08-161-4/+4
| | | | | | | The flag was updated *before* we actually run the visitor callback so we might miss updates. llvm-svn: 369159
* [Attributor] Add all missing attribute definitions/symbolsJohannes Doerfert2019-08-161-35/+117
| | | | | | | | | | | | | | | | As a preparation to "on-demand" abstract attribute generation we need implementations for all attributes (as they can be queried and then created on-demand where we now fail to find one). Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66129 llvm-svn: 369155
* Revert "[ARM] push LR before __gnu_mcount_nc"Jian Cai2019-08-161-1/+1
| | | | | | This reverts commit f4cf3b959333f62b7a7b2d7771f7010c9d8da388. llvm-svn: 369149
* [ARM] push LR before __gnu_mcount_ncJian Cai2019-08-161-1/+1
| | | | | | | | | Push LR register before calling __gnu_mcount_nc as it expects the value of LR register to be the top value of the stack on ARM32. Differential Revision: https://reviews.llvm.org/D65019 llvm-svn: 369147
* [Attributor] Towards a more structured deduction patternJohannes Doerfert2019-08-161-111/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first commit aiming to structure the attribute deduction. The base idea is that we have default propagation patterns as listed below on top of which we can add specific, e.g., context sensitive, logic. Deduction patterns used in this patch: - argument states are determined from call site argument states, see AAAlignArgument and AAArgumentFromCallSiteArguments. - call site argument states are determined as if they were floating values, see AAAlignCallSiteArgument and AAAlignFloating. - floating value states are determined by traversing the def-use chain and combining the states determined for the leaves, see AAAlignFloating and genericValueTraversal. - call site return states are determined from function return states, see AAAlignCallSiteReturned and AACallSiteReturnedFromReturned. - function return states are determined from returned value states, see AAAlignReturned and AAReturnedFromReturnedValues. Through this strategy all logic for alignment is concentrated in the AAAlignFloating::updateImpl method. Note: This commit works on its own but is part of a larger change that involves "on-demand" creation of abstract attributes that will participate in the fixpoint iteration. Without this part, we sometimes do not have an AAAlign abstract attribute to query, loosing information we determined before. All tests have appropriate FIXMEs and the information will be recovered once we added all parts. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66126 llvm-svn: 369144
* [Attributor][NFC] Introduce aliases for call site attributesJohannes Doerfert2019-08-161-7/+40
| | | | | | | | | Until we have call site specific liveness and/or value information there is no need to do call site specific deduction. Though, we need the symbols in follow up patches that make Attributor::getAAFor return a reference. llvm-svn: 369143
* [Attributor] Introduce initialize calls and move code to keep attributes conciseJohannes Doerfert2019-08-161-179/+180
| | | | | | | | | | | | | | | | | | Summary: This patch should not change the behavior except that the added initialize methods might indicate an optimistic fixpoint earlier. The code movement is done to keep the attribute definitions in a single block where it makes sense. No functional changes intended there. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66258 llvm-svn: 369142
* [InstCombine] canonicalize a scalar-select-of-vectors to vector selectSanjay Patel2019-08-161-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pattern may arise more frequently with an enhancement to SLP vectorization suggested in PR42755: https://bugs.llvm.org/show_bug.cgi?id=42755 ...but we should handle this pattern to make things easier for the backend either way. For all in-tree targets that I looked at, codegen for typical vector sizes looks better when we change to a vector select, so this is safe to do without a cost model (in other words, as a target-independent canonicalization). For example, if the condition of the select is a scalar, we end up with something like this on x86: vpcmpgtd %xmm0, %xmm1, %xmm0 vpextrb $12, %xmm0, %eax testb $1, %al jne LBB0_2 ## %bb.1: vmovaps %xmm3, %xmm2 LBB0_2: vmovaps %xmm2, %xmm0 Rather than the splat-condition variant: vpcmpgtd %xmm0, %xmm1, %xmm0 vpshufd $255, %xmm0, %xmm0 ## xmm0 = xmm0[3,3,3,3] vblendvps %xmm0, %xmm2, %xmm3, %xmm0 Differential Revision: https://reviews.llvm.org/D66095 llvm-svn: 369140
* [SLPVectorizer] Make the scheduler aware of the TreeEntry operands.Vasileios Porpodas2019-08-161-79/+171
| | | | | | | | | | | | | | | | | | | | | | Summary: The scheduler's dependence graph gets the use-def dependencies by accessing the operands of the instructions in a bundle. However, buildTree_rec() may change the order of the operands in TreeEntry, and the scheduler is currently not aware of this. This is not causing any functional issues currently, because reordering is restricted to the operands of a single instruction. Once we support operand reordering across multiple TreeEntries, as shown here: http://www.llvm.org/devmtg/2019-04/slides/Poster-Porpodas-Supernode_SLP.pdf , the scheduler will need to get the correct operands from TreeEntry and not from the individual instructions. In short, this patch: - Connects the scheduler's bundle with the corresponding TreeEntry. It introduces new TE and Lane fields in ScheduleData. - Moves the location where the operands of the TreeEntry are initialized. This used to take place in newTreeEntry() setting one operand at a time, but is now moved pre-order just before the recursion of buildTree_rec(). This is required because the scheduler needs to access both operands of the TreeEntry in tryScheduleBundle(). - Updates the scheduler to access the instruction operands through the TreeEntry operands instead of accessing the instruction operands directly. Reviewers: ABataev, RKSimon, dtemirbulatov, Ayal, dorit, hfinkel Reviewed By: ABataev Subscribers: hiraditya, llvm-commits, lebedev.ri, rcorcs Tags: #llvm Differential Revision: https://reviews.llvm.org/D62432 llvm-svn: 369131
OpenPOWER on IntegriCloud