summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopStrengthReduce
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-215-12/+12
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* Set option enabling LSR alternative way to resolve complex solution to false.Evgeny Stupachenko2017-03-042-2/+2
| | | | | | | Differential Revision: http://reviews.llvm.org/D29862 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 296959
* [LSR] Canonicalize formula and put recursive Reg related with current loop ↵Wei Mi2017-02-221-0/+65
| | | | | | | | | | | | | | | | in ScaledReg. After rL294814, LSR formula can have multiple SCEVAddRecExprs inside of its BaseRegs. Previous canonicalization will swap the first SCEVAddRecExpr in BaseRegs with ScaledReg. But now we want to swap the SCEVAddRecExpr Reg related with current loop with ScaledReg. Otherwise, we may generate code like this: RegA + lsr.iv + RegB, where loop invariant parts RegA and RegB are not grouped together and cannot be promoted outside of loop. With this patch, it will ensure lsr.iv to be generated later in the expr: RegA + RegB + lsr.iv, so that RegA + RegB can be promoted outside of loop. Differential Revision: https://reviews.llvm.org/D26781 llvm-svn: 295884
* The patch introduces new way of narrowing complex (>UINT16 variants) solutions.Evgeny Stupachenko2017-02-212-2/+2
| | | | | | | | | | | | | | | | | | | The new method introduced under "-lsr-exp-narrow" option (currenlty set to true). Summary: The method is based on registers number mathematical expectation and should be generally closer to optimal solution. Please see details in comments to "LSRInstance::NarrowSearchSpaceByDeletingCostlyFormulas()" function (in lib/Transforms/Scalar/LoopStrengthReduce.cpp). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D29862 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 295704
* [LSR] Prevent formula with SCEVAddRecExpr type of Reg from Sibling loopsWei Mi2017-02-161-0/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In rL294814, we allow formula with SCEVAddRecExpr type of Reg from loops other than current loop. This is good for the case when induction variable of outerloop being used in expr in innerloop. But it is very bad to allow such Reg from sibling loop because we may need to add lsr.iv in other sibling loops when scev expanding those SCEVAddRecExpr type exprs. For the testcase below, one loop can be inserted with a bunch of lsr.iv because of LSR for other loops. // The induction variable j from a loop in the middle will have initial // value generated from previous sibling loop and exit value used by its // next sibling loop. void goo(long i, long j); long cond; void foo(long N) { long i = 0; long j = 0; i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); } The fix is to only allow formula with SCEVAddRecExpr type of Reg from current loop or its parents. Differential Revision: https://reviews.llvm.org/D30021 llvm-svn: 295378
* [LSR] Pointers with different address spaces are considered incompatible.Mikael Holmen2017-02-141-0/+31
| | | | | | | | | | | | | | | | | | | | | | Summary: Function isCompatibleIVType is already used as a guard before the call to SE.getMinusSCEV(OperExpr, PrevExpr); in LSRInstance::ChainInstruction. getMinusSCEV requires the expressions to be of the same type, so we now consider two pointers with different address spaces to be incompatible, since it is possible that the pointers in fact have different sizes. Reviewers: qcolombet, eli.friedman Reviewed By: qcolombet Subscribers: nhaehnle, Ka-Ka, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D29885 llvm-svn: 295033
* The patch fixes r294821Evgeny Stupachenko2017-02-112-6/+6
| | | | | | | | Summary: Update register match for windows testing From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294825
* Fix PR23384 (under "-lsr-insns-cost" option)Evgeny Stupachenko2017-02-112-0/+110
| | | | | | | | | | | | | Summary: The patch adds instructions number generated by a solution to LSR cost under "-lsr-insns-cost" option. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D28307 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294821
* [LSR] Recommit: Allow formula containing Reg for SCEVAddRecExpr related with ↵Wei Mi2017-02-111-0/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | outerloop. The recommit includes some changes of testcases. No functional change to the patch. In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1*reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 294814
* LSR: Check atomic instruction pointer operandsMatt Arsenault2017-02-081-0/+87
| | | | llvm-svn: 294410
* LSR: Don't drop address space when type doesn't matchMatt Arsenault2017-01-301-0/+54
| | | | | | | | | | For targets with different addressing modes in each address space, if this is dropped querying isLegalAddressingMode later with this will give a nonsense result, breaking the isLegalUse assertions. This is a candidate for the 4.0 release branch. llvm-svn: 293542
* This test apparently requires an x86 target and is failing on numerousChandler Carruth2017-01-231-0/+48
| | | | | | | | | | | bots ever since d0k fixed the CHECK lines so that it did something at all. It isn't actually testing SCEV directly but LSR, so move it into LSR and the x86-specific tree of tests that already exists there. Target dependence is common and unavoidable with the current design of LSR. llvm-svn: 292774
* [PM] Clean up the testing for IVUsers, especially with the new PM.Chandler Carruth2017-01-151-54/+0
| | | | | | | | | | First, I've moved a test of IVUsers from the LSR tree to a dedicated IVUsers test directory. I've also simplified its RUN line now that the new pass manager's loop PM is providing analyses on their own. No functionality changed, but it makes subsequent changes cleaner. llvm-svn: 292060
* [LoopStrengthReduce] Don't bother rewriting PHIs in catchswitch blocksDavid Majnemer2017-01-131-0/+58
| | | | | | | | | The catchswitch instruction cannot be split, don't bother trying to rewrite it. This fixes PR31627. llvm-svn: 291966
* Revert r286999 which caused buildbot test failures. Some testcases need to ↵Wei Mi2016-11-151-65/+0
| | | | | | be made target specific. llvm-svn: 287014
* [LSR] Allow formula containing Reg for SCEVAddRecExpr related with outerloop.Wei Mi2016-11-151-0/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1*reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 286999
* [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.Alexandros Lamprineas2016-11-091-0/+35
| | | | | | | | | | | Scalar Evolution asserts when not all the operands of an Add Recurrence Expression are loop invariants. Loop Strength Reduction should only create affine Add Recurrences, so that both the start and the step of the expression are loop invariants. Differential Revision: https://reviews.llvm.org/D26185 llvm-svn: 286347
* Fix testcases failing after r284036Krzysztof Parzyszek2016-10-121-3/+1
| | | | | | The codegen has changed slightly between my tests and the commit. llvm-svn: 284049
* Do not remove implicit defs in BranchFolderKrzysztof Parzyszek2016-10-121-0/+1
| | | | | | | | | | | Branch folder removes implicit defs if they are the only non-branching instructions in a block, and the branches do not use the defined registers. The problem is that in some cases these implicit defs are required for the liveness information to be correct. Differential Revision: https://reviews.llvm.org/D25478 llvm-svn: 284036
* [LSR] Don't try and create post-inc expressions on non-rotated loopsJames Molloy2016-08-151-0/+43
| | | | | | | | | | | | | | | If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs! Motivating testcase: void f(float *a, float *b, float *c, int n) { while (n-- > 0) *c++ = *a++ + *b++; } It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage. llvm-svn: 278658
* [SCEV] Update interface to handle SCEVExpander insert point motion.Geoff Berry2016-08-111-0/+47
| | | | | | | | | | | | | | | | | | | | Summary: This is an extension of the fix in r271424. That fix dealt with builder insert points being moved by SCEV expansion, but only for the lifetime of the expand call. This change modifies the interface so that LSR can safely call expand multiple times at the same insert point and do the right thing if one of the expansions decides to move the original insert point. This is a fix for PR28719. Reviewers: sanjoy Subscribers: llvm-commits, mcrosier, mzolotukhin Differential Revision: https://reviews.llvm.org/D23342 llvm-svn: 278413
* [PM] Significantly refactor the pass pipeline parsing to be easier toChandler Carruth2016-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reason about and less error prone. The core idea is to fully parse the text without trying to identify passes or structure. This is done with a single state machine. There were various bugs in the logic around this previously that were repeated and scattered across the code. Having a single routine makes it much easier to fix and get correct. For example, this routine doesn't suffer from PR28577. Then the actual pass construction is handled using *much* easier to read code and simple loops, with particular pass manager construction sunk to live with other pass construction. This is especially nice as the pass managers *are* in fact passes. Finally, the "implicit" pass manager synthesis is done much more simply by forming "pre-parsed" structures rather than having to duplicate tons of logic. One of the bugs fixed by this was evident in the tests where we accepted a pipeline that wasn't really well formed. Another bug is PR28577 for which I have added a test case. The code is less efficient than the previous code but I'm really hoping that's not a priority. ;] Thanks to Sean for the review! Differential Revision: https://reviews.llvm.org/D22724 llvm-svn: 277561
* [PM] Convert Loop Strength Reduce pass to new PMDehao Chen2016-07-181-0/+1
| | | | | | | | | | | | Summary: Convert Loop String Reduce pass to new PM Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22468 llvm-svn: 275919
* [PM] Convert IVUsers analysis to new pass manager.Dehao Chen2016-07-161-0/+1
| | | | | | | | | | | | Summary: Convert IVUsers analysis to new pass manager. Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22434 llvm-svn: 275698
* AMDGPU: Run pointer optimization passesMatt Arsenault2016-06-151-1/+2
| | | | llvm-svn: 272736
* Reapply [LSR] Create fewer redundant instructions.Geoff Berry2016-06-062-0/+81
| | | | | | | | | | | | | | | | | | | Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Originally reviewed in http://reviews.llvm.org/D18001 Reviewers: atrick Subscribers: llvm-commits, mzolotukhin, mcrosier Differential Revision: http://reviews.llvm.org/D18480 llvm-svn: 271929
* AMDGPU: Fix a few slightly broken testsMatt Arsenault2016-05-181-22/+23
| | | | | | | Fix minor bugs and uses of undef which break when pointer related optimization passes are run. llvm-svn: 269944
* AMDGPU: Stop reporting an addressing mode for unknown addrspaceMatt Arsenault2016-04-291-4/+20
| | | | | | | | | This was being treated the same as private, which has an immediate offset. For unknown, it probably means it's for a computation not actually being used for accessing memory, so it should not have a nontrivial addressing mode. llvm-svn: 268002
* Don't delete empty preheaders in CodeGenPrepare if it would create a ↵Chuang-Yu Cheng2016-04-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | critical edge Presently, CodeGenPrepare deletes all nearly empty (only phi and branch) basic blocks. This pass can delete loop preheaders which frequently creates critical edges. A preheader can be a convenient place to spill registers to the stack. If the entrance to a loop body is a critical edge, then spills may occur in the loop body rather than immediately before it. This patch protects loop preheaders from deletion in CodeGenPrepare even if they are nearly empty. Since the patch alters the CFG, it affects a large number of test cases. In most cases, the changes are merely cosmetic (basic blocks have different names or instruction orders change slightly). I am somewhat concerned about the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not deleted, then the MIPS backend does not take advantage of a branch delay slot. Consequently, I would like some close review by a MIPS expert. The patch also partially subsumes D16893 from George Burgess IV. George correctly notes that CodeGenPrepare does not actually preserve the dominator tree. I think the dominator tree was usually not valid when CodeGenPrepare ran, but I am using LoopInfo to mark preheaders, so the dominator tree is now always valid before CodeGenPrepare. Author: Tom Jablin (tjablin) Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng http://reviews.llvm.org/D16984 llvm-svn: 265397
* [LoopStrengthReduce] Don't hoist into a catchswitchDavid Majnemer2016-03-241-0/+50
| | | | | | | | We try to hoist the insertion point as high as possible to encourage sharing. However, we must be careful not to hoist into a catchswitch as it is both an EHPad and a terminator. llvm-svn: 264344
* Revert "[LSR] Create fewer redundant instructions."Geoff Berry2016-03-161-34/+0
| | | | | | This reverts commit r263644. Investigating bootstrap failures. llvm-svn: 263655
* [LSR] Create fewer redundant instructions.Geoff Berry2016-03-161-0/+34
| | | | | | | | | | | | | | | | | Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Reviewers: atrick Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18001 llvm-svn: 263644
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-041-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. The original commit triggered regressions in Polly tests. The regressions exposed two problems which have been fixed in current version. 1. Polly will generate a new function based on the old one. To generate an instruction for the new function, it builds SCEV for the old instruction, applies some tranformation on the SCEV generated, then expands the transformed SCEV and insert the expanded value into new function. Because SCEV expansion may reuse value cached in ExprValueMap, the value in old function may be inserted into new function, which is wrong. In SCEVExpander::expand, there is a logic to check the cached value to be used should dominate the insertion point. However, for the above case, the check always passes. That is because the insertion point is in a new function, which is unreachable from the old function. However for unreachable node, DominatorTreeBase::dominates thinks it will be dominated by any other node. The fix is to simply add a check that the cached value to be used in expansion should be in the same function as the insertion point instruction. 2. When the SCEV is of scConstant type, expanding it directly is cheaper than reusing a normal value cached. Although in the cached value set in ExprValueMap, there is a Constant type value, but it is not easy to find it out -- the cached Value set is not sorted according to the potential cost. Existing reuse logic in SCEVExpander::expand simply chooses the first legal element from the cached value set. The fix is that when the SCEV is of scConstant type, don't try the reuse logic. simply expand it. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259736
* [LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitchesDavid Majnemer2016-02-031-0/+29
| | | | | | | | | | Bail out if we have a PHI on an EHPad that gets a value from a CatchSwitchInst. Because the CatchSwitchInst cannot be split, there is no good place to stick any instructions. This fixes PR26373. llvm-svn: 259702
* Revert r259662, which caused regressions on polly tests.Wei Mi2016-02-031-3/+2
| | | | llvm-svn: 259675
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-031-2/+3
| | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259662
* Followup to 258750; update more tests to use .p2align .Dan Gohman2016-01-261-1/+1
| | | | llvm-svn: 258755
* [IR] Remove terminatepadDavid Majnemer2015-12-141-3/+0
| | | | | | | | | | | | | It turns out that terminatepad gives little benefit over a cleanuppad which calls the termination function. This is not sufficient to implement fully generic filters but MSVC doesn't support them which makes terminatepad a little over-designed. Depends on D15478. Differential Revision: http://reviews.llvm.org/D15479 llvm-svn: 255522
* [IR] Reformulate LLVM's EH funclet IRDavid Majnemer2015-12-122-60/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422
* [LoopStrengthReduce] Don't increment iterator past the end of the BBDavid Majnemer2015-11-161-0/+51
| | | | | | | | | | | We tried to move the insertion point beyond instructions like landingpad and cleanuppad. However, we *also* tried to move past catchpad. This is problematic because catchpad is also a terminator. This fixes PR25541. llvm-svn: 253238
* [LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad predsDavid Majnemer2015-11-081-0/+53
| | | | | | | | We cannot really insert fixup code into a PHI's predecessor. This fixes PR25445. llvm-svn: 252416
* [ScalarEvolutionExpander] PHI on a catchpad can be used on both edgesDavid Majnemer2015-10-271-0/+42
| | | | | | | | A PHI on a catchpad might be used by both edges out of the catchpad, feeding back into a loop. In this case, just use the insertion point. Anything more clever would require new basic blocks or PHI placement. llvm-svn: 251442
* [ScalarEvolutionExpander] Properly insert no-op casts + EH PadsDavid Majnemer2015-10-271-0/+148
| | | | | | | | | | | We want to insert no-op casts as close as possible to the def. This is tricky when the cast is of a PHI node and the BasicBlocks between the def and the use cannot hold any instructions. Iteratively walk EH pads until we hit a non-EH pad. This fixes PR25326. llvm-svn: 251393
* [SCEV] Mark AddExprs as nsw or nuw if legalSanjoy Das2015-10-221-1/+1
| | | | | | | | | | | | | | Summary: This uses `ScalarEvolution::getRange` and not potentially control dependent `nsw` and `nuw` bits on the arithmetic instruction. Reviewers: atrick, hfinkel, nlewycky Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13613 llvm-svn: 251048
* [TwoAddressInstructionPass] When looking for a 3 addr conversion after ↵Craig Topper2015-10-061-1/+1
| | | | | | commuting, make sure regB has been updated to take into account the commute. llvm-svn: 249378
* [ARM][NEON] Use address space in vld([1234]|[234]lane) and ↵Jeroen Ketema2015-09-301-25/+25
| | | | | | | | | | | | | | | | | | | | | vst([1234]|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887
* DI: Require subprogram definitions to be distinctDuncan P. N. Exon Smith2015-08-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | As a follow-up to r246098, require `DISubprogram` definitions (`isDefinition: true`) to be 'distinct'. Specifically, add an assembler check, a verifier check, and bitcode upgrading logic to combat testcase bitrot after the `DIBuilder` change. While working on the testcases, I realized that test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore. Its purpose was to check for a corner case in PR22792 where two subprogram definitions match exactly and share the same metadata node. The new verifier check, requiring that subprogram definitions are 'distinct', precludes that possibility. I updated almost all the IR with the following script: git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' | grep -v test/Bitcode | xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/' Likely some variant of would work for out-of-tree testcases. llvm-svn: 246327
* [NVPTX] truncating 64-bit to 32-bit is freeJingyue Wu2015-08-202-0/+47
| | | | | | | | | | | | | | Summary: Add an LSR test that exercises isTruncateFree. Without this change, LSR creates another indvar representing the truncated value. Reviewers: jholewinski, eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D12058 llvm-svn: 245611
* LoopStrengthReduce: Try to pass address space to isLegalAddressingModeMatt Arsenault2015-08-153-0/+272
| | | | | | | | | This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137
* [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shlBjarke Hammersholt Roune2015-08-141-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions. This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit: for (int i = 0; i < numIterations; ++i) sum += ptr[i - offset]; for (int i = 0; i < numIterations; ++i) sum += ptr[i * stride]; for (int i = 0; i < numIterations; ++i) sum += ptr[3 * (i << 7)]; Reviewers: atrick, sanjoy Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben Differential Revision: http://reviews.llvm.org/D11860 llvm-svn: 245118
OpenPOWER on IntegriCloud