summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] add tests for fneg+sel; NFCSanjay Patel2019-05-061-0/+84
| | | | llvm-svn: 360058
* Add FNeg support to InstructionSimplifyCameron McInally2019-05-061-0/+9
| | | | | | Differential Revision: https://reviews.llvm.org/D61573 llvm-svn: 360053
* [InstCombine] regenerate test checks; NFCSanjay Patel2019-05-062-24/+36
| | | | llvm-svn: 360052
* [SimplifyLibCalls] Simplify bcmp too.Clement Courbet2019-05-061-0/+144
| | | | | | | | | | | | | | Summary: Fixes PR40699. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61585 llvm-svn: 360021
* [DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref.Markus Lavin2019-05-061-1/+1
| | | | | | | | | | Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian targets for same reasons as in D59687. Differential Revision: https://reviews.llvm.org/D60611 llvm-svn: 360013
* Add FNeg IR constant folding supportCameron McInally2019-05-054-55/+8
| | | | llvm-svn: 359982
* Add InstCombine tests for FNeg instruction.Cameron McInally2019-05-041-0/+46
| | | | llvm-svn: 359970
* [CodeGenPrepare] limit overflow intrinsic matching to a single basic block ↵Sanjay Patel2019-05-042-15/+34
| | | | | | | | | | | | | | | | | | | | | (2nd try) This is a subset of the original commit from rL359879 which was reverted because it could crash when using the 'RemovedInstructions' structure that enables delayed deletion of dead instructions. The motivating compile-time win does not require that change though. We should get most of that win from this change alone. Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359969
* Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic ↵Evgeniy Stepanov2019-05-032-14/+15
| | | | | | | | block" This reverts commit r359879, which introduced a compiler crash. llvm-svn: 359908
* Revert r359549 - incorrect update of test checks. NFCRobert Lougher2019-05-031-4/+4
| | | | llvm-svn: 359897
* [LICM] auto-generate complete test checks; NFCSanjay Patel2019-05-031-3/+27
| | | | llvm-svn: 359881
* [CodeGenPrepare] limit overflow intrinsic matching to a single basic blockSanjay Patel2019-05-032-15/+14
| | | | | | | | | | | | | | | | | | | Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. Also, we were restarting the iterator loops when doing the overflow intrinsic transforms by marking the dominator tree for update. That was done to prevent iterating over a removed instruction. But we can postpone the deletion using the existing "RemovedInsts" structure, and that means we don't need to update the DT. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359879
* remove inalloca parameters in globalopt and simplify argpromotionBob Haarman2019-05-023-7/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Inalloca parameters require special handling in some optimizations. This change causes globalopt to strip the inalloca attribute from function parameters when it is safe to do so, removes the special handling for inallocas from argpromotion, and replaces it with a simple check that causes argpromotion to skip functions that receive inallocas (for when the pass is invoked on code that didn't run through globalopt first). This also avoids a case where argpromotion would incorrectly try to pass an inalloca in a register. Fixes PR41658. Reviewers: rnk, efriedma Reviewed By: rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61286 llvm-svn: 359743
* [PGO][CHR] A bug fix.Hiroshi Yamauchi2019-05-011-0/+119
| | | | | | | | | | | | | | | | Summary: Fix a transformation bug where two scopes share a common instrution to hoist. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61405 llvm-svn: 359736
* [tests] Add host-byteorder-*-endian; update XFAILs of big-endian triplesHubert Tong2019-05-012-2/+2
| | | | | | | | | | | | | | | | | | | | Summary: Triple components in `XFAIL` lines are tested against the target triple. Various tests that are expected to fail on big-endian hosts are marked as being `XFAIL` for big-endian targets. This patch corrects these tests by having them test against a new `host-byteorder-big-endian` feature. Reviewers: xingxue, sfertile, jasonliu Reviewed By: xingxue Subscribers: jvesely, nhaehnle, fedor.sergeev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60551 llvm-svn: 359689
* [InstCombine] Limit a vector demanded elts rule which was producing invalid IR.Philip Reames2019-04-301-0/+13
| | | | | | | | The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes. This should fix https://bugs.llvm.org/show_bug.cgi?id=41624. llvm-svn: 359633
* [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.Alina Sbirlea2019-04-301-1/+1
| | | | | | | | | | | | | | | | | Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615
* [SLP] Lit test that cannot get vectorized due to lack of look-ahead operand ↵Simon Pilgrim2019-04-301-0/+74
| | | | | | | | | | | | | reordering heuristic. The code in this test is not vectorized by SLP because its operand reordering cannot look beyond the immediate predecessors. This will get fixed in a follow-up patch that introduces the look-ahead operand reordering heuristic. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D61283 llvm-svn: 359553
* Update checks in an instcombine test, NFCJeremy Morse2019-04-301-4/+4
| | | | | | This reduces the delta in some incoming work that changes this test. llvm-svn: 359549
* [BlockExtractor] Change the basic block separator from ',' to ';'Quentin Colombet2019-04-291-2/+2
| | | | | | | | | This change aims at making the file format be compatible with the way LLVM handles command line options. Differential Revision: https://reviews.llvm.org/D60970 llvm-svn: 359462
* [InstCombine][X86] Add PACKSS tests for truncation of sign-extended comparisonsSimon Pilgrim2019-04-291-0/+106
| | | | llvm-svn: 359435
* [NFC] Add baseline tests for int isKnownNonZeroDan Robertson2019-04-262-0/+197
| | | | | | | | Add baseline tests for improvements of isKnownNonZero for integer types. Differential Revision: https://reviews.llvm.org/D60932 llvm-svn: 359267
* [ObjC][ARC] Let ARC optimizer bail out if the number of pointer statesAkira Hatanaka2019-04-251-0/+26
| | | | | | | | | | | | | | | | | | | | it keeps track of becomes too large ARC optimizer does a top-down and a bottom-up traversal of the whole function to pair up retain and release instructions and remove them. This can be expensive if the number of instructions in the function and pointer states it tracks are large since it has to look at each pointer state and determine whether the instruction being visited can potentially use the pointer. This patch adds a command line option that sets a limit to the number of pointers it tracks. rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D61100 llvm-svn: 359226
* [Evaluator] Walk initial elements when handling load through bitcastRobert Lougher2019-04-252-0/+93
| | | | | | | | | | | | | | | | | | | | | | When evaluating a store through a bitcast, the evaluator tries to move the bitcast from the pointer onto the stored value. If the cast is invalid, it tries to "introspect" the type to get a valid cast by obtaining a pointer to the initial element (if the type is nested, this may require walking several initial elements). In some situations it is possible to get a bitcast on a load (e.g. with unions, where the bitcast may not be the same type as the store). However, equivalent logic to the store to introspect the type is missing. This patch add this logic. Note, when developing the patch I was unhappy with adding similar logic directly to the load case as it could get out of step. Instead, I have abstracted the "introspection" into a helper function, with the specifics being handled by a passed-in lambda function. Differential Revision: https://reviews.llvm.org/D60793 llvm-svn: 359205
* [InstCombine][X86] Add PACKSS/PACKUS tests for truncation where saturation ↵Simon Pilgrim2019-04-251-0/+160
| | | | | | won't occur llvm-svn: 359185
* [NFC][LoopIdiomRecognize] Some basic baseline tests for bcmp loop idiomRoman Lebedev2019-04-254-0/+2734
| | | | | | | Doubt this is the final test coverage, but this appears to have good coverage already, so i figure i might as well precommit it. llvm-svn: 359173
* Enable LoopVectorization by default.Alina Sbirlea2019-04-252-506/+2
| | | | | | | | | | | | | | | | | Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167
* [SLP] Fix crash after r358519, by V. Porpodas.Alexey Bataev2019-04-241-0/+47
| | | | | | | | | | | | | | | | Summary: The code did not check if operand was undef before casting it to Instruction. Reviewers: RKSimon, ABataev, dtemirbulatov Reviewed By: ABataev Subscribers: uabelho Tags: #llvm Differential Revision: https://reviews.llvm.org/D61024 llvm-svn: 359136
* The error message for mismatched value sites is very cryptic.Dmitry Mikulin2019-04-232-0/+21
| | | | | | | | Make it more readable for an average user. Differential Revision: https://reviews.llvm.org/D60896 llvm-svn: 359043
* [ObjC][ARC] Check the basic block size before callingAkira Hatanaka2019-04-231-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DominatorTree::dominate. ARC contract pass has an optimization that replaces the uses of the argument of an ObjC runtime function call with the call result. For example: ; Before optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %1, i8** @g0, align 8 ; After optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %2, i8** @g0, align 8 // %1 is replaced with %2 Before replacing the argument use, DominatorTree::dominate is called to determine whether the user instruction is dominated by the ObjC runtime function call instruction. The call to DominatorTree::dominate can be expensive if the two instructions belong to the same basic block and the size of the basic block is large. This patch checks the basic block size and just bails out if the size exceeds the limit set by command line option "arc-contract-max-bb-size". rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D60900 llvm-svn: 359027
* [InstCombine] Convert a masked.load of a dereferenceable address to an ↵Philip Reames2019-04-231-2/+3
| | | | | | | | | | unconditional load If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable. Differential Revision: https://reviews.llvm.org/D59703 llvm-svn: 359000
* [LSR] Limit the recursion for setup costDavid Green2019-04-232-1/+94
| | | | | | | | | | | | | | In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958
* [InstCombine] Eliminate stores to constant memoryPhilip Reames2019-04-224-4/+0
| | | | | | | | | | | | If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919
* [Tests] Revise a test as requested by reviewer in D59703Philip Reames2019-04-221-13/+14
| | | | llvm-svn: 358907
* [Tests] Add a negative test for masked.gather part of D59703Philip Reames2019-04-221-0/+14
| | | | llvm-svn: 358906
* [NewPM] Add Option handling for SimpleLoopUnswitchSerguei Katkov2019-04-2210-57/+57
| | | | | | | | | | | This patch enables passing options to SimpleLoopUnswitch via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60676 llvm-svn: 358880
* [NewPM] Add dummy Test for LoopVectorize option parsing.Serguei Katkov2019-04-221-0/+10
| | | | llvm-svn: 358878
* [CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw.Luqman Aden2019-04-202-19/+19
| | | | | | | | | | | | | | | | | | | Summary: Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative. Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181) Reviewers: sanjoy, apilipenko, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60036 llvm-svn: 358816
* [IndVarSimplify] Generate full checks for some LFTR tests; NFCNikita Popov2019-04-207-128/+326
| | | | llvm-svn: 358813
* [IndVarSimplify] Add tests for PR31181; NFCNikita Popov2019-04-201-0/+152
| | | | llvm-svn: 358812
* [CVP] Add tests for sub nowrap inference; NFCNikita Popov2019-04-201-0/+601
| | | | | | | | These are baseline tests for D60036. Patch by Luqman Aden. llvm-svn: 358808
* [GVN+LICM] Use line 0 locations for better crash attributionVedant Kumar2019-04-191-3/+4
| | | | | | | | | | | | This is a follow-up to r291037+r291258, which used null debug locations to prevent jumpy line tables. Using line 0 locations achieves the same effect, but works better for crash attribution because it preserves the right inline scope. Differential Revision: https://reviews.llvm.org/D60913 llvm-svn: 358791
* [MergeFunc] removeUsers: call remove() only on direct usersFangrui Song2019-04-191-4/+18
| | | | | | | | | | | | | | | removeUsers uses a work list to collect indirect users and call remove() on those functions. However it has a bug (`if (!Visited.insert(UU).second)`). Actually, we don't have to collect indirect users. After the merge of F and G, G's callers will be considered (added to Deferred). If G's callers can be merged, G's callers' callers will be considered. Update the test unnamed-addr-reprocessing.ll to make it clear we can still merge indirect callers. llvm-svn: 358741
* MergeFunc: preserve COMDAT information when creating a thunkSaleem Abdulrasool2019-04-191-0/+24
| | | | | | | | | | We would previously drop the COMDAT on the thunk we generated when replacing a function body with the forwarding thunk. This would result in a function that may have been multiply emitted and multiply merged to be emitted with the same name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is used for the deduplication of Value Witness functions for Swift. llvm-svn: 358728
* [GuardWidening] Wire up a NPM version of the LoopGuardWidening passPhilip Reames2019-04-181-0/+138
| | | | llvm-svn: 358704
* [BlockExtractor] Extend the file format to support the grouping of basic blocksQuentin Colombet2019-04-181-0/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701
* [LoopPred] Fix a blatantly obvious bug in r358684Philip Reames2019-04-181-0/+64
| | | | | | The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688
* [LoopPredication] Allow predication of loop invariant computations (within ↵Philip Reames2019-04-183-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684
* Add basic loop fusion pass.Kit Barton2019-04-175-0/+1030
| | | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607
* [ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbolsSteven Wu2019-04-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool. ThinLTOCodeGenerator currently does not preserve llvm.used symbols and it can internalize them. In order to pass the necessary information to the legacy ThinLTOCodeGenerator, the input to the code generator is rewritten to be based on lto::InputFile. Now ThinLTO using the legacy LTO API will requires data layout in Module. "internalize" thinlto action in llvm-lto is updated to run both "promote" and "internalize" with the same configuration as ThinLTOCodeGenerator. The old "promote" + "internalize" option does not produce the same output as ThinLTOCodeGenerator. This fixes: PR41236 rdar://problem/49293439 Reviewers: tejohnson, pcc, kromanova, dexonsmith Reviewed By: tejohnson Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60421 llvm-svn: 358601
OpenPOWER on IntegriCloud