summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoadStoreVectorizer/X86
Commit message (Collapse)AuthorAgeFilesLines
* [lit] Delete empty lines at the end of lit.local.cfg NFCFangrui Song2019-06-171-1/+0
| | | | llvm-svn: 363538
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-1711-0/+547
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-1711-547/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* Fix invalid target triples in tests. (NFC)Florian Hahn2019-03-041-2/+2
| | | | llvm-svn: 355349
* [PM] Port LoadStoreVectorizer to the new pass manager.Markus Lavin2018-12-0710-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D54848 llvm-svn: 348570
* Revert "[SCEV][NFC] Check NoWrap flags before lexicographical comparison of ↵Roman Tereshin2018-08-271-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCEVs" This reverts r319889. Unfortunately, wrapping flags are not a part of SCEV's identity (they do not participate in computing a hash value or in equality comparisons) and in fact they could be assigned after the fact w/o rebuilding a SCEV. Grep for const_cast's to see quite a few of examples, apparently all for AddRec's at the moment. So, if 2 expressions get built in 2 slightly different ways: one with flags set in the beginning, the other with the flags attached later on, we may end up with 2 expressions which are exactly the same but have their operands swapped in one of the commutative N-ary expressions, and at least one of them will have "sorted by complexity" invariant broken. 2 identical SCEV's won't compare equal by pointer comparison as they are supposed to. A real-world reproducer is added as a regression test: the issue described causes 2 identical SCEV expressions to have different order of operands and therefore compare not equal, which in its turn prevents LoadStoreVectorizer from vectorizing a pair of consecutive loads. On a larger example (the source of the test attached, which is a bugpoint) I have seen even weirder behavior: adding a constant to an existing SCEV changes the order of the existing terms, for instance, getAddExpr(1, ((A * B) + (C * D))) returns (1 + (C * D) + (A * B)). Differential Revision: https://reviews.llvm.org/D40645 llvm-svn: 340777
* [SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw><nsw> transformRoman Tereshin2018-07-241-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | if the top level addition in (D + (C-D + x + ...)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x + ...), ensuring homogeneous behaviour of the transformation and better canonicalization of such expressions. This enables better canonicalization of expressions like 1 + zext(5 + 20 * %x + 24 * %y) and zext(6 + 20 * %x + 24 * %y) which get both transformed to 2 + zext(4 + 20 * %x + 24 * %y) This pattern is common in address arithmetics and the transformation makes it easier for passes like LoadStoreVectorizer to prove that 2 or more memory accesses are consecutive and optimize (vectorize) them. Reviewed By: mzolotukhin Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337859
* NFC - Various typo fixes in testsGabor Buella2018-07-041-1/+1
| | | | llvm-svn: 336268
* [LoadStoreVectorizer] Differentiate between <1 x T> and TSven van Haastregt2018-03-071-0/+14
| | | | | | | | | | | The LoadStoreVectorizer thought that <1 x T> and T were the same types when merging stores, leading to a crash later. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D44014 llvm-svn: 326884
* [Analysis] Fix merging TBAA tags with different final access typesIvan A. Kosarev2017-11-081-0/+46
| | | | | | | | | | | | | | | | | | | | | There are cases when we have to merge TBAA access tags with the same base access type, but different final access types. For example, accesses to different members of the same structure may be vectorized into a single load or store instruction. Since we currently assume that the tags to merge always share the same final access type, we incorrectly return a tag that describes an access to one of the original final access types as the generic tag. This patch fixes that by producing generic tags for the common type and not the final access types of the original tags. Resolves: PR35225: Wrong tbaa metadata after load store vectorizer due to recent change https://bugs.llvm.org/show_bug.cgi?id=35225 Differential Revision: https://reviews.llvm.org/D39732 llvm-svn: 317682
* [LSV] Skip all non-byte sizes, not only less than eight bitsBjorn Pettersson2017-10-261-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The code comments indicate that no effort has been spent on handling load/stores when the size isn't a multiple of the byte size correctly. However, the code only avoided types smaller than 8 bits. So for example a load of an i28 could still be considered as a candidate for vectorization. This patch adjusts the code to behave according to the code comment. The test case used to hit the following assert when trying to use "cast" an i32 to i28 using CreateBitOrPointerCast: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void*) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&) #8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*) Reviewers: arsenm Reviewed By: arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39295 llvm-svn: 316663
* [X86 TTI] Implement LSV hookKeno Fischer2017-04-051-0/+38
| | | | | | | | | | | | | | | | | | Summary: LSV wants to know the maximum size that can be loaded to a vector register. On X86, this always matches the maximum register width. Implement this accordingly and add a test to make sure that LSV can vectorize up to the maximum permissible width on X86. Reviewers: delena, arsenm Reviewed By: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D31504 llvm-svn: 299589
* [LoadStoreVectorizer] Change VectorSet to Vector to match head and tail ↵Alina Sbirlea2016-08-301-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | positions. Resolves PR29148. Summary: LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize. A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head. e.g.: i1: store a[0] i2: store a[1] i3: store a[1] Leads to: H: i1 T: i2 i3 Instead of: H: i1 i1 T: i2 i3 So the positions for instructions that follow i3 will have different indexes in H/T. This patch resolves PR29148. This issue also surfaced the fact that if the chain is too long, and TLI returns a "not-fast" answer, the whole chain will be abandoned for vectorization, even though a smaller one would be beneficial. Added a testcase and FIXME for this. Reviewers: tstellarAMD, arsenm, jlebar Subscribers: mzolotukhin, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24057 llvm-svn: 280179
* LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack ↵Alina Sbirlea2016-08-044-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | adjustments. Summary: TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses. A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted. Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures: - x86 failing tests either did not have the right target, or the right alignment. - NVPTX failing tests did not have the right alignment. - AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks for a maximum size allowed and relies on the next condition checking for %4 for correctness. This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list). Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure), so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context. Reviewers: arsenm, jlebar, tstellarAMD Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23068 llvm-svn: 277735
* [LSV] Add detail to correct-order.ll test.Justin Lebar2016-07-191-5/+6
| | | | | | | | | | | | | | Summary: This helps keep us honest -- there were a number of ways we could screw up and still have passed this test. Reviewers: asbirlea Subscribers: llvm-commits, arsenm Differential Revision: https://reviews.llvm.org/D22531 llvm-svn: 276053
* Extended LoadStoreVectorizer to vectorize subchains.Alina Sbirlea2016-07-131-7/+3
| | | | | | | | | | | | | | Summary: LSV used to abort vectorizing a chain for interleaved load/store accesses that alias. Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain. Reviewers: llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22119 llvm-svn: 275317
* Correct ordering of loads/stores.Alina Sbirlea2016-07-114-4/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Aiming to correct the ordering of loads/stores. This patch changes the insert point for loads to the position of the first load. It updates the ordering method for loads to insert before, rather than after. Before this patch the following sequence: "load a[1], store a[1], store a[0], load a[2]" Would incorrectly vectorize to "store a[0,1], load a[1,2]". The correctness check was assuming the insertion point for loads is at the position of the first load, when in practice it was at the last load. An alternative fix would have been to invert the correctness check. The current fix changes insert position but also requires reordering of instructions before the vectorized load. Updated testcases to reflect the changes. Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22071 llvm-svn: 275117
* Address two correctness issues in LoadStoreVectorizerAlina Sbirlea2016-07-013-0/+53
Summary: GetBoundryInstruction returns the last instruction as the instruction which follows or end(). Otherwise the last instruction in the boundry set is not being tested by isVectorizable(). Partially solve reordering of instructions. More extensive solution to follow. Reviewers: tstellarAMD, llvm-commits, jlebar Subscribers: escha, arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21934 llvm-svn: 274389
OpenPOWER on IntegriCloud