summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64/urem-seteq-vec-nonsplat.ll
Commit message (Collapse)AuthorAgeFilesLines
* [NFC][X86][AArch64] Revisit test coverage for X s% C == 0 fold - add tests ↵Roman Lebedev2019-07-301-76/+162
| | | | | | | | | | for negative divisors, INT_MIN divisors As discussed in the review, that fold is only valid for positive divisors, so while we can negate negative divisors, we have to special-case INT_MIN. llvm-svn: 367294
* [Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvementsRoman Lebedev2019-07-201-51/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Four things here: 1. Generalize the fold to handle non-splat divisors. Reasonably trivial. 2. Unban power-of-two divisors. I don't see any reason why they should be illegal. * There is no ban in Hacker's Delight * I think the ban came from the same bug that caused the miscompile in the base patch - in `floor((2^W - 1) / D)` we were dividing by `D0` instead of `D`, and we **were** ensuring that `D0` is not `1`, which made sense. 3. Unban `1` divisors. I no longer believe Hacker's Delight actually says that the fold is invalid for `D = 0`. Further considerations: * We know that * `(X u% 1) == 0` can be constant-folded to `1`, * `(X u% 1) != 0` can be constant-folded to `0`, * Also, we know that * `X u<= -1` can be constant-folded to `1`, * `X u> -1` can be constant-folded to `0`, * https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p * We know will end up with the following: `(setule/setugt (rotr (mul N, P), K), Q)` * Therefore, for given new DAG nodes and comparison predicates (`ule`/`ugt`), we will still produce the correct answer if: `Q` is a all-ones constant; and both `P` and `K` are *anything* other than `undef`. * The fold will indeed produce `Q = all-ones`. 4. Try to re-splat the `P` and `K` vectors - we don't care about their values for the lanes where divisor was `1`. Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00 Reviewed By: RKSimon Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63963 llvm-svn: 366637
* [NFC][Codegen] Revisit test coverage for X % C == 0 fold once more (add ↵Roman Lebedev2019-06-281-75/+418
| | | | | | tests with '1' divisor) llvm-svn: 364661
* [NFC][Codegen] Revisit test coverage for X % C == 0 foldRoman Lebedev2019-06-281-107/+219
| | | | llvm-svn: 364642
* [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 3)Roman Lebedev2019-06-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222. There is no such action in "Add Action" menu... This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. This is a recommit, the original commit rL364563 was reverted in rL364568 because test-suite detected miscompile - the new comparison constant 'Q' was being computed incorrectly (we divided by `D0` instead of `D`). Original patch D50222 by @hermord (Dmytro Shynkevych) Notes: - In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla. This seems to require an extra branch on overflow, so I refrained from implementing this for now. - An explicit check for when the `REM` can be reduced to just its LHS is included: the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise. I hadn't managed to find a better way to not generate worse output in this case. - The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390. Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: dexonsmith, kristina, xbolva00, javed.absar, llvm-commits, hermord Tags: #llvm Differential Revision: https://reviews.llvm.org/D63391 llvm-svn: 364600
* [NFC][CodeGen] Add negative test for X u% C == 0 fold (D63391)Roman Lebedev2019-06-271-12/+44
| | | | | | | | The fold (D63391) uses multiplicativeInverse(), but it is not guaranteed to always succeed, and '100' appears to be one of the problematic values. llvm-svn: 364578
* [SelectionDAG] Add icmp UNDEF handling to SelectionDAG::FoldSetCCSimon Pilgrim2019-03-251-6/+2
| | | | | | | | | | First half of PR40800, this patch adds DAG undef handling to icmp instructions to match the behaviour in llvm::ConstantFoldCompareInstruction and SimplifyICmpInst, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.). This involved a lot of tweaking to reduced tests as bugpoint loves to reduce icmp arguments to undef........ Differential Revision: https://reviews.llvm.org/D59363 llvm-svn: 356938
* [AARCH64][X86] Remove _nonsplat from test namesSimon Pilgrim2018-10-071-12/+12
| | | | | | As discussed on D50222 llvm-svn: 343934
* [DagCombine][NFC] Some more tests fo for X % C == 0 (UREM case) transformRoman Lebedev2018-09-111-0/+240
For https://reviews.llvm.org/D50222 Patch by: hermord (Dmytro Shynkevych)! llvm-svn: 341953
OpenPOWER on IntegriCloud