summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64/unfold-masked-merge-scalar-variablemask.ll
Commit message (Collapse)AuthorAgeFilesLines
* Relanding r368987 [AArch64] Change location of frame-record within ↵Sander de Smalen2019-08-161-8/+8
| | | | | | | | | | | | | | | | callee-save area. Changes: There was a condition for `!NeedsFrameRecord` missing in the assert. The assert in question has changed to: + assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg2 != AArch64::FP || + RPI.Reg1 == AArch64::LR) && + "FrameRecord must be allocated together with LR"); This addresses PR43016. llvm-svn: 369122
* Revert r368987, it caused PR43016.Nico Weber2019-08-161-8/+8
| | | | llvm-svn: 369080
* [AArch64] Change location of frame-record within callee-save area.Sander de Smalen2019-08-151-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the location of the frame-record (FP, LR) to the bottom of the callee-saved area. According to the AAPCS the location of the frame-record within the stackframe is unspecified (section 5.2.3 The Frame Pointer), so the compiler should be free to choose a different location. The reason for changing the location of the frame-record is to prepare the frame for allocating an SVE area below the callee-saves. This way the compiler can use the VL-scaled addressing modes to directly access SVE objects from the frame-pointer. : : | stack | | stack | | args | | args | +-------+ +-------+ | x30 | | x19 | | x29 | | x20 | FP -> |- - - -| | x21 | | x19 | ==> | x22 | | x20 | |- - - -| | x21 | | x30 | | x22 | | x29 | +-------+ +-------+ <- FP |///////| |///////| // realignment gap |- - - -| |- - - -| |spills/| |spills/| | locals| | locals| SP -> +-------+ +-------+ <- SP Things to point out: - The algorithm to find a paired register should be prevented from accidentally pairing some callee-saved register with LR that is not FP, since they should always be paired together when the frame has a frame-record. - For Darwin platforms the location of the frame-record is unchanged, since the unwind encoding does not allow for encoding this position dynamically and other tools currently depend on the former layout. Reviewers: efriedma, rovka, rengolin, thegameg, greened, t.p.northover Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65653 llvm-svn: 368987
* [DAGCombine] Fold (x & ~y) | y patternsNikita Popov2019-03-171-4/+2
| | | | | | | | | | | | | Fold (x & ~y) | y and it's four commuted variants to x | y. This pattern can in particular appear when a vselect c, x, -1 is expanded to (x & ~c) | (-1 & c) and combined to (x & ~c) | c. This change has some overlap with D59066, which avoids creating a vselect of this form in the first place during uaddsat expansion. Differential Revision: https://reviews.llvm.org/D59174 llvm-svn: 356333
* [DAGCombiner] use root SDLoc for all nodes created by logic foldSanjay Patel2018-12-071-6/+6
| | | | | | | | | | | If this is not a valid way to assign an SDLoc, then we get this wrong all over SDAG. I don't know enough about the SDAG to explain this. IIUC, theoretically, debug info is not supposed to affect codegen. But here it has clearly affected 3 different targets, and the x86 change is an actual improvement. llvm-svn: 348552
* Extend hasStoreToStackSlot with list of FI accesses.Sander de Smalen2018-09-031-4/+4
| | | | | | | | | | | | | | | | | | For instructions that spill/fill to and from multiple frame-indices in a single instruction, hasStoreToStackSlot and hasLoadFromStackSlot should return an array of accesses, rather than just the first encounter of such an access. This better describes FI accesses for AArch64 (paired) LDP/STP instructions. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51537 llvm-svn: 341301
* [DAGCombine][NFC] Masked merge unfolding: comment: some tests are non-canonicalRoman Lebedev2018-05-071-0/+8
| | | | | | As requested in https://reviews.llvm.org/D46494#inline-407282 llvm-svn: 331650
* [DAGCombiner] Masked merge: don't touch "not" xor's.Roman Lebedev2018-05-051-8/+10
| | | | | | | | | | | | | | | | | | | | Summary: Split off form D46031. It seems we don't want to transform the pattern if the `xor`'s are actually `not`'s. In vector case, this breaks `andnpd` / `vandnps` patterns. That being said, we may want to re-visit this `not` handling, maybe in D46073. Reviewers: spatel, craig.topper, javed.absar Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46492 llvm-svn: 331595
* [X86][AArch64][NFC] Finish adding 'bad' tests for masked merge unfolding ↵Roman Lebedev2018-04-251-29/+181
| | | | | | | | | | with constants. I have initially committed basic tests in, rL330771, but then quickly discovered that there are a few more interesting patterns. llvm-svn: 330819
* [X86][AArch64][NFC] Add tests for masked merge unfolding with %y = constRoman Lebedev2018-04-241-0/+55
| | | | | | | | The fold was added in D45733. This appears to be a regression. llvm-svn: 330771
* [DAGCombiner] Unfold scalar masked merge if profitableRoman Lebedev2018-04-231-57/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]]. [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly. Previously, `andl`+`andn`/`andps`+`andnps` / `bic`/`bsl` would be generated. (see `@out`) Now, they would no longer be generated (see `@in`). So we need to make sure that they are still generated. If the mask is constant, we do nothing. InstCombine should have unfolded it. Else, i use `hasAndNot()` TLI hook. For now, only handle scalars. https://rise4fun.com/Alive/bO6 ---- I *really* don't like the code i wrote in `DAGCombiner::unfoldMaskedMerge()`. It is super fragile. Is there something like IR Pattern Matchers for this? Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: andreadb, courbet, kristof.beyls, javed.absar, rengolin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D45733 llvm-svn: 330646
* [X86][AArch64][NFC] Add tests for masked merge unfoldingRoman Lebedev2018-04-231-0/+415
Summary: This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]]. [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly. Previously, `andl`+`andn`/`andps`+`andnps` / `bic`/`bsl` would be generated. (see `@out`) Now, they would no longer be generated (see `@in`). I'm guessing `llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp` should be able to unfold this. Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: nemanjai, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45563 llvm-svn: 330645
OpenPOWER on IntegriCloud