summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Add test cases for masked store and masked scatter with an all zeroes ↵Craig Topper2019-06-021-1/+1
| | | | | | | | | mask. Fix bug in ScalarizeMaskedMemIntrin Need to cast only to Constant instead of ConstantVector to allow ConstantAggregateZero. llvm-svn: 362341
* [ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and ↵Craig Topper2019-03-211-0/+158
| | | | | | | | | | | | | | compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687
* [ScalarizeMaskedMemIntrinsics] Reverse some if conditions to reduce ↵Craig Topper2019-03-211-20/+16
| | | | | | | | indentations to remove curly braces. Pre-commit for D59180 llvm-svn: 356646
* [ScalarizeMaskedMemIntrin] Use IRBuilder functions that take ↵Craig Topper2019-03-091-43/+29
| | | | | | | | | | | | uint32_t/uint64_t for getelementptr, extractelement, and insertelement. This saves needing to call getInt32 ourselves. Making the code a little shorter. The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue. This cleanup because I plan to write similar code for expandload/compressstore. llvm-svn: 355767
* [ScalarizeMaskedMemIntrin] Only set the ModifiedDT flag if new basic blocks ↵Craig Topper2019-03-081-12/+16
| | | | | | | | | | were added. There are special cases in the scalarization for constant masks. If we hit one of the special cases we don't need to reset the iteration. Noticed while starting work on adding expandload/compressstore to this pass. llvm-svn: 355754
* [opaque pointer types] Pass value type to LoadInst creation.James Y Knight2019-02-011-5/+6
| | | | | | | | | This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [ScalarizeMaskedMemIntrin] Limit the scope of some variables that are only ↵Craig Topper2018-10-301-8/+5
| | | | | | used inside loops. llvm-svn: 345638
* [ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the ↵Craig Topper2018-09-281-2/+2
| | | | | | | | scalar load/stores to handle element types that are byte-sized but not powers of 2. This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work. llvm-svn: 343294
* [ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar ↵Craig Topper2018-09-281-1/+1
| | | | | | | | stores of a masked store expansion. It should be the minimum of the original alignment and the scalar size. llvm-svn: 343284
* [ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts ↵Craig Topper2018-09-271-4/+19
| | | | | | | | before generating the expansion without control flow. Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case. llvm-svn: 343278
* [ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an ↵Craig Topper2018-09-271-10/+6
| | | | | | | | assert. Consistently make use of the element type variable we already have. NFCI cast will take care of asserting internally. llvm-svn: 343277
* [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the ↵Craig Topper2018-09-271-22/+11
| | | | | | | | passthru vector and insert the new load results into it. Previously we started with undef and did a final merge with the passthru at the end. llvm-svn: 343273
* [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the ↵Craig Topper2018-09-271-22/+12
| | | | | | | | passthru value and insert each conditional load result over their element. Previously we started with undef and did one final merge at the end with a select. llvm-svn: 343271
* [ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.Craig Topper2018-09-271-8/+8
| | | | | | | | This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead. So instead just check for Constant and use getAggregateElement which will do the dirty work for us. llvm-svn: 343270
* [ScalarizeMaskedMemIntrin] Remove some temporary variables that are only ↵Craig Topper2018-09-271-14/+5
| | | | | | used by a single if condition. llvm-svn: 343268
* [ScalarizeMaskedMemIntrin] Cleanup comments. NFCCraig Topper2018-09-271-58/+49
| | | | llvm-svn: 343267
* [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask ↵Craig Topper2018-09-271-23/+9
| | | | | | | | values. That's just %x so use that directly. Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical. llvm-svn: 343244
* [CodeGen] Do not allow opt-bisect-limit to skip ScalarizeMaskedMemIntrin.Andrei Elovikov2018-04-241-3/+0
| | | | | | | | | | | | | | | | | Summary: The pass is supposed to scalarize such intrinsics if the target does not support them natively, so if the scalarization does not happen instruction selection crashes due to inability to lower these intrinsics. Reviewers: andrew.w.kaylor, craig.topper Reviewed By: andrew.w.kaylor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45947 llvm-svn: 330700
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-171-1/+1
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* [CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include ↵Eugene Zelenko2017-09-271-17/+29
| | | | | | What You Use warnings; other minor fixes (NFC). llvm-svn: 314363
* Sink some IntrinsicInst.h and Intrinsics.h out of llvm/includeReid Kleckner2017-09-071-0/+1
| | | | | | | Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759
* CodeGen: Rename DEBUG_TYPE to match passnamesMatthias Braun2017-05-251-6/+2
| | | | | | | | Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921
* [X86] Relocate code of replacement of subtarget unsupported masked memory ↵Ayman Musa2017-05-151-0/+660
intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050
OpenPOWER on IntegriCloud