summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Fix SimplifyLibCalls erasing an instruction while IC still had ↵Amara Emerson2018-10-111-1/+5
| | | | | | | | | | | | | references to it. InstCombine keeps a worklist and assumes that optimizations don't eraseFromParent() the instruction, which SimplifyLibCalls violates. This change adds a new callback to SimplifyLibCalls to let clients specify their own hander for erasing actions. Differential Revision: https://reviews.llvm.org/D52729 llvm-svn: 344251
* [IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle.Neil Henning2018-10-081-5/+5
| | | | | | | | | | | | | | | | | | | | The IRBuilder CreateIntrinsic method wouldn't allow you to specify the types that you wanted the intrinsic to be mangled with. To fix this I've: - Added an ArrayRef<Type *> member to both CreateIntrinsic overloads. - Used that array to pass into the Intrinsic::getDeclaration call. - Added a CreateUnaryIntrinsic to replace the most common use of CreateIntrinsic where the type was auto-deduced from operand 0. - Added a bunch more unit tests to test Create*Intrinsic calls that weren't being tested (including the FMF flag that wasn't checked). This was suggested as part of the AMDGPU specific atomic optimizer review (https://reviews.llvm.org/D51969). Differential Revision: https://reviews.llvm.org/D52087 llvm-svn: 343962
* [InstCombine][x86] try even harder to convert blendv intrinsic to generic IR ↵Sanjay Patel2018-09-221-7/+20
| | | | | | | | | | | | | | | | (PR38814) Follow-up to rL342324 (D52059): Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 This is an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. llvm-svn: 342806
* [InstCombine][x86] try harder to convert blendv intrinsic to generic IR ↵Sanjay Patel2018-09-151-7/+15
| | | | | | | | | | | | | | | | | | (PR38814) Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 If this works, it's an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. I don't think that's too likely, but I've kept this patch minimal with a 'TODO', so we can test that theory in the wild before expanding the transform. Differential Revision: https://reviews.llvm.org/D52059 llvm-svn: 342324
* [InstCombine] canonicalize fneg with llvm.sinSanjay Patel2018-08-291-5/+14
| | | | | | | | | | | | | | This is a follow-up to rL339604 which did the same transform for a sin libcall. The handling of intrinsics vs. libcalls is unfortunately scattered, so I'm just adding this next to the existing transform for llvm.cos for now. This should resolve PR38458: https://bugs.llvm.org/show_bug.cgi?id=38458 If the call was already negated, the negates will cancel each other out. llvm-svn: 340952
* AMDGPU: Remove nan tests in class if src is nnanMatt Arsenault2018-08-281-0/+7
| | | | llvm-svn: 340850
* [X86] Remove masking from the 512-bit padds and psubs intrinsics. Use select ↵Craig Topper2018-08-161-33/+13
| | | | | | in IR instead. llvm-svn: 339842
* AMDGPU: Stop producing icmp/fcmp intrinsics with invalid typesMatt Arsenault2018-08-151-0/+27
| | | | llvm-svn: 339815
* [X86] Constant folding of adds/subs intrinsicsTomasz Krupa2018-08-141-0/+98
| | | | | | | | | | | | | | Summary: This adds constant folding of signed add/sub with saturation intrinsics. Reviewers: craig.topper, spatel, RKSimon, chandlerc, efriedma Reviewed By: craig.topper Subscribers: rnk, llvm-commits Differential Revision: https://reviews.llvm.org/D50499 llvm-svn: 339659
* AMDGPU: Turn class x, p_zero|n_zero into fcmp oeq x, 0Matt Arsenault2018-08-101-0/+9
| | | | | | The library does use this for some reason. llvm-svn: 339461
* [InstSimplify] move minnum/maxnum with Inf folds from instcombineSanjay Patel2018-08-091-35/+0
| | | | llvm-svn: 339396
* [InstSimplify] move minnum/maxnum with common op fold from instcombineSanjay Patel2018-08-071-30/+0
| | | | llvm-svn: 339144
* [InstSimplify] move minnum/maxnum with undef fold from instcombineSanjay Patel2018-08-021-11/+0
| | | | llvm-svn: 338719
* [InstSimplify] move minnum/maxnum with same arg fold from instcombineSanjay Patel2018-08-011-4/+0
| | | | llvm-svn: 338652
* PatternMatch: Add wrappers for fabs and canonicalizeMatt Arsenault2018-07-271-3/+3
| | | | llvm-svn: 338111
* Simplify recursive launder.invariant.group and stripPiotr Padlewski2018-07-121-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is crucial for proving equality laundered/stripped pointers. eg: bool foo(A *a) { return a == std::launder(a); } Clang with -fstrict-vtable-pointers will emit something like: define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) { entry: %c = bitcast %struct.A* %a to i8* %call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c) %0 = bitcast %struct.A* %a to i8* %1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0) %2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call) %cmp = icmp eq i8* %1, %2 ret i1 %cmp } and because %2 can be replaced with @llvm.strip.invariant.group(%0) and that %2 and %1 will produce the same value (because strip is readnone) we can replace compare with true. Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D47423 llvm-svn: 336963
* [X86] Remove and autoupgrade the scalar fma intrinsics with masking.Craig Topper2018-07-121-10/+0
| | | | | | This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871
* llvm: Add support for "-fno-delete-null-pointer-checks"Manoj Gupta2018-07-091-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613
* Fix asserts in AMDGCN fmed3 folding by handling more cases of NaNMatt Arsenault2018-07-051-7/+18
| | | | | | | | | | | | | | | Better NaN handling for AMDGCN fmed3. All operands are checked for NaN now. The checks were moved before the canonicalization to provide a better mapping from fclamp. Changed the behaviour of fmed3(x,y,NaN) to return max(x,y) instead of min(x,y) in light of this. Updated tests as a result and added some new cases to cover the fix. Patch by Alan Baker llvm-svn: 336375
* [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵Craig Topper2018-07-051-2/+0
| | | | | | independent FMA and extractelement/insertelement. llvm-svn: 336315
* [X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics ↵Craig Topper2018-06-271-6/+6
| | | | | | | | that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". llvm-svn: 335744
* [InstCombine] ignore debuginfo when removing redundant assumes (PR37726)Sanjay Patel2018-06-201-3/+5
| | | | | | | | | | This is similar to: rL335083 Fixes:: https://bugs.llvm.org/show_bug.cgi?id=37726 llvm-svn: 335121
* [IR] Introduce helpers to skip debug instructions (NFC)Vedant Kumar2018-06-191-11/+3
| | | | | | | | | | | | | | | | | | | | | | | This patch introduces two helpers to make it easier to ignore debug intrinsics: - Instruction::getNextNonDebugInstruction() This is just like Instruction::getNextNode(), except that it skips debug info. - skipDebugInfo(BasicBlock::iterator) A free function which advances a BasicBlock iterator past any debug info. This is a no-op when the iterator already points to a non-debug instruction. Part of: llvm.org/PR37728 Related to: https://reviews.llvm.org/D47874 Differential Revision: https://reviews.llvm.org/D48305 llvm-svn: 335083
* [InstCombine] Replacing X86-specific rounding intrinsics with generic floor-ceilMikhail Dvoretckii2018-06-191-2/+128
| | | | | | | | | | | | This patch replaces calls to X86-specific intrinsics with floor-ceil semantics with calls to target-independent @llvm.floor.* and @llvm.ceil.* intrinsics. This doesn't affect the resulting machine code, as those intrinsics are lowered to the same instructions, but exposes these specific rounding cases to generic optimizations. Differential Revision: https://reviews.llvm.org/D48067 llvm-svn: 335039
* [X86] Remove masking from the 512-bit masked floating point add/sub/mul/div ↵Craig Topper2018-06-101-50/+17
| | | | | | intrinsics. Use a select in IR instead. llvm-svn: 334358
* [InstCombine] Skip dbg.value(s) when looking at stack{save,restore}.Davide Italiano2018-06-081-1/+8
| | | | | | Fixes PR37713. llvm-svn: 334317
* InstCombine: ignore debug instructions during fence combineTim Northover2018-06-061-1/+5
| | | | | | | | | | We should never get different CodeGen based on whether the code is being compiled in debug mode so we must skip over @llvm.dbg.value (and similar) calls. Should fix at least the worst part of PR37690. llvm-svn: 334090
* Move Analysis/Utils/Local.h back to TransformsDavid Blaikie2018-06-041-1/+1
| | | | | | | | | | Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954
* [InstCombine, ARM] Convert vld1 to llvm loadAlexandros Lamprineas2018-05-311-1/+30
| | | | | | | | | | Convert a vector load intrinsic into an llvm load instruction. This is beneficial when the underlying object being addressed comes from a constant, since we get constant-folding for free. Differential Revision: https://reviews.llvm.org/D46273 llvm-svn: 333643
* [InstCombine, ARM, AArch64] Convert table lookup to shuffle vectorAlexandros Lamprineas2018-05-301-0/+46
| | | | | | | | | | | Turning a table lookup intrinsic into a shuffle vector instruction can be beneficial. If the mask used for the lookup is the constant vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse instructions instead. Differential Revision: https://reviews.llvm.org/D46133 llvm-svn: 333550
* [InstCombine] Combine XOR and AES instructions on ARM/ARM64.Chad Rosier2018-05-241-0/+17
| | | | | | | | | | | The ARM/ARM64 AESE and AESD instructions have a builtin XOR as the first step in the instruction. Therefore, if the AES key is zero and the AES data was previously XORed, it can be combined into a single instruction. Differential Revision: https://reviews.llvm.org/D47239 Patch by Michael Brase! llvm-svn: 333193
* [AMDGPU] Optimze old value of v_mov_b32_dppStanislav Mekhanoshin2018-05-221-0/+17
| | | | | | | | | | We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf. This is alternative implementation working with the intrinsic in InstCombine. Original review for past-ISel optimization: D46570. Differential Revision: https://reviews.llvm.org/D46596 llvm-svn: 332956
* [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵Craig Topper2018-05-201-20/+12
| | | | | | | | in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-141-2/+2
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 ↵Craig Topper2018-05-131-9/+12
| | | | | | bit version. llvm-svn: 332202
* [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.Craig Topper2018-05-131-2/+0
| | | | llvm-svn: 332198
* [X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer ↵Craig Topper2018-05-111-6/+0
| | | | | | used by clang. llvm-svn: 332146
* [InstCombine] Handle atomic memset in the same way as regular memsetDaniel Neilson2018-05-111-3/+5
| | | | | | | | | | | | | | | | | | Summary: This change adds handling of the atomic memset intrinsic to the code path that simplifies the regular memset. In practice this means that we will now also expand a small constant-length atomic memset into a single unordered atomic store. Reviewers: apilipenko, skatkov, mkazantsev, anna, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D46660 llvm-svn: 332132
* [InstCombine] Unify handling of atomic memtransfer with non-atomic memtransferDaniel Neilson2018-05-111-110/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change reworks the handling of atomic memcpy within the instcombine pass. Previously, a constant length atomic memcpy would be lowered into loads & stores as long as no more than 16 load/store pairs are created. This is quite different from the lowering done for a non-atomic memcpy; which only ever lowers into a single load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are expanded to load/stores in later passes, such as SelectionDAG lowering. In this change the behaviour for atomic memcpy is unified with non-atomic memcpy; atomic memcpy is now treated in the same was as non-atomic memcpy has always been. We leave it to later passes to lower longer-length atomic memcpy calls. Due to the structure of the pass's handling of memtransfer intrinsics, this change also gives us handling of atomic memmove that we did not previously have. Reviewers: apilipenko, skatkov, mkazantsev, anna, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D46658 llvm-svn: 332093
* [InstCombine] add folds for minnum(-a, -b) --> -maxnum(a, b)Sanjay Patel2018-05-101-0/+16
| | | | | | | | | | | This is similar to what we do for integer min/max with 'not' ops (rL321882). This should fix: https://bugs.llvm.org/show_bug.cgi?id=37404 https://bugs.llvm.org/show_bug.cgi?id=37405 llvm-svn: 332031
* [Inscombine] fix a signedness warning which broke -Werror buildsPhilip Reames2018-05-101-1/+1
| | | | llvm-svn: 331944
* [InstCombine] Widen guards with conditions betweenPhilip Reames2018-05-091-1/+22
| | | | | | | | The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed. Differential Revision: https://reviews.llvm.org/D46203 llvm-svn: 331935
* [X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR.Craig Topper2018-04-131-69/+0
| | | | | | | | This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics. We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well. llvm-svn: 329990
* [InstCombine] cleanup; NFCSanjay Patel2018-04-051-15/+12
| | | | llvm-svn: 329282
* [InstCombine] Don't strip function type casts from musttail callsReid Kleckner2018-04-021-1/+10
| | | | | | | | | | | | | | | Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027
* [PatternMatch] allow undef elements when matching vector FP +0.0Sanjay Patel2018-03-251-2/+2
| | | | | | | | | | | | | This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461
* [InstCombine] simplify code for FP intrinsic shrinking; NFCISanjay Patel2018-03-231-10/+5
| | | | llvm-svn: 328372
* [InstCombineCalls] Update deprecated API usage (NFC)Daniel Neilson2018-03-221-1/+1
| | | | | | | | Summary: Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The former has been deprecated. llvm-svn: 328227
* Fix a couple of layering violations in TransformsDavid Blaikie2018-03-211-1/+1
| | | | | | | | | | | | | Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165
* [Transforms] Propagate new-format TBAA tags on simplification of ↵Ivan A. Kosarev2018-02-191-1/+3
| | | | | | | | | | | | | | memory-transfer intrinsics With this patch in place, when a new-format TBAA tag is available for a memory-transfer intrinsic call, we prefer propagating that new-format tag. Otherwise, we fallback to the old approach where we try to construct a proper TBAA access tag from 'tbaa.struct' metadata. Differential Revision: https://reviews.llvm.org/D41543 llvm-svn: 325488
OpenPOWER on IntegriCloud