summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [LibCallSimplifier] propagate FMF when shrinking binary callsSanjay Patel2015-12-311-0/+4
| | | | llvm-svn: 256682
* [LibCallSimplifier] propagate FMF when shrinking unary callsSanjay Patel2015-12-311-0/+4
| | | | llvm-svn: 256679
* Variable names start with an upper case letter; NFCSanjay Patel2015-12-311-4/+4
| | | | llvm-svn: 256676
* fix formatting; NFCSanjay Patel2015-12-311-17/+22
| | | | llvm-svn: 256675
* getParent() ^ 3 == getModule() ; NFCISanjay Patel2015-12-141-1/+1
| | | | llvm-svn: 255511
* [SimplifyLibCalls] Optimization for pow(x, n) where n is some constantWeiming Zhao2015-12-041-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In order to avoid calling pow function we generate repeated fmul when n is a positive or negative whole number. For each exponent we pre-compute Addition Chains in order to minimize the no. of fmuls. Refer: http://wwwhomes.uni-bielefeld.de/achim/addition_chain.html We pre-compute addition chains for exponents upto 32 (which results in a max of 7 fmuls). For eg: 4 = 2+2 5 = 2+3 6 = 3+3 and so on Hence, pow(x, 4.0) ==> y = fmul x, x x = fmul y, y ret x For negative exponents, we simply compute the reciprocal of the final result. Note: This transformation is only enabled under fast-math. Patch by Mandeep Singh Grang <mgrang@codeaurora.org> Reviewers: weimingz, majnemer, escha, davide, scanon, joerg Subscribers: probinson, escha, llvm-commits Differential Revision: http://reviews.llvm.org/D13994 llvm-svn: 254776
* [SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math.Davide Italiano2015-11-301-1/+9
| | | | llvm-svn: 254317
* [SimplifyLibCalls] Don't crash if the function doesn't have a name.Davide Italiano2015-11-291-3/+2
| | | | llvm-svn: 254265
* [SimplifyLibCalls] Cross out implemented transformations.Davide Italiano2015-11-291-2/+0
| | | | llvm-svn: 254264
* [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).Davide Italiano2015-11-291-5/+50
| | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math. There are cases where the difference between the value computed and the correct value is huge even for ffast-math, e.g. as Steven pointed out: x = -1, y = -4 log(pow(-1), 4) = 0 4*log(-1) = NaN I checked what GCC does and apparently they do the same optimization (which result in the dramatic difference). Future work might try to make this (slightly) less worse. Differential Revision: http://reviews.llvm.org/D14400 llvm-svn: 254263
* [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!Davide Italiano2015-11-281-4/+3
| | | | llvm-svn: 254239
* [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized ↵Benjamin Kramer2015-11-281-2/+2
| | | | | | | | memory read below. Found by msan! llvm-svn: 254238
* [SimplifyLibCalls] Use range-based loop. NFC.Davide Italiano2015-11-271-4/+2
| | | | llvm-svn: 254193
* [SimplifyLibCalls] Don't depend on a called function having a name, it might ↵Benjamin Kramer2015-11-261-11/+8
| | | | | | | | be an indirect call. Fixes the crasher in PR25651 and related crashers using the same pattern. llvm-svn: 254145
* [Utils] Put includes in correct order. NFC.Weiming Zhao2015-11-241-1/+1
| | | | | | | | | | | | | | | | | | | Summary: Followed the guidelines in: http://llvm.org/docs/CodingStandards.html#include-style However, I noticed that uppercase named headers come before lowercase ones throughout the codebase. So kept them as is. Patch by Mandeep Singh Grang <mgrang@codeaurora.org> Reviewers: majnemer, davide, jmolloy, atrick Subscribers: sanjoy Differential Revision: http://reviews.llvm.org/D14939 llvm-svn: 254005
* [SimplifyLibCalls] Removed some TODOs which are already implemented. NFC.Weiming Zhao2015-11-211-4/+0
| | | | | | | | | | | | | | Summary: D14302 implements tan(atan(x)) -> x D14045 implements pow(exp(x), y) -> exp(x*y) Patch by Mandeep Singh Grang <mgrang@codeaurora.org> Reviewers: majnemer, davide Differential Revision: http://reviews.llvm.org/D14882 llvm-svn: 253768
* Revert "Change memcpy/memset/memmove to have dest and source alignments."Pete Cooper2015-11-191-11/+10
| | | | | | | | | | This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
* [SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math.Davide Italiano2015-11-181-2/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D14466 llvm-svn: 253521
* Change memcpy/memset/memmove to have dest and source alignments.Pete Cooper2015-11-181-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
* [SimplifyLibCalls] Generalize a comment. This doesn't apply only to sqrt.Davide Italiano2015-11-161-2/+2
| | | | llvm-svn: 253224
* [SimplifyLibCalls] Make a function shorter. NFC.Davide Italiano2015-11-121-10/+2
| | | | llvm-svn: 252970
* [SimplifyLibCalls] Don't hardcode the function name.Davide Italiano2015-11-061-1/+2
| | | | llvm-svn: 252342
* [SimplifyLibCalls] Use hasFloatVersion(). NFCI.Davide Italiano2015-11-051-15/+13
| | | | llvm-svn: 252186
* [SimplifyLibCalls] New transformation: tan(atan(x)) -> xDavide Italiano2015-11-041-1/+38
| | | | | | | | | | | | | | | | | | | | This is enabled only under -ffast-math. So, instead of emitting: 4007b0: 50 push %rax 4007b1: e8 8a fd ff ff callq 400540 <atanf@plt> 4007b6: 58 pop %rax 4007b7: e9 94 fd ff ff jmpq 400550 <tanf@plt> 4007bc: 0f 1f 40 00 nopl 0x0(%rax) for: float mytan(float x) { return tanf(atanf(x)); } we emit a single retq. Differential Revision: http://reviews.llvm.org/D14302 llvm-svn: 252098
* [SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)Davide Italiano2015-11-031-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976
* [SimplifyLibCalls] Remove variables that are not used. NFC.Davide Italiano2015-11-021-7/+2
| | | | llvm-svn: 251852
* [SimplifyLibCalls] Merge two if statements. NFC.Davide Italiano2015-11-021-4/+1
| | | | llvm-svn: 251845
* Simplify a check. NFC.Davide Italiano2015-11-011-2/+2
| | | | llvm-svn: 251757
* [SimplifyLibCalls] Factor out other common code.Davide Italiano2015-10-311-21/+10
| | | | llvm-svn: 251754
* [SimplifyLibCalls] Remove dead code.Davide Italiano2015-10-311-6/+0
| | | | llvm-svn: 251737
* [SimplifyLibCalls] Factor out common unsafe-math checks.Davide Italiano2015-10-291-29/+23
| | | | llvm-svn: 251595
* [SimplifyLibCalls] Use range-based loop. No functional change.Davide Italiano2015-10-271-4/+2
| | | | llvm-svn: 251383
* TransformUtils: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-131-2/+1
| | | | | | | | | | | Continuing the work from last week to remove implicit ilist iterator conversions. First related commit was probably r249767, with some more motivation in r249925. This edition gets LLVMTransformUtils compiling without the implicit conversions. No functional change intended. llvm-svn: 250142
* [SimplifyLibCalls] Fix instruction misplacement in string/memory libcall ↵Bruno Cardoso Lopes2015-10-011-2/+6
| | | | | | | | | | | | | | | | | | optimization When trying to optimize fortified library functions use the right location to insert new instructions in order to preserve correct def-use order. This fixes an issue where a misplaced instruction definition would happen to be *after* one of its use after a RAUW, forming invalid IR. This behavior was introduced by r227250. Differential Revision: http://reviews.llvm.org/D13301 rdar://problem/22802369 llvm-svn: 249092
* Prune utf8 chars in comments.NAKAMURA Takumi2015-09-071-1/+1
| | | | llvm-svn: 246953
* Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y.Chad Rosier2015-08-281-0/+22
| | | | | | | http://reviews.llvm.org/D6952 PR20673 llvm-svn: 246313
* [SimplifyLibCalls] Fix a typoDavid Majnemer2015-08-261-1/+1
| | | | | | | cbrt(sqrt(x)) calculates the sixth root, not the ninth root. cbrt(cbrt(x)) calculates the ninth root. llvm-svn: 246046
* [SimplifyLibCalls] Drop default template args. No functional change.Benjamin Kramer2015-08-161-4/+2
| | | | llvm-svn: 245189
* transform fmin/fmax calls when possible (PR24314)Sanjay Patel2015-08-161-2/+61
| | | | | | | | | | | | | | | If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187
* [SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttzDavide Italiano2015-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 llvm-svn: 244947
* fix typo; NFCSanjay Patel2015-08-121-1/+1
| | | | llvm-svn: 244805
* Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC.Pete Cooper2015-05-201-1/+1
| | | | | | | | Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method. This updates all users which were either using an unsigned to store it, or had a now unnecessary cast. llvm-svn: 237810
* Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced ↵David Blaikie2015-05-181-2/+2
| | | | | | init only llvm-svn: 237624
* [opaque pointer type] More GEP IRBuilder API migrations...David Blaikie2015-04-031-2/+2
| | | | llvm-svn: 234058
* Use early returns to reduce indentation.David Blaikie2015-04-031-18/+18
| | | | llvm-svn: 234057
* [SimplifyLibCalls] Ignore nobuiltin/unavailable fortified libcalls.Ahmed Bougacha2015-04-011-3/+13
| | | | | | | | | | | | | | | | | We used to do this before refactorings around r225640. Some clang users checked for _chk libcall availability using: __has_builtin(__builtin___memcpy_chk) When compiling with -fno-builtin, this is always true. When passing -ffreestanding/-mkernel, which both imply -fno-builtin, we end up with fortified libcalls, which isn't acceptable in a freestanding environment which only provides their non-fortified counterparts. Until we change clang and/or teach external users to check for availability differently, disregard the "nobuiltin" attribute and TLI::has. Workaround for PR23093. llvm-svn: 233776
* [opaque pointer type] More IRBuilder::createGEP (non-inbounds) migrations: ↵David Blaikie2015-03-301-9/+9
| | | | | | CodeGenPrepare and SimplifyLibCalls llvm-svn: 233596
* [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> ↵Benjamin Kramer2015-03-211-1/+3
| | | | | | bitfield transform. llvm-svn: 232903
* [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.Benjamin Kramer2015-03-211-4/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | strchr("123!", C) != nullptr is a common pattern to check if C is one of 1, 2, 3 or !. If the largest element of the string is smaller than the target's register size we can easily create a bitfield and just do a simple test for set membership. int foo(char C) { return strchr("123!", C) != nullptr; } now becomes cmpl $64, %edi ## range check sbbb %al, %al movabsq $0xE000200000001, %rcx btq %rdi, %rcx ## bit test sbbb %cl, %cl andb %al, %cl ## and the two conditions andb $1, %cl movzbl %cl, %eax ## returning an int ret (imho the backend should expand this into a series of branches, but that's a different story) The code is currently limited to bit fields that fit in a register, so usually 64 or 32 bits. Sadly, this misses anything using alpha chars or {}. This could be fixed by just emitting a i128 bit field, but that can generate really ugly code so we have to find a better way. To some degree this is also recreating switch lowering logic, but we can't simply emit a switch instruction and thus change the CFG within instcombine. llvm-svn: 232902
* SimplifyLibCalls: Add basic optimization of memchr calls.Benjamin Kramer2015-03-211-0/+40
| | | | | | This is just memchr(x, y, 0) -> nullptr and constant folding. llvm-svn: 232896
OpenPOWER on IntegriCloud