summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to ↵Simon Pilgrim2016-05-011-8/+13
| | | | | | accept UNDEF elements. llvm-svn: 268206
* [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept ↵Simon Pilgrim2016-05-011-20/+27
| | | | | | UNDEF elements. llvm-svn: 268204
* [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept ↵Simon Pilgrim2016-05-011-16/+17
| | | | | | UNDEF elements. llvm-svn: 268202
* [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to ↵Simon Pilgrim2016-05-011-0/+37
| | | | | | shufflevector. llvm-svn: 268199
* [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate ↵Simon Pilgrim2016-04-301-18/+20
| | | | | | | | elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268158
* [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate ↵Simon Pilgrim2016-04-291-17/+23
| | | | | | | | elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268115
* [InstCombine] Determine the result of a select based on a dominating condition.Chad Rosier2016-04-291-0/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D19550 llvm-svn: 268104
* [InstCombine] clean up; NFCSanjay Patel2016-04-291-1/+1
| | | | llvm-svn: 268099
* [InstCombine] add helper function for ICmp with constant canonicalization; NFCISanjay Patel2016-04-291-24/+38
| | | | | | | As suggested in http://reviews.llvm.org/D17859 , we should enhance this to support vectors. llvm-svn: 268059
* [InstCombine] Propagate operand bundlesDavid Majnemer2016-04-292-3/+9
| | | | | | | We neglected to transfer operand bundles for some transforms. These were found via inspection, I'll try to come up with some test cases. llvm-svn: 268010
* [InstCombine] Remove trailing whitespace. NFC.Ahmed Bougacha2016-04-281-1/+1
| | | | | | r267873. llvm-svn: 267887
* [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBitsSimon Pilgrim2016-04-281-0/+22
| | | | | | | | | | The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 llvm-svn: 267873
* isSafeToLoadUnconditionally support queries without a contextArtur Pilipenko2016-04-271-2/+2
| | | | | | | | | | This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692
* Optimize store of "bitcast" from vector to aggregate.Arch D. Robison2016-04-251-0/+60
| | | | | | | | | | | This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482
* Cleanup redundant expression in InstCombineAndOrXor.Etienne Bergeron2016-04-251-2/+0
| | | | | | | | | | | | | | | Summary: The expression is redundant on both side of operator |. detected by : http://reviews.llvm.org/D19451 Reviewers: rnk, majnemer Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D19459 llvm-svn: 267458
* Test commit: modified comment. NFCAnna Thomas2016-04-251-1/+1
| | | | llvm-svn: 267406
* Tweak comments to make it clear that these combines are for SSE scalar ↵Simon Pilgrim2016-04-241-4/+5
| | | | | | instructions. llvm-svn: 267360
* [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is requiredSimon Pilgrim2016-04-241-1/+7
| | | | | | As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359
* [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)Simon Pilgrim2016-04-241-1/+38
| | | | | | | | | | | | | | | | Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357
* [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)Simon Pilgrim2016-04-241-0/+52
| | | | | | | | | | | | This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356
* [InstCombine] Avoid updating argument demanded elements in separate passes.Simon Pilgrim2016-04-241-7/+15
| | | | | | As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can. llvm-svn: 267355
* [X86][InstCombine] Tidyup VPERMILVAR -> shufflevector conversion to helper ↵Simon Pilgrim2016-04-241-36/+47
| | | | | | function. NFCI. llvm-svn: 267352
* [X86][InstCombine] Tidyup PSHUFB -> shufflevector conversion to helper ↵Simon Pilgrim2016-04-241-40/+45
| | | | | | function. NFCI. llvm-svn: 267351
* Revert r267210, it makes clang assert (PR27490).Nico Weber2016-04-221-11/+6
| | | | llvm-svn: 267232
* Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor2016-04-221-1/+1
| | | | | | | | | | support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
* [unordered] sink unordered stores at end of blocksPhilip Reames2016-04-221-4/+3
| | | | | | The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215
* Fold compares for distinct allocationsSanjoy Das2016-04-221-5/+11
| | | | | | | | | | | | | | | | Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214
* [unordered] Extend load/store type canonicalization to handle unordered ↵Philip Reames2016-04-221-6/+11
| | | | | | | | operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210
* [InstCombine] Preserve fast math flags when combining PHIsSilviu Baranga2016-04-221-38/+11
| | | | | | | | | | | | | | | | | | | | Summary: When optimizing PHIs which have inputs floating point binary operators, we preserve all IR flags except the fast math flags. This change removes the logic which tracked some of the IR flags (no wrap, exact) and replaces it by doing an and on the IR flags of all inputs to the PHI - which will also handle the fast math flags. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19370 llvm-svn: 267139
* Revert "Initial implementation of optimization bisect support."Vedant Kumar2016-04-221-5/+1
| | | | | | | | This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
* NFC: fix copy / paste commentJF Bastien2016-04-211-2/+2
| | | | llvm-svn: 267039
* NFC: fix nonsensical commentJF Bastien2016-04-211-1/+1
| | | | llvm-svn: 267036
* Folding compares with unescaped allocationsSanjoy Das2016-04-211-1/+14
| | | | | | | | | | | | | | | | Summary: If we know that the pointer allocated within a function does not escape, we can fold away comparisons that are done with global pointers Patch by Anna Thomas! Reviewers: reames, majnemer, sanjoy Subscribers: mgrang, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D19276 llvm-svn: 267035
* [instcombine][unordered] Extend load(select) transform to handle unordered loadsPhilip Reames2016-04-211-4/+3
| | | | llvm-svn: 267023
* Initial implementation of optimization bisect support.Andrew Kaylor2016-04-211-1/+5
| | | | | | | | | | | | This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
* [unordered] unordered loads from null are still unreachablePhilip Reames2016-04-211-3/+7
| | | | llvm-svn: 267019
* [instcombine][unordered] Implement *-load forwarding for unordered atomicsPhilip Reames2016-04-211-4/+4
| | | | | | This builds on 266999 which made FindAvailableValue do the right thing. Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements. llvm-svn: 267006
* [ValueTracking, VectorUtils] Refactor getIntrinsicIDForCallDavid Majnemer2016-04-191-1/+1
| | | | | | | | | | | | | The functionality contained within getIntrinsicIDForCall is two-fold: it checks if a CallInst's callee is a vectorizable intrinsic. If it isn't an intrinsic, it attempts to map the call's target to a suitable intrinsic. Move the mapping functionality into getIntrinsicForCallSite and rename getIntrinsicIDForCall to getVectorIntrinsicIDForCall while reimplementing it in terms of getIntrinsicForCallSite. llvm-svn: 266801
* [NFC] Header cleanupMehdi Amini2016-04-181-3/+2
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* Fix a typo in rL265762Sanjoy Das2016-04-171-1/+1
| | | | | | | | | I accidentally replaced `mayBeOverridden` with `!isInterposable`. Remove the negation and add a test case that would've caught this. Many thanks to Håkan Hjort for spotting this! llvm-svn: 266551
* [InstCombine] Don't transform compares of calls to functions named fabs{f,l,}David Majnemer2016-04-151-30/+25
| | | | | | | | InstCombine wants to optimize compares of calls to fabs with zero. However, we didn't have the necessary legality checking to verify that the function call had the same behavior as fabs. llvm-svn: 266452
* [InstCombine] remove constant by inverting compare + logic (PR27105)Sanjay Patel2016-04-141-0/+9
| | | | | | | | | | | | | | | https://llvm.org/bugs/show_bug.cgi?id=27105 We can check if all bits outside of a constant mask are set with a single constant. As noted in the bug report, although this form should be considered the canonical IR, backends may want to transform this into an 'andn' / 'andc' comparison against zero because that could be a single machine instruction. Differential Revision: http://reviews.llvm.org/D18842 llvm-svn: 266362
* [InstCombine] We folded an fcmp to an i1 instead of a vector of i1David Majnemer2016-04-131-1/+1
| | | | | | | | | Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175
* [x86, InstCombine] fix masked load pass-through operand to be a zero vectorSanjay Patel2016-04-121-3/+6
| | | | | | | | | | | | | | This bug was introduced with: http://reviews.llvm.org/rL262269 AVX masked loads are specified to set vector lanes to zero when the high bit of the mask element for that lane is zero: "If the mask is 0, the corresponding data element is set to zero in the load form of these instructions, and unmodified in the store form." --Intel manual Differential Revision: http://reviews.llvm.org/D19017 llvm-svn: 266148
* Add the allocsize attribute to LLVM.George Burgess IV2016-04-121-2/+7
| | | | | | | | | | | | | | | | `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032
* add FIXME comment; NFCSanjay Patel2016-04-111-1/+3
| | | | llvm-svn: 265970
* add an assert for safety; NFCSanjay Patel2016-04-111-0/+2
| | | | llvm-svn: 265969
* variable names start with a capital letter; NFCSanjay Patel2016-04-111-9/+9
| | | | llvm-svn: 265968
* [InstCombine] use canEvaluateShiftedShift() to handle the lshr case (NFCI)Sanjay Patel2016-04-111-33/+12
| | | | | | | | | We need just a couple of logic tweaks to consolidate the shl and lshr cases. This is step 5 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265965
* [InstCombine] don't try to shift an illegal amount (PR26760)Sanjay Patel2016-04-111-1/+3
| | | | | | | | | | This is the straightforward fix for PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 But we still need to make some changes to generalize this helper function and then send the lshr case into here. llvm-svn: 265960
OpenPOWER on IntegriCloud