summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Optimize store of "bitcast" from vector to aggregate.Arch D. Robison2016-04-251-0/+74
| | | | | | | | | | | This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482
* Fix typo from r267432.Chad Rosier2016-04-251-2/+2
| | | | llvm-svn: 267436
* [ValueTracking] Add an additional test case for r266767 where one operand is ↵Chad Rosier2016-04-251-0/+24
| | | | | | a const. llvm-svn: 267432
* [ValueTracking] Improve isImpliedCondition when the dominating cond is false.Chad Rosier2016-04-252-0/+389
| | | | llvm-svn: 267430
* [GlobalOpt] Allow constant globals to be SRA'dJames Molloy2016-04-251-0/+21
| | | | | | | | The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393
* [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is requiredSimon Pilgrim2016-04-242-10/+4
| | | | | | As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359
* [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)Simon Pilgrim2016-04-244-182/+60
| | | | | | | | | | | | | | | | Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357
* [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)Simon Pilgrim2016-04-243-145/+74
| | | | | | | | | | | | This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356
* DebugInfo: Remove MDString-based type referencesDuncan P. N. Exon Smith2016-04-234-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around DIType*. It is no longer legal to refer to a DICompositeType by its 'identifier:', and DIBuilder no longer retains all types with an 'identifier:' automatically. Aside from the bitcode upgrade, this is mainly removing logic to resolve an MDString-based reference to an actualy DIType. The commits leading up to this have made the implicit type map in DICompileUnit's 'retainedTypes:' field superfluous. This does not remove DITypeRef, DIScopeRef, DINodeRef, and DITypeRefArray, or stop using them in DI-related metadata. Although as of this commit they aren't serving a useful purpose, there are patchces under review to reuse them for CodeView support. The tests in LLVM were updated with deref-typerefs.sh, which is attached to the thread "[RFC] Lazy-loading of debug info metadata": http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html llvm-svn: 267296
* Revert r267210, it makes clang assert (PR27490).Nico Weber2016-04-221-39/+0
| | | | llvm-svn: 267232
* Introduce llvm.load.relative intrinsic.Peter Collingbourne2016-04-223-0/+120
| | | | | | | | | | | | | | | | | | | This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
* [unordered] sink unordered stores at end of blocksPhilip Reames2016-04-221-0/+34
| | | | | | The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215
* Fold compares for distinct allocationsSanjoy Das2016-04-221-0/+50
| | | | | | | | | | | | | | | | Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214
* [unordered] Extend load/store type canonicalization to handle unordered ↵Philip Reames2016-04-221-0/+39
| | | | | | | | operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210
* PM: Port SinkingPass to the new pass managerJustin Bogner2016-04-222-1/+1
| | | | llvm-svn: 267199
* [DeadStoreElimination] Shorten beginning of memset overwritten by later storesJun Bum Lim2016-04-221-0/+90
| | | | | | | | | | | | Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197
* PM: Port DCE to the new pass managerJustin Bogner2016-04-221-0/+11
| | | | | | | Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196
* [LoopVersioningLICM] Add test coverage for llvm.loop.licm_versioning.disableAdam Nemet2016-04-221-0/+104
| | | | | | | | In the next change, I am generalizing the function findStringMetadataForLoop and I want to make sure I don't break this. Looks like there was no coverage for this so far. llvm-svn: 267182
* [SimplifyCFG] Add final missing implications to isImpliedTrueByMatchingCmp.Chad Rosier2016-04-221-8/+8
| | | | | | | | | Summary: eq imply [u|s]ge and [u|s]le are true. Remove redundant logic by implementing isImpliedFalseByMatchingCmp(Pred1, Pred2) as isImpliedTrueByMatchingCmp(Pred1, getInversePredicate(Pred2)). llvm-svn: 267177
* [SimplifyCFG] Add missing implications to isImpliedTrueByMatchingCmp.Chad Rosier2016-04-221-0/+1005
| | | | | | | | | Summary: [u|s]gt and [u|s]lt imply [u|s]ge and [u|s]le are true, respectively. I've simplified the existing tests and added additional tests to cover the new cases mentioned above. I've also added tests for all the cases where the first compare doesn't imply anything about the second compare. llvm-svn: 267171
* [SimplifyCFG] Simplify code review by temporarily removing this test file.Chad Rosier2016-04-221-478/+0
| | | | | | | A followup commit will replace these tests with simplified and more inclusive tests. The diff is unreadable if this were to be done in a single commit. llvm-svn: 267170
* [EarlyCSE] Don't add the overflow flags to the hashDavid Majnemer2016-04-221-3/+2
| | | | | | | | We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153
* [InstCombine] Preserve fast math flags when combining PHIsSilviu Baranga2016-04-221-0/+89
| | | | | | | | | | | | | | | | | | | | Summary: When optimizing PHIs which have inputs floating point binary operators, we preserve all IR flags except the fast math flags. This change removes the logic which tracked some of the IR flags (no wrap, exact) and replaces it by doing an and on the IR flags of all inputs to the PHI - which will also handle the fast math flags. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19370 llvm-svn: 267139
* [GVN] Respect fast-math-flags on fcmpsDavid Majnemer2016-04-221-0/+18
| | | | | | | We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113
* [EarlyCSE] Take the intersection of flags on instructionsDavid Majnemer2016-04-221-0/+18
| | | | | | | | | | | | | EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111
* Folding compares with unescaped allocationsSanjoy Das2016-04-211-0/+42
| | | | | | | | | | | | | | | | Summary: If we know that the pointer allocated within a function does not escape, we can fold away comparisons that are done with global pointers Patch by Anna Thomas! Reviewers: reames, majnemer, sanjoy Subscribers: mgrang, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D19276 llvm-svn: 267035
* [instcombine][unordered] Extend load(select) transform to handle unordered loadsPhilip Reames2016-04-211-0/+28
| | | | llvm-svn: 267023
* [unordered] unordered loads from null are still unreachablePhilip Reames2016-04-211-0/+51
| | | | llvm-svn: 267019
* [instcombine][unordered] Implement *-load forwarding for unordered atomicsPhilip Reames2016-04-211-2/+35
| | | | | | This builds on 266999 which made FindAvailableValue do the right thing. Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements. llvm-svn: 267006
* [unordered] Add tests and conservative handling in support of future changes ↵Philip Reames2016-04-211-1/+47
| | | | | | | | [NFCI] This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing. At the moment, the code added is dead, but separating it makes follow on changes far more obvious. llvm-svn: 266999
* [SimplifyCFG] Fold `llvm.guard(false)` to unreachableSanjoy Das2016-04-211-0/+86
| | | | | | | | | | | | | | Summary: `llvm.guard(false)` always bails out of the current compilation unit, so we can prune any control flow following it. Reviewers: hfinkel, pcc, reames Subscribers: majnemer, reames, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19245 llvm-svn: 266955
* ThinLTO/ModuleLinker: add a flag to not always pull-in linkonce when ↵Mehdi Amini2016-04-212-3/+57
| | | | | | | | | | | | | | | | | | | performing importing Summary: The function importer already decided what symbols need to be pulled in. Also these magically added ones will not be in the export list for the source module, which can confuse the internalizer for instance. Reviewers: tejohnson, rafael Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19096 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266948
* Add optimization for 'icmp slt (or A, B), A' and some related idioms based ↵Nick Lewycky2016-04-211-0/+137
| | | | | | | | | | | | | | | | | | | | on knowledge of the sign bit for A and B. No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table: ``` A is +, -, +/- F F F + B is T F ? - ? F ? +/- ``` The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate. There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers. llvm-svn: 266939
* [test/PGOProfile] Make tests independent of the raw profile version (NFC)Vedant Kumar2016-04-208-9/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D19290 llvm-svn: 266928
* [ThinLTO] Prevent importing of "llvm.used" valuesTeresa Johnson2016-04-202-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch prevents importing from (and therefore exporting from) any module with a "llvm.used" local value. Local values need to be promoted and renamed when importing, and their presense on the llvm.used variable indicates that there are opaque uses that won't see the rename. One such example is a use in inline assembly. See also the discussion at: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html As part of this, move collectUsedGlobalVariables out of Transforms/Utils and into IR/Module so that it can be used more widely. There are several other places in LLVM that used copies of this code that can be cleaned up as a follow on NFC patch. Reviewers: joker.eph Subscribers: pcc, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18986 llvm-svn: 266877
* [LLVM] Remove unwanted --check-prefix=CHECK from unit tests. NFC.Mandeep Singh Grang2016-04-192-2/+2
| | | | | | | | | | | | Summary: Removed unwanted --check-prefix=CHECK from numerous unit tests. Reviewers: t.p.northover, dblaikie, uweigand, MatzeB, tstellarAMD, mcrosier Subscribers: mcrosier, dsanders Differential Revision: http://reviews.llvm.org/D19279 llvm-svn: 266834
* [AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic.Marcin Koscielnicki2016-04-192-4/+4
| | | | | | | | | | | | | | Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that just return the thread pointer. I have a pending patch that does the same for SystemZ (D19054), and there are many more targets that could benefit from one. This patch merges the ARM and AArch64 intrinsics into a single target independent one that will also be used by subsequent targets. Differential Revision: http://reviews.llvm.org/D19098 llvm-svn: 266818
* [ValueTracking] Improve isImpliedCondition for conditions with matching ↵Chad Rosier2016-04-192-0/+507
| | | | | | | | | | | | | | | operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767
* [InstCombine][X86] Added extra tests introduced for D17490Simon Pilgrim2016-04-194-0/+578
| | | | llvm-svn: 266732
* [InstCombine][X86] Regenerate SSE combine tests as part of setup for D17490Simon Pilgrim2016-04-196-468/+581
| | | | | | Regenerated with utils/update_test_checks.py llvm-svn: 266731
* ARM: use a pseudo-instruction for cmpxchg at -O0.Tim Northover2016-04-183-3/+3
| | | | | | | | | | | | | | | | | The fast register-allocator cannot cope with inter-block dependencies without spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions where any value produced within a block is dead by the end, but not for cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded after regalloc. Fortunately this is at -O0 so we don't have to care about performance. This simplifies the various axes of expansion considerably: we assume a strong seq_cst operation and ensure ordering via the always-present DMB instructions rather than v8 acquire/release instructions. Should fix the 32-bit part of PR25526. llvm-svn: 266679
* [ValueTracking] Correct lit test comments. NFC.Chad Rosier2016-04-181-2/+2
| | | | llvm-svn: 266657
* Revert "Replace the use of MaxFunctionCount module flag"Eric Liu2016-04-182-38/+16
| | | | | | | | | | This reverts commit r266477. This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData", while "ProfileData" depends on "Object", which depends on "BitCode", which depends on "Analysis". llvm-svn: 266619
* [ARM] AArch32 v8 NEON is still not IEEE-754 compliantRenato Golin2016-04-181-14/+8
| | | | llvm-svn: 266603
* Fix a typo in rL265762Sanjoy Das2016-04-171-0/+12
| | | | | | | | | I accidentally replaced `mayBeOverridden` with `!isInterposable`. Remove the negation and add a test case that would've caught this. Many thanks to Håkan Hjort for spotting this! llvm-svn: 266551
* ThinLTO: Make aliases explicit in the summaryMehdi Amini2016-04-161-1/+1
| | | | | | | | | | | To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266517
* [cfi] Support explicit sections for functions in cfi-icall.Evgeniy Stepanov2016-04-151-0/+26
| | | | | | | | | | Allow explicit section for indirectly called functions in cfi-icall. Jumptables for functions in the same type class must be contiguous, so they always go to the default text section. Fixes PR25079. llvm-svn: 266486
* Convert this sample-based-profiling testcase to use a NoDebug CU.Adrian Prantl2016-04-151-4/+1
| | | | llvm-svn: 266481
* Replace the use of MaxFunctionCount module flagEaswaran Raman2016-04-152-16/+38
| | | | | | | | Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477
* ARM: don't try to hoist constant RHS out of a division.Tim Northover2016-04-151-0/+45
| | | | | | | | | | | | Divisions by a constant can be converted into multiplies which are usually cheaper, but this isn't possible if the constant gets separated (particularly in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is cheap. I considered making the check generic, but neither AArch64 (strangely) nor x86 showed any benefit on the tests I had. llvm-svn: 266464
OpenPOWER on IntegriCloud