summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* transform fmin/fmax calls when possible (PR24314)Sanjay Patel2015-08-161-2/+61
| | | | | | | | | | | | | | | If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187
* [LSR][NFC] Don’t duplicate entity name at the beginning of the comment.Sanjoy Das2015-08-161-236/+208
| | | | llvm-svn: 245183
* [LSR][NFC] Use camelCase for method names in Formula and RegUseTracker.Sanjoy Das2015-08-161-34/+34
| | | | llvm-svn: 245182
* Revert "Add support for cross block dse. This patch enables dead stroe ↵David Majnemer2015-08-161-224/+51
| | | | | | | | elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172
* [InstCombine] Replace an and+icmp with a trunc+icmpDavid Majnemer2015-08-161-0/+17
| | | | | | | | Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171
* MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting.NAKAMURA Takumi2015-08-161-1/+6
| | | | | | Don't assume second would be ordered in the module. llvm-svn: 245168
* Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890Yaron Keren2015-08-151-21/+19
| | | | | | | | | | | | | ByteSize and BitSize should not be size_t but unsigned, considering 1) They are at most 2^16 and 2^19, respectively. 2) BitSize is an argument to Type::getIntNTy which takes unsigned. Also, use the correct utostr instead itostr and cache the string result. Thanks to James Touton for reporting this! llvm-svn: 245167
* [IR] Give catchret an optional 'return value' operandDavid Majnemer2015-08-153-9/+18
| | | | | | | | | | | Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149
* Accelerate MergeFunctions with hashingJF Bastien2015-08-151-4/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the Merge Functions pass faster by calculating and comparing a hash value which captures the essential structure of a function before performing a full function comparison. The hash is calculated by hashing the function signature, then walking the basic blocks of the function in the same order as the main comparison function. The opcode of each instruction is hashed in sequence, which means that different functions according to the existing total order cannot have the same hash, as the comparison requires the opcodes of the two functions to be the same order. The hash function is a static member of the FunctionComparator class because it is tightly coupled to the exact comparison function used. For example, functions which are equivalent modulo a single variant callsite might be merged by a more aggressive MergeFunctions, and the hash function would need to be insensitive to these differences in order to exploit this. The hashing function uses a utility class which accumulates the values into an internal state using a standard bit-mixing function. Note that this is a different interface than a regular hashing routine, because the values to be hashed are scattered amongst the properties of a llvm::Function, not linear in memory. This scheme is fast because only one word of state needs to be kept, and the mixing function is a few instructions. The main runOnModule function first computes the hash of each function, and only further processes functions which do not have a unique function hash. The hash is also used to order the sorted function set. If the hashes differ, their values are used to order the functions, otherwise the full comparison is done. Both of these are helpful in speeding up MergeFunctions. Together they result in speedups of 9% for mysqld (a mostly C application with little redundancy), 46% for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all three cases, the new speed of MergeFunctions is about half that of the module verifier, making it relatively inexpensive even for large LTO builds with hundreds of thousands of functions. The same functions are merged, so this change is free performance. Author: jrkoenig Reviewers: nlewycky, dschuff, jfb Subscribers: llvm-commits, aemerson Differential revision: http://reviews.llvm.org/D11923 llvm-svn: 245140
* LoopStrengthReduce: Try to pass address space to isLegalAddressingModeMatt Arsenault2015-08-151-63/+94
| | | | | | | | | This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137
* Fix a crash where a utility function wasn't aware of fcmp vectors and ↵Nick Lewycky2015-08-141-1/+2
| | | | | | created a value with the wrong type. Fixes PR24458! llvm-svn: 245119
* [msan] Fix handling of musttail calls.Evgeniy Stepanov2015-08-141-0/+20
| | | | | | | MSan instrumentation for return values of musttail calls is not allowed by the IR constraints, and not needed at the same time. llvm-svn: 245106
* [sancov] Fix an unused variable warning introduced in r245067Justin Bogner2015-08-141-1/+1
| | | | llvm-svn: 245072
* [sancov] Leave llvm.localescape in the entry blockReid Kleckner2015-08-142-9/+40
| | | | | | | | | | | | Summary: Similar to the change we applied to ASan. The same test case works. Reviewers: samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11961 llvm-svn: 245067
* Separate out BDCE's analysis into a separate DemandedBits analysis.James Molloy2015-08-141-328/+18
| | | | | | | This allows other areas of the compiler to use BDCE's bit-tracking. NFCI. llvm-svn: 245039
* [LVer] Remove unused Pass parameter from versionLoop, NFCAdam Nemet2015-08-142-2/+2
| | | | llvm-svn: 245032
* [IR] Add token typesDavid Majnemer2015-08-143-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces the basic functionality to support "token types". The motivation stems from the need to perform operations on a Value whose provenance cannot be obscured. There are several applications for such a type but my immediate motivation stems from WinEH. Our personality routine enforces a single-entry - single-exit regime for cleanups. After several rounds of optimizations, we may be left with a terminator whose "cleanup-entry block" is not entirely clear because control flow has merged two cleanups together. We have experimented with using labels as operands inside of instructions which are not terminators to indicate where we came from but found that LLVM does not expect such exotic uses of BasicBlocks. Instead, we can use this new type to clearly associate the "entry point" and "exit point" of our cleanup. This is done by having the cleanuppad yield a Token and consuming it at the cleanupret. The token type makes it impossible to obscure or otherwise hide the Value, making it trivial to track the relationship between the two points. What is the burden to the optimizer? Well, it turns out we have already paid down this cost by accepting that there are certain calls that we are not permitted to duplicate, optimizations have to watch out for such instructions anyway. There are additional places in the optimizer that we will probably have to update but early examination has given me the impression that this will not be heroic. Differential Revision: http://reviews.llvm.org/D11861 llvm-svn: 245029
* Add support for cross block dse.Karthik Bhat2015-08-141-51/+224
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables dead stroe elimination across basicblocks. Example: define void @test_02(i32 %N) { %1 = alloca i32 store i32 %N, i32* %1 store i32 10, i32* @x %2 = load i32, i32* %1 %3 = icmp ne i32 %2, 0 br i1 %3, label %4, label %5 ; <label>:4 store i32 5, i32* @x br label %7 ; <label>:5 %6 = load i32, i32* @x store i32 %6, i32* @y br label %7 ; <label>:7 store i32 15, i32* @x ret void } In the above example dead store "store i32 5, i32* @x" is now eliminated. Differential Revision: http://reviews.llvm.org/D11143 llvm-svn: 245025
* [PM/AA] Run clang-format over the ObjCARC Alias Analysis code toChandler Carruth2015-08-142-37/+34
| | | | | | normalize its formatting before I make more substantial changes. llvm-svn: 245024
* [PM/AA] Don't bother forward declaring Function and Value, just includeChandler Carruth2015-08-141-5/+2
| | | | | | their headers. llvm-svn: 245023
* [PM/AA] Extract the interface for GlobalsModRef into a header along withChandler Carruth2015-08-141-0/+1
| | | | | | | | | its creation function. This required shifting a bunch of method definitions to be out-of-line so that we could leave most of the implementation guts in the .cpp file. llvm-svn: 245021
* [PM/AA] Hoist the interface to TBAA into a dedicated header along withChandler Carruth2015-08-142-0/+2
| | | | | | its creation function. Update the relevant includes accordingly. llvm-svn: 245019
* [PM/AA] Hoist ScopedNoAliasAA's interface into a header and move theChandler Carruth2015-08-142-0/+2
| | | | | | | | | creation function there. Same basic refactoring as the other alias analyses. Nothing special required this time around. llvm-svn: 245012
* [PM/AA] Extract a minimal interface for CFLAA to its own header file.Chandler Carruth2015-08-141-0/+1
| | | | | | | | | | I've used forward declarations and reorderd the source code some to make this reasonably clean and keep as much of the code as possible in the source file, including all the stratified set details. Just the basic AA interface and the create function are in the header file, and the header file is now included into the relevant locations. llvm-svn: 245009
* [SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't ↵Jingyue Wu2015-08-141-5/+100
| | | | | | | | | | | | | | | | | | | | sign-overflow. Summary: This patch implements my promised optimization to reunites certain sexts from operands after we extract the constant offset. See the header comment of reuniteExts for its motivation. One key building block that enables this optimization is Bjarke's poison value analysis (D11212). That helps to prove "a +nsw b" can't overflow. Reviewers: broune Subscribers: jholewinski, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12016 llvm-svn: 245003
* [LIR] Re-instate r244880, reverted in r244884, factoring the handling ofChandler Carruth2015-08-141-3/+5
| | | | | | | | | | | | | AliasAnalysis in LoopIdiomRecognize. The previous commit to LIR, r244879, exposed some scary bug in the loop pass pipeline with an assert failure that showed up on several bots. This patch got reverted as part of getting that revision reverted, but they're actually independent and unrelated. This patch has no functional change and should be completely safe. It is also useful for my current work on the AA infrastructure. llvm-svn: 244993
* don't repeat function names in comments; NFCSanjay Patel2015-08-131-24/+20
| | | | llvm-svn: 244977
* [SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttzDavide Italiano2015-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 llvm-svn: 244947
* [SeparateConstOffsetFromGEP] strengthen the inbounds attributeJingyue Wu2015-08-131-4/+9
| | | | | | | | | | | | | | We used to be over-conservative about preserving inbounds. Actually, the second GEP (which applies the constant offset) can inherit the inbounds attribute of the original GEP, because the resultant pointer is equivalent to that of the original GEP. For example, x = GEP inbounds a, i+5 => y = GEP a, i // inbounds removed x = GEP inbounds y, 5 // inbounds preserved llvm-svn: 244937
* [DeadStoreElimination] remove a redundant store even if the load is in a ↵Erik Eckstein2015-08-131-10/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | different block. DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location. So far this worked only if the store is in the same block as the load. Now we can also handle stores which are in a different block than the load. Example: define i32 @test(i1, i32*) { entry: %l2 = load i32, i32* %1, align 4 br i1 %0, label %bb1, label %bb2 bb1: br label %bb3 bb2: ; This store is redundant store i32 %l2, i32* %1, align 4 br label %bb3 bb3: ret i32 0 } Differential Revision: http://reviews.llvm.org/D11854 llvm-svn: 244901
* [InstCombinePHI] Partial simplification of identity operations.Charlie Turner2015-08-131-0/+115
| | | | | | | | | | | | | | | | | | | | | Consider this code: BB: %i = phi i32 [ 0, %if.then ], [ %c, %if.else ] %add = add nsw i32 %i, %b ... In this common case the add can be moved to the %if.else basic block, because adding zero is an identity operation. If we go though %if.then branch it's always a win, because add is not executed; if not, the number of instructions stays the same. This pattern applies also to other instructions like sub, shl, shr, ashr | 0, mul, sdiv, div | 1. Patch by Jakub Kuderski! llvm-svn: 244887
* Revert "[LIR] Start leveraging the fundamental guarantees of a loop..."Renato Golin2015-08-131-15/+12
| | | | | | | This reverts commit r244879, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244885
* Revert "[LIR] Handle access to AliasAnalysis the same way as the other ↵Renato Golin2015-08-131-5/+3
| | | | | | | | | analysis in LoopIdiomRecognize." This reverts commit r244880, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244884
* Test Commit.Ashutosh Nema2015-08-131-1/+1
| | | | llvm-svn: 244883
* [LIR] Handle access to AliasAnalysis the same way as the other analysisChandler Carruth2015-08-131-3/+5
| | | | | | | in LoopIdiomRecognize. This is what started me staring at this code. Now migrating it with the new AA stuff will be trivial. llvm-svn: 244880
* [LIR] Start leveraging the fundamental guarantees of a loop inChandler Carruth2015-08-131-12/+15
| | | | | | | | | | | simplified form to remove redundant checks and simplify the code for popcount recognition. We don't actually need to handle all of these cases. I've left a FIXME for one in particular until I finish inspecting to make sure we don't actually *rely* on the predicate in any way. llvm-svn: 244879
* [LIR] Handle the LoopInfo the same as all the other analyses. No utilityChandler Carruth2015-08-131-3/+3
| | | | | | really in breaking pattern just for this analysis. llvm-svn: 244878
* [InstCombine] SSE/AVX vector shifts demanded shift amount bitsSimon Pilgrim2015-08-131-27/+84
| | | | | | | | | | Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this. I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift. Differential Revision: http://reviews.llvm.org/D11938 llvm-svn: 244872
* [LoopUnswitch] Check OptimizeForSize before traversing over all basic blocks ↵Chen Li2015-08-131-7/+6
| | | | | | | | | | | | | | in current loop Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC. Reviewers: reames, weimingz, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11997 llvm-svn: 244868
* [LIR] Make the LoopIdiomRecognize pass get analyses essentially the sameChandler Carruth2015-08-131-40/+6
| | | | | | | way as every other pass. This simplifies the code quite a bit and is also more idiomatic! <ba-dum!> llvm-svn: 244853
* [LIR] Remove the dedicated class for popcount recognition and sink theChandler Carruth2015-08-131-392/+343
| | | | | | | | | | | | | | | | | | | | | code into methods on LoopIdiomRecognize. This simplifies the code somewhat and also makes it much easier to move the analyses around. Ultimately, the separate class wasn't providing significant value over methods -- it contained the precondition basic block and the current loop. The current loop is already available and the precondition block wasn't needed everywhere and is easy to pass around. In several cases I just moved things to be static functions because they already accepted most of their inputs as arguments. This doesn't fix the way we manage analyses yet, that will be the next patch, but it already makes the code over 50 lines shorter. No functionality changed. llvm-svn: 244851
* [LIR] Move all the helpers to be private and re-order the methods inChandler Carruth2015-08-131-46/+55
| | | | | | a way that groups things logically. No functionality changed. llvm-svn: 244845
* [LIR] Remove the 'LIRUtils' abstraction which was unnecessary and addingChandler Carruth2015-08-121-51/+18
| | | | | | | | | | | | | | | | | | | | | | | | complexity. There is only one function that was called from multiple locations, and that was 'getBranch' which has a reasonable one-line spelling already: dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but it doesn't seem to add much value. Instead, we should avoid calling it so many times on the same basic blocks, but that will be in a subsequent patch. The other functions are only called in one location, so inline them there, and take advantage of this to use direct early exit and reduce indentation. This makes it much more clear what is being tested for, and in fact makes it clear now to me that there are simpler ways to do this work. However, this patch just does the mechanical inlining. I'll clean up the functionality of the code to leverage loop simplified form more effectively in a follow-up. Despite lots of early line breaks due to early-exit, this is still shorter than it was before. llvm-svn: 244841
* [LIR] Run clang-format over LoopIdiomRecognize in preparation forChandler Carruth2015-08-121-227/+219
| | | | | | | | | | | | a significant code cleanup here. The handling of analyses in this pass is overly complex and can be simplified significantly, but the right way to do that is to simplify all of the code not just the analyses, and that'll require pretty extensive edits that would be noisy with formatting changes mixed into them. llvm-svn: 244828
* [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepointsPhilip Reames2015-08-121-0/+31
| | | | | | | | | | | | | | | | To be clear: this is an *optimization* not a correctness change. CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll. One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust. I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially. Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly. Differential Revision: http://reviews.llvm.org/D11819 llvm-svn: 244821
* [RewriteStatepointsForGC] Handle extractelement fully in the base pointer ↵Philip Reames2015-08-121-61/+96
| | | | | | | | | | algorithm When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :) This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed. Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement. That change will follow in the near future. llvm-svn: 244808
* fix typo; NFCSanjay Patel2015-08-121-1/+1
| | | | llvm-svn: 244805
* [PM/AA] Add missing static dependency edges from DSE and memdep to TLI.Chandler Carruth2015-08-121-1/+2
| | | | | | | | I forgot to add these in r244780 and r244778. Sorry about that. Also order the static dependencies in a lexicographical order. llvm-svn: 244787
* [PM/AA] Explicitly depend on TLI rather than getting it out of theChandler Carruth2015-08-121-1/+7
| | | | | | | | | | AliasAnalysis. Same as the other commits, the TLI access from an alias analysis is going away and isn't very clean -- it is better to explicitly mark the dependencies. llvm-svn: 244785
* [PM/AA] Stop getting the TargetLibraryInfo out of the AliasAnalysis andChandler Carruth2015-08-121-36/+36
| | | | | | | | | | | | | | just depend on it directly. This was particularly frustrating because there was a really wide mixture of using a member variable and re-extracting it from the AA that happened to be around. I think the result is much more clear. I've also deleted all of the pointless null checks and used references across the APIs where I could to make it explicit that this cannot be null in a useful fashion. llvm-svn: 244780
OpenPOWER on IntegriCloud