summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* MIR Serialization: Serialize the memory operand's alias scope metadata node.Alex Lorenz2015-08-174-3/+12
| | | | llvm-svn: 245245
* MIR Serialization: Serialize the memory operand's TBAA metadata node.Alex Lorenz2015-08-174-11/+60
| | | | llvm-svn: 245244
* [WinEHPrepare] Replace unreasonable funclet terminators with unreachableDavid Majnemer2015-08-171-3/+33
| | | | | | | | | | It is possible to be in a situation where more than one funclet token is a valid SSA value. If we see a terminator which exits a funclet which doesn't use the funclet's token, replace it with unreachable. Differential Revision: http://reviews.llvm.org/D12074 llvm-svn: 245238
* [SPARC]: recognize '.' as the start of an assembler expression.Douglas Katzman2015-08-171-0/+1
| | | | llvm-svn: 245232
* [ARM] Fix crash when targetting CPU without NEONJames Molloy2015-08-171-3/+3
| | | | | | | | We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available. Found here: https://code.google.com/p/chromium/issues/detail?id=521671 llvm-svn: 245231
* [ScalarEvolutionExpander] Reuse findExistingExpansion during expansion cost ↵Igor Laevsky2015-08-171-19/+11
| | | | | | | | | | calculation for division Primary purpose of this change is to reuse existing code inside findExistingExpansion. However it introduces very slight semantic change - findExistingExpansion now looks into exiting blocks instead of a loop latches. Originally heuristic was based on the fact that we want to look at the loop exit conditions. And since all exiting latches will be listed in the ExitingBlocks, heuristic stays roughly the same. Differential Revision: http://reviews.llvm.org/D12008 llvm-svn: 245227
* [CostModel][AArch64] Increase cost of vector insert element and add missing ↵Silviu Baranga2015-08-171-1/+33
| | | | | | | | | | | | | | | | | | | | | | cast costs Summary: Increase the estimated costs for insert/extract element operations on AArch64. This is motivated by results from benchmarking interleaved accesses. Add missing costs for zext/sext/trunc instructions and some integer to floating point conversions. These costs were previously calculated by scalarizing these operation and were affected by the cost increase of the insert/extract element operations. Reviewers: rengolin Subscribers: mcrosier, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11939 llvm-svn: 245226
* [CostModel][ARM] Increase cost of insert/extract operationsSilviu Baranga2015-08-171-5/+12
| | | | | | | | | | | | | | | Summary: This change limits the minimum cost of an insert/extract element operation to 2 in cases where this would result in mixing of NEON and VFP code. Reviewers: rengolin Subscribers: mssimpso, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12030 llvm-svn: 245225
* [BasicAliasAnalysis] Do not check ModRef table for intrinsicsIgor Laevsky2015-08-171-7/+0
| | | | | | | | All possible ModRef behaviours can be completely represented using existing LLVM IR attributes. Differential Revision: http://reviews.llvm.org/D12033 llvm-svn: 245224
* Take alignment into account in isSafeToSpeculativelyExecute and ↵Artur Pilipenko2015-08-172-39/+91
| | | | | | | | | | isSafeToLoadUnconditionally. Reviewed By: hfinkel, sanjoy, MatzeB Differential Revision: http://reviews.llvm.org/D9791 llvm-svn: 245223
* Extend MCAsmLexer so that it can peek forward several tokensBenjamin Kramer2015-08-171-3/+13
| | | | | | | | | | | | | | | This commit adds a virtual `peekTokens()` function to `MCAsmLexer` which can peek forward an arbitrary number of tokens. It also makes the `peekTok()` method call `peekTokens()` method, but only requesting one token. The idea is to better support targets which more more ambiguous assembly syntaxes. Patch by Dylan McKay! llvm-svn: 245221
* Correcting a -Woverflow warning where 0xFFFF was overflowing an implicit ↵Aaron Ballman2015-08-171-1/+1
| | | | | | constant conversion. llvm-svn: 245220
* [WinEHPrepare] Fix catchret successor phi demotionJoseph Tremoulet2015-08-171-0/+36
| | | | | | | | | | | | | | | | | | | | Summary: When demoting an SSA value that has a use on a phi and one of the phi's predecessors terminates with catchret, the edge needs to be split and the load inserted in the new block, else we'll still have a cross-funclet SSA value. Add a test for this, and for the similar case where a def to be spilled is on and invoke and a critical edge, which was already implemented but missing a test. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12065 llvm-svn: 245218
* Revert "Disable targetdatalayoutcheck"Tobias Grosser2015-08-171-0/+5
| | | | | | | I committed by accident a local hack that should not have made it upstream. Sorry for the noise. llvm-svn: 245212
* Disable targetdatalayoutcheckTobias Grosser2015-08-171-5/+0
| | | | llvm-svn: 245210
* [mips] [IAS] Add support for the DLA pseudo-instruction and fix problems ↵Daniel Sanders2015-08-174-213/+283
| | | | | | | | | | | | | | with DLI Summary: It is the same as LA, except that it can also load 64-bit addresses and it only works on 64-bit MIPS architectures. Reviewers: tomatabacu, seanbruno, vkalintiris Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D9524 llvm-svn: 245208
* [GMR] isNonEscapingGlobalNoAlias() should look through Bitcasts/GEPs when ↵Michael Kuperstein2015-08-171-1/+1
| | | | | | | | | | looking at loads. This fixes yet another case from PR24288. Differential Revision: http://reviews.llvm.org/D12064 llvm-svn: 245207
* Remove hand-rolled matching for fmin and fmax.James Molloy2015-08-171-98/+2
| | | | | | SDAGBuilder now does this all for us. llvm-svn: 245198
* Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNMJames Molloy2015-08-171-194/+0
| | | | | | This is no longer needed - SDAGBuilder will do this for us. llvm-svn: 245197
* Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder.James Molloy2015-08-171-12/+33
| | | | | | | | | | These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted. For example on AArch32 (V8), we have scalar fminnm but not fmin. Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with. llvm-svn: 245196
* Fix PR24469 resulting from r245025 and re-enable dead store elimination ↵Karthik Bhat2015-08-171-51/+231
| | | | | | | | | | across basicblocks. PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was deleting the next basic block iterator. Fixed the same by resetting the basic block iterator post call to DeleteDeadInstruction. llvm-svn: 245195
* Revert "[InstCombinePHI] Partial simplification of identity operations."David Majnemer2015-08-171-115/+0
| | | | | | This reverts commit r244887, it caused PR24470. llvm-svn: 245194
* [PM] Port ScalarEvolution to the new pass manager.Chandler Carruth2015-08-1738-251/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never *actually* invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there *is* actually a key that we don't update with value handles -- there is a map keyed off of Loop*s. Because LoopInfo *does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that *don't* trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in *just* such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193
* [ADT] Teach FoldingSet to be movable.Chandler Carruth2015-08-161-0/+20
| | | | | | | | | | This is a very minimal move support - it leaves the moved-from object in a zombie state that is only valid for destruction and move assignment. This seems fine to me, and leaving it in the default constructed state would require adding more state to the object and potentially allocating memory (!!!) and so seems like a Bad Idea. llvm-svn: 245192
* [SimplifyLibCalls] Drop default template args. No functional change.Benjamin Kramer2015-08-161-4/+2
| | | | llvm-svn: 245189
* [IR] Simplify code. No functionality change.Benjamin Kramer2015-08-161-6/+4
| | | | llvm-svn: 245188
* transform fmin/fmax calls when possible (PR24314)Sanjay Patel2015-08-161-2/+61
| | | | | | | | | | | | | | | If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187
* [LSR][NFC] Don’t duplicate entity name at the beginning of the comment.Sanjoy Das2015-08-161-236/+208
| | | | llvm-svn: 245183
* [LSR][NFC] Use camelCase for method names in Formula and RegUseTracker.Sanjoy Das2015-08-161-34/+34
| | | | llvm-svn: 245182
* use SDValue bool operator; NFCISanjay Patel2015-08-161-4/+3
| | | | llvm-svn: 245181
* Add missing include guard.Yaron Keren2015-08-161-0/+4
| | | | llvm-svn: 245173
* Revert "Add support for cross block dse. This patch enables dead stroe ↵David Majnemer2015-08-161-224/+51
| | | | | | | | elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172
* [InstCombine] Replace an and+icmp with a trunc+icmpDavid Majnemer2015-08-161-0/+17
| | | | | | | | Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171
* Revert r244127: [PM] Remove a failed attempt to port the CallGraphChandler Carruth2015-08-161-0/+15
| | | | | | | | | | | | | analysis ... It turns out that we *do* need the old CallGraph ported to the new pass manager. There are times where this model of a call graph is really superior to the one provided by the LazyCallGraph. For example, GlobalsModRef very specifically needs the model provided by CallGraph. While here, I've tried to make the move semantics actually work. =] llvm-svn: 245170
* [X86] Widen the 'AND' mask if doing so shrinks the encoding sizeDavid Majnemer2015-08-161-2/+61
| | | | | | | | | | | | | We can set additional bits in a mask given that we know the other operand of an AND already has some bits set to zero. This can be more efficient if doing so allows us to use an instruction which implicitly sign extends the immediate. This fixes PR24085. Differential Revision: http://reviews.llvm.org/D11289 llvm-svn: 245169
* MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting.NAKAMURA Takumi2015-08-161-1/+6
| | | | | | Don't assume second would be ordered in the module. llvm-svn: 245168
* Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890Yaron Keren2015-08-151-21/+19
| | | | | | | | | | | | | ByteSize and BitSize should not be size_t but unsigned, considering 1) They are at most 2^16 and 2^19, respectively. 2) BitSize is an argument to Type::getIntNTy which takes unsigned. Also, use the correct utostr instead itostr and cache the string result. Thanks to James Touton for reporting this! llvm-svn: 245167
* [x86] enable machine combiner reassociations for scalar single-precision ↵Sanjay Patel2015-08-151-0/+6
| | | | | | minimums llvm-svn: 245166
* Silence VS2015 warning.Yaron Keren2015-08-151-1/+1
| | | | | | | | Patch by James Touton! http://reviews.llvm.org/D11890 llvm-svn: 245161
* [DAGCombiner] Attempt to mask vectors before zero extension instead of after.Simon Pilgrim2015-08-152-17/+47
| | | | | | | | | | | | | | For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors. (zext (truncate x)) -> (zext (and(x, m)) Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code. This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles. Differential Revision: http://reviews.llvm.org/D11764 llvm-svn: 245160
* [PM/AA] Delete the LibCallAliasAnalysis and all the associatedChandler Carruth2015-08-154-185/+0
| | | | | | | | | | | | | | | | | | infrastructure. This AA was never used in tree. It's infrastructure also completely overlaps that of TargetLibraryInfo which is used heavily by BasicAA to achieve similar goals to those stated for this analysis. As has come up in several discussions, the use case here is still really important, but this code isn't helping move toward that use case. Any progress on better supporting rich AA information for runtime library environments would likely be better off starting from scratch or starting from TargetLibraryInfo than from this base. Differential Revision: http://reviews.llvm.org/D12028 llvm-svn: 245155
* AMDGPU/SI: Only look at live out SGPR defsMatt Arsenault2015-08-151-3/+7
| | | | | | | | | | | | | | | | | When trying to fix SGPR live ranges, skip defs that are killed in the same block as the def. I don't think we need to worry about these cases as long as the live ranges of the SGPRs in dominating blocks are correct. This reduces the number of elements the second loop over the function needs to look at, and makes it generally easier to understand. The second loop also only considers if the live range is live in to a block, which logically means it must have been live out from another. llvm-svn: 245150
* [IR] Give catchret an optional 'return value' operandDavid Majnemer2015-08-1512-42/+104
| | | | | | | | | | | Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149
* Remove redundant TargetFrameLowering::getFrameIndexOffset virtualJames Y Knight2015-08-1518-78/+94
| | | | | | | | | | | function. This was the same as getFrameIndexReference, but without the FrameReg output. Differential Revision: http://reviews.llvm.org/D12042 llvm-svn: 245148
* [WebAssembly] Add RelooperJF Bastien2015-08-153-0/+1121
| | | | | | | | | | | | | | | | This is just an initial checkin of an implementation of the Relooper algorithm, in preparation for WebAssembly codegen to utilize. It doesn't do anything yet by itself. The Relooper algorithm takes an arbitrary control flow graph and generates structured control flow from that, utilizing a helper variable when necessary to handle irreducibility. The WebAssembly backend will be able to use this in order to generate an AST for its binary format. Author: azakai Reviewers: jfb, sunfish Subscribers: jevinskie, arsenm, jroelofs, llvm-commits Differential revision: http://reviews.llvm.org/D11691 llvm-svn: 245142
* Accelerate MergeFunctions with hashingJF Bastien2015-08-151-4/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the Merge Functions pass faster by calculating and comparing a hash value which captures the essential structure of a function before performing a full function comparison. The hash is calculated by hashing the function signature, then walking the basic blocks of the function in the same order as the main comparison function. The opcode of each instruction is hashed in sequence, which means that different functions according to the existing total order cannot have the same hash, as the comparison requires the opcodes of the two functions to be the same order. The hash function is a static member of the FunctionComparator class because it is tightly coupled to the exact comparison function used. For example, functions which are equivalent modulo a single variant callsite might be merged by a more aggressive MergeFunctions, and the hash function would need to be insensitive to these differences in order to exploit this. The hashing function uses a utility class which accumulates the values into an internal state using a standard bit-mixing function. Note that this is a different interface than a regular hashing routine, because the values to be hashed are scattered amongst the properties of a llvm::Function, not linear in memory. This scheme is fast because only one word of state needs to be kept, and the mixing function is a few instructions. The main runOnModule function first computes the hash of each function, and only further processes functions which do not have a unique function hash. The hash is also used to order the sorted function set. If the hashes differ, their values are used to order the functions, otherwise the full comparison is done. Both of these are helpful in speeding up MergeFunctions. Together they result in speedups of 9% for mysqld (a mostly C application with little redundancy), 46% for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all three cases, the new speed of MergeFunctions is about half that of the module verifier, making it relatively inexpensive even for large LTO builds with hundreds of thousands of functions. The same functions are merged, so this change is free performance. Author: jrkoenig Reviewers: nlewycky, dschuff, jfb Subscribers: llvm-commits, aemerson Differential revision: http://reviews.llvm.org/D11923 llvm-svn: 245140
* LoopStrengthReduce: Try to pass address space to isLegalAddressingModeMatt Arsenault2015-08-151-63/+94
| | | | | | | | | This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137
* AMDGPU/SI: Fix printing useless info with amdhsaMatt Arsenault2015-08-151-1/+1
| | | | | | | The comments at the bottom would all report 0 if amdhsa was used. llvm-svn: 245135
* AMDGPU/SI: Update LiveVariablesMatt Arsenault2015-08-151-2/+15
| | | | | | | This is simple but won't work if/when this pass is moved to be post-SSA. llvm-svn: 245134
* AMDGPU/SI: Update LiveIntervals during SIFixSGPRLiveRangesMatt Arsenault2015-08-151-4/+13
| | | | | | | | | Does not mark SlotIndexes as reserved, although I think that might be OK. LiveVariables still need to be handled. llvm-svn: 245133
OpenPOWER on IntegriCloud