summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* fix trivial typo in comment, NFCHiroshi Inoue2017-06-261-1/+1
| | | | llvm-svn: 306274
* [MBP] do not rotate loop if it creates extra branchSerguei Katkov2017-06-263-4/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true 1) Best Exit has next element in chain as successor. 2) Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34271 llvm-svn: 306272
* [CFL-AA] Remove unneeded function declaration. NFCI.Davide Italiano2017-06-261-3/+0
| | | | llvm-svn: 306268
* [InstCombine] Factor the logic for propagating !nonnull and !rangeChandler Carruth2017-06-263-31/+64
| | | | | | | | | | | | | | | | metadata out of InstCombine and into helpers. NFC, this just exposes the logic used by InstCombine when propagating metadata from one load instruction to another. The plan is to use this in SROA to address PR32902. If anyone has better ideas about how to factor this or name variables, I'm all ears, but this seemed like a pretty good start and lets us make progress on the PR. This is based on a patch by Ariel Ben-Yehuda (D34285). llvm-svn: 306267
* AMDGPU: Whitespace fixesMatt Arsenault2017-06-264-6/+6
| | | | llvm-svn: 306265
* AMDGPU: Partially fix implicit.buffer.ptr intrinsic handlingMatt Arsenault2017-06-268-30/+93
| | | | | | | | | | | | | | This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. llvm-svn: 306264
* fix various typosSylvestre Ledru2017-06-263-12/+12
| | | | llvm-svn: 306262
* [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.Chandler Carruth2017-06-257-65/+175
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was reverted in r306252, but I already had the bug fixed and was just trying to form a test case. The original commit factored the logic for forming dedicated exits inside of LoopSimplify into a helper that could be used elsewhere and with an approach that required fewer intermediate data structures. See that commit for full details including the change to the statistic, etc. The code looked fine to me and my reviewers, but in fact didn't handle indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty. If you have code that looks *just* right, you can end up leaking these predecessors into a subsequent rewrite, and crash deep down when trying to update PHI nodes for predecessors that don't exist. I've added an assert that makes the bug much more obvious, and then changed the code to reliably clear the vector so we don't get this bug again in some other form as the code changes. I've also added a test case that *does* manage to catch this while also giving some nice positive coverage in the face of indirectbr. The real code that found this came out of what I think is CPython's interpreter loop, but any code with really "creative" interpreter loops mixing indirectbr and other exit paths could manage to tickle the bug. I was hard to reduce the original test case because in addition to having a particular pattern of IR, the whole thing depends on the order of the predecessors which is in turn depends on use list order. The test case added here was designed so that in multiple different predecessor orderings it should always end up going down the same path and tripping the same bug. I hope. At least, it tripped it for me without manipulating the use list order which is better than anything bugpoint could do... llvm-svn: 306257
* [LoopSimplify] Improve a test for loop simplify minorly. NFC.Chandler Carruth2017-06-251-12/+150
| | | | | | | | I did some basic testing while looking for a bug in my recent change to loop simplify and even though it didn't find the bug it seems like a useful improvement anyways. llvm-svn: 306256
* [MemDep] Cleanup return after else & use `auto`. NFC.Davide Italiano2017-06-251-3/+3
| | | | llvm-svn: 306255
* [LoopDeletion] NFC: Move phi node value setting into prepassAnna Thomas2017-06-251-11/+14
| | | | | | | | | | | | | | | Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator, which caused loop deletion tests to hang due to infinite loop. Had reverted it in rL306162. rL306157 commit message: Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306254
* Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."Daniel Jasper2017-06-256-91/+65
| | | | | | | This leads to a segfault. Chandler already has a test case and should be able to recommit with a fix soon. llvm-svn: 306252
* [TableGen] Remove some copies around PatternToMatch.Craig Topper2017-06-252-16/+14
| | | | | | | | | | | | | | | | | | | Summary: This patch does a few things that should remove some copies around PatternsToMatch. These were noticed while reviewing code for D34341. Change constructor to take Dstregs by value and move it into the class. Change one of the callers to add std::move to the argument so that it gets moved. Make AddPatternToMatch take PatternToMatch by rvalue reference so we can move it into the PatternsToMatch vector. I believe we should have a implicit default move constructor available on PatternToMatch. I chose rvalue reference because both callers call it with temporaries already. Reviewers: RKSimon, aymanmus, spatel Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34411 llvm-svn: 306251
* [IR] Use isIntOrIntVectorTy instead of writing it out the long way. NFCCraig Topper2017-06-251-10/+4
| | | | llvm-svn: 306250
* [IR] Move repeated asserts in FCmpInst constructor to a helper method like ↵Craig Topper2017-06-251-21/+13
| | | | | | we do for ICmpInst and other classes. NFC llvm-svn: 306249
* [X86][SSE] Remove unused memopfsf32_128/memopfsf64_128 scalar memopsSimon Pilgrim2017-06-251-10/+0
| | | | | | The 'scalar' simd bitops were dropped a while ago llvm-svn: 306248
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-06-252-2/+2
| | | | llvm-svn: 306247
* [X86] Add test case for PR15705Simon Pilgrim2017-06-251-0/+48
| | | | llvm-svn: 306246
* [InstCombine] add (sext i1 X), 1 --> zext (not X)Sanjay Patel2017-06-252-17/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | http://rise4fun.com/Alive/i8Q A narrow bitwise logic op is obviously better than math for value tracking, and zext is better than sext. Typically, the 'not' will be folded into an icmp predicate. The IR difference would even survive through codegen for x86, so we would see worse code: https://godbolt.org/g/C14HMF one_or_zero(int, int): # @one_or_zero(int, int) xorl %eax, %eax cmpl %esi, %edi setle %al retq one_or_zero_alt(int, int): # @one_or_zero_alt(int, int) xorl %ecx, %ecx cmpl %esi, %edi setg %cl movl $1, %eax subl %ecx, %eax retq llvm-svn: 306243
* AVX-512: Fixed a crash during legalization of <3 x i8> typeElena Demikhovsky2017-06-252-2/+32
| | | | | | | | | The compiler fails with assertion during legalization of SETCC for <3 x i8> operands. The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>. Differential Revision: https://reviews.llvm.org/D34503 llvm-svn: 306242
* [AST] Fix a bug in aliasesUnknownInst. Make sure we are comparing the ↵Xin Tong2017-06-253-2/+90
| | | | | | | | | | | | | | | | unknown instructions in the alias set and the instruction interested in. Summary: Make sure we are comparing the unknown instructions in the alias set and the instruction interested in. I believe this is clearly a bug (missed opportunity). I can also add some test cases if desired. Reviewers: hfinkel, davide, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34597 llvm-svn: 306241
* [GlobalISel][X86] Support vector type G_EXTRACT selection.Igor Breger2017-06-253-0/+311
| | | | | | | | | | | | | | | | Summary: Support vector type G_EXTRACT selection. For now G_EXTRACT marked as legal for any type, so nothing to do in legalizer. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33957 llvm-svn: 306240
* [AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2Dorit Nuzman2017-06-254-0/+298
| | | | | | | | | | | | | | | | | | | | | | | | | | | The cost of an interleaved access was only implemented for AVX512. For other X86 targets an overly conservative Base cost was returned, resulting in avoiding vectorization where it is actually profitable to vectorize. This patch starts to add costs for AVX2 for most prominent cases of interleaved accesses (stride 3,4 chars, for now). Note1: Improvements of up to ~4x were observed in some of EEMBC's rgb workloads; There is also a known issue of 15-30% degradations on some of these workloads, associated with an interleaved access followed by type promotion/widening; the resulting shuffle sequence is currently inefficient and will be improved by a series of patches that extend the X86InterleavedAccess pass (such as D34601 and more to follow). Note 2: The costs in this patch do not reflect port pressure penalties which can be very dominant in the case of interleaved accesses since most of the shuffle operations are restricted to a single port. Further tuning, that may incorporate these considerations, will be done on top of the upcoming improved shuffle sequences (that is, along with the abovementioned work to extend X86InterleavedAccess pass). Differential Revision: https://reviews.llvm.org/D34023 llvm-svn: 306238
* Add support for Ananas platformEd Schouten2017-06-253-0/+9
| | | | | | | | | | | | | | | | | Ananas is a home-brew operating system, mainly for amd64 machines. After using GCC for quite some time, it has switched to clang and never looked back - yet, having to manually patch things is annoying, so it'd be much nicer if this was in the official tree. More information: https://github.com/zhmu/ananas/ https://rink.nu/projects/ananas.html Submitted by: Rink Springer Differential Revision: https://reviews.llvm.org/D32937 llvm-svn: 306237
* [PatternMatch] Just check if value is a Constant before calling ↵Craig Topper2017-06-251-4/+1
| | | | | | isAllOnesValue for not_match. We don't really need to check for a specific subclass of Constant. NFC llvm-svn: 306236
* [pdb] Fix reading of llvm-generated PDBs by cvdump.Zachary Turner2017-06-251-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you dump a pdb to yaml, and then round-trip it back to a pdb, and run cvdump -l <file> on the new pdb, cvdump will generate output such as this. *** LINES ** Module: "d:\src\llvm\test\DebugInfo\PDB\Inputs\empty.obj" Error: Line number corrupted: invalid file id 0 <Unknown> (MD5), 0001:00000010-0000001A, line/addr pairs = 3 5 00000010 6 00000013 7 00000018 Note the error message about the corrupted line number. It turns out that the problem is that cvdump cannot find the /names stream (e.g. the global string table), and the reason it can't find the /names stream is because it doesn't understand the NameMap that we serialize which tells pdb consumers which stream has the string table. Some experimentation shows that if we add items to the hash table in a specific order before serializing it, cvdump can read it. This suggests that either we're using the wrong hash function, or we're serializing something incorrectly, but it will take some deeper investigation to figure out how / why. For now, this at least allows cvdump to read our line information (and incidentally, produces an identical byte sequence to what Microsoft tools produce when writing the named stream map). Differential Revision: https://reviews.llvm.org/D34491 llvm-svn: 306233
* [PGO] Implementate profile counter regiser promotionXinliang David Li2017-06-258-15/+463
| | | | | | Differential Revision: http://reviews.llvm.org/D34085 llvm-svn: 306231
* [Support] Don't use std::iterator, it's deprecated in C++17.Zachary Turner2017-06-251-6/+5
| | | | | | | | | | In converting this over to iterator_facade_base, some member operators and methods are no longer needed since iterator_facade implements them in the base class using CRTP. Differential Revision: https://reviews.llvm.org/D34223 llvm-svn: 306230
* [SCEV] Avoid copying ConstantRange just to get the min/max valueCraig Topper2017-06-242-72/+89
| | | | | | | | | | | | | | | | | | | | | Summary: This patch changes getRange to getRangeRef and returns a reference to the ConstantRange object stored inside the DenseMap caches. We then take advantage of that to add new helper methods that can return min/max value of a signed or unsigned ConstantRange using that reference without first copying the ConstantRange. getRangeRef calls itself recursively and I believe the reference return is fine for those calls. I've left getSignedRange and getUnsignedRange returning a ConstantRange object so they will make a copy now. This is to ensure safety since the reference will be invalidated if the DenseMap changes. I'm sure there are still more places that can take advantage of the reference and I'll submit future patches as I find them. Reviewers: sanjoy, davide Reviewed By: sanjoy Subscribers: zzheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D32978 llvm-svn: 306229
* [PatternMatch] Use ConstantFP::isNan instead of getting the APFloat and ↵Craig Topper2017-06-241-4/+2
| | | | | | calling isNaN on that. NFC llvm-svn: 306227
* [IR] Implement commutable matchers without using combineOrCraig Topper2017-06-241-56/+58
| | | | | | | | | | | | | | | | | | | | | Summary: Turns out creating matchers with combineOr isn't very efficient as we have to build matcher objects for both sides of the OR. Those objects aren't free, the trees usually contain several objects that contain a reference to a Value *, ConstantInt *, APInt * or some such thing. The compiler isn't always willing to inline all the matcher code to get rid of these member variables. Thus we end up loads and stores of these variables. Using combineOR ends up creating two complete copies of the tree and the associated stores. I believe we're also paying for the opcode check twice. This patch adds a commutable mode to several of the matcher objects as a bool template parameter that can be used to enable commutable support directly in the match functions of the corresponding objects. This avoids the duplicate object creation and the opcode checks. This shows about an ~7-8k reduction in the opt binary size on my local build. Reviewers: spatel, majnemer, davide Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34592 llvm-svn: 306226
* Another test commitAnton Korobeynikov2017-06-241-1/+1
| | | | llvm-svn: 306224
* Remove test commit change.Tanya Lattner2017-06-241-1/+1
| | | | llvm-svn: 306223
* test commitTanya Lattner2017-06-241-1/+1
| | | | llvm-svn: 306222
* Still debuggingAnton Korobeynikov2017-06-241-1/+1
| | | | llvm-svn: 306216
* Still test commitAnton Korobeynikov2017-06-241-1/+1
| | | | llvm-svn: 306215
* Another test commitAnton Korobeynikov2017-06-241-1/+1
| | | | llvm-svn: 306214
* Another test commitAnton Korobeynikov2017-06-241-1/+1
| | | | llvm-svn: 306213
* Test commitAnton Korobeynikov2017-06-241-1/+1
| | | | llvm-svn: 306212
* fix trivial typos in comment, NFCHiroshi Inoue2017-06-243-3/+3
| | | | llvm-svn: 306211
* fix trivial typos in comment, NFCHiroshi Inoue2017-06-242-3/+3
| | | | | | dereferencable -> dereferenceable llvm-svn: 306210
* [SelectionDAG] set dereferenceable flag when expanding memcpy/memmoveHiroshi Inoue2017-06-244-8/+122
| | | | | | | | | | When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable. This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer. This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure. Differential Revision: https://reviews.llvm.org/D34467 llvm-svn: 306209
* Ensure backends available in 'opt' are also available in 'bugpoint'Tobias Grosser2017-06-243-0/+11
| | | | | | | | | | | | | | | | | | | | | | This patch links LLVM back-ends into bugpoint the same way they are already available in 'opt' and 'clang'. This resolves an inconsistency that allowed the use of LLVM backends in loadable modules that run in 'opt', but that would prevent the debugging of these modules with bugpoint due to unavailable / unresolved symbols. For e.g. In D31859, Polly requires the NVPTX back-end. Reviewers: hfinkel, bogner, chandlerc, grosser, Meinersbur Subscribers: bollu, mgorny, grosser, Meinersbur Tags: #polly Contributed by: Singapuram Sanjay Differential Revision: https://reviews.llvm.org/D32003 llvm-svn: 306208
* [IR] Remove BinOp2_match and replace its usage with the more capable ↵Craig Topper2017-06-241-49/+48
| | | | | | BinOpPred_match. llvm-svn: 306207
* [IR][AssumptionCache] Add m_Shift and m_BitwiseLogic matchers to replace a ↵Craig Topper2017-06-243-10/+49
| | | | | | | | | | | | | | | | | | | couple m_CombineOr Summary: m_CombineOr isn't very efficient. The code using it is also quite verbose. This patch adds m_Shift and m_BitwiseLogic matchers to make the using code more concise and improve the match efficiency. Reviewers: spatel, davide Reviewed By: davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D34593 llvm-svn: 306206
* [ValueTracking][InstCombine] Use m_Shr instead m_CombineOr(m_LShr, m_AShr). NFCCraig Topper2017-06-242-7/+3
| | | | llvm-svn: 306205
* [Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a ↵Craig Topper2017-06-243-8/+4
| | | | | | few places. NFC llvm-svn: 306204
* Simplify the processFixupValue interface. NFC.Rafael Espindola2017-06-248-35/+18
| | | | llvm-svn: 306202
* Add comments for OrderedInstruction. NFCXin Tong2017-06-241-0/+3
| | | | llvm-svn: 306201
* Remove a processFixupValue hack.Rafael Espindola2017-06-242-35/+32
| | | | | | | | | | | The intention of processFixupValue is not to redefine the semantics of MCExpr. It is odd enough that a expression lowers to a PCRel MCExpr or not depending on what it looks like. At least it is a local hack now. I left a fix for anyone trying to figure out what producers should be producing a different expression. llvm-svn: 306200
OpenPOWER on IntegriCloud