summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Expand frame indexes to be relative to scratch wave offsetMatt Arsenault2017-05-171-6/+71
| | | | | | | | | | | | In order for an arbitrary callee to access an object in a caller's stack frame, the 32-bit offset used as the private pointer needs to be relative to the kernel's scratch wave offset register. Convert to this by finding the difference from the current stack frame and scaling by the wavefront size. llvm-svn: 303303
* AMDGPU: Change mubuf soffset register when SP relativeMatt Arsenault2017-05-172-15/+53
| | | | | | | | | | Check the MachinePointerInfo for whether the access is supposed to be relative to the stack pointer. No tests because this is used in later commits implementing calls. llvm-svn: 303301
* [X86][AVX512] Add 512-bit vector ctlz costs + testsSimon Pilgrim2017-05-171-0/+24
| | | | llvm-svn: 303300
* [llvm-pdbdump] in yaml2pdb, generate default output filename if none givenBob Haarman2017-05-171-0/+1
| | | | | | | | | | | | | | | | | | | | Summary: llvm-pdbdump yaml2pdb used to fail with a misleading error message ("An I/O error occurred on the file system") if no output file was specified. This change adds an assert to PDBFileBuilder to check that an output file name is specified, and makes llvm-pdbdump generate an output file name based on the input file name if no output file name is explicitly specified. Reviewers: amccarth, zturner Reviewed By: zturner Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D33296 llvm-svn: 303299
* Add some helpers for manipulating BinaryStreamRefs.Zachary Turner2017-05-171-0/+5
| | | | llvm-svn: 303297
* AMDGPU: Make better use of op_sel with high componentsMatt Arsenault2017-05-172-8/+57
| | | | | | Handle more general swizzles. llvm-svn: 303296
* [InstSimplify] handle all icmp i1 X, C in one place; NFCISanjay Patel2017-05-171-28/+39
| | | | | | | | | | | | | | We already handled all of the new tests identically, but several of those went through a lot of unnecessary processing before getting folded. Another motivation for grouping these cases together is that InstCombine needs a similar fold. Currently, it handles the 'not' cases inefficiently which can lead to bugs as described in the post-commit comments of: https://reviews.llvm.org/D32143 llvm-svn: 303295
* [BinaryStream] Reduce the amount of boiler plate needed to use.Zachary Turner2017-05-174-4/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Often you have an array and you just want to use it. With the current design, you have to first construct a `BinaryByteStream`, and then create a `BinaryStreamRef` from it. Worse, the `BinaryStreamRef` holds a pointer to the `BinaryByteStream`, so you can't just create a temporary one to appease the compiler, you have to actually hold onto both the `ArrayRef` as well as the `BinaryByteStream` *AND* the `BinaryStreamReader` on top of that. This makes for very cumbersome code, often requiring one to store a `BinaryByteStream` in a class just to circumvent this. At the cost of some added complexity (not exposed to users, but internal to the library), we can do better than this. This patch allows us to construct `BinaryStreamReaders` and `BinaryStreamWriters` directly from source data (e.g. `StringRef`, `MutableArrayRef<uint8_t>`, etc). Not only does this reduce the amount of code you have to type and make it more obvious how to use it, but it solves real lifetime issues when it's inconvenient to hold onto a `BinaryByteStream` for a long time. The additional complexity is in the form of an added layer of indirection. Whereas before we simply stored a `BinaryStream*` in the ref, we now store both a `BinaryStream*` **and** a `std::shared_ptr<BinaryStream>`. When the user wants to construct a `BinaryStreamRef` directly from an `ArrayRef` etc, we allocate an internal object that holds ownership over a `BinaryByteStream` and forwards all calls, and store this in the `shared_ptr<>`. This also maintains the ref semantics, as you can copy it by value and references refer to the same underlying stream -- the one being held in the object stored in the `shared_ptr`. Differential Revision: https://reviews.llvm.org/D33293 llvm-svn: 303294
* [X86][AVX512] Add 512-bit vector cttz costs + testsSimon Pilgrim2017-05-171-0/+6
| | | | llvm-svn: 303293
* Only enable LiveRangeShrink for x86.Dehao Chen2017-05-172-3/+1
| | | | | | | | | | | | | | Summary: Moving LiveRangeShrink to x86 as this pass is mostly useful for archtectures with great register pressure. Reviewers: MatzeB, qcolombet Reviewed By: qcolombet Subscribers: jholewinski, jyknight, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33294 llvm-svn: 303292
* AMDGPU: Try to use op_sel when selecting packed instructionsMatt Arsenault2017-05-171-1/+29
| | | | | | | | | | | | Avoids instructions to pack a vector when the source is really a scalar being broadcast. Also be smarter and look for per-component fneg. Doesn't yet handle scalar from upper half of register or other swizzles. llvm-svn: 303291
* [WebAssembly][NFC] Update expected testsuite failures for newly passing testsJacob Gravelle2017-05-171-3/+0
| | | | | | | | | | | | Summary: r303050 fixes crashes when calling scalarizeMaskedMemIntrin pass from WebAssembly backend. This updates expected test failures for that. Reviewers: sbc100 Subscribers: jfb, llvm-commits, dschuff Differential Revision: https://reviews.llvm.org/D33295 llvm-svn: 303288
* AMDGPU: Use appropriate soffset for spillingMatt Arsenault2017-05-172-20/+20
| | | | | | | This needs to be the frame offset register, and not the global scratch wave offset register. For kernels, these are the same. llvm-svn: 303287
* Revert r303015, because it has the unintended side effect of breakingDimitry Andric2017-05-171-24/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | driver-mode recognition in clang (this is because the sysctl method always returns one and only one executable path, even for an executable with multiple links): Fix DynamicLibraryTest.cpp on FreeBSD and NetBSD Summary: After rL301562, on FreeBSD the DynamicLibrary unittests fail, because the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since the path does not contain any slashes, retrieving the main executable will not work. Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3), which is more reliable than fiddling with relative or absolute paths. Also add retrieval of the original argv[] from the GoogleTest framework, to use as a fallback for other OSes. Reviewers: emaste, marsupial, hans, krytarowski Reviewed By: krytarowski Subscribers: krytarowski, llvm-commits Differential Revision: https://reviews.llvm.org/D33171 llvm-svn: 303285
* AMDGPU: Fix min3/max3 combines for f16/i16Matt Arsenault2017-05-173-2/+25
| | | | | | Fix missing instruction definitions for min3/max3. llvm-svn: 303284
* [X86][AVX512] Add 512-bit vector bitreverse costs + testsSimon Pilgrim2017-05-171-0/+18
| | | | llvm-svn: 303283
* Re-land r303274: "[CrashRecovery] Use SEH __try instead of VEH when available"Reid Kleckner2017-05-171-48/+71
| | | | | | | | | | | We have to check gCrashRecoveryEnabled before using __try. In other words, SEH works too well and we ended up recovering from crashes in implicit module builds that we weren't supposed to. Only libclang is supposed to enable CrashRecoveryContext to allow implicit module builds to crash. llvm-svn: 303279
* [GISel]: Fix undefined behavior in IRTranslatorAditya Nandakumar2017-05-171-0/+5
| | | | | | | | Make sure IRTranslator->MachineIRBuilder->DebugLoc doesn't outlive the DILocation. Clear it at the end of IRTranslator::runOnMachineFunction llvm-svn: 303277
* Revert "[CrashRecovery] Use SEH __try instead of VEH when available"Reid Kleckner2017-05-171-66/+48
| | | | | | This reverts commit r303274, it appears to break some clang tests. llvm-svn: 303275
* [CrashRecovery] Use SEH __try instead of VEH when availableReid Kleckner2017-05-171-48/+66
| | | | | | | | | | | | | | | | | | | | Summary: It avoids problems when other libraries raise exceptions. In particular, OutputDebugString raises an exception that the debugger is supposed to catch and suppress. VEH kicks in first right now, and that is entirely incorrect. Unfortunately, GCC does not support SEH, so I've kept the old buggy VEH codepath around. We could fix it with SetUnhandledExceptionFilter, but that is not per-thread, so a well-behaved library shouldn't set it. Reviewers: zturner Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D33261 llvm-svn: 303274
* Workaround for incorrect Win32 header on GCC.Zachary Turner2017-05-171-6/+4
| | | | llvm-svn: 303272
* [CodeView] Simplify the use of visiting type records & streams.Zachary Turner2017-05-177-50/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is often a lot of boilerplate code required to visit a type record or type stream. The #1 use case is that you have a sequence of bytes that represent one or more records, and you want to deserialize each one, switch on it, and call a callback with the deserialized record that the user can examine. Currently this requires at least 6 lines of code: codeview::TypeVisitorCallbackPipeline Pipeline; Pipeline.addCallbackToPipeline(Deserializer); Pipeline.addCallbackToPipeline(MyCallbacks); codeview::CVTypeVisitor Visitor(Pipeline); consumeError(Visitor.visitTypeRecord(Record)); With this patch, it becomes one line of code: consumeError(codeview::visitTypeRecord(Record, MyCallbacks)); This is done by having the deserialization happen internally inside of the visitTypeRecord function. Since this is occasionally not desirable, the function provides a 3rd parameter that can be used to change this behavior. Hopefully this can significantly reduce the barrier to entry to using the visitation infrastructure. Differential Revision: https://reviews.llvm.org/D33245 llvm-svn: 303271
* [InstCombine] add isCanonicalPredicate() helper function and use it; NFCISanjay Patel2017-05-172-31/+32
| | | | | | | | | | | | | | | | | | | | There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code. The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring should make it easier if we want to change any of that behavior. 1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to rL32751 from what I can tell, but I'm not sure if there's a justification for that rule. 2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches for selects? 3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. Should one or both be changed? No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates. Differential Revision: https://reviews.llvm.org/D33247 llvm-svn: 303261
* [PPC] Properly update register save area offsetsKrzysztof Parzyszek2017-05-171-9/+14
| | | | | | | | | | | | The variables MinGPR/MinG8R were not updated properly when resetting the offsets, which in the included testcase lead to saving the CR register in the same location as R30. This fixes another issue reported in PR26519. Differential Revision: https://reviews.llvm.org/D33017 llvm-svn: 303257
* [GlobalISel][X86] Support add i64 in IA32.Igor Breger2017-05-172-0/+71
| | | | | | | | | | | | | | Summary: support G_UADDE instruction selection. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D33096 llvm-svn: 303255
* [SystemZ] Modelling of costs of divisions with a constant power of 2.Jonas Paulsson2017-05-171-1/+33
| | | | | | | | Such divisions will eventually be implemented with shifts which should be reflected in the cost function. Review: Ulrich Weigand llvm-svn: 303254
* Reland r303247: [ARM] GlobalISel: Remove dead instruction selection codeDiana Picus2017-05-171-15/+0
| | | | | | | | It only failed on llvm-clang-x86_64-expensive-checks-win, probably because the TableGen stuff hasn't been regenerated. Requires a clean build. llvm-svn: 303252
* [DWARF] - Cleanup relocations proccessing.George Rimar2017-05-171-39/+22
| | | | | | | | | | | | | | | RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8), and width field was never used in code. Relocations proccessing loop had checks for width field. Does not look like DWARF parser should do that. There is probably no much sense to validate relocations during proccessing them in parser. Patch removes relocation's width relative code from DWARFContext. Differential revision: https://reviews.llvm.org/D33194 llvm-svn: 303251
* Revert "[ARM] GlobalISel: Remove dead instruction selection code"Diana Picus2017-05-171-0/+15
| | | | | | | This reverts commit r303247 because the tests are failing on some bots. Sorry! llvm-svn: 303249
* [ARM] GlobalISel: Remove dead instruction selection codeDiana Picus2017-05-171-15/+0
| | | | | | | We can now generate code for selecting G_ADD, G_SUB and G_MUL. Remove the hand-written versions. llvm-svn: 303247
* [Sparc] Remove execute permissions from non-executable text filesDaniel Cederman2017-05-173-0/+0
| | | | | | | | | | | | Reviewers: jyknight, lero_chris, venkatra Reviewed By: jyknight Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27127 llvm-svn: 303245
* [RuntimeDyld] Fix debug section relocation (pr20457)Pavel Labath2017-05-171-3/+7
| | | | | | | | | | | | | | | | | | Summary: Debug info sections, (or non-SHF_ALLOC sections in general) should be linked as if their load address was zero to emulate the behavior of the static linker. This bug was discovered because it was breaking lldb expression evaluation on linux. Reviewers: lhames Subscribers: aprantl, eugene, clayborg, lldb-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D32899 llvm-svn: 303239
* Make sure -optimize-regalloc=false is used correctly by user.Jonas Paulsson2017-05-171-10/+14
| | | | | | | | | | | Don't allow -optimize-regalloc=false with -regalloc given for anything other than 'fast'. The other register allocators depend on the supporting passes added by addOptimizedRegAlloc(). Reviewers: Quentin Colombet, Matthias Braun https://reviews.llvm.org/D33181 llvm-svn: 303238
* [SCEV] Always sort AddRecExprs from different loops by dominanceMax Kazantsev2017-05-171-7/+7
| | | | | | | | | | | | | | Sorting of AddRecExprs by loop nesting does not make sense since we only invoke the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This guarantees that there is always a dominance relationship between them. This patch removes the sorting by nesting which is a dead code in current usage of this function. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D33228 llvm-svn: 303235
* [SCEV][NFC] Replace redundant dyn_cast with cast in getAddExprMax Kazantsev2017-05-171-14/+15
| | | | | | | | Replace dyn_cast which is ensured by isa just one line above with cast. Differential Revision: https://reviews.llvm.org/D33231 llvm-svn: 303234
* [coroutines] Handle spills before catchswitchGor Nishanov2017-05-171-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | If we need to spill the result of the PHI instruction, we insert the spill after all of the PHIs and EHPads, however, in a catchswitch block there is no room to insert the spill. Make room by splitting away catchswitch into a separate block. Before the fix: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %switch = catchswitch within none [label %catch] unwind label %cleanuppad After: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %tok = cleanuppad within none [] ; spill goes here cleanupret from %tok unwind label %catch.dispatch.switch catch.dispatch.switch: %switch = catchswitch within none [label %catch] unwind label %cleanuppad https://reviews.llvm.org/D31846 llvm-svn: 303232
* BitVector: add iterators for set bitsFrancis Visoiu Mistrih2017-05-1715-50/+36
| | | | | | Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227
* [ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).Eugene Zelenko2017-05-161-9/+28
| | | | llvm-svn: 303221
* Fix for compilers with older CRT header libraries.Zachary Turner2017-05-161-1/+6
| | | | llvm-svn: 303220
* [Support] Ignore OutputDebugString exceptions in our crash recovery.Zachary Turner2017-05-161-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we use AddVectoredExceptionHandler, we get notified of every exception that gets raised by a program. Sometimes these are not necessarily errors though, and this can be especially true when linking against a library that we have no control over, and may raise an exception internally which it intends to catch. In particular, the Windows API OutputDebugString does exactly this. It raises an exception inside of a __try / __except, giving the debugger a chance to handle the exception to print the message to the debug console. But this doesn't interoperate nicely with our vectored exception handler, which just sees another exception and decides that we need to terminate the program. Add a special case for this so that we ignore ODS exceptions and continue normally. Note that a better fix is to simply not use vectored exception handlers and use SEH instead, but given that MinGW doesn't support SEH, this is the only solution for MinGW. Differential Revision: https://reviews.llvm.org/D33260 llvm-svn: 303219
* [IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity.Davide Italiano2017-05-162-2/+2
| | | | llvm-svn: 303218
* [InstSimplify] add folds for constant mask of value shifted by constantSanjay Patel2017-05-161-0/+18
| | | | | | | | | | | | | | | | | We would eventually catch these via demanded bits and computing known bits in InstCombine, but I think it's better to handle the simple cases as soon as possible as a matter of efficiency. This fold allows further simplifications based on distributed ops transforms. eg: %a = lshr i8 %x, 7 %b = or i8 %a, 2 %c = and i8 %b, 1 InstSimplify can directly fold this now: %a = lshr i8 %x, 7 Differential Revision: https://reviews.llvm.org/D33221 llvm-svn: 303213
* The patch exclude a case from zero check skip inEvgeny Stupachenko2017-05-161-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CTLZ idiom recognition (r303102). Summary: The following case: i = 1; if(n) while (n >>= 1) i++; use(i); Was converted to: i = 1; if(n) i += builtin_ctlz(n >> 1, false); use(i); Which is not correct. The patch make it: i = 1; if(n) i += builtin_ctlz(n >> 1, true); use(i); From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303212
* Re-commit r302678, fixing PR33053.Amara Emerson2017-05-164-269/+101
| | | | | | | The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions which didn't have a lowering. llvm-svn: 303211
* [Inliner] Do not mix callsite and callee hotness based updates.Easwaran Raman2017-05-161-15/+27
| | | | | | | | | | Update threshold based on callee's hotness only when BFI is not available. Otherwise use only callsite's hotness. This makes it easier to reason about hotness related threshold updates. Differential revision: https://reviews.llvm.org/D33157 llvm-svn: 303210
* [PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.Tim Shen2017-05-165-8/+67
| | | | | | | | | | | | | | | | | Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 llvm-svn: 303205
* Add hasProfileSummary and has{Sample|Instrumentation}Profile methodsEaswaran Raman2017-05-161-1/+1
| | | | | | | | ProfileSummaryInfo already checks whether the module has sample profile in determining profile counts. This will also be useful in inliner to clean up threshold updates. llvm-svn: 303204
* In debug builds non-trivial amount of time is spent in InstCombine processingDmitry Mikulin2017-05-161-1/+4
| | | | | | @llvm.dbg.* calls in visitCallInst(). They can be safely ignored. llvm-svn: 303202
* NewGVN: Only do something in verifyStoreExpressions if assertions are ↵Daniel Berlin2017-05-161-0/+2
| | | | | | enabled, to avoid unused code warnings. llvm-svn: 303201
* NewGVN: Fix PR 33051 by making sure we remove old store expressionsDaniel Berlin2017-05-161-27/+36
| | | | | | from the ExpressionToClass mapping. llvm-svn: 303200
OpenPOWER on IntegriCloud