summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [BinaryStream] Reduce the amount of boiler plate needed to use.Zachary Turner2017-05-174-4/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Often you have an array and you just want to use it. With the current design, you have to first construct a `BinaryByteStream`, and then create a `BinaryStreamRef` from it. Worse, the `BinaryStreamRef` holds a pointer to the `BinaryByteStream`, so you can't just create a temporary one to appease the compiler, you have to actually hold onto both the `ArrayRef` as well as the `BinaryByteStream` *AND* the `BinaryStreamReader` on top of that. This makes for very cumbersome code, often requiring one to store a `BinaryByteStream` in a class just to circumvent this. At the cost of some added complexity (not exposed to users, but internal to the library), we can do better than this. This patch allows us to construct `BinaryStreamReaders` and `BinaryStreamWriters` directly from source data (e.g. `StringRef`, `MutableArrayRef<uint8_t>`, etc). Not only does this reduce the amount of code you have to type and make it more obvious how to use it, but it solves real lifetime issues when it's inconvenient to hold onto a `BinaryByteStream` for a long time. The additional complexity is in the form of an added layer of indirection. Whereas before we simply stored a `BinaryStream*` in the ref, we now store both a `BinaryStream*` **and** a `std::shared_ptr<BinaryStream>`. When the user wants to construct a `BinaryStreamRef` directly from an `ArrayRef` etc, we allocate an internal object that holds ownership over a `BinaryByteStream` and forwards all calls, and store this in the `shared_ptr<>`. This also maintains the ref semantics, as you can copy it by value and references refer to the same underlying stream -- the one being held in the object stored in the `shared_ptr`. Differential Revision: https://reviews.llvm.org/D33293 llvm-svn: 303294
* [X86][AVX512] Add 512-bit vector cttz costs + testsSimon Pilgrim2017-05-171-0/+6
| | | | llvm-svn: 303293
* Only enable LiveRangeShrink for x86.Dehao Chen2017-05-172-3/+1
| | | | | | | | | | | | | | Summary: Moving LiveRangeShrink to x86 as this pass is mostly useful for archtectures with great register pressure. Reviewers: MatzeB, qcolombet Reviewed By: qcolombet Subscribers: jholewinski, jyknight, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33294 llvm-svn: 303292
* AMDGPU: Try to use op_sel when selecting packed instructionsMatt Arsenault2017-05-171-1/+29
| | | | | | | | | | | | Avoids instructions to pack a vector when the source is really a scalar being broadcast. Also be smarter and look for per-component fneg. Doesn't yet handle scalar from upper half of register or other swizzles. llvm-svn: 303291
* [WebAssembly][NFC] Update expected testsuite failures for newly passing testsJacob Gravelle2017-05-171-3/+0
| | | | | | | | | | | | Summary: r303050 fixes crashes when calling scalarizeMaskedMemIntrin pass from WebAssembly backend. This updates expected test failures for that. Reviewers: sbc100 Subscribers: jfb, llvm-commits, dschuff Differential Revision: https://reviews.llvm.org/D33295 llvm-svn: 303288
* AMDGPU: Use appropriate soffset for spillingMatt Arsenault2017-05-172-20/+20
| | | | | | | This needs to be the frame offset register, and not the global scratch wave offset register. For kernels, these are the same. llvm-svn: 303287
* Revert r303015, because it has the unintended side effect of breakingDimitry Andric2017-05-171-24/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | driver-mode recognition in clang (this is because the sysctl method always returns one and only one executable path, even for an executable with multiple links): Fix DynamicLibraryTest.cpp on FreeBSD and NetBSD Summary: After rL301562, on FreeBSD the DynamicLibrary unittests fail, because the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since the path does not contain any slashes, retrieving the main executable will not work. Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3), which is more reliable than fiddling with relative or absolute paths. Also add retrieval of the original argv[] from the GoogleTest framework, to use as a fallback for other OSes. Reviewers: emaste, marsupial, hans, krytarowski Reviewed By: krytarowski Subscribers: krytarowski, llvm-commits Differential Revision: https://reviews.llvm.org/D33171 llvm-svn: 303285
* AMDGPU: Fix min3/max3 combines for f16/i16Matt Arsenault2017-05-173-2/+25
| | | | | | Fix missing instruction definitions for min3/max3. llvm-svn: 303284
* [X86][AVX512] Add 512-bit vector bitreverse costs + testsSimon Pilgrim2017-05-171-0/+18
| | | | llvm-svn: 303283
* Re-land r303274: "[CrashRecovery] Use SEH __try instead of VEH when available"Reid Kleckner2017-05-171-48/+71
| | | | | | | | | | | We have to check gCrashRecoveryEnabled before using __try. In other words, SEH works too well and we ended up recovering from crashes in implicit module builds that we weren't supposed to. Only libclang is supposed to enable CrashRecoveryContext to allow implicit module builds to crash. llvm-svn: 303279
* [GISel]: Fix undefined behavior in IRTranslatorAditya Nandakumar2017-05-171-0/+5
| | | | | | | | Make sure IRTranslator->MachineIRBuilder->DebugLoc doesn't outlive the DILocation. Clear it at the end of IRTranslator::runOnMachineFunction llvm-svn: 303277
* Revert "[CrashRecovery] Use SEH __try instead of VEH when available"Reid Kleckner2017-05-171-66/+48
| | | | | | This reverts commit r303274, it appears to break some clang tests. llvm-svn: 303275
* [CrashRecovery] Use SEH __try instead of VEH when availableReid Kleckner2017-05-171-48/+66
| | | | | | | | | | | | | | | | | | | | Summary: It avoids problems when other libraries raise exceptions. In particular, OutputDebugString raises an exception that the debugger is supposed to catch and suppress. VEH kicks in first right now, and that is entirely incorrect. Unfortunately, GCC does not support SEH, so I've kept the old buggy VEH codepath around. We could fix it with SetUnhandledExceptionFilter, but that is not per-thread, so a well-behaved library shouldn't set it. Reviewers: zturner Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D33261 llvm-svn: 303274
* Workaround for incorrect Win32 header on GCC.Zachary Turner2017-05-171-6/+4
| | | | llvm-svn: 303272
* [CodeView] Simplify the use of visiting type records & streams.Zachary Turner2017-05-177-50/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is often a lot of boilerplate code required to visit a type record or type stream. The #1 use case is that you have a sequence of bytes that represent one or more records, and you want to deserialize each one, switch on it, and call a callback with the deserialized record that the user can examine. Currently this requires at least 6 lines of code: codeview::TypeVisitorCallbackPipeline Pipeline; Pipeline.addCallbackToPipeline(Deserializer); Pipeline.addCallbackToPipeline(MyCallbacks); codeview::CVTypeVisitor Visitor(Pipeline); consumeError(Visitor.visitTypeRecord(Record)); With this patch, it becomes one line of code: consumeError(codeview::visitTypeRecord(Record, MyCallbacks)); This is done by having the deserialization happen internally inside of the visitTypeRecord function. Since this is occasionally not desirable, the function provides a 3rd parameter that can be used to change this behavior. Hopefully this can significantly reduce the barrier to entry to using the visitation infrastructure. Differential Revision: https://reviews.llvm.org/D33245 llvm-svn: 303271
* [InstCombine] add isCanonicalPredicate() helper function and use it; NFCISanjay Patel2017-05-172-31/+32
| | | | | | | | | | | | | | | | | | | | There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code. The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring should make it easier if we want to change any of that behavior. 1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to rL32751 from what I can tell, but I'm not sure if there's a justification for that rule. 2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches for selects? 3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. Should one or both be changed? No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates. Differential Revision: https://reviews.llvm.org/D33247 llvm-svn: 303261
* [PPC] Properly update register save area offsetsKrzysztof Parzyszek2017-05-171-9/+14
| | | | | | | | | | | | The variables MinGPR/MinG8R were not updated properly when resetting the offsets, which in the included testcase lead to saving the CR register in the same location as R30. This fixes another issue reported in PR26519. Differential Revision: https://reviews.llvm.org/D33017 llvm-svn: 303257
* [GlobalISel][X86] Support add i64 in IA32.Igor Breger2017-05-172-0/+71
| | | | | | | | | | | | | | Summary: support G_UADDE instruction selection. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D33096 llvm-svn: 303255
* [SystemZ] Modelling of costs of divisions with a constant power of 2.Jonas Paulsson2017-05-171-1/+33
| | | | | | | | Such divisions will eventually be implemented with shifts which should be reflected in the cost function. Review: Ulrich Weigand llvm-svn: 303254
* Reland r303247: [ARM] GlobalISel: Remove dead instruction selection codeDiana Picus2017-05-171-15/+0
| | | | | | | | It only failed on llvm-clang-x86_64-expensive-checks-win, probably because the TableGen stuff hasn't been regenerated. Requires a clean build. llvm-svn: 303252
* [DWARF] - Cleanup relocations proccessing.George Rimar2017-05-171-39/+22
| | | | | | | | | | | | | | | RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8), and width field was never used in code. Relocations proccessing loop had checks for width field. Does not look like DWARF parser should do that. There is probably no much sense to validate relocations during proccessing them in parser. Patch removes relocation's width relative code from DWARFContext. Differential revision: https://reviews.llvm.org/D33194 llvm-svn: 303251
* Revert "[ARM] GlobalISel: Remove dead instruction selection code"Diana Picus2017-05-171-0/+15
| | | | | | | This reverts commit r303247 because the tests are failing on some bots. Sorry! llvm-svn: 303249
* [ARM] GlobalISel: Remove dead instruction selection codeDiana Picus2017-05-171-15/+0
| | | | | | | We can now generate code for selecting G_ADD, G_SUB and G_MUL. Remove the hand-written versions. llvm-svn: 303247
* [Sparc] Remove execute permissions from non-executable text filesDaniel Cederman2017-05-173-0/+0
| | | | | | | | | | | | Reviewers: jyknight, lero_chris, venkatra Reviewed By: jyknight Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27127 llvm-svn: 303245
* [RuntimeDyld] Fix debug section relocation (pr20457)Pavel Labath2017-05-171-3/+7
| | | | | | | | | | | | | | | | | | Summary: Debug info sections, (or non-SHF_ALLOC sections in general) should be linked as if their load address was zero to emulate the behavior of the static linker. This bug was discovered because it was breaking lldb expression evaluation on linux. Reviewers: lhames Subscribers: aprantl, eugene, clayborg, lldb-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D32899 llvm-svn: 303239
* Make sure -optimize-regalloc=false is used correctly by user.Jonas Paulsson2017-05-171-10/+14
| | | | | | | | | | | Don't allow -optimize-regalloc=false with -regalloc given for anything other than 'fast'. The other register allocators depend on the supporting passes added by addOptimizedRegAlloc(). Reviewers: Quentin Colombet, Matthias Braun https://reviews.llvm.org/D33181 llvm-svn: 303238
* [SCEV] Always sort AddRecExprs from different loops by dominanceMax Kazantsev2017-05-171-7/+7
| | | | | | | | | | | | | | Sorting of AddRecExprs by loop nesting does not make sense since we only invoke the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This guarantees that there is always a dominance relationship between them. This patch removes the sorting by nesting which is a dead code in current usage of this function. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D33228 llvm-svn: 303235
* [SCEV][NFC] Replace redundant dyn_cast with cast in getAddExprMax Kazantsev2017-05-171-14/+15
| | | | | | | | Replace dyn_cast which is ensured by isa just one line above with cast. Differential Revision: https://reviews.llvm.org/D33231 llvm-svn: 303234
* [coroutines] Handle spills before catchswitchGor Nishanov2017-05-171-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | If we need to spill the result of the PHI instruction, we insert the spill after all of the PHIs and EHPads, however, in a catchswitch block there is no room to insert the spill. Make room by splitting away catchswitch into a separate block. Before the fix: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %switch = catchswitch within none [label %catch] unwind label %cleanuppad After: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %tok = cleanuppad within none [] ; spill goes here cleanupret from %tok unwind label %catch.dispatch.switch catch.dispatch.switch: %switch = catchswitch within none [label %catch] unwind label %cleanuppad https://reviews.llvm.org/D31846 llvm-svn: 303232
* BitVector: add iterators for set bitsFrancis Visoiu Mistrih2017-05-1715-50/+36
| | | | | | Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227
* [ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).Eugene Zelenko2017-05-161-9/+28
| | | | llvm-svn: 303221
* Fix for compilers with older CRT header libraries.Zachary Turner2017-05-161-1/+6
| | | | llvm-svn: 303220
* [Support] Ignore OutputDebugString exceptions in our crash recovery.Zachary Turner2017-05-161-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we use AddVectoredExceptionHandler, we get notified of every exception that gets raised by a program. Sometimes these are not necessarily errors though, and this can be especially true when linking against a library that we have no control over, and may raise an exception internally which it intends to catch. In particular, the Windows API OutputDebugString does exactly this. It raises an exception inside of a __try / __except, giving the debugger a chance to handle the exception to print the message to the debug console. But this doesn't interoperate nicely with our vectored exception handler, which just sees another exception and decides that we need to terminate the program. Add a special case for this so that we ignore ODS exceptions and continue normally. Note that a better fix is to simply not use vectored exception handlers and use SEH instead, but given that MinGW doesn't support SEH, this is the only solution for MinGW. Differential Revision: https://reviews.llvm.org/D33260 llvm-svn: 303219
* [IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity.Davide Italiano2017-05-162-2/+2
| | | | llvm-svn: 303218
* [InstSimplify] add folds for constant mask of value shifted by constantSanjay Patel2017-05-161-0/+18
| | | | | | | | | | | | | | | | | We would eventually catch these via demanded bits and computing known bits in InstCombine, but I think it's better to handle the simple cases as soon as possible as a matter of efficiency. This fold allows further simplifications based on distributed ops transforms. eg: %a = lshr i8 %x, 7 %b = or i8 %a, 2 %c = and i8 %b, 1 InstSimplify can directly fold this now: %a = lshr i8 %x, 7 Differential Revision: https://reviews.llvm.org/D33221 llvm-svn: 303213
* The patch exclude a case from zero check skip inEvgeny Stupachenko2017-05-161-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CTLZ idiom recognition (r303102). Summary: The following case: i = 1; if(n) while (n >>= 1) i++; use(i); Was converted to: i = 1; if(n) i += builtin_ctlz(n >> 1, false); use(i); Which is not correct. The patch make it: i = 1; if(n) i += builtin_ctlz(n >> 1, true); use(i); From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303212
* Re-commit r302678, fixing PR33053.Amara Emerson2017-05-164-269/+101
| | | | | | | The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions which didn't have a lowering. llvm-svn: 303211
* [Inliner] Do not mix callsite and callee hotness based updates.Easwaran Raman2017-05-161-15/+27
| | | | | | | | | | Update threshold based on callee's hotness only when BFI is not available. Otherwise use only callsite's hotness. This makes it easier to reason about hotness related threshold updates. Differential revision: https://reviews.llvm.org/D33157 llvm-svn: 303210
* [PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.Tim Shen2017-05-165-8/+67
| | | | | | | | | | | | | | | | | Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 llvm-svn: 303205
* Add hasProfileSummary and has{Sample|Instrumentation}Profile methodsEaswaran Raman2017-05-161-1/+1
| | | | | | | | ProfileSummaryInfo already checks whether the module has sample profile in determining profile counts. This will also be useful in inliner to clean up threshold updates. llvm-svn: 303204
* In debug builds non-trivial amount of time is spent in InstCombine processingDmitry Mikulin2017-05-161-1/+4
| | | | | | @llvm.dbg.* calls in visitCallInst(). They can be safely ignored. llvm-svn: 303202
* NewGVN: Only do something in verifyStoreExpressions if assertions are ↵Daniel Berlin2017-05-161-0/+2
| | | | | | enabled, to avoid unused code warnings. llvm-svn: 303201
* NewGVN: Fix PR 33051 by making sure we remove old store expressionsDaniel Berlin2017-05-161-27/+36
| | | | | | from the ExpressionToClass mapping. llvm-svn: 303200
* Revert "[X86] Replace slow LEA instructions in X86"Reid Kleckner2017-05-164-237/+43
| | | | | | | This reverts commit r303183, it broke various buildbots and introduced sanitizer errors. llvm-svn: 303199
* Elide stores which are overwritten without being observed.Nirav Dave2017-05-161-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 llvm-svn: 303198
* ShrinkWrap: Add skipFunction() callMatthias Braun2017-05-161-1/+1
| | | | | | | ShrinkWrapping is a performance optimization that can safely be skipped, so we can add `if (!skipFunction()) return;` llvm-svn: 303197
* [MetadataLoader] Remove unused Vector. NFCI.Davide Italiano2017-05-161-1/+1
| | | | llvm-svn: 303196
* Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"Renato Golin2017-05-163-4/+4
| | | | | | | | | | | | | | | | | Revert "[ARM] Mark LEApcrel as not having side effects" This reverts commit r303054 and r303053, as they broke the ARM self-hosting buildbots: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845 Offline investigation on course. llvm-svn: 303193
* [AMDGPU] Use GCNRPTracker dumper methods in schedulerStanislav Mekhanoshin2017-05-163-18/+21
| | | | | | Differential Revision: https://reviews.llvm.org/D33244 llvm-svn: 303186
* [AMDGPU] Cache live-ins and register pressure in schedulerStanislav Mekhanoshin2017-05-162-75/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using LIS can be quite expensive, so caching of calculated region live-ins and pressure is implemented. It does two things: 1. Caches the info for the second stage when we schedule with decreased target occupancy. 2. Tracks the basic block from top to bottom thus eliminating the need to scan whole register file liveness at every region split in the middle of the block. The scheduling is now done in 3 stages instead of two, with the first one being really a no-op and only used to collect scheduling regions as sent by the scheduler driver. There is no functional change to the current behavior, only compilation speed is affected. In general computeBlockPressure() could be simplified if we switch to backward RP tracker, because scheduler sends regions within a block starting from the last upward. We could use a natural order of upward tracker to seamlessly change between regions of the same block, since live reg set of a previous tracked region would become a live-out of the next region. That however requires fixing upward tracker to properly account defs and uses of the same instruction as both are contributing to the current pressure. When we converge on the produced pressure we should be able to switch between them back and forth. In addition, backward tracker is less expensive as it uses LIS in recede less often than forward uses it in advance. At the moment the worst known case compilation time has improved from 26 minutes to 8.5. Differential Revision: https://reviews.llvm.org/D33117 llvm-svn: 303184
OpenPOWER on IntegriCloud