summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopStrengthReduce] Don't neglect the Fixup.Offset in isAMCompletelyFolded().Jonas Paulsson2017-08-091-2/+2
| | | | | | | | In the recursive call to isAMCompletelyFolded(), the passed offset should be the sum of F.BaseOffset and Fixup.Offset. Review: Quentin Colombet. llvm-svn: 310462
* [mips] PR34083 - Wimplicit-fallthrough warning in MipsAsmParser.cppSimon Dardis2017-08-091-6/+7
| | | | | | | | | | | | Assert that a binary expression is actually a binary expression, rather than potientially incorrectly attempting to handle it as a unary expression. This resolves PR34083. Thanks to Simonn Pilgrim for reporting the issue! llvm-svn: 310460
* Suppress a warning. NFC.Gabor Horvath2017-08-091-1/+1
| | | | llvm-svn: 310459
* [AsmParser] Hash is not a comment on some targetsOliver Stannard2017-08-092-18/+0
| | | | | | | | | | | | The '#' token is not a comment for all targets (on ARM and AArch64 it marks an immediate operand), so we shouldn't treat it as such. Comments are already converted to AsmToken::EndOfStatement by AsmLexer::LexLineComment, so this check was unnecessary. Differential Revision: https://reviews.llvm.org/D36405 llvm-svn: 310457
* [LCG] Completely remove the map-based association of post-order numbersChandler Carruth2017-08-091-35/+34
| | | | | | | | | | | | | | | | | | | | | | | to Nodes when removing ref edges from a RefSCC. This map based association turns out to be pretty expensive for large RefSCCs and pointless as we already have embedded data members inside nodes that we use to track the DFS state. We can reuse one of those and the map becomes unnecessary. This also fuses the update of those numbers into the scan across the pending stack of nodes so that we don't walk the nodes twice during the DFS. With this I expect the new PM to be faster than the old PM for the test case I have been optimizing. That said, it also seems simpler and more direct in many ways. The side storage was always pretty awkward. The last remaining hot-spot in the profile of the LCG once this is done will be the edge iterator walk in the DFS. I'll take a look at improving that next. llvm-svn: 310456
* [GlobalOpt] Switch an explicit loop to llvm::all_of(). NFCI.Davide Italiano2017-08-091-5/+2
| | | | llvm-svn: 310453
* [LCG] Special case when removing a ref edge from a RefSCC leavesChandler Carruth2017-08-091-12/+29
| | | | | | | | | | | | | | | that RefSCC still connected. This is common and can be handled much more efficiently. As soon as we know we've covered every node in the RefSCC with the DFS, we can simply reset our state and return. This avoids numerous data structure updates and other complexity. On top of other changes, this appears to get new PM back to parity with the old PM for a large protocol buffer message source code. The dense map updates are very hot in this function. llvm-svn: 310451
* [LCG] Switch one of the update methods for the LazyCallGraph to supportChandler Carruth2017-08-092-155/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | limited batch updates. Specifically, allow removing multiple reference edges starting from a common source node. There are a few constraints that play into supporting this form of batching: 1) The way updates occur during the CGSCC walk, about the most we can functionally batch together are those with a common source node. This also makes the batching simpler to implement, so it seems a worthwhile restriction. 2) The far and away hottest function for large C++ files I measured (generated code for protocol buffers) showed a huge amount of time was spent removing ref edges specifically, so it seems worth focusing there. 3) The algorithm for removing ref edges is very amenable to this restricted batching. There are just both API and implementation special casing for the non-batch case that gets in the way. Once removed, supporting batches is nearly trivial. This does modify the API in an interesting way -- now, we only preserve the target RefSCC when the RefSCC structure is unchanged. In the face of any splits, we create brand new RefSCC objects. However, all of the users were OK with it that I could find. Only the unittest needed interesting updates here. How much does batching these updates help? I instrumented the compiler when run over a very large generated source file for a protocol buffer and found that the majority of updates are intrinsically updating one function at a time. However, nearly 40% of the total ref edges removed are removed as part of a batch of removals greater than one, so these are the cases batching can help with. When compiling the IR for this file with 'opt' and 'O3', this patch reduces the total time by 8-9%. Differential Revision: https://reviews.llvm.org/D36352 llvm-svn: 310450
* [X86] Add the rest of the ADC and SBB instructions to isDefConvertible.Craig Topper2017-08-091-6/+10
| | | | | | I don't know if this really affects anything. Just thought it was weird that we had all of the ADD/SUB/AND/OR/XOR instructions. llvm-svn: 310447
* [InstCombine] Use regular dyn_cast instead of a matcher for a simple case. NFCCraig Topper2017-08-091-2/+2
| | | | llvm-svn: 310446
* [ImplicitNullCheck] Fix the bug when dependent instruction accesses memorySerguei Katkov2017-08-091-1/+3
| | | | | | | | | | | | | | It is possible that dependent instruction may access memory. In this case we must reject optimization because the memory change will be visible in null handler basic block. So we will execute an instruction which we must not execute if check fails. Reviewers: sanjoy, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36392 llvm-svn: 310443
* [PDB] Fix an issue writing the publics stream.Zachary Turner2017-08-092-17/+11
| | | | | | | | | | | | | In the refactor to merge the publics and globals stream, a bug was introduced that wrote the wrong value for one of the fields of the PublicsStreamHeader. This caused debugging in WinDbg to break. We had no way of dumping any of these fields, so in addition to fixing the bug I've added dumping support for them along with a test that verifies the correct value is written. llvm-svn: 310439
* [PDB] Merge Global and Publics Builders.Zachary Turner2017-08-096-343/+326
| | | | | | | | | | | | | | | | | | | | | The publics stream and globals stream are very similar. They both contain a list of hash buckets that refer into a single shared stream, the symbol record stream. Because of the need for each builder to manage both an independent hash stream as well as a single shared record stream, making the two builders be independent entities is not the right design. This patch merges them into a single class, of which only a single instance is needed to create all 3 streams. PublicsStreamBuilder and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder class, which writes all 3 streams at once. Note that this patch does not contain any functionality change. So we're still not yet writing any records to the globals stream. All we're doing is making it so that when we do start writing records to the globals, this refactor won't have to be part of that patch. Differential Revision: https://reviews.llvm.org/D36489 llvm-svn: 310438
* [AMDGPU] Revert r310429 changes in AMDKernelCodeT.h which broke some build bots.Eugene Zelenko2017-08-091-20/+28
| | | | llvm-svn: 310430
* [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-0810-193/+300
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 310429
* Revert "[GlobalISel] Remove the GISelAccessor API."Quentin Colombet2017-08-088-75/+200
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r310115. It causes a linker failure for the one of the unittests of AArch64 on one of the linux bot: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429 : && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O2 -L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined -Wl,-O3 -Wl,--gc-sections unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o unittests/Target/AArch64/AArch64Tests lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && : unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0): undefined reference to `vtable for llvm::LegalizerInfo' unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8): undefined reference to `vtable for llvm::RegisterBankInfo' The particularity of this bot is that it is built with BUILD_SHARED_LIBS=ON However, I was not able to reproduce the problem so far. Reverting to unblock the bot. llvm-svn: 310425
* My commit r310346 introduced some valid warnings. This cleans them up.Nemanja Ivanovic2017-08-081-6/+9
| | | | llvm-svn: 310424
* [MachineOutliner] Ensure AArch64 outliner doesn't mess with W30 or LRJessica Paquette2017-08-081-6/+7
| | | | | | | | | | | | | | Before, the outliner would mark all instructions that read from/modify LR as illegal. This doesn't handle W30, which overlaps with LR. This shouldn't be outlined. This commit fixes that by making modifiesRegister() and readsRegister() look at W30 + take in a TRI argument. This makes sure that modifiesRegister() and readsRegister() won't outline either of W30 and LR. https://reviews.llvm.org/D36435 llvm-svn: 310422
* [GVN] Remove stale entries in phitranslate cache when new phi is generated ↵Wei Mi2017-08-081-0/+14
| | | | | | | | | | | | | | | | | | for PRE When a new phi is generated for scalarpre of an expression, the phiTranslate cache will become stale: Before PRE, the candidate expression must not be available in a predecessor block, and phitranslate will cache the information. After PRE, the expression will become available in all predecessor blocks, so the related entries in phiTranslate cache becomes stale. The patch will simply remove the stale entries so phiTranslate can be recomputed next time. The stale entries in phitranslate cache will not affect correctness but will cause missing PRE opportunity for later instructions. Differential Revision: https://reviews.llvm.org/D36124 llvm-svn: 310421
* BasicAA: assert on another case where aliasGEP shouldn't get a PartialAlias ↵Nuno Lopes2017-08-081-1/+3
| | | | | | response llvm-svn: 310420
* Make ICP uses PSI to check for hotness.Dehao Chen2017-08-082-18/+21
| | | | | | | | | | | | | | Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted. Reviewers: davidxl, tejohnson, eraman Reviewed By: davidxl Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36341 llvm-svn: 310416
* [codeview] Emit nested enums and typedefs from classesReid Kleckner2017-08-081-4/+6
| | | | | | | | | | Previously we limited ourselves to only emitting nested classes, but we need other kinds of types as well. This fixes the Visual Studio STL visualizers, so that users can visualize std::string and other objects. llvm-svn: 310410
* [InstCombine] Support pulling left shifts through a subtract with constant LHSCraig Topper2017-08-081-0/+14
| | | | | | | | We already support pulling through an add with constant RHS. We can do the same for subtract. Differential Revision: https://reviews.llvm.org/D36443 llvm-svn: 310407
* [DAG] Introduce peekThroughBitcast function. NFCI.Nirav Dave2017-08-081-23/+14
| | | | llvm-svn: 310405
* [DAG] Update comments. NFC.Nirav Dave2017-08-081-8/+9
| | | | llvm-svn: 310404
* [AMDGPU] Add llvm.amdgpu.update.dpp intrinsicConnor Abbott2017-08-081-0/+8
| | | | | | | | | | | | | | | | | | Summary: Now that we've made all the necessary backend changes, we can add a new intrinsic which exposes the new capabilities to IR producers. Since llvm.amdgpu.update.dpp is a strict superset of llvm.amdgpu.mov.dpp, we should deprecate the former. We also add tests for all the functionality that was added in previous changes, now that we can access it via an IR construct. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34718 llvm-svn: 310399
* [NewGVN] Use a cast instead of a dyn_cast.Chad Rosier2017-08-081-1/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D36478 llvm-svn: 310397
* [PDB] Fix linking of function symbols and local variables.Zachary Turner2017-08-081-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | The compiler outputs PROC32_ID symbols into the object files for functions, and these symbols have an embedded type index which, when copied to the PDB, refer to the IPI stream. However, the symbols themselves are also converted into regular symbols (e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular symbol records refer to the TPI stream. So this patch applies two fixes to function records. 1. It converts ID symbols to the proper non-ID record type. 2. After remapping the type index from the object file's index space to the PDB file/IPI stream's index space, it then remaps that index to the TPI stream's index space by. Besides functions, during the remapping process we were also discarding symbol record types which we did not recognize. In particular, we were discarding S_BPREL32 records, which is what MSVC uses to describe local variables on the stack. So this patch fixes that as well by copying them to the PDB. Differential Revision: https://reviews.llvm.org/D36426 llvm-svn: 310394
* [LoopVectorize] Fix assertion failure in Fcmp vectorizationAnna Thomas2017-08-081-1/+3
| | | | | | | | | | | | | | | | | | | | | Summary: When vectorizing fcmps we can trip on incorrect cast assertion when setting the FastMathFlags after generating the vectorized FCmp. This can happen if the FCmp can be folded to true or false directly. The fix here is to set the FastMathFlag using the FastMathFlagBuilder *before* creating the FCmp Instruction. This is what's done by other optimizations such as InstCombine. Added a test case which trips on cast assertion without this patch. Reviewers: Ayal, mssimpso, mkuper, gilr Reviewed by: Ayal, mssimpso Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D36244 llvm-svn: 310389
* Revert "[ARM] Fix assembly and disassembly for VMRS/VMSR"Tim Northover2017-08-083-73/+36
| | | | | | | | This reverts r310243. Only MVFR2 is actually restricted to v8 and it'll be a little while before we can get a proper fix together. Better that we allow incorrect code than reject correct in the meantime. llvm-svn: 310384
* [KnownBits][ValueTracking] Move the math for calculating known bits for ↵Craig Topper2017-08-083-41/+67
| | | | | | | | | | | | add/sub into a static method in KnownBits object I want to reuse this code in SimplifyDemandedBits handling of Add/Sub. This will make that easier. Wonder if we should use it in SelectionDAG's computeKnownBits too. Differential Revision: https://reviews.llvm.org/D36433 llvm-svn: 310378
* [RISCV] Fix warning about unused getSubtargetFeatureName()Alex Bradbury2017-08-081-1/+0
| | | | llvm-svn: 310375
* BasicAA: aliasGEP shouldn't get a PartialAlias response hereNuno Lopes2017-08-081-1/+3
| | | | | | add an assert() to ensure that's the case (as I'm not convinced it won't happen) llvm-svn: 310373
* [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as ↵Simon Pilgrim2017-08-081-11/+10
| | | | | | | | well as BUILD_VECTOR Minor extension to D36393 llvm-svn: 310372
* [RISCV] Add basic RISCVAsmParser (missing files)Alex Bradbury2017-08-083-0/+399
| | | | | | | | This commit adds the files missing from rL310361. Apologies for the noise. Differential Revision: https://reviews.llvm.org/D23563 llvm-svn: 310363
* [RISCV] Add basic RISCVAsmParserAlex Bradbury2017-08-084-2/+19
| | | | | | | | | This doesn't yet support parsing things like %pcrel_hi(foo), but will handle basic instructions with register or immediate operands. Differential Revision: https://reviews.llvm.org/D23563 llvm-svn: 310361
* [PowerPC] Don't crash on larger splats achieved through 1-byte splatsNemanja Ivanovic2017-08-081-0/+9
| | | | | | | | | | We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358
* Appease compilers that have the -Wcovered-switch-default switch.Nemanja Ivanovic2017-08-081-2/+5
| | | | llvm-svn: 310356
* [X86] Improved X86::CMOV to Branch heuristic.Amjad Aboud2017-08-081-4/+18
| | | | | | | | | Resolved PR33954. This patch contains two more constraints that aim to reduce the noise cases where we convert CMOV into branch for small gain, and end up spending more cycles due to overhead. Differential Revision: https://reviews.llvm.org/D36081 llvm-svn: 310352
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-08-081-0/+126
| | | | | | | | | Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 llvm-svn: 310346
* [DAGCombiner] Simplify shuffle mask index if the referenced input element is ↵Simon Pilgrim2017-08-081-0/+36
| | | | | | | | | | UNDEF Fixes one of the cases in PR34041. Differential Revision: https://reviews.llvm.org/D36393 llvm-svn: 310344
* [globalisel][tablegen] Add support for importing 'imm' operands.Daniel Sanders2017-08-081-1/+2
| | | | | | | | | | | | | | | | | | | Summary: This patch enables the import of rules containing 'imm' operands that do not constrain the acceptable values using predicates. Support for ImmLeaf will arrive in a later patch. Depends on D35681 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35833 llvm-svn: 310343
* [PM] Fix a likely more critical infloop bug in the CGSCC pass manager.Chandler Carruth2017-08-081-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | This was just a bad oversight on my part. The code in question should never have worked without this fix. But it turns out, there are relatively few places that involve libfunctions that participate in a single SCC, and unless they do, this happens to not matter. The effect of not having this correct is that each time through this routine, the edge from write_wrapper to write was toggled between a call edge and a ref edge. First time through, it becomes a demoted call edge and is turned into a ref edge. Next time it is a promoted call edge from a ref edge. On, and on it goes forever. I've added the asserts which should have always been here to catch silly mistakes like this in the future as well as a test case that will actually infloop without the fix. The other (much scarier) infinite-inlining issue I think didn't actually occur in practice, and I simply misdiagnosed this minor issue as that much more scary issue. The other issue *is* still a real issue, but I'm somewhat relieved that so far it hasn't happened in real-world code yet... llvm-svn: 310342
* [InstCombine] Cast to BinaryOperator earlier in foldSelectIntoOp to simplify ↵Craig Topper2017-08-081-14/+10
| | | | | | | | the code. We no longer need the explicit operand count check or the later dynamic cast. llvm-svn: 310339
* AMDGPU: Fix warnings introduced by r310336Tom Stellard2017-08-081-4/+2
| | | | llvm-svn: 310337
* AMDGPU: Move R600 parts of AMDGPUISelDAGToDAG into their own classTom Stellard2017-08-083-112/+184
| | | | | | | | | | | | Summary: This refactoring is required in order to split the R600 and GCN tablegen files. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D36286 llvm-svn: 310336
* [PM] Fix new LoopUnroll function pass by invalidating loop analysisChandler Carruth2017-08-081-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | results when a loop is completely removed. This is very hard to manifest as a visible bug. You need to arrange for there to be a subsequent allocation of a 'Loop' object which gets the exact same address as the one which the unroll deleted, and you need the LoopAccessAnalysis results to be significant in the way that they're stale. And you need a million other things to align. But when it does, you get a deeply mysterious crash due to actually finding a stale analysis result. This fixes the issue and tests for it by directly checking we successfully invalidate things. I have not been able to get *any* test case to reliably trigger this. Changes to LLVM itself caused the only test case I ever had to cease to crash. I've looked pretty extensively at less brittle ways of fixing this and they are actually very, very hard to do. This is a somewhat strange and unusual case as we have a pass which is deleting an IR unit, but is not running within that IR unit's pass framework (which is what handles this cleanly for the normal loop unroll). And where there isn't a definitive way to clear *all* of the stale cache entries. And where the pass *is* updating the core analysis that provides the IR units! For example, we don't have any of these problems with Function analyses because it is easy to clear out function analyses when the functions themselves may have been deleted -- we clear an entire module's worth! But that is too heavy of a hammer down here in the LoopAnalysisManager layer. A better long-term solution IMO is to require that AnalysisManager's make their keys durable to this kind of thing. Specifically, when caching an analysis for one IR unit that is conceptually "owned" by a higher level IR unit, the AnalysisManager should incorporate this into its data structures so that we can reliably clear these results without having to teach each and every pass to do so manually as we do here. But that is a change for another day as it will be a fairly invasive change to the AnalysisManager infrastructure. Until then, this fortunately seems to be quite rare. llvm-svn: 310333
* [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-0810-215/+294
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 310328
* [libFuzzer] simplify code, NFCKostya Serebryany2017-08-082-11/+7
| | | | llvm-svn: 310326
* [libFuzzer] remove stale codeKostya Serebryany2017-08-083-14/+0
| | | | llvm-svn: 310325
OpenPOWER on IntegriCloud