summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [ValueTracking] Honour recursion limit.Davide Italiano2017-08-091-0/+4
| | | | | | | | | | | The recently improved support for `icmp` in ValueTracking (r307304) exposes the fact that `isImplied` condition doesn't really bail out if we hit the recursion limit (and calls `computeKnownBits` which increases the depth and asserts). Differential Revision: https://reviews.llvm.org/D36512 llvm-svn: 310481
* [AArch64] Assembler support for the ARMv8.2a dot product instructionsSjoerd Meijer2017-08-095-0/+39
| | | | | | | | | | | Dot product is an optional ARMv8.2a extension, see also the public architecture specification here: https://developer.arm.com/products/architecture/a-profile/exploration-tools. This patch adds AArch64 assembler support for these dot product instructions. Differential Revision: https://reviews.llvm.org/D36515 llvm-svn: 310480
* [ARM] Remove FeatureNoARM implies ModeThumb.Florian Hahn2017-08-091-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: By removing FeatureNoARM implies ModeThumb, we can detect cases where a function's target-features contain -thumb-mode (enables ARM codegen for the function), but the architecture does not support ARM mode. Previously, the implication caused the FeatureNoARM bit to be cleared for functions with -thumb-mode, making the assertion in ARMSubtarget::ARMSubtarget [1] pointless for such functions. This assertion is the only guard against generating ARM code for architectures without ARM codegen support. Is there a place where we could easily generate error messages for the user? At the moment, we would generate ARM code for Thumb-only architectures. X86 has the same behavior as ARM, as in it only has an assertion and no error message, but I think for ARM an error message would be helpful. What do you think? For the example below, `llc -mtriple=armv7m-eabi test.ll -o -` will generate ARM assembler (or fail with an assertion error with this patch). Note that if we run the resulting assembler through llvm-mc, we get an appropriate error message, but not when codegen is handled through clang. ``` define void @bar() #0 { entry: ret void } attributes #0 = { "target-features"="-thumb-mode" } ``` [1] https://github.com/llvm-mirror/llvm/blob/c1f7b54cef62e9c8aa745d40bea146a167bf844e/lib/Target/ARM/ARMSubtarget.cpp#L147 Reviewers: t.p.northover, rengolin, peter.smith, aadg, silviu.baranga, richard.barton.arm, echristo Reviewed By: rengolin, echristo Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35569 llvm-svn: 310476
* [DAG] Explicitly cleanup merged load values during store merge. NFCI.Nirav Dave2017-08-091-2/+8
| | | | llvm-svn: 310474
* Fix -Wpessimizing-move warning.Haojian Wu2017-08-091-2/+2
| | | | llvm-svn: 310469
* [AsmParser][AVX512]Enhance OpMask/Zero/Merge syntax check rubostnessCoby Tayree2017-08-091-4/+8
| | | | | | | | Adopt a more strict approach regarding what marks should/can appear after a destination register, when operating upon an AVX512 platform. Differential Revision: https://reviews.llvm.org/D35785 llvm-svn: 310467
* [LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess()Jonas Paulsson2017-08-094-23/+18
| | | | | | | | | | | | | | | | | isLegalAddressingMode() has recently gained the extra optional Instruction* parameter, and therefore it can now do the job that previously only isFoldableMemAccess() could do. The SystemZ implementation of isLegalAddressingMode() has gained the functionality of checking for offsets, which used to be done with isFoldableMemAccess(). The isFoldableMemAccess() hook has been removed everywhere. Review: Quentin Colombet, Ulrich Weigand https://reviews.llvm.org/D35933 llvm-svn: 310463
* [LoopStrengthReduce] Don't neglect the Fixup.Offset in isAMCompletelyFolded().Jonas Paulsson2017-08-091-2/+2
| | | | | | | | In the recursive call to isAMCompletelyFolded(), the passed offset should be the sum of F.BaseOffset and Fixup.Offset. Review: Quentin Colombet. llvm-svn: 310462
* [mips] PR34083 - Wimplicit-fallthrough warning in MipsAsmParser.cppSimon Dardis2017-08-091-6/+7
| | | | | | | | | | | | Assert that a binary expression is actually a binary expression, rather than potientially incorrectly attempting to handle it as a unary expression. This resolves PR34083. Thanks to Simonn Pilgrim for reporting the issue! llvm-svn: 310460
* Suppress a warning. NFC.Gabor Horvath2017-08-091-1/+1
| | | | llvm-svn: 310459
* [AsmParser] Hash is not a comment on some targetsOliver Stannard2017-08-092-18/+0
| | | | | | | | | | | | The '#' token is not a comment for all targets (on ARM and AArch64 it marks an immediate operand), so we shouldn't treat it as such. Comments are already converted to AsmToken::EndOfStatement by AsmLexer::LexLineComment, so this check was unnecessary. Differential Revision: https://reviews.llvm.org/D36405 llvm-svn: 310457
* [LCG] Completely remove the map-based association of post-order numbersChandler Carruth2017-08-091-35/+34
| | | | | | | | | | | | | | | | | | | | | | | to Nodes when removing ref edges from a RefSCC. This map based association turns out to be pretty expensive for large RefSCCs and pointless as we already have embedded data members inside nodes that we use to track the DFS state. We can reuse one of those and the map becomes unnecessary. This also fuses the update of those numbers into the scan across the pending stack of nodes so that we don't walk the nodes twice during the DFS. With this I expect the new PM to be faster than the old PM for the test case I have been optimizing. That said, it also seems simpler and more direct in many ways. The side storage was always pretty awkward. The last remaining hot-spot in the profile of the LCG once this is done will be the edge iterator walk in the DFS. I'll take a look at improving that next. llvm-svn: 310456
* [GlobalOpt] Switch an explicit loop to llvm::all_of(). NFCI.Davide Italiano2017-08-091-5/+2
| | | | llvm-svn: 310453
* [LCG] Special case when removing a ref edge from a RefSCC leavesChandler Carruth2017-08-091-12/+29
| | | | | | | | | | | | | | | that RefSCC still connected. This is common and can be handled much more efficiently. As soon as we know we've covered every node in the RefSCC with the DFS, we can simply reset our state and return. This avoids numerous data structure updates and other complexity. On top of other changes, this appears to get new PM back to parity with the old PM for a large protocol buffer message source code. The dense map updates are very hot in this function. llvm-svn: 310451
* [LCG] Switch one of the update methods for the LazyCallGraph to supportChandler Carruth2017-08-092-155/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | limited batch updates. Specifically, allow removing multiple reference edges starting from a common source node. There are a few constraints that play into supporting this form of batching: 1) The way updates occur during the CGSCC walk, about the most we can functionally batch together are those with a common source node. This also makes the batching simpler to implement, so it seems a worthwhile restriction. 2) The far and away hottest function for large C++ files I measured (generated code for protocol buffers) showed a huge amount of time was spent removing ref edges specifically, so it seems worth focusing there. 3) The algorithm for removing ref edges is very amenable to this restricted batching. There are just both API and implementation special casing for the non-batch case that gets in the way. Once removed, supporting batches is nearly trivial. This does modify the API in an interesting way -- now, we only preserve the target RefSCC when the RefSCC structure is unchanged. In the face of any splits, we create brand new RefSCC objects. However, all of the users were OK with it that I could find. Only the unittest needed interesting updates here. How much does batching these updates help? I instrumented the compiler when run over a very large generated source file for a protocol buffer and found that the majority of updates are intrinsically updating one function at a time. However, nearly 40% of the total ref edges removed are removed as part of a batch of removals greater than one, so these are the cases batching can help with. When compiling the IR for this file with 'opt' and 'O3', this patch reduces the total time by 8-9%. Differential Revision: https://reviews.llvm.org/D36352 llvm-svn: 310450
* [X86] Add the rest of the ADC and SBB instructions to isDefConvertible.Craig Topper2017-08-091-6/+10
| | | | | | I don't know if this really affects anything. Just thought it was weird that we had all of the ADD/SUB/AND/OR/XOR instructions. llvm-svn: 310447
* [InstCombine] Use regular dyn_cast instead of a matcher for a simple case. NFCCraig Topper2017-08-091-2/+2
| | | | llvm-svn: 310446
* [ImplicitNullCheck] Fix the bug when dependent instruction accesses memorySerguei Katkov2017-08-091-1/+3
| | | | | | | | | | | | | | It is possible that dependent instruction may access memory. In this case we must reject optimization because the memory change will be visible in null handler basic block. So we will execute an instruction which we must not execute if check fails. Reviewers: sanjoy, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36392 llvm-svn: 310443
* [PDB] Fix an issue writing the publics stream.Zachary Turner2017-08-092-17/+11
| | | | | | | | | | | | | In the refactor to merge the publics and globals stream, a bug was introduced that wrote the wrong value for one of the fields of the PublicsStreamHeader. This caused debugging in WinDbg to break. We had no way of dumping any of these fields, so in addition to fixing the bug I've added dumping support for them along with a test that verifies the correct value is written. llvm-svn: 310439
* [PDB] Merge Global and Publics Builders.Zachary Turner2017-08-096-343/+326
| | | | | | | | | | | | | | | | | | | | | The publics stream and globals stream are very similar. They both contain a list of hash buckets that refer into a single shared stream, the symbol record stream. Because of the need for each builder to manage both an independent hash stream as well as a single shared record stream, making the two builders be independent entities is not the right design. This patch merges them into a single class, of which only a single instance is needed to create all 3 streams. PublicsStreamBuilder and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder class, which writes all 3 streams at once. Note that this patch does not contain any functionality change. So we're still not yet writing any records to the globals stream. All we're doing is making it so that when we do start writing records to the globals, this refactor won't have to be part of that patch. Differential Revision: https://reviews.llvm.org/D36489 llvm-svn: 310438
* [AMDGPU] Revert r310429 changes in AMDKernelCodeT.h which broke some build bots.Eugene Zelenko2017-08-091-20/+28
| | | | llvm-svn: 310430
* [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-0810-193/+300
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 310429
* Revert "[GlobalISel] Remove the GISelAccessor API."Quentin Colombet2017-08-088-75/+200
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r310115. It causes a linker failure for the one of the unittests of AArch64 on one of the linux bot: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429 : && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O2 -L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined -Wl,-O3 -Wl,--gc-sections unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o unittests/Target/AArch64/AArch64Tests lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && : unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0): undefined reference to `vtable for llvm::LegalizerInfo' unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8): undefined reference to `vtable for llvm::RegisterBankInfo' The particularity of this bot is that it is built with BUILD_SHARED_LIBS=ON However, I was not able to reproduce the problem so far. Reverting to unblock the bot. llvm-svn: 310425
* My commit r310346 introduced some valid warnings. This cleans them up.Nemanja Ivanovic2017-08-081-6/+9
| | | | llvm-svn: 310424
* [MachineOutliner] Ensure AArch64 outliner doesn't mess with W30 or LRJessica Paquette2017-08-081-6/+7
| | | | | | | | | | | | | | Before, the outliner would mark all instructions that read from/modify LR as illegal. This doesn't handle W30, which overlaps with LR. This shouldn't be outlined. This commit fixes that by making modifiesRegister() and readsRegister() look at W30 + take in a TRI argument. This makes sure that modifiesRegister() and readsRegister() won't outline either of W30 and LR. https://reviews.llvm.org/D36435 llvm-svn: 310422
* [GVN] Remove stale entries in phitranslate cache when new phi is generated ↵Wei Mi2017-08-081-0/+14
| | | | | | | | | | | | | | | | | | for PRE When a new phi is generated for scalarpre of an expression, the phiTranslate cache will become stale: Before PRE, the candidate expression must not be available in a predecessor block, and phitranslate will cache the information. After PRE, the expression will become available in all predecessor blocks, so the related entries in phiTranslate cache becomes stale. The patch will simply remove the stale entries so phiTranslate can be recomputed next time. The stale entries in phitranslate cache will not affect correctness but will cause missing PRE opportunity for later instructions. Differential Revision: https://reviews.llvm.org/D36124 llvm-svn: 310421
* BasicAA: assert on another case where aliasGEP shouldn't get a PartialAlias ↵Nuno Lopes2017-08-081-1/+3
| | | | | | response llvm-svn: 310420
* Make ICP uses PSI to check for hotness.Dehao Chen2017-08-082-18/+21
| | | | | | | | | | | | | | Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted. Reviewers: davidxl, tejohnson, eraman Reviewed By: davidxl Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36341 llvm-svn: 310416
* [codeview] Emit nested enums and typedefs from classesReid Kleckner2017-08-081-4/+6
| | | | | | | | | | Previously we limited ourselves to only emitting nested classes, but we need other kinds of types as well. This fixes the Visual Studio STL visualizers, so that users can visualize std::string and other objects. llvm-svn: 310410
* [InstCombine] Support pulling left shifts through a subtract with constant LHSCraig Topper2017-08-081-0/+14
| | | | | | | | We already support pulling through an add with constant RHS. We can do the same for subtract. Differential Revision: https://reviews.llvm.org/D36443 llvm-svn: 310407
* [DAG] Introduce peekThroughBitcast function. NFCI.Nirav Dave2017-08-081-23/+14
| | | | llvm-svn: 310405
* [DAG] Update comments. NFC.Nirav Dave2017-08-081-8/+9
| | | | llvm-svn: 310404
* [AMDGPU] Add llvm.amdgpu.update.dpp intrinsicConnor Abbott2017-08-081-0/+8
| | | | | | | | | | | | | | | | | | Summary: Now that we've made all the necessary backend changes, we can add a new intrinsic which exposes the new capabilities to IR producers. Since llvm.amdgpu.update.dpp is a strict superset of llvm.amdgpu.mov.dpp, we should deprecate the former. We also add tests for all the functionality that was added in previous changes, now that we can access it via an IR construct. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34718 llvm-svn: 310399
* [NewGVN] Use a cast instead of a dyn_cast.Chad Rosier2017-08-081-1/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D36478 llvm-svn: 310397
* [PDB] Fix linking of function symbols and local variables.Zachary Turner2017-08-081-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | The compiler outputs PROC32_ID symbols into the object files for functions, and these symbols have an embedded type index which, when copied to the PDB, refer to the IPI stream. However, the symbols themselves are also converted into regular symbols (e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular symbol records refer to the TPI stream. So this patch applies two fixes to function records. 1. It converts ID symbols to the proper non-ID record type. 2. After remapping the type index from the object file's index space to the PDB file/IPI stream's index space, it then remaps that index to the TPI stream's index space by. Besides functions, during the remapping process we were also discarding symbol record types which we did not recognize. In particular, we were discarding S_BPREL32 records, which is what MSVC uses to describe local variables on the stack. So this patch fixes that as well by copying them to the PDB. Differential Revision: https://reviews.llvm.org/D36426 llvm-svn: 310394
* [LoopVectorize] Fix assertion failure in Fcmp vectorizationAnna Thomas2017-08-081-1/+3
| | | | | | | | | | | | | | | | | | | | | Summary: When vectorizing fcmps we can trip on incorrect cast assertion when setting the FastMathFlags after generating the vectorized FCmp. This can happen if the FCmp can be folded to true or false directly. The fix here is to set the FastMathFlag using the FastMathFlagBuilder *before* creating the FCmp Instruction. This is what's done by other optimizations such as InstCombine. Added a test case which trips on cast assertion without this patch. Reviewers: Ayal, mssimpso, mkuper, gilr Reviewed by: Ayal, mssimpso Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D36244 llvm-svn: 310389
* Revert "[ARM] Fix assembly and disassembly for VMRS/VMSR"Tim Northover2017-08-083-73/+36
| | | | | | | | This reverts r310243. Only MVFR2 is actually restricted to v8 and it'll be a little while before we can get a proper fix together. Better that we allow incorrect code than reject correct in the meantime. llvm-svn: 310384
* [KnownBits][ValueTracking] Move the math for calculating known bits for ↵Craig Topper2017-08-083-41/+67
| | | | | | | | | | | | add/sub into a static method in KnownBits object I want to reuse this code in SimplifyDemandedBits handling of Add/Sub. This will make that easier. Wonder if we should use it in SelectionDAG's computeKnownBits too. Differential Revision: https://reviews.llvm.org/D36433 llvm-svn: 310378
* [RISCV] Fix warning about unused getSubtargetFeatureName()Alex Bradbury2017-08-081-1/+0
| | | | llvm-svn: 310375
* BasicAA: aliasGEP shouldn't get a PartialAlias response hereNuno Lopes2017-08-081-1/+3
| | | | | | add an assert() to ensure that's the case (as I'm not convinced it won't happen) llvm-svn: 310373
* [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as ↵Simon Pilgrim2017-08-081-11/+10
| | | | | | | | well as BUILD_VECTOR Minor extension to D36393 llvm-svn: 310372
* [RISCV] Add basic RISCVAsmParser (missing files)Alex Bradbury2017-08-083-0/+399
| | | | | | | | This commit adds the files missing from rL310361. Apologies for the noise. Differential Revision: https://reviews.llvm.org/D23563 llvm-svn: 310363
* [RISCV] Add basic RISCVAsmParserAlex Bradbury2017-08-084-2/+19
| | | | | | | | | This doesn't yet support parsing things like %pcrel_hi(foo), but will handle basic instructions with register or immediate operands. Differential Revision: https://reviews.llvm.org/D23563 llvm-svn: 310361
* [PowerPC] Don't crash on larger splats achieved through 1-byte splatsNemanja Ivanovic2017-08-081-0/+9
| | | | | | | | | | We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358
* Appease compilers that have the -Wcovered-switch-default switch.Nemanja Ivanovic2017-08-081-2/+5
| | | | llvm-svn: 310356
* [X86] Improved X86::CMOV to Branch heuristic.Amjad Aboud2017-08-081-4/+18
| | | | | | | | | Resolved PR33954. This patch contains two more constraints that aim to reduce the noise cases where we convert CMOV into branch for small gain, and end up spending more cycles due to overhead. Differential Revision: https://reviews.llvm.org/D36081 llvm-svn: 310352
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-08-081-0/+126
| | | | | | | | | Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 llvm-svn: 310346
* [DAGCombiner] Simplify shuffle mask index if the referenced input element is ↵Simon Pilgrim2017-08-081-0/+36
| | | | | | | | | | UNDEF Fixes one of the cases in PR34041. Differential Revision: https://reviews.llvm.org/D36393 llvm-svn: 310344
* [globalisel][tablegen] Add support for importing 'imm' operands.Daniel Sanders2017-08-081-1/+2
| | | | | | | | | | | | | | | | | | | Summary: This patch enables the import of rules containing 'imm' operands that do not constrain the acceptable values using predicates. Support for ImmLeaf will arrive in a later patch. Depends on D35681 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35833 llvm-svn: 310343
* [PM] Fix a likely more critical infloop bug in the CGSCC pass manager.Chandler Carruth2017-08-081-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | This was just a bad oversight on my part. The code in question should never have worked without this fix. But it turns out, there are relatively few places that involve libfunctions that participate in a single SCC, and unless they do, this happens to not matter. The effect of not having this correct is that each time through this routine, the edge from write_wrapper to write was toggled between a call edge and a ref edge. First time through, it becomes a demoted call edge and is turned into a ref edge. Next time it is a promoted call edge from a ref edge. On, and on it goes forever. I've added the asserts which should have always been here to catch silly mistakes like this in the future as well as a test case that will actually infloop without the fix. The other (much scarier) infinite-inlining issue I think didn't actually occur in practice, and I simply misdiagnosed this minor issue as that much more scary issue. The other issue *is* still a real issue, but I'm somewhat relieved that so far it hasn't happened in real-world code yet... llvm-svn: 310342
OpenPOWER on IntegriCloud