summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [KnownBits][ValueTracking] Move the math for calculating known bits for ↵Craig Topper2017-08-084-41/+71
| | | | | | | | | | | | add/sub into a static method in KnownBits object I want to reuse this code in SimplifyDemandedBits handling of Add/Sub. This will make that easier. Wonder if we should use it in SelectionDAG's computeKnownBits too. Differential Revision: https://reviews.llvm.org/D36433 llvm-svn: 310378
* [RISCV] Fix warning about unused getSubtargetFeatureName()Alex Bradbury2017-08-081-1/+0
| | | | llvm-svn: 310375
* BasicAA: aliasGEP shouldn't get a PartialAlias response hereNuno Lopes2017-08-081-1/+3
| | | | | | add an assert() to ensure that's the case (as I'm not convinced it won't happen) llvm-svn: 310373
* [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as ↵Simon Pilgrim2017-08-082-13/+12
| | | | | | | | well as BUILD_VECTOR Minor extension to D36393 llvm-svn: 310372
* [RISCV] Add basic RISCVAsmParser (missing files)Alex Bradbury2017-08-083-0/+399
| | | | | | | | This commit adds the files missing from rL310361. Apologies for the noise. Differential Revision: https://reviews.llvm.org/D23563 llvm-svn: 310363
* [RISCV] Add basic RISCVAsmParserAlex Bradbury2017-08-084-2/+19
| | | | | | | | | This doesn't yet support parsing things like %pcrel_hi(foo), but will handle basic instructions with register or immediate operands. Differential Revision: https://reviews.llvm.org/D23563 llvm-svn: 310361
* [PowerPC] Don't crash on larger splats achieved through 1-byte splatsNemanja Ivanovic2017-08-082-0/+29
| | | | | | | | | | We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358
* [globalisel][tablegen] Remove unnecessary ; to satisfy ubuntu-gcc7.1-werror.Daniel Sanders2017-08-081-1/+1
| | | | llvm-svn: 310357
* Appease compilers that have the -Wcovered-switch-default switch.Nemanja Ivanovic2017-08-081-2/+5
| | | | llvm-svn: 310356
* [X86] Improved X86::CMOV to Branch heuristic.Amjad Aboud2017-08-083-10/+161
| | | | | | | | | Resolved PR33954. This patch contains two more constraints that aim to reduce the noise cases where we convert CMOV into branch for small gain, and end up spending more cycles due to overhead. Differential Revision: https://reviews.llvm.org/D36081 llvm-svn: 310352
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-08-0813-0/+945
| | | | | | | | | Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 llvm-svn: 310346
* [DAGCombiner] Simplify shuffle mask index if the referenced input element is ↵Simon Pilgrim2017-08-082-12/+42
| | | | | | | | | | UNDEF Fixes one of the cases in PR34041. Differential Revision: https://reviews.llvm.org/D36393 llvm-svn: 310344
* [globalisel][tablegen] Add support for importing 'imm' operands.Daniel Sanders2017-08-088-27/+233
| | | | | | | | | | | | | | | | | | | Summary: This patch enables the import of rules containing 'imm' operands that do not constrain the acceptable values using predicates. Support for ImmLeaf will arrive in a later patch. Depends on D35681 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35833 llvm-svn: 310343
* [PM] Fix a likely more critical infloop bug in the CGSCC pass manager.Chandler Carruth2017-08-082-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | This was just a bad oversight on my part. The code in question should never have worked without this fix. But it turns out, there are relatively few places that involve libfunctions that participate in a single SCC, and unless they do, this happens to not matter. The effect of not having this correct is that each time through this routine, the edge from write_wrapper to write was toggled between a call edge and a ref edge. First time through, it becomes a demoted call edge and is turned into a ref edge. Next time it is a promoted call edge from a ref edge. On, and on it goes forever. I've added the asserts which should have always been here to catch silly mistakes like this in the future as well as a test case that will actually infloop without the fix. The other (much scarier) infinite-inlining issue I think didn't actually occur in practice, and I simply misdiagnosed this minor issue as that much more scary issue. The other issue *is* still a real issue, but I'm somewhat relieved that so far it hasn't happened in real-world code yet... llvm-svn: 310342
* [InstCombine] Cast to BinaryOperator earlier in foldSelectIntoOp to simplify ↵Craig Topper2017-08-081-14/+10
| | | | | | | | the code. We no longer need the explicit operand count check or the later dynamic cast. llvm-svn: 310339
* AMDGPU: Fix warnings introduced by r310336Tom Stellard2017-08-081-4/+2
| | | | llvm-svn: 310337
* AMDGPU: Move R600 parts of AMDGPUISelDAGToDAG into their own classTom Stellard2017-08-083-112/+184
| | | | | | | | | | | | Summary: This refactoring is required in order to split the R600 and GCN tablegen files. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D36286 llvm-svn: 310336
* AMDGPU: Also remove SI from docsKonstantin Zhuravlyov2017-08-081-2/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D36424 llvm-svn: 310335
* [PM] Relax the spelling of a pass name slightly in this test.Chandler Carruth2017-08-081-1/+1
| | | | | | I forgot that MSVC doesn't preserve this typedef, my bad. llvm-svn: 310334
* [PM] Fix new LoopUnroll function pass by invalidating loop analysisChandler Carruth2017-08-082-2/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | results when a loop is completely removed. This is very hard to manifest as a visible bug. You need to arrange for there to be a subsequent allocation of a 'Loop' object which gets the exact same address as the one which the unroll deleted, and you need the LoopAccessAnalysis results to be significant in the way that they're stale. And you need a million other things to align. But when it does, you get a deeply mysterious crash due to actually finding a stale analysis result. This fixes the issue and tests for it by directly checking we successfully invalidate things. I have not been able to get *any* test case to reliably trigger this. Changes to LLVM itself caused the only test case I ever had to cease to crash. I've looked pretty extensively at less brittle ways of fixing this and they are actually very, very hard to do. This is a somewhat strange and unusual case as we have a pass which is deleting an IR unit, but is not running within that IR unit's pass framework (which is what handles this cleanly for the normal loop unroll). And where there isn't a definitive way to clear *all* of the stale cache entries. And where the pass *is* updating the core analysis that provides the IR units! For example, we don't have any of these problems with Function analyses because it is easy to clear out function analyses when the functions themselves may have been deleted -- we clear an entire module's worth! But that is too heavy of a hammer down here in the LoopAnalysisManager layer. A better long-term solution IMO is to require that AnalysisManager's make their keys durable to this kind of thing. Specifically, when caching an analysis for one IR unit that is conceptually "owned" by a higher level IR unit, the AnalysisManager should incorporate this into its data structures so that we can reliably clear these results without having to teach each and every pass to do so manually as we do here. But that is a change for another day as it will be a fairly invasive change to the AnalysisManager infrastructure. Until then, this fortunately seems to be quite rare. llvm-svn: 310333
* [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-0810-215/+294
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 310328
* [libFuzzer] simplify code, NFCKostya Serebryany2017-08-082-11/+7
| | | | llvm-svn: 310326
* [libFuzzer] remove stale codeKostya Serebryany2017-08-083-14/+0
| | | | llvm-svn: 310325
* [libFuzzer] simplify the implementation of -print_coverage=1Kostya Serebryany2017-08-082-103/+69
| | | | llvm-svn: 310324
* [KnownBits] Fix copy pasto in comment. NFCCraig Topper2017-08-071-1/+1
| | | | llvm-svn: 310320
* [X86][AVX] Added test for broadcast shuffle from binary sources with undefs ↵Simon Pilgrim2017-08-071-0/+24
| | | | | | (D36393) llvm-svn: 310317
* AMDGPU: Implement getMinimumNopSizeMatt Arsenault2017-08-071-0/+6
| | | | llvm-svn: 310310
* [Object] Initialize LoadConfig member to nullReid Kleckner2017-08-073-1/+6
| | | | | | | | | | Executables may not contain a load config, and clients should be able to test for nullability. Previously we'd return uninitialized memory. Now getLoadConfig32/64 return valid pointers or null. Fixes PR34108 llvm-svn: 310308
* Do not instrument libFuzzer itself when built with -DLLVM_USE_SANITIZE_COVERAGEGeorge Karpenkov2017-08-071-0/+5
| | | | | | | | Fixes regression from https://reviews.llvm.org/D36295 Differential Revision: https://reviews.llvm.org/D36428 llvm-svn: 310305
* [llvm-pdbutil] Don't crash when a section contrib's isect is invalid.Zachary Turner2017-08-071-2/+6
| | | | llvm-svn: 310298
* Move the SampleProfileLoader right after EarlyFPM.Dehao Chen2017-08-073-25/+49
| | | | | | | | | | | | | | Summary: SampleProfileLoader pass do need to happen after some early cleanup passes so that inlining can happen correctly inside the SampleProfileLoader pass. Reviewers: chandlerc, davidxl, tejohnson Reviewed By: chandlerc, tejohnson Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36333 llvm-svn: 310296
* Reapply fix PR23384 (part 3 of 3) r304824 (was reverted in r305720).Evgeny Stupachenko2017-08-0717-92/+107
| | | | | | | | | | | | | | | | The root cause of reverting was fixed - PR33514. Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 310289
* Removing an unused variable that was missed with the refactoring in r310272; ↵Aaron Ballman2017-08-071-3/+0
| | | | | | NFC. llvm-svn: 310285
* [AMDGPU] Add pseudo "old" source to all DPP instructionsConnor Abbott2017-08-077-37/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: All instructions with the DPP modifier may not write to certain lanes of the output if bound_ctrl=1 is set or any bits in bank_mask or row_mask aren't set, so the destination register may be both defined and modified. The right way to handle this is to add a constraint that the destination register is the same as one of the inputs. We could tie the destination to the first source, but that would be too restrictive for some use-cases where we want the destination to be some other value before the instruction executes. Instead, add a fake "old" source and tie it to the destination. Effectively, the "old" source defines what value unwritten lanes will get. We'll expose this functionality to users with a new intrinsic later. Also, we want to use DPP instructions for computing derivatives, which means we need to set WQM for them. We also need to enable the entire wavefront when using DPP intrinsics to implement nonuniform subgroup reductions, since otherwise we'll get incorrect results in some cases. To accomodate this, add a new operand to all DPP instructions which will be interpreted by the SI WQM pass. This will be exposed with a new intrinsic later. We'll also add support for Whole Wavefront Mode later. I also fixed llvm.amdgcn.mov.dpp to overwrite the source and fixed up the test. However, I could also keep the old behavior (where lanes that aren't written are undefined) if people want it. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34716 llvm-svn: 310283
* AMDGPU: Remove -mcpu=SIMatt Arsenault2017-08-0750-67/+63
| | | | | | Leftover from before amdgcn/r600 split. llvm-svn: 310277
* AMDGPU: Remove redundant opt level checkMatt Arsenault2017-08-071-2/+1
| | | | | | addOptimizedRegAlloc isn't used for -O0 already. llvm-svn: 310275
* AMDGPU: Remove FixControlFlowLiveIntervals passMatt Arsenault2017-08-074-97/+0
| | | | | | | | | | | | | | This hasn't done anything in a long time. This was running after the the control flow pseudos were expanded, so this would never find them. The control flow pseudo expansion was moved to solve the problem this pass was supposed to solve in the first place, except handling it earlier also fixes it for fast regalloc which doesn't use LiveIntervals. Noticed by checking LCOV reports. llvm-svn: 310274
* [InstCombine] Support (X | C1) & C2 --> (X & C2^(C1&C2)) | (C1&C2) for ↵Craig Topper2017-08-072-15/+47
| | | | | | | | | | | | vector splats Note the original code I deleted incorrectly listed this as (X | C1) & C2 --> (X & C2^(C1&C2)) | C1 Which is only valid if C1 is a subset of C2. This relied on SimplifyDemandedBits to remove any extra bits from C1 before we got to that code. My new implementation avoids relying on that behavior so that it can be naively verified with alive. Differential Revision: https://reviews.llvm.org/D36384 llvm-svn: 310272
* AMDGPU: Use a custom areInlineCompatibleMatt Arsenault2017-08-074-0/+134
| | | | | | | Fixes not inlining OpenCL library functions on AMDGPU, which don't have an explicitly set target-cpu. llvm-svn: 310269
* [X86][AVX] Add full test coverage of subvector_broadcasts from registersSimon Pilgrim2017-08-071-0/+648
| | | | | | | | X86SubVBroadcast is for memory subvector broadcasts, but we must test that it handles all cases without the load as well just in case. This was noticed while I was triaging the test cases from PR34041. llvm-svn: 310268
* [DebugInfo][DWARF] Address paulr's comment on rL310253.Simon Dardis2017-08-071-1/+1
| | | | llvm-svn: 310267
* [X86][AVX] Cleanup subvector broadcast tests - remove old prefixes.Simon Pilgrim2017-08-071-183/+183
| | | | llvm-svn: 310265
* [x86] revert r310208 to investigate test-suite failures (PR34105 / PR34097) Sanjay Patel2017-08-0712-178/+251
| | | | llvm-svn: 310264
* [DebugInfo][DWARF] Correct some usages of PRIx32 to PRIx64Simon Dardis2017-08-071-4/+4
| | | | | | | | | | | These lead to tests failing spuriously as the values after being rendered to a string were incorrect. Reviewers: clayborg Differential Revision: https://reviews.llvm.org/D36319 llvm-svn: 310262
* [SLP] General improvements of SLP vectorization process.Alexey Bataev2017-08-075-206/+224
| | | | | | | | | | | | | | | | | | Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310260
* Fix typo in commentMatt Arsenault2017-08-071-2/+2
| | | | llvm-svn: 310259
* AMDGPU: Cleanup subtarget featuresMatt Arsenault2017-08-0772-155/+163
| | | | | | | | | | | | Try to avoid mutually exclusive features. Don't use a real default GPU, and use a fake "generic". The goal is to make it easier to see which set of features are incompatible between feature strings. Most of the test changes are due to random scheduling changes from not having a default fullspeed model. llvm-svn: 310258
* Revert "[SLP] General improvements of SLP vectorization process."Alexey Bataev2017-08-075-226/+206
| | | | | | This reverts commit r310255. llvm-svn: 310257
* [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.Nirav Dave2017-08-073-20/+40
| | | | | | | | | | | | | | | | Relanding after case to insert explicit truncation as necessary. Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally improves vector shuffle computations. Reviewers: efriedma, RKSimon, spatel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35566 llvm-svn: 310256
* [SLP] General improvements of SLP vectorization process.Alexey Bataev2017-08-075-206/+226
| | | | | | | | | | | | | | | Summary: Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310255
OpenPOWER on IntegriCloud