summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] drop getIsFP td helperStanislav Mekhanoshin2019-10-173-23/+13
| | | | | | | | | We already have isFloatType helper, and they are out of sync. Drop one and merge the type list. Differential Revision: https://reviews.llvm.org/D69138 llvm-svn: 375175
* [NFC][InstCombine] Some more preparatory cleanup for ↵Roman Lebedev2019-10-171-4/+4
| | | | | | dropRedundantMaskingOfLeftShiftInput() llvm-svn: 375153
* [PowerPC] Turn on CR-Logical reducer passNemanja Ivanovic2019-10-171-1/+1
| | | | | | | | | | | | | | | | | Quite a while ago, we implemented a pass that will reduce the number of CR-logical operations we emit. It does so by converting a CR-logical operation into a branch. We have kept this off by default because it seemed to cause a significant regression with one benchmark. However, that regression turned out to be due to a completely unrelated reason - AADB introducing a self-copy that is a priority-setting nop and it was just exacerbated by this pass. Now that we understand the reason for the only degradation, we can turn this pass on by default. We have long since fixed the cause for the degradation. Differential revision: https://reviews.llvm.org/D52431 llvm-svn: 375152
* Reapply r375051: [support] GlobPattern: add support for `\` and `[!...]`, ↵Jordan Rupprecht2019-10-171-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | and allow `]` in more places Reland r375051 (reverted in r375052) after fixing lld tests on Windows in r375126 and r375131. Original description: Update GlobPattern in libSupport to handle a few more cases. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway). This will be used to implement the `--wildcard` flag in llvm-objcopy to be more compatible with GNU objcopy. This is split off of D66613 to land the libSupport changes separately. The llvm-objcopy part will land soon. Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap Reviewed By: MaskRay Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66613 llvm-svn: 375149
* NFC: Fix variable only used in asserts by propagating the value.Sterling Augustine2019-10-171-3/+4
| | | | | | | | | | | | | | Summary: This fixes builds with assertions disabled that would otherwise fail with unused variable warnings Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69123 llvm-svn: 375148
* [IndVars] Split loop predication out of optimizeLoopExits [NFC]Philip Reames2019-10-171-11/+42
| | | | | | In the process of writing D69009, I realized we have two distinct sets of invariants within this single function, and basically no shared logic. The optimize loop exit transforms (including the new one in D69009) only care about *analyzeable* exits. Loop predication, on the other hand, has to reason about *all* exits. At the moment, we have the property (due to the requirement for an exact btc) that all exits are analyzeable, but that will likely change in the future as we add widenable condition support. llvm-svn: 375138
* [codeview] Workaround for PR43479, don't re-emit instr labelsReid Kleckner2019-10-171-4/+12
| | | | | | | | | | | | | | | | | | | | Summary: In the long run we should come up with another mechanism for marking call instructions as heap allocation sites, and remove this workaround. For now, we've had two bug reports about this, so let's apply this workaround. SLH (the other client of instruction labels) probably has the same bug, but the solution there is more likely to be to mark the call instruction as not duplicatable, which doesn't work for debug info. Reviewers: akhuang Subscribers: aprantl, hiraditya, aganea, chandlerc, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69068 llvm-svn: 375137
* [IndVars] Factor out a helper function for readability [NFC]Philip Reames2019-10-171-7/+20
| | | | llvm-svn: 375133
* [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modelsXiangling Liao2019-10-174-11/+111
| | | | | | | | | This patch provides support for peudo ops including ADDIStocHA8, ADDIStocHA, LWZtocL, LDtoc, LDtocL for AIX, lowering them from MIR to assembly. Differential Revision: https://reviews.llvm.org/D68341 llvm-svn: 375113
* [AMDGPU] Improve code size cost modelDaniil Fukalov2019-10-173-3/+37
| | | | | | | | | | | | | | | | | | | Summary: Added estimation for zero size insertelement, extractelement and llvm.fabs operators. Updated inline/unroll parameters default values. Reviewers: rampitec, arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68881 llvm-svn: 375109
* [ARM][MVE] Enable truncating masked storesSam Parker2019-10-172-33/+35
| | | | | | | | | | Allow us to generate truncating masked store which take v4i32 and v8i16 vectors and can store to v4i8, v4i16 and v8i8 and memory. Removed support for unaligned masked stores. Differential Revision: https://reviews.llvm.org/D68461 llvm-svn: 375108
* JumpThreadingPass::UnfoldSelectInstr - silence static analyzer dyn_cast<> ↵Simon Pilgrim2019-10-171-1/+1
| | | | | | | | null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375103
* [LoopIdiom] BCmp: check, not assert that loop exits exit out of the loop ↵Roman Lebedev2019-10-171-7/+8
| | | | | | | | | | | | | | | (PR43687) We can't normally stumble into that assertion because a tautological *conditional* `br` in loop body is required, one that always branches to loop latch. But that should have been always folded to an unconditional branch before we get it. But that is not guaranteed if the pass is run standalone. So let's just promote the assertion into a proper check. Fixes https://bugs.llvm.org/show_bug.cgi?id=43687 llvm-svn: 375100
* Reland: Dead Virtual Function EliminationOliver Stannard2019-10-176-36/+215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove dead virtual functions from vtables with replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up correctly. Original commit message: Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 375094
* [ARM][MVE] Change VPST to use, not def, VPRSam Parker2019-10-171-1/+1
| | | | | | | | Unlike VPT, VPST just uses the current value of VPR.P0. Differential Revision: https://reviews.llvm.org/D69037 llvm-svn: 375087
* [DFAPacketizer] Use DFAEmitter. NFC.James Molloy2019-10-171-67/+12
| | | | | | | | | | | | | | | Summary: This is a NFC change that removes the NFA->DFA construction and emission logic from DFAPacketizerEmitter and instead uses the generic DFAEmitter logic. This allows DFAPacketizer to use the Automaton class from Support and remove a bunch of logic there too. After this patch, DFAPacketizer is mostly logic for grepping Itineraries and collecting functional units, with no state machine logic. This will allow us to modernize by removing the 16-functional-unit limit and supporting non-itinerary functional units. This is all for followup patches. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68992 llvm-svn: 375086
* [DAGCombine][ARM] Enable extending masked loadsSam Parker2019-10-175-37/+141
| | | | | | | | | | | Add generic DAG combine for extending masked loads. Allow us to generate sext/zext masked loads which can access v4i8, v8i8 and v4i16 memory to produce v4i32, v8i16 and v4i32 respectively. Differential Revision: https://reviews.llvm.org/D68337 llvm-svn: 375085
* [Alignment][NFC] Use Align for TargetFrameLowering/SubtargetGuillaume Chatelet2019-10-1733-83/+89
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68993 llvm-svn: 375084
* [ThinLTO] Import virtual method with single implementation in hybrid modeEugene Leviant2019-10-171-34/+43
| | | | | | Differential revision: https://reviews.llvm.org/D68782 llvm-svn: 375083
* Move LiveRangeCalc header to publicily available position. NFCMarcello Maggioni2019-10-177-303/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D69078 llvm-svn: 375075
* Fix unused variable in r375066Daniel Sanders2019-10-171-2/+2
| | | | llvm-svn: 375070
* [gicombiner] Add the run-time rule disable optionDaniel Sanders2019-10-173-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Each generated helper can be configured to generate an option that disables rules in that helper. This can be used to bisect rulesets. The disable bits are stored in a SparseVector as this is very cheap for the common case where nothing is disabled. It gets more expensive the more rules are disabled but you're generally doing that for debug purposes where performance is less of a concern. Depends on D68426 Reviewers: volkan, bogner Reviewed By: volkan Subscribers: hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68438 llvm-svn: 375067
* [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => ↵Quentin Colombet2019-10-172-0/+86
| | | | | | | | | | | | | build_vector Teach the combiner helper how to flatten concat_vectors of build_vectors into a build_vector. Add this combine as part of AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69071 llvm-svn: 375066
* [gicombiner] Hoist pure C++ combine into the tablegen definitionDaniel Sanders2019-10-164-13/+16
| | | | | | | | | | | | | | | | | | | | | | Summary: This is just moving the existing C++ code around and will be NFC w.r.t AArch64. Renamed 'CombineBr' to something more descriptive ('ElideByByInvertingCond') at the same time. The remaining combines in AArch64PreLegalizeCombiner require features that aren't implemented at this point and will be hoisted as they are added. Depends on D68424 Reviewers: bogner, volkan Subscribers: kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68426 llvm-svn: 375057
* [NFC] Fix unused var in release buildsJordan Rupprecht2019-10-161-0/+1
| | | | llvm-svn: 375053
* Revert [support] GlobPattern: add support for `\` and `[!...]`, and allow ↵Jordan Rupprecht2019-10-161-16/+7
| | | | | | | | | | `]` in more places This reverts r375051 (git commit a409afaad64ce83ea44cc30ee5f96b6e613a6e98) The patch does not work on Windows due to `\` in filenames being interpreted as escaping rather than literal path separators when used by lld linker scripts. llvm-svn: 375052
* [support] GlobPattern: add support for `\` and `[!...]`, and allow `]` in ↵Jordan Rupprecht2019-10-161-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | more places Summary: Update GlobPattern in libSupport to handle a few more cases. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway). This will be used to implement the `--wildcard` flag in llvm-objcopy to be more compatible with GNU objcopy. This is split off of D66613 to land the libSupport changes separately. The llvm-objcopy part will land soon. Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap Reviewed By: MaskRay Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66613 undo objcopy changes to make this libsupport only llvm-svn: 375051
* [Utils] Cleanup similar cases to MergeBlockIntoPredecessor.Alina Sbirlea2019-10-163-69/+49
| | | | | | | | | | | | | | | | | | | Summary: There are two cases where a block is merged into its predecessor and the MergeBlockIntoPredecessor API is not used. Update the API so it can be reused in the other cases, in order to avoid code duplication. Cleanup motivated by D68659. Reviewers: chandlerc, sanjoy.google, george.burgess.iv Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68670 llvm-svn: 375050
* [AArch64] Fix offset calculationShoaib Meenai2019-10-162-4/+4
| | | | | | | | | | | | | | | | | r374772 changed Offset to be an int64_t but left NewOffset as an int. Scale is unsigned, so in the calculation `Offset - NewOffset * Scale`, `NewOffset * Scale` was promoted to unsigned and was then zero-extended to 64 bits, leading to an incorrect computation which manifested as an out-of-memory when building the Swift standard library for Android aarch64. Promote NewOffset to int64_t to fix this, and promote EmittableOffset as well, since its one user passes it to a function which takes an int64_t anyway. Test case based on a suggestion by Sander de Smalen! Differential Revision: https://reviews.llvm.org/D69018 llvm-svn: 375043
* GlobalISel: Implement lower for G_SADDO/G_SSUBOMatt Arsenault2019-10-163-3/+43
| | | | | | | Port directly from SelectionDAG, minus the path using ISD::SADDSAT/ISD::SSUBSAT. llvm-svn: 375042
* [Symbolize] Use the local MSVC C++ demangler instead of relying on dbghelp. NFC.Martin Storsjo2019-10-161-25/+11
| | | | | | | | | This allows making a couple llvm-symbolizer tests run in all environments. Differential Revision: https://reviews.llvm.org/D68133 llvm-svn: 375041
* [IndVars] Fix a miscompile in off-by-default loop predication implementationPhilip Reames2019-10-161-5/+33
| | | | | | | | | | | | The problem is that we can have two loop exits, 'a' and 'b', where 'a' and 'b' would exit at the same iteration, 'a' precedes 'b' along some path, and 'b' is predicated while 'a' is not. In this case (see the previously submitted test case), we causing the loop to exit through 'b' whereas it should have exited through 'a'. This only applies to loop exits where the exit counts are not provably inequal, but that isn't as much of a restriction as it appears. If we could order the exit counts, we'd have already removed one of the two exits. In theory, we might be able to prove inequality w/o ordering, but I didn't really explore that piece. Instead, I went for the obvious restriction and ensured we didn't predicate exits following non-predicateable exits. Credit goes to Evgeny Brevnov for figuring out the problematic case. Fuzzing probably also found it (failures seen), but due to some silly infrastructure problems I hadn't gotten to the results before Evgeny hand reduced it from a benchmark (he manually enabled the transform). Once this is fixed, I'll try to filter through the fuzzer failures to see if there's anything additional lurking. Differential Revision https://reviews.llvm.org/D68956 llvm-svn: 375038
* [AMDGPU] Do not combine dpp mov reading physregsStanislav Mekhanoshin2019-10-161-0/+6
| | | | | | | | We cannot be sure physregs will stay unchanged. Differential Revision: https://reviews.llvm.org/D69065 llvm-svn: 375033
* [AMDGPU] Do not combine dpp with physreg defStanislav Mekhanoshin2019-10-161-0/+4
| | | | | | | | We will remove dpp mov along with the physreg def otherwise. Differential Revision: https://reviews.llvm.org/D69063 llvm-svn: 375030
* [SLP] avoid reduction transform on patterns that the backend can ↵Sanjay Patel2019-10-161-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | load-combine (2nd try) The 1st attempt at this modified the cost model in a bad way to avoid the vectorization, but that caused problems for other users (the loop vectorizer) of the cost model. I don't see an ideal solution to these 2 related, potentially large, perf regressions: https://bugs.llvm.org/show_bug.cgi?id=42708 https://bugs.llvm.org/show_bug.cgi?id=43146 We decided that load combining was unsuitable for IR because it could obscure other optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend. Therefore, preventing SLP from destroying load combine opportunities requires that it recognizes patterns that could be combined later, but not do the optimization itself ( it's not a vector combine anyway, so it's probably out-of-scope for SLP). Here, we add a cost-independent bailout with a conservative pattern match for a multi-instruction sequence that can probably be reduced later. In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining will produce a single instruction on these tests like: movbe rax, qword ptr [rdi] or: mov rax, qword ptr [rdi] Not some (half) vector monstrosity as we currently do using SLP: vpmovzxbq ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,.. vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0] movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rdi + 5] shl rcx, 40 movzx edx, byte ptr [rdi + 6] shl rdx, 48 or rdx, rcx movzx ecx, byte ptr [rdi + 7] shl rcx, 56 or rcx, rdx or rcx, rax vextracti128 xmm1, ymm0, 1 vpor xmm0, xmm0, xmm1 vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1] vpor xmm0, xmm0, xmm1 vmovq rax, xmm0 or rax, rcx vzeroupper ret Differential Revision: https://reviews.llvm.org/D67841 llvm-svn: 375025
* [NFC][XCOFF][AIX] Rename ControlSections to CsectGroupJason Liu2019-10-161-4/+4
| | | | | | | | | | | The name of ControlSections is not expressive enough to convey what they really are. CsectGroup can better communicate the concept of grouping csects together since they have similar property. Reviewer: daltenty Differential Revision: https://reviews.llvm.org/D69001 llvm-svn: 375021
* CombinerHelper - silence dead assignment warnings. NFCI.Simon Pilgrim2019-10-161-9/+9
| | | | | | Copy the NewAlignment value to Alignment first and then use that to update the stack frame object alignments. llvm-svn: 375019
* [AMDGPU] Supress unused sdwa insts generationStanislav Mekhanoshin2019-10-163-25/+55
| | | | | | | | | Do not generate non-existing sdwa instructions. It reduces the number of generated instructions by 185. Differential Revision: https://reviews.llvm.org/D69010 llvm-svn: 375016
* [SVE][IR] Small TypeSize improvements left out of initial commitGraham Hunter2019-10-161-4/+4
| | | | | | | The commit for D53137 left out the last round of improvements requested by reviewers. Adding those in now. llvm-svn: 375013
* [AArch64,Assembler] Compiler support for ID_MMFR5_EL1Mark Murray2019-10-161-0/+1
| | | | | | | | | | | | Summary: Add read-only system register ID_MMFR5_EL1 and unit tests. Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69039 llvm-svn: 375010
* [Remarks] Add support for prepending a path to external filesFrancis Visoiu Mistrih2019-10-165-17/+39
| | | | | | | | | This helps with testing and debugging for paths that are assumed absolute. It also uses a FileError to provide the file path it's trying to open. llvm-svn: 375008
* bpf: fix wrong truncation elimination when there is back-edge/loopJiong Wang2019-10-164-167/+205
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, BPF backend is doing truncation elimination. If one truncation is performed on a value defined by narrow loads, then it could be redundant given BPF loads zero extend the destination register implicitly. When the definition of the truncated value is a merging value (PHI node) that could come from different code paths, then checks need to be done on all possible code paths. Above described optimization was introduced as r306685, however it doesn't work when there is back-edge, for example when loop is used inside BPF code. For example for the following code, a zero-extended value should be stored into b[i], but the "and reg, 0xffff" is wrongly eliminated which then generates corrupted data. void cal1(unsigned short *a, unsigned long *b, unsigned int k) { unsigned short e; e = *a; for (unsigned int i = 0; i < k; i++) { b[i] = e; e = ~e; } } The reason is r306685 was trying to do the PHI node checks inside isel DAG2DAG phase, and the checks are done on MachineInstr. This is actually wrong, because MachineInstr is being built during isel phase and the associated information is not completed yet. A quick search shows none target other than BPF is access MachineInstr info during isel phase. For an PHI node, when you reached it during isel phase, it may have all predecessors linked, but not successors. It seems successors are linked to PHI node only when doing SelectionDAGISel::FinishBasicBlock and this happens later than PreprocessISelDAG hook. Previously, BPF program doesn't allow loop, there is probably the reason why this bug was not exposed. This patch therefore fixes the bug by the following approach: - The existing truncation elimination code and the associated "load_to_vreg_" records are removed. - Instead, implement truncation elimination using MachineSSA pass, this is where all information are built, and keep the pass together with other similar peephole optimizations inside BPFMIPeephole.cpp. Redundant move elimination logic is updated accordingly. - Unit testcase included + no compilation errors for kernel BPF selftest. Patch Review === Patch was sent to and reviewed by BPF community at: https://lore.kernel.org/bpf Reported-by: David Beckett <david.beckett@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 375007
* [RISCV] Add MachineInstr immediate verificationLuis Marques2019-10-166-4/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements the `TargetInstrInfo::verifyInstruction` hook for RISC-V. Currently the hook verifies the machine instruction's immediate operands, to check if the immediates are within the expected bounds. Without the hook invalid immediates are not detected except when doing assembly parsing, so they are silently emitted (including being truncated when emitting object code). The bounds information is specified in tablegen by using the `OperandType` definition, which sets the `MCOperandInfo`'s `OperandType` field. Several RISC-V-specific immediate operand types were created, which extend the `MCInstrDesc`'s `OperandType` `enum`. To have the hook called with `llc` pass it the `-verify-machineinstrs` option. For Clang add the cmake build config `-DLLVM_ENABLE_EXPENSIVE_CHECKS=True`, or temporarily patch `TargetPassConfig::addVerifyPass`. Review concerns: - The patch adds immediate operand type checks that cover at least the base ISA. There are several other operand types for the C extension and one type for the F/D extensions that were left out of this initial patch because they introduced further design concerns that I felt were best evaluated separately. - Invalid register classes (e.g. passing a GPR register where a GPRC is expected) are already caught, so were not included. - This design makes the more abstract `MachineInstr` verification depend on MC layer definitions, which arguably is not the cleanest design, but is in line with how things are done in other parts of the target and LLVM in general. - There is some duplication of logic already present in the `MCOperandPredicate`s. Since the `MachineInstr` and `MCInstr` notions of immediates are fundamentally different, this is currently necessary. Reviewers: asb, lenary Reviewed By: lenary Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67397 llvm-svn: 375006
* [AMDGPU] Fix-up cases where writelane has 2 SGPR operandsDavid Stuttard2019-10-162-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Even though writelane doesn't have the same constraints as other valu instructions it still can't violate the >1 SGPR operand constraint Due to later register propagation (e.g. fixing up vgpr operands via readfirstlane) changing writelane to only have a single SGPR is tricky. This implementation puts a new check after SIFixSGPRCopies that prevents multiple SGPRs being used in any writelane instructions. The algorithm used is to check for trivial copy prop of suitable constants into one of the SGPR operands and perform that if possible. If this isn't possible put an explicit copy of Src1 SGPR into M0 and use that instead (this is allowable for writelane as the constraint is for SGPR read-port and not constant-bus access). Reviewers: rampitec, tpr, arsenm, nhaehnle Reviewed By: rampitec, arsenm, nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, mgorny, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D51932 Change-Id: Ic7553fa57440f208d4dbc4794fc24345d7e0e9ea llvm-svn: 375004
* RedirectingFileSystem::openFileForRead - replace bitwise & with boolean && ↵Simon Pilgrim2019-10-161-1/+1
| | | | | | | | to fix warning Seems to be just a typo - now matches other instances which do something similar llvm-svn: 374995
* RealFile - fix self-initialization warning in constructor.Simon Pilgrim2019-10-161-3/+3
| | | | llvm-svn: 374994
* [InstCombine][AMDGPU] Fix crash with v3i16/v3f16 buffer intrinsicsPiotr Sobczak2019-10-161-0/+7
| | | | | | | | | | | | | | | | | | | | Summary: This is something of a workaround to avoid a crash later on in type legalizer (WidenVectorResult()). Also added some f16 tests, including a non-working v3f16 case with a FIXME. Reviewers: arsenm, tpr, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68865 llvm-svn: 374993
* Revert "[HardwareLoops] Optimisation remarks"Sjoerd Meijer2019-10-161-81/+23
| | | | | | | | while I investigate the PPC build bot failures. This reverts commit ad763751565b9663bc338fa2ca5ade86c6ca22ec. llvm-svn: 374992
* [ARM] Add a register class for GPR pairs without SP and use it. NFCIMikhail Maltsev2019-10-162-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently Thumb2InstrInfo.cpp uses a register class which is auto-generated by tablegen. Such approach is fragile because auto-generated classes might change when other register classes are added. For example, before https://reviews.llvm.org/D62667 we were using GPRPair_with_gsub_1_in_rGPRRegClass, but had to change it to GPRPair_with_gsub_1_in_GPRwithAPSRnospRegClass because the former class stopped being generated (this did not change the functionality though). This patch adds a register class consisting of even-odd GPR register pairs from (R0, R1) to (R10, R11), which excludes (R12, SP) and uses it in Thumb2InstrInfo.cpp instead of GPRPair_with_gsub_1_in_GPRwithAPSRnospRegClass. Reviewers: ostannard, simon_tatham, dmgreen, efriedma Reviewed By: simon_tatham Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69026 llvm-svn: 374990
* SimpleLoopUnswitch - fix uninitialized variable and null dereference ↵Simon Pilgrim2019-10-161-2/+3
| | | | | | warnings. NFCI. llvm-svn: 374986
OpenPOWER on IntegriCloud