summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [CGP] Fix the rematerialization of gc.relocatesSerguei Katkov2017-08-171-0/+15
| | | | | | | | | | | | | | | | | | | If we want to substitute the relocation of derived pointer with gep of base then we must ensure that relocation of base dominates the relocation of derived pointer. Currently only check for basic block is present. However it is possible that both relocation are in the same basic block but relocation of derived pointer is defined earlier. The patch moves the relocation of base pointer right before relocation of derived pointer in this case. Reviewers: sanjoy,artagnon,igor-laevsky,reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36462 llvm-svn: 311067
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-08-173-549/+24
| | | | | | | | | | This reverts commit r311038. Several buildbots are breaking, and at least one appears to be due to the forwarding of physical regs enabled by this change. Reverting while I investigate further. llvm-svn: 311062
* ARM: mark CPSR as clobbered for Windows VLAsSaleem Abdulrasool2017-08-171-0/+4
| | | | | | | | | | | | | When lowering a VLA, we emit a __chstk call. However, this call can internally clobber CPSR. We did not mark this register as an ImpDef, which could potentially allow a comparison to be hoisted above the call to `__chkstk`. In such a case, the CPSR could be clobbered, and the check invalidated. When the support was initially added, it seemed that the call would take care of preventing CPSR from being clobbered, but this is not the case. Mark the register as clobbered to fix a possible state corruption. llvm-svn: 311061
* [X86] Exchange the memory op predicate for PALIGNR/VPALIGNR. I accidentally ↵Craig Topper2017-08-171-2/+2
| | | | | | swapped them. llvm-svn: 311060
* [X86] Cleanup multiclasses for SSE/AVX2 PALIGNR. Add missing load patterns.Craig Topper2017-08-171-43/+21
| | | | | | | | We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences. We were also missing load patterns, though we had them for the AVX-512 version. llvm-svn: 311059
* [X86] Remove patterns for PALIGNR with non-vXi8 types.Craig Topper2017-08-173-37/+5
| | | | llvm-svn: 311058
* Reapply: [ADCE][Dominators] Teach ADCE to preserve dominatorsJakub Kuderski2017-08-171-7/+46
| | | | | | | | | | | | | | | | | | | | | Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. The patch was originally committed in r311039 and reverted in r311049. This revision fixes the problem with not adding a dependency on the DominatorTreeWrapperPass for the LegacyPassManager. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311057
* [X86] Put multiclass closer to its use and simplify slightly. NFCCraig Topper2017-08-161-10/+11
| | | | llvm-svn: 311055
* [X86] Use a static array instead of a SmallVector for a small fixed size ↵Craig Topper2017-08-161-2/+2
| | | | | | array. NFC llvm-svn: 311054
* [InstCombine] Teach canEvaluateTruncated to handle arithmetic shift ↵Amjad Aboud2017-08-161-0/+17
| | | | | | | | (including those with vector splat shift amount) Differential Revision: https://reviews.llvm.org/D36784 llvm-svn: 311050
* Revert "[ADCE][Dominators] Teach ADCE to preserve dominators"Jakub Kuderski2017-08-161-43/+7
| | | | | | | This reverts commit r311039. The patch caused the `test/Bindings/OCaml/Output/scalar_opts.ml` to fail. llvm-svn: 311049
* [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-08-165-81/+166
| | | | | | other minor fixes (NFC). llvm-svn: 311048
* [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) ↵Craig Topper2017-08-161-17/+22
| | | | | | | | | | + C1 support splat vectors This also uses decomposeBitTestICmp to decode the compare. Differential Revision: https://reviews.llvm.org/D36781 llvm-svn: 311044
* [ADCE][Dominators] Teach ADCE to preserve dominatorsJakub Kuderski2017-08-161-7/+43
| | | | | | | | | | | | | | | | | Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311039
* [MachineCopyPropagation] Extend pass to do COPY source forwardingGeoff Berry2017-08-163-24/+549
| | | | | | | | | | | | | | | | | | This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311038
* [LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolutionGeoff Berry2017-08-162-6/+2
| | | | | | | | | | | | | | | | | | | | | Summary: Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving ScalarEvolution since they do not alter loop structure and should not alter any SCEV values (though LoopDataPrefetch may introduce new instructions that won't have cached SCEV values yet). This can result in slight code differences, mainly w.r.t. nsw/nuw flags on SCEVs, since these are computed somewhat lazily when a zext/sext instruction is encountered. As a result, passes after the modified passes may see SCEVs with more nsw/nuw flags present. Reviewers: sanjoy, anemet Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36716 llvm-svn: 311032
* Add a convenience overload of DWARFDie::dump() for debugging purposes.Adrian Prantl2017-08-161-0/+2
| | | | llvm-svn: 311026
* Add more commentXinliang David Li2017-08-161-1/+9
| | | | llvm-svn: 311025
* [PGO] Fix ThinLTO crash Xinliang David Li2017-08-161-0/+6
| | | | | | Differential Revsion: http://reviews.llvm.org/D36640 llvm-svn: 311023
* [AMDGPU] NFC: test commitEvgeny Mankov2017-08-161-10/+10
| | | | llvm-svn: 311019
* AMDGPU/NFC: Sort files in CMakeLists.txt alphabeticallyKonstantin Zhuravlyov2017-08-161-17/+17
| | | | llvm-svn: 311017
* [Dominators] Introduce batch updatesJakub Kuderski2017-08-161-0/+10
| | | | | | | | | | | | | | | | | Summary: This patch introduces a way of informing the (Post)DominatorTree about multiple CFG updates that happened since the last tree update. This makes performing tree updates much easier, as it internally takes care of applying the updates in lockstep with the (virtual) updates to the CFG, which is done by reverse-applying future CFG updates. The batch updater is able to remove redundant updates that cancel each other out. In the future, it should be also possible to reorder updates to reduce the amount of work needed to perform the updates. Reviewers: dberlin, sanjoy, grosser, davide, brzycki Reviewed By: brzycki Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D36167 llvm-svn: 311015
* [BDCE] Don't check demanded bits on unsized typesHal Finkel2017-08-161-2/+13
| | | | | | | | | | | | | | | To clear assumptions that are potentially invalid after trivialization, we need to walk the use/def chain. Normally, the only way to reach an instruction with an unsized type is via an instruction that has side effects (or otherwise will demand its input bits). That would stop the walk. However, if we have a readnone function that returns an unsized type (e.g., void), we must avoid asking for the demanded bits of the function call's return value. A void-returning readnone function is always dead (and so we can stop walking the use/def chain here), but the check is necessary to avoid asserting. Fixes PR34211. llvm-svn: 311014
* [Verifier] Reject globals without a type associated.Davide Italiano2017-08-161-0/+1
| | | | llvm-svn: 311012
* [AMDGPU][MC][GFX9] Added op_sel support for v_mad_*16, v_fma_f16, ↵Dmitry Preobrazhensky2017-08-161-66/+85
| | | | | | | | | | | | v_div_fixup_f16 This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36694 llvm-svn: 311011
* [DemandedBits] simplify call; NFCSanjay Patel2017-08-161-1/+1
| | | | llvm-svn: 311009
* Revert "MachineInstr: Reason locally about some memory objects before going ↵Balaram Makam2017-08-161-42/+17
| | | | | | | | | | | | | to AA." r310825 caused the clang-ppc64le-linux-lnt bot to go red (http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712) because of a test-suite failure of SingleSource/UnitTests/2003-07-09-SignedArgs This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996. llvm-svn: 311008
* [AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodesDmitry Preobrazhensky2017-08-1613-45/+166
| | | | | | | | | | See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36674 llvm-svn: 311006
* [CostModel][X86][XOP] Improve costs for XOP shufflesSimon Pilgrim2017-08-161-0/+22
| | | | | | VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions llvm-svn: 311005
* [mips] Handle variables with an explicit section and interactions with ↵Simon Dardis2017-08-161-0/+16
| | | | | | | | | | | | | | | | | | .sdata, .sbss If a variable has an explicit section such as .sdata or .sbss, it is placed in that section and accessed in a gp relative manner. This overrides the global -G setting. Otherwise if a variable has a explicit section attached to it, such as '.rodata' or '.mysection', it is not placed in the small data section. This also overrides the global -G setting. Reviewers: atanasyan, nitesh.jain Differential Revision: https://reviews.llvm.org/D36616 llvm-svn: 311001
* [ARM] Improve loop unrolling for Cortex-MSam Parker2017-08-161-6/+19
| | | | | | | | | | | - Set the default runtime unroll count to 4 and use the newly added UnrollRemainder option. - Create loop cost and force unroll for a cost less than 12. - Disable unrolling on Thumb1 only targets. Differential Revision: https://reviews.llvm.org/D36134 llvm-svn: 310997
* [COFF] Make the weak aliases optionalMartin Storsjo2017-08-162-3/+3
| | | | | | | | | | | | | | | | When creating an import library from lld, the cases with Name != ExtName shouldn't end up as a weak alias, but as a real export of the new name, which is what actually is exported from the DLL. This restores the behaviour of renamed exports to what it was in 4.0. The other half of this commit, including test, goes into lld. Differential Revision: https://reviews.llvm.org/D36633 llvm-svn: 310991
* [llvm-dlltool] Fix creating stdcall/fastcall import libraries for i386Martin Storsjo2017-08-163-4/+26
| | | | | | | | | | | | | | | | | | | | | Hook up the -k option (that in the original GNU dlltool removes the @n suffix from the symbol that the final executable ends up linked to). In llvm-dlltool, make sure that functions end up with the undecorate name type if this option is set and they are decorated. In mingw, when creating import libraries from def files instead of creating an import library as a side effect of linking a DLL, the symbol names in the def contain the stdcall/fastcall decoration (but no leading underscore). By setting the undecorate name type, a linker linking to the import library will omit the decoration from the DLL import entry. With this in place, mingw-w64 for i386 built with llvm-dlltool/clang produces import libraries that actually work. Differential Revision: https://reviews.llvm.org/D36548 llvm-svn: 310990
* [COFF] Add SymbolName as a distinct field in COFFImportFileMartin Storsjo2017-08-161-1/+1
| | | | | | | | | | | The previous Name and ExtName aren't enough to convey all the nuances between weak aliases and stdcall decorated function names. A test for this will be added in LLD. Differential Revision: https://reviews.llvm.org/D36544 llvm-svn: 310988
* [AMDGPU] Eliminate no effect instructions before s_endpgmStanislav Mekhanoshin2017-08-161-3/+63
| | | | | | Differential Revision: https://reviews.llvm.org/D36585 llvm-svn: 310987
* Merge debug info when hoist then-else code to if.Dehao Chen2017-08-161-0/+2
| | | | | | | | | | | | | | Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached. Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D36778 llvm-svn: 310985
* [VirtRegRewriter] Properly model the register liveness on undef subreg ↵Quentin Colombet2017-08-161-1/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | definition Undef subreg definition means that the content of the super register doesn't matter at this point. While that's true for virtual registers, this may not hold when replacing them with actual physical registers. Indeed, some part of the physical register may be coalesced with the related virtual register and thus, the values for those parts matter and must be live. The fix consists in checking whether or not subregs of the physical register being assigned to an undef subreg definition are live through that def and insert an implicit use if they are. Doing so, will keep them alive until that point like they should be. E.g., let vreg14 being assigned to R0_R1 then %vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here %vreg14:gsub_1<def> = COPY %R1 Before this changes, the rewriter would change the code into: %R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1 ; is believed to come ; from the previous ; instruction Because of this invalid liveness, later pass could make wrong choices and in particular clobber live register as it happened with the register scavenger in llvm.org/PR34107 Now we would generate: %R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to ; reach this point %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> The bug has been here forever, it got exposed recently because the register scavenger got smarter. Fixes llvm.org/PR34107 llvm-svn: 310979
* [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle ↵Craig Topper2017-08-151-10/+18
| | | | | | | | | | vector shifts with splat shift amount We were only allowing ConstantInt before. This patch allows splat of ConstantInt too. Differential Revision: https://reviews.llvm.org/D36763 llvm-svn: 310970
* Reapply "[GlobalISel] Remove the GISelAccessor API."Quentin Colombet2017-08-158-200/+75
| | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r310425, thus reapplying r310335 with a fix for link issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON. Original commit message: [GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. ---- The fix for the link issue consists in adding the GlobalISel library in the list of dependencies for the AArch64 unittests. This dependency comes from the use of AArch64Subtarget that needs to know how to destruct the GISel related APIs when being detroyed. Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and understand the problem. llvm-svn: 310969
* [InstCombine] Added support for (X >>s C) << C --> X & (-1 << C)Amjad Aboud2017-08-151-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D36743 llvm-svn: 310949
* [InstCombine] sink sext after ashrSanjay Patel2017-08-151-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Narrow ops are better for bit-tracking, and in the case of vectors, may enable better codegen. As the trunc test shows, this can allow follow-on simplifications. There's a block of code in visitTrunc that deals with shifted ops with FIXME comments. It may be possible to remove some of that now, but I want to make sure there are no problems with this step first. http://rise4fun.com/Alive/Y3a Name: hoist_ashr_ahead_of_sext_1 %s = sext i8 %x to i32 %r = ashr i32 %s, 3 ; shift value is < than source bit width => %a = ashr i8 %x, 3 %r = sext i8 %a to i32 Name: hoist_ashr_ahead_of_sext_2 %s = sext i8 %x to i32 %r = ashr i32 %s, 8 ; shift value is >= than source bit width => %a = ashr i8 %x, 7 ; so clamp this shift value %r = sext i8 %a to i32 Name: junc_the_trunc %a = sext i16 %v to i32 %s = ashr i32 %a, 18 %t = trunc i32 %s to i16 => %t = ashr i16 %v, 15 llvm-svn: 310942
* [Dominators] Include infinite loops in PostDominatorTreeJakub Kuderski2017-08-151-18/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940
* [ORC] Add case statements for AArch64 to the local stub and callback managerLang Hames2017-08-151-0/+13
| | | | | | | | creation functions. This should allow lli to lazily execute code using OrcLazyJIT on AArch64. llvm-svn: 310938
* Fix -Wunused-lambda-capture for Release build.Rui Ueyama2017-08-151-2/+2
| | | | | | | `I` and `this` are used only in assert or DEBUG, so they are unused in Release build. llvm-svn: 310934
* [llvm-dwarfdump] - Attemp to fix BB after r310915.George Rimar2017-08-151-1/+1
| | | | | | | Now MIPS one is unhappy: http://lab.llvm.org:8011/builders/llvm-mips-linux/builds/2221 llvm-svn: 310928
* [llvm-dwarfdump] - Refactor section name/uniqueness gathering.George Rimar2017-08-152-18/+20
| | | | | | | | | | | As was requested in D36313 thread, with this patch section names and uniqueness calculated once, and not every time when a range is dumped. Differential revision: https://reviews.llvm.org/D36740 llvm-svn: 310923
* Revert r310919 - [globalisel][tablegen] Support zero-instruction emission.Daniel Sanders2017-08-151-11/+1
| | | | | | | | | | As expected, this failed on the windows bots but the instrumentation showed something interesting. The ADD8ri and INC8r rules are never directly compared on the windows machines. That implies that the issue lies in transitivity of the Compare predicate. I believe I've already verified that but maybe I missed something. llvm-svn: 310922
* Re-commit with some instrumentation: [globalisel][tablegen] Support ↵Daniel Sanders2017-08-151-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. The previous commit failed on the windows bots and this one is likely to fail on those same bots. However, the added instrumentation should reveal a particular isHigherPriorityThan() evaluation which I'm expecting to expose that these machines are weighing priority of two rules differently from the non-windows machines. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 310919
* [RISCV] Add RISCVInstPrinter and basic MC assembler testsAlex Bradbury2017-08-158-4/+141
| | | | | | | | | With the addition of RISCVInstPrinter, it is now possible to test the basic operation of the RISCV MC layer. Differential Revision: https://reviews.llvm.org/D23564 llvm-svn: 310917
* [llvm-dwarfdump] - Print section name and index when dumping .debug_info rangesGeorge Rimar2017-08-153-19/+53
| | | | | | | | | Teaches llvm-dwarfdump to print section index and name of range when it dumps .debug_info. Differential revision: https://reviews.llvm.org/D36313 llvm-svn: 310915
OpenPOWER on IntegriCloud