summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* DWARFDebugLoclists: Make it possible to read relocated addressesPavel Labath2019-11-053-15/+17
| | | | | | | | | | | | | | | | | Summary: Handling relocations was not needed when the loclists section was a DWO-only thing. But since DWARF5, it is possible to use it in regular objects too, and the standard permits embedding addresses into the section directly. These addresses need to be relocated in unlinked files. Reviewers: JDevlieghere, dblaikie, probinson Subscribers: aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68271
* Recommit "[HardwareLoops] Optimisation remarks"Sjoerd Meijer2019-11-051-23/+82
| | | | | | | | | With a few things fixed: - initialisaiton of the optimisation remark pass (this was causing the buildbot failures on PPC), - a test case. Differential Revision: https://reviews.llvm.org/D69660
* [IR] Remove switch's default block that causes clang 8 raise erroraqjune2019-11-051-2/+0
|
* [X86] Lower the cost of avx512 horizontal bool and/or reductions to ↵Craig Topper2019-11-041-0/+21
| | | | | | | | | | | | | | | | | 2*log2(bitwidth)+1 for legal types. This better represents the kshift+binop we'd get for each stage before the final extract. Its likely we'll do even better by doing a kmov and a cmp with a GPR, but this is a good start. The default handling was costing a worst case single source permute shuffle of the vector before the binop. This worst case assumes the shuffle might have to be emulated with extracts and inserts. But since we know we're doing a reduction we can assume we'll get kshift lowering. There's still some room for improvement here, but this is much better than it was.
* [IR] Add Freeze instructionaqjune2019-11-0513-64/+93
| | | | | | | | | | | | | | | | | | Summary: - Define Instruction::Freeze, let it be UnaryOperator - Add support for freeze to LLLexer/LLParser/BitcodeReader/BitcodeWriter The format is `%x = freeze <ty> %v` - Add support for freeze instruction to llvm-c interface. - Add m_Freeze in PatternMatch. - Erase freeze when lowering IR to SelDag. Reviewers: deadalnix, hfinkel, efriedma, lebedev.ri, nlopes, jdoerfert, regehr, filcab, delcypher, whitequark Reviewed By: lebedev.ri, jdoerfert Subscribers: jfb, kristof.beyls, hiraditya, lebedev.ri, steven_wu, dexonsmith, xbolva00, delcypher, spatel, regehr, trentxintong, vsk, filcab, nlopes, mehdi_amini, deadalnix, llvm-commits Differential Revision: https://reviews.llvm.org/D29011
* [BPF] fix a use after free bugYonghong Song2019-11-041-2/+8
| | | | | | | | | | | | | | Commit fff2721286e1 ("[BPF] Fix CO-RE bugs with bitfields") fixed CO-RE handling bitfield issues. But the implementation introduced a use after free bug. The "Base" of the intrinsic might be freed so later on accessing the Type of "Base" might access the freed memory. The failed test case, CodeGen/BPF/CORE/offset-reloc-middle-chain.ll is exactly used to test such a case. Similarly to previous attempt to remember Metadata etc, remember "Base" pointee Alignment in advance to avoid such use after free bug.
* [X86] Teach X86MCInstLower to swap operands of commutable instructions to ↵Craig Topper2019-11-041-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | enable 2-byte VEX encoding. Summary: The 2 source operands commutable instructions are encoded in the VEX.VVVV field and the r/m field of the MODRM byte plus the VEX.B field. The VEX.B field is missing from the 2-byte VEX encoding. If the VEX.VVVV source is 0-7 and the other register is 8-15 we can swap them to avoid needing the VEX.B field. This works as long as the VEX.W, VEX.mmmmm, and VEX.X fields are also not needed. Fixes PR36706. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68550
* [BPF] Fix CO-RE bugs with bitfieldsYonghong Song2019-11-041-38/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bitfield handling is not robust with current implementation. I have seen two issues as described below. Issue 1: struct s { long long f1; char f2; char b1:1; } *p; The current approach will generate an access bit size 56 (from b1 to the end of structure) which will be rejected as it is not power of 2. Issue 2: struct s { char f1; char b1:3; char b2:5; char b3:6: char b4:2; char f2; }; The LLVM will group 4 bitfields together with 2 bytes. But loading 2 bytes is not correct as it violates alignment requirement. Note that sometimes, LLVM breaks a large bitfield groups into multiple groups, but not in this case. To resolve the above two issues, this patch takes a different approach. The alignment for the structure is used to construct the offset of the bitfield access. The bitfield incurred memory access is an aligned memory access with alignment/size equal to the alignment of the structure. This also simplified the code. This may not be the optimal memory access in terms of memory access width. But this should be okay since extracting the bitfield value will have the same amount of work regardless of what kind of memory access width. Differential Revision: https://reviews.llvm.org/D69837
* [AArch64] Update for ExynosEvandro Menezes2019-11-041-2/+4
| | | | Fix the costs of integer division.
* [AMDGPU] Added assert in SIFoldOperands before ptr use. NFC.Stanislav Mekhanoshin2019-11-041-0/+1
|
* [AMDGPU] deduplicate tablegen predicatesStanislav Mekhanoshin2019-11-0411-38/+46
| | | | | | | | | | | | | | | We are duplicating predicates if several parts of the combined predicate list contain the same condition. Added code to deduplicate the list. We have AssemblerPredicates and AssemblerPredicate in the PredicateControl, but we never use AssemblerPredicates with an actual list, so this one is dropped. This addresses the first part of the llvm bug 43886: https://bugs.llvm.org/show_bug.cgi?id=43886 Differential Revision: https://reviews.llvm.org/D69815
* [demangle] NFC: get rid of NodeOrStringErik Pilkington2019-11-042-19/+0
| | | | | | This class was a bit overengineered, and was triggering some PVS warnings. Instead, put strings into a NameType and let clients unconditionally treat it as a Node.
* [X86] Add support for -mvzeroupper and -mno-vzeroupper to match gccCraig Topper2019-11-044-53/+68
| | | | | | | | | | | | | | | | | -mvzeroupper will force the vzeroupper insertion pass to run on CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs where it normally runs. To support this with the default feature handling in clang, we need a vzeroupper feature flag in X86.td. Since this flag has the opposite polarity of the fast-partial-ymm-or-zmm-write we used to use to disable the pass, we now need to add this new flag to every CPU except KNL/KNM and BTVER2 to keep identical behavior. Remove -fast-partial-ymm-or-zmm-write which is no longer used. Differential Revision: https://reviews.llvm.org/D69786
* [SimplifyCFG] Use a (trivially) dominanting widenable branch to remove later ↵Philip Reames2019-11-041-0/+48
| | | | | | | | | | slow path blocks This transformation is a variation on the GuardWidening transformation we have checked in as it's own pass. Instead of focusing on merge (i.e. hoisting and simplifying) two widenable branches, this transform makes the observation that simply removing a second slowpath block (by reusing an existing one) is often a very useful canonicalization. This may lead to later merging, or may not. This is a useful generalization when the intermediate block has loads whose dereferenceability is hard to establish. As noted in the patch, this can be generalized further, and will be. Differential Revision: https://reviews.llvm.org/D69689
* [DAGCombine][MSP430] use shift amount threshold in DAGCombine (2/2)Sanjay Patel2019-11-041-36/+48
| | | | | | | | | | | | | | | | | Continuation of: D69116 Contributes to a fix for PR43559: https://bugs.llvm.org/show_bug.cgi?id=43559 See also D69099 and D69116 Use the TLI hook in DAGCombine.cpp to guard against creating shift nodes that are not optimal for a target. Patch by: @joanlluch (Joan LLuch) Differential Revision: https://reviews.llvm.org/D69120
* [X86] Fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-0410-42/+42
|
* VirtualFileSystem - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-041-2/+2
|
* Fix static analysis warnings in ARM calling convention loweringOliver Stannard2019-11-041-22/+22
| | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=43891
* Lower generic MASSV entries to PowerPC subtarget-specific entriesJinsong Ji2019-11-044-1/+175
| | | | | | | | | | | | | | | | | | | | | This patch (second of two patches) lowers the generic PowerPC vector entries to PowerPC subtarget-specific entries. For instance, the PowerPC generic entry 'cbrtd2_massv' is lowered to 'cbrtd2_P9' or Power9 subtarget. The first patch enables the vectorizer to recognize the IBM MASS vector library routines. This patch specifically adds support for recognizing the '-vector-library=MASSV' option, and defines mappings from IEEE standard scalar math functions to generic PowerPC MASS vector counterparts. For instance, the generic PowerPC MASS vector entry for double-precision 'cbrt' function is '__cbrtd2_massv' The overall support for MASS vector library is presented as such in two patches for ease of review. Patch by pjeeva01 (Jeeva P.) Differential Revision: https://reviews.llvm.org/D59883
* Recommit "[CodeView] Add option to disable inline line tables."Amy Huang2019-11-041-11/+30
| | | | | | | | | | | | This reverts commit 004ed2b0d1b86d424643ffc88fce20ad8bab6804. Original commit hash 6d03890384517919a3ba7fe4c35535425f278f89 Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. https://reviews.llvm.org/D67723
* [FPEnv][SelectionDAG] Refactor strict FP node constructionUlrich Weigand2019-11-041-23/+23
| | | | | | | | | | | Small refactoring in visitConstrainedFPIntrinsic that should make it easier to create DAG nodes requiring extra arguments. That is the case currently only for STRICT_FP_ROUND, but may be the case for additional nodes (in particular compares) in the future. Extracted from the patch for D69281. NFC.
* [SLP]Fix PR43799: Crash on different sizes of GEP indices.Alexey Bataev2019-11-041-2/+22
| | | | | | | | | | | | | | | Summary: If the GEP instructions are going to be vectorized, the indices in those GEP instructions must be of the same type. Otherwise, the compiler may crash when trying to build the vector constant. Reviewers: RKSimon, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69627
* [X86] Convert ShrinkMode to scoped enum class. NFCI.Simon Pilgrim2019-11-041-11/+15
|
* [SystemZ] Use LivePhysRegs instead of isCCLiveOut() in SystemZElimCompare.cppJonas Paulsson2019-11-041-9/+4
| | | | | Review: Ulrich Weigand https://reviews.llvm.org/D68267
* [MachineVerifier] Improve verification of live-in lists.Jonas Paulsson2019-11-041-0/+26
| | | | | | | | | | | | | | | | | MachineVerifier::visitMachineFunctionAfter() is extended to check the live-through case for live-in lists. This is only done for registers without aliases and that are neither allocatable or reserved, such as the SystemZ::CC register. The MachineVerifier earlier only catched the case of a live-in use without an entry in the live-in list (as "using an undefined physical register"). A comment in LivePhysRegs.h has been added stating a guarantee that addLiveOuts() can be trusted for a full register both before and after register allocation. Review: Quentin Colombet https://reviews.llvm.org/D68267
* [ARM] Use isFMAFasterThanFMulAndFAdd for MVEDavid Green2019-11-043-30/+35
| | | | | | | | | | | | | | | The Arm backend will usually return false for isFMAFasterThanFMulAndFAdd, where both the fused VFMA.f32 and a non-fused VMLA.f32 are usually available for scalar code. For MVE we don't have the non-fused version though. It makes more sense for isFMAFasterThanFMulAndFAdd to return true, allowing us to simplify some of the existing ISel patterns. The tests here are that non of the existing tests failed, and so we are still selecting VFMA and VFMS. The one test that changed shows we can now select from fast math flags, as opposed to just relying on the isFMADLegalForFAddFSub option. Differential Revision: https://reviews.llvm.org/D69115
* [IR] adjust assert when replacing undef elements in vector constantSanjay Patel2019-11-041-1/+1
| | | | | | | | | | As noted in follow-up to: rGa1e8ad4f2fa7 It's not safe to assume that an element of the constant is always non-null. It's definitely not an expected case for the current instcombine user, but that may not hold if this function is eventually called from arbitrary places.
* [SystemZ] Fix typoUlrich Weigand2019-11-041-1/+1
| | | | Typo in comment. NFC.
* Revert "[LV] Apply sink-after & interleave-groups as VPlan transformations ↵Benjamin Kramer2019-11-045-170/+125
| | | | | | (NFC)" This reverts commit 2be17087f8c38934b7fc9208ae6cf4e9b4d44f4b. Fails ASAN.
* [ARM] Add vrev32 NEON fp16 patternsDavid Green2019-11-041-3/+13
| | | | | | | Fill in the gaps for vrev32.16 f16 patterns, extending the existing i16 patterns. Differential Revision: https://reviews.llvm.org/D69508
* [InstSimplify] use FMF to improve fcmp+select foldSanjay Patel2019-11-041-7/+10
| | | | | This is part of a series of patches needed to solve PR39535: https://bugs.llvm.org/show_bug.cgi?id=39535
* [SystemZ] Add GHC calling conventionUlrich Weigand2019-11-045-0/+77
| | | | | | | This is a special calling convention to be used by the GHC compiler. Author: Stefan Schulze Frielinghaus Differential Revision: https://reviews.llvm.org/D69024
* [X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using ↵Simon Pilgrim2019-11-041-0/+17
| | | | | | | | | | DemandedElts mask (REAPPLIED) If we don't demand all elements, then attempt to combine to a simpler shuffle. At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts (see D66004). This reapplies rL368307 (reverted at rL369167) after the fix for the infinite loop reported at PR43024 was applied at rG3f087e38a2e7b87a5adaaac1c1b61e51220e7ff3
* [FIX] Removed duplicated v4f16 and v8f16 declarationsDiogo Sampaio2019-11-041-10/+10
| | | | | | | | | | | | Reviewers: RKSimon, ostannard Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69795
* [RISCV] Implement the TargetLowering::getRegisterByName hookLuís Marques2019-11-042-0/+26
| | | | | | | | | | | | | | | Summary: The hook should work for any RISC-V register. Non-allocatable registers do not need to be reserved, for the remaining the hook will only succeed if you pass clang the -ffixed-xX flag. This builds upon D67185, which currently only allows reserving GPRs. Reviewers: asb, lenary Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D69130
* [hwasan] Remove lazy thread-initialisationDavid Spickett2019-11-041-27/+2
| | | | | | | | | | | | | | | | | | | | | This was an experiment made possible by a non-standard feature of the Android dynamic loader. It required introducing a flag to tell the compiler which ABI was being targeted. This flag is no longer needed, since the generated code now works for both ABI's. We leave that flag untouched for backwards compatibility. This also means that if we need to distinguish between targeted ABI's again we can do that without disturbing any existing workflows. We leave a comment in the source code and mention in the help text to explain this for any confused person reading the code in the future. Patch by Matthew Malcomson Differential Revision: https://reviews.llvm.org/D69574
* [SystemZ] Improve handling of huge PC relative immediate offsets.Jonas Paulsson2019-11-042-13/+40
| | | | | | | | | | | | | | | Demand that an immediate offset to a PC relative address fits in 32 bits, or else load it into a register and perform a separate add. Verify in the assembler that such immediate offsets fit the bitwidth. Even though the final address of a Load Address Relative Long may fit in 32 bits even with a >32 bit offset (depending on where the symbol lives relative to PC), the GNU toolchain demands the offset by itself to be in range. This patch adapts the same behavior for llvm. Review: Ulrich Weigand https://reviews.llvm.org/D69749
* [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC)Gil Rapaport2019-11-045-125/+170
| | | | | | | | | | The sink-after and interleave-group vectorization decisions were so far applied to VPlan during initial VPlan construction, which complicates VPlan construction – also because of their inter-dependence. This patch refactors buildVPlanWithRecipes() to construct a simpler initial VPlan and later apply both these vectorization decisions, in order, as VPlan-to-VPlan transformations. Differential Revision: https://reviews.llvm.org/D68577
* Set the floating point status register as reservedPengfei Wang2019-11-031-0/+3
| | | | | | | | | | | | | | | | | | | | Summary: This patch sets the FPSW (X87 floating-point status register) as a reserved physical register and fix the test failure caused by [[ https://reviews.llvm.org/D68854| D68854 ]]. Before this patch, some tests will fail because it implicit uses FPSW without define it. Setting the FPSW as a reserved physical register will skip liveness analysis because it is always live. Reviewers: pengfei, craig.topper Reviewed By: craig.topper Subscribers: craig.topper, hiraditya, llvm-commits Patch by LiuChen. Differential Revision: https://reviews.llvm.org/D69784
* [X86][SSE] combineX86ShufflesRecursively - at Depth==0, only resolve ↵Simon Pilgrim2019-11-031-6/+31
| | | | | | | | KnownZero if it removes an input. This stops infinite loops where KnownUndef elements are converted to Zeroable, resulting in KnownZero elements which are then simplified (via SimplifyDemandedElts etc.) back to KnownUndef elements........ Prep fix for PR43024 which will allow rL368307 to be re-applied.
* [SIMachineScheduler] Fixed ''then' statement is equivalent to the 'else' ↵Dávid Bolvanský2019-11-031-6/+1
| | | | statement.' warning. NFCI.
* [SILoadStoreOptimizer] Fixed typo. NFCI.Dávid Bolvanský2019-11-031-1/+1
|
* Reland '[InstructionCombining] Fixed null check after dereferencing warning. ↵Dávid Bolvanský2019-11-031-2/+5
| | | | NFCI.'
* Revert "[InstructionCombining] Fixed null check after dereferencing warning. ↵Dávid Bolvanský2019-11-031-1/+0
| | | | | | NFCI." This reverts commit 8308187fd9bfa08ffad0a636d4dd1d25e7de5a76. This exposed a bug.
* [SCEV] Fixed 'Uninitialized variable 'ContainsAddRec' used.' warning. NFCI.Dávid Bolvanský2019-11-031-1/+1
|
* [MemorySSA] Fixed null check after dereferencing warning. NFCI.Dávid Bolvanský2019-11-031-0/+1
|
* Revert "[InstructionCompares] Fixed null check after dereferencing warning. ↵Dávid Bolvanský2019-11-031-1/+0
| | | | | | NFCI." This reverts commit b8685cf3042f6a2e129061922bd6b18e3c42258e.
* [InstructionCompares] Fixed null check after dereferencing warning. NFCI.Dávid Bolvanský2019-11-031-0/+1
|
* [InstructionCombining] Fixed null check after dereferencing warning. NFCI.Dávid Bolvanský2019-11-031-0/+1
|
* [CHR] Fixed null check after dereferencing warning. NFCI.Dávid Bolvanský2019-11-031-1/+1
|
OpenPOWER on IntegriCloud