summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [BPF] fix a use after free bugYonghong Song2019-11-041-2/+8
| | | | | | | | | | | | | | Commit fff2721286e1 ("[BPF] Fix CO-RE bugs with bitfields") fixed CO-RE handling bitfield issues. But the implementation introduced a use after free bug. The "Base" of the intrinsic might be freed so later on accessing the Type of "Base" might access the freed memory. The failed test case, CodeGen/BPF/CORE/offset-reloc-middle-chain.ll is exactly used to test such a case. Similarly to previous attempt to remember Metadata etc, remember "Base" pointee Alignment in advance to avoid such use after free bug.
* [X86] Teach X86MCInstLower to swap operands of commutable instructions to ↵Craig Topper2019-11-041-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | enable 2-byte VEX encoding. Summary: The 2 source operands commutable instructions are encoded in the VEX.VVVV field and the r/m field of the MODRM byte plus the VEX.B field. The VEX.B field is missing from the 2-byte VEX encoding. If the VEX.VVVV source is 0-7 and the other register is 8-15 we can swap them to avoid needing the VEX.B field. This works as long as the VEX.W, VEX.mmmmm, and VEX.X fields are also not needed. Fixes PR36706. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68550
* [BPF] Fix CO-RE bugs with bitfieldsYonghong Song2019-11-041-38/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bitfield handling is not robust with current implementation. I have seen two issues as described below. Issue 1: struct s { long long f1; char f2; char b1:1; } *p; The current approach will generate an access bit size 56 (from b1 to the end of structure) which will be rejected as it is not power of 2. Issue 2: struct s { char f1; char b1:3; char b2:5; char b3:6: char b4:2; char f2; }; The LLVM will group 4 bitfields together with 2 bytes. But loading 2 bytes is not correct as it violates alignment requirement. Note that sometimes, LLVM breaks a large bitfield groups into multiple groups, but not in this case. To resolve the above two issues, this patch takes a different approach. The alignment for the structure is used to construct the offset of the bitfield access. The bitfield incurred memory access is an aligned memory access with alignment/size equal to the alignment of the structure. This also simplified the code. This may not be the optimal memory access in terms of memory access width. But this should be okay since extracting the bitfield value will have the same amount of work regardless of what kind of memory access width. Differential Revision: https://reviews.llvm.org/D69837
* [AArch64] Update for ExynosEvandro Menezes2019-11-041-2/+4
| | | | Fix the costs of integer division.
* [AMDGPU] Added assert in SIFoldOperands before ptr use. NFC.Stanislav Mekhanoshin2019-11-041-0/+1
|
* [AMDGPU] deduplicate tablegen predicatesStanislav Mekhanoshin2019-11-0411-38/+46
| | | | | | | | | | | | | | | We are duplicating predicates if several parts of the combined predicate list contain the same condition. Added code to deduplicate the list. We have AssemblerPredicates and AssemblerPredicate in the PredicateControl, but we never use AssemblerPredicates with an actual list, so this one is dropped. This addresses the first part of the llvm bug 43886: https://bugs.llvm.org/show_bug.cgi?id=43886 Differential Revision: https://reviews.llvm.org/D69815
* [X86] Add support for -mvzeroupper and -mno-vzeroupper to match gccCraig Topper2019-11-044-53/+68
| | | | | | | | | | | | | | | | | -mvzeroupper will force the vzeroupper insertion pass to run on CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs where it normally runs. To support this with the default feature handling in clang, we need a vzeroupper feature flag in X86.td. Since this flag has the opposite polarity of the fast-partial-ymm-or-zmm-write we used to use to disable the pass, we now need to add this new flag to every CPU except KNL/KNM and BTVER2 to keep identical behavior. Remove -fast-partial-ymm-or-zmm-write which is no longer used. Differential Revision: https://reviews.llvm.org/D69786
* [X86] Fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-0410-42/+42
|
* Fix static analysis warnings in ARM calling convention loweringOliver Stannard2019-11-041-22/+22
| | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=43891
* Lower generic MASSV entries to PowerPC subtarget-specific entriesJinsong Ji2019-11-044-1/+175
| | | | | | | | | | | | | | | | | | | | | This patch (second of two patches) lowers the generic PowerPC vector entries to PowerPC subtarget-specific entries. For instance, the PowerPC generic entry 'cbrtd2_massv' is lowered to 'cbrtd2_P9' or Power9 subtarget. The first patch enables the vectorizer to recognize the IBM MASS vector library routines. This patch specifically adds support for recognizing the '-vector-library=MASSV' option, and defines mappings from IEEE standard scalar math functions to generic PowerPC MASS vector counterparts. For instance, the generic PowerPC MASS vector entry for double-precision 'cbrt' function is '__cbrtd2_massv' The overall support for MASS vector library is presented as such in two patches for ease of review. Patch by pjeeva01 (Jeeva P.) Differential Revision: https://reviews.llvm.org/D59883
* [X86] Convert ShrinkMode to scoped enum class. NFCI.Simon Pilgrim2019-11-041-11/+15
|
* [SystemZ] Use LivePhysRegs instead of isCCLiveOut() in SystemZElimCompare.cppJonas Paulsson2019-11-041-9/+4
| | | | | Review: Ulrich Weigand https://reviews.llvm.org/D68267
* [ARM] Use isFMAFasterThanFMulAndFAdd for MVEDavid Green2019-11-043-30/+35
| | | | | | | | | | | | | | | The Arm backend will usually return false for isFMAFasterThanFMulAndFAdd, where both the fused VFMA.f32 and a non-fused VMLA.f32 are usually available for scalar code. For MVE we don't have the non-fused version though. It makes more sense for isFMAFasterThanFMulAndFAdd to return true, allowing us to simplify some of the existing ISel patterns. The tests here are that non of the existing tests failed, and so we are still selecting VFMA and VFMS. The one test that changed shows we can now select from fast math flags, as opposed to just relying on the isFMADLegalForFAddFSub option. Differential Revision: https://reviews.llvm.org/D69115
* [SystemZ] Fix typoUlrich Weigand2019-11-041-1/+1
| | | | Typo in comment. NFC.
* [ARM] Add vrev32 NEON fp16 patternsDavid Green2019-11-041-3/+13
| | | | | | | Fill in the gaps for vrev32.16 f16 patterns, extending the existing i16 patterns. Differential Revision: https://reviews.llvm.org/D69508
* [SystemZ] Add GHC calling conventionUlrich Weigand2019-11-045-0/+77
| | | | | | | This is a special calling convention to be used by the GHC compiler. Author: Stefan Schulze Frielinghaus Differential Revision: https://reviews.llvm.org/D69024
* [X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using ↵Simon Pilgrim2019-11-041-0/+17
| | | | | | | | | | DemandedElts mask (REAPPLIED) If we don't demand all elements, then attempt to combine to a simpler shuffle. At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts (see D66004). This reapplies rL368307 (reverted at rL369167) after the fix for the infinite loop reported at PR43024 was applied at rG3f087e38a2e7b87a5adaaac1c1b61e51220e7ff3
* [FIX] Removed duplicated v4f16 and v8f16 declarationsDiogo Sampaio2019-11-041-10/+10
| | | | | | | | | | | | Reviewers: RKSimon, ostannard Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69795
* [RISCV] Implement the TargetLowering::getRegisterByName hookLuís Marques2019-11-042-0/+26
| | | | | | | | | | | | | | | Summary: The hook should work for any RISC-V register. Non-allocatable registers do not need to be reserved, for the remaining the hook will only succeed if you pass clang the -ffixed-xX flag. This builds upon D67185, which currently only allows reserving GPRs. Reviewers: asb, lenary Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D69130
* [SystemZ] Improve handling of huge PC relative immediate offsets.Jonas Paulsson2019-11-042-13/+40
| | | | | | | | | | | | | | | Demand that an immediate offset to a PC relative address fits in 32 bits, or else load it into a register and perform a separate add. Verify in the assembler that such immediate offsets fit the bitwidth. Even though the final address of a Load Address Relative Long may fit in 32 bits even with a >32 bit offset (depending on where the symbol lives relative to PC), the GNU toolchain demands the offset by itself to be in range. This patch adapts the same behavior for llvm. Review: Ulrich Weigand https://reviews.llvm.org/D69749
* Set the floating point status register as reservedPengfei Wang2019-11-031-0/+3
| | | | | | | | | | | | | | | | | | | | Summary: This patch sets the FPSW (X87 floating-point status register) as a reserved physical register and fix the test failure caused by [[ https://reviews.llvm.org/D68854| D68854 ]]. Before this patch, some tests will fail because it implicit uses FPSW without define it. Setting the FPSW as a reserved physical register will skip liveness analysis because it is always live. Reviewers: pengfei, craig.topper Reviewed By: craig.topper Subscribers: craig.topper, hiraditya, llvm-commits Patch by LiuChen. Differential Revision: https://reviews.llvm.org/D69784
* [X86][SSE] combineX86ShufflesRecursively - at Depth==0, only resolve ↵Simon Pilgrim2019-11-031-6/+31
| | | | | | | | KnownZero if it removes an input. This stops infinite loops where KnownUndef elements are converted to Zeroable, resulting in KnownZero elements which are then simplified (via SimplifyDemandedElts etc.) back to KnownUndef elements........ Prep fix for PR43024 which will allow rL368307 to be re-applied.
* [SIMachineScheduler] Fixed ''then' statement is equivalent to the 'else' ↵Dávid Bolvanský2019-11-031-6/+1
| | | | statement.' warning. NFCI.
* [SILoadStoreOptimizer] Fixed typo. NFCI.Dávid Bolvanský2019-11-031-1/+1
|
* [X86][SSE] combineX86ShufflesRecursively - don't bother merging shuffles ↵Simon Pilgrim2019-11-031-92/+105
| | | | | | with empty roots. NFCI. This doesn't affect actual codegen, but is a minor refactor toward fixing PR43024 where we need to avoid excess changes (folding zeroables etc.) to the shuffle mask at Depth == 0.
* [X86] Convert PICStyles::Style to scoped enum class. NFCI.Simon Pilgrim2019-11-032-11/+11
| | | | Fixes MSVC static analyzer warnings about enum safety, this enum performs no integer math so it'd be better to fix its scope.
* [BPF] fix a bug in __builtin_preserve_field_info() with FIELD_BYTE_SIZEYonghong Song2019-11-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | During deriving proper bitfield access FIELD_BYTE_SIZE, function Member->getStorageOffsetInBits() is used to get llvm IR type storage offset in bits so that the byte size can permit aligned loads/stores with previously derived FIELD_BYTE_OFFSET. The function should only be used with bitfield members and it will assert if ASSERT is turned on during cmake build. Constant *getStorageOffsetInBits() const { assert(getTag() == dwarf::DW_TAG_member && isBitField()); if (auto *C = cast_or_null<ConstantAsMetadata>(getExtraData())) return C->getValue(); return nullptr; } This patch fixed the issue by using Member->isBitField() directly and a test case is added to cover this missing case. This issue is discovered when running Andrii's linux kernel CO-RE tests. Differential Revision: https://reviews.llvm.org/D69761
* Fix uninitialized variable warning. NFCI.Simon Pilgrim2019-11-031-1/+1
|
* [mips] Remove trailing spaces. NFCSimon Atanasyan2019-11-031-4/+4
|
* [mips] Split long lines in the code. NFCSimon Atanasyan2019-11-0328-143/+216
|
* A15SDOptimizer::getPrefSPRLane - fix null dereference warning. NFCISimon Pilgrim2019-11-021-2/+1
|
* isImmPCRel/isImmSigned - both functions should return bool not unsigned. NFCI.Simon Pilgrim2019-11-021-2/+2
|
* X86_MC::createX86MCSubtargetInfo - X86_MC::ParseX86Triple never returns an ↵Simon Pilgrim2019-11-021-6/+3
| | | | | | empty string. NFCI. PVS Studio was complaining that the expression '!ArchFS.empty()' is always true.
* X86Operand::print - fix SymName shadow variable warning. NFCI.Simon Pilgrim2019-11-021-2/+2
|
* X86AsmPrinter - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-021-2/+2
|
* TargetMachine - fix uninitialized variable warning. NFCI.Simon Pilgrim2019-11-021-2/+2
| | | | TargetPassConfig::addCoreISelPasses() always initializes O0WantsFastISel but it appeases static analyzers that complain that O0WantsFastISel isn't initialized in the constructor.
* [X86] Move computeZeroableShuffleElements before ↵Simon Pilgrim2019-11-021-87/+87
| | | | | | getTargetShuffleAndZeroables. NFCI. Prep work toward merging some of the functionality.
* [X86] Remove FeatureSSE3 from the implies list of HasFastHorizontalOps.Craig Topper2019-11-011-1/+1
| | | | HasFastHorizontalOps is a tuning flag. It shouldn't imply an ISA flag.
* [X86] Model MXCSR for MMX FP instructionsPengfei Wang2019-11-011-5/+5
| | | | | | | | | | | | | | | | Summary: This patch models MXCSR and adds flag "mayRaiseFPException" for MMX FP instructions. Reviewers: craig.topper, andrew.w.kaylor, RKSimon, cameron.mcinally Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D69702
* [X86] add mayRaiseFPException flag and FPCW registers for X87 instructionsPengfei Wang2019-11-012-25/+46
| | | | | | | | | | | | | | | | Summary: This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise float exception. Reviewers: pengfei, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally, craig.topper Reviewed By: craig.topper Subscribers: thakis, hiraditya, llvm-commits Patch by LiuChen. Differential Revision: https://reviews.llvm.org/D68854
* [amdgpu] Fix known bits compuation on `MUL_I24`/`MUL_U24`.Michael Liao2019-11-011-0/+3
| | | | | | | | | | Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D69735
* [X86] Change the behavior of canWidenShuffleElements used by ↵Craig Topper2019-11-011-19/+14
| | | | | | | | | | | lowerV2X128Shuffle to match the behavior in lowerVectorShuffle with regards to zeroable elements. Previously we marked zeroable elements in a way that prevented the widening check from recognizing that it could widen. Now we only mark them zeroable if V2 is an all zeros vector. This matches what we do for widening elements in lowerVectorShuffle. Fixes PR43866.
* [MIPS GlobalISel] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off ↵Fangrui Song2019-11-011-0/+1
| | | | builds after D69663
* [X86][AVX] Add support for and/or scalar bool reduction with AVX512 mask ↵Simon Pilgrim2019-11-011-0/+6
| | | | | | registers combineBitcastvxi1 only handles bitcast->MOVMSK combines, with mask registers we use BITCAST directly.
* [WebAssembly] Add experimental SIMD dot product instructionThomas Lively2019-11-011-0/+7
| | | | | | | | | | | | | | | | | | | Summary: This instruction is not merged to the spec proposal, but we need it to be implemented in the toolchain to experiment with it. It is available only on an opt-in basis through a clang builtin. Defined in https://github.com/WebAssembly/simd/pull/127. Depends on D69696. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69697
* Reland "[WebAssembly] Expand setcc of v2i64"Thomas Lively2019-11-012-0/+31
| | | | | | | | This reverts commit e5cae5692b5899631b5bfe5c23234deb5efda10c, which reverted 11850a6305c5778b180243eb06aefe86762dd4ce. The original revert was done because of breakage that was actually in a separate commit, 2ab1b8c1ec452fb743f6cc5051e75a01039cabfe, which was also reverted and has since been fixed and relanded.
* [X86] isFNEG - use switch() instead of if-else tree. NFCI.Simon Pilgrim2019-11-011-33/+36
| | | | In a future patch this will avoid some checks which don't need to be done for some opcodes.
* Revert "[AArch64][MachineOutliner] Return address signing for outlined ↵Oliver Stannard2019-11-011-241/+7
| | | | | | | | | | functions" This is causing faults when an instruction which modifies SP is outlined, causing the PAC and AUT instructions to not match. This reverts commits 70caa1fc30c392974df3bccd9959765dae1779f6 and 55314d323738e4a8c1890b6a6e5064e7f4e0da1c.
* [AArch64] Output the pseudo SPACE in asm and object filesMomchil Velikov2019-11-012-2/+12
| | | | | | | | | | | | | | | | Summary: It outputs nothing, but is useful for writing tests, checking asm output. Reviewers: t.p.northover, ostannard, tellenbach Reviewed By: tellenbach Subscribers: tellenbach, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69185 Change-Id: I6b58310e9e5632f0976d2000ce975ee28df90ebe
* [MIPS GlobalISel] Improve reg bank handling in MipsInstructionSelectorPetar Avramovic2019-11-011-58/+70
| | | | | | | | | | Introduce helper methods and refactor pieces of code related to register banks in MipsInstructionSelector. Add a few detailed asserts in order to get a better overview of LLT, register bank combinations that are supported at the moment and reduce need to look at other files. Differential Revision: https://reviews.llvm.org/D69663
OpenPOWER on IntegriCloud