| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit fff2721286e1 ("[BPF] Fix CO-RE bugs with bitfields")
fixed CO-RE handling bitfield issues. But the implementation
introduced a use after free bug. The "Base" of the intrinsic
might be freed so later on accessing the Type of "Base"
might access the freed memory. The failed test case,
CodeGen/BPF/CORE/offset-reloc-middle-chain.ll
is exactly used to test such a case.
Similarly to previous attempt to remember Metadata etc,
remember "Base" pointee Alignment in advance to avoid
such use after free bug.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enable 2-byte VEX encoding.
Summary:
The 2 source operands commutable instructions are encoded in the
VEX.VVVV field and the r/m field of the MODRM byte plus the VEX.B
field.
The VEX.B field is missing from the 2-byte VEX encoding. If the
VEX.VVVV source is 0-7 and the other register is 8-15 we can
swap them to avoid needing the VEX.B field. This works as long as
the VEX.W, VEX.mmmmm, and VEX.X fields are also not needed.
Fixes PR36706.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68550
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bitfield handling is not robust with current implementation.
I have seen two issues as described below.
Issue 1:
struct s {
long long f1;
char f2;
char b1:1;
} *p;
The current approach will generate an access bit size
56 (from b1 to the end of structure) which will be
rejected as it is not power of 2.
Issue 2:
struct s {
char f1;
char b1:3;
char b2:5;
char b3:6:
char b4:2;
char f2;
};
The LLVM will group 4 bitfields together with 2 bytes. But
loading 2 bytes is not correct as it violates alignment
requirement. Note that sometimes, LLVM breaks a large
bitfield groups into multiple groups, but not in this case.
To resolve the above two issues, this patch takes a
different approach. The alignment for the structure is used
to construct the offset of the bitfield access. The bitfield
incurred memory access is an aligned memory access with alignment/size
equal to the alignment of the structure.
This also simplified the code.
This may not be the optimal memory access in terms of memory access
width. But this should be okay since extracting the bitfield value
will have the same amount of work regardless of what kind of
memory access width.
Differential Revision: https://reviews.llvm.org/D69837
|
|
|
|
| |
Fix the costs of integer division.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are duplicating predicates if several parts of the combined
predicate list contain the same condition. Added code to deduplicate
the list.
We have AssemblerPredicates and AssemblerPredicate in the
PredicateControl, but we never use AssemblerPredicates with an
actual list, so this one is dropped.
This addresses the first part of the llvm bug 43886:
https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69815
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-mvzeroupper will force the vzeroupper insertion pass to run on
CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs
where it normally runs.
To support this with the default feature handling in clang, we
need a vzeroupper feature flag in X86.td. Since this flag has
the opposite polarity of the fast-partial-ymm-or-zmm-write we
used to use to disable the pass, we now need to add this new
flag to every CPU except KNL/KNM and BTVER2 to keep identical
behavior.
Remove -fast-partial-ymm-or-zmm-write which is no longer used.
Differential Revision: https://reviews.llvm.org/D69786
|
| |
|
|
|
|
| |
Fixes https://bugs.llvm.org/show_bug.cgi?id=43891
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch (second of two patches) lowers the generic PowerPC vector
entries to PowerPC subtarget-specific entries.
For instance, the PowerPC generic entry 'cbrtd2_massv' is lowered to
'cbrtd2_P9' or Power9 subtarget.
The first patch enables the vectorizer to recognize the IBM MASS vector
library routines. This patch specifically adds support for recognizing
the '-vector-library=MASSV' option, and defines mappings from IEEE
standard scalar math functions to generic PowerPC MASS vector
counterparts.
For instance, the generic PowerPC MASS vector entry for double-precision
'cbrt' function is '__cbrtd2_massv'
The overall support for MASS vector library is presented as such in two
patches for ease of review.
Patch by pjeeva01 (Jeeva P.)
Differential Revision: https://reviews.llvm.org/D59883
|
| |
|
|
|
|
|
| |
Review: Ulrich Weigand
https://reviews.llvm.org/D68267
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Arm backend will usually return false for isFMAFasterThanFMulAndFAdd,
where both the fused VFMA.f32 and a non-fused VMLA.f32 are usually
available for scalar code. For MVE we don't have the non-fused version
though. It makes more sense for isFMAFasterThanFMulAndFAdd to return
true, allowing us to simplify some of the existing ISel patterns.
The tests here are that non of the existing tests failed, and so we are
still selecting VFMA and VFMS. The one test that changed shows we can
now select from fast math flags, as opposed to just relying on the
isFMADLegalForFAddFSub option.
Differential Revision: https://reviews.llvm.org/D69115
|
|
|
|
| |
Typo in comment. NFC.
|
|
|
|
|
|
|
| |
Fill in the gaps for vrev32.16 f16 patterns, extending the existing i16
patterns.
Differential Revision: https://reviews.llvm.org/D69508
|
|
|
|
|
|
|
| |
This is a special calling convention to be used by the GHC compiler.
Author: Stefan Schulze Frielinghaus
Differential Revision: https://reviews.llvm.org/D69024
|
|
|
|
|
|
|
|
|
|
| |
DemandedElts mask (REAPPLIED)
If we don't demand all elements, then attempt to combine to a simpler shuffle.
At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts (see D66004).
This reapplies rL368307 (reverted at rL369167) after the fix for the infinite loop reported at PR43024 was applied at rG3f087e38a2e7b87a5adaaac1c1b61e51220e7ff3
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: RKSimon, ostannard
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69795
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The hook should work for any RISC-V register. Non-allocatable registers
do not need to be reserved, for the remaining the hook will only succeed
if you pass clang the -ffixed-xX flag. This builds upon D67185, which
currently only allows reserving GPRs.
Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69130
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Demand that an immediate offset to a PC relative address fits in 32 bits, or
else load it into a register and perform a separate add.
Verify in the assembler that such immediate offsets fit the bitwidth.
Even though the final address of a Load Address Relative Long may fit in 32
bits even with a >32 bit offset (depending on where the symbol lives relative
to PC), the GNU toolchain demands the offset by itself to be in range. This
patch adapts the same behavior for llvm.
Review: Ulrich Weigand
https://reviews.llvm.org/D69749
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch sets the FPSW (X87 floating-point status register) as a reserved
physical register and fix the test failure caused by [[ https://reviews.llvm.org/D68854| D68854 ]].
Before this patch, some tests will fail because it implicit uses FPSW without
define it. Setting the FPSW as a reserved physical register will skip liveness
analysis because it is always live.
Reviewers: pengfei, craig.topper
Reviewed By: craig.topper
Subscribers: craig.topper, hiraditya, llvm-commits
Patch by LiuChen.
Differential Revision: https://reviews.llvm.org/D69784
|
|
|
|
|
|
|
|
| |
KnownZero if it removes an input.
This stops infinite loops where KnownUndef elements are converted to Zeroable, resulting in KnownZero elements which are then simplified (via SimplifyDemandedElts etc.) back to KnownUndef elements........
Prep fix for PR43024 which will allow rL368307 to be re-applied.
|
|
|
|
| |
statement.' warning. NFCI.
|
| |
|
|
|
|
|
|
| |
with empty roots. NFCI.
This doesn't affect actual codegen, but is a minor refactor toward fixing PR43024 where we need to avoid excess changes (folding zeroables etc.) to the shuffle mask at Depth == 0.
|
|
|
|
| |
Fixes MSVC static analyzer warnings about enum safety, this enum performs no integer math so it'd be better to fix its scope.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During deriving proper bitfield access FIELD_BYTE_SIZE,
function Member->getStorageOffsetInBits() is used to
get llvm IR type storage offset in bits so that
the byte size can permit aligned loads/stores with previously
derived FIELD_BYTE_OFFSET.
The function should only be used with bitfield members and it will
assert if ASSERT is turned on during cmake build.
Constant *getStorageOffsetInBits() const {
assert(getTag() == dwarf::DW_TAG_member && isBitField());
if (auto *C = cast_or_null<ConstantAsMetadata>(getExtraData()))
return C->getValue();
return nullptr;
}
This patch fixed the issue by using Member->isBitField()
directly and a test case is added to cover this missing case.
This issue is discovered when running Andrii's linux kernel CO-RE
tests.
Differential Revision: https://reviews.llvm.org/D69761
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
empty string. NFCI.
PVS Studio was complaining that the expression '!ArchFS.empty()' is always true.
|
| |
|
| |
|
|
|
|
| |
TargetPassConfig::addCoreISelPasses() always initializes O0WantsFastISel but it appeases static analyzers that complain that O0WantsFastISel isn't initialized in the constructor.
|
|
|
|
|
|
| |
getTargetShuffleAndZeroables. NFCI.
Prep work toward merging some of the functionality.
|
|
|
|
| |
HasFastHorizontalOps is a tuning flag. It shouldn't imply an ISA flag.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch models MXCSR and adds flag "mayRaiseFPException" for MMX FP
instructions.
Reviewers: craig.topper, andrew.w.kaylor, RKSimon, cameron.mcinally
Reviewed By: craig.topper
Subscribers: hiraditya, llvm-commits, LiuChen3
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69702
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise
float exception.
Reviewers: pengfei, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally, craig.topper
Reviewed By: craig.topper
Subscribers: thakis, hiraditya, llvm-commits
Patch by LiuChen.
Differential Revision: https://reviews.llvm.org/D68854
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69735
|
|
|
|
|
|
|
|
|
|
|
| |
lowerV2X128Shuffle to match the behavior in lowerVectorShuffle with regards to zeroable elements.
Previously we marked zeroable elements in a way that prevented
the widening check from recognizing that it could widen. Now
we only mark them zeroable if V2 is an all zeros vector. This
matches what we do for widening elements in lowerVectorShuffle.
Fixes PR43866.
|
|
|
|
| |
builds after D69663
|
|
|
|
|
|
| |
registers
combineBitcastvxi1 only handles bitcast->MOVMSK combines, with mask registers we use BITCAST directly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This instruction is not merged to the spec proposal, but we need it to
be implemented in the toolchain to experiment with it. It is available
only on an opt-in basis through a clang builtin.
Defined in https://github.com/WebAssembly/simd/pull/127.
Depends on D69696.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69697
|
|
|
|
|
|
|
|
| |
This reverts commit e5cae5692b5899631b5bfe5c23234deb5efda10c, which
reverted 11850a6305c5778b180243eb06aefe86762dd4ce. The original revert
was done because of breakage that was actually in a separate commit,
2ab1b8c1ec452fb743f6cc5051e75a01039cabfe, which was also reverted and
has since been fixed and relanded.
|
|
|
|
| |
In a future patch this will avoid some checks which don't need to be done for some opcodes.
|
|
|
|
|
|
|
|
|
|
| |
functions"
This is causing faults when an instruction which modifies SP is
outlined, causing the PAC and AUT instructions to not match.
This reverts commits 70caa1fc30c392974df3bccd9959765dae1779f6 and
55314d323738e4a8c1890b6a6e5064e7f4e0da1c.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: It outputs nothing, but is useful for writing tests, checking asm output.
Reviewers: t.p.northover, ostannard, tellenbach
Reviewed By: tellenbach
Subscribers: tellenbach, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69185
Change-Id: I6b58310e9e5632f0976d2000ce975ee28df90ebe
|
|
|
|
|
|
|
|
|
|
| |
Introduce helper methods and refactor pieces of code related to
register banks in MipsInstructionSelector.
Add a few detailed asserts in order to get a better overview
of LLT, register bank combinations that are supported at the moment
and reduce need to look at other files.
Differential Revision: https://reviews.llvm.org/D69663
|