| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
This should prevent doing this on pre-sse4.1 targets or for 256
bit vectors without avx2.
I don't know of a failure from this. Op legalization will probably
take care of, but seemed better to be safe.
llvm-svn: 365577
|
| |
|
|
| |
llvm-svn: 365575
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D64433
llvm-svn: 365573
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D64446
llvm-svn: 365563
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A short granule is a granule of size between 1 and `TG-1` bytes. The size
of a short granule is stored at the location in shadow memory where the
granule's tag is normally stored, while the granule's actual tag is stored
in the last byte of the granule. This means that in order to verify that a
pointer tag matches a memory tag, HWASAN must check for two possibilities:
* the pointer tag is equal to the memory tag in shadow memory, or
* the shadow memory tag is actually a short granule size, the value being loaded
is in bounds of the granule and the pointer tag is equal to the last byte of
the granule.
Pointer tags between 1 to `TG-1` are possible and are as likely as any other
tag. This means that these tags in memory have two interpretations: the full
tag interpretation (where the pointer tag is between 1 and `TG-1` and the
last byte of the granule is ordinary data) and the short tag interpretation
(where the pointer tag is stored in the granule).
When HWASAN detects an error near a memory tag between 1 and `TG-1`, it
will show both the memory tag and the last byte of the granule. Currently,
it is up to the user to disambiguate the two possibilities.
Because this functionality obsoletes the right aligned heap feature of
the HWASAN memory allocator (and because we can no longer easily test
it), the feature is removed.
Also update the documentation to cover both short granule tags and
outlined checks.
Differential Revision: https://reviews.llvm.org/D63908
llvm-svn: 365551
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isLoadBitCastBeneficial/isStoreBitCastBeneficial to allow X86 to bypass it
Basically the problem is that X86 doesn't set the Fast flag from
allowsMemoryAccess on certain CPUs due to slow unaligned memory
subtarget features. This prevents bitcasts from being folded into
loads and stores. But all vector loads and stores of the same width
are the same cost on X86.
This patch merges the allowsMemoryAccess call into isLoadBitCastBeneficial to allow X86 to skip it.
Differential Revision: https://reviews.llvm.org/D64295
llvm-svn: 365549
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D64438
llvm-svn: 365546
|
| |
|
|
|
|
|
|
|
|
| |
Stubs out a number of the classes needed to produce a new object file format
(XCOFF) for the powerpc-aix target. For testing input is an empty module which
produces an object file with just a file header.
Differential Revision: https://reviews.llvm.org/D61694
llvm-svn: 365541
|
| |
|
|
| |
llvm-svn: 365540
|
| |
|
|
|
|
|
|
| |
Fixed the file name from BPFAbstrctMemberAccess.cpp to
BPFAbstractMemberAccess.cpp.
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 365532
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D64429
llvm-svn: 365525
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
return instruction.
Function return instruction lowering, currently uses the fixed register pair s[30:31] for holding
the return address. It can be any SGPR pair other than the CSRs. Created an SGPR pair sub-register class
exclusive of the CSRs, and used this regclass while lowering the return instruction.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D63924
llvm-svn: 365512
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There was an error being thrown from isDesirableToCommuteWithShift in
some tests. This was tracked down to the method being called before
legalisation, with an extended value type, not a machine value type.
In the case I diagnosed, the error was only hit with an instruction sequence
involving `i24`s in the add and shift. `i24` is not a Machine ValueType, it is
instead an Extended ValueType which was causing the issue.
I have added a test to cover this case, and fixed the error in the callback.
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64425
llvm-svn: 365511
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
unconditional branches
If we have an icmp->brcond->br sequence where the brcond just branches to the
next block jumping over the br, while the br takes the false edge, then we can
modify the conditional branch to jump to the br's target while inverting the
condition of the incoming icmp. This means we can eliminate the br as an
unconditional branch to the fallthrough block.
Differential Revision: https://reviews.llvm.org/D64354
llvm-svn: 365510
|
| |
|
|
| |
llvm-svn: 365508
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduction
============
This patch added intial support for bpf program compile once
and run everywhere (CO-RE).
The main motivation is for bpf program which depends on
kernel headers which may vary between different kernel versions.
The initial discussion can be found at https://lwn.net/Articles/773198/.
Currently, bpf program accesses kernel internal data structure
through bpf_probe_read() helper. The idea is to capture the
kernel data structure to be accessed through bpf_probe_read()
and relocate them on different kernel versions.
On each host, right before bpf program load, the bpfloader
will look at the types of the native linux through vmlinux BTF,
calculates proper access offset and patch the instruction.
To accommodate this, three intrinsic functions
preserve_{array,union,struct}_access_index
are introduced which in clang will preserve the base pointer,
struct/union/array access_index and struct/union debuginfo type
information. Later, bpf IR pass can reconstruct the whole gep
access chains without looking at gep itself.
This patch did the following:
. An IR pass is added to convert preserve_*_access_index to
global variable who name encodes the getelementptr
access pattern. The global variable has metadata
attached to describe the corresponding struct/union
debuginfo type.
. An SimplifyPatchable MachineInstruction pass is added
to remove unnecessary loads.
. The BTF output pass is enhanced to generate relocation
records located in .BTF.ext section.
Typical CO-RE also needs support of global variables which can
be assigned to different values to different hosts. For example,
kernel version can be used to guard different versions of codes.
This patch added the support for patchable externals as well.
Example
=======
The following is an example.
struct pt_regs {
long arg1;
long arg2;
};
struct sk_buff {
int i;
struct net_device *dev;
};
#define _(x) (__builtin_preserve_access_index(x))
static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) =
(void *) 4;
extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version;
int bpf_prog(struct pt_regs *ctx) {
struct net_device *dev = 0;
// ctx->arg* does not need bpf_probe_read
if (__kernel_version >= 41608)
bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg1)->dev));
else
bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg2)->dev));
return dev != 0;
}
In the above, we want to translate the third argument of
bpf_probe_read() as relocations.
-bash-4.4$ clang -target bpf -O2 -g -S trace.c
The compiler will generate two new subsections in .BTF.ext,
OffsetReloc and ExternReloc.
OffsetReloc is to record the structure member offset operations,
and ExternalReloc is to record the external globals where
only u8, u16, u32 and u64 are supported.
BPFOffsetReloc Size
struct SecLOffsetReloc for ELF section #1
A number of struct BPFOffsetReloc for ELF section #1
struct SecOffsetReloc for ELF section #2
A number of struct BPFOffsetReloc for ELF section #2
...
BPFExternReloc Size
struct SecExternReloc for ELF section #1
A number of struct BPFExternReloc for ELF section #1
struct SecExternReloc for ELF section #2
A number of struct BPFExternReloc for ELF section #2
struct BPFOffsetReloc {
uint32_t InsnOffset; ///< Byte offset in this section
uint32_t TypeID; ///< TypeID for the relocation
uint32_t OffsetNameOff; ///< The string to traverse types
};
struct BPFExternReloc {
uint32_t InsnOffset; ///< Byte offset in this section
uint32_t ExternNameOff; ///< The string for external variable
};
Note that only externs with attribute section ".BPF.patchable_externs"
are considered for Extern Reloc which will be patched by bpf loader
right before the load.
For the above test case, two offset records and one extern record
will be generated:
OffsetReloc records:
.long .Ltmp12 # Insn Offset
.long 7 # TypeId
.long 242 # Type Decode String
.long .Ltmp18 # Insn Offset
.long 7 # TypeId
.long 242 # Type Decode String
ExternReloc record:
.long .Ltmp5 # Insn Offset
.long 165 # External Variable
In string table:
.ascii "0:1" # string offset=242
.ascii "__kernel_version" # string offset=165
The default member offset can be calculated as
the 2nd member offset (0 representing the 1st member) of struct "sk_buff".
The asm code:
.Ltmp5:
.Ltmp6:
r2 = 0
r3 = 41608
.Ltmp7:
.Ltmp8:
.loc 1 18 9 is_stmt 0 # t.c:18:9
.Ltmp9:
if r3 > r2 goto LBB0_2
.Ltmp10:
.Ltmp11:
.loc 1 0 9 # t.c:0:9
.Ltmp12:
r2 = 8
.Ltmp13:
.loc 1 19 66 is_stmt 1 # t.c:19:66
.Ltmp14:
.Ltmp15:
r3 = *(u64 *)(r1 + 0)
goto LBB0_3
.Ltmp16:
.Ltmp17:
LBB0_2:
.loc 1 0 66 is_stmt 0 # t.c:0:66
.Ltmp18:
r2 = 8
.loc 1 21 66 is_stmt 1 # t.c:21:66
.Ltmp19:
r3 = *(u64 *)(r1 + 8)
.Ltmp20:
.Ltmp21:
LBB0_3:
.loc 1 0 66 is_stmt 0 # t.c:0:66
r3 += r2
r1 = r10
.Ltmp22:
.Ltmp23:
.Ltmp24:
r1 += -8
r2 = 8
call 4
For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number
8 is the structure offset based on the current BTF.
Loader needs to adjust it if it changes on the host.
For instruction .Ltmp5, "r2 = 0", the external variable
got a default value 0, loader needs to supply an appropriate
value for the particular host.
Compiling to generate object code and disassemble:
0000000000000000 bpf_prog:
0: b7 02 00 00 00 00 00 00 r2 = 0
1: 7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2
2: b7 02 00 00 00 00 00 00 r2 = 0
3: b7 03 00 00 88 a2 00 00 r3 = 41608
4: 2d 23 03 00 00 00 00 00 if r3 > r2 goto +3 <LBB0_2>
5: b7 02 00 00 08 00 00 00 r2 = 8
6: 79 13 00 00 00 00 00 00 r3 = *(u64 *)(r1 + 0)
7: 05 00 02 00 00 00 00 00 goto +2 <LBB0_3>
0000000000000040 LBB0_2:
8: b7 02 00 00 08 00 00 00 r2 = 8
9: 79 13 08 00 00 00 00 00 r3 = *(u64 *)(r1 + 8)
0000000000000050 LBB0_3:
10: 0f 23 00 00 00 00 00 00 r3 += r2
11: bf a1 00 00 00 00 00 00 r1 = r10
12: 07 01 00 00 f8 ff ff ff r1 += -8
13: b7 02 00 00 08 00 00 00 r2 = 8
14: 85 00 00 00 04 00 00 00 call 4
Instructions #2, #5 and #8 need relocation resoutions from the loader.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D61524
llvm-svn: 365503
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Select gprb or fprb when def/use register operand of G_PHI is
used/defined by either:
copy to/from physical register or
instruction with only one mapping available for that use/def operand.
Integer s64 phi is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.
Differential Revision: https://reviews.llvm.org/D64351
llvm-svn: 365494
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Select gprb or fprb when def/use register operand of G_SELECT is
used/defined by either:
copy to/from physical register or
instruction with only one mapping available for that use/def operand.
Integer s64 select is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.
For selection of floating point s32 or s64 select it is enough to set
fprb of appropriate size and selectImpl will do the rest.
Differential Revision: https://reviews.llvm.org/D64350
llvm-svn: 365492
|
| |
|
|
| |
llvm-svn: 365488
|
| |
|
|
|
|
| |
Account for 64-bit scalar eq/ne when available.
llvm-svn: 365487
|
| |
|
|
| |
llvm-svn: 365486
|
| |
|
|
| |
llvm-svn: 365484
|
| |
|
|
| |
llvm-svn: 365483
|
| |
|
|
| |
llvm-svn: 365482
|
| |
|
|
|
|
|
|
|
|
| |
The `sge/sgeu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than or equal `Src2/Imm` and
to 0 otherwise.
Differential Revision: https://reviews.llvm.org/D64314
llvm-svn: 365476
|
| |
|
|
|
|
|
|
|
| |
The `sgt/sgtu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than `Src2/Imm` and to 0 otherwise.
Differential Revision: https://reviews.llvm.org/D64313
llvm-svn: 365475
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dump the DWARF information about call sites and call site parameters into
debug info sections.
The patch also provides an interface for the interpretation of instructions
that could load values of a call site parameters in order to generate DWARF
about the call site parameters.
([13/13] Introduce the debug entry values.)
Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>
Differential Revision: https://reviews.llvm.org/D60716
llvm-svn: 365467
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getMinSignedBits() > 64
APInt::getSExtValue will assert if getMinSignedBits() > 64. This can happen,
for instance, if examining an i128. Avoid this assertion by checking
Imm.getMinSignedBits() <= 64 before doing
getTLI()->isLegalAddImmediate(Imm.getSExtValue()). We could directly check
getMinSignedBits() <= 12 but it seems better to reuse the isLegalAddImmediate
helper for this.
Differential Revision: https://reviews.llvm.org/D64390
llvm-svn: 365462
|
| |
|
|
| |
llvm-svn: 365433
|
| |
|
|
|
|
|
|
| |
Infrastructure work for future commit. NFC.
Differential Revision: https://reviews.llvm.org/D64370
llvm-svn: 365432
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D64369
llvm-svn: 365431
|
| |
|
|
|
|
|
|
|
|
|
| |
Summary:
`extsw` and `sldi` are supposed to be combined if they are in the same
BB in instruction selection phase. This patch handles the case where
extsw and sldi are not in the same BB.
Differential Revision: https://reviews.llvm.org/D63806
llvm-svn: 365430
|
| |
|
|
| |
llvm-svn: 365429
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Even with functions with `no-prototype` attribute, there can be an
argument `sret` (structure return) attribute, which is an optimization
when a function return type is a struct. Fixes PR42420.
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64318
llvm-svn: 365426
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Porting over the part of `emitComparison` in AArch64ISelLowering where we use
TST to represent a compare.
- Rename `tryOptCMN` to `tryFoldIntegerCompare`, since it now also emits TSTs
when possible.
- Add a utility function for emitting a TST with register operands.
- Rename opt-fold-cmn.mir to opt-fold-compare.mir, since it now also tests the
TST fold as well.
Differential Revision: https://reviews.llvm.org/D64371
llvm-svn: 365404
|
| |
|
|
|
|
|
| |
This will help removing the custom load predicates, allowing the
global isel emitter to handle them.
llvm-svn: 365398
|
| |
|
|
| |
llvm-svn: 365394
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This makes it so that IR files using triples without an environment work
out of the box, without normalizing them.
Typically, the MSVC behavior is more desirable. For example, it tends to
enable things like constant merging, use of associative comdats, etc.
Addresses PR42491
Reviewers: compnerd
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64109
llvm-svn: 365387
|
| |
|
|
| |
llvm-svn: 365378
|
| |
|
|
| |
llvm-svn: 365373
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the FP register callee saved.
This is tricky because now the FP needs to be spilled in the prolog
relative to the incoming SP register, rather than the frame register
used throughout the rest of the function. I don't like how this
bypassess the standard mechanism for CSR spills just to get the
correct insert point. I may look for a better solution, since all CSR
VGPRs may also need to have all lanes activated. Another option might
be to make getFrameIndexReference change the base register if the
frame index is a CSR, and then try to figure out the right insertion
point in emitProlog.
If there is a free VGPR lane available for SGPR spilling, try to use
it for the FP. If that would require intrtoducing a new VGPR spill,
try to use a free call clobbered SGPR. Only fallback to introducing a
new VGPR spill as a last resort.
This also doesn't attempt to handle SGPR spilling with scalar stores.
llvm-svn: 365372
|
| |
|
|
| |
llvm-svn: 365369
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before, they were one category of operands which could cause
crashes in non-sensical combinations, e.g. "f32.const symbol".
Now these are forced to be an error.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64039
llvm-svn: 365351
|
| |
|
|
|
|
|
| |
These are identical to the *_global PatFrag, and will only create more
work to get the GlobalISel importer to handle them.
llvm-svn: 365350
|
| |
|
|
| |
llvm-svn: 365349
|
| |
|
|
| |
llvm-svn: 365343
|
| |
|
|
|
|
| |
Keep the uint64_t type from getConstantOperandVal to stop truncation/extension overflow warnings in MSVC in subvector index math.
llvm-svn: 365328
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Select gprb or fprb when loaded value is used by either:
copy to physical register or
instruction with only one mapping available for that use operand.
Load of integer s64 is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.
Differential Revision: https://reviews.llvm.org/D64269
llvm-svn: 365323
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Select gprb or fprb when stored value is defined by either:
copy from physical register or
instruction with only one mapping available for that def operand.
Store of integer s64 is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.
Differential Revision: https://reviews.llvm.org/D64268
llvm-svn: 365322
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary of changes:
- simplified handling of FLAT offset: offset_s13 and offset_u12 have been replaced with flat_offset;
- provided information about error position for pre-gfx9 targets;
- improved errors handling.
Reviewers: artem.tamazov, arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D64244
llvm-svn: 365321
|