| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PowerPC hits an assertion due to somewhat the same reason as https://reviews.llvm.org/D70975.
Though there are already some hack, it still failed with some case, when the operand 0 is NOT
a const fp, it is another fma that with const fp. And that const fp is negated which result in multi-uses.
A better fix is to check the uses of the negated const fp. If there are already use of its negated
value, we will have benefit as no extra Node is added.
Differential revision: https://reviews.llvm.org/D75501
(cherry picked from commit 3906ae387f0775dfe4426e4336748269fafbd190)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
L is meant to support the second word used by 32b calling conventions for 64b arguments.
This is required for build 32b PowerPC Linux kernels after upstream
commit 334710b1496a ("powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'")
Thanks for the report from @nathanchance, and reference to GCC's
implementation from @segher.
Fixes: pr/46186
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1044
Reviewers: echristo, hfinkel, MaskRay
Reviewed By: MaskRay
Subscribers: MaskRay, wuzish, nemanjai, hiraditya, kbarton, steven.zhang, llvm-commits, segher, nathanchance, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81767
(cherry picked from commit 2d8e105db6bea10a6b96e4a094e73a87987ef909)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The UnwindHelp object is used during exception handling by runtime
code. It must be findable from a fixed offset from FP.
This change allocates the UnwindHelp object as a fixed object (as is
done for x86_64) to ensure that both the generated code and runtime
agree on the location of the object.
Fixes https://bugs.llvm.org/show_bug.cgi?id=45346
Differential Revision: https://reviews.llvm.org/D77016
(cherry picked from commit 494abe139a9aab991582f1b3f3370b99b252944c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The generated code for a funclet can have an add to sp in the epilogue
for which there is no corresponding sub in the prologue.
This patch removes the early return from emitPrologue that was
preventing the sub to sp, and instead conditionalizes the appropriate
parts of the rest of the function.
Fixes https://bugs.llvm.org/show_bug.cgi?id=45345
Differential Revision: https://reviews.llvm.org/D77015
(cherry picked from commit 522b4c4b88a5606b0074926e8658e7fede97c230)
|
| |
|
|
|
|
|
|
|
|
| |
When the FP exists, the FP base CFI directive offset should take the size of variable arguments into account.
Differential Revision: https://reviews.llvm.org/D73862
(cherry picked from commit 64f417200e1020305f28f3c1e40691585f50f6ad)
|
|
|
|
|
|
|
|
|
|
|
| |
Since i32 is not legal in riscv64,
it always promoted to i64 before emitting lib call and
for conversions like float/double to int and float/double to unsigned int
wrong lib call was emitted. This commit fix it using custom lowering.
Differential Revision: https://reviews.llvm.org/D80526
(cherry picked from commit 7622ea5835f0381a426e504f4c03f11733732b83)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised
with GetPromotedInteger, which leaves the upper bits of the value
undefind. Since this is used for comparing in an LR/SC loop with a
full-width comparison, we must sign extend it on RISC-V.
This is related to https://reviews.llvm.org/D58829, which solved the
issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler
ATOMIC_CMP_SWAP.
This patch is a modified form of
616289ed29225c0ddfe5699c7fdf42a0fcbe0ab4 by Jessica Clarke. It localises
the changes to LegalizeIntegerTypes and avoids adding a new virtual
method to TargetLowering to avoid changing the ABI of libLLVM.so.
|
|
|
|
|
|
|
|
|
|
| |
@nikic raised an issue on D75936 that the added complexity to the O0 pipeline was causing noticeable slowdowns for `-O0` builds. This patch addresses the issue by adding a pass with equal security properties, but without any optimizations (and more importantly, without the need for expensive analysis dependencies).
Reviewers: nikic, craig.topper, mattdr
Reviewed By: craig.topper, mattdr
Differential Revision: https://reviews.llvm.org/D80964
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After finding all such gadgets in a given function, the pass minimally inserts
LFENCE instructions in such a manner that the following property is satisfied:
for all SOURCE+SINK pairs, all paths in the CFG from SOURCE to SINK contain at
least one LFENCE instruction. The algorithm that implements this minimal
insertion is influenced by an academic paper that minimally inserts memory
fences for high-performance concurrent programs:
http://www.cs.ucr.edu/~lesani/companion/oopsla15/OOPSLA15.pdf
The algorithm implemented in this pass is as follows:
1. Build a condensed CFG (i.e., a GadgetGraph) consisting only of the following components:
-SOURCE instructions (also includes function arguments)
-SINK instructions
-Basic block entry points
-Basic block terminators
-LFENCE instructions
2. Analyze the GadgetGraph to determine which SOURCE+SINK pairs (i.e., gadgets) are already mitigated by existing LFENCEs. If all gadgets have been mitigated, go to step 6.
3. Use a heuristic or plugin to approximate minimal LFENCE insertion.
4. Insert one LFENCE along each CFG edge that was cut in step 3.
5. Go to step 2.
6. If any LFENCEs were inserted, return true from runOnFunction() to tell LLVM that the function was modified.
By default, the heuristic used in Step 3 is a greedy heuristic that avoids
inserting LFENCEs into loops unless absolutely necessary. There is also a
CLI option to load a plugin that can provide even better optimization,
inserting fewer fences, while still mitigating all of the LVI gadgets.
The plugin can be found here: https://github.com/intel/lvi-llvm-optimization-plugin,
and a description of the pass's behavior with the plugin can be found here:
https://software.intel.com/security-software-guidance/insights/optimized-mitigation-approach-load-value-injection.
Differential Revision: https://reviews.llvm.org/D75937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gadgets
Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.
More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).
Also adds a new target feature to X86: +lvi-load-hardening
The feature can be added via the clang CLI using -mlvi-hardening.
Differential Revision: https://reviews.llvm.org/D75936
|
|
|
|
|
|
| |
`MBB.back()` could segfault if `MBB.empty()`. Fixed by checking for `MBB.empty()` in the loop.
Differential Revision: https://reviews.llvm.org/D77584
|
|
|
|
|
|
|
|
|
| |
Injection (LVI) Gadgets"
This reverts commit c74dd640fd740c6928f66a39c7c15a014af3f66f.
Reverting to address coding standard issues raised in post-commit
review.
|
|
|
|
|
|
|
|
|
| |
Injection (LVI)"
This reverts commit 62c42e29ba43c9d79cd5bd2084b641fbff6a96d5
Reverting to address coding standard issues raised in post-commit
review.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After finding all such gadgets in a given function, the pass minimally inserts
LFENCE instructions in such a manner that the following property is satisfied:
for all SOURCE+SINK pairs, all paths in the CFG from SOURCE to SINK contain at
least one LFENCE instruction. The algorithm that implements this minimal
insertion is influenced by an academic paper that minimally inserts memory
fences for high-performance concurrent programs:
http://www.cs.ucr.edu/~lesani/companion/oopsla15/OOPSLA15.pdf
The algorithm implemented in this pass is as follows:
1. Build a condensed CFG (i.e., a GadgetGraph) consisting only of the following components:
-SOURCE instructions (also includes function arguments)
-SINK instructions
-Basic block entry points
-Basic block terminators
-LFENCE instructions
2. Analyze the GadgetGraph to determine which SOURCE+SINK pairs (i.e., gadgets) are already mitigated by existing LFENCEs. If all gadgets have been mitigated, go to step 6.
3. Use a heuristic or plugin to approximate minimal LFENCE insertion.
4. Insert one LFENCE along each CFG edge that was cut in step 3.
5. Go to step 2.
6. If any LFENCEs were inserted, return true from runOnFunction() to tell LLVM that the function was modified.
By default, the heuristic used in Step 3 is a greedy heuristic that avoids
inserting LFENCEs into loops unless absolutely necessary. There is also a
CLI option to load a plugin that can provide even better optimization,
inserting fewer fences, while still mitigating all of the LVI gadgets.
The plugin can be found here: https://github.com/intel/lvi-llvm-optimization-plugin,
and a description of the pass's behavior with the plugin can be found here:
https://software.intel.com/security-software-guidance/insights/optimized-mitigation-approach-load-value-injection.
Differential Revision: https://reviews.llvm.org/D75937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gadgets
Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.
More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).
Also adds a new target feature to X86: +lvi-load-hardening
The feature can be added via the clang CLI using -mlvi-hardening.
Differential Revision: https://reviews.llvm.org/D75936
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding a pass that replaces every ret instruction with the sequence:
pop <scratch-reg>
lfence
jmp *<scratch-reg>
where <scratch-reg> is some available scratch register, according to the
calling convention of the function being mitigated.
Differential Revision: https://reviews.llvm.org/D75935
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass replaces each indirect call/jump with a direct call to a thunk that looks like:
lfence
jmpq *%r11
This ensures that if the value in register %r11 was loaded from memory, then
the value in %r11 is (architecturally) correct prior to the jump.
Also adds a new target feature to X86: +lvi-cfi
("cfi" meaning control-flow integrity)
The feature can be added via clang CLI using -mlvi-cfi.
This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code.
Differential Revision: https://reviews.llvm.org/D76812
|
|
|
|
|
|
|
|
| |
Retpoline
Introduce a ThunkInserter CRTP base class from which new thunk types can inherit, e.g., thunks to mitigate https://software.intel.com/security-software-guidance/software-guidance/load-value-injection.
Differential Revision: https://reviews.llvm.org/D76811
|
|
|
|
|
|
|
|
|
|
|
|
| |
"Indirect Thunks"
There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g.,
https://software.intel.com/security-software-guidance/software-guidance/load-value-injection
Therefore it makes sense to refactor X86RetpolineThunks as a more general capability.
Differential Revision: https://reviews.llvm.org/D76810
|
|
|
|
|
|
|
|
| |
RDF is designed to be target agnostic. Therefore it would be useful to have it available for other targets, such as X86.
Based on a previous patch by Krzysztof Parzyszek
Differential Revision: https://reviews.llvm.org/D75932
|
|
|
|
|
|
|
| |
PR45539:
https://bugs.llvm.org/show_bug.cgi?id=45539
(cherry picked from commit 01bcc3e9371470e1974f066ced353df15e10056d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Part of the changes in D44564 made BasicAA not CFG only due to it using
PhiAnalysisValues which may have values invalidated.
Subsequent patches (rL340613) appear to have addressed this limitation.
BasicAA should not be invalidated by non-CFG-altering passes.
A concrete example is MemCpyOpt which preserves CFG, but we are testing
it invalidates BasicAA.
llvm-dev RFC: https://groups.google.com/forum/#!topic/llvm-dev/eSPXuWnNfzM
Reviewers: john.brawn, sebpop, hfinkel, brzycki
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74353
(cherry picked from commit 0cecafd647ccd9d0acc5968d4d6e80c1cbdee275)
|
|
|
|
|
|
|
|
|
|
| |
After pseudo-expansion, we may end up with ADDI (add immediate)
instructions where the operand is not an immediate but a relocation.
For such instructions, attempts to get the immediate result in
assertion failures for obvious reasons.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45432
(cherry picked from commit a56d057dfe3127c111c3470606c04e96d35b1fa3)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In BPF Instruction Selection DAGToDAG transformation phase,
BPF backend had an optimization to turn load from readonly data
section to direct load of the values. This phase is implemented
before libbpf has readonly section support and before alu32
is supported.
This phase however may generate incorrect type when alu32 is
enabled. The following is an example,
-bash-4.4$ cat ~/tmp2/t.c
struct t {
unsigned char a;
unsigned char b;
unsigned char c;
};
extern void foo(void *);
int test() {
struct t v = {
.b = 2,
};
foo(&v);
return 0;
}
The compiler will turn local variable "v" into a readonly section.
During instruction selection phase, the compiler generates two
loads from readonly section, one 2 byte load or 1 byte load, e.g., for 2 loads,
t8: i32,ch = load<(dereferenceable load 2 from `i8* getelementptr inbounds
(%struct.t, %struct.t* @__const.test.v, i64 0, i32 0)`, align 1),
anyext from i16> t3, GlobalAddress:i64<%struct.t* @__const.test.v> 0, undef:i64
t9: ch = store<(store 2 into %ir.v1.sub1), trunc to i16> t3, t8,
FrameIndex:i64<0>, undef:i64
BPF backend changed t8 to i64 = Constant<2> and eventually the generated machine IR:
t10: i64 = MOV_ri TargetConstant:i64<2>
t40: i32 = SLL_ri_32 t10, TargetConstant:i32<8>
t41: i32 = OR_ri_32 t40, TargetConstant:i64<0>
t9: ch = STH32<Mem:(store 2 into %ir.v1.sub1)> t41, TargetFrameIndex:i64<0>,
TargetConstant:i64<0>, t3
Note that t10 in the above is not correct. The type should be i32 and instruction
should be MOV_ri_32. The reason for incorrect insn selection is BPF insn selection
generated an i64 constant instead of an i32 constant as specified in the original
load instruction. Such incorrect insn sequence eventually caused the following
fatal error when a COPY insn tries to copy a 64bit register to a 32bit subregister.
Impossible reg-to-reg copy
UNREACHABLE executed at ../lib/Target/BPF/BPFInstrInfo.cpp:42!
This patch fixed the issue by using the load result type instead of always i64
when doing readonly load optimization.
Differential Revision: https://reviews.llvm.org/D81630
(cherry picked from commit 4db1878158a3f481ff673fef2396c12b7a53d280)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In BTF, pointee type pruning is used to reduce cluttering
too many unused types into prog BTF. For example,
struct task_struct {
...
struct mm_struct *mm;
...
}
If bpf program does not access members of "struct mm_struct",
there is no need to bring types for "struct mm_struct" to BTF.
This patch fixed a bug where an incorrect pruning happened.
The test case like below:
struct t;
typedef struct t _t;
struct s1 { _t *c; };
int test1(struct s1 *arg) { ... }
struct t { int a; int b; };
struct s2 { _t c; }
int test2(struct s2 *arg) { ... }
After processing test1(), among others, BPF backend generates BTF types for
"struct s1", "_t" and a placeholder for "struct t".
Note that "struct t" is not really generated. If later a direct access
to "struct t" member happened, "struct t" BTF type will be generated
properly.
During processing test2(), when processing member type "_t c",
BPF backend sees type "_t" already generated, so returned.
This caused the problem that "struct t" BTF type is never generated and
eventually causing incorrect type definition for "struct s2".
To fix the issue, during DebugInfo type traversal, even if a
typedef/const/volatile/restrict derived type has been recorded in BTF,
if it is not a type pruning candidate, type traversal of its base type continues.
Differential Revision: https://reviews.llvm.org/D82041
(cherry picked from commit 89648eb16d01725457f958e634d16c534b64c42c)
|
|
|
|
|
|
|
|
|
|
| |
As reported in PR45186, we could be in a situation where we don't
want to handle unaligned memory accesses for FP scalars but still
have VSX (which allows unaligned access for vectors). Change the
default to only apply to scalars.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45186
(cherry picked from commit 099a875f28d0131a6ae85af91b9eb8627917fbbe)
|
|
|
|
|
|
|
|
|
|
|
|
| |
We relied on the fact that the iterators walks through the elements of a
DenseSet in a deterministic order (which is not true). This caused
ThinLTO cache misses. This patch addresses this problem.
See PR 45819 for additional information
https://bugs.llvm.org/show_bug.cgi?id=45819
Differential Revision: https://reviews.llvm.org/D79772
(cherry picked from commit 252892fea7088abbeff9476e0ecbacc091d135a0)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As reported in https://bugs.llvm.org/show_bug.cgi?id=45709 we can hit an
infinite loop in legalization since we set the legalization action for
ISD::SELECT_CC for all fixed length vector types to Promote. Without some
different legalization action for the type being promoted to, the legalizer
simply loops. Since we don't have patterns to match the node, the right
legalization action should be Expand.
Differential revision: https://reviews.llvm.org/D79854
(cherry picked from commit 793cc518b9428a0b7a40c59d4ecd5939a7bc84f7)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fix for PR39865 took care of some of the handling for half precision
but it missed a number of issues that still exist. This patch fixes the
remaining issues that cause crashes in the PPC back end.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776
Differential revision: https://reviews.llvm.org/D79283
(cherry picked from commit 1a493b0fa556a07c728862c3c3f70bfd8683bef0)
|
|
|
|
|
|
|
|
|
| |
This patch adds support for Vector Multiply-Sum Unsigned Doubleword Modulo
instruction; vmsumudm.
Differential Revision: https://reviews.llvm.org/D80294
(cherry picked from commit a28e9f1208608f8d18750bb88ca74722fb0bcce4)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The approach here is to create a new (empty) component, `Extensions', where all
statically compiled extensions dynamically register their dependencies. That way
we're more natively compatible with LLVMBuild and llvm-config.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=44870
Differential Revision: https://reviews.llvm.org/D78192
(cherry picked from commit 8f766e382b77eef3102798b49e087d1e4804b984)
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first checkin to support Marvell ThunderX3T110.
Initial definition of the micro-ops of the instructions in ThunderX3T110
is included.
Differential Revision: https://reviews.llvm.org/D78129
(cherry picked from commit 382d3a85e2a9269569e7fb8caa487d7ef57900c6)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8-bit immediates for minsize
Summary:
Otherwise PostRA list scheduler may reorder instruction, such as
schedule this
'''
pushq $0x8
pop %rbx
lea 0x2a0(%rsp),%r15
'''
to
'''
pushq $0x8
lea 0x2a0(%rsp),%r15
pop %rbx
'''
by mistake. The patch is to prevent this to happen by making sure POP has
implicit use of SP.
Reviewers: craig.topper
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77031
(cherry picked from commit ece79f47083babcabde3700c67b90ef19967a5b3)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, `rewriteLoopExitValues()`'s logic is roughly as following:
> Loop over each incoming value in each PHI node.
> Query whether the SCEV for that incoming value is high-cost.
> Expand the SCEV.
> Perform sanity check (`isValidRewrite()`, D51582)
> Record the info
> Afterwards, see if we can drop the loop given replacements.
> Maybe perform replacements.
The problem is that we interleave SCEV cost checking and expansion.
This is A Problem, because `isHighCostExpansion()` takes special care
to not bill for the expansions that were already expanded, and we can reuse.
While it makes sense in general - if we know that we will expand some SCEV,
all the other SCEV's costs should account for that, which might cause
some of them to become non-high-cost too, and cause chain reaction.
But that isn't what we are doing here. We expand *all* SCEV's, unconditionally.
So every next SCEV's cost will be affected by the already-performed expansions
for previous SCEV's. Even if we are not planning on keeping
some of the expansions we performed.
Worse yet, this current "bonus" depends on the exact PHI node
incoming value processing order. This is completely wrong.
As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have
a PHI node with two(!) identical high-cost incoming values for the same basic blocks,
we would decide first time around that it is high-cost, expand it,
and immediately decide that it is not high-cost because we have an expansion
that we could reuse (because we expanded it right before, temporarily),
and replace the second incoming value but not the first one;
thus resulting in a broken PHI.
What we instead should do for now, is not perform any expansions
until after we've queried all the costs.
Later, in particular after `isValidRewrite()` is an assertion (D51582)
we could improve upon that, but in a more coherent fashion.
See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]]
Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma
Reviewed By: dmajor, mkazantsev
Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79787
(cherry picked from commit b2df96123198deadad74634c978e84912314da26)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
SCTLR_EL1.BT[01] controls the PACI[AB]SP compatibility with PBYTE 11
(see [1])
This bit will be set to zero so PACI[AB]SP are equal to BTI C
instruction only.
[1] https://developer.arm.com/docs/ddi0595/b/aarch64-system-registers/sctlr_el1
Reviewers: chill, tamas.petz, pbarrio, ostannard
Reviewed By: tamas.petz, ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81746
(cherry picked from commit b8ae3fdfa579dbf366b1bb1cbfdbf8c51db7fa55)
|
|
|
|
|
|
|
|
|
|
| |
In some cases BTI landing pad is inserted even compatible instruction
was there already. Meta instruction does not count in this case
therefore skip them in the check for first instructions in the function.
Differential revision: https://reviews.llvm.org/D74492
(cherry picked from commit d5a186a60014dc1a8c979c978cb32aba7ecb9102)
|
|
|
|
|
|
|
|
| |
Similar to D81212.
Differential Revision: https://reviews.llvm.org/D81292
(cherry picked from commit 3408dcbdf054ac3cc32a97a6a82a3cf5844be609)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
undef.
Shifts are supposed to always shift in zeros or sign bits regardless of their inputs. It's possible the input value may have been replaced with undef by SimplifyDemandedBits, but the shift in zeros are still demanded.
This issue was reported to me by ispc from 10.0. Unfortunately their failing test does not fail on trunk. Seems to be because the shl is optimized out earlier now and doesn't become VSHLI.
ispc bug https://github.com/ispc/ispc/issues/1771
Differential Revision: https://reviews.llvm.org/D81212
(cherry picked from commit 7c9a89fed8f5d53d61fe3a61a2581a7c28b1b6d2)
|
|
|
|
|
|
|
|
|
|
| |
Bug report: https://bugs.llvm.org/show_bug.cgi?id=45291
Patch by Tomasz Miąsko
Differential Revision: https://reviews.llvm.org/D80066
(cherry picked from commit 5f65faef2c61bfb5e041f74db61665f43a05e9db)
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the x, t and g modifiers for inline asm from GCC. These will print a vector register as xmm*, ymm* or zmm* respectively.
I also fixed register names with modifiers with inteldialect so they are no longer printed with a leading %.
Patch by Amanieu d'Antras
Differential Revision: https://reviews.llvm.org/D78977
(cherry picked from commit c5f7c039efe7ff09a44cfd252f6cb001ceed6269)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Likely fixes https://bugs.llvm.org/show_bug.cgi?id=45858.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80047
(cherry picked from commit fc937806efd71eb3803b35d0920bb7d76bc3040b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In 2e24219d3cbf, a number of ARM pcrel fixups were resolved at assembly
time, to solve PR44929. This only covered little-endian ARM however, so
add similar fixups for big-endian ARM. Also extend the test case to
cover big-endian ARM.
Reviewers: hans, psmith, MaskRay
Reviewed By: psmith, MaskRay
Subscribers: kristof.beyls, hiraditya, danielkiss, emaste, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79774
(cherry picked from commit fc373522b044e0b150561204958f0d603fb4caba)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiling for a arm5te cpu from clang, the +dsp attribute is set.
This meant we could try and generate qadd8 instructions where we would
end up having no pattern. I've changed the condition here to be hasV6Ops
&& hasDSP, which is what other parts of ARMISelLowering seem to use for
similar instructions.
Fixed PR45677.
Differential Revision: https://reviews.llvm.org/D78877
(cherry picked from commit 8807139026b64ac40163bb255dad38a1d8054f08)
|
|
|
|
|
|
|
|
|
|
|
|
| |
We call the function that attempts to reuse the conversion without checking
whether the target matches the constraints that the callee expects. This patch
adds the check prior to the call.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=43976
Differential revision: https://reviews.llvm.org/D77564
(cherry picked from commit 64b31d96dfd6c05e6d52d8798726dec60502cfde)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Addresses PR44728 but no tests because I've not yet made any attempt to verify
correctness of the debug info.
Reviewers: sbc100, aardappel
Differential Revision: https://reviews.llvm.org/D74656
(cherry picked from commit 2504f14a06872f2e1755a88b3aab7e6bc280bec7)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of a struct that cover the whole struct
This can happen when the rest of the
members of are zero length. Following
the same pattern applied to the SROA
pass in:
d7f6f1636d53c3e2faf55cdf20fbb44a1a149df1
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45335
Differential Revision: https://reviews.llvm.org/D78720
(cherry picked from commit 3929429347d398773577b79f7fdb780d4f7ed887)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
NOTE: I mostly put it on Phabricator to make it easy for other people to fetch & check if that fixes a bug.
This patch backports 4878aa36d4a and required earlier patches onto
the release/10.x branch.
It includes the following patches:
aa5ebfdf205de6d599c1fed3161da3b63b6f0bef [ValueLattice] Make mark* functions public, return if value changed.
c1943b42c5b7feff5d0e0c1358d02889e2be165f [ValueLattice] Update markConstantRange to return false equal ranges.
e30c257811f62fea21704caa961c61e4559de202 [CVP,SCCP] Precommit test for D75055.
4878aa36d4aa27df644430139fab2734fde4a000 [ValueLattice] Add new state for undef constants.
All patches except the last one apply cleanly. For the last one, the
changes to SCCP.cpp were stripped, because SCCP does not yet use
ValueLattice on release/10.x. Otherwise we would have to pull in more
additional changes.
Subscribers: tstellar, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76596
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enforcement Technology)
Do not commit the llvm/test/ExecutionEngine/MCJIT/cet-code-model-lager.ll because it will
cause build bot fail(not suitable for window 32 target).
Summary:
This patch comes from H.J.'s https://github.com/hjl-tools/llvm-project/commit/2bd54ce7fa9e94fcd1118b948e14d1b6fc54dfd2
**This patch fix the failed llvm unit tests which running on CET machine. **(e.g. ExecutionEngine/MCJIT/MCJITTests)
The reason we enable IBT at "JIT compiled with CET" is mainly that: the JIT don't know the its caller program is CET enable or not.
If JIT's caller program is non-CET, it is no problem JIT generate CET code or not.
But if JIT's caller program is CET enabled, JIT must generate CET code or it will cause Control protection exceptions.
I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed.
and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too.
(if not apply this patch, VNCserver will crash at CET machine.)
Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei
Reviewed By: LuoYuanke
Subscribers: tstellar, efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76900
(cherry picked from commit 01a32f2bd3fad4331cebe9f4faa270d7c082d281)
|