| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch provides mitigation for CVE-2017-5715, Spectre variant two,
which affects the P5600 and P6600. It implements the LLVM part of
-mindirect-jump=hazard. It is _not_ enabled by default for the P5600.
The migitation strategy suggested by MIPS for these processors is to use
hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard
barrier variants of the 'jalr' and 'jr' instructions respectively.
These instructions impede the execution of instruction stream until
architecturally defined hazards (changes to the instruction stream,
privileged registers which may affect execution) are cleared. These
instructions in MIPS' designs are not speculated past.
These instructions are used with the attribute +use-indirect-jump-hazard
when branching indirectly and for indirect function calls.
These instructions are defined by the MIPS32R2 ISA, so this mitigation
method is not compatible with processors which implement an earlier
revision of the MIPS ISA.
Performance benchmarking of this option with -fpic and lld using
-z hazardplt shows a difference of overall 10%~ time increase
for the LLVM testsuite. Certain benchmarks such as methcall show a
substantially larger increase in time due to their nature.
Reviewers: atanasyan, zoran.jovanovic
Differential Revision: https://reviews.llvm.org/D43486
llvm-svn: 325653
|
| |
|
|
|
|
|
|
|
| |
We already do this in DAGCombiner, but it should
also be good to eliminate the fsub use in IR.
This is similar to rL325648.
llvm-svn: 325649
|
| |
|
|
|
|
|
| |
We already do this in DAGCombiner, but it should
also be good to eliminate the fsub use in IR.
llvm-svn: 325648
|
| |
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/rL325518
It breaks following OpenCL conformance tests:
- Basic - parameter_types
- Basic - vload_private
llvm-svn: 325643
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We used to remove the first memmove in cases like this:
memmove(p, p+2, 8);
memmove(p, p+2, 8);
which is incorrect. Fix this by changing isPossibleSelfRead to what was most
likely the intended behavior.
Historical note: the buggy code was added in https://reviews.llvm.org/rL120974
to address PR8728.
Reviewers: rsmith
Subscribers: mcrosier, llvm-commits, jlebar
Differential Revision: https://reviews.llvm.org/D43425
llvm-svn: 325641
|
| |
|
|
|
|
|
|
| |
Spilling may cause previously non-empty intervals (both for the spilled vreg
and others) to become empty. Moving the pruning into initializeGraph catches
these cases and fixes PR33038.
llvm-svn: 325632
|
| |
|
|
|
|
|
|
| |
This is usually not a problem because this code's main purpose is
eliminating unused new/delete pairs. We got deletes of nullptr or
nobuiltin deletes of builtin new wrong though.
llvm-svn: 325630
|
| |
|
|
|
|
|
| |
FMul is commutative, so complexity-based canonicalization should always
take care of the swap via SimplifyAssociativeOrCommutative().
llvm-svn: 325628
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
against vector splats of constants.
This is split off from D42948 and includes just the cases that constant fold to true or false. It also includes some refactoring to keep predicate checks together.
This supports things like
(setcc uge X, 0) -> true
Differential Revision: https://reviews.llvm.org/D43489
llvm-svn: 325627
|
| |
|
|
|
|
|
|
|
|
|
| |
Get rid of icky goto loops and make the code easier to maintain. Otherwise,
NFC.
Restore r324903 and fix PR36369.
Differentail revision: https://reviews.llvm.org/D43364
llvm-svn: 325621
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With D43396, no clients use the Path parameter anymore.
Depends on D43396.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D43400
llvm-svn: 325619
|
| |
|
|
|
|
|
|
| |
This case wasn't handled yet.
Differential Revision: https://reviews.llvm.org/D43508
llvm-svn: 325616
|
| |
|
|
| |
llvm-svn: 325606
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
uses of the condition.
SimplifyDemandedBits forces the demanded mask to all 1s if the node has multiple uses, unless the AssumeSingleUse flag is set.
So previously we were only really likely to simplify something if the condition had a single use. And on the off chance we did simplify with multiple uses the demanded mask being used was all ones so there was no reason to create a shrunkblend.
This patch now checks that the condition is only used by selects first, and then sets the AssumeSingleUse flag for the simplifcation. Then we convert the selects to shrunkblend, and finally replace condition.
Differential Revision: https://reviews.llvm.org/D43446
llvm-svn: 325604
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
simplify DAGCombiner and simplifySetCC code and fix a bug.
DAGCombiner and SimplifySetCC both use getPointerTy for shift amounts pre-legalization. DAGCombiner uses a single helper function to hide this. SimplifySetCC does it in multiple places.
This patch adds a defaulted parameter to getShiftAmountTy that can make it return getPointerTy for scalar types. Use this parameter to simplify the SimplifySetCC and DAGCombiner.
Additionally, there were two places in SimplifySetCC that were creating shifts using the target's preferred shift amount pre-legalization. If the target uses a narrow type and the type is illegal, this can cause SimplfiySetCC to create a shift with an amount that can't represent all possible shift values for the type. To fix this we should use pointer type there too.
Alternatively we could make getScalarShiftAmountTy for each target return a safe value for large types as proposed in D43445. And maybe we should still do that, but fixing the SimplifySetCC code keeps other targets from tripping over this in the future.
Fixes PR36250.
Differential Revision: https://reviews.llvm.org/D43449
llvm-svn: 325602
|
| |
|
|
|
|
|
|
|
|
| |
This allows us to avoid an opsize prefix. And forcing some move immediates to i32 avoids a length changing prefix on those instructions.
This mostly replaces the existing combine we had for zext/sext+cmov of constants. I left in a case for sign extending a 32 bit cmov of constants to 64 bits.
Differential Revision: https://reviews.llvm.org/D43327
llvm-svn: 325601
|
| |
|
|
| |
llvm-svn: 325597
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are fdiv-with-constant-divisor, so they already become
reciprocal multiplies. The last gap for vector ops should be
closed with rL325590.
It's possible that we're missing folds for some edge cases
with denormal intermediate constants after deleting these,
but there are no tests for those patterns, and it would be
better to handle denormals more consistently (and less
conservatively) as noted in TODO comments.
llvm-svn: 325595
|
| |
|
|
| |
llvm-svn: 325590
|
| |
|
|
|
|
|
|
| |
An upcoming patch D41434, changes the ordering of the matcher table
for assembly. This patch corrects the definition of the normal MIPS
cvt.d.w not to be available in microMIPS.
llvm-svn: 325589
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds an option for emission of inlined strings rather than
.debug_str section.
Reviewers: echristo, jlebar
Subscribers: eraman, llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D43390
llvm-svn: 325583
|
| |
|
|
|
|
|
|
|
|
|
|
| |
parameter save area when needed
Current implementation always allocates the parameter save area conservatively
for fastcc functions. There is no reason to allocate the parameter save area if
all the parameters can be passed via registers.
Differential Revision: https://reviews.llvm.org/D42602
llvm-svn: 325581
|
| |
|
|
| |
llvm-svn: 325580
|
| |
|
|
|
|
|
|
| |
ExpandUINT_TO_FLOAT can accept vXi32 or vXi64 inputs, so we need to use a uint64_t shift to generate the 2^(BW/2) constant.
No test case unfortunately as no upstream target uses this, but its affecting a downstream target.
llvm-svn: 325578
|
| |
|
|
|
|
|
|
|
|
| |
We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1
would not otherwise be considered a cheap constant. This prevents the
-1's from being pulled out into constants and potentially hoisted.
Differential Revision: https://reviews.llvm.org/D43451
llvm-svn: 325573
|
| |
|
|
|
|
|
|
|
|
|
| |
For instructions like call foo and jmp foo patch changes
relocation produced from R_X86_64_PC32 to R_X86_64_PLT32.
Relocation can be used as a marker for 32-bit PC-relative branches.
Linker will reduce PLT32 relocation to PC32 if function is defined locally.
Differential revision: https://reviews.llvm.org/D43383
llvm-svn: 325569
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The machine instruction scheduler was illegally moving a buffer store
past a buffer load with the same descriptor and offset. Fixed by marking
buffer ops as mayAlias and isAliased. This may be overly conservative,
and we may need to revisit.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D43332
Change-Id: Iff3173d9e0653e830474546276ab9d30318b8ef7
llvm-svn: 325567
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm-mc can crash when
there is cfi_startproc without cfi_end_proc:
.text
.globl foo
foo:
.cfi_startproc
Testcase shows the issue, patch fixes it.
Differential revision: https://reviews.llvm.org/D43456
llvm-svn: 325564
|
| |
|
|
|
|
|
|
| |
auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics.
The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select.
llvm-svn: 325559
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second part of recommit of r325224. The previous part was
committed in r325426, which deals with C++ memory allocation. Solution
for C memory allocation involved functions `llvm::malloc` and similar.
This was a fragile solution because it caused ambiguity errors in some
cases. In this commit the new functions have names like `llvm::safe_malloc`.
The relevant part of original comment is below, updated for new function
names.
Analysis of fails in the case of out of memory errors can be tricky on
Windows. Such error emerges at the point where memory allocation function
fails, but manifests itself when null pointer is used. These two points
may be distant from each other. Besides, next runs may not exhibit
allocation error.
In some cases memory is allocated by a call to some of C allocation
functions, malloc, calloc and realloc. They are used for interoperability
with C code, when allocated object has variable size and when it is
necessary to avoid call of constructors. In many calls the result is not
checked for null pointer. To simplify checks, new functions are defined
in the namespace 'llvm': `safe_malloc`, `safe_calloc` and `safe_realloc`.
They behave as corresponding standard functions but produce fatal error if
allocation fails. This change replaces the standard functions like 'malloc'
in the cases when the result of the allocation function is not checked
for null pointer.
Finally, there are plain C code, that uses malloc and similar functions. If
the result is not checked, assert statement is added.
Differential Revision: https://reviews.llvm.org/D43010
llvm-svn: 325551
|
| |
|
|
|
|
|
|
|
|
|
|
| |
fpr32 first.
This is a follow on commit to r[x] where we fix the other direction of copy.
For this case, after converting the source from gpr32 -> fpr32, we use a
subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land.
https://reviews.llvm.org/D43444
llvm-svn: 325550
|
| |
|
|
| |
llvm-svn: 325547
|
| |
|
|
|
|
| |
do it in two places.
llvm-svn: 325546
|
| |
|
|
|
|
| |
Also, move the folds with constants closer to make it easier to follow.
llvm-svn: 325541
|
| |
|
|
|
|
| |
This reverts commit r325532.
llvm-svn: 325539
|
| |
|
|
|
|
| |
Previously we used vptestmd, but the scheduling data for SKX says vpmovq2m/vpmovd2m is lower latency. We already used vpmovb2m/vpmovw2m for byte/word truncates. So this is more consistent anyway.
llvm-svn: 325534
|
| |
|
|
|
|
|
|
|
|
|
|
| |
-ffast-math
It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this
should be conservatively better than what we have right now. GCC allows this with
only -freciprocal-math.
The last test is changed to show a case that is expected to fold, but we need D43398.
llvm-svn: 325533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Several for loops in PromoteMemoryToRegister.cpp leave their increment
expression empty, instead incrementing the iterator within the for loop
body. I believe this is because these loops were previously implemented
as while loops; see https://reviews.llvm.org/rL188327.
Incrementing the iterator within the body of the for loop instead of
in its increment expression makes it seem like the iterator will be
modified or conditionally incremented within the loop, but that is not
the case in these loops.
Instead, use range loops.
Test Plan: `check-llvm`
Reviewers: davide, bkramer
Reviewed By: davide, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43473
llvm-svn: 325532
|
| |
|
|
|
|
|
|
|
| |
The last fold that used to be here was not necessary. That's a
combination of 2 folds (and there's a regression test to show that).
The transforms are guarded by isFast(), but that should be loosened.
llvm-svn: 325531
|
| |
|
|
|
|
|
|
|
|
| |
Summary:
Move a debug statement to above where an assertion is hit, so that the debug
statement can be inspected before a stack trace.
Test Plan: `check-llvm`
llvm-svn: 325529
|
| |
|
|
|
|
| |
We swapped the operands and used setle, but I don't see any reason to do that. I think this is a holdover from SSE where we swap and the invert to use pcmpgt. But with AVX512 we don't want an invert so we won't use pcmpgt. So there's no need to swap.
llvm-svn: 325527
|
| |
|
|
|
|
|
|
|
|
| |
VPTESTM/VPTESTNM matching.
Canonicalize EQ/NE PCMPM to have build vector all zeros on the RHS so we don't have to pattern match it in both locations. This significantly reduces the number of isel patterns needed since we also had to multiply it out with loads being in either operand of the 'and' input node and in the 'and' masking node.
This removes over 24000 bytes from the isel table.
llvm-svn: 325526
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The discussion and as per need, each vendor needs a way to keep the old fast flags and the new fast flags in the auto upgrade path of the IR upgrader. This revision addresses that issue.
Patched by Michael Berg
Reviewers: qcolombet, hans, steven_wu
Reviewed By: qcolombet, steven_wu
Subscribers: dexonsmith, vsk, mehdi_amini, andrewrk, MatzeB, wristow, spatel
Differential Revision: https://reviews.llvm.org/D43253
llvm-svn: 325525
|
| |
|
|
|
|
| |
to suppression of redundant waitcnt instrs. It is necessary to make note of these existing waitcnt instrs so that we do not fall into an infinite loop when handling loops. Also, [NFC] some minor code clean-up.
llvm-svn: 325524
|
| |
|
|
|
|
|
|
|
|
| |
If we have a clamp pattern, SMIN(SMAX(X, LO),HI) or SMAX(SMIN(X, HI),LO) then we can deduce that the number of signbits (zeros/ones) will be at least the minimum of the LO and HI constants.
ComputeKnownBits equivalent of D43338.
Differential Revision: https://reviews.llvm.org/D43463
llvm-svn: 325521
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.
Author: FarhanaAleen
Reviewed By: rampitec
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D43275
llvm-svn: 325518
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This commit separates the abstract accelerator table data structure
from the code for writing out an on-disk representation of a specific
accelerator table format. The idea is that former (now called
AccelTable<T>) can be reused for the DWARF v5 accelerator tables
as-is, without any further customizations.
Some bits of the emission code (now living in the EmissionContext class)
can be reused for DWARF v5 as well, but the subtle differences in the
layout of various subtables mean the sharing is not always possible.
(Also, the individual emit*** functions are fairly simple so there's a
tradeoff between making a bigger general-purpose function, and two
smaller targeted functions.)
Another advantage of this setup is that more of the serialization logic
can be hidden in the .cpp file -- I have moved declarations of the
header and all the emission functions there.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: echristo, clayborg, vleschuk, llvm-commits
Differential Revision: https://reviews.llvm.org/D43285
llvm-svn: 325516
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was reverted because it broke the grub build. The reason the grub
build broke is because grub does its own relocation processing and was
not handing R_386_PLT32. Since grub has no dynamic linker, the fix is
trivial: handle R_386_PLT32 exactly like R_386_PC32.
On the report it was noted that they are using
-fno-integrated-assembler. The upstream GAS (starting with
451875b4f976a527395e9303224c7881b65e12ed) will already be producing a
R_386_PLT32 anyway, so they have to update their code one way or the
other
Original message:
Don't assume a null GV is local for ELF and MachO.
This is already a simplification, and should help with avoiding a plt
reference when calling an intrinsic with -fno-plt.
With this change we return false for null GVs, so the caller only
needs to check the new metadata to decide if it should use foo@plt or
*foo@got.
llvm-svn: 325514
|
| |
|
|
| |
llvm-svn: 325512
|
| |
|
|
|
|
|
|
| |
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.
Third attempt - moved function from lambda to static function due to build failures.
llvm-svn: 325506
|