| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.
Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.
This partially resolves PR35209.
Thanks to Alexander Richardson for reporting the issue!
Reviewers: john.brawn
Differential Revision: https://reviews.llvm.org/D39841
llvm-svn: 318032
|
| |
|
|
| |
llvm-svn: 318029
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.
It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.
The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.
`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.
Reviewers: davide, davidxl, grosser
Reviewed By: davide, davidxl, grosser
Subscribers: gyiu, grosser, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D39607
llvm-svn: 318028
|
| |
|
|
| |
llvm-svn: 318027
|
| |
|
|
|
|
|
|
|
| |
This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D38671
Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661
llvm-svn: 318026
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updated the scheduling information of the SKX subtarget in the file X86SchedSkylakeServer.td under lib/Target/X86 to:
1. add regular opcodes in addition to the suffixed "_Int" opcodes
2. add the (V)MAXCPD/MAXCPS/MAXCSD/MAXCSS/MINCPD/MINCPS/MINCSD/MINCSS
instructions that are equivalent to their counterparts without the 'C' as they are part of a hack to
make floating point min/max commutable under fast math.
Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D39833
Change-Id: Ie13702a5ce1b1a08af91ca637a52b6962881e7d6
llvm-svn: 318024
|
| |
|
|
|
|
| |
We support 2 spelling for silvermont and we should accept both here.
llvm-svn: 318023
|
| |
|
|
|
|
| |
vrcp14ss/sd, rsqrt14ss/sd instructions.
llvm-svn: 318022
|
| |
|
|
| |
llvm-svn: 318020
|
| |
|
|
|
|
| |
intrinsics.
llvm-svn: 318019
|
| |
|
|
| |
llvm-svn: 318017
|
| |
|
|
|
|
| |
sse_load_f32/sse_load_f64 to increase load folding opportunities.
llvm-svn: 318016
|
| |
|
|
|
|
|
|
|
|
|
| |
This was using a custom function that didn't handle the
addressing modes properly for private. Use
isLegalAddressingMode to avoid duplicating this.
Additionally, skip the combine if there is only one use
since the standard combine will handle it.
llvm-svn: 318013
|
| |
|
|
| |
llvm-svn: 318010
|
| |
|
|
| |
llvm-svn: 318009
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsics.
The VRNDSCALE instructions implement a superset of the (V)ROUND instructions. They are equivalent if the upper 4-bits of the immediate are 0.
This patch lowers the legacy intrinsics to the VRNDSCALE ISD node and masks the upper bits of the immediate to 0. This allows us to take advantage of the larger register encoding space.
We should maybe consider converting VRNDSCALE back to VROUND in the EVEX to VEX pass if the extended registers are not being used.
I notice some load folding opportunities being missed for the VRNDSCALESS/SD instructions that I'll try to fix in future patches.
llvm-svn: 318008
|
| |
|
|
|
|
|
|
| |
and without the rounding operand. NFCI
I want to reuse the VRNDSCALE node for the legacy SSE rounding intrinsics so that those intrinsics can use EVEX instructions. All of these nodes share tablegen multiclasses so I split them all so that they all remain similar in their implementations.
llvm-svn: 318007
|
| |
|
|
| |
llvm-svn: 318005
|
| |
|
|
|
|
| |
This fixes a bug where we selected packed instructions for scalar intrinsics.
llvm-svn: 317999
|
| |
|
|
| |
llvm-svn: 317997
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: davidxl, olista01, Eugene.Zelenko
Reviewed By: Eugene.Zelenko
Subscribers: sdardis, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D39917
llvm-svn: 317995
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max.
In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value.
I've chosen a simpler case for the test here.
Reviewers: spatel, davide, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39766
llvm-svn: 317994
|
| |
|
|
|
|
|
|
| |
when avx512vl is enabled.
This matches what we do for scalar and 512-bit types.
llvm-svn: 317991
|
| |
|
|
|
|
|
|
|
|
| |
matchBinOpReduction currently matches against a single opcode, but we already have a case where we repeat calls to try to match against AND/OR and I'll be shortly adding another case for SMAX/SMIN/UMAX/UMIN (D39729).
This NFCI patch alters matchBinOpReduction to try and pattern match against any of the provided list of candidate bin ops at once to save time.
Differential Revision: https://reviews.llvm.org/D39726
llvm-svn: 317985
|
| |
|
|
|
|
|
|
|
|
| |
rename the existing versions to _Int.
This is consistent with out normal implementation of scalar instructions.
While there disable load folding for the patterns with IMPLICIT_DEF unless optimizing for size which is also our standard practice.
llvm-svn: 317977
|
| |
|
|
| |
llvm-svn: 317975
|
| |
|
|
| |
llvm-svn: 317974
|
| |
|
|
| |
llvm-svn: 317973
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
(sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
(sext:i32 (load:i16 $ptr)<<unindexedload>>)
I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.
Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.
llvm-svn: 317971
|
| |
|
|
| |
llvm-svn: 317968
|
| |
|
|
|
|
|
|
| |
multiclasses. NFC
No one ever uses this default and probably shouldn't since it sets the execution domain to generic.
llvm-svn: 317967
|
| |
|
|
|
|
|
|
| |
middle GEP indices are non-constant.
This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored.
llvm-svn: 317950
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
handling to accept GEPs with more than 2 operands if the middle operands are all 0s
Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows.
This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted.
We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example.
Differential Revision: https://reviews.llvm.org/D39911
llvm-svn: 317947
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The following kernel change has moved ET_DYN base to 0x4000000 on arm32:
https://marc.info/?l=linux-kernel&m=149825162606848&w=2
Switch to dynamic shadow base to avoid such conflicts in the future.
Reserve shadow memory in an ifunc resolver, but don't use it in the instrumentation
until PR35221 is fixed. This will eventually let use save one load per function.
Reviewers: kcc
Subscribers: aemerson, srhines, kubamracek, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D39393
llvm-svn: 317943
|
| |
|
|
|
|
|
|
|
|
|
| |
Also change some default cases into llvm_unreachable in
WindowsResourceCOFFWriter, to make it easier to find if they
are triggerd from within e.g. lld, which supported ARM64 earlier
than llvm-cvtres did.
Differential Revision: https://reviews.llvm.org/D39892
llvm-svn: 317942
|
| |
|
|
|
|
| |
folding, this trigger needless extra rounds of combine for nothing. NFC
llvm-svn: 317926
|
| |
|
|
| |
llvm-svn: 317923
|
| |
|
|
|
|
|
| |
The Windows builder did not reconstruct the HexagonGenDAGISel.inc file
after the TableGen binary has changed.
llvm-svn: 317921
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D39880
llvm-svn: 317920
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(NFC)
Summary:
The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires
that the pointer arguments have alignments of at least the element size. The existing
IRBuilder interface to create a call to this intrinsic does not allow for providing
the alignment of these pointer args. Having an interface that makes it easy to
construct invalid intrinsic calls doesn't seem sensible, so this patch simply
adds the requirement that one provide the argument alignments when using IRBuilder
to create atomic memcpy calls.
llvm-svn: 317918
|
| |
|
|
|
|
| |
This reverts r317904: broke Windows build.
llvm-svn: 317916
|
| |
|
|
|
|
|
|
| |
selectVectorAddr. NFCI
Just need to initialize a couple variables differently based on the node type. No need for a whole separate template method.
llvm-svn: 317915
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds logic to CVP to remove some overflow checks. It uses LVI to remove
operations with at least one constant. Specifically, this can remove many
overflow intrinsics immediately following an overflow check in the source code,
such as:
if (x < INT_MAX)
... x + 1 ...
Patch by Joel Galenson!
Reviewers: sanjoy, regehr
Reviewed By: sanjoy
Subscribers: fhahn, pirama, srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D39483
llvm-svn: 317911
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also minor cleanups:
1. Avoided multiple calls to Fixup.getKind()
2. Avoided multiple calls to getFixupKindInfo()
3. Removed a redundant return.
Reviewers: asb, apazos
Reviewed By: asb
Subscribers: rbar, johnrusso, llvm-commits
Differential Revision: https://reviews.llvm.org/D39881
llvm-svn: 317908
|
| |
|
|
| |
llvm-svn: 317904
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I want to leverage this to clean up some of the code in clang. This will allow us to simplify D39521 which was trying to do some of the same.
If we accurately keep the code in Host.cpp synced with new CPUs added to compile-rt/libgcc we should be able to use this file as a proxy for what's implemented in the libraries.
The entries for the CPUs recognized by the libraries use separate macros that define additional parameters like the name for __builtin_cpu_is and an alias string for the couple cases where __builtin_cpu_is accepts two different names.
All of the macros contain an ARCHNAME that is usually the same as the __builtin_cpu_is string, but sometimes isn't. This represents the name recognized by X86.td and -march.
I'm following the precedent set by ARM and AArch64 and adding this information to lib/Support/TargetParser.cpp
Reviewers: erichkeane, echristo, asbirlea
Reviewed By: echristo
Subscribers: llvm-commits, aemerson, kristof.beyls
Differential Revision: https://reviews.llvm.org/D39782
llvm-svn: 317900
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
LTO/Caching.cpp uses file rename to atomically set the value for a
cache key. On Windows, this fails when the destination file already
exists. Previously, LLVM would report_fatal_error in such
cases. However, because the old and the new value for the cache key
are supposed to be equivalent, it actually doesn't matter which one we
keep. This change makes it so that failing the rename when an openable
file with the desired name already exists causes us to report success
instead of fataling.
Reviewers: pcc, hans
Subscribers: mehdi_amini, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D39874
llvm-svn: 317899
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fixes PR35220
Reviewers: vadimcn, alexcrichton
Reviewed By: alexcrichton
Subscribers: pepyakin, alexcrichton, jfb, dschuff, sbc100, jgravelle-google, llvm-commits, aheejin
Differential Revision: https://reviews.llvm.org/D39866
llvm-svn: 317895
|
| |
|
|
|
|
|
|
| |
dead one
Differential revision: https://reviews.llvm.org/D38754
llvm-svn: 317884
|
| |
|
|
|
|
|
|
|
| |
This change adds generic fuzzing tools capable of running libFuzzer tests on
any optimization pass or combination of them.
Differential Revision: https://reviews.llvm.org/D39555
llvm-svn: 317883
|