| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 318055
|
| |
|
|
|
|
|
|
| |
These are set with other scalar int ops few lines up
Differential Revision: https://reviews.llvm.org/D39928
llvm-svn: 318051
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If a compare instruction is same or inverse of the compare in the
branch of the loop latch, then return a constant evolution node.
This shall facilitate computations of loop exit counts in cases
where compare appears in the evolution chain of induction variables.
Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju
Reviewed By: sanjoy, junryoungju
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38494
llvm-svn: 318050
|
| |
|
|
|
|
| |
This reverts commit r318032. The test broke some sanitizer bots.
llvm-svn: 318049
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
In more recent Linux kernels (including those with 47 bit VMAs) the layout of
virtual memory for powerpc64 changed causing the memory sanitizer to not
work properly. This patch adjusts a bit mask in the memory sanitizer to work
on the newer kernels while continuing to work on the older ones as well.
This is the non-runtime part of the patch and finishes it. ref: r317802
Tested on several 4.x and 3.x kernel releases.
llvm-svn: 318045
|
| |
|
|
|
|
|
|
|
| |
Remove builtins from llvm and add AutoUpgrade support.
Also add fast-isel tests for the TEST and TESTN instructions.
Differential Revision: https://reviews.llvm.org/D38736
llvm-svn: 318036
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When generating table jump code for switch statements, place the jump
table label as the first operand in the various addition instructions
in order to enable addressing mode selectors to better match index
computation and possibly fold them into the addressing mode of the
table entry load instruction.
Differential revision: https://reviews.llvm.org/D39752
llvm-svn: 318033
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.
Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.
This partially resolves PR35209.
Thanks to Alexander Richardson for reporting the issue!
Reviewers: john.brawn
Differential Revision: https://reviews.llvm.org/D39841
llvm-svn: 318032
|
| |
|
|
| |
llvm-svn: 318029
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.
It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.
The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.
`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.
Reviewers: davide, davidxl, grosser
Reviewed By: davide, davidxl, grosser
Subscribers: gyiu, grosser, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D39607
llvm-svn: 318028
|
| |
|
|
| |
llvm-svn: 318027
|
| |
|
|
|
|
|
|
|
| |
This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D38671
Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661
llvm-svn: 318026
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updated the scheduling information of the SKX subtarget in the file X86SchedSkylakeServer.td under lib/Target/X86 to:
1. add regular opcodes in addition to the suffixed "_Int" opcodes
2. add the (V)MAXCPD/MAXCPS/MAXCSD/MAXCSS/MINCPD/MINCPS/MINCSD/MINCSS
instructions that are equivalent to their counterparts without the 'C' as they are part of a hack to
make floating point min/max commutable under fast math.
Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D39833
Change-Id: Ie13702a5ce1b1a08af91ca637a52b6962881e7d6
llvm-svn: 318024
|
| |
|
|
|
|
| |
We support 2 spelling for silvermont and we should accept both here.
llvm-svn: 318023
|
| |
|
|
|
|
| |
vrcp14ss/sd, rsqrt14ss/sd instructions.
llvm-svn: 318022
|
| |
|
|
| |
llvm-svn: 318020
|
| |
|
|
|
|
| |
intrinsics.
llvm-svn: 318019
|
| |
|
|
| |
llvm-svn: 318017
|
| |
|
|
|
|
| |
sse_load_f32/sse_load_f64 to increase load folding opportunities.
llvm-svn: 318016
|
| |
|
|
|
|
|
|
|
|
|
| |
This was using a custom function that didn't handle the
addressing modes properly for private. Use
isLegalAddressingMode to avoid duplicating this.
Additionally, skip the combine if there is only one use
since the standard combine will handle it.
llvm-svn: 318013
|
| |
|
|
| |
llvm-svn: 318010
|
| |
|
|
| |
llvm-svn: 318009
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsics.
The VRNDSCALE instructions implement a superset of the (V)ROUND instructions. They are equivalent if the upper 4-bits of the immediate are 0.
This patch lowers the legacy intrinsics to the VRNDSCALE ISD node and masks the upper bits of the immediate to 0. This allows us to take advantage of the larger register encoding space.
We should maybe consider converting VRNDSCALE back to VROUND in the EVEX to VEX pass if the extended registers are not being used.
I notice some load folding opportunities being missed for the VRNDSCALESS/SD instructions that I'll try to fix in future patches.
llvm-svn: 318008
|
| |
|
|
|
|
|
|
| |
and without the rounding operand. NFCI
I want to reuse the VRNDSCALE node for the legacy SSE rounding intrinsics so that those intrinsics can use EVEX instructions. All of these nodes share tablegen multiclasses so I split them all so that they all remain similar in their implementations.
llvm-svn: 318007
|
| |
|
|
| |
llvm-svn: 318005
|
| |
|
|
|
|
| |
This fixes a bug where we selected packed instructions for scalar intrinsics.
llvm-svn: 317999
|
| |
|
|
| |
llvm-svn: 317997
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: davidxl, olista01, Eugene.Zelenko
Reviewed By: Eugene.Zelenko
Subscribers: sdardis, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D39917
llvm-svn: 317995
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max.
In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value.
I've chosen a simpler case for the test here.
Reviewers: spatel, davide, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39766
llvm-svn: 317994
|
| |
|
|
|
|
|
|
| |
when avx512vl is enabled.
This matches what we do for scalar and 512-bit types.
llvm-svn: 317991
|
| |
|
|
|
|
|
|
|
|
| |
matchBinOpReduction currently matches against a single opcode, but we already have a case where we repeat calls to try to match against AND/OR and I'll be shortly adding another case for SMAX/SMIN/UMAX/UMIN (D39729).
This NFCI patch alters matchBinOpReduction to try and pattern match against any of the provided list of candidate bin ops at once to save time.
Differential Revision: https://reviews.llvm.org/D39726
llvm-svn: 317985
|
| |
|
|
|
|
|
|
|
|
| |
rename the existing versions to _Int.
This is consistent with out normal implementation of scalar instructions.
While there disable load folding for the patterns with IMPLICIT_DEF unless optimizing for size which is also our standard practice.
llvm-svn: 317977
|
| |
|
|
| |
llvm-svn: 317975
|
| |
|
|
| |
llvm-svn: 317974
|
| |
|
|
| |
llvm-svn: 317973
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
(sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
(sext:i32 (load:i16 $ptr)<<unindexedload>>)
I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.
Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.
llvm-svn: 317971
|
| |
|
|
| |
llvm-svn: 317968
|
| |
|
|
|
|
|
|
| |
multiclasses. NFC
No one ever uses this default and probably shouldn't since it sets the execution domain to generic.
llvm-svn: 317967
|
| |
|
|
|
|
|
|
| |
middle GEP indices are non-constant.
This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored.
llvm-svn: 317950
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
handling to accept GEPs with more than 2 operands if the middle operands are all 0s
Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows.
This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted.
We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example.
Differential Revision: https://reviews.llvm.org/D39911
llvm-svn: 317947
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The following kernel change has moved ET_DYN base to 0x4000000 on arm32:
https://marc.info/?l=linux-kernel&m=149825162606848&w=2
Switch to dynamic shadow base to avoid such conflicts in the future.
Reserve shadow memory in an ifunc resolver, but don't use it in the instrumentation
until PR35221 is fixed. This will eventually let use save one load per function.
Reviewers: kcc
Subscribers: aemerson, srhines, kubamracek, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D39393
llvm-svn: 317943
|
| |
|
|
|
|
|
|
|
|
|
| |
Also change some default cases into llvm_unreachable in
WindowsResourceCOFFWriter, to make it easier to find if they
are triggerd from within e.g. lld, which supported ARM64 earlier
than llvm-cvtres did.
Differential Revision: https://reviews.llvm.org/D39892
llvm-svn: 317942
|
| |
|
|
|
|
| |
folding, this trigger needless extra rounds of combine for nothing. NFC
llvm-svn: 317926
|
| |
|
|
| |
llvm-svn: 317923
|
| |
|
|
|
|
|
| |
The Windows builder did not reconstruct the HexagonGenDAGISel.inc file
after the TableGen binary has changed.
llvm-svn: 317921
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D39880
llvm-svn: 317920
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(NFC)
Summary:
The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires
that the pointer arguments have alignments of at least the element size. The existing
IRBuilder interface to create a call to this intrinsic does not allow for providing
the alignment of these pointer args. Having an interface that makes it easy to
construct invalid intrinsic calls doesn't seem sensible, so this patch simply
adds the requirement that one provide the argument alignments when using IRBuilder
to create atomic memcpy calls.
llvm-svn: 317918
|
| |
|
|
|
|
| |
This reverts r317904: broke Windows build.
llvm-svn: 317916
|
| |
|
|
|
|
|
|
| |
selectVectorAddr. NFCI
Just need to initialize a couple variables differently based on the node type. No need for a whole separate template method.
llvm-svn: 317915
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds logic to CVP to remove some overflow checks. It uses LVI to remove
operations with at least one constant. Specifically, this can remove many
overflow intrinsics immediately following an overflow check in the source code,
such as:
if (x < INT_MAX)
... x + 1 ...
Patch by Joel Galenson!
Reviewers: sanjoy, regehr
Reviewed By: sanjoy
Subscribers: fhahn, pirama, srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D39483
llvm-svn: 317911
|