| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in rL333782, we can be both better for optimization and
safer with this transform:
BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask
The only potentially unsafe-to-speculate binops are integer div/rem.
All other binops are always safe (although I don't see a way to
assert that in code here).
For opcodes like shifts that can produce poison, it can't matter
here because we know the lanes with undef are dropped by the
subsequent shuffle.
Differential Revision: https://reviews.llvm.org/D47686
llvm-svn: 333962
|
| |
|
|
|
|
|
|
|
|
| |
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
llvm-svn: 333954
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is setting up to fix bug 37573 cleanly.
This moves data structures that are technically both used in some way by the
target and the general-purpose outlining algorithm into MachineOutliner.h. In
particular, the `Candidate` class is of importance.
Before, the outliner passed the locations of `Candidates` to the target, which
would then make some decisions about the prospective outlined function. This
change allows us to just pass `Candidates` along to the target. This will allow
the target to discard `Candidates` that would be considered unsafe before cost
calculation. Thus, we will be able to remove the unsafe candidates described in
the bug without resorting to torching the entire prospective function.
Also, as a side-effect, it makes the outliner a bit cleaner.
https://bugs.llvm.org/show_bug.cgi?id=37573
llvm-svn: 333952
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for the proposed linker ABI changes
(https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf,
https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-cet.pdf),
this patch enables emission of the .note.gnu.property section to
ELF object files when building CET-enabled modules.
patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D47145
llvm-svn: 333951
|
| |
|
|
|
|
|
|
| |
Some overloads failed to update divergence.
Differential Revision: https://reviews.llvm.org/D47148
llvm-svn: 333947
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Windows' CRT has a limit of 512 open file descriptors, and fds which are
generated by converting a HANDLE via _get_osfhandle count towards this
limit as well.
Regardless, often you find yourself marshalling back and forth between
native HANDLE objects and fds anyway. If we know from the getgo that
we're going to need to work directly with the handle, we can cut out the
marshalling layer while also not contributing to filling up the CRT's
very limited handle table.
On Unix these functions just delegate directly to the existing set of
functions since an fd *is* the native file type. It would be nice, very
long term, if we could convert most uses of fds to file_t.
Differential Revision: https://reviews.llvm.org/D47688
llvm-svn: 333945
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This include variant for add, uaddo and addcarry. usubo and subcarry require the carry to be flipped to preserve semantic, but we chose to do the transform anyway in that case as to push the transform down the carry chain.
Reviewers: efriedma, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46505
llvm-svn: 333943
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: It has been deprecated in favor of SETCCCARRY for a year now and isn't used by any in tree backend.
Reviewers: efriedma, craig.topper, dblaikie, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47685
llvm-svn: 333939
|
| |
|
|
|
|
|
|
|
|
| |
entries to reach the target. Since these calls don't require type checks,
we can short-circuit them to their real targets, except in cases when they
can be pre-empted.
Differential Revision: https://reviews.llvm.org/D46326
llvm-svn: 333937
|
| |
|
|
|
|
|
|
| |
bool output parameter to get the real piece of info we care about. NFC
The ParitySrc array is more of an implementation detail. A single bool to get the final parity is sufficient.
llvm-svn: 333935
|
| |
|
|
|
|
|
|
| |
After last changes some code can be simplified.
Differential Revision: https://reviews.llvm.org/D47661
llvm-svn: 333934
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D47664
llvm-svn: 333931
|
| |
|
|
|
|
|
|
| |
This is a followup to rL333496.
Differential Revision: https://reviews.llvm.org/D47544
llvm-svn: 333929
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D47727
llvm-svn: 333928
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When checking a select to see if it matches an abs, allow the true/false values
to be a sign-extension of the comparison value instead of requiring that they're
directly the comparison value, as all the comparison cares about is the sign of
the value.
This fixes a regression due to r333702, where we were no longer generating ctlz
due to isKnownNonNegative failing to match such a pattern.
Differential Revision: https://reviews.llvm.org/D47631
llvm-svn: 333927
|
| |
|
|
|
|
|
|
| |
On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order.
Differential Revision: https://reviews.llvm.org/D46616
llvm-svn: 333926
|
| |
|
|
|
|
|
|
| |
resolveTargetShuffleInputs/getFauxShuffleMask
These methods should only be using SelectionDAG for analysis (known/sign bits etc), not node creation.
llvm-svn: 333925
|
| |
|
|
| |
llvm-svn: 333924
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html
This fixes PR36672.
The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes. This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).
Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
1) a change that teaches llvm-mca how to resolve variant scheduling classes.
2) a change to the BtVer2 scheduling model that allows us to special-case
packed XOR zero-idioms (this partially fixes PR36671).
Differential Revision: https://reviews.llvm.org/D47374
llvm-svn: 333909
|
| |
|
|
| |
llvm-svn: 333907
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Avoid name clashes with the corresponding bit fields in the instruction
encoding.
Change-Id: Id1644e703e976e78f7af93788d9f44cb48c3251f
Reviewers: arsenm, rampitec, kzhuravl
Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D47433
llvm-svn: 333905
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The new rules are straightforward. The main rules to keep in mind
are:
1. NAME is an implicit template argument of class and multiclass,
and will be substituted by the name of the instantiating def/defm.
2. The name of a def/defm in a multiclass must contain a reference
to NAME. If such a reference is not present, it is automatically
prepended.
And for some additional subtleties, consider these:
3. defm with no name generates a unique name but has no special
behavior otherwise.
4. def with no name generates an anonymous record, whose name is
unique but undefined. In particular, the name won't contain a
reference to NAME.
Keeping rules 1&2 in mind should allow a predictable behavior of
name resolution that is simple to follow.
The old "rules" were rather surprising: sometimes (but not always),
NAME would correspond to the name of the toplevel defm. They were
also plain bonkers when you pushed them to their limits, as the old
version of the TableGen test case shows.
Having NAME correspond to the name of the toplevel defm introduces
"spooky action at a distance" and breaks composability:
refactoring the upper layers of a hierarchy of nested multiclass
instantiations can cause unexpected breakage by changing the value
of NAME at a lower level of the hierarchy. The new rules don't
suffer from this problem.
Some existing .td files have to be adjusted because they ended up
depending on the details of the old implementation.
Change-Id: I694095231565b30f563e6fd0417b41ee01a12589
Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D47430
llvm-svn: 333900
|
| |
|
|
|
|
|
|
| |
Reviewers: smaksimovic, atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D47584
llvm-svn: 333895
|
| |
|
|
| |
llvm-svn: 333879
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For immediates used in DUP instructions that have the range
-128 to 127, or a multiple of 256 in the range -32768 to 32512,
one could argue that when the result element size is 16bits (.h),
the value can be considered both signed and unsigned.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47619
llvm-svn: 333873
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Print the first indexed element as a FP register, for example:
mov z0.d, z1.d[0]
Is now printed as:
mov z0.d, d1
Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47571
llvm-svn: 333872
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated copy of indexed SVE element to SVE vector,
along with MOV-aliases.
For example:
dup z0.h, z1.h[0]
duplicates the first 16-bit element from z1 to all elements in
the result vector z0.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47570
llvm-svn: 333871
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Predicated copy of floating-point immediate value to SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47518
llvm-svn: 333869
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Predicated copy of possibly shifted immediate value into SVE
vector, along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47517
llvm-svn: 333868
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When we optimize select basing on fact that div by 0 is undef
we should not traverse the instruction which are not guaranteed to
transfer execution to next instruction. Guard intrinsic is an example.
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47576
llvm-svn: 333864
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Applying synthetic debug info before the bitcode writer pass has no
testing-related purpose. This commit prevents that from happening.
It also adds tests which check that IR produced with/without
-debugify-each enabled is identical after stripping. This makes it
possible to check that individual passes (or full pipelines) are
invariant to debug info.
llvm-svn: 333861
|
| |
|
|
|
|
| |
intrinsics and select instructions.
llvm-svn: 333857
|
| |
|
|
|
|
|
|
|
|
| |
pre-existing SymbolFlags and SymbolToDefinition maps.
This constructor is useful when delegating work from an existing
IRMaterialiaztionUnit to a new one, as it avoids the cost of re-computing these
maps.
llvm-svn: 333852
|
| |
|
|
|
|
|
|
| |
There's a patchwork of existing transforms trying to handle
these cases, but as seen in the changed test, we weren't
catching them all.
llvm-svn: 333845
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This creates a small perf regression, but after talking with Jacques Pienaar, he was good with it to get things moving toward removng SETCCE.
Reviewers: jpienaar, bryant
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47626
llvm-svn: 333838
|
| |
|
|
|
|
|
|
|
| |
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.
Differential Revision: https://reviews.llvm.org/D47120
llvm-svn: 333825
|
| |
|
|
|
|
|
|
| |
The LLVM part was committed instead of the Clang part.
Differential Revision: https://reviews.llvm.org/D47121
llvm-svn: 333824
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Object FIle Representation
At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like:
.cg_profile a, b, 32
.cg_profile freq, a, 11
.cg_profile freq, b, 20
When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight).
Differential Revision: https://reviews.llvm.org/D44965
llvm-svn: 333823
|
| |
|
|
|
|
| |
This doesn't affect the assembly or disassembly, but is more accurate.
llvm-svn: 333822
|
| |
|
|
|
|
|
|
| |
the assembly parser.
The caret was positioned on the wrong operand. It's too hard to get right so just put the caret at the beginning of the instruction.
llvm-svn: 333821
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in the review thread for rL333782, we could have
made a bug harder to hit if we were simplifying instructions
before trying other folds.
The shuffle transform in question isn't ever a simplification;
it's just a canonicalization. So I've renamed that to make that
clearer.
This is NFCI at this point, but I've regenerated the test file
to show the cosmetic value naming difference of using
instcombine's RAUW vs. the builder.
Possible follow-ups:
1. Move reassociation folds after simplifies too.
2. Refactor common code; we shouldn't have so much repetition.
llvm-svn: 333820
|
| |
|
|
|
|
|
|
|
| |
We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.
Differential Revision: https://reviews.llvm.org/D47121
llvm-svn: 333819
|
| |
|
|
|
|
| |
by anchor()
llvm-svn: 333817
|
| |
|
|
|
|
|
|
|
|
| |
instructions so they can be assembled and disassembled.
These instructions are unusual in that they operate on 4 consecutive registers so supporting them in codegen will be more difficult than normal.
Includes an assembler check to warn if the source register is not the first register of a 4 register group.
llvm-svn: 333812
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I noticed this issue because we didn't put the primary cloned loop into
the `NonChildClonedLoops` vector and so never iterated on it. Once
I fixed that, it made it clear why I had to do a really complicated and
unnecesasry dance when updating the loops to remain in canonical form --
I was unwittingly working around the fact that the primary cloned loop
wasn't in the expected list of cloned loops. Doh!
Now that we include it in this vector, we don't need to return it and we
can consolidate the update logic as we correctly have a single place
where it can be handled.
I've just added a test for the iteration order aspect as every time
I changed the update logic partially or incorrectly here, an existing
test failed and caught it so that seems well covered (which is also
evidenced by the extensive working around of this missing update).
Reviewers: asbirlea, sanjoy
Subscribers: mcrosier, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D47647
llvm-svn: 333811
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and using the latter in DIBuilder::createArtificialType and
DIBuilder::createObjectPointerType methods as well as introducing
mirroring DISubprogram::cloneWithFlags and
DIBuilder::createArtificialSubprogram methods.
The primary goal here is to add createArtificialSubprogram to support
a pass downstream while keeping the method consistent with the
existing ones and making sure we don't encourage changing already
created DI-nodes.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D47615
llvm-svn: 333806
|
| |
|
|
|
|
| |
Previously we just returned undef, but really we should be returning the pass thru input. We also need to make sure we preserve the chain output that the original intrinsic node had to maintain connectivity in the DAG. So we should just return the incoming chain as the output chain.
llvm-svn: 333804
|
| |
|
|
|
|
|
|
|
| |
This makes it easier to inspect the results of
DbgValueHistoryCalculator.
Differential Revision: https://reviews.llvm.org/D47663
llvm-svn: 333801
|
| |
|
|
|
|
| |
value is a zero vector.
llvm-svn: 333800
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea behind WindowsSupport.h is that it's in the source directory so
that windows.h'isms don't leak out into the larger LLVM project. To that
end, any symbol that references a symbol from windows.h must be in this
private header, and not in a public header.
However, we had some useful utility functions in WindowsSupport.h which
have no dependency on the Windows API, but still only make sense on
Windows. Those functions should be usable outside of Support since there
is no risk of causing a windows.h leak. Although this introduces some
preprocessor logic in some header files, It's not too egregious and it's
better than the alternative of duplicating a ton of code.
Differential Revision: https://reviews.llvm.org/D47662
llvm-svn: 333798
|