| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
See D46356 for context.
llvm-svn: 334164
|
| |
|
|
|
|
|
|
|
|
| |
This is needed to get CC operand in right place, as expected by the
SchedModel.
Review: Ulrich Weigand
https://reviews.llvm.org/D47820
llvm-svn: 334161
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When denormals are supported we are producing a full division for
1.0f / x. That still can be replaced by the faster version:
bool c = fabs(x) > 0x1.0p+96f;
float s = c ? 0x1.0p-32f : 1.0f;
x *= s;
return s * v_rcp_f32(x)
in case if requested accuracy is 2.5ulp or less. The same version
is used if denormals are not supported for non 1.0 numerators, where
just v_rcp_f32 is then used for 1.0 numerator.
The optimization of 1/x is extended to the case -1/x, which is the
same except for the resulting sign bit.
OpenCL conformance passed with both enabled and disabled denorms.
Differential Revision: https://reviews.llvm.org/D47805
llvm-svn: 334142
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Fixes terrible code on targets without f16 support. The
legalization creates a mess that is difficult to recover
from. Also should avoid randomly breaking these tests
multiple times in sequence in future commits.
Some regressions in cases where it happens to be better
to pull the source modifier after the conversion.
llvm-svn: 334132
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In D47428, i propose to choose the `~(-(1 << nbits))` as the canonical form of low-bit-mask formation.
As it is seen from these tests, there is a reason for that.
AArch64 currently better handles `~(-(1 << nbits))`, but not the more traditional `(1 << nbits) - 1` (sic!).
The other way around for X86.
It would be much better to canonicalize.
This patch is completely monkey-typing.
I don't really understand how this works :)
I have based it on `// x & (-1 >> (32 - y))` pattern.
Also, when we only have `BMI`, i wonder if we could use `BEXTR` with `start=0` ?
Related links:
https://bugs.llvm.org/show_bug.cgi?id=36419
https://bugs.llvm.org/show_bug.cgi?id=37603
https://bugs.llvm.org/show_bug.cgi?id=37610
https://rise4fun.com/Alive/idM
Reviewers: craig.topper, spatel, RKSimon, javed.absar
Reviewed By: craig.topper
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D47453
llvm-svn: 334125
|
| |
|
|
| |
llvm-svn: 334123
|
| |
|
|
|
|
|
|
|
|
| |
are used as the index.
These encodings correspond to the cases in the normal encoding scheme where there is no index and our modrm reading code initially decodes it as such. The VSIB handling code tried to compensate for this, but failed to add the base needed to make later code do the right thing.
Fixes PR37712.
llvm-svn: 334121
|
| |
|
|
|
|
|
|
|
|
|
| |
The index size is represented by the letter after the 'v'. The number represents the memory size. If an 'x' appears after the number its means the index register can be from VR128X/VR256X instead of VR128/VR256.
As vy512mem uses a VR256X index it should have an x.
And vz256mem uses a VR512 index so it shouldn't have an x.
I admit these names kind of suck and are confusing.
llvm-svn: 334120
|
| |
|
|
|
|
|
|
| |
dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), all these instructions are dependency breaking and zero the destination register.
llvm-svn: 334119
|
| |
|
|
|
|
| |
Create a separate feature set for Exynos M4 and add test cases.
llvm-svn: 334115
|
| |
|
|
| |
llvm-svn: 334109
|
| |
|
|
|
|
|
|
|
|
|
| |
Make TII isCopyInstr() return MachineOperands through pointer to pointer
instead via reference.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D47364
llvm-svn: 334105
|
| |
|
|
| |
llvm-svn: 334086
|
| |
|
|
| |
llvm-svn: 334085
|
| |
|
|
|
|
|
|
| |
Only the bottom 16-bits of BEXTR's control op are required (0:8 INDEX, 15:8 LENGTH).
Differential Revision: https://reviews.llvm.org/D47690
llvm-svn: 334083
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.
Differential Revision: https://reviews.llvm.org/D44928
llvm-svn: 334078
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add minimal support to lower function calls.
Support only functions with arguments/return that go through registers
and have type i32.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D45627
llvm-svn: 334071
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: As it turns out, the lowering for the Mips16* family of target is the exact same thing as what the ops expands to, so the code handling them can be removed and the ops only enabled for the MipsSE* family of targets.
Reviewers: smaksimovic, atanasyan, abeserminji
Subscribers: sdardis, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D47703
llvm-svn: 334052
|
| |
|
|
|
|
|
|
| |
Preserves the low bound of the !range. I don't think
it's legal to do anything with the top half since it's
theoretically reading garbage.
llvm-svn: 334045
|
| |
|
|
|
|
| |
Apply to i8 vectors.
llvm-svn: 334044
|
| |
|
|
| |
llvm-svn: 334043
|
| |
|
|
| |
llvm-svn: 334038
|
| |
|
|
|
|
|
|
| |
Reviewers: smaksimovic, atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D47635
llvm-svn: 334031
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Similar to v4i32 SHL, convert v8i16 shift amounts to scale factors instead to improve performance and reduce instruction count. We were already doing this for constant shifts, this adds variable shift support.
Reduces the serial nature of the codegen, which relies on chains of plendvb/pand+pandn+por shifts.
This is a step towards adding support for vXi16 vector rotates.
Differential Revision: https://reviews.llvm.org/D47546
llvm-svn: 334023
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Allow extended parsing of variable assembler assignment syntax and modify X86 to permit
VAR = register assignment. As we emit these as .set directives when possible, we inline
such expressions in output assembly.
Fixes PR37425.
Reviewers: rnk, void, echristo
Reviewed By: rnk
Subscribers: nickdesaulniers, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D47545
llvm-svn: 334022
|
| |
|
|
| |
llvm-svn: 334016
|
| |
|
|
| |
llvm-svn: 334015
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers.
Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation.
For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation.
This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5.
This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first.
Differential Revision: https://reviews.llvm.org/D47765
llvm-svn: 334011
|
| |
|
|
|
|
|
|
|
|
| |
X86TargetLowering::computeKnownBitsForTargetNode
Ideally we'd use resolveTargetShuffleInputs to handle faux shuffles as well but:
(a) that code path doesn't handle general/pre-legalized ops/types very well.
(b) I'm concerned about the compute time as they recurse to calls to computeKnownBits/ComputeNumSignBits which would need depth limiting somehow.
llvm-svn: 334007
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Bringing some come duplicated in the AT&T and the Intel printers
into a common parent class.
Reviewers: craig.topper
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D47682
llvm-svn: 334005
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When the branch target of a Thumb2 unconditional or conditonal branch is
resolved at assembly time, no range checking is performed on the result
leading to incorrect immediates. This change adds a range check:
+- 16 Megabytes for unconditional branches, +- 1 Megabyte for the
conditional branch.
Differential Revision: https://reviews.llvm.org/D46306
llvm-svn: 333997
|
| |
|
|
|
|
|
|
| |
X86TargetLowering::computeKnownBitsForTargetNode
Helps improve analysis of saturation ops
llvm-svn: 333995
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The Thumb BL range is + or - either 16 Megabytes or 4 Megabytes depending
on whether the CPU supports Thumb2 or the v8-m baseline ops. The existing
check for BL range is incorrectly set at +- 32 Megabytes. This change
corrects the higher range and uses the lower range if the featurebits
don't have the necessary support for it.
Differential Revision: https://reviews.llvm.org/D46305
llvm-svn: 333991
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the new version of D46181, allowing setjmp/longjmp
to work correctly with the Intel CET shadow stack by storing
SSP on setjmp and fixing it on longjmp. The patch has been
updated to use the cf-protection-return module flag instead
of HasSHSTK, and the bug that caused D46181 to be reverted
has been fixed with the test expanded to track that fix.
patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D47311
llvm-svn: 333990
|
| |
|
|
|
|
|
|
| |
the initial MMX support via one of the SSE features flags make them require the MMX feature as well.
Passing -mattr=-mmx needs to disable these instructions since the MMX register class won't have been set up. But we don't want -mattr=-mmx to disable SSE so we have to do it separately.
llvm-svn: 333984
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The ELF version was broken (does not deal with wasm specific fixups),
and now is slightly less broken. It will be removed in its entirety
in the future which this change makes slightly easier (just remove
the IsELF bool).
Differential Revision: https://reviews.llvm.org/D47745
Patch by Wouter van Oortmerssen
llvm-svn: 333964
|
| |
|
|
|
|
|
|
|
|
| |
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
llvm-svn: 333954
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is setting up to fix bug 37573 cleanly.
This moves data structures that are technically both used in some way by the
target and the general-purpose outlining algorithm into MachineOutliner.h. In
particular, the `Candidate` class is of importance.
Before, the outliner passed the locations of `Candidates` to the target, which
would then make some decisions about the prospective outlined function. This
change allows us to just pass `Candidates` along to the target. This will allow
the target to discard `Candidates` that would be considered unsafe before cost
calculation. Thus, we will be able to remove the unsafe candidates described in
the bug without resorting to torching the entire prospective function.
Also, as a side-effect, it makes the outliner a bit cleaner.
https://bugs.llvm.org/show_bug.cgi?id=37573
llvm-svn: 333952
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for the proposed linker ABI changes
(https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf,
https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-cet.pdf),
this patch enables emission of the .note.gnu.property section to
ELF object files when building CET-enabled modules.
patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D47145
llvm-svn: 333951
|
| |
|
|
|
|
|
|
| |
bool output parameter to get the real piece of info we care about. NFC
The ParitySrc array is more of an implementation detail. A single bool to get the final parity is sufficient.
llvm-svn: 333935
|
| |
|
|
|
|
|
|
| |
After last changes some code can be simplified.
Differential Revision: https://reviews.llvm.org/D47661
llvm-svn: 333934
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D47664
llvm-svn: 333931
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D47727
llvm-svn: 333928
|
| |
|
|
|
|
|
|
| |
On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order.
Differential Revision: https://reviews.llvm.org/D46616
llvm-svn: 333926
|
| |
|
|
|
|
|
|
| |
resolveTargetShuffleInputs/getFauxShuffleMask
These methods should only be using SelectionDAG for analysis (known/sign bits etc), not node creation.
llvm-svn: 333925
|
| |
|
|
| |
llvm-svn: 333924
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html
This fixes PR36672.
The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes. This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).
Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
1) a change that teaches llvm-mca how to resolve variant scheduling classes.
2) a change to the BtVer2 scheduling model that allows us to special-case
packed XOR zero-idioms (this partially fixes PR36671).
Differential Revision: https://reviews.llvm.org/D47374
llvm-svn: 333909
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Avoid name clashes with the corresponding bit fields in the instruction
encoding.
Change-Id: Id1644e703e976e78f7af93788d9f44cb48c3251f
Reviewers: arsenm, rampitec, kzhuravl
Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D47433
llvm-svn: 333905
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The new rules are straightforward. The main rules to keep in mind
are:
1. NAME is an implicit template argument of class and multiclass,
and will be substituted by the name of the instantiating def/defm.
2. The name of a def/defm in a multiclass must contain a reference
to NAME. If such a reference is not present, it is automatically
prepended.
And for some additional subtleties, consider these:
3. defm with no name generates a unique name but has no special
behavior otherwise.
4. def with no name generates an anonymous record, whose name is
unique but undefined. In particular, the name won't contain a
reference to NAME.
Keeping rules 1&2 in mind should allow a predictable behavior of
name resolution that is simple to follow.
The old "rules" were rather surprising: sometimes (but not always),
NAME would correspond to the name of the toplevel defm. They were
also plain bonkers when you pushed them to their limits, as the old
version of the TableGen test case shows.
Having NAME correspond to the name of the toplevel defm introduces
"spooky action at a distance" and breaks composability:
refactoring the upper layers of a hierarchy of nested multiclass
instantiations can cause unexpected breakage by changing the value
of NAME at a lower level of the hierarchy. The new rules don't
suffer from this problem.
Some existing .td files have to be adjusted because they ended up
depending on the details of the old implementation.
Change-Id: I694095231565b30f563e6fd0417b41ee01a12589
Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D47430
llvm-svn: 333900
|
| |
|
|
|
|
|
|
| |
Reviewers: smaksimovic, atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D47584
llvm-svn: 333895
|