| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 306104
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 306097
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 306092
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is very similar to the transform in:
https://reviews.llvm.org/rL306040
...but in this case, we use cmp X, 1 to set the carry bit as needed.
Again, we can show that all of these are logically equivalent (although
InstCombine currently canonicalizes to a form not seen here), and if
we believe IACA, then this is the smallest/fastest code. Eg, with SNB:
| Num Of | Ports pressure in cycles | |
| Uops | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | |
---------------------------------------------------------------------
| 1 | 1.0 | | | | | | | cmp edi, 0x1
| 2 | | 1.0 | | | | 1.0 | CP | sbb eax, eax
The larger motivation is to clean up all select-of-constants combining/lowering
because we're missing some common cases.
llvm-svn: 306072
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: RKSimon, DavidKreitzer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32658
llvm-svn: 306068
|
|
|
|
|
|
| |
These are siblings of the tests added with r306032.
llvm-svn: 306064
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These intrinsics aren't used by clang and haven't been for a while.
There's some really terrible codegen in the 32-bit target for avx512bw due to i64 not being legal. But as I said these intrinsics aren't used by clang even before this patch so this codegen reflects our clang behavior today.
Reviewers: spatel, RKSimon, zvi, igorb
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34389
llvm-svn: 306047
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our handling of select-of-constants is lumpy in IR (https://reviews.llvm.org/D24480),
lumpy in DAGCombiner, and lumpy in X86ISelLowering. That's why we only had the 'sbb'
codegen in 1 out of the 4 tests. This is a step towards smoothing that out.
First, show that all of these IR forms are equivalent:
http://rise4fun.com/Alive/mx
Second, show that the 'sbb' version is faster/smaller. IACA output for SandyBridge
(later Intel and AMD chips are similar based on Agner's tables):
This is the "obvious" x86 codegen (what gcc appears to produce currently):
| Num Of | Ports pressure in cycles | |
| Uops | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | |
---------------------------------------------------------------------
| 1* | | | | | | | | xor eax, eax
| 1 | 1.0 | | | | | | CP | test edi, edi
| 1 | | | | | | 1.0 | CP | setnz al
| 1 | | 1.0 | | | | | CP | neg eax
This is the adc version:
| 1* | | | | | | | | xor eax, eax
| 1 | 1.0 | | | | | | CP | cmp edi, 0x1
| 2 | | 1.0 | | | | 1.0 | CP | adc eax, 0xffffffff
And this is sbb:
| 1 | 1.0 | | | | | | | neg edi
| 2 | | 1.0 | | | | 1.0 | CP | sbb eax, eax
If IACA is trustworthy, then sbb became a single uop in Broadwell, so this will be
clearly better than the alternatives going forward.
llvm-svn: 306040
|
|
|
|
| |
llvm-svn: 306032
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds prologue code emission for stack probe function
calls.
Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D34387
llvm-svn: 306010
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Support vector type G_INSERT legalization/selection.
Split from https://reviews.llvm.org/D33665
Reviewers: qcolombet, t.p.northover, zvi, guyblank
Reviewed By: guyblank
Subscribers: guyblank, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33956
llvm-svn: 305989
|
|
|
|
|
|
|
|
|
|
|
|
| |
Masked gather for vector length 2 is lowered incorrectly for element type i32.
The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD.
The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal.
In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well.
Differential revision: https://reviews.llvm.org/D34343
llvm-svn: 305987
|
|
|
|
|
|
|
|
| |
Patch by Fedor Sergeev.
Differential Revision: https://reviews.llvm.org/D33868
llvm-svn: 305948
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305916
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305913
|
|
|
|
| |
llvm-svn: 305910
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305909
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305908
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305907
|
|
|
|
|
|
| |
Use triple and attribute only for consistency
llvm-svn: 305906
|
|
|
|
| |
llvm-svn: 305905
|
|
|
|
| |
llvm-svn: 305904
|
|
|
|
| |
llvm-svn: 305897
|
|
|
|
|
|
|
| |
Add support for combining a build vector to a shuffle.
When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1.
llvm-svn: 305883
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When we're building with XRay instrumentation, we use a trick that
preserves references from the function to a function sled index. This
index table lives in a separate section, and without this trick the
linker is free to garbage-collect this section and all the segments it
refers to. Until we're able to tell the linkers to preserve these
sections, we use this reference trick to keep around both the index and
the entries in the instrumentation map.
Before this change we emitted both a synthetic reference to the label in
the instrumentation map, and to the entry in the function map index.
This change removes the first synthetic reference and only emits one
synthetic reference to the index -- the index entry has the references
to the labels in the instrumentation map, so the linker will still
preserve those if the function itself is preserved.
This reduces the amount of synthetic references we emit from 16 bytes to
just 8 bytes in x86_64, and similarly to other platforms.
Reviewers: dblaikie
Subscribers: javed.absar, kpw, pelikan, llvm-commits
Differential Revision: https://reviews.llvm.org/D34340
llvm-svn: 305880
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now areMemoryOpsAliased has an assertion justified as:
MMO1 should have a value due it comes from operation we'd like to use
as implicit null check.
assert(MMO1->getValue() && "MMO1 should have a Value!");
However, it is possible for that invariant to not be upheld in the
following situation (conceptually):
Null check %RAX
NotNullSucc:
%RAX = LEA %RSP, 16 // I0
%RDX = MOV64rm %RAX // I1
With the current code, we will have an early exit from
ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable.
However, I1 will look plausible (since it loads from %RAX) and
will go ahead and call areMemoryOpsAliased(I1, I0). This will cause
us to fail the assert mentioned above since I1 does not load from an
IR level value and thus is allowed to have a non-Value base address.
The fix is to bail out earlier whenever we see an unsuitable
instruction overwrite PointerReg. This would guarantee that when we
call areMemoryOpsAliased, we're guaranteed to be looking at an
instruction that loads from or stores to an IR level value.
Original Patch Author: sanjoy
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34385
llvm-svn: 305879
|
|
|
|
|
|
|
|
|
| |
There are a couple of potential improvements as seen in the IR and asm:
1. We're unnecessarily extending to a larger type to compare values.
2. The codegen for (select cond, 1, -1) could avoid a cmov.
(or we could change the order of the compares, so we have a select with 0 operand)
llvm-svn: 305802
|
|
|
|
|
|
|
|
| |
>=16bit elements
Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this
llvm-svn: 305801
|
|
|
|
| |
llvm-svn: 305787
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In some cases RegClass depends on target feature. Hight (16-31) vector registers exist only if AVX512f available.
Split from https://reviews.llvm.org/D33665
Reviewers: qcolombet, t.p.northover, zvi, guyblank
Reviewed By: t.p.northover, guyblank
Subscribers: guyblank, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33952
Conflicts:
test/CodeGen/X86/GlobalISel/select-memop-scalar.mir
llvm-svn: 305784
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In some cases legalization ends up with not symmetric merge/unmerge nodes.
Transform it to merge/unmerge nodes.
Reviewers: t.p.northover, qcolombet, zvi
Reviewed By: t.p.northover
Subscribers: rovka, kristof.beyls, guyblank, llvm-commits
Differential Revision: https://reviews.llvm.org/D33626
llvm-svn: 305783
|
|
|
|
| |
llvm-svn: 305781
|
|
|
|
|
|
|
|
|
|
| |
The AMD64rm instruction used in the test case was incorrect. Since
the first input register to AND64rm is tied to output register, they
must be the same.
Thanks for Jesper Antonsson for pointing this out!
llvm-svn: 305756
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This seems to be interacting badly with ASan somehow, causing false reports of
heap-buffer overflows: PR33514.
> Summary:
> The patch makes instruction count the highest priority for
> LSR solution for X86 (previously registers had highest priority).
>
> Reviewers: qcolombet
>
> Differential Revision: http://reviews.llvm.org/D30562
>
> From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 305720
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Implement some of the simplest addressing modes.It should help to test ABI.
Reviewers: zvi, guyblank
Reviewed By: guyblank
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33888
llvm-svn: 305691
|
|
|
|
| |
llvm-svn: 305670
|
|
|
|
| |
llvm-svn: 305656
|
|
|
|
| |
llvm-svn: 305655
|
|
|
|
| |
llvm-svn: 305654
|
|
|
|
|
|
| |
Increment (add 1) could be transformed to sub -1, and we'd lose coverage for these patterns.
llvm-svn: 305646
|
|
|
|
|
|
| |
Increment (add 1) could be transformed to sub -1, and we'd lose coverage for these patterns.
llvm-svn: 305645
|
|
|
|
|
|
| |
Increment (add 1) could be transformed to sub -1, and we'd lose coverage for these patterns.
llvm-svn: 305644
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse
to place spills as the very first instruciton of a basic block and thus
artifically increase pressure (test in
test/CodeGen/PowerPC/scavenging.mir:spill_at_begin)
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.
This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.
Differential Revision: http://reviews.llvm.org/D21885
llvm-svn: 305625
|
|
|
|
|
|
| |
The analysis is expected to be preserved by SelectionDAG.
llvm-svn: 305621
|
|
|
|
|
|
|
|
|
| |
Revert because of reports of some PPC input starting to spill when it
was predicted that it wouldn't and no spillslot was reserved.
This reverts commit r305516.
llvm-svn: 305566
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html
This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic.
Reviewers: reames, sanjoy, efriedma
Reviewed By: reames
Subscribers: mzolotukhin, anna, llvm-commits, skatkov
Differential Revision: https://reviews.llvm.org/D33240
llvm-svn: 305558
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64
problems reported in the stage2 build last time, which I cannot
reproduce right now.
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.
This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.
Differential Revision: http://reviews.llvm.org/D21885
llvm-svn: 305516
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code assumed that we process instructions in basic block order. FastISel
processes instructions in reverse basic block order. We need to pre-assign
virtual registers before selecting otherwise we get def-use relationships wrong.
This only affects code with swifterror registers.
rdar://32659327
llvm-svn: 305484
|
|
|
|
|
|
|
|
|
|
| |
assuming v8i16 vectors
We can use this with v16i16/v32i16 as well.
Found during fuzz testing.
llvm-svn: 305472
|
|
|
|
|
|
|
|
| |
(remove redundant shift left+right instructions).
This is causing windows buildbot failures
llvm-svn: 305470
|