| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an attempt of fixing PR35807.
Due to the non-standard definition of dominance in LLVM, where uses in
unreachable blocks are dominated by anything, you can have, in an
unreachable block:
%patatino = OP1 %patatino, CONSTANT
When `SimplifyInstruction` receives a PHI where an incoming value is of
the aforementioned form, in some cases, loops indefinitely.
What I propose here instead is keeping track of the incoming values
from unreachable blocks, and replacing them with undef. It fixes this
case, and it seems to be good regardless (even if we can't prove that
the value is constant, as it's coming from an unreachable block, we
can ignore it).
Differential Revision: https://reviews.llvm.org/D41812
llvm-svn: 322006
|
| |
|
|
|
|
|
|
|
|
| |
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
Differential Revision: https://reviews.llvm.org/D40531
llvm-svn: 322005
|
| |
|
|
|
|
|
|
|
| |
This patch improves diagnostic for case when mapped instruction
does not contain a field listed under RowFields.
Differential Revision: https://reviews.llvm.org/D41778
llvm-svn: 322004
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is precedence for factorization transforms in instcombine for FP ops with fast-math.
We also have similar logic in foldSPFofSPF().
It would take more work to add this to reassociate because that's specialized for binops,
and min/max are not binops (or even single instructions). Also, I don't have evidence that
larger min/max trees than this exist in real code, but if we find that's true, we might
want to reorganize where/how we do this optimization.
In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have:
int test(int xc, int xm, int xy) {
int xk;
if (xc < xm)
xk = xc < xy ? xc : xy;
else
xk = xm < xy ? xm : xy;
return xk;
}
This patch solves that problem because we recognize more min/max patterns after rL321672
https://rise4fun.com/Alive/Qjne
https://rise4fun.com/Alive/3yg
Differential Revision: https://reviews.llvm.org/D41603
llvm-svn: 321998
|
| |
|
|
|
|
|
|
|
|
| |
The patch makes the unwind information not mention registers, which were pushed
solely for the purpose of saving stack adjustment instructions.
Differential revision: https://reviews.llvm.org/D41300
Fixes https://bugs.llvm.org/show_bug.cgi?id=35379
llvm-svn: 321996
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes the bug with incorrect handling of InsertValue|InsertElement
instrucions in SLP vectorizer. Currently, we may use incorrect
ExtractElement instructions as the operands of the original
InsertValue|InsertElement instructions.
Reviewers: mkuper, hfinkel, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41767
llvm-svn: 321994
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the vectorized value is marked as extra reduction argument, its users
are not considered as external users. Patch fixes this.
Reviewers: mkuper, hfinkel, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41786
llvm-svn: 321993
|
| |
|
|
|
|
|
|
|
|
|
| |
I had falsely assumed that constant operands would be operand(1) of
the bin ops that may need their constant operand to be masked.
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35761
Differential Revision: https://reviews.llvm.org/D41667
llvm-svn: 321991
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch allows `r7` to be used, regardless of its use as a frame pointer, as
a temporary register when popping `lr`, and also falls back to using a high
temporary register if, for some reason, we weren't able to find a suitable low
one.
Differential revision: https://reviews.llvm.org/D40961
Fixes https://bugs.llvm.org/show_bug.cgi?id=35481
llvm-svn: 321989
|
| |
|
|
| |
llvm-svn: 321988
|
| |
|
|
|
|
| |
CVT2MASK is just checking the sign bit which can be represented with a comparison with zero.
llvm-svn: 321985
|
| |
|
|
|
|
| |
128/256-bit compares when VLX is not available.
llvm-svn: 321984
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
These tests assumes availability of external symbols provided by the
C++ library, but those won't be available in case when the C++ library
is statically linked because lli itself doesn't need these.
This uses llvm-readobj -needed-libs to check if C++ library is linked as
shared library and exposes that information as a feature to lit.
Differential Revision: https://reviews.llvm.org/D41272
llvm-svn: 321981
|
| |
|
|
|
|
|
|
| |
This implements the -needed-libs option in Mach-O dumper.
Differential Revision: https://reviews.llvm.org/D41527
llvm-svn: 321980
|
| |
|
|
| |
llvm-svn: 321978
|
| |
|
|
|
|
|
|
| |
The approach was never discussed, I wasn't able to reproduce this
non-determinism, and the original author went AWOL.
After a discussion on the ML, Philip suggested to revert this.
llvm-svn: 321974
|
| |
|
|
|
|
| |
Support for ISel to be added.
llvm-svn: 321970
|
| |
|
|
|
|
|
|
|
|
| |
Allow SimplifyDemandedBits to use TargetLoweringOpt::computeKnownBits to look through bitcasts. This can help simplifying in some cases where bitcasts of constants generated during or after legalization can't be folded away, and thus didn't get picked up by SimplifyDemandedBits. This fixes PR34620, where a redundant pand created during legalization from lowering and lshr <16xi8> wasn't being simplified due to the presence of a bitcasted build_vector as an operand.
Committed on the behalf of @sameconrad (Sam Conrad)
Differential Revision: https://reviews.llvm.org/D41643
llvm-svn: 321969
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type.
It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway.
This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly.
We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added.
I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all.
There's definitely room for improvement with some follow up patches.
Reviewers: RKSimon, zvi, guyblank
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41560
llvm-svn: 321967
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In addition to target-dependent attributes, we can also preserve a
white-listed subset of target independent function attributes. The white-list
excludes problematic attributes, most prominently:
* attributes related to memory accesses, as alloca instructions
could be moved in/out of the extracted block
* control-flow dependent attributes, like no_return or thunk, as the
relerelevant instructions might or might not get extracted.
Thanks @efriedma and @aemerson for providing a set of attributes that cannot be
propagated.
Reviewers: efriedma, davidxl, davide, silvas
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D41334
llvm-svn: 321961
|
| |
|
|
| |
llvm-svn: 321958
|
| |
|
|
| |
llvm-svn: 321953
|
| |
|
|
| |
llvm-svn: 321951
|
| |
|
|
|
|
|
|
| |
versions of cvtps2ph to the store folding tables.
The memory form of the xmm->xmm version only writes 64-bits. If we use it in the folding tables and its get used for a stack spill, only half the slot will be written. Then a reload may read all 128-bits which will pull in garbage. But without the spill the upper bits of the register would have been zero. By not folding we would preserve the zeros.
llvm-svn: 321950
|
| |
|
|
|
|
|
|
| |
matcher table.
This is also needed to fix PR35837.
llvm-svn: 321946
|
| |
|
|
| |
llvm-svn: 321945
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: efriedma, rnk, davide
Reviewed By: rnk, davide
Differential Revision: https://reviews.llvm.org/D41556
llvm-svn: 321943
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: rnk, efriedma
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D41555
llvm-svn: 321942
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If the varargs are not accessed by a function, we can inline the
function.
Reviewers: dblaikie, chandlerc, davide, efriedma, rnk, hfinkel
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D41335
llvm-svn: 321940
|
| |
|
|
|
|
|
|
|
|
| |
assembler matcher table
We should always prefer the VEX encoded version of these instructions. There is no advantage to the EVEX version.
Fixes PR35837.
llvm-svn: 321939
|
| |
|
|
|
|
|
|
|
|
|
| |
In the minimal case, this won't remove instructions, but it still improves
uses of existing values.
In the motivating example from PR35834, it does remove instructions, and
sets that case up to be optimized by something like D41603:
https://reviews.llvm.org/D41603
llvm-svn: 321936
|
| |
|
|
| |
llvm-svn: 321935
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the last step needed to fix PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325
We're trading branch and compares for loads and logic ops.
This makes the code smaller and hopefully faster in most cases.
The 24-byte test shows an interesting construct: we load the trailing scalar
elements into vector registers and generate the same pcmpeq+movmsk code that
we expected for a pair of full vector elements (see the 32- and 64-byte tests).
Differential Revision: https://reviews.llvm.org/D41714
llvm-svn: 321934
|
| |
|
|
|
|
| |
we don't crash when trying to print an error message using it.
llvm-svn: 321930
|
| |
|
|
|
|
| |
lowerV4I64VectorShuffle.
llvm-svn: 321929
|
| |
|
|
| |
llvm-svn: 321928
|
| |
|
|
| |
llvm-svn: 321918
|
| |
|
|
|
|
|
|
| |
instructions.
This matches their VEX equivalents.
llvm-svn: 321912
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This had been reverted because the new test failed on non-X86 bots. I moved
the new test to the appropriate subdirectory to correct this.
Differential Revision: https://reviews.llvm.org/D41264
Original submission: r321122 (which was reverted by r321125)
This reverts commit 3c1639b5703c387a0d8cba2862803b4e68dff436.
llvm-svn: 321911
|
| |
|
|
|
|
| |
Recommit r321897 with updated testcases.
llvm-svn: 321908
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This commit updates the BufferByteStreamer, used by DebugLocStream
to buffer bytes/comments to put in the debug_loc section, to
make sure that the Buffer and Comments vectors are synced.
Previously, when an SLEB128 or ULEB128 was emitted together with
a comment, the vectors could be out-of-sync if the LEB encoding
added several entries to the Buffer vectors, while we only added
a single entry to the Comments vector.
The goal with this is to get the comments in the debug_loc
section in the .s file correctly aligned.
Example (using ARM as target):
Instead of
.byte 144 @ sub-register DW_OP_regx
.byte 128 @ 256
.byte 2 @ DW_OP_piece
.byte 147 @ 8
.byte 8 @ sub-register DW_OP_regx
.byte 144 @ 257
.byte 129 @ DW_OP_piece
.byte 2 @ 8
.byte 147 @
.byte 8 @
we now get
.byte 144 @ sub-register DW_OP_regx
.byte 128 @ 256
.byte 2 @
.byte 147 @ DW_OP_piece
.byte 8 @ 8
.byte 144 @ sub-register DW_OP_regx
.byte 129 @ 257
.byte 2 @
.byte 147 @ DW_OP_piece
.byte 8 @ 8
Reviewers: JDevlieghere, rnk, aprantl
Reviewed By: aprantl
Subscribers: davide, Ka-Ka, uabelho, aemerson, javed.absar, kristof.beyls, llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D41763
llvm-svn: 321907
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D41784
llvm-svn: 321905
|
| |
|
|
|
|
|
| |
Commit message:
[Hexagon] Add patterns for sext_inreg of HVX vector types
llvm-svn: 321904
|
| |
|
|
|
|
|
|
|
|
| |
instructions as well.
Without this we allow "vmovd %rax, %xmm0", but not "vmovd %rax, %xmm16"
This exists due to continue a silly bug where really old versions of the GNU assembler required movd instead of movq on these instructions. This compatibility hack then crept forward to avx version too, but we didn't propagate it to avx512.
llvm-svn: 321903
|
| |
|
|
|
|
|
|
| |
This option is widely used by scripts and there is no reason to break them.
rdar://problem/36032398
llvm-svn: 321901
|
| |
|
|
|
|
|
|
| |
'movq' instead.
This behavior existed to work with an old version of the gnu assembler on MacOS that only accepted this form. Newer versions of GNU assembler and the current LLVM derived version of the assembler on MacOS support movq as well.
llvm-svn: 321898
|
| |
|
|
|
|
| |
Only non-bool vectors.
llvm-svn: 321895
|
| |
|
|
| |
llvm-svn: 321894
|
| |
|
|
| |
llvm-svn: 321893
|
| |
|
|
| |
llvm-svn: 321892
|