| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Per Table 1-1 in October 2017 edition of Intel® Architecture Instruction Set Extensions and Future Features
llvm-svn: 321501
|
| |
|
|
|
|
|
|
| |
My original implementation ran as a DAG combine post type legalization, but it turns out we don't run that DAG combine step if type legalization didn't change anything. Attempts to make the combine run before type legalization as well hit other issues.
So just do it in LowerMUL where we can catch more cases.
llvm-svn: 321496
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r319980 added new patterns to the machine combiner for transforming (fsub (fmul
x y) z) into (fmla (fneg z) x y). That is, fsub's where the first source
operand is an fmul are transformed. We previously only matched the case where
the second source operand of an fsub was an fmul, transforming (fsub z (fmul x
y)) into (fmls z x y). Now, if we have an fsub where both source operands are
fmuls, both of the above patterns are applicable.
However, the order in which we add the patterns to the list of candidates
determines the transformation that takes place, since only the first pattern
that matches will be used. This patch changes the order these two patterns are
added to the list of candidates such that we prefer the case where the second
source operand is an fmul (the fmls case), rather than the other one (the
fmla/fneg case). When both source operands are fmuls, this ordering results in
fewer instructions.
Differential Revision: https://reviews.llvm.org/D41587
llvm-svn: 321491
|
| |
|
|
|
|
| |
v8i32 is legal von AVX1, but it doesn't have pmuludq for it.
llvm-svn: 321490
|
| |
|
|
|
|
|
|
| |
Returning SDValue() means nothing changed, SDValue(N,0) means there was a change but the worklist management was taken care of.
I don't know if this has a real effect other than making sure the combine counter in the DAG combiner gets updated, but it is the correct thing to do.
llvm-svn: 321463
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D41579
llvm-svn: 321459
|
| |
|
|
| |
llvm-svn: 321452
|
| |
|
|
|
|
| |
SSE/AVX counterparts.
llvm-svn: 321451
|
| |
|
|
| |
llvm-svn: 321450
|
| |
|
|
|
|
|
|
|
|
| |
upper bits are all sign bits or zeros.
Normally we catch this during lowering, but vXi64 mul is considered legal when we have AVX512DQ.
This DAG combine allows us to avoid PMULLQ with AVX512DQ if we can prove its unnecessary. PMULLQ is 3 uops that take 4 cycles each. While pmuldq/pmuludq is only one 4 cycle uop.
llvm-svn: 321437
|
| |
|
|
| |
llvm-svn: 321433
|
| |
|
|
| |
llvm-svn: 321432
|
| |
|
|
| |
llvm-svn: 321425
|
| |
|
|
|
|
|
|
| |
(PR21160, PR34080, PR34454).
Match regular x87 memory fold instructions with load/sideeffects tags, to prevent the schedulers from re-ordering them across the fnstcw/fldcw sequences for truncating stores while they are still pseudo during the stack conversion pass.
llvm-svn: 321424
|
| |
|
|
|
|
| |
Previously we extended v2i1 to v2f64 and then tried to use cvtuqq2pd/cvtqq2pd, but that only works with avx512dq. So we ended up scalarizing it. Now we widen to v4i1 first and extend to v4i32.
llvm-svn: 321420
|
| |
|
|
|
|
| |
consistent with the other AVX512 ISAs.
llvm-svn: 321416
|
| |
|
|
|
|
| |
RHS not just all zeros/ones.
llvm-svn: 321415
|
| |
|
|
|
|
| |
This can help AVX-512 code where mask types are legal allowing us to remove extends and truncates to/from mask types.
llvm-svn: 321408
|
| |
|
|
|
|
|
|
| |
truncate on N1.
Later in the code we explicitly bypass the truncate so we should be checking its type to make sure that it's safe.
llvm-svn: 321407
|
| |
|
|
|
|
|
|
| |
Immediately after it is created we check if its equal to another EVT. Then we inconsistently use one or the other variables in the code below.
Instead do the equality check directly on the getValueType result and remove the variable. Use the origina VT variable throughout the remaining code.
llvm-svn: 321406
|
| |
|
|
| |
llvm-svn: 321405
|
| |
|
|
|
|
| |
This will be used to help tidyup existing pseudos that we've added scheduling info to.
llvm-svn: 321401
|
| |
|
|
|
|
| |
Apparently we don't have tests for this which I didn't realize before. I'll try to fix that but wanted to fix the obvious bug.
llvm-svn: 321399
|
| |
|
|
| |
llvm-svn: 321398
|
| |
|
|
|
|
|
|
|
|
| |
to get the type of the operand.
getOperand returns an SDValue that contains the node and the result number. There is no guarantee that the result number if 0. By using the -> operator we are calling SDNode::getValueType rather than SDValue::getValueType. This requires supplying a result number and we shouldn't assume it was 0.
I don't have a test case. Just noticed while cleaning up some other code and saw that it occurred in other places.
llvm-svn: 321397
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-land r321234. It had to be reverted because it broke the shared
library build. The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).
Original commit message:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
llvm-svn: 321375
|
| |
|
|
|
|
|
|
|
|
| |
See bug 35716: https://bugs.llvm.org/show_bug.cgi?id=35716
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D41488
llvm-svn: 321372
|
| |
|
|
|
|
|
|
| |
non-constant index just use either a 128-bit type or the vXi8 type with the correct number of elements.
Despite what the comment said there isn't better codegen for 512-bit vectors. The 128/256/512 bit implementation jus stores to memory and loads an element. There's no advantage to doing that with a larger size. In fact in many cases it causes a stack realignment and generates worse code.
llvm-svn: 321369
|
| |
|
|
|
|
| |
Fix some inconsistent new line behavior and only print the FrameIndex when the address mode is a FrameIndexBase addressing mode.
llvm-svn: 321368
|
| |
|
|
|
|
|
|
|
|
| |
See bug 35645: https://bugs.llvm.org/show_bug.cgi?id=35645
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D41186
llvm-svn: 321367
|
| |
|
|
|
|
|
|
|
|
|
|
| |
See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561
This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic.
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D41437
llvm-svn: 321359
|
| |
|
|
|
|
|
| |
Mark conversions between pointers and 32-bit scalars as legal, map them
to the GPR and select to a simple COPY.
llvm-svn: 321356
|
| |
|
|
|
|
|
|
|
|
| |
Pointer constants are pretty rare, since we usually represent them as
integer constants and then cast to pointer. One notable exception is the
null pointer constant, which is represented directly as a G_CONSTANT 0
with pointer type. Mark it as legal and make sure it is selected like
any other integer constant.
llvm-svn: 321354
|
| |
|
|
| |
llvm-svn: 321340
|
| |
|
|
| |
llvm-svn: 321336
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
for prefetch instructions.
Previously prefetch was only considered legal if sse was enabled, but it should be supported with 3dnow as well.
The prfchw flag now imply at least some form of prefetch without the write hint is available, either the sse or 3dnow version. This is true even if 3dnow and sse are explicitly disabled.
Similarly prefetchwt1 feature implies availability of prefetchw and the the prefetcht0/1/2/nta instructions. This way we can support _MM_HINT_ET0 using prefetchw and _MM_HINT_ET1 with prefetchwt1. And its assumed that if we have levels for the write hint we would have levels for the non-write hint, thus why we enable the sse prefetch instructions.
I believe this behavior is consistent with gcc. I've updated the prefetch.ll to test all of these combinations.
llvm-svn: 321335
|
| |
|
|
| |
llvm-svn: 321334
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.
While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.
Differential Revision: https://reviews.llvm.org/D41522
llvm-svn: 321332
|
| |
|
|
|
|
|
|
| |
extract_vector_elt from vXi1 with a non-const index.
We have a better range of instructions we can use if we can fill with the value i1 value rather than zeroing.
llvm-svn: 321315
|
| |
|
|
|
|
|
|
| |
512-bit if we have VLX.
This should only affect what we do for v8i16. Previously we went to v8i64, but if we have VLX we only need v8i32. This prevents an unnecessary zmm usage.
llvm-svn: 321303
|
| |
|
|
|
|
|
|
| |
We should have equally good shuffle options for v8i32 with VLX. This was spotted during my attempts to remove 512-bit vectors from SKX.
We still use 512-bits for v16i1, v32i1, and v64i1. I'm less sure we can handle those well with narrower vectors. i32 and i64 element sizes get the best shuffle support.
llvm-svn: 321291
|
| |
|
|
|
|
|
|
| |
Patch to allow detectAVGPattern handle vectors larger than the legal size (128 SSE2, 256 AVX2, 512 AVX512BW), splitting the vectors accordingly.
Differential Revision: https://reviews.llvm.org/D41440
llvm-svn: 321288
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The build failure was caused by an assertion in pre-legalization DAGCombine:
Combining: t6: ppcf128 = uint_to_fp t5
... into: t20: f32 = PPCISD::FCFIDUS t19
which is clearly wrong since ppcf128 are definitely different type with f32 and
we cannot change the node value type when do DAGCombine. The fix is don't
handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and
leave it to downstream to legalize it and expand it to small legal types.
Differential Revision: https://reviews.llvm.org/D41411
llvm-svn: 321276
|
| |
|
|
|
|
|
|
| |
Implement MC support for the Armv8-R 'Data Full Barrier' instruction.
Differential Revision: https://reviews.llvm.org/D41430
llvm-svn: 321256
|
| |
|
|
|
|
| |
PSHUFB has the ability to implicitly 0 elements which VPERMI2W can't do. So give a chance to use it first.
llvm-svn: 321251
|
| |
|
|
|
|
| |
otherwise used two PSHUFBs ORed together.
llvm-svn: 321249
|
| |
|
|
|
|
| |
supported.
llvm-svn: 321248
|
| |
|
|
|
|
| |
This reverts commit r321234. It breaks the -DBUILD_SHARED_LIBS=ON build.
llvm-svn: 321243
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
llvm-svn: 321234
|
| |
|
|
| |
llvm-svn: 321233
|