| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 328838
|
| |
|
|
|
|
|
|
|
|
| |
Sometimes the operand comes after the memory operand so we need 5 ReadDefaults first.
I suspect we also need to do something for the mask operand for masked avx512 instructions? I'm not sure if the mask should be ReadAfterLd or not since it can mask faults. If it shouldn't be ReadAfterLd then we're probably wrong for zero masking instructions already.
Differential Revision: https://reviews.llvm.org/D44726
llvm-svn: 328834
|
| |
|
|
| |
llvm-svn: 328832
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While the stack access instructions don't care about
alignment > 4, some transformations on the pointer calculation
do make assumptions based on knowing the low bits of a pointer
are 0. If a stack object ends up being accessed through its
absolute address (relative to the kernel scratch wave offset),
the addressing expression may depend on the stack frame being
properly aligned. This was breaking in a testcase due to the
add->or combine.
I think some of the SP/FP handling logic is still backwards,
and overly simplistic to support all of the stack features.
Code which tries to modify the SP with inline asm for example
or variable sized objects will probably require redoing this.
llvm-svn: 328831
|
| |
|
|
|
|
|
|
|
|
| |
register operand in their memory form
The memory form of these instructions only read an input from memory. They don't have any register operands.
Differential Revision: https://reviews.llvm.org/D44836
llvm-svn: 328828
|
| |
|
|
|
|
|
|
|
|
| |
SchedRW for BEXTR/BZHI.
These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd
Differential Revision: https://reviews.llvm.org/D44838
llvm-svn: 328823
|
| |
|
|
|
|
|
| |
8 and 16-byte values are common, so increase the default
alignment to avoid realigning the stack in most functions.
llvm-svn: 328821
|
| |
|
|
| |
llvm-svn: 328818
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: sdardis, RKSimon, dsanders, atanasyan
Reviewed By: atanasyan
Subscribers: atanasyan, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D44869
llvm-svn: 328815
|
| |
|
|
|
|
|
|
|
|
|
|
| |
CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.
The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.
Differential Revision: https://reviews.llvm.org/D45017
llvm-svn: 328806
|
| |
|
|
|
|
| |
Patch by Sumanth Gundapaneni.
llvm-svn: 328774
|
| |
|
|
|
|
|
|
|
|
| |
We are re-adding all the bitcasts, constant masks and target shuffles to the work list for no apparent gain.
Found while investigating adding SimplifyDemandedVectorElts to target shuffles.
Differential Revision: https://reviews.llvm.org/D44942
llvm-svn: 328771
|
| |
|
|
|
|
|
|
|
|
|
|
| |
I believe the role of ehDataReg has been replaced by MipsABIInfo::GetEhDataReg, thus removing the dead code.
Patch By: Wei-Ren Chen.
Reviewers: ehostunreach, sdardis
Differential Revision: https://reviews.llvm.org/D44867
llvm-svn: 328767
|
| |
|
|
|
|
|
|
| |
EmitAnyX86InstComments. Just always use the function from the ATTPrinter. NFC
The IntelPrinter and the ATTPrinter produce the same strings for the same input. We already use the ATTPrinter explicitly in several other places.
llvm-svn: 328762
|
| |
|
|
|
|
|
|
| |
it. NFC
This feels more consistent with the other classes. We don't need to say _NOREX if we didn't start it with an R in the first place.
llvm-svn: 328757
|
| |
|
|
|
|
| |
EVEX shouldn't inherit from VEX and EVEX_4V shouldn't inherit from VEX_4V.
llvm-svn: 328756
|
| |
|
|
| |
llvm-svn: 328745
|
| |
|
|
|
|
|
|
| |
dependency
Thanks to echristo for the pointers on direction.
llvm-svn: 328737
|
| |
|
|
|
|
|
|
|
|
| |
scheduler regexs.
Most of these were optional matches at the end of the strings, but since the strings themselves are prefix matches by default you don't need to check for something optional at the end.
I've left the 'b' on memory instructions where it means 'broadcast' because I'm not sure those really have the same load latency and we may need to split them explicitly in the future.
llvm-svn: 328730
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" versions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.
We need to use pseudo instructions for these instructions until
after register allocation. The problem is that these instructions
allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for
the CSx set-up until after register allocation when the Mx
register has been fixed for the instruction.
There is a related clang patch.
Patch by Brendon Cahoon.
llvm-svn: 328724
|
| |
|
|
|
|
|
|
|
|
| |
for call outlining
This commit simplifies the call outlining logic by removing references to the
Function associated with the callee. To do this, it requires that valid
callee save info is available to the outliner.
llvm-svn: 328719
|
| |
|
|
|
|
|
|
|
| |
declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.
llvm-svn: 328717
|
| |
|
|
|
|
|
|
|
| |
See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833
Differential Revision: https://reviews.llvm.org/D44779
Reviewers: arsenm, artem.tamazov, timcorringham
llvm-svn: 328713
|
| |
|
|
|
|
|
|
|
| |
See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834
Differential Revision: https://reviews.llvm.org/D44795
Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle
llvm-svn: 328710
|
| |
|
|
|
|
|
|
|
| |
See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835
Differential Revision: https://reviews.llvm.org/D44825
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328707
|
| |
|
|
|
|
|
|
|
| |
See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836
Differential Revision: https://reviews.llvm.org/D44832
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 328704
|
| |
|
|
|
|
| |
Renamed JWriteFPAY22 to JWriteFCmpY - we've tended to avoid latency based names
llvm-svn: 328701
|
| |
|
|
|
|
|
|
|
|
|
| |
instructions.
Similar to r328694. The number of micro opcodes should be 2 for those
instructions.
This was found when testing AVX code for BtVer2 using llvm-mca.
llvm-svn: 328698
|
| |
|
|
|
|
|
|
|
|
|
| |
This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981.
It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a
release-with-asserts build. I will resubmit the change when I have fixed
that.
Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a
llvm-svn: 328695
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The Jaguar backend natively supports 128-bit data types. Operations on YMM
registers are split into two COPs (complex operations). Each COP consumes a slot
in the dispatch group, and in the reorder buffer.
The scheduling model for Jaguar should mark those instructions as `let
NumMicroOps = 2`.
This was found when testing AVX code for BtVer2 using llvm-mca.
llvm-svn: 328694
|
| |
|
|
|
|
|
|
|
|
| |
Follow up patch of r328313 to support the UseVMOVSR constraint. Removed
some unneeded instructions from the test and removed some stray
comments.
Differential Revision: https://reviews.llvm.org/D44941
llvm-svn: 328691
|
| |
|
|
|
|
|
| |
Currently this seems to only really be used for debug
info.
llvm-svn: 328677
|
| |
|
|
|
|
|
|
|
| |
If an ADRP appears with, say, a CPI operand, we shouldn't outline it.
This moves the check for unsafe operands so that it occurs before the special-case
for ADRPs. Also add a test for outlining ADRPs.
llvm-svn: 328674
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).
This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.
Reviewers: kzhuravl, nhaehnle, timcorringham
Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm
Differential Revision: https://reviews.llvm.org/D44468
Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f
llvm-svn: 328673
|
| |
|
|
| |
llvm-svn: 328667
|
| |
|
|
|
|
|
|
| |
Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer.
Differential Revision: https://reviews.llvm.org/D44924
llvm-svn: 328664
|
| |
|
|
|
|
|
|
| |
Before this was not done if the function had no calls in it. This
is still a possible issue with any callable function, regardless
of calls present.
llvm-svn: 328659
|
| |
|
|
|
|
|
| |
Only 4 byte alignment is ever useful, so increasing anything
beyond this may require realigning the stack.
llvm-svn: 328656
|
| |
|
|
|
|
|
|
| |
The combine on a select of a load only triggers for
addrspace 0, and discards the MachinePointerInfo. The
conservative default needs to be used for this.
llvm-svn: 328652
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a function, s5 is used as the frame base SGPR. If a function
is calling another function, during the call sequence
it is copied to a preserved SGPR and restored.
Before it was possible for the scheduler to move stack operations
before the restore of s5, since there's nothing to associate
a frame index access with the restore.
Add an implicit use of s5 to the adjcallstack pseudo which ends
the call sequence to preven this from happening. I'm not 100%
satisfied with this solution, but I'm not sure what else would be
better.
llvm-svn: 328650
|
| |
|
|
| |
llvm-svn: 328648
|
| |
|
|
|
|
|
| |
The COPY instruction was listed as a 4 cycle instruction.
It is now listed correctly as a 2 cycle ALU instruction.
llvm-svn: 328647
|
| |
|
|
|
|
|
|
| |
This implements a set of TTI functions that the loop vectorizer uses.
The only purpose of this is to enable testing. Auto-vectorization is
disabled by default, enabled by -hexagon-autohvx.
llvm-svn: 328639
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a canonical way to teach objdump to print the target
symbols for branches when disassembling AArch64 code.
Reviewers: evandro, t.p.northover, espindola
Reviewed By: t.p.northover
Differential Revision: https://reviews.llvm.org/D44851
llvm-svn: 328638
|
| |
|
|
| |
llvm-svn: 328620
|
| |
|
|
|
|
|
|
| |
This patch supports secure PLT mode for PowerPC 32 architecture.
Differential Revision: https://reviews.llvm.org/D42112
llvm-svn: 328617
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I recently added a new Fixup kind to our fork of LLVM but forgot to add
it to the table in MipsAsmBackend.cpp. With this static_assert the error
would have been caught instead of zero-initializing the array entries for
the new fixups.
Reviewers: sdardis, atanasyan
Reviewed By: atanasyan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44895
llvm-svn: 328616
|
| |
|
|
|
|
|
|
| |
Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis.
Differential Revision: https://reviews.llvm.org/D44647
llvm-svn: 328582
|
| |
|
|
|
|
|
|
| |
In restoreLatency, replace range-for loop with std::find.
Patch by Jyotsna Verma.
llvm-svn: 328574
|
| |
|
|
|
|
| |
Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)
llvm-svn: 328573
|