| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements.
This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.
I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course.
Followup to D25691.
Differential Revision: https://reviews.llvm.org/D31311
llvm-svn: 299219
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously compiler often extracted common immediates into specific register, e.g.:
```
%vreg0 = S_MOV_B32 0xff;
%vreg2 = V_AND_B32_e32 %vreg0, %vreg1
%vreg4 = V_AND_B32_e32 %vreg0, %vreg3
```
Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands:
```
SDWA src: %vreg2 src_sel:BYTE_0
SDWA src: %vreg4 src_sel:BYTE_0
```
With this change peephole check if operand is either immediate or register that is copy of immediate.
llvm-svn: 299202
|
|
|
|
|
|
|
|
|
|
| |
computeKnownBitsForTargetNode
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.
Differential Revision: https://reviews.llvm.org/D31249
llvm-svn: 299201
|
|
|
|
|
|
|
|
| |
What we really want to do is distinguish functions that may
be called by other functions, and graphics shaders are not
called kernels.
llvm-svn: 299140
|
|
|
|
|
|
| |
Add scope, order, isVolatile
llvm-svn: 299122
|
|
|
|
|
|
|
|
|
| |
If set to false it does not remove global aliases. With this parameter
set to false it should be safe to run the pass before link.
Differential Revision: https://reviews.llvm.org/D31489
llvm-svn: 299108
|
|
|
|
|
|
|
|
| |
computeKnownBitsForTargetNode/ComputeNumSignBitsForTargetNode arguments. NFCI.
Based on comment in D31249.
llvm-svn: 298991
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is less important than increase threshold for private memory,
but still brings performance improvements in a wide range of tests.
Unrolling more for local memory serves three purposes: it allows
to combine ds operations if offset becomes static, saves registers
used for offsets in case of static offsets, and allows better lds
latency hiding.
Differential Revision: https://reviews.llvm.org/D31412
llvm-svn: 298948
|
|
|
|
|
|
|
|
|
|
| |
This is incorrect to record region boundaries before scheduling,
it may change after scheduling. As a result second pass may see less
instructions to schedule than it should.
Differential Revision: https://reviews.llvm.org/D31434
llvm-svn: 298945
|
|
|
|
|
|
|
|
|
|
| |
Previously it was covered by the internalization. It turns out we cannot
run internalizer in FE, it break separate compilation tests. Thus early
inliner gets its own option.
Differential Revision: https://reviews.llvm.org/D31429
llvm-svn: 298935
|
|
|
|
|
|
|
|
|
|
| |
Depends on rL298896: MachineScheduler/ScheduleDAG: Add support for GetSubGraph
Patch by Axel Davy (axel.davy@normalesup.org)
Differential revision: https://reviews.llvm.org/D30152
llvm-svn: 298902
|
|
|
|
|
|
|
|
| |
Patch by Axel Davy (axel.davy@normalesup.org)
Differential revision: https://reviews.llvm.org/D30153
llvm-svn: 298872
|
|
|
|
|
|
|
|
| |
Patch by Axel Davy (axel.davy@normalesup.org)
Differential revision: https://reviews.llvm.org/D30150
llvm-svn: 298861
|
|
|
|
|
|
|
|
| |
Patch by Axel Davy (axel.davy@normalesup.org)
Differential revision: https://reviews.llvm.org/D30145
llvm-svn: 298857
|
|
|
|
|
|
|
|
|
|
| |
Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type
Reviewers: vpykhtin, arsenm
Differential Revision: https://reviews.llvm.org/D31327
llvm-svn: 298852
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As we introduced target triple environment amdgiz and amdgizcl, the address
space values are no longer enums. We have to decide the value by target triple.
The basic idea is to use struct AMDGPUAS to represent address space values.
For address space values which are not depend on target triple, use static
const members, so that they don't occupy extra memory space and is equivalent
to a compile time constant.
Since the struct is lightweight and cheap, it can be created on the fly at
the point of usage. Or it can be added as member to a pass and created at
the beginning of the run* function.
Differential Revision: https://reviews.llvm.org/D31284
llvm-svn: 298846
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switch data layout by target triple environment amdgiz and amdgizcl indicating using of an address space mapping in which generic address space is 0.
amdgiz is for non-OpenCL environment where generic address space is 0.
amdgizcl is for OpenCL environment where generic address space is 0.
Differential Revision: https://reviews.llvm.org/D31211
llvm-svn: 298758
|
|
|
|
|
|
|
|
| |
If the branch condition for a loop was a phi which itself
was fed from a phi from a loop, it isn't safe to try
to delete the phi until after the loop is handled.
llvm-svn: 298737
|
|
|
|
| |
llvm-svn: 298730
|
|
|
|
|
|
|
|
|
|
| |
StructurizeCFG can't handle cases with multiple
returns creating regions with multiple exits.
Create a copy of UnifyFunctionExitNodes that only
unifies exit nodes that skips exit nodes
with uniform branch sources.
llvm-svn: 298729
|
|
|
|
|
|
|
|
| |
Such instructions sometimes appear after lowering and folding.
Differential Revision: https://reviews.llvm.org/D31318
llvm-svn: 298723
|
|
|
|
| |
llvm-svn: 298722
|
|
|
|
|
|
|
|
| |
Previously it was added only to the BE.
Differential Revision: https://reviews.llvm.org/D31323
llvm-svn: 298721
|
|
|
|
|
|
| |
around that don't have a constexpr std::pair.
llvm-svn: 298719
|
|
|
|
|
|
|
|
| |
Patch by Axel Davy (axel.davy@normalesup.org)
Differential revision: https://reviews.llvm.org/D30382
llvm-svn: 298718
|
|
|
|
|
|
|
|
| |
Patch by Axel Davy (axel.davy@normalesup.org)
Differential revision: https://reviews.llvm.org/D30149
llvm-svn: 298710
|
|
|
|
|
|
|
|
| |
Patch by Axel Davy (axel.davy@normalesup.org)
Differential revision: https://reviews.llvm.org/D30146
llvm-svn: 298708
|
|
|
|
|
|
| |
NFC.
llvm-svn: 298701
|
|
|
|
|
|
|
|
| |
- It was decided to expose this information through other means (rocr)
Differential Revision: https://reviews.llvm.org/D30970
llvm-svn: 298560
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D30969
llvm-svn: 298558
|
|
|
|
|
|
|
|
| |
- These are not required for low level runtime
Differential Revision: https://reviews.llvm.org/D29949
llvm-svn: 298556
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Rename runtime metadata -> code object metadata
- Make metadata not flow
- Switch enums to use ScalarEnumerationTraits
- Cleanup and move AMDGPUCodeObjectMetadata.h to AMDGPU/MCTargetDesc
- Introduce in-memory representation for attributes
- Code object metadata streamer
- Create metadata for isa and printf during EmitStartOfAsmFile
- Create metadata for kernel during EmitFunctionBodyStart
- Finalize and emit metadata to .note during EmitEndOfAsmFile
- Other minor improvements/bug fixes
Differential Revision: https://reviews.llvm.org/D29948
llvm-svn: 298552
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31258
llvm-svn: 298551
|
|
|
|
|
|
|
|
|
|
| |
Fixed v_mad_i64_i32/u64_u32 encoding
Reviewers: artem.tamazov
Differential Revision: https://reviews.llvm.org/D30828
llvm-svn: 298502
|
|
|
|
| |
llvm-svn: 298454
|
|
|
|
|
|
|
|
| |
This is used for a specific type of return to a shader part's
epilog code. Rename to try avoiding confusion from a true
call's return.
llvm-svn: 298452
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a parameter to @llvm.objectsize that makes it return
conservative values if it's given null.
This fixes PR23277.
Differential Revision: https://reviews.llvm.org/D28494
llvm-svn: 298430
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr
Differential Revision: https://reviews.llvm.org/D31158
llvm-svn: 298397
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr
Differential Revision: https://reviews.llvm.org/D31157
llvm-svn: 298396
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.
Rename AttributeSetImpl to AttributeListImpl to follow suit.
It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.
Reviewers: sanjoy, javed.absar, chandlerc, pete
Reviewed By: pete
Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits
Differential Revision: https://reviews.llvm.org/D31102
llvm-svn: 298393
|
|
|
|
| |
llvm-svn: 298390
|
|
|
|
|
|
| |
Fold these to undef during lowering so users get eliminated.
llvm-svn: 298387
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D31046
llvm-svn: 298368
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
First iteration of SDWA peephole.
This pass tries to combine several instruction into one SDWA instruction. E.g. it converts:
'''
V_LSHRREV_B32_e32 %vreg0, 16, %vreg1
V_ADD_I32_e32 %vreg2, %vreg0, %vreg3
V_LSHLREV_B32_e32 %vreg4, 16, %vreg2
'''
Into:
'''
V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD
'''
Pass structure:
1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''.
2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0'''
3. Iterate over all potential instructions and check if they can be converted into SDWA.
4. Convert instructions to SDWA.
This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done).
There are several ways this pass can be improved:
1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass.
2. Introduce more SDWA patterns
3. Introduce mnemonics to limit when SDWA patterns should apply
Reviewers: vpykhtin, alex-t, arsenm, rampitec
Subscribers: wdng, nhaehnle, mgorny
Differential Revision: https://reviews.llvm.org/D30038
llvm-svn: 298365
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31141
llvm-svn: 298281
|
|
|
|
|
|
|
|
|
|
| |
This fix enables sp3 abs modifier with constants
Reviewers: artem.tamazov
Differential Revision: https://reviews.llvm.org/D30825
llvm-svn: 298265
|
|
|
|
|
|
|
|
|
|
| |
Fixed several related issues with VOP3 fp modifiers.
Reviewers: artem.tamazov
Differential Revision: https://reviews.llvm.org/D30821
llvm-svn: 298255
|
|
|
|
|
|
| |
This reverts commit r297958, it breaks device-libs build.
llvm-svn: 298239
|
|
|
|
|
|
| |
labels". NFCI.
llvm-svn: 298225
|
|
|
|
|
|
|
|
|
| |
This is direct port of HSAILAliasAnalysis pass, just cleaned for
style and renamed.
Differential Revision: https://reviews.llvm.org/D31103
llvm-svn: 298172
|