| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
related. NFC.
Eric has replied and has demanded the patch be reverted.
llvm-svn: 247702
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247692
|
|
|
|
|
|
| |
LLDB needs to be updated in the same commit.
llvm-svn: 247686
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247683
|
|
|
|
|
|
|
|
|
|
| |
Summary: This fixes a variety of typos in docs, code and headers.
Subscribers: jholewinski, sanjoy, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12626
llvm-svn: 247495
|
|
|
|
|
|
| |
small. NFC.
llvm-svn: 247357
|
|
|
|
| |
llvm-svn: 247345
|
|
|
|
| |
llvm-svn: 247344
|
|
|
|
| |
llvm-svn: 247230
|
|
|
|
|
|
|
| |
This will be caught by existing tests with a
verifier check to be added in a future commit.
llvm-svn: 247229
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of extracting both 32-bit components from the 128-bit
register. This produces fewer copies and is easier for
the copy peephole optimizer to understand and see the actual uses
as extracts from a reg_sequence.
This avoids needing to handle subregister composing in the
PeepholeOptimizer's ValueTracker for this case.
llvm-svn: 247162
|
|
|
|
| |
llvm-svn: 247161
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12256
llvm-svn: 247157
|
|
|
|
|
|
|
| |
Broken by r247074. Should include an assembler test,
but the assembler is currently broken for VOP3b apparently.
llvm-svn: 247123
|
|
|
|
|
|
|
|
|
|
| |
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.
This moves a hack out of SI's argument lowering and
is covered by existing tests.
llvm-svn: 247113
|
|
|
|
|
|
|
|
|
| |
Adds vcc to output string input for e32. Allows option
of using e64 encoding with assembler.
Also fixes these instructions not implicitly reading exec.
llvm-svn: 247074
|
|
|
|
|
|
|
|
|
|
|
| |
These were marked as WriteSALU, which is low latency.
I'm guessing at the value to use, but it should probably
be considered the highest latency instruction.
I'm not sure this has any actual effect since hasSideEffects
probably is preventing any moving of these.
llvm-svn: 247060
|
|
|
|
|
|
|
|
| |
This should be convergent. This is not a
barrier in the isBarrier sense, nor
hasCtrlDep.
llvm-svn: 247059
|
|
|
|
|
|
|
|
|
| |
sub C, x - > add (sub 0, x), C for DS offsets.
This is mostly to fix regressions that show up when
SeparateConstOffsetFromGEP is enabled.
llvm-svn: 247054
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time
we have a merged access candidate. Without this patch, we were generating unaligned
16-byte (SSE) memops for x86 targets where those accesses are slow.
This change was mentioned in:
http://reviews.llvm.org/D10662 and
http://reviews.llvm.org/D10905
and will help solve PR21711.
Differential Revision: http://reviews.llvm.org/D12573
llvm-svn: 246771
|
|
|
|
|
|
|
| |
These are already added during the MachineInstr construction,
so this was adding the implicit registers twice.
llvm-svn: 246525
|
|
|
|
|
|
|
|
|
|
| |
The VOP3 encoding of these allows any SGPR pair for the i1
output, but this was forced before to always use vcc.
This doesn't yet try to use this, but does add the operand
to the definitions so the main change is adding vcc to the
output of the VOP2 encoding.
llvm-svn: 246358
|
|
|
|
| |
llvm-svn: 246357
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without a memory operand, mayLoad or mayStore instructions
are treated as hasUnorderedMemRef, which results in much worse
scheduling.
We really should have a verifier check that any
non-side effecting mayLoad or mayStore has a memory operand.
There are a few instructions (interp and images) which I'm
not sure what / where to add these.
llvm-svn: 246356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were assuming tha if the use operand had a sub-register that
the immediate was 64-bits, but this was breaking the case of
folding a 64-bit immediate into another 64-bit instruction.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12255
llvm-svn: 246354
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12254
llvm-svn: 246353
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no context where s_mov_b64 is emitted
and could potentially be moved to the VALU.
It is currently only emitted for materializing
immediates, which can't be dependent on vector sources.
The immediate splitting is already done when selecting
constants. I'm not sure what contexts if any the register
splitting would have been used before.
Also clean up using s_mov_b64 in place of v_mov_b64_pseudo,
although this isn't required and just skips the extra step
of eliminating the copy from the SReg_64.
llvm-svn: 246080
|
|
|
|
| |
llvm-svn: 246079
|
|
|
|
|
|
|
| |
This wouldn't propagate to users of the original BFE
and would hit a verifier error.
llvm-svn: 246078
|
|
|
|
|
|
|
|
|
|
|
|
| |
When splitting 64-bit operations, create the correct
VALU instructions immediately.
This was splitting things like s_or_b64 into the two
s_or_b32s and then pushing the new instructions
onto the worklist. There's no reason we need
to do this intermediate step.
llvm-svn: 246077
|
|
|
|
| |
llvm-svn: 246056
|
|
|
|
|
|
|
|
| |
I think this could potentially have broken if
one of the super registers were allocated
that contain v254/v255.
llvm-svn: 246051
|
|
|
|
| |
llvm-svn: 246048
|
|
|
|
|
|
|
|
| |
Although the basic s_load_* instructions happen to use the same
opcode, some of the special case SMRD instructions have
different opcodes.
llvm-svn: 245775
|
|
|
|
| |
llvm-svn: 245774
|
|
|
|
| |
llvm-svn: 245772
|
|
|
|
| |
llvm-svn: 245769
|
|
|
|
| |
llvm-svn: 245768
|
|
|
|
|
|
| |
There are still a couple of CI patterns left in SIInstructions.
llvm-svn: 245767
|
|
|
|
|
|
|
|
| |
The main change is inverting the condition for the
operand class classes so that VT.Size == 16 uses VGPR_32
instead of 64.
llvm-svn: 245764
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can wait on either VM, EXP or LGKM.
The waits are independent.
Without this patch, a wait inserted because of one of them
would also wait for all the previous others.
This patch makes s_wait only wait for the ones we need for the next
instruction.
Here's an example of subtle perf reduction this patch solves:
This is without the patch:
buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen
buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen
s_load_dwordx4 s[44:47], s[8:9], 0xc
s_waitcnt lgkmcnt(0)
buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen
s_load_dwordx4 s[48:51], s[8:9], 0x10
s_waitcnt vmcnt(1)
buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen
The s_waitcnt vmcnt(1) is useless.
The reason it is added is because the last
buffer_load_format_xyzw needs s[44:47], which was issued
by the first s_load_dwordx4. It waits for all VM
before that call to have finished.
Internally after every instruction, 3 counters (for VM, EXP and LGTM)
are updated after every instruction. For example buffer_load_format_xyzw
will
increase the VM counter, and s_load_dwordx4 the LGKM one.
Without the patch, for every defined register,
the current 3 counters are stored, and are used to know
how long to wait when an instruction needs the register.
Because of that, the s[44:47] counter includes that to use the register
you need to wait for the previous buffer_load_format_xyzw.
Instead this patch stores only the counters that matter for the
register,
and puts zero for the other ones, since we don't need any wait for them.
Patch by: Axel Davy
Differential Revision: http://reviews.llvm.org/D11883
llvm-svn: 245755
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the isPow2SDivCheap() query, as it is not currently used in
any meaningful way. isIntDivCheap() no longer relies on a state variable
(as all in-tree target set it to false), but the interface allows querying
based on the type optimization level.
NFC.
Differential Revision: http://reviews.llvm.org/D12082
llvm-svn: 245430
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This method checks whether a physical regiser or any of its aliases are
used in the function.
Using this function in SIRegisterInfo::findUnusedReg() should also fix
this reported failure:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150803/292143.html
http://reviews.llvm.org/rL242173#inline-533
The report doesn't come with a testcase and I don't know enough about
AMDGPU to create one myself.
llvm-svn: 245329
|
|
|
|
| |
llvm-svn: 245173
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When trying to fix SGPR live ranges, skip defs that are
killed in the same block as the def. I don't think
we need to worry about these cases as long as the
live ranges of the SGPRs in dominating blocks are
correct.
This reduces the number of elements the second
loop over the function needs to look at, and makes
it generally easier to understand. The second loop
also only considers if the live range is live
in to a block, which logically means it
must have been live out from another.
llvm-svn: 245150
|
|
|
|
|
|
|
|
|
|
|
| |
function.
This was the same as getFrameIndexReference, but without the FrameReg
output.
Differential Revision: http://reviews.llvm.org/D12042
llvm-svn: 245148
|
|
|
|
|
|
|
| |
The comments at the bottom would all report 0 if
amdhsa was used.
llvm-svn: 245135
|
|
|
|
|
|
|
| |
This is simple but won't work if/when this pass
is moved to be post-SSA.
llvm-svn: 245134
|
|
|
|
|
|
|
|
|
| |
Does not mark SlotIndexes as reserved, although I think
that might be OK.
LiveVariables still need to be handled.
llvm-svn: 245133
|
|
|
|
|
|
|
| |
These shouldn't ever be null. The number of successors
was already asserted to be 2.
llvm-svn: 245132
|