| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
More API churn, experimental target got sad.
llvm-svn: 262179
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17684
llvm-svn: 262176
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17654
llvm-svn: 262157
|
|
|
|
|
|
| |
Part of PR26753.
llvm-svn: 262154
|
|
|
|
|
|
|
|
|
|
|
|
| |
The maximum private allocation for the whole GPU is 4G,
so the maximum possible index for a single workitem is the
maximum size divided by the smallest granularity for a dispatch.
This increases the number of known zero high bits, which
enables more offset folding. The maximum private size per
workitem with this is 128M but may be smaller still.
llvm-svn: 262153
|
|
|
|
|
|
|
|
| |
InlineSpiller::rematerializeFor() never uses its parameter as an
iterator, so take it by reference instead. This removes an implicit
conversion from MachineBasicBlock::iterator to MachineInstr*.
llvm-svn: 262152
|
|
|
|
|
|
| |
These parameters aren't expected to be null, so take them by reference.
llvm-svn: 262151
|
|
|
|
|
|
|
|
| |
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null. Slowly inching
toward being able to fix PR26753.
llvm-svn: 262149
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case where op = add, y = base_ptr, and x = offset, this
transform:
(op y, (op x, c1)) -> (op (op x, y), c1)
breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.
This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.
llvm-svn: 262148
|
|
|
|
|
|
| |
Also update HashEndOfMBB to take MachineBasicBlock&.
llvm-svn: 262146
|
|
|
|
|
|
|
|
| |
Take parameters as MachineInstr& instead of MachineInstr* in
AntiDepBreaker API, since these are required to be non-null. No
functionality change intended. Looking toward PR26753.
llvm-svn: 262145
|
|
|
|
|
|
|
| |
This already assumes a valid MI, since it dereferences the MI in an
assertion before checking for null. At an explicit assert.
llvm-svn: 262144
|
|
|
|
| |
llvm-svn: 262143
|
|
|
|
|
|
|
|
|
| |
In all but one case, change the DFAPacketizer API to take MachineInstr&
instead of MachineInstr*. In DFAPacketizer::endPacket(), take
MachineBasicBlock::iterator. Besides cleaning up the API, this is in
search of PR26753.
llvm-svn: 262142
|
|
|
|
|
|
|
|
| |
Update APIs in MachineInstrBundle.h to take and return MachineInstr&
instead of MachineInstr* when the instruction cannot be null. Besides
being a nice cleanup, this is tacking toward a fix for PR26753.
llvm-svn: 262141
|
|
|
|
|
|
| |
It was broken by the work for PR26753.
llvm-svn: 262140
|
|
|
|
|
|
| |
This reverts commit r262103, as it broke all ARM and AArch64 bots.
llvm-svn: 262139
|
|
|
|
|
|
| |
NFCI.
llvm-svn: 262137
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
registers supported by the Sparc v8 manual.
These are all co-processor registers, with the exception of the floating-point deferred-trap queue register.
Although these will not be lowered automatically by any instructions, it allows the use of co-processor
instructions implemented by inline-assembly.
Code Reviewed at http://reviews.llvm.org/D17133, with the exception of a very small change in brace placement in SparcInstrInfo.td,
which was formerly causing a problem in the disassembly of the %fq register.
llvm-svn: 262133
|
|
|
|
| |
llvm-svn: 262131
|
|
|
|
|
|
| |
PassManager and AnalysisManager template specializations as well.
llvm-svn: 262128
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
manager proxies and use those rather than repeating their definition
four times.
There are real differences between the two directions: outer AMs are
const and don't need to have invalidation tracked. But every proxy in
a particular direction is identical except for the analysis manager type
and the IR unit they proxy into. This makes them prime candidates for
nice templates.
I've started introducing explicit template instantiation declarations
and definitions as well because we really shouldn't be emitting all this
everywhere. I'm going to go back and add the same for the other
templates like this in a follow-up patch.
I've left the analysis manager as an opaque type rather than using two
IR units and requiring it to be an AnalysisManager template
specialization. I think its important that users retain the ability to
provide their own custom analysis management layer and provided it has
the appropriate API everything should Just Work.
llvm-svn: 262127
|
|
|
|
|
|
| |
This is OK for +0 since compares to +/-0 give the same result.
llvm-svn: 262125
|
|
|
|
|
|
|
| |
This will be more useful for marking builtins acceptable for which
subtargets.
llvm-svn: 262121
|
|
|
|
| |
llvm-svn: 262120
|
|
|
|
|
|
|
|
|
|
| |
This matches the behavior of the HSAIL clock instruction.
s_realmemtime is used if the subtarget supports it, and falls
back to s_memtime if not.
Also introduces new intrinsics for each of s_memtime / s_memrealtime.
llvm-svn: 262119
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are
never null, so this cleans up the API a bit. It also incidentally
removes a few implicit conversions from MachineInstrBundleIterator to
MachineInstr* (see PR26753).
At a couple of call sites it was convenient to convert to a range-based
for loop over MachineBasicBlock::instr_begin/instr_end, so I added
MachineBasicBlock::instrs.
llvm-svn: 262115
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The PS4 linker seems to handle this fine.
Hi David, it seems that indeed most ELF linkers support
__{start,stop}_SECNAME, as our proprietary linker does as well.
This follows the pattern of r250679 w.r.t. the testing.
Maggie, Phillip, Paul: I've tested this with the PS4 SDK 3.5 toolchain
prerelease and it seems to work fine.
Reviewers: davidxl
Subscribers: probinson, phillip.power, MaggieYi
Differential Revision: http://reviews.llvm.org/D17672
llvm-svn: 262112
|
|
|
|
| |
llvm-svn: 262111
|
|
|
|
|
|
| |
-fsanitize-coverage=trace-pc mode; update libFuzzer doc for previous commit
llvm-svn: 262110
|
|
|
|
| |
llvm-svn: 262109
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
merged into a loop that was subsequently unrolled (or otherwise nuked).
In this case it can't merge in the ASTs for any remaining nested loops,
it needs to re-add their instructions dircetly.
The fix is very isolated, but I've pulled the code for merging blocks
into the AST into a single place in the process. The only behavior
change is in the case which would have crashed before.
This fixes a crash reported by Mikael Holmen on the list after r261316
restored much of the loop pass pipelining and allowed us to actually do
this kind of nested transformation sequenc. I've taken that test case
and further reduced it into the somewhat twisty maze of loops in the
included test case. This does in fact trigger the bug even in this
reduced form.
llvm-svn: 262108
|
|
|
|
| |
llvm-svn: 262106
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without tree pruning clang has 2,667,552 points.
Wiht only dominators pruning: 1,515,586.
With both dominators & predominators pruning: 1,340,534.
Differential Revision: http://reviews.llvm.org/D17671
llvm-svn: 262103
|
|
|
|
| |
llvm-svn: 262102
|
|
|
|
| |
llvm-svn: 262096
|
|
|
|
|
|
|
| |
We ended up removing a save/restore pair around an inalloca call,
leading to a miscompile in Chromium.
llvm-svn: 262095
|
|
|
|
|
|
| |
assertion failure on AArch64.
llvm-svn: 262091
|
|
|
|
| |
llvm-svn: 262087
|
|
|
|
| |
llvm-svn: 262086
|
|
|
|
|
|
|
|
|
|
| |
Most of this is fairly straight forward. Add handling for min/max via existing matcher utility and ConstantRange routines. Add handling for clamp by exploiting condition constraints on inputs.
Note that I'm only handling two constant ranges at this point. It would be reasonable to consider treating overdefined as a full range if the instruction is typed as an integer, but that should be a separate change.
Differential Revision: http://reviews.llvm.org/D17184
llvm-svn: 262085
|
|
|
|
| |
llvm-svn: 262084
|
|
|
|
| |
llvm-svn: 262083
|
|
|
|
|
|
|
|
| |
This was split off from http://reviews.llvm.org/D17184.
Reviewed by: Sanjoy
llvm-svn: 262080
|
|
|
|
|
|
|
|
| |
Currently we always expand ISD::FNEG. For v4f32 and v2f64 vector types VSX has
native support for this opcode
Phabricator: http://reviews.llvm.org/D17647
llvm-svn: 262079
|
|
|
|
| |
llvm-svn: 262078
|
|
|
|
|
|
|
|
|
| |
Replicate everything for integers...because x86.
Continuation of:
http://reviews.llvm.org/rL262064
llvm-svn: 262077
|
|
|
|
|
|
| |
-fsanitize-coverage=trace-pc. This does not scale well yet, but already cracks FullCoverageSetTest in seconds
llvm-svn: 262073
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change implements the following vsx instructions:
Quad/Double-Precision Compare:
xscmpoqp xscmpuqp
xscmpexpdp xscmpexpqp
xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp
xvcmpnedp(.) xvcmpnesp(.)
Quad-Precision Floating-Point Conversion
xscvqpdp(o) xscvdpqp
xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp
xscvdphp xscvhpdp xvcvhpsp xvcvsphp
xsrqpi xsrqpix xsrqpxp
28 instructions
Phabricator: http://reviews.llvm.org/D16709
llvm-svn: 262068
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145
is that customers using the AVX intrinsics in C will benefit from combines when
the store mask is constant:
void mstore_zero_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(0), v);
}
void mstore_fake_ones_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(1), v);
}
void mstore_ones_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v);
}
void mstore_one_set_elt_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v);
}
...so none of the above will actually generate a masked store for optimized code.
Differential Revision: http://reviews.llvm.org/D17485
llvm-svn: 262064
|