| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
comments. NFC
llvm-svn: 262301
|
|
|
|
|
|
|
| |
This fixes regressions exposed in existing AMDGPU tests in a
future commit when all loads are custom lowered.
llvm-svn: 262299
|
|
|
|
|
|
|
|
| |
Technically you aren't supposed to emit these after type legalization
for some reason, and we use vector extracts of bitcasted integers
as the canonical way to do this.
llvm-svn: 262298
|
|
|
|
| |
llvm-svn: 262297
|
|
|
|
|
|
|
|
|
|
| |
This currently does not have the control over the bitwidth,
and there are missing optimizations to reduce the integer to
32-bit if it can be.
But in most situations we do want the sinking to occur.
llvm-svn: 262296
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CatchObjOffset is relative to the end of the EH registration node
for 32-bit x86 WinEH targets. A special sentinel value, 0, is used to
indicate that no catch object should be initialized.
This means that a catch object allocated immediately before the
registration node would be assigned a CatchObjOffset of 0, leading the
runtime to believe that a catch object should not be initialized.
To handle this, allocate the registration node prior to any other frame
object. This will ensure that catch objects will not be allocated
before the registration node.
This fixes PR26757.
Differential Revision: http://reviews.llvm.org/D17689
llvm-svn: 262294
|
|
|
|
|
|
|
|
| |
Generally speaking, this can only happen with unreachable code.
However, neglecting to check for this condition would lead us to loop
forever.
llvm-svn: 262284
|
|
|
|
| |
llvm-svn: 262280
|
|
|
|
| |
llvm-svn: 262279
|
|
|
|
|
|
|
| |
Continuation of:
http://reviews.llvm.org/rL262269
llvm-svn: 262273
|
|
|
|
| |
llvm-svn: 262270
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145
is that customers using the AVX intrinsics in C will benefit from combines when
the load mask is constant:
__m128 mload_zeros(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0));
}
__m128 mload_fakeones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(1));
}
__m128 mload_ones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000));
}
__m128 mload_oneset(float *f) {
return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0));
}
...so none of the above will actually generate a masked load for optimized code.
This is the masked load counterpart to:
http://reviews.llvm.org/rL262064
llvm-svn: 262269
|
|
|
|
|
|
|
| |
This change makes the verifier a little more paranoid. It was possible
to trick the verifier into crashing or infinite looping.
llvm-svn: 262268
|
|
|
|
|
|
|
|
|
| |
We can actually have dependences between accesses with different
underlying types. Bail in this case.
A test will follow shortly.
llvm-svn: 262267
|
|
|
|
|
|
|
|
| |
http://reviews.llvm.org/D9979
Patch by Richard Thomson (and some conflict resolution by me).
llvm-svn: 262266
|
|
|
|
| |
llvm-svn: 262265
|
|
|
|
| |
llvm-svn: 262264
|
|
|
|
|
|
|
|
|
| |
Combinations of suffixes that look useful are actually ignored;
complaining about them will avoid mistakes.
Differential Revision: http://reviews.llvm.org/D17587
llvm-svn: 262263
|
|
|
|
| |
llvm-svn: 262262
|
|
|
|
|
|
| |
variants since they're usually in range.
llvm-svn: 262258
|
|
|
|
| |
llvm-svn: 262252
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I re-benchmarked this and results are similar to original results in
D13259:
On ARM64:
SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27%
SingleSource/Benchmarks/Polybench/stencils/adi -19.78%
On x86:
SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -27.14%
And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop
Distribution.
In terms of compile time, there is ~5% increase on both
SingleSource/Benchmarks/Misc/oourafft and
SingleSource/Benchmarks/Linkpack/linkpack-pc. These are both very tiny
loop-intensive programs where SCEV computations dominates compile time.
The reason that time spent in SCEV increases has to do with the design
of the old pass manager. If a transform pass does not preserve an
analysis we *invalidate* the analysis even if there was *no*
modification made by the transform pass.
This means that currently we don't take advantage of LLE and LV sharing
the same analysis (LAA) and unfortunately we recompute LAA *and* SCEV
for LLE.
(There should be a way to work around this limitation in the case of
SCEV and LAA since both compute things on demand and internally cache
their result. Thus we could pretend that transform passes preserve
these analyses and manually invalidate them upon actual modification.
On the other hand the new pass manager is supposed to solve so I am not
sure if this is worthwhile.)
Reviewers: hfinkel, dberlin
Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D16300
llvm-svn: 262250
|
|
|
|
| |
llvm-svn: 262249
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: t.p.northover, jmolloy
Subscribers: mcrosier, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D17463
llvm-svn: 262248
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a variable is described by a single DBG_VALUE instruction we can
often use a more efficient inline DW_AT_location instead of using a
location list.
This commit makes the heuristic that decides when to apply this
optimization stricter by also verifying that the DBG_VALUE is live at the
entry of the function (instead of just checking that it is valid until
the end of the function).
<rdar://problem/24611008>
llvm-svn: 262247
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Rename the section embeds bitcode from ".llvmbc,.llvmbc" to "__LLVM,__bitcode".
The new name matches MachO section naming convention.
Reviewers: rafael, pcc
Subscribers: davide, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D17388
llvm-svn: 262245
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is long-standing dirtiness, as acknowledged by r77582:
The current trick is to select it into a merge_values with
the first definition being an implicit_def. The proper solution is
to add new ISD opcodes for the no-output variant.
Doing this before selection will let us combine away some constructs.
Differential Revision: http://reviews.llvm.org/D17659
llvm-svn: 262244
|
|
|
|
| |
llvm-svn: 262243
|
|
|
|
| |
llvm-svn: 262242
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
32-bit X86 EH on Windows utilizes a stack of registration nodes
allocated and deallocated on entry/exit. A registration node contains a
bunch of EH personality specific information like which try-state we are
currently in.
Because a setjmp target allows control flow from arbitrary program
points, there is no way to ensure that the try-state we are in is
correctly updated once we transfer control.
MSVC compatible compilers, like MSVC and ICC, utilize runtime helpers to
reinitialize the try-state when a longjmp occurs. This is implemented
by adding additional arguments to _setjmp3: the desired try-state and
a helper routine to update the try-state.
Differential Revision: http://reviews.llvm.org/D17721
llvm-svn: 262241
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Now discriminator is assigned per-function instead of per-module.
Reviewers: davidxl, dnovillo
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D17664
llvm-svn: 262240
|
|
|
|
| |
llvm-svn: 262238
|
|
|
|
| |
llvm-svn: 262236
|
|
|
|
|
|
|
|
|
|
| |
Corresponds to Phabricator review:
http://reviews.llvm.org/D16592
This fix includes both an update to how we handle the "generic" CPU on LE
systems as well as Anton's fix for the Fast Isel issue.
llvm-svn: 262233
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The bug was that dextu's operand 3 would print 0-31 instead of 32-63 when
printing assembly. This came up when replacing
MipsInstPrinter::printUnsignedImm() with a version that could handle arbitrary
bit widths.
MipsAsmPrinter::printUnsignedImm*() don't seem to be used so they have been
removed.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15521
llvm-svn: 262231
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: dsanders
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15420
llvm-svn: 262230
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously, it would always select DEXT and substitute any invalid matches
for DEXTU/DEXTM during MipsMCCodeEmitter::encodeInstruction(). This works
but causes problems when adding range checked immediates to IAS.
Now isel selects the correct variant up front.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D16810
llvm-svn: 262229
|
|
|
|
| |
llvm-svn: 262222
|
|
|
|
|
|
|
| |
I accidentally removed this in r262212 but there was no test coverage to
detect it.
llvm-svn: 262215
|
|
|
|
|
|
|
|
|
|
|
|
| |
compare-to-immediate-and-branch macros.
Reviewers: vkalintiris
Subscribers: llvm-commits, vkalintiris, dim, seanbruno, dsanders
Differential Revision: http://reviews.llvm.org/D15369
llvm-svn: 262213
|
|
|
|
|
|
|
|
| |
are ignored.
Only allow fsub -0.0, (fsub -0.0, X) ==> X without nsz. PR26746.
llvm-svn: 262212
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17070
llvm-svn: 262211
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the PassBuilder.
These are really just stubs for now, but they give a nice API surface
that Clang or other tools can start learning about and enabling for
experimentation.
I've also wired up parsing various synthetic module pass names to
generate these set pipelines. This allows the pipelines to be combined
with other passes and have their order controlled, with clear separation
between the *kind* of canned pipeline, and the *level* of optimization
to be used within that canned pipeline.
The most interesting part of this patch is almost certainly the spec for
the different optimization levels. I don't think we can ever have hard
and fast rules that would make it easy to determine whether a particular
optimization makes sense at a particular level -- it will always be in
large part a judgement call. But hopefully this will outline the
expected rationale that should be used, and the direction that the
pipelines should be taken. Much of this was based on a long llvm-dev
discussion I started years ago to try and crystalize the intent behind
these pipelines, and now, at long long last I'm returning to the task of
actually writing it down somewhere that we can cite and try to be
consistent with.
Differential Revision: http://reviews.llvm.org/D12826
llvm-svn: 262196
|
|
|
|
|
|
|
|
| |
for clang.
char AnalysisBase::ID should be declared as extern and defined in one module.
llvm-svn: 262188
|
|
|
|
| |
llvm-svn: 262187
|
|
|
|
|
|
|
|
| |
tweaks."
I'll rework soon.
llvm-svn: 262186
|
|
|
|
|
|
| |
char AnalysisBase::ID should be declared as extern and defined in one module.
llvm-svn: 262185
|
|
|
|
|
|
| |
Operand order seems to have changed, the new one is nicer.
llvm-svn: 262180
|
|
|
|
|
|
| |
More API churn, experimental target got sad.
llvm-svn: 262179
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17684
llvm-svn: 262176
|