| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Tests added contain splat-masks with undef elements.
llvm-svn: 299988
|
| |
|
|
|
|
|
|
|
| |
Check if the scale operand is identical (doesn't have to be 1) and
do not check the chaain operand.
Differential revision: https://reviews.llvm.org/D31833
llvm-svn: 299986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the vectorization of first order recurrence, we vectorize such
that the last element in the vector will be the one extracted to pass into the
scalar remainder loop. However, this is not true when there is a phi (other
than the primary induction variable) is used outside the loop.
In such a case, we need the value from the second last iteration (i.e.
the phi value), not the last iteration (which would be the phi update).
I've added a test case for this. Also see PR32396.
A follow up patch would generate the correct code gen for such cases,
and turn this vectorization on.
Differential Revision: https://reviews.llvm.org/D31910
Reviewers: mssimpso
llvm-svn: 299985
|
| |
|
|
| |
llvm-svn: 299984
|
| |
|
|
|
|
|
|
| |
Analysis, it has Analysis passes, and once NewGVN is made an Analysis,
this removes the cross dependency from Analysis to Transform/Utils.
NFC.
llvm-svn: 299980
|
| |
|
|
|
|
|
|
|
|
|
|
| |
If you run llc -stop-after=codegenprepare and feed the resulting MIR
to llc -start-after=codegenprepare, you'll have an empty machine
function since we haven't run any isel yet. Of course, this only works
if the MIRParser believes you that this is okay.
This is essentially a revert of r241862 with a fix for the problem it
was papering over.
llvm-svn: 299975
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31911
llvm-svn: 299973
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this patch, pass AddDiscriminators always avoided to assign
discriminators to intrinsic calls. This was done mainly for two reasons:
1) We wanted to minimize the number of based discriminators used.
2) We wanted to avoid non-deterministic discriminator assignment for
different debug levels.
Unfortunately, that approach was problematic for MemIntrinsic calls.
MemIntrinsic calls can be split by SROA into loads and stores, and each new
load/store instruction would obtain the debug location from the original
intrinsic call.
If we don't assign a discriminator to MemIntrinsic calls, then we cannot
correctly set the discriminator for the newly created loads and stores.
This may have a negative impact on the basic block weight computation
performed by the SampleLoader.
This patch fixes the issue by letting MemIntrinsic calls have a discriminator.
Differential Revision: https://reviews.llvm.org/D31900
llvm-svn: 299972
|
| |
|
|
| |
llvm-svn: 299971
|
| |
|
|
| |
llvm-svn: 299969
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Move LTO::run() to a "run" subcommand so that we can introduce new subcommands
for testing different parts of the LTO implementation.
This doesn't use llvm::cl subcommands because it doesn't appear to be currently
possible to pass an argument not associated with a subcommand to a subcommand
(e.g. -lto-use-new-pm, -mcpu=yonah).
Differential Revision: https://reviews.llvm.org/D31410
llvm-svn: 299967
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31589
llvm-svn: 299964
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This lets PDB readers lookup type record data by type index in O(log n)
time. It also enables makes `cvdump -t` work on PDBs produced by LLD.
cvdump will not dump a PDB that doesn't have an index-to-offset table.
The table is sorted by type index, and has an entry every 8KB. Looking
up a type record by index is a binary search of this table, followed by
a scan of at most 8KB.
Reviewers: ruiu, zturner, inglorion
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31636
llvm-svn: 299958
|
| |
|
|
|
|
| |
This is a candidate culprit for multiple bot fails, so reverting pending investigation.
llvm-svn: 299955
|
| |
|
|
|
|
| |
Turn GVNHoist back on by default now that PR32153 has been fixed.
llvm-svn: 299944
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In rL299692 I improved strip-dead-debug-info's ability to drop CUs that are not
referenced from the current module. However, in doing so I neglected to realize
that some SPs could be referenced entirely from inlined functions. It appears
I was not the only one to make this mistake, because DebugInfoFinder, doesn't
find those SPs either. Fix this in DebugInfoFinder and then use that to make
sure not to drop those CUs in strip-dead-debug-info.
Reviewers: aprantl
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31904
llvm-svn: 299936
|
| |
|
|
|
|
|
|
|
| |
Use the same handling in the generic legalizer code as for the other
libcalls (G_FREM, G_FPOW).
Enable it on ARM for float and double so we can test it.
llvm-svn: 299931
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Legalize only if the type is marked as Legal or Custom. If not, return Unsupported as LegalizerHelper is not able to handle non-power-of-2 types right now.
Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, kristof.beyls, javed.absar, ab
Reviewed By: kristof.beyls, ab
Subscribers: dberris, rovka, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D31711
llvm-svn: 299929
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A fix for the bug reported in PR30911.
The issue arises when multiple CALLSEQ_BEGIN nodes are unscheduled as
the last node to be unscheduled will gain access to the CallResource
register. But when a node is being picked, only CALLSEQ_END nodes are
checked against the CallResource and have their chains evaluated.
This then means that other CALLSEQ_BEGIN nodes can be scheduled
before the existing call sequence has been finalised. This patch adds
a check against the FrameSetup nodes in DelayForLiveRegs to prevent
this from happening.
Differential Revision: https://reviews.llvm.org/D31536
llvm-svn: 299926
|
| |
|
|
| |
llvm-svn: 299915
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(h/t to Chandler for pointing this out)
The test in question was not at all testing what it was supposed to
test. We do not //care// about placing `!make.implicit` in inner
constant branch (since it will be folded away anyway). We care about
placing `!make.implicit` in the outer branch that switches between
either version of the loop.
Having said that, it is _correct_ to leave behind the `!make.implicit`
in the inner branch, but there is no need to do so.
llvm-svn: 299912
|
| |
|
|
|
|
|
|
|
|
|
| |
When allowed, we can hoist a division out of a loop in favor of a
multiplication by the reciprocal. Fixes PR32157.
Patch by vit9696!
Differential Revision: https://reviews.llvm.org/D30819
llvm-svn: 299911
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Check the legality of ISD::[US]MULO to see whether
Intrinsic::[us]mul_with_overflow will legalize into a function call (and, thus,
will use the CTR register). Fixes PR32485.
Patch by Tim Neumann!
Differential Revision: https://reviews.llvm.org/D31790
llvm-svn: 299910
|
| |
|
|
| |
llvm-svn: 299897
|
| |
|
|
|
|
|
|
|
|
| |
The math works out where it can actually be counter-productive. The probability
calculations correctly handle the case where the alternative is 0 probability,
rely on those calculations.
Includes a test case that demonstrates the problem.
llvm-svn: 299892
|
| |
|
|
|
|
|
| |
Qin may be large, and Succ may be more frequent than BB. Take these both into
account when deciding if tail-duplication is profitable.
llvm-svn: 299891
|
| |
|
|
|
|
|
|
| |
Merging identical blocks when it doesn't reduce fallthrough. It is common for
the blocks created from critical edge splitting to be identical. We would like
to merge these blocks whenever doing so would not reduce fallthrough.
llvm-svn: 299890
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM makes several assumptions about address space 0. However,
alloca is presently constrained to always return this address space.
There's no real way to avoid using alloca, so without this
there is no way to opt out of these assumptions.
The problematic assumptions include:
- That the pointer size used for the stack is the same size as
the code size pointer, which is also the maximum sized pointer.
- That 0 is an invalid, non-dereferencable pointer value.
These are problems for AMDGPU because alloca is used to
implement the private address space, which uses a 32-bit
index as the pointer value. Other pointers are 64-bit
and behave more like LLVM's notion of generic address
space. By changing the address space used for allocas,
we can change our generic pointer type to be LLVM's generic
pointer type which does have similar properties.
llvm-svn: 299888
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
findCalleeFunctionSamples which is going to be refactored.
Summary: Now the SamplePGO support is more stable, we do not need so many verbose optimization remarks emitted.
Reviewers: dnovillo, davidxl
Reviewed By: davidxl
Subscribers: fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D31826
llvm-svn: 299883
|
| |
|
|
|
|
|
|
|
|
|
| |
w.r.t. https://bugs.llvm.org/show_bug.cgi?id=32153
The consensus seems to be isGuaranteedToTransferExecutionToSuccessor should be called for each function.
Patch by Aditya Kumar
Differential Revision: https://reviews.llvm.org/D31035
llvm-svn: 299882
|
| |
|
|
|
|
| |
This reverts commit r299696, which is causing mysterious test failures.
llvm-svn: 299880
|
| |
|
|
|
|
| |
This reverts commit r299697, which caused a big increase in object file size.
llvm-svn: 299879
|
| |
|
|
|
|
| |
In preparation for allowing allocas to have non-0 addrspace.
llvm-svn: 299876
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When dumping classes, show where padding occurs, and at the end of the
class print statistics about how many bytes total of padding exist in a
class.
Since PDB doesn't specifically contain information about padding, we have
to mimic this by sort of reversing a small portion of the record layout
algorithm (e.g. looking at offsets and sizes and trying to determine
whether something is part of the same field or a new field).
Differential Revision: https://reviews.llvm.org/D31800
llvm-svn: 299869
|
| |
|
|
|
|
| |
Patch by James Price
llvm-svn: 299866
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
While we don't want them aliasing with other pointers, there seems to
be no point in not having them clobber must-aliased'd pointers.
If some day, we split the aliasing and ordering chains, we'd make this
not aliasing but an ordering barrier (IE it doesn't affect it's
memory, but we can't hoist it above it).
Reviewers: hfinkel, george.burgess.iv
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31865
llvm-svn: 299865
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch refactors and strengthens the type checks performed for interleaved
accesses. The primary functional change is to ensure that the interleaved
accesses have valid element types. The added test cases previously failed
because the element type is f128.
Differential Revision: https://reviews.llvm.org/D31817
llvm-svn: 299864
|
| |
|
|
|
|
|
|
| |
code. Add missing test cases.
In one case I removed commute handling for a multiply with a constant since we'll eventually get the constant on the right hand side.
llvm-svn: 299863
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The unused dummy src2_modifiers is missing, so it crashes
when trying to print it.
I tried to fully remove src2_modifiers, but there are some
irritations in the places where it is converted to mad since
it starts to require modifying use lists while iterating over
them.
llvm-svn: 299861
|
| |
|
|
|
|
| |
since they were missing. NFC
llvm-svn: 299853
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31754
llvm-svn: 299852
|
| |
|
|
|
|
|
|
|
|
|
| |
Also, make the same change in and-of-icmps and remove a hack for detecting that case.
Finally, add some FIXME comments because the code duplication here is awful.
This should fix the remaining IR problem noted in:
https://bugs.llvm.org/show_bug.cgi?id=32524
llvm-svn: 299851
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Adds support for pointers to arrays, which was missing
* Adds some tests
* Improves consistency of const and volatile qualifiers
* Eliminates non-composable special case code for arrays and function by using
a more general recursive approach
* Has a hack for getting the calling convention into the right spot for
pointer-to-functions
Given the rapid changes happenning in llvm-pdbdump, this may be difficult to
merge.
Differential Revision: https://reviews.llvm.org/D31832
llvm-svn: 299848
|
| |
|
|
|
|
|
|
|
|
| |
select operations
We currently only fold scalar add of constants into selects. This improves this to support vectors too.
Differential Revision: https://reviews.llvm.org/D31683
llvm-svn: 299847
|
| |
|
|
| |
llvm-svn: 299846
|
| |
|
|
|
|
| |
Legalize to a libcall.
llvm-svn: 299841
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions into phi nodes
Summary: I noticed in the select folding code that we copied fast math flags, but did not do the same for the similar handling in phi nodes. This patch fixes that to do the same thing as select
Reviewers: spatel, davide, majnemer, hfinkel
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31690
llvm-svn: 299838
|
| |
|
|
|
|
| |
transform.
llvm-svn: 299837
|
| |
|
|
|
|
| |
version of a transform. NFC.
llvm-svn: 299836
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Resolve indirect branch target when possible.
This potentially eliminates more basicblocks and result in better evaluation for phi and other things.
Reviewers: davide, efriedma, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30322
llvm-svn: 299830
|