| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In getEntryCost(), make the scalar type for a compare instruction that of the
operands, not i1. This is needed in order to call getCmpSelInstrCost() for a
compare in a sensible way, the same way as the LoopVectorizer does.
New test: test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll
Review: Matthew Simpson
https://reviews.llvm.org/D31601
llvm-svn: 300061
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cost for a branch after vectorization is very different depending on if
the vectorizer will if-convert the block (branch is eliminated), or if
scalarized and predicated blocks will be produced (branch duplicated before
each block). There is also the case of remaining scalar branches, such as the
back-edge branch.
This patch handles these cases differently with TTI based cost estimates.
Review: Matthew Simpson
https://reviews.llvm.org/D31175
llvm-svn: 300058
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since SystemZ supports vector element load/store instructions, there is no
need for extracts/inserts if a vector load/store gets scalarized.
This patch lets Target specify that it supports such instructions by means of
a new TTI hook that defaults to false.
The use for this is in the LoopVectorizer getScalarizationOverhead() method,
which will with this patch produce a smaller sum for a vector load/store on
SystemZ.
New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll
Review: Adam Nemet
https://reviews.llvm.org/D30680
llvm-svn: 300056
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.
Interleaved access vectorization enabled.
BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.
Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631
llvm-svn: 300052
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Dead basic blocks may be forming a loop, for which SSA form is
fulfilled, but with a circular def-use chain. LoadCombine could
enter an infinite loop when analysing such dead code. This patch
solves the problem by simply avoiding to analyse all basic blocks
that aren't forward reachable, from function entry, in LoadCombine.
Fixes https://bugs.llvm.org/show_bug.cgi?id=27065
Reviewers: mehdi_amini, chandlerc, grosser, Bigcheese, davide
Reviewed By: davide
Subscribers: dberlin, zzheng, bjope, grandinj, Ka-Ka, materi, jholewinski, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D31032
llvm-svn: 300034
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and to expose a handle to represent the actual case rather than having
the iterator return a reference to itself.
All of this allows the iterator to be used with common STL facilities,
standard algorithms, etc.
Doing this exposed some missing facilities in the iterator facade that
I've fixed and required some work to the actual iterator to fully
support the necessary API.
Differential Revision: https://reviews.llvm.org/D31548
llvm-svn: 300032
|
|
|
|
|
|
| |
code. NFC
llvm-svn: 300030
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
COFF requires that every comdat contain a symbol with the same name as
the comdat. ThinLTOBitcodeWriter renames symbols, which may cause this
requirement to be violated. This change avoids such violations by
renaming comdats if their leaders are renamed. It also keeps comdats
together when splitting modules.
Reviewers: pcc, mehdi_amini, tejohnson
Reviewed By: pcc
Subscribers: rnk, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31963
llvm-svn: 300019
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For now, it just wraps AttributeSetNode*. Eventually, it will hold
AvailableAttrs as an inline bitset, and adding and removing enum
attributes will be super cheap.
This sinks AttributeSetNode back down to lib/IR/AttributeImpl.h.
Reviewers: pete, chandlerc
Subscribers: llvm-commits, jfb
Differential Revision: https://reviews.llvm.org/D31940
llvm-svn: 300014
|
|
|
|
|
|
|
| |
Internal linkage preserves names like "__asan_global_foo" which may
account to 2% of unstripped binary size.
llvm-svn: 299995
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the vectorization of first order recurrence, we vectorize such
that the last element in the vector will be the one extracted to pass into the
scalar remainder loop. However, this is not true when there is a phi (other
than the primary induction variable) is used outside the loop.
In such a case, we need the value from the second last iteration (i.e.
the phi value), not the last iteration (which would be the phi update).
I've added a test case for this. Also see PR32396.
A follow up patch would generate the correct code gen for such cases,
and turn this vectorization on.
Differential Revision: https://reviews.llvm.org/D31910
Reviewers: mssimpso
llvm-svn: 299985
|
|
|
|
|
|
|
|
| |
Analysis, it has Analysis passes, and once NewGVN is made an Analysis,
this removes the cross dependency from Analysis to Transform/Utils.
NFC.
llvm-svn: 299980
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this patch, pass AddDiscriminators always avoided to assign
discriminators to intrinsic calls. This was done mainly for two reasons:
1) We wanted to minimize the number of based discriminators used.
2) We wanted to avoid non-deterministic discriminator assignment for
different debug levels.
Unfortunately, that approach was problematic for MemIntrinsic calls.
MemIntrinsic calls can be split by SROA into loads and stores, and each new
load/store instruction would obtain the debug location from the original
intrinsic call.
If we don't assign a discriminator to MemIntrinsic calls, then we cannot
correctly set the discriminator for the newly created loads and stores.
This may have a negative impact on the basic block weight computation
performed by the SampleLoader.
This patch fixes the issue by letting MemIntrinsic calls have a discriminator.
Differential Revision: https://reviews.llvm.org/D31900
llvm-svn: 299972
|
|
|
|
| |
llvm-svn: 299970
|
|
|
|
|
|
| |
This removes a TODO in getIdentityValue and may allow some transforms to occur earlier. But I was unable to find any transforms we didn't already handle.
llvm-svn: 299966
|
|
|
|
|
|
| |
This is a candidate culprit for multiple bot fails, so reverting pending investigation.
llvm-svn: 299955
|
|
|
|
|
|
|
|
|
|
|
| |
templates.
From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments.
The variadic template is an obvious solution to both issues.
Differential Revision: https://reviews.llvm.org/D31070
llvm-svn: 299949
|
|
|
|
|
|
| |
Turn GVNHoist back on by default now that PR32153 has been fixed.
llvm-svn: 299944
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In rL299692 I improved strip-dead-debug-info's ability to drop CUs that are not
referenced from the current module. However, in doing so I neglected to realize
that some SPs could be referenced entirely from inlined functions. It appears
I was not the only one to make this mistake, because DebugInfoFinder, doesn't
find those SPs either. Fix this in DebugInfoFinder and then use that to make
sure not to drop those CUs in strip-dead-debug-info.
Reviewers: aprantl
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31904
llvm-svn: 299936
|
|
|
|
|
|
|
| |
This reverts commit r299925 because it broke the buildbots. See e.g.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008
llvm-svn: 299928
|
|
|
|
|
|
|
|
|
|
|
|
| |
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.
From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.
llvm-svn: 299925
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fix coverity cid 1374240
Reviewers: dberlin
Reviewed By: dberlin
Differential Revision: https://reviews.llvm.org/D31928
llvm-svn: 299924
|
|
|
|
|
|
| |
if all the elements are Undef or ConstantInt.
llvm-svn: 299917
|
|
|
|
| |
llvm-svn: 299915
|
|
|
|
|
|
|
|
|
|
|
| |
When allowed, we can hoist a division out of a loop in favor of a
multiplication by the reciprocal. Fixes PR32157.
Patch by vit9696!
Differential Revision: https://reviews.llvm.org/D30819
llvm-svn: 299911
|
|
|
|
|
|
|
|
|
|
| |
have >1 value."
It's not ready yet this was an accidental commit :(
This reverts r299903
llvm-svn: 299904
|
|
|
|
|
|
|
|
| |
value.
Fixes PR 32607.
llvm-svn: 299903
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This re-lands r299875.
I introduced a bug in Clang code responsible for replacing K&R, no
prototype declarations with a real function definition with a prototype.
The bug was here:
// Collect any return attributes from the call.
- if (oldAttrs.hasAttributes(llvm::AttributeList::ReturnIndex))
- newAttrs.push_back(llvm::AttributeList::get(newFn->getContext(),
- oldAttrs.getRetAttributes()));
+ newAttrs.push_back(oldAttrs.getRetAttributes());
Previously getRetAttributes() carried AttributeList::ReturnIndex in its
AttributeList. Now that we return the AttributeSetNode* directly, it no
longer carries that index, and we call this overload with a single node:
AttributeList::get(LLVMContext&, ArrayRef<AttributeSetNode*>)
That aborted with an assertion on x86_32 targets. I added an explicit
triple to the test and added CHECKs to help find issues like this in the
future sooner.
llvm-svn: 299899
|
|
|
|
|
|
|
| |
This Placates GCC7 with -Werror. Also, clang-format the assertions
while I'm here.
llvm-svn: 299895
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31818
llvm-svn: 299893
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM makes several assumptions about address space 0. However,
alloca is presently constrained to always return this address space.
There's no real way to avoid using alloca, so without this
there is no way to opt out of these assumptions.
The problematic assumptions include:
- That the pointer size used for the stack is the same size as
the code size pointer, which is also the maximum sized pointer.
- That 0 is an invalid, non-dereferencable pointer value.
These are problems for AMDGPU because alloca is used to
implement the private address space, which uses a 32-bit
index as the pointer value. Other pointers are 64-bit
and behave more like LLVM's notion of generic address
space. By changing the address space used for allocas,
we can change our generic pointer type to be LLVM's generic
pointer type which does have similar properties.
llvm-svn: 299888
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
findCalleeFunctionSamples which is going to be refactored.
Summary: Now the SamplePGO support is more stable, we do not need so many verbose optimization remarks emitted.
Reviewers: dnovillo, davidxl
Reviewed By: davidxl
Subscribers: fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D31826
llvm-svn: 299883
|
|
|
|
|
|
|
|
|
|
|
| |
w.r.t. https://bugs.llvm.org/show_bug.cgi?id=32153
The consensus seems to be isGuaranteedToTransferExecutionToSuccessor should be called for each function.
Patch by Aditya Kumar
Differential Revision: https://reviews.llvm.org/D31035
llvm-svn: 299882
|
|
|
|
|
|
| |
This reverts commit r299696, which is causing mysterious test failures.
llvm-svn: 299880
|
|
|
|
|
|
| |
This reverts commit r299697, which caused a big increase in object file size.
llvm-svn: 299879
|
|
|
|
|
|
|
| |
This reverts r299875. A Linux bot came back with a test failure:
http://bb.pgr.jp/builders/test-clang-i686-linux-RA/builds/741/steps/test_clang/logs/Clang%20%3A%3A%20CodeGen__2006-05-19-SingleEltReturn.c
llvm-svn: 299878
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
AttributeList::get(Fn|Ret|Param)Attributes no longer creates a temporary
AttributeList just to hide the AttributeSetNode type.
I've also added a factory method to create AttributeLists from a
parallel array of AttributeSetNodes. I think this simplifies
construction of AttributeLists when rewriting function prototypes.
Previously we would test if a particular index had attributes, and
conditionally add a temporary attribute list to a vector. Now the
attribute set vector is parallel to the argument vector already that
these passes already construct.
My long term vision is to wrap AttributeSetNode* inside an AttributeSet
type that holds the enum attributes, but that will come in a follow up
change.
I haven't done any performance measurements for this change because
profiling hasn't shown that any of the affected code is hot.
Reviewers: pete, chandlerc, sanjoy, hfinkel
Reviewed By: pete
Subscribers: jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D31198
llvm-svn: 299875
|
|
|
|
| |
llvm-svn: 299871
|
|
|
|
|
|
| |
Patch by James Price
llvm-svn: 299866
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
While we don't want them aliasing with other pointers, there seems to
be no point in not having them clobber must-aliased'd pointers.
If some day, we split the aliasing and ordering chains, we'd make this
not aliasing but an ordering barrier (IE it doesn't affect it's
memory, but we can't hoist it above it).
Reviewers: hfinkel, george.burgess.iv
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31865
llvm-svn: 299865
|
|
|
|
|
|
|
|
| |
code. Add missing test cases.
In one case I removed commute handling for a multiply with a constant since we'll eventually get the constant on the right hand side.
llvm-svn: 299863
|
|
|
|
|
|
| |
since they were missing. NFC
llvm-svn: 299853
|
|
|
|
|
|
|
|
|
|
|
| |
Also, make the same change in and-of-icmps and remove a hack for detecting that case.
Finally, add some FIXME comments because the code duplication here is awful.
This should fix the remaining IR problem noted in:
https://bugs.llvm.org/show_bug.cgi?id=32524
llvm-svn: 299851
|
|
|
|
|
|
|
|
|
|
| |
select operations
We currently only fold scalar add of constants into selects. This improves this to support vectors too.
Differential Revision: https://reviews.llvm.org/D31683
llvm-svn: 299847
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is my first time using the commutable matchers so wanted to make sure I was doing it right.
Are there any other matcher tricks to further shrink this? Can we commute the whole match so we don't have to LHS and RHS separately?
Reviewers: davide, spatel
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31680
llvm-svn: 299840
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions into phi nodes
Summary: I noticed in the select folding code that we copied fast math flags, but did not do the same for the similar handling in phi nodes. This patch fixes that to do the same thing as select
Reviewers: spatel, davide, majnemer, hfinkel
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31690
llvm-svn: 299838
|
|
|
|
|
|
| |
transform.
llvm-svn: 299837
|
|
|
|
|
|
|
|
| |
matcher checks in visitXor.
The matchers themselves should be enough.
llvm-svn: 299835
|
|
|
|
|
|
|
|
| |
very similar (A&B)^B -> ~A & B code. This should be NFC except for the addition of hasOneUse check.
I think this code is still overly complicated and should use matchers, but first I wanted to make it consistent.
llvm-svn: 299834
|
|
|
|
| |
llvm-svn: 299833
|