| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This modification of the currently unused inter-procedural constant
propagation pass (IPConstantPropagation) shows how abstract call sites
enable optimization of callback calls alongside direct and indirect
calls. Through minimal changes, mostly dealing with the partial mapping
of callbacks, inter-procedural constant propagation was enabled for
callbacks, e.g., OpenMP runtime calls or pthreads_create.
Differential Revision: https://reviews.llvm.org/D56447
llvm-svn: 351628
|
| |
|
|
|
|
| |
Original commit: r351582
llvm-svn: 351626
|
| |
|
|
|
|
|
|
|
|
|
| |
Thanks to Nikita Popov for pointing out this missed case.
This is a follow-up to r351411, which disabled function merging for
vararg functions outright due to a miscompile (see llvm.org/PR40345).
Differential Revision: https://reviews.llvm.org/D56865
llvm-svn: 351624
|
| |
|
|
|
|
|
|
|
|
| |
If an inherently cold function is found, mark it as cold. For now this
means applying the `cold` and `minsize` attributes.
As a drive-by, revisit and clean up the criteria for considering a
function for splitting. Add tests.
llvm-svn: 351623
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CodeExtractor permits extracting a region of blocks from a function even
when values defined within the region are used outside of it.
This is typically done by creating an alloca in the original function
and reloading the alloca after a call to the extracted function.
Wrap the reload in lifetime start/end markers to promote stack coloring.
Suggested by Sergei Kachkov!
Differential Revision: https://reviews.llvm.org/D56045
llvm-svn: 351621
|
| |
|
|
|
|
|
|
|
|
| |
This reverts commit r351618.
Compiler RT + ASAN tests are failing for PowerPC. Not sure
how would I reproduce these on macOS, so reverting (again)
until I do.
llvm-svn: 351619
|
| |
|
|
|
|
| |
Original commit: r351582
llvm-svn: 351618
|
| |
|
|
|
|
|
|
| |
This reverts commit r351582.
Bots are failing. Reverting this to fix and re-commit later.
llvm-svn: 351598
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure CodeGenPrepare doesn't emit multiple inttoptr instructions of
the same integer value while sinking address computations, but rather
CSEs them on the fly: excessive inttoptr's confuse SCEV into thinking
that related pointers have nothing to do with each other.
This problem blocks LoadStoreVectorizer from vectorizing some of the
loads / stores in a downstream target.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D56838
llvm-svn: 351582
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to r348205, extracting code regions with live output values was
disabled because of a miscompilation (PR39433). Lift the restriction as
PR39433 has been addressed.
Tested on LNT+externals, on a run of check-llvm in a stage2 build, and
with a full build of iOS (with hot/cold splitting enabled).
As a drive-by, remove an errant TODO.
llvm-svn: 351492
|
| |
|
|
|
|
|
|
|
|
| |
Resuming exception unwinding is roughly as unlikely as throwing an
exception.
Tested on LNT+externals (in particular, the C++ EH regression tests
provide end-to-end test coverage), as well as with a full build of iOS.
llvm-svn: 351491
|
| |
|
|
|
|
|
|
|
| |
Relaxing this requirement creates opportunities to split code dominated
by an EH pad.
Tested on LNT+externals.
llvm-svn: 351483
|
| |
|
|
|
|
|
|
| |
This gets rid of the brittle/mysterious calls to @sink()/@sideeffect()
peppered throughout the test cases. They are no longer needed to force
splitting to occur.
llvm-svn: 351480
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If the sample profile has no inlining hierachy information included, we call
the sample profile is flattened. For flattened profile, in ThinLTO postlink
phase, SampleProfileLoader's hot function inlining and profile annotation will
do nothing, so it is better to save the effort to read in the profile and run
the sample profile loader pass. It is helpful for reducing compile time when
the flattened profile is huge.
Differential Revision: https://reviews.llvm.org/D54819
llvm-svn: 351476
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
InstCombine's sinking algorithm only thinks about memory. It doesn't
think about non-memory constraints like stack object lifetime. It can
sink dynamic allocas across a stacksave call, which may be used with
stackrestore, which can incorrectly reduce the lifetime of the dynamic
alloca.
Fixes PR40365
Reviewers: hfinkel, efriedma
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D56872
llvm-svn: 351475
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
During the transforms in LoopSimplifyCFG, when we remove a dead exiting edge, the
parent loop may stop being reachable from the child loop, and therefore they become
siblings. If the former child loop had uses of some values from its former parent loop,
now such uses will require LCSSA Phis, even if they weren't needed before. So we must
form LCSSA for all loops that stopped being ancestors of the current loop in this case.
Differential Revision: https://reviews.llvm.org/D56144
Reviewed By: fedor.sergeev
llvm-svn: 351434
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function `DeleteDeadBlock` requires that all predecessors of a block
being deleted have already been deleted, with the exception of a
single-block loop. When we use it for removal of dead subloops that
contain more than one block, we may not fulfull this requirement and
fail an assertion.
This patch replaces invocation of `DeleteDeadBlock` with a generalized
version `DeleteDeadBlocks` that is able to deal with multiple dead blocks,
even if they contain some cycles.
Differential Revision: https://reviews.llvm.org/D56121
Reviewed By: fedor.sergeev
llvm-svn: 351433
|
| |
|
|
| |
llvm-svn: 351427
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The function merging pass miscompiles identical vararg functions. The
forwarding thunk it emits doesn't forward the full variable-length list
of arguments. Disable merging for vararg functions for now.
I've filed llvm.org/PR40345 to track the issue.
rdar://47326238
llvm-svn: 351411
|
| |
|
|
|
|
|
|
|
|
|
| |
Essentially, do not treat `call` and `musttail call` as the same thing.
As a drive-by, fold CallInst and InvokeInst handling together using the
CallSite helper.
Differential Revision: https://reviews.llvm.org/D56815
llvm-svn: 351405
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have pgo options defined in PassManagerBuilder.cpp only for
instrument pgo, but not for sample pgo. We also have pgo options defined
in NewPMDriver.cpp in opt only for new pass manager and for all kinds of
pgo. They have some inconsistency.
To make the options more consistent and make tests writing easier, the
patch let old pass manager to share the same pgo options with new pass
manager in opt, and removes the options in PassManagerBuilder.cpp.
Differential Revision: https://reviews.llvm.org/D56749
llvm-svn: 351392
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Sometimes the SLP vectorizer tries to vectorize the horizontal reduction
nodes during regular vectorization. This may happen inside of the loops,
when there are some vectorizable PHIs. Patch fixes this by checking if
the node is the reduction node and thus it must not be vectorized, it must
be gathered.
Reviewers: RKSimon, spatel, hfinkel, fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D56783
llvm-svn: 351349
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the given test SROA detects possible replacement and creates a correct alloca. After that SROA is adding lifetime markers for this new alloca. The function getNewAllocaSlicePtr is trying to deduce the pointer type based on the original alloca, which is split, to use it later in lifetime intrinsic.
For the test we ended up with such code (rA is initial alloca [10 x float], which is split, and rA.sroa.0.0 is a new split allocation)
```
%rA.sroa.0.0.rA.sroa_cast = bitcast i32* %rA.sroa.0 to [10 x float]* <----- this one causing the assertion and is an extra bitcast
%5 = bitcast [10 x float]* %rA.sroa.0.0.rA.sroa_cast to i8*
call void @llvm.lifetime.start.p0i8(i64 4, i8* %5)
```
isAllocaPromotable code assumes that a user of alloca may go into lifetime marker through bitcast but it must be the only one bitcast to i8* type. In the test it's not a i8* type, return false and throw the assertion.
As we are creating a pointer, which will be used in lifetime markers only, the proposed fix is to create a bitcast to i8* immediately to avoid extra bitcast creation.
The test is a greatly simplified to just reproduce the assertion.
Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com>
Reviewers: chandlerc, craig.topper
Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D55934
llvm-svn: 351325
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Check to make sure that the caller and the callee have compatible
function arguments before promoting arguments. This uses the same
TargetTransformInfo queries that are used to determine if attributes
are compatible for inlining.
The goal here is to avoid breaking ABI when a called function's ABI
depends on a target feature that is not enabled in the caller.
This is a very conservative fix for PR37358. Ideally we would have a more
sophisticated check for ABI compatiblity rather than checking if the
attributes are compatible for inlining.
Reviewers: echristo, chandlerc, eli.friedman, craig.topper
Reviewed By: echristo, chandlerc
Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits
Differential Revision: https://reviews.llvm.org/D53554
llvm-svn: 351296
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
InstCombine is able to transform mem transfer instrinsic to alone store or store/load pair.
It might result in generation of unaligned atomic load/store which later in backend
will be transformed to libcall. It is not an evident gain and it is better to keep intrinsic as is
and handle it at backend.
Reviewers: reames, anna, apilipenko, mkazantsev
Reviewed By: reames
Subscribers: t.p.northover, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D56582
llvm-svn: 351295
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Make recoverfp intrinsic target-independent so that it can be implemented for AArch64, etc.
Refer D53541 for the context. Clang counterpart D56748.
Reviewers: rnk, efriedma
Reviewed By: rnk, efriedma
Subscribers: javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D56747
llvm-svn: 351281
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
InvokeInst should be treated like CallInst and
assigned a separate discriminator. This is particularly
import when an Invoke is converted to a Call
during compilation and so can invalidate sample profile
data collected wtih different link time optimizations
Reviewers: twoh, Kader, danielcdh, wmi
Reviewed By: wmi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D56491
llvm-svn: 351251
|
| |
|
|
| |
llvm-svn: 351240
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Related to https://bugs.llvm.org/show_bug.cgi?id=40123.
Rather than scalarizing, expand a vector USUBSAT into UMAX+SUB,
which produces much better code for X86.
Reapplying with updated SLPVectorizer tests.
Differential Revision: https://reviews.llvm.org/D56636
llvm-svn: 351219
|
| |
|
|
|
|
|
|
|
| |
compiler identification lines in test-cases.
(Doing so only because it's then easier to search for references which
are actually important and need fixing.)
llvm-svn: 351200
|
| |
|
|
|
|
| |
Fixes SLP test issue with D56636
llvm-svn: 351199
|
| |
|
|
|
|
|
|
|
| |
Otherwise instcombine gets stuck in a cycle. The canonicalization was
added in D55961.
This patch fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12400
llvm-svn: 351187
|
| |
|
|
| |
llvm-svn: 351183
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows moving the condition from the intrinsic to the standard ICmp
opcode, so that LLVM can do simplifications on it. The icmp.i1 intrinsic
is an identity for retrieving the SGPR mask.
And we can also get the mask from and i1, or i1, xor i1.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52060
llvm-svn: 351150
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
official Git repository.
Remove the directions for using git-svn, and demote the prominence of
the svn instructions.
Also, fix a few other issues while I'm in there:
* Mention LLVM_ENABLE_PROJECTS more.
* Getting started doesn't need to mention test-suite, but should
mention clang and the other projects.
* Remove mentions of "configure", since that's long gone.
I've also adjusted a few other mentions of svn to point to github, but
have not done so comprehensively.
Differential Revision: https://reviews.llvm.org/D56654
llvm-svn: 351130
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TFE and LWE support requires extra result registers that are written in the
event of a failure in order to detect that failure case.
The specific use-case that initiated these changes is sparse texture support.
This means that if image intrinsics are used with either option turned on, the
programmer must ensure that the return type can contain all of the expected
results. This can result in redundant registers since the vector size must be a
power-of-2.
This change takes roughly 6 parts:
1. Modify the instruction defs in tablegen to add new instruction variants that
can accomodate the extra return values.
2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE
(where the bulk of the work for these instruction types is now done)
3. Extra verification code to catch cases where intrinsics have been used but
insufficient return registers are used.
4. Modification to the adjustWritemask optimisation to account for TFE/LWE being
enabled (requires extra registers to be maintained for error return value).
5. An extra pass to zero initialize the error value return - this is because if
the error does not occur, the register is not written and thus must be zeroed
before use. Also added a new (on by default) option to ensure ALL return values
are zero-initialized that is required for sparse texture support.
6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO
for this to re-enable and handle correctly).
There's an additional fix now to avoid a dmask=0
For an image intrinsic with tfe where all result channels except tfe
were unused, I was getting an image instruction with dmask=0 and only a
single vgpr result for tfe. That is incorrect because the hardware
assumes there is at least one vgpr result, plus the one for tfe.
Fixed by forcing dmask to 1, which gives the desired two vgpr result
with tfe in the second one.
The TFE or LWE result is returned from the intrinsics using an aggregate
type. Look in the test code provided to see how this works, but in essence IR
code to invoke the intrinsic looks as follows:
%v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15,
i32 %s, <8 x i32> %rsrc, i32 1, i32 0)
%v.vec = extractvalue {<4 x float>, i32} %v, 0
%v.err = extractvalue {<4 x float>, i32} %v, 1
This re-submit of the change also includes a slight modification in
SIISelLowering.cpp to work-around a compiler bug for the powerpc_le
platform that caused a buildbot failure on a previous submission.
Differential revision: https://reviews.llvm.org/D48826
Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda
Work around for ppcle compiler bug
Change-Id: Ie284cf24b2271215be1b9dc95b485fd15000e32b
llvm-svn: 351054
|
| |
|
|
|
|
|
|
|
|
| |
Something like this is requested by:
https://bugs.llvm.org/show_bug.cgi?id=40265
...and it seems like a common enough case that we should acknowledge it.
Differential Revision: https://reviews.llvm.org/D56551
llvm-svn: 351010
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes https://bugs.llvm.org/show_bug.cgi?id=40110.
This implements handling of undef operands for integer intrinsics in
ConstantFolding, in particular for the bitcounting intrinsics (ctpop,
cttz, ctlz), the with.overflow intrinsics, the saturating math
intrinsics and the funnel shift intrinsics.
The undef behavior follows what InstSimplify does for the general cas
e of non-constant operands. For the bitcount intrinsics (where
InstSimplify doesn't do undef handling -- there cannot be a combination
of an undef + non-constant operand) I'm using a 0 result if the intrinsic
is defined for zero and undef otherwise.
Differential Revision: https://reviews.llvm.org/D55950
llvm-svn: 350971
|
| |
|
|
| |
llvm-svn: 350969
|
| |
|
|
| |
llvm-svn: 350967
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Records in the module summary index whether the bitcode was compiled
with the option necessary to enable splitting the LTO unit
(e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit).
The information is passed down to the ModuleSummaryIndex builder via a
new module flag "EnableSplitLTOUnit", which is propagated onto a flag
on the summary index.
This is then used during the LTO link to check whether all linked
summaries were built with the same value of this flag. If not, an error
is issued when we detect a situation requiring whole program visibility
of the class hierarchy. This is the case when both of the following
conditions are met:
1) We are performing LowerTypeTests or Whole Program Devirtualization.
2) There are type tests or type checked loads in the code.
Note I have also changed the ThinLTOBitcodeWriter to also gate the
module splitting on the value of this flag.
Reviewers: pcc
Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D53890
llvm-svn: 350948
|
| |
|
|
|
|
|
|
|
|
| |
MergeFunc only deletes unused duplicate functions if they have local
linkage, but it should be safe to relax this to any "discardable if
unused" linkage type.
Differential Revision: https://reviews.llvm.org/D56574
llvm-svn: 350939
|
| |
|
|
|
|
|
|
|
|
|
| |
Currently when a select has a constant value in one branch and the select feeds
a conditional branch (via a compare/ phi and compare) we unfold the select
statement. This results in threading the conditional branch later on. Similar
opportunity exists when a select (with a constant in one branch) feeds a
switch (via a phi node). The patch unfolds select under this condition.
A testcase is provided.
llvm-svn: 350931
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Step 2 in using MemorySSA in LICM:
Use MemorySSA in LICM to do sinking and hoisting, all under "EnableMSSALoopDependency" flag.
Promotion is disabled.
Enable flag in LICM sink/hoist tests to test correctness of this change. Moved one test which
relied on promotion, in order to test all sinking tests.
Reviewers: sanjoy, davide, gberry, george.burgess.iv
Subscribers: llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D40375
llvm-svn: 350879
|
| |
|
|
| |
llvm-svn: 350807
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The C standard says "The memchr function locates the first
occurrence of c (converted to an unsigned char)[...]". The expansion
was missing the conversion to unsigned char.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39041 .
Differential Revision: https://reviews.llvm.org/D55947
llvm-svn: 350775
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: chandlerc
Subscribers: haicheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D56409
llvm-svn: 350751
|
| |
|
|
|
|
| |
These changed with rL350672.
llvm-svn: 350674
|
| |
|
|
|
|
|
|
|
| |
This is matching the equivalent of the DAG expansion,
so it should never end up with worse perf than the
original code even if the target doesn't have a rotate
instruction.
llvm-svn: 350672
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes the IDom for exit blocks and all blocks reachable from the exit blocks, when runtime unrolling under multiexit/exiting case.
We initially had a restrictive check that the IDom is only updated when
it is the header of the loop.
However, we also need to update the IDom to the correct one when the
IDom is any block within the original loop. See added test cases (which
fail dom tree verification without the patch).
Reviewers: reames, mzolotukhin, mkazantsev, hfinkel
Reviewed by: brzycki, kuhar
Subscribers: zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D56284
llvm-svn: 350640
|