| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The expandAddRecExprLiterally function incorrectly transforms
`[Start + Step * X]` into `Step * [Start + X]` instead of the correct
transform of `[Step * X] + Start`.
This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219
due to what appeared to be sufficiently complicated loop interactions.
Patch by Jameson Nash (jameson@juliacomputing.com).
Reviewers: sanjoy
Differential Revision: http://reviews.llvm.org/D16505
llvm-svn: 275239
|
| |
|
|
|
|
|
| |
Remove another variable added in r275216 that was only used in debug
mode.
llvm-svn: 275238
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Refactored the profitability analysis out of the IC promotion pass and
into lib/Analysis so that it can be accessed by the summary index
builder in a follow-on patch to enable IC promotion in ThinLTO (D21932).
Reviewers: davidxl, xur
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22182
llvm-svn: 275216
|
| |
|
|
|
|
|
|
|
| |
Use range-base for loops.
Use auto when appropriate.
No functional change is intended.
llvm-svn: 275213
|
| |
|
|
|
|
|
| |
Woohoo, unused variable warnings in builds without asserts (as a result
of r275122).
llvm-svn: 275126
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch simplifies the graph builder by encoding nodes as {Value,
Dereference Level} pairs. This lets us kill edge types, and allows us to
get rid of hacks in StratifiedSets (like addAttrsBelow/...). This
simplification also allows us to remove InstantiatedRelations and
InstantiatedAttrs.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D22080
llvm-svn: 275122
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Add additional parameters: AddressSpace, Alignment, Fast.
Reviewers: llvm-commits, jlebar
Subscribers: arsenm, mzolotukhin
Differential Revision: http://reviews.llvm.org/D21935
llvm-svn: 275100
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch.
E.g.
if (A1 && A2 && A3 && ..... && A10) {
for (i=0; i < 100000000; i++) {
callsite();
}
}
Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value.
In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness.
Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR.
Reviewers: davidxl, eraman, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22118
llvm-svn: 275073
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to
ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when
the Instruction is a call-site.
This is now more in line with getModRefInfo generally: it returns Mod when
I modifies a memory location that is accessed (read or written) by CS and
Ref when I reads a memory location that is written by CS.
From a grep of the code, the only uses of this particular getModRefInfo
overload are in MemorySSA and MemCpyOptimizer, and they only care about
where the result is MR_NoModRef or not. Therefore, this change should have
no visible effect.
Separated out from D17279 upon request.
llvm-svn: 275065
|
| |
|
|
|
|
|
|
|
| |
For functions which are known to return a specific argument, pointer-comparison
folding can look through the function calls as part of its analysis.
Differential Revision: http://reviews.llvm.org/D9387
llvm-svn: 275039
|
| |
|
|
|
|
|
|
|
| |
For functions which are known to return their argument,
isDereferenceableAndAlignedPointer can examine the argument value.
Differential Revision: http://reviews.llvm.org/D9384
llvm-svn: 275038
|
| |
|
|
|
|
|
|
|
| |
When building SCEVs, if a function is known to return its argument, then we can
build the SCEV using the corresponding argument value.
Differential Revision: http://reviews.llvm.org/D9381
llvm-svn: 275037
|
| |
|
|
|
|
|
|
|
| |
If a function is known to return one of its arguments, we can use that in order
to compute known bits of the return value.
Differential Revision: http://reviews.llvm.org/D9397
llvm-svn: 275036
|
| |
|
|
|
|
|
|
|
|
|
| |
Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look
through returned-argument functions when answering queries. This is essential
so that we don't loose all other AA information when supplementing with
llvm.noalias.
Differential Revision: http://reviews.llvm.org/D9383
llvm-svn: 275035
|
| |
|
|
|
|
|
| |
`const` was dropped by r274958, and the lack of `const` makes GCC6
(correctly) complain.
llvm-svn: 274961
|
| |
|
|
|
|
|
|
| |
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D22022
llvm-svn: 274958
|
| |
|
|
|
|
|
|
|
|
|
| |
This removes a few fields from the graph builder by making us compute
things (that we'd always compute anyway) more eagerly.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D22009
llvm-svn: 274957
|
| |
|
|
|
|
|
| |
This reverts commit r274853.
Caused failure in ppcBE build
llvm-svn: 274943
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 274940
|
| |
|
|
| |
llvm-svn: 274934
|
| |
|
|
| |
llvm-svn: 274927
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We can fold truncs whose operand feeds from a load, if the trunc value
is available through a prior load/store.
This change is from: http://reviews.llvm.org/D21246, which folded the
trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW
call, when the load type didnt match the prior load/store type.
Differential Revision: http://reviews.llvm.org/D21791
llvm-svn: 274853
|
| |
|
|
| |
llvm-svn: 274765
|
| |
|
|
|
|
|
|
|
|
|
| |
friend definitions.
Based on the experiments Sean Silva and Reid did, this seems the safest
course of action and also will work around a questionable warning
provided by GCC6 on the old form of the code. Thanks for Davide pointing
out the issue and other suggesting ways to fix.
llvm-svn: 274740
|
| |
|
|
|
|
|
|
|
| |
We were inappropriately using 32-bit types to account for quantities
that can be far larger.
Fixed in PR28443.
llvm-svn: 274737
|
| |
|
|
|
|
|
| |
Note that require<domtree> and require<loops> aren't needed because they
come in implicitly via the loop pass manager.
llvm-svn: 274712
|
| |
|
|
|
|
|
|
|
|
|
| |
"More things" = StratifiedAttrs and various bits like interprocedural
summaries.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21964
llvm-svn: 274592
|
| |
|
|
|
|
|
|
| |
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21963
llvm-svn: 274591
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
StratifiedSets (as implemented) is very fast, but its accuracy is also
limited. If we take a more aggressive andersens-like approach, we can be
way more accurate, but we'll also end up being slower.
So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA.
Long-term, we want to end up in a place where CFLSteens is queried
first; if it can provide an answer, great (since queries are basically
map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc.
This patch splits everything out so we can try to do something like
that when we get a reasonable CFLAnders implementation.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21910
llvm-svn: 274589
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This complements the earlier addition of IntrWriteMem and IntrWriteArgMem
LLVM intrinsic properties, see D18291.
Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis.
Reviewers: reames, joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D18714
llvm-svn: 274485
|
| |
|
|
| |
llvm-svn: 274481
|
| |
|
|
| |
llvm-svn: 274480
|
| |
|
|
| |
llvm-svn: 274479
|
| |
|
|
| |
llvm-svn: 274478
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.
This also removes the dependence of functionattrs on TLI altogether.
llvm-svn: 274455
|
| |
|
|
|
|
|
| |
It is implemented as a LoopAnalysis pass as
discussed and agreed upon.
llvm-svn: 274452
|
| |
|
|
|
|
|
|
| |
where possible.
No functionality change intended.
llvm-svn: 274431
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21636
llvm-svn: 274334
|
| |
|
|
| |
llvm-svn: 274302
|
| |
|
|
|
|
|
|
| |
This will be re-used by the LoadStoreVectorizer.
Fix handling of range metadata and testcase by Justin Lebar.
llvm-svn: 274281
|
| |
|
|
|
|
|
| |
In particular, check to see if we can compute a precise trip count by
exhaustively simulating the loop first.
llvm-svn: 274199
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch makes CFLAA answer some ModRef queries. Because we don't
distinguish between reading/writing when making StratifiedSets, we're
unable to offer any of the readonly-related answers.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21858
llvm-svn: 274197
|
| |
|
|
| |
llvm-svn: 274115
|
| |
|
|
|
|
|
|
| |
bit for a recurrence with a NSW addition."
This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything.
llvm-svn: 274102
|
| |
|
|
|
|
|
|
|
|
|
|
| |
a recurrence with a NSW addition.
If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case.
I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at.
Differential revision: http://reviews.llvm.org/D21493
llvm-svn: 274098
|
| |
|
|
| |
llvm-svn: 274038
|
| |
|
|
|
|
|
|
|
|
| |
This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).
llvm-svn: 273996
|
| |
|
|
|
|
|
|
|
| |
MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..
llvm-svn: 273992
|
| |
|
|
|
|
|
|
|
|
|
| |
BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.
This patch also enables BFI graph to show branch probability
on edges as MBFI does before.
llvm-svn: 273990
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
the new pass manager.
This adds operator<< overloads for the various bits of the
LazyCallGraph, dump methods for use from the debugger, and debug logging
using them to the CGSCC pass manager.
Having this was essential for debugging the call graph update patch, and
I've extracted what I could from that patch here to minimize the delta.
llvm-svn: 273961
|