| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
No functional change is intended.
llvm-svn: 275999
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.
It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).
This patch changes both scalar and packed versions back to using x86-specific builtins.
It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.
A companion clang patch is at D22105
Differential Revision: https://reviews.llvm.org/D22106
llvm-svn: 275981
|
| |
|
|
| |
llvm-svn: 275911
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The main goal is to able to start using the new OptRemarkEmitter
analysis from the LoopVectorizer. Since the vectorizer was recently
converted to the new PM, it makes sense to convert this analysis as
well.
This pass is currently tested through the LoopDistribution pass, so I am
also porting LoopDistribution to get coverage for this analysis with the
new PM.
Reviewers: davidxl, silvas
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D22436
llvm-svn: 275810
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
To enable profile-guided indirect call promotion in ThinLTO mode, we
simply add call graph edges for each profitable target from the profile
to the summaries, then the summary-guided importing will consider the
callee for importing as usual.
Also we need to enable the indirect call promotion pass creation in the
PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO
backend), so that the newly imported functions are considered for
promotion in the backends.
The IC promotion profiles refer to callees by GUID, which required
adding GUIDs to the per-module VST in bitcode (and assigning them
valueIds similar to how they are assigned valueIds in the combined
index).
Reviewers: mehdi_amini, xur
Subscribers: mehdi_amini, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21932
llvm-svn: 275707
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Convert IVUsers analysis to new pass manager.
Reviewers: davidxl, silvas
Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D22434
llvm-svn: 275698
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds proper handling of stratified attributes into our
anders-style CFLAA implementation. It also comes bundled with more
CFLAnders tests. :)
Patch by Jia Chen.
Differential Revision: https://reviews.llvm.org/D22325
llvm-svn: 275604
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This adds an incomplete anders-style implementation for CFLAA. It's
incomplete in that it's missing interprocedural analysis, attrs
handling, etc. and that it needs more tests. More tests and features
will be added in future commits.
Patch by Jia Chen.
Differential Revision: https://reviews.llvm.org/D22291
llvm-svn: 275602
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334
This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution. My goal is to shake out the design issues before scaling
it up to other types and passes.
Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count. It's only printed in opt
currently since clang prints the diagnostic fields directly. E.g.:
remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)
A new API added is similar to emitOptimizationRemarkMissed. The
difference is that it additionally takes a code region that the
diagnostic corresponds to. From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI. (Thanks to Hal for the analysis pass idea.)
This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context. If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.
A new command-line option is added to turn this on in opt.
My plan is to switch all user of emitOptimizationRemark* to use this
module instead.
Reviewers: hfinkel
Subscribers: rcox2, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D21771
llvm-svn: 275583
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Calling getModRefInfo with a fence resulted in crashes because fences
don't have a memory location. Add a new predicate to Instruction
called isFenceLike which indicates that the instruction mutates memory
but not any single memory location in particular. In practice, it is a
proxy for the set of instructions which "mayWriteToMemory" but cannot be
used with MemoryLocation::get.
This fixes PR28570.
llvm-svn: 275581
|
| |
|
|
|
|
|
|
|
| |
Most possibly problem was caused by the same reason as PR28400. This change
bypasses it by using CallbackVH instead of AssertingVH.
Differential Revision: https://reviews.llvm.org/D20957
llvm-svn: 275563
|
| |
|
|
| |
llvm-svn: 275465
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In preparation for changing GlobalsAA to stop assuming that intrinsics
can't read arbitrary globals, we need to make sure GlobalsAA is querying
function attributes rather than relying on this assumption.
This patch was inspired by: http://reviews.llvm.org/D20206
Reviewers: jmolloy, hfinkel
Subscribers: eli.friedman, llvm-commits
Differential Revision: https://reviews.llvm.org/D21318
llvm-svn: 275433
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
constant hoisting. It not only takes into account the number of uses and the
cost of expressions in which constants appear, but now also the resulting
integer range of the offsets. Thus, the algorithm maximizes the number of uses
within an integer range that will enable more efficient code generation. On
ARM, for example, this will enable code size optimisations because less
negative offsets will be created. Negative offsets/immediates are not supported
by Thumb1 thus preventing more compact instruction encoding.
Differential Revision: http://reviews.llvm.org/D21183
llvm-svn: 275382
|
| |
|
|
|
|
|
| |
We can always pick the passthru value if the mask is undef: we are
permitted to treat the mask as-if it were filled with zeros.
llvm-svn: 275379
|
| |
|
|
|
|
|
|
|
| |
We can constant fold a masked load if the operands are appropriately
constant.
Differential Revision: http://reviews.llvm.org/D22324
llvm-svn: 275352
|
| |
|
|
|
|
|
|
|
| |
offsets
Treat loads which clip before the start of a global initializer the same
way we treat clipping beyond the end of the initializer: use zeros.
llvm-svn: 275345
|
| |
|
|
|
|
|
| |
This transform doesn't require any new instructions, it can safely live
in InstSimplify.
llvm-svn: 275344
|
| |
|
|
| |
llvm-svn: 275335
|
| |
|
|
| |
llvm-svn: 275334
|
| |
|
|
|
|
|
| |
In fact, don't even pass this to the ctor since we can get it from the
module.
llvm-svn: 275326
|
| |
|
|
| |
llvm-svn: 275325
|
| |
|
|
| |
llvm-svn: 275322
|
| |
|
|
| |
llvm-svn: 275304
|
| |
|
|
|
|
|
|
| |
Patch by Sunita Marathe
Differential Revision: http://reviews.llvm.org/D21920
llvm-svn: 275284
|
| |
|
|
|
|
| |
This is a simplification, there should be no functional change.
llvm-svn: 275273
|
| |
|
|
|
|
| |
GEP offsets are signed, don't treat them as huge positive numbers.
llvm-svn: 275251
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is necessary for D21771. In order to add the hotness attribute to
optimization remarks we need BFI to be available in all passes that emit
optimization remarks.
However we don't want to pay for computing BFI unless the hotness
attribute is requested.
This is achieved by making BFI lazy at the very high-level through a new
analysis pass -- BFI is not calculated unless requested.
I am adding a test to check the laziness under D21771 where the first
user of the analysis is added.
Reviewers: hfinkel, dexonsmith, davidxl
Subscribers: davidxl, dexonsmith, llvm-commits
Differential Revision: http://reviews.llvm.org/D22141
llvm-svn: 275250
|
| |
|
|
|
|
| |
No functional change is intended, just a minor cleanup.
llvm-svn: 275249
|
| |
|
|
|
|
|
| |
A GEPed offset can go negative, the result of getIndexedOffsetInType
should according be a signed type.
llvm-svn: 275246
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The expandAddRecExprLiterally function incorrectly transforms
`[Start + Step * X]` into `Step * [Start + X]` instead of the correct
transform of `[Step * X] + Start`.
This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219
due to what appeared to be sufficiently complicated loop interactions.
Patch by Jameson Nash (jameson@juliacomputing.com).
Reviewers: sanjoy
Differential Revision: http://reviews.llvm.org/D16505
llvm-svn: 275239
|
| |
|
|
|
|
|
| |
Remove another variable added in r275216 that was only used in debug
mode.
llvm-svn: 275238
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Refactored the profitability analysis out of the IC promotion pass and
into lib/Analysis so that it can be accessed by the summary index
builder in a follow-on patch to enable IC promotion in ThinLTO (D21932).
Reviewers: davidxl, xur
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22182
llvm-svn: 275216
|
| |
|
|
|
|
|
|
|
| |
Use range-base for loops.
Use auto when appropriate.
No functional change is intended.
llvm-svn: 275213
|
| |
|
|
|
|
|
| |
Woohoo, unused variable warnings in builds without asserts (as a result
of r275122).
llvm-svn: 275126
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch simplifies the graph builder by encoding nodes as {Value,
Dereference Level} pairs. This lets us kill edge types, and allows us to
get rid of hacks in StratifiedSets (like addAttrsBelow/...). This
simplification also allows us to remove InstantiatedRelations and
InstantiatedAttrs.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D22080
llvm-svn: 275122
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Add additional parameters: AddressSpace, Alignment, Fast.
Reviewers: llvm-commits, jlebar
Subscribers: arsenm, mzolotukhin
Differential Revision: http://reviews.llvm.org/D21935
llvm-svn: 275100
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch.
E.g.
if (A1 && A2 && A3 && ..... && A10) {
for (i=0; i < 100000000; i++) {
callsite();
}
}
Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value.
In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness.
Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR.
Reviewers: davidxl, eraman, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22118
llvm-svn: 275073
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to
ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when
the Instruction is a call-site.
This is now more in line with getModRefInfo generally: it returns Mod when
I modifies a memory location that is accessed (read or written) by CS and
Ref when I reads a memory location that is written by CS.
From a grep of the code, the only uses of this particular getModRefInfo
overload are in MemorySSA and MemCpyOptimizer, and they only care about
where the result is MR_NoModRef or not. Therefore, this change should have
no visible effect.
Separated out from D17279 upon request.
llvm-svn: 275065
|
| |
|
|
|
|
|
|
|
| |
For functions which are known to return a specific argument, pointer-comparison
folding can look through the function calls as part of its analysis.
Differential Revision: http://reviews.llvm.org/D9387
llvm-svn: 275039
|
| |
|
|
|
|
|
|
|
| |
For functions which are known to return their argument,
isDereferenceableAndAlignedPointer can examine the argument value.
Differential Revision: http://reviews.llvm.org/D9384
llvm-svn: 275038
|
| |
|
|
|
|
|
|
|
| |
When building SCEVs, if a function is known to return its argument, then we can
build the SCEV using the corresponding argument value.
Differential Revision: http://reviews.llvm.org/D9381
llvm-svn: 275037
|
| |
|
|
|
|
|
|
|
| |
If a function is known to return one of its arguments, we can use that in order
to compute known bits of the return value.
Differential Revision: http://reviews.llvm.org/D9397
llvm-svn: 275036
|
| |
|
|
|
|
|
|
|
|
|
| |
Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look
through returned-argument functions when answering queries. This is essential
so that we don't loose all other AA information when supplementing with
llvm.noalias.
Differential Revision: http://reviews.llvm.org/D9383
llvm-svn: 275035
|
| |
|
|
|
|
|
| |
`const` was dropped by r274958, and the lack of `const` makes GCC6
(correctly) complain.
llvm-svn: 274961
|
| |
|
|
|
|
|
|
| |
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D22022
llvm-svn: 274958
|
| |
|
|
|
|
|
|
|
|
|
| |
This removes a few fields from the graph builder by making us compute
things (that we'd always compute anyway) more eagerly.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D22009
llvm-svn: 274957
|
| |
|
|
|
|
|
| |
This reverts commit r274853.
Caused failure in ppcBE build
llvm-svn: 274943
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 274940
|
| |
|
|
| |
llvm-svn: 274934
|