| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 339059
|
| |
|
|
|
|
|
|
|
|
|
| |
This assert fires when attempting to extract a subregister from the
global PIC base register. This virtual register SD node is not in the
VRBaseMap, so we shouldn't call getVR to look it up there. If this is a
RegisterSDNode, we should be able to use the virtual register directly.
Fixes PR38385
llvm-svn: 339056
|
| |
|
|
|
|
|
|
|
| |
Properly shrink `pow()` to `powf()` as a binary function and, when no other
simplification applies, do not discard it.
Differential revision: https://reviews.llvm.org/D50113
llvm-svn: 339046
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50258
llvm-svn: 339045
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expand isFNEG so that we generate the appropriate F(N)M(ADD|SUB)
instructions in more cases. For example, the following sequence
a = _mm256_broadcast_ss(f)
d = _mm256_fnmadd_ps(a, b, c)
generates an fsub and fma without this patch and an fnma with this
change.
Reviewers: craig.topper
Subscribers: llvm-commits, davidxl, wmi
Differential Revision: https://reviews.llvm.org/D48467
llvm-svn: 339043
|
| |
|
|
|
|
|
|
|
|
| |
sure the store isn't volatile
If the store is volatile this might be a memory mapped IO access. In that case we shouldn't generate a load that didn't exist in the source
Differential Revision: https://reviews.llvm.org/D50270
llvm-svn: 339041
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for all the uses from the same def is done.
We run into a compile time problem with flex generated code combined with
`-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants
outside of a big loop, and drastically increases the compile time in global
register splitting and copy coalescing. https://reviews.llvm.org/D49353
relieves the problem in global splitting. This patch is to handle the problem
in copy coalescing.
About the situation where the problem in copy coalescing happens. After
machineLICM, we have several defs outside of a big loop with hundreds or
thousands of uses inside the loop. Rematerialization in copy coalescing
happens for each use and everytime rematerialization is done, shrinkToUses
will be called to update the huge live interval. Because we have 'n' uses
for a def, and each live interval update will have at least 'n' complexity,
the total update work is n^2.
To fix the problem, we try to do the live interval update work in a collective
way. If a def has many copylike uses larger than a threshold, each time
rematerialization is done for one of those uses, we won't do the live interval
update in time but delay that work until rematerialization for all those uses
are completed, so we only have to do the live interval update work once.
Delaying the live interval update could potentially change the copy coalescing
result, so we hope to limit that change to those defs with many
(like above a hundred) copylike uses, and the cutoff can be adjusted by the
option -mllvm -late-remat-update-threshold=xxx.
Differential Revision: https://reviews.llvm.org/D49519
llvm-svn: 339035
|
| |
|
|
|
|
|
|
|
|
| |
On windows when raw_fd_ostream::write_impl calls write, a 32 bit input is required for character count. As a variable with size_t is used for this argument, on x64 integral demotion occurs. In the case of large files an infinite loop follows.
See: https://bugs.llvm.org/show_bug.cgi?id=37926
This fix allows the output of files larger than the previous int32 limit.
Differential Revision: https://reviews.llvm.org/D48948
llvm-svn: 339027
|
| |
|
|
|
|
| |
Appears from expansion of some packed cases.
llvm-svn: 339025
|
| |
|
|
|
|
|
| |
Also fix apparently missing test coverage for any of the
handling here.
llvm-svn: 339023
|
| |
|
|
| |
llvm-svn: 339021
|
| |
|
|
| |
llvm-svn: 339020
|
| |
|
|
| |
llvm-svn: 339019
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the
same type. This fixes an assertion failure in VerifySDNode.
Reviewers: SjoerdMeijer, t.p.northover, javed.absar
Reviewed By: SjoerdMeijer
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D50202
llvm-svn: 339013
|
| |
|
|
|
|
|
|
|
|
| |
Currently we use #pragma push_macro(LLVM_DEBUG) to fiddle with the LLVM_DEBUG
macro so that we can silence debugging the Knuth division algorithm unless it's
actually desired. Unfortunately this is incompatible with enabling modules
while building LLVM (via LLVM_ENABLE_MODULES=ON), probably due to a bug being
fixed by D33004.
llvm-svn: 339009
|
| |
|
|
|
|
|
|
|
| |
ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes
out that part of any addend in an object file. But it only does that for
symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting
that extra bit in other cases.
llvm-svn: 339007
|
| |
|
|
|
|
|
|
|
|
|
| |
AND"
The patch was reverted because of bug detected by sanitizer. The bug is fixed,
respective tests added.
Differential Revision: https://reviews.llvm.org/D50172
llvm-svn: 339005
|
| |
|
|
|
|
|
|
|
| |
Multiple failues reported by sanitizer-x86_64-linux, seem to be caused by this
patch. Reverting to see if they sustain without it.
Differential Revision: https://reviews.llvm.org/D50172
llvm-svn: 338994
|
| |
|
|
| |
llvm-svn: 338991
|
| |
|
|
|
|
|
|
|
|
|
| |
`isKnownNonNullFromDominatingCondition` is able to prove non-null basing on `br` or `guard`
by `%p != null` condition, but is unable to do so basing on `(%p != null) && %other_cond`.
This patch allows it to do so.
Differential Revision: https://reviews.llvm.org/D50172
Reviewed By: reames
llvm-svn: 338990
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
branches
If there is a frequently taken branch dominated by a guard, and its condition is available
at the point of the guard, we can widen guard with condition of this branch and convert
the branch into unconditional:
guard(cond1)
if (cond2) {
// taken in 99.9% cases
// do something
} else {
// do something else
}
Converts to
guard(cond1 && cond2)
// do something
Differential Revision: https://reviews.llvm.org/D49974
Reviewed By: reames
llvm-svn: 338988
|
| |
|
|
| |
llvm-svn: 338987
|
| |
|
|
| |
llvm-svn: 338986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past, DbgInfoIntrinsic has a strong assumption that these
intrinsics all have variables and expressions attached to them.
However, it is too strong to derive the class for other debug entities.
Now, it has problems for debug labels.
In order to make DbgInfoIntrinsic as a base class for 'debug info', I
create a class for 'variable debug info', DbgVariableIntrinsic.
DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it.
Differential Revision: https://reviews.llvm.org/D50220
llvm-svn: 338984
|
| |
|
|
|
|
|
|
| |
This code was moved out from BasicObjectLayerMaterializationUnit, which required
the supplied object to be well formed. The getObjectSymbolFlags function does
not require a well-formed object, so we have to propagate the error here.
llvm-svn: 338975
|
| |
|
|
|
|
| |
flags map from a buffer representing an object file.
llvm-svn: 338974
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.
Patch by: yrouban (Yevgeny Rouban)
Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00
Reviewed By: tejohnson, xbolva00
Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith
Differential Revision: https://reviews.llvm.org/D49412
llvm-svn: 338969
|
| |
|
|
|
|
|
|
|
|
|
| |
sections"
There are a bunch of edge cases and inconsistencies in how we're emitting sections
cause this warning to fire and it needs more work.
This reverts commit r335558.
llvm-svn: 338968
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously, in the NewPM pipeline, TailCallElim recalculates the DomTree when it modifies any instruction in the Function.
For example,
```
CallInst *CI = dyn_cast<CallInst>(&I);
...
CI->setTailCall();
Modified = true;
...
if (!Modified || ...)
return PreservedAnalyses::all();
```
After applying this patch, the DomTree only recalculates if needed (plus an extra insertEdge() + an extra deleteEdge() call).
When optimizing SQLite with `-passes="default<O3>"` pipeline of the newPM, the number of DomTree recalculation decreases by 6.2%, the number of nodes visited by DFS decreases by 2.9%. The time used by DomTree will decrease approximately 1%~2.5% after applying the patch.
Statistics:
```
Before the patch:
23010 dom-tree-stats - Number of DomTree recalculations
489264 dom-tree-stats - Number of nodes visited by DFS -- DomTree
After the patch:
21581 dom-tree-stats - Number of DomTree recalculations
475088 dom-tree-stats - Number of nodes visited by DFS -- DomTree
```
Reviewers: kuhar, dmgreen, brzycki, grosser, davide
Reviewed By: kuhar, brzycki
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49982
llvm-svn: 338954
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: ADCE doesn't need to query domtree.
Reviewers: kuhar, brzycki, dmgreen, davide, grosser
Reviewed By: kuhar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49988
llvm-svn: 338950
|
| |
|
|
|
|
|
|
| |
that commit (r338826, r338827, r338829, r338880).
This commit has broken build bots and has been left unattended for too long.
llvm-svn: 338948
|
| |
|
|
|
|
|
|
| |
https://reviews.llvm.org/D48600
Added IRTranslator support to translate these known intrinsics into GISel opcodes.
llvm-svn: 338944
|
| |
|
|
| |
llvm-svn: 338940
|
| |
|
|
|
|
|
|
|
|
| |
This change allows users pass compression level that was not listed
in the enum. Also, I think using different values than zlib's
compression levels was just confusing.
Differential Revision: https://reviews.llvm.org/D50196
llvm-svn: 338939
|
| |
|
|
|
|
| |
Despite the comment removed in this patch, this is beneficial when the RHS of the sub is a register.
llvm-svn: 338930
|
| |
|
|
|
|
|
|
|
|
|
|
| |
and the normal instructions instead
At one point in time acquire implied mayLoad and mayStore as did release. Thus we needed separate pseudos that also carried that property. This appears to no longer be the case. I believe it was changed in 2012 with a comment saying that atomic memory accesses are marked volatile which preserves the ordering.
So from what I can tell we shouldn't need additional pseudos since they aren't carry any flags that are different from the normal instructions. The only thing I can think of is that we may consider them for load folding candidates in the peephole pass now where we didn't before. If that's important hopefully there's something in the memory operand we can check to prevent the folding without relying on pseudo instructions.
Differential Revision: https://reviews.llvm.org/D50212
llvm-svn: 338925
|
| |
|
|
|
|
|
|
|
| |
There are two branch instructions created
so the new test covers them both.
Differential Revision: https://reviews.llvm.org/D50263
llvm-svn: 338917
|
| |
|
|
|
|
|
|
|
|
| |
store.
The mask operand is visited before the data operand so we need to be able to widen it.
Fixes PR38436.
llvm-svn: 338915
|
| |
|
|
|
|
|
|
|
|
|
| |
resize() (zeroing) makes every allocated page resident. The actual size of the compressed buffer is usually much
smaller. Making every page resident is wasteful.
When linking a test binary with ~1.9GiB uncompressed debug info with LLD, this optimization decreases max RSS by ~1.5GiB.
Differential Revision: https://reviews.llvm.org/50223
llvm-svn: 338913
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.
Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.
llvm-svn: 338910
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
libdevice in recent CUDA versions relies on __nvvm_reflect() to select
GPU-specific bitcode. This patch addresses the requirement.
Reviewers: jlebar
Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits
Differential Revision: https://reviews.llvm.org/D50207
llvm-svn: 338908
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
enable better codegen
Clang uses "ctpop & 1" to implement __builtin_parity. If the popcnt instruction isn't supported this generates a large amount of code to calculate the population count. Instead we can bisect the data down to a single byte using xor and then check the parity flag.
Even when popcnt is supported, its still a good idea to split 64-bit data on 32-bit targets using an xor in front of a single popcnt. Otherwise we get two popcnts and an add before the and.
I've specifically targeted this at the sizes supported by clang builtins, but we could generalize this if we think that's useful.
Differential Revision: https://reviews.llvm.org/D50165
llvm-svn: 338907
|
| |
|
|
|
|
|
| |
Merge the helper functions for shrinking unary and binary functions into a
single one, while keeping all their functionality. Otherwise, NFC.
llvm-svn: 338905
|
| |
|
|
|
|
|
|
| |
In r337830 I added SCEV checks to enable us to insert fewer bounds checks. Unfortunately, this sometimes crashes when multiple bounds checks are added due to SCEV caching issues. This patch splits the bounds checking pass into two phases, one that computes all the conditions (using SCEV checks) and the other that adds the new instructions.
Differential Revision: https://reviews.llvm.org/D49946
llvm-svn: 338902
|
| |
|
|
|
|
|
| |
- It's possible for 'Changed' to return as false even if we did
partial inline something. Fixed to accumulate return values
llvm-svn: 338896
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44030
llvm-svn: 338894
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
already awaiting deletion
Summary:
Previously, `removeUnreachableBlocks` still returns true (which indicates the CFG is changed) even when all the unreachable blocks found is awaiting deletion in the DDT class.
This makes code pattern like
```
// Code modified from lib/Transforms/Scalar/SimplifyCFGPass.cpp
bool EverChanged = removeUnreachableBlocks(F, nullptr, DDT);
...
do {
EverChanged = someMightHappenModifications();
EverChanged |= removeUnreachableBlocks(F, nullptr, DDT);
} while (EverChanged);
```
become a dead loop.
Fix this by detecting whether a BasicBlock is already awaiting deletion.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49738
llvm-svn: 338882
|
| |
|
|
| |
llvm-svn: 338880
|
| |
|
|
|
|
|
|
|
| |
We don't expect module names to be present in the index. This patch adds
DW_TAG_module to the blacklist.
Differential revision: https://reviews.llvm.org/D50237
llvm-svn: 338878
|
| |
|
|
|
|
|
|
|
|
|
| |
Some instructions expand to more than one decoder group.
This has been hitherto ignored, but is handled with this patch.
Review: Ulrich Weigand
https://reviews.llvm.org/D50187
llvm-svn: 338849
|