| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
Register indexing 64-bit elements is possible on the SALU, but not the
VALU. Handle splitting this into two 32-bit indexes. Extend waterfall
loop handling to allow moving a range of instructions.
llvm-svn: 373638
|
| |
|
|
|
|
|
|
| |
We can still do a waterfall loop over the index if using a VGPR to
index an SGPR. The result will still be a VGPR, but we can avoid the
wide copy of the source register to a VGPR.
llvm-svn: 373637
|
| |
|
|
|
|
| |
This would try to do FewerElements to v9s8
llvm-svn: 373635
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds a pre-pass to this optimization that scans through the basic
block and generates lists of mergeable instructions with one list per unique
address.
In the optimization phase instead of scanning through the basic block for mergeable
instructions, we now iterate over the lists generated by the pre-pass.
The decision to re-optimize a block is now made per list, so if we fail to merge any
instructions with the same address, then we do not attempt to optimize them in
future passes over the block. This will help to reduce the time this pass
spends re-optimizing instructions.
In one pathological test case, this change reduces the time spent in the
SILoadStoreOptimizer from 0.2s to 0.03s.
This restructuring will also make it possible to implement further solutions in
this pass, because we can now add less expensive checks to the pre-pass and
filter instructions out early which will avoid the need to do the expensive
scanning during the optimization pass. For example, checking for adjacent
offsets is an inexpensive test we can move to the pre-pass.
Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65961
llvm-svn: 373630
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Hexagon code assumes there's no existing terminator when inserting its
trip count condition check.
This causes swp-stages5.ll to break. The generated code looks good to me,
it is likely a permutation. I have disabled the new codegen path to keep
everything green and will investigate along with the other 3-4 tests
that have different codegen.
Fixes expensive-checks build.
llvm-svn: 373629
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During studying support for bitfield, I found an issue for
an example like the one in test offset-reloc-middle-chain.ll.
struct t1 { int c; };
struct s1 { struct t1 b; };
struct r1 { struct s1 a; };
#define _(x) __builtin_preserve_access_index(x)
void test1(void *p1, void *p2, void *p3);
void test(struct r1 *arg) {
struct s1 *ps = _(&arg->a);
struct t1 *pt = _(&arg->a.b);
int *pi = _(&arg->a.b.c);
test1(ps, pt, pi);
}
The IR looks like:
%0 = llvm.preserve.struct.access(base, ...)
%1 = llvm.preserve.struct.access(%0, ...)
%2 = llvm.preserve.struct.access(%1, ...)
using %0, %1 and %2
In this case, we need to generate three relocatiions
corresponding to chains: (%0), (%0, %1) and (%0, %1, %2).
After collecting all the chains, the current implementation
process each chain (in a map) with code generation sequentially.
For example, after (%0) is processed, the code may look like:
%0 = base + special_global_variable
// llvm.preserve.struct.access(base, ...) is delisted
// from the instruction stream.
%1 = llvm.preserve.struct.access(%0, ...)
%2 = llvm.preserve.struct.access(%1, ...)
using %0, %1 and %2
When processing chain (%0, %1), the current implementation
tries to visit intrinsic llvm.preserve.struct.access(base, ...)
to get some of its properties and this caused segfault.
This patch fixed the issue by remembering all necessary
information (kind, metadata, access_index, base) during
analysis phase, so in code generation phase there is
no need to examine the intrinsic call instructions.
This also simplifies the code.
Differential Revision: https://reviews.llvm.org/D68389
llvm-svn: 373621
|
| |
|
|
|
|
| |
This reverts commit b3af236fb5fc6e50fcc1b54d868f0bff557f3fb1.
llvm-svn: 373619
|
| |
|
|
|
|
|
|
| |
These old aliases were renamed, but are still used by some projects (eg newlib).
Differential Revision: https://reviews.llvm.org/D68392
llvm-svn: 373618
|
| |
|
|
|
|
|
|
|
| |
It allows using "Size" with or without "Content" in YAML descriptions of
SHT_LLVM_ADDRSIG sections.
Differential revision: https://reviews.llvm.org/D68334
llvm-svn: 373610
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sections."
Fix: call `consumeError()` for a case missed.
Original commit message:
SHT_LLVM_ADDRSIG is described here:
https://llvm.org/docs/Extensions.html#sht-llvm-addrsig-section-address-significance-table
This patch teaches tools to dump them and to parse the YAML declarations of such sections.
Differential revision: https://reviews.llvm.org/D68333
llvm-svn: 373606
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
defined API for the plugins.
Summary: This PR creates a utility class called ValueProfileCollector that tells PGOInstrumentationGen and PGOInstrumentationUse what to value-profile and where to attach the profile metadata. It then refactors logic scattered in PGOInstrumentation.cpp into two plugins that plug into the ValueProfileCollector.
Authored By: Wael Yehia <wyehia@ca.ibm.com>
Reviewer: davidxl, tejohnson, xur
Reviewed By: davidxl, tejohnson, xur
Subscribers: llvm-commits
Tag: #llvm
Differential Revision: https://reviews.llvm.org/D67920
Patch By Wael Yehia <wyehia@ca.ibm.com>
llvm-svn: 373601
|
| |
|
|
| |
llvm-svn: 373600
|
| |
|
|
|
|
|
|
|
| |
sections."
It broke BB:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18655/steps/test/logs/stdio
llvm-svn: 373599
|
| |
|
|
|
|
|
|
|
|
|
| |
SHT_LLVM_ADDRSIG is described here:
https://llvm.org/docs/Extensions.html#sht-llvm-addrsig-section-address-significance-table
This patch teaches tools to dump them and to parse the YAML declarations of such sections.
Differential revision: https://reviews.llvm.org/D68333
llvm-svn: 373598
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, bollu, jdoerfert
Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D68268
llvm-svn: 373595
|
| |
|
|
| |
llvm-svn: 373591
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds support to AArch64FrameLowering to allocate fixed-stack SVE objects.
The focus of this patch is purely to allow the stack frame to
allocate/deallocate space for scalable SVE objects. More dynamic
allocation (at compile-time, i.e. determining placement of SVE objects
on the stack), or resolving frame-index references that include
scalable-sized offsets, are left for subsequent patches.
SVE objects are allocated in the stack frame as a separate region below
the callee-save area, and above the alignment gap. This is done so that
the SVE objects can be accessed directly from the FP at (runtime)
VL-based offsets to benefit from using the VL-scaled addressing modes.
The layout looks as follows:
+-------------+
| stack arg |
+-------------+
| Callee Saves|
| X29, X30 | (if available)
|-------------| <- FP (if available)
| : |
| SVE area |
| : |
+-------------+
|/////////////| alignment gap.
| : |
| Stack objs |
| : |
+-------------+ <- SP after call and frame-setup
SVE and non-SVE stack objects are distinguished using different
StackIDs. The offsets for objects with TargetStackID::SVEVector should be
interpreted as purely scalable offsets within their respective SVE region.
Reviewers: thegameg, rovka, t.p.northover, efriedma, rengolin, greened
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D61437
llvm-svn: 373585
|
| |
|
|
| |
llvm-svn: 373583
|
| |
|
|
| |
llvm-svn: 373582
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68329
llvm-svn: 373580
|
| |
|
|
|
|
| |
It's already available in the class.
llvm-svn: 373568
|
| |
|
|
| |
llvm-svn: 373567
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vbroadcast_load if the scalar size is the same.
This improves broadcast load folding of i64 elements on 32-bit
targets where i64 isn't legal.
Previously we had to represent these as vXf64 vbroadcast_loads and
a bitcast to vXi64. But we didn't have any isel patterns
looking for that.
This also allows us to remove or simplify some isel patterns that
were looking for bitcasted vbroadcast_loads.
llvm-svn: 373566
|
| |
|
|
|
|
|
|
| |
VPMULLQ/VPMAXSQ/VPMAXUQ/VPMINSQ/VPMINUQ patterns.
More fixes for PR36191.
llvm-svn: 373560
|
| |
|
|
|
|
| |
copy/pasted from right above them. NFC
llvm-svn: 373559
|
| |
|
|
|
|
| |
through to tablegen
llvm-svn: 373545
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When SIFixSGPRCopies attempts to fix an illegal copy from vector to
scalar register it calls moveToVALU(). A copy from an agpr to sgpr
becomes a copy from agpr to agpr, which may result in the illegal
register class at a use of this copy.
Solution is to copy it always into a vgpr. This may result in a
subsequent copy into an agpr if that is what really needed, however
should not happen too often and likely will be folded later.
The opposite situation may not happen because an sgpr is always
illegal where agpr is legal, so such user instructions may not
exist.
Differential Revision: https://reviews.llvm.org/D68358
llvm-svn: 373544
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://rise4fun.com/Alive/8BY - valid for lshr+trunc+variable sext
https://rise4fun.com/Alive/7jk - the variable sext can be redundant
https://rise4fun.com/Alive/Qslu - 'exact'-ness of first shift can be preserver
https://rise4fun.com/Alive/IF63 - without trunc we could view this as
more general "drop redundant mask before right-shift",
but let's handle it here for now
https://rise4fun.com/Alive/iip - likewise, without trunc, variable sext can be redundant.
There's more patterns for sure - e.g. we can have 'lshr' as the final shift,
but that might be best handled by some more generic transform, e.g.
"drop redundant masking before right-shift" (PR42456)
I'm singling-out this sext patch because you can only extract
high bits with `*shr` (unlike abstract bit masking),
and i *know* this fold is wanted by existing code.
I don't believe there is much to review here,
so i'm gonna opt into post-review mode here.
https://bugs.llvm.org/show_bug.cgi?id=43523
llvm-svn: 373542
|
| |
|
|
|
|
|
|
| |
Brings this struct in line with the RangeSpan class so they might
eventually be used by common template code for generating range/loc
lists with less duplicate code.
llvm-svn: 373540
|
| |
|
|
|
|
|
| |
bcopy is still widely used mainly for network apps. Sadly, LLVM has no optimizations for bcopy, but there are some for memmove.
Since bcopy == memmove, it is profitable to transform bcopy to memmove and use current optimizations for memmove for free here.
llvm-svn: 373537
|
| |
|
|
|
|
|
|
| |
SplitVecRes_SETCC in SplitRes_SELECT.
No point in manually splitting the SETCC if it was already done.
llvm-svn: 373535
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an effort to make RangeSpan and DebugLocStream::Entry more
similar to share code for their emission (to reuse the more complicated
code for using (& choosing when to use) base address selection entries,
etc).
It didn't seem like this struct was worth the complexity of
encapsulation - when the members could be initialized by the ctor to any
value (no validation) and the type is assignable (so there's no
mutability or other constraint being implemented by its interface).
llvm-svn: 373533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is the first of a series of patches extracted from a much bigger WIP
patch. It merely establishes the tblgen pass and the way empty combiner
helpers are declared and integrated into a combiner info.
The tablegen pass takes a -combiners option to select the combiner helper
that will be generated. This can be given multiple values to generate
multiple combiner helpers at once. Doing so helps to minimize parsing
overhead.
The reason for creating a GlobalISel subdirectory in utils/TableGen is that
there will be quite a lot of non-pass files (~15) by the time the patch
series is done.
Reviewers: volkan
Subscribers: mgorny, hiraditya, simoncook, Petar.Avramovic, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68286
llvm-svn: 373527
|
| |
|
|
|
|
|
|
|
|
|
| |
recomputing."
The cause for the revert should be fixed by r373513 /
a80b6c15425f82521c624ff24c5c0a34cd534d54
This reverts commit 47dbcbd8ec6bf6c0b9cbe5811e81a37cc55e73ef.
llvm-svn: 373522
|
| |
|
|
|
|
|
|
|
|
| |
Store rlwinm Rx, Ry, 32, 0, 31 as rlwinm Rx, Ry, 0, 0, 31 and store
rldicl Rx, Ry, 64, 0 as rldicl Rx, Ry, 0, 0. Otherwise SH field is overflow and
fails assertion in assembly printing stage.
Differential Revision: https://reviews.llvm.org/D66991
llvm-svn: 373519
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
[MSan] handle llvm.launder.invariant.group
Msan used to give false-positives in
class Foo {
public:
virtual ~Foo() {};
};
// Return true iff *x is set.
bool f1(void **x, bool flag);
Foo* f() {
void *p;
bool found;
found = f1(&p,flag);
if (found) {
// p is always set here.
return static_cast<Foo*>(p); // False positive here.
}
return nullptr;
}
Patch by Ilya Tokar.
Reviewers: #sanitizers, eugenis
Reviewed By: #sanitizers, eugenis
Subscribers: eugenis, Prazek, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68236
llvm-svn: 373515
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Terminators like invoke can have users outside the current basic block.
We have to replace those users with undef, before replacing the
terminator.
This fixes a crash exposed by rL373430.
Reviewers: brzycki, asbirlea, davide, spatel
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D68327
llvm-svn: 373513
|
| |
|
|
|
|
|
|
|
| |
MemoryPhis should be added in the IDF of the blocks newly gaining Defs.
This includes the blocks that gained a Phi and the block gaining a Def,
if the block did not have one before.
Resolves PR43427.
llvm-svn: 373505
|
| |
|
|
| |
llvm-svn: 373503
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
of bits that might be undef
The previous code tried to do a trick where we would extract the subvector from the location we were inserting. Then xor that with the new value. Take the xored value and clear out the bits above the subvector size. Then shift that xored subvector to the insert location. And finally xor that with the original vector. Since the old subvector was used in both xors, this would leave just the new subvector at the inserted location. Since the surrounding bits had been zeroed no other bits of the original vector would be modified.
Unfortunately, if the old subvector came from undef we might aggressively propagate the undef. Then we end up with the XORs not cancelling because they aren't using the same value for the two uses of the old subvector. @bkramer gave me a case that demonstrated this, but we haven't reduced it enough to make it easily readable to see what's happening.
This patch uses a safer, but more costly approach. It isolate the bits above the insertion and bits below the insert point and ORs those together leaving 0 for the insertion location. Then widens the subvector with 0s in the upper bits, shifts it into position with 0s in the lower bits. Then we do another OR.
Differential Revision: https://reviews.llvm.org/D68311
llvm-svn: 373495
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: sdmitriev, tejohnson
Reviewed by: tejohnson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68318
llvm-svn: 373494
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
64-bit WebAssembly (wasm64) is not specified and not supported in the
WebAssembly backend. We do have support for it in clang, however, and
we would like to keep that support because we expect wasm64 to be
specified and supported in the future. For now add an error when
trying to use wasm64 from the backend to minimize user confusion from
unexplained crashes.
Reviewers: aheejin, dschuff, sunfish
Subscribers: sbc100, jgravelle-google, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68254
llvm-svn: 373493
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Extend cachepolicy operand in the new VMEM buffer intrinsics
to supply information whether the buffer data is swizzled.
Also, propagate this information to MIR.
Intrinsics updated:
int_amdgcn_raw_buffer_load
int_amdgcn_raw_buffer_load_format
int_amdgcn_raw_buffer_store
int_amdgcn_raw_buffer_store_format
int_amdgcn_raw_tbuffer_load
int_amdgcn_raw_tbuffer_store
int_amdgcn_struct_buffer_load
int_amdgcn_struct_buffer_load_format
int_amdgcn_struct_buffer_store
int_amdgcn_struct_buffer_store_format
int_amdgcn_struct_tbuffer_load
int_amdgcn_struct_tbuffer_store
Furthermore, disable merging of VMEM buffer instructions
in SI Load/Store optimizer, if the "swizzled" bit on the instruction
is on.
The default value of the bit is 0, meaning that data in buffer
is linear and buffer instructions can be merged.
There is no difference in the generated code with this commit.
However, in the future it will be expected that front-ends
use buffer intrinsics with correct "swizzled" bit set.
Reviewers: arsenm, nhaehnle, tpr
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, arphaman, jfb, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68200
llvm-svn: 373491
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There are no users that pass in LazyValueInfo, so we can simplify the
function a bit.
Reviewers: brzycki, asbirlea, davide
Reviewed By: davide
Differential Revision: https://reviews.llvm.org/D68297
llvm-svn: 373488
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a hole in the handling of devirtualized targets that were
local but need promoting due to devirtualization in another module. We
were not correctly referencing the promoted symbol in some cases. Make
sure the code that updates the name also looks at the ExportedGUIDs set
by utilizing a callback that checks all conditions (the callback
utilized by the internalization/promotion code).
Reviewers: pcc, davidxl, hiraditya
Subscribers: mehdi_amini, Prazek, inglorion, steven_wu, dexonsmith, dang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68159
llvm-svn: 373485
|
| |
|
|
|
|
|
|
|
|
| |
As noted on PR41772, the static analyzer reports that the MachineMemOperand::print partial wrappers set a number of args to null pointers that were then dereferenced in the actual implementation.
It turns out that these wrappers are not being used at all (hence why we're not seeing any crashes), so I'd like to propose we just get rid of them.
Differential Revision: https://reviews.llvm.org/D68208
llvm-svn: 373484
|
| |
|
|
|
|
|
|
| |
dyn_cast<PHINode> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<PHINode> directly and if not assert will fire for us.
llvm-svn: 373481
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: fhahn
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68331
llvm-svn: 373479
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
default is unreachable (PR43129)"
This was reverted in r373454 due to breaking the expensive-checks bot.
This version addresses that by omitting the addSuccessorWithProb() call
when omitting the range check.
> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131
llvm-svn: 373477
|
| |
|
|
|
|
|
|
|
| |
This is a follow-up for D68085 which allows using "Size"
tag together with "Content" tag or alone.
Differential revision: https://reviews.llvm.org/D68216
llvm-svn: 373473
|