| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D33847
llvm-svn: 305170
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid.
Reviewers: joerg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34088
llvm-svn: 305162
|
| |
|
|
| |
llvm-svn: 305154
|
| |
|
|
| |
llvm-svn: 305153
|
| |
|
|
| |
llvm-svn: 305152
|
| |
|
|
|
|
| |
As discussed on D33983, as SLM has so many custom costs its worth testing as well.
llvm-svn: 305151
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D28531
llvm-svn: 305137
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D33737
llvm-svn: 305132
|
| |
|
|
| |
llvm-svn: 305131
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Fix assertion failures on F16 to/from int types in FastISel by falling
back to regular ISel
- Add a testcase of various conversion cases with FastISel (-O0)
Reviewers: kristof.beyls, jmolloy, SjoerdMeijer
Reviewed By: SjoerdMeijer
Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33734
llvm-svn: 305127
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in
GEPOperator::accumulateConstantOffset.
Basically it does not consider the situation that the pointer operand of load or store
may be in a non-zero address space and its size may be different from the size of
a pointer in address space 0.
This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend.
Diffferential Revision: https://reviews.llvm.org/D33298
llvm-svn: 305107
|
| |
|
|
|
|
|
|
|
|
| |
This is to reflect the evolving nature of the tool as being
useful for more than just dumping PDBs, as it can do many other
things.
Differential Revision: https://reviews.llvm.org/D34062
llvm-svn: 305106
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
isSafeToSpeculativelyExecute is the wrong predicate to use here.
All that checks for is whether it is safe to hoist a value due to
unaligned/un-dereferencable accesses. However, not only are we doing
sinking rather than hoisting, our concern is that the location
we're loading from may have been modified. Instead forbid sinking
any load across a critical edge.
Reviewers: majnemer
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D33179
llvm-svn: 305102
|
| |
|
|
| |
llvm-svn: 305100
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D34046
llvm-svn: 305098
|
| |
|
|
| |
llvm-svn: 305097
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add the WindowsResourceCOFFWriter class for producing the final COFF after all parsing is done.
Reviewers: hiraditya!, zturner, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34020
llvm-svn: 305092
|
| |
|
|
|
|
| |
If the inputs won't saturate during packing then we can treat the PACKSS as a truncation shuffle
llvm-svn: 305091
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and returns"
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.
The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.
Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.
By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.
Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".
This patch enables the MIPS backend to take either form for vector types.
The previous version of this patch had a "conditional move or jump depends on
uninitialized value".
Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur
Differential Revision: https://reviews.llvm.org/D27845
llvm-svn: 305083
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Alloca promotion pass not dealing with non-canonical input
Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical)
Also added some test cases in non-canonical form to check that it no longer crashes
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D31710
llvm-svn: 305079
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an empty comment is present in an assembly file, the compiler will crash because it checks the first character for '\n' or '\r'.
The fix consists of also checking if the string is empty before accessing the *front* method of the StringRef.
A test is included for the x86 target, but this issue is reproducible with other targets as well.
Patch by Alexandru Guduleasa!
Reviewers: niravd, grosbach, llvm-commits
Reviewed By: niravd
Differential Revision: https://reviews.llvm.org/D33993
llvm-svn: 305077
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.
Reviewers: bkramer, spatel, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33731
llvm-svn: 305070
|
| |
|
|
|
|
|
|
|
|
| |
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.
Differential Revision: https://reviews.llvm.org/D34040
llvm-svn: 305064
|
| |
|
|
| |
llvm-svn: 305063
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initial implementation - needs similar work/testing for other tools
bugpoint invokes (llc, lli I think, maybe more).
Alternatively (as suggested by chandlerc@) an environment variable could
be used. This would allow the option to pass transparently through user
scripts, pass to compilers if they happened to be LLVM-ish, etc.
I worry a bit about using cl::opt in the crash handling code - LLVM
might crash early, perhaps before the cl::opt is properly initialized?
Or at least before arguments have been parsed?
- should be OK since it defaults to "pretty", so if the crash is very
early in opt parsing, etc, then crash reports will still be symbolized.
I shyed away from doing this with an environment variable when I
realized that would require copying the existing environment and
appending the env variable of interest. But it seems there's no existing
LLVM API for accessing the environment (even the Support tests for
process launching have their own ifdefs for getting the environment). It
could be added, but seemed like a higher bar/untested codepath to
actually add environment variables.
Most importantly, this reduces the runtime of test/BugPoint/metadata.ll
in a split-dwarf Debug build from 1m34s to 6.5s by avoiding a lot of
symbolication. (this wasn't a problem for non-split-dwarf builds only
because the executable was too large to map into memory (due to bugpoint
setting a 400MB memory (including address space - not sure why? Going to
remove that) limit on the child process) so symbolication would fail
fast & wouldn't spend all that time parsing DWARF, etc)
Reviewers: chandlerc, dannyb
Differential Revision: https://reviews.llvm.org/D33804
llvm-svn: 305056
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.
Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979
llvm-svn: 305055
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.
If we do, we get incorrect optimizations in cases like:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] - 128;
}
which gets optimized to:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] | 128;
}
Because:
- InstCombine turned `sub i32 %from.i, 128` into
`add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
"couldn't wrap", and changed the `add` to an `or`.
InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.
llvm-svn: 305053
|
| |
|
|
|
|
|
|
|
| |
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.
llvm-svn: 305052
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.
Fixes PR33335.
Reviewers: zturner, hiraditya!
Reviewed By: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33968
llvm-svn: 305043
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for Symbols, StringTable, and FrameData subsection
types. Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB. The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them. And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time. Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.
llvm-svn: 305037
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is the same change for the YAML Output style applied to the
raw output style. Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order. This was because some subsections need to be
read first in order to properly dump later subsections. This
patch allows them to be dumped in the order they appear.
Differential Revision: https://reviews.llvm.org/D34015
llvm-svn: 305034
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pdb2yaml and raw subcommands did something very
similar but with a different output format, and they
used a lot of the same command line options, but each
one re-implemented the command line option with slightly
different spellings / options. This patch merges them
together into a single definition which is shared by
both subcommands. This new syntax also allows for more
flexibility in the way debug subsections are dumped.
Differential Revision: https://reviews.llvm.org/D33996
llvm-svn: 305032
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.
It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.
This is the second attempt to land this change after fixing PR33316.
llvm-svn: 305031
|
| |
|
|
|
|
|
|
|
| |
This is to prepare to allow for dead stripping of globals in the
merged modules.
Differential Revision: https://reviews.llvm.org/D33921
llvm-svn: 305027
|
| |
|
|
|
|
| |
-fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308
llvm-svn: 305026
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.
(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.
(2) MachineVerifier: updated the test per MatzeB's review.
(3) +testcase
Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!
Differential Revision: https://reviews.llvm.org/D33947
llvm-svn: 305016
|
| |
|
|
| |
llvm-svn: 305014
|
| |
|
|
|
|
| |
Fixes PR33316.
llvm-svn: 305012
|
| |
|
|
|
|
|
|
|
|
|
|
| |
No IR tests were added with rL304313 ( https://reviews.llvm.org/D28637 ),
so I want these for extra coverage if we enable memcmp expansion for x86.
As shown, nothing is expanded for x86 in CGP yet.
Also fundamentally, we're doing an IR transform, so we should have IR tests
for just that part. If something goes wrong, we need to know if the bug is
in CGP or later lowering.
llvm-svn: 305011
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.
Reviewers: davidxl, dnovillo, iteratee
Reviewed By: iteratee
Subscribers: iteratee, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D34017
llvm-svn: 305009
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This matches the behavior used in the SDAG when expanding memcmp.
For reference, we're intentionally treating the earlier fortified call transforms differently after:
https://bugs.llvm.org/show_bug.cgi?id=23093
https://reviews.llvm.org/rL233776
One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers:
https://reviews.llvm.org/D19781
https://reviews.llvm.org/D19801
Differential Revision: https://reviews.llvm.org/D34043
llvm-svn: 305007
|
| |
|
|
|
|
| |
Fixes using physical registers in inline asm from clang.
llvm-svn: 305004
|
| |
|
|
|
|
|
|
|
|
| |
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.
This patch fixed PR32442.
Differential Revision: https://reviews.llvm.org/D31407
llvm-svn: 305001
|
| |
|
|
|
|
|
|
| |
The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes.
Differential Revision: https://reviews.llvm.org/D33783
llvm-svn: 304998
|
| |
|
|
|
|
|
|
|
| |
ordering.
This is a temporarily fix which needs additional work, as it triggers a test3 failure.
test3 is commented out till then.
llvm-svn: 304993
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword
Differential Revision: https://reviews.llvm.org/D33510
llvm-svn: 304992
|
| |
|
|
|
|
|
|
|
|
|
| |
In SDAG, we don't expand libcalls with a nobuiltin attribute.
It's not clear if that's correct from the existing code comment:
"Don't do the check if marked as nobuiltin for some reason."
...adding a test here either way to show that there is currently
a different behavior implemented in the CGP-based expansion.
llvm-svn: 304991
|
| |
|
|
| |
llvm-svn: 304989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test diff for PowerPC shows we can better optimize if this case is one block.
For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed
cheap and SDAG can't optimize across blocks.
Instead of this:
_cmp_eq8:
movq (%rdi), %rax
cmpq (%rsi), %rax
je LBB23_1
## BB#2: ## %res_block
movl $1, %ecx
jmp LBB23_3
LBB23_1:
xorl %ecx, %ecx
LBB23_3: ## %endblock
xorl %eax, %eax
testl %ecx, %ecx
sete %al
retq
We get this:
cmp_eq8:
movq (%rdi), %rcx
xorl %eax, %eax
cmpq (%rsi), %rcx
sete %al
retq
And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall().
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we
can lower the IR produced here optimally.
Differential Revision: https://reviews.llvm.org/D34005
llvm-svn: 304987
|
| |
|
|
|
|
|
| |
This patch will close PR32801.
Differential Revision: https://reviews.llvm.org/D33203
llvm-svn: 304986
|