| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
ARMLoadStoreOpt::FixInvalidRegPairOp() was only checking if one of the
load destination registers to be split overlapped with the base register
if the base register was marked as killed. Since kill flags may not
always be present, this can lead to incorrect code.
This bug was exposed by my MachineCopyPropagation change D30751 breaking
the sanitizer-x86_64-linux-android buildbot.
Also clean up some dead code and add an assert that a register offset is
never encountered by this code, since it does not handle them correctly.
Reviewers: MatzeB, qcolombet, t.p.northover
Subscribers: aemerson, javed.absar, kristof.beyls, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D37164
llvm-svn: 311907
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, a phi node is created in the normal destination to unify the return values from promoted calls and the original indirect call. This patch makes this phi node to be created only when the return value has uses.
This patch is necessary to generate valid code, as compiler crashes with the attached test case without this patch. Without this patch, an illegal phi node that has no incoming value from `entry`/`catch` is created in `cleanup` block.
I think existing implementation is good as far as there is at least one use of the original indirect call. `insertCallRetPHI` creates a new phi node in the normal destination block only when the original indirect call dominates its use and the normal destination block. Otherwise, `fixupPHINodeForNormalDest` will handle the unification of return values naturally without creating a new phi node. However, if there's no use, `insertCallRetPHI` still creates a new phi node even when the original indirect call does not dominate the normal destination block, because `getCallRetPHINode` returns false.
Reviewers: xur, davidxl, danielcdh
Reviewed By: xur
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37176
llvm-svn: 311906
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
S_UDT symbols are the debugger's "index" for all the structs,
typedefs, classes, and enums in a program. If any of those
structs/classes don't have a complete declaration, or if there
is a typedef to something that doesn't have a complete definition,
then emitting the S_UDT is unhelpful because it doesn't give
the debugger enough information to do anything useful. On the
other hand, it results in a huge size blow-up in the resulting
PDB, which is exacerbated by an order of magnitude when linking
with /DEBUG:FASTLINK.
With this patch, we drop S_UDT records for types that refer either
directly or indirectly (e.g. through a typedef, pointer, etc) to
a class/struct/union/enum without a complete definition. This
brings us about 50% of the way towards parity with /DEBUG:FASTLINK
PDBs generated from cl-compiled object files.
Differential Revision: https://reviews.llvm.org/D37162
llvm-svn: 311904
|
| |
|
|
|
|
|
|
|
| |
Added the following P9 instructions: mffsce, mffscdrn, mffscdrni, mffscrn,
mffscrni, mffsl
Differential Revision: https://reviews.llvm.org/D37167
llvm-svn: 311903
|
| |
|
|
|
|
|
|
|
|
| |
NSW flag when handling Add in SimplifyDemandedUseBits.
This is a typo from r311789.
This should fix PR34349.
llvm-svn: 311902
|
| |
|
|
|
|
|
| |
Insert artificial edges between loads that could cause a cache bank
conflict.
llvm-svn: 311901
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Under -cl-fast-relaxed-math we could use native_sqrt, but f64 was
allowed to produce HSAIL's nsqrt instruction. HSAIL is not here
and we stick with non-existing native_sqrt(double) as a result.
Add check for f64 to not return native functions and also remove
handling of f64 case for fold_sqrt.
Differential Revision: https://reviews.llvm.org/D37223
llvm-svn: 311900
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37168
llvm-svn: 311896
|
| |
|
|
| |
llvm-svn: 311895
|
| |
|
|
| |
llvm-svn: 311894
|
| |
|
|
|
|
|
|
|
|
|
|
| |
combining with BUILD_VECTOR from Legalization to DAG combine
EXTRACT_SUBVECTOR was marked Custom solely so we could combine it with BUILD_VECTOR operations to create smaller BUILD_VECTORS during Legalization. But that sort of combining should really be done by the DAG combiner.
This patch adds the last piece of needed supported DAG combine to handle this. Once that's done we can make the EXTRACT_SUBVECTOR operations Legal.
Differential Revision: https://reviews.llvm.org/D37197
llvm-svn: 311893
|
| |
|
|
|
|
|
|
|
|
| |
into smaller BUILD_VECTORs
Only do this before operations are legalized of BUILD_VECTOR is Legal for the target.
Differential Revision: https://reviews.llvm.org/D37186
llvm-svn: 311892
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
possible sets of x86 prefixes. This patch is the first step to close PR7709 and PR17697. There will be next patch(es) to close relative PRs.
Differential Revision: https://reviews.llvm.org/D36788
M lib/Target/X86/Disassembler/X86DisassemblerDecoder.cpp
M lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
A test/MC/Disassembler/X86/prefixes-i386.s
A test/MC/Disassembler/X86/prefixes-x86_64.s
M test/MC/Disassembler/X86/prefixes.txt
llvm-svn: 311882
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch completely replaces the instruction scheduling information for the Haswell architecture target by modifying the file X86SchedHaswell.td located under the X86 Target.
We used the scheduling information retrieved from the Haswell architects in order to replace and modify the existing scheduling.
The patch continues the scheduling replacement effort started with the SNB target in r307529 and r310792.
Information includes latency, number of micro-Ops and used ports by each HSW instruction.
Please expect some performance fluctuations due to code alignment effects.
Reviewers: RKSimon, zvi, aymanmus, craig.topper, m_zuckerman, igorb, dim, chandlerc, aaboud
Differential Revision: https://reviews.llvm.org/D36663
llvm-svn: 311879
|
| |
|
|
| |
llvm-svn: 311875
|
| |
|
|
|
|
|
|
| |
using X86ISD::UNPCKL in reduceVMULWidth.
This runs fairly early, we should use target independent nodes if possible.
llvm-svn: 311873
|
| |
|
|
|
|
|
|
| |
when SSE2 is disabled.
Without this the madd.ll and sad.ll test cases both throw assertions if you run them with SSE2 disabled.
llvm-svn: 311872
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct string {
~string();
};
void f2();
void f1(int) { f2(); }
void run(int c) {
string body;
while (true) {
if (c)
f1(c);
else
f1(c);
}
}
Will recommit once the issue is fixed.
llvm-svn: 311864
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch enables generation of NMADD and NMSUB instructions when fneg node
is present. These instructions are currently only generated if fsub node is
present.
Patch by Stanislav Ocovaj.
Differential Revision: https://reviews.llvm.org/D34507
llvm-svn: 311862
|
| |
|
|
|
|
|
|
|
| |
Move condition code support functions to Utils and remove code duplication.
Reviewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37179
llvm-svn: 311860
|
| |
|
|
|
|
| |
the lowest subvector. This time with bitcasts between the vselect and the extract.
llvm-svn: 311856
|
| |
|
|
|
|
|
|
| |
As noted in the FIXME, this could be improved more, but this is the smallest fix
that helps:
https://bugs.llvm.org/show_bug.cgi?id=34111
llvm-svn: 311853
|
| |
|
|
|
|
|
|
|
| |
Simplify getDRegFromQReg function
Reviewed by: @fhahn, @asb
Differential Revision: https://reviews.llvm.org/D37118
llvm-svn: 311850
|
| |
|
|
|
|
|
|
|
|
|
| |
Original commit r311077 of D32871 was reverted in r311304 due to failures
reported in PR34248.
This recommit fixes PR34248 by restricting the packing of predicated scalars
into vectors only when vectorizing, avoiding doing so when unrolling w/o
vectorizing. Added a test derived from the reproducer of PR34248.
llvm-svn: 311849
|
| |
|
|
|
|
| |
all zero/one build_vectors.
llvm-svn: 311841
|
| |
|
|
| |
llvm-svn: 311840
|
| |
|
|
| |
llvm-svn: 311838
|
| |
|
|
|
|
|
|
|
|
| |
between the vselect and the extract_subvector. Remove the late DAG combine.
We used to do a late DAG combine to move the bitcasts out of the way, but I'm starting to think that it's better to canonicalize extract_subvector's type to match the type of its input. I've seen some cases where we've formed two different extract_subvector from the same node where one had a bitcast and the other didn't.
Add some more test cases to ensure we've also got most of the zero masking covered too.
llvm-svn: 311837
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Remove redundant explicit template instantiation.
This was reported by Andrew Kelley building release_50 with gcc7.2.0 on MacOS: duplicate symbol llvm::DominatorTreeBase.
Reviewers: kuhar, andrewrk, davide, hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37185
llvm-svn: 311835
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.
This will also fix PR 33784
Reviewers: zvi, delena, RKSimon, thakis
Reviewed By: RKSimon
Subscribers: chandlerc, eladcohen, llvm-commits
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 311833
|
| |
|
|
|
|
|
|
| |
Summary: This reverts commit rL311247.
Differential Revision: https://reviews.llvm.org/D36927
llvm-svn: 311832
|
| |
|
|
|
|
| |
for loads, not just when we do it for stores.
llvm-svn: 311829
|
| |
|
|
|
|
| |
We were suppressing most uses of INC/DEC, but this one seems to have been missed.
llvm-svn: 311828
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add options -print-bfi/-print-bpi that dump block frequency and branch
probability info like -view-block-freq-propagation-dags and
-view-machine-block-freq-propagation-dags do but in text.
This is useful when the graph is very large and complex (the dot command
crashes, lines/edges too close to tell apart, hard to navigate without textual
search) or simply when text is preferred.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37165
llvm-svn: 311822
|
| |
|
|
|
|
|
|
| |
extract_subvector of the lowest subvector.
This only supports 32 and 64 bit element sizes for now. But we could probably do 16 and 8-bit elements with BWI.
llvm-svn: 311821
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to instructions.
These can't be reasonably matched in tablegen due to the handling of
flags, so we have to do this in C++ code. We only did it for `inc` and
`dec` historically, this starts fleshing that out to more interesting
instructions. Notably, this handles transfering operands to `add` and
`sub`.
Currently this forces them into a register. The next patch will add
support for keeping immediate operands as immediates. Then I'll extend
this beyond just `add` and `sub`.
I'm not super thrilled by the repeated switches in the code but
everything else I tried was really ugly or problematic.
Many thanks to Craig Topper for the suggestions about where to even
begin here and how to make this stuff work.
Differential Revision: https://reviews.llvm.org/D37130
llvm-svn: 311806
|
| |
|
|
|
|
| |
Fixes PR34325.
llvm-svn: 311805
|
| |
|
|
|
|
|
|
| |
Prior to this change (and after r311371), we computed it
unconditionally, causin gsevere compile time regressions (in some
cases, 5 to 10x).
llvm-svn: 311804
|
| |
|
|
|
|
| |
This reverts r311801 due to a bot failure.
llvm-svn: 311803
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Don't sanitize __sancov_lowest_stack.
- Don't instrument leaf functions.
- Add CoverageStackDepth to Fuzzer and FuzzerNoLink.
Reviewers: vitalybuka, kcc
Reviewed By: kcc
Subscribers: cfe-commits, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D37156
llvm-svn: 311801
|
| |
|
|
| |
llvm-svn: 311794
|
| |
|
|
|
|
|
|
|
| |
Change the early exit condition from Cost > Threshold to Cost >= Threshold
because the inline condition is Cost < Threshold.
Differential Revision: https://reviews.llvm.org/D37087
llvm-svn: 311791
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bit of Add/Sub is demanded.
Just create an all 1s demanded mask and continue recursing like normal. The recursive calls should be able to handle an all 1s mask and do the right thing.
The only time we should care about knowing whether the upper bit was demanded is when we need to know if we should clear the NSW/NUW flags.
Now that we have a consistent path through the code for all cases, use KnownBits::computeForAddSub to compute the known bits at the end since we already have the LHS and RHS.
My larger goal here is to move the code that turns add into xor if only 1 bit is demanded and no bits below it are non-zero from InstCombiner::OptAndOp to here. This will allow it to be more general instead of just looking for 'add' and 'and' with constant RHS.
Differential Revision: https://reviews.llvm.org/D36486
llvm-svn: 311789
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
SimplifyIndVar may introduce zext instructions to widen arguments of the
loop exit check. They should not prevent us from splitting the loop at
the induction variable, but maybe the check should be more conservative,
e.g. making sure it only extends arguments used by a comparison?
Reviewers: karthikthecool, mcrosier, mzolotukhin
Reviewed By: mcrosier
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D34879
llvm-svn: 311783
|
| |
|
|
|
|
|
|
| |
Since the lambda isn't escaped (via a std::function or similar) it's
fine/better to use default capture-by-ref to provide semantics similar
to language-level nested scopes (if/for/while/etc).
llvm-svn: 311782
|
| |
|
|
| |
llvm-svn: 311781
|
| |
|
|
|
|
|
| |
Commit r297442 introduced mixed CRLF/LF line endings to two files.
Normalize to to LF-only line endings.
llvm-svn: 311774
|
| |
|
|
|
|
|
|
|
|
| |
convert AShr to LShr.
There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit.
Differential Revision: https://reviews.llvm.org/D36936
llvm-svn: 311773
|
| |
|
|
|
|
| |
getOpcode on that. NFC
llvm-svn: 311765
|
| |
|
|
|
|
|
|
| |
compares. NFCI
While there use a local variable instead of calling C->getZExtValue() repeatedly.
llvm-svn: 311764
|