| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 1st try was reverted because it could inf-loop by creating a dead instruction.
Fixed that to not happen and added a test case to verify.
Original commit message:
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C
Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.
I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.
The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.
Differential Revision: https://reviews.llvm.org/D36922
llvm-svn: 311366
|
| |
|
|
|
|
|
|
|
|
| |
(icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type
This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely.
Differential Revision: https://reviews.llvm.org/D36498
llvm-svn: 311362
|
| |
|
|
|
|
|
|
| |
Looks like this fails to build with libstdc++.
This reverts r311356
llvm-svn: 311358
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces the FuzzMutate library, which provides structured
fuzzing for LLVM IR, as described in my [EuroLLVM 2017 talk][1]. Most
of the basic mutators to inject and delete IR are provided, with
support for most basic operations.
I will follow up with the instruction selection fuzzer, which is
implemented in terms of this library.
[1]: http://llvm.org/devmtg/2017-03//2017/02/20/accepted-sessions.html#2
llvm-svn: 311356
|
| |
|
|
|
|
|
|
|
| |
For the medium and large code models we only need to check if a call crosses
dso-boundaries when considering tail-call elgibility.
Differential Revision: https://reviews.llvm.org/D34245
llvm-svn: 311353
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is an attempt to move WholeProgramDevirt to the new remark API.
https://bugs.llvm.org/show_bug.cgi?id=33793
Reviewers: anemet
Reviewed By: anemet
Subscribers: fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D36943
llvm-svn: 311352
|
| |
|
|
|
|
|
|
| |
Previously, we would just assert instead.
Differential Revision: https://reviews.llvm.org/D36961
llvm-svn: 311351
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33786
Reviewers: anemet, davidxl, chandlerc
Reviewed By: anemet
Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D36054
llvm-svn: 311349
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear.
Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits.
Reviewers: spatel, majnemer, davide
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36944
llvm-svn: 311343
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
use of every node all the way back to the root of the match
Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions.
Reviewers: RKSimon, spatel, zvi, igorb
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36938
llvm-svn: 311342
|
| |
|
|
| |
llvm-svn: 311341
|
| |
|
|
|
|
|
|
|
|
| |
arguments
We're getting lots of compile-timeout bot failures like:
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux
llvm-svn: 311340
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for dumping a summary of module symbols
and CodeView debug chunks. This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size. Then
at the end it prints the totals for the entire file.
Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.
llvm-svn: 311338
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C
Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.
I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.
The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.
Differential Revision: https://reviews.llvm.org/D36922
llvm-svn: 311333
|
| |
|
|
|
|
|
|
|
|
|
| |
Preparations to use the per-increment are sometimes done in the target
independent pass Loop Strength Reduction. We try to detect them in the PowerPC
specific pass so that they are not done twice and so that we do not add PHIs
that are not required.
Differential Revision: https://reviews.llvm.org/D36736
llvm-svn: 311332
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions.
Reviewers: zvi, guyblank
Reviewed By: guyblank
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D34754
llvm-svn: 311327
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-committing after r311325 fixed an unintentional use of '#' comments in
clang.
The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.
Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.
Differential Revision: https://reviews.llvm.org/D36405
llvm-svn: 311326
|
| |
|
|
|
|
| |
LOAD_STACK_GUARD and PHI nodes.
llvm-svn: 311323
|
| |
|
|
| |
llvm-svn: 311321
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
widely used processors.
This occured to me when I saw that we were generating 'inc' and 'dec'
when for Haswell and newer we shouldn't. However, there were a few "X is
slow" things that we should probably just set.
I've avoided any of the "X is fast" features because most of those would
be pretty serious regressions on processors where X isn't actually fast.
The slow things are likely to be negligible costs on processors where
these aren't slow and a significant win when they are slow.
In retrospect this seems somewhat obvious. Not sure why we didn't do
this a long time ago.
Differential Revision: https://reviews.llvm.org/D36947
llvm-svn: 311318
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rather than doing a separate comparison.
This both saves an explicit comparision and avoids the use of `xadd`
which introduces register constraints and other challenges to the
generated code.
The motivating case is from atomic reference counts where `1` is the
sentinel rather than `0` for whatever reason. This can and should be
lowered efficiently on x86 by just using a different flag, however the
x86 code only handled the `0` case.
There remains some further opportunities here that are currently hidden
due to canonicalization. I've included test cases that show these and
FIXMEs. However, I don't at the moment have any production use cases and
they seem substantially harder to address.
Differential Revision: https://reviews.llvm.org/D36945
llvm-svn: 311317
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.
Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75
Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55
Differential Revision: https://reviews.llvm.org/D36667
llvm-svn: 311316
|
| |
|
|
|
|
|
| |
Allow those prefixes on assembly code
Differential Revision: https://reviews.llvm.org/D36845
llvm-svn: 311309
|
| |
|
|
|
|
|
|
|
|
| |
broadcasts when AVX512DQ is enabled.
There's no functional difference between the AVX512DQ instructions if we're not masking.
This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently.
llvm-svn: 311308
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When extracting the instrumentation map from a binary, we should be able
to recognize the new kinds of instrumentation sleds we've been emitting
with the compiler using -fxray-instrument. This change adds a test for
all the kinds of sleds we currently support (sans the tail-call sled,
which is a bit harder to force in a simple prebuilt input).
Reviewers: kpw, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36819
llvm-svn: 311305
|
| |
|
|
|
|
|
| |
This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.
llvm-svn: 311304
|
| |
|
|
| |
llvm-svn: 311297
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 311295
|
| |
|
|
| |
llvm-svn: 311294
|
| |
|
|
| |
llvm-svn: 311291
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 311290
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 311288
|
| |
|
|
|
|
| |
There's no reason to destroy them in a global destructor.
llvm-svn: 311287
|
| |
|
|
|
|
|
|
|
| |
Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling.
This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks.
Differential Revision: https://reviews.llvm.org/D35888
llvm-svn: 311285
|
| |
|
|
|
|
|
|
|
|
|
| |
Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.
Patch written by Divya Shanmughan and Aditya Kumar
Differential Revision: https://reviews.llvm.org/D36220
llvm-svn: 311281
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported.
Reviewers: zvi, guyblank, oren_ben_simhon
Reviewed By: oren_ben_simhon
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34602
llvm-svn: 311279
|
| |
|
|
|
|
| |
Usually this case generated by ABI lowering, it requare to performe trancate/anyext.
llvm-svn: 311278
|
| |
|
|
| |
llvm-svn: 311277
|
| |
|
|
|
|
| |
Replace with report_fatal_error.
llvm-svn: 311276
|
| |
|
|
| |
llvm-svn: 311275
|
| |
|
|
|
|
| |
Reverting due to clang build failure
llvm-svn: 311274
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33786
Reviewers: anemet, davidxl, chandlerc
Reviewed By: anemet
Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D36054
llvm-svn: 311273
|
| |
|
|
|
|
|
|
|
| |
cmov self-refrencing.
Pointed out by Amjad Aboud in code review, test case minorly simplified
from the one he posted.
llvm-svn: 311267
|
| |
|
|
|
|
|
|
|
|
| |
predicates.
We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size.
This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur.
llvm-svn: 311266
|
| |
|
|
|
|
|
|
| |
predicate.
We can read the memoryVT and get its store size directly from the SDNode to check its alignment.
llvm-svn: 311265
|
| |
|
|
|
|
| |
from an extract subvector operation.
llvm-svn: 311263
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D36930
llvm-svn: 311260
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.
This fixes PR33921.
Differential Revision: https://reviews.llvm.org/D36899
llvm-svn: 311258
|
| |
|
|
|
|
|
| |
- Move *load* functions before *atomic* functions
- Move *store* functions before *atomic* functions
llvm-svn: 311256
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.
Reviewers: zvi, delena, RKSimon, thakis
Reviewed By: RKSimon
Subscribers: chandlerc, eladcohen, llvm-commits
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 311255
|