| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D69909
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining.
Reviewers: wmi, davidxl, tejohnson
Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69736
|
|
|
|
|
|
|
|
|
|
|
| |
Add pattern matching and intrinsics for the following instructions:
predicated orr, eor, and, bic
predicated mul, smulh, umulh, sdiv, udiv, sdivr, udivr
predicated smax, umax, smin, umin, sabd, uabd
mad, msb, mla, mls
https://reviews.llvm.org/D69588
|
|
|
|
|
| |
This reverts commit c52efdc52cef2597a1d21595a9685e2f798025b8,
because b5913e6d2f got reverted.
|
|
|
|
| |
This reverts commit b5913e6d2f6d13fb753df701619731ca11936316.
|
|
|
|
| |
This only works if there is no use of the return value.
|
|
|
|
|
|
|
| |
AMDGPU has some atomic instructions that do not return the previous
result, and can only be selected if there are no uses. The source
pattern will only match if the use is empty, so it should be safe to
discard the result.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"[SLP] Generalization of stores vectorization."
"[SLP] Fix -Wunused-variable. NFC"
"[SLP] Vectorize jumbled stores."
As they're causing significant (10-30x) compile time regressions on
vectorizable code.
The primary cause of the compile-time regression is f228b5371647f471853c5fb3e6719823a42fe451.
This reverts commits:
f228b5371647f471853c5fb3e6719823a42fe451
5503455ccb3f5fcedced158332c016c8d3a7fa81
21d498c9c0f32dcab5bc89ac593aa813b533b43a
|
|
|
|
|
|
|
|
|
|
| |
require scanning symbols
Performance issues lead to the libc++ std::function formatter to be disabled.
This change is the first of two changes that should address the performance issues and allow us to enable the formatter again.
In some cases we end up scanning the symbol table for the callable wrapped by std::function for those cases we will now cache the results and used the cache in subsequent look-ups. This still leaves a large cost for the initial lookup which will be addressed in the next change.
Differential Revision: https://reviews.llvm.org/D67111
|
|
|
|
|
|
| |
This was omitted. Also SReg_96Reg missed IsSGPR assignment.
Differential Revision: https://reviews.llvm.org/D69919
|
|
|
|
|
|
|
|
|
|
|
| |
Have CMake treat the unwind libraries as C libraries rather than C++.
There is no C++ runtime dependency at runtime. This ensures that we do
not accidentally end up with a link against the C++ runtime.
We need to explicitly reset the implicitly linked libraries for C++ to
ensure that we do not have CMake force the link against the C++ runtime.
This adjustment should enable the NetBSD bots to be happy with this
change.
|
|
|
|
|
| |
Reflow the CMake properties to take less vertical space. This just
makes it easier to read. NFC.
|
|
|
|
|
|
|
|
| |
The basic idea of the transform is to convert variant loop exit conditions into invariant exit conditions by changing the iteration on which the exit is taken when we know that the trip count is unobservable. See the original patch which introduced the code for a more complete explanation.
The individual parts of this have been reviewed, the result has been fuzzed, and then further analyzed by hand, but despite all of that, I will not be suprised to see breakage here. If you see problems, please don't hesitate to revert - though please do provide a test case. The most likely class of issues are latent SCEV bugs and without a reduced test case, I'll be essentially stuck on reducing them.
(Note: A bunch of tests were opted out of the new transform to preserve coverage. That landed in a previous commit to simplify revert cycles if they turn out to be needed.)
|
|
|
|
|
|
| |
I'm about to enable the new loop predication transform by default. It has the effect of completely destroying many read only loops - which happen to be a super common idiom in our test cases. So as to preserve test coverage of other transforms, disable the new transform where it would cause sharp test coverage regressions.
(This is semantically part of the enabling commit. It's committed separate to ease revert if the actual flag flip gets reverted.)
|
| |
|
|
|
|
|
|
|
|
|
|
| |
return value location depends on the calling convention of the callee.
`F.getCallingConv()`, however, is the caller CC. Correct it to the
callee CC from `CallLoweringInfo`.
Fixes PR43449
Patch by Shu-Chun Weng!
|
|
|
|
|
|
| |
Without asan and tsan as test dependencies, you might end up with a
clang that points to sanitizer runtime library that hasn't been build
yet.
|
|
|
|
|
| |
In the case where xcodebuild fails as you set up simulator tests, you
would fail because `feature` is never defined.
|
| |
|
|
|
|
|
|
|
| |
The -use-mcjit option was replaced with -jit-kind=mcjit a while back. This patch
updates the docs to reflect that.
Patch by Yu Jian. Thanks Jian!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Much like D67339, adds ConstantRange handling for
when we know no-wrap behavior of the `sub`.
Unlike addWithNoWrap(), we only get lucky re returning empty set
for signed wrap. For unsigned, we must perform overflow check manually.
A patch that makes use of this in LVI (CVP) to be posted later.
Reviewers: nikic, shchenz, efriedma
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69918
|
|
|
|
|
|
|
|
|
| |
sadd_sat()/uadd_sat()
As discussed in https://reviews.llvm.org/D69918
that happens to work as intended, and returns empty set if
there is always an overflow because we get lucky with intersection.
Since there's now an explicit test for that, let's prefer cleaner code.
|
|
|
|
|
|
|
| |
empty set
As disscussed in https://reviews.llvm.org/D69918 / https://reviews.llvm.org/D67339
that is an implied postcondition, but it's not really fully tested.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some targets (E.g. MachO/arm64) use relocations to fix some CFI record fields
in the eh-frame section. When relocations are used the initial (pre-relocation)
content of the eh-frame section can no longer be interpreted by following the
eh-frame specification. This causes errors in the existing eh-frame parser.
This patch moves eh-frame handling into two LinkGraph passes that are run after
relocations have been parsed (but before they are applied). The first] pass
breaks up blocks in the eh-frame section into per-CFI-record blocks, and the
second parses blocks of (potentially multiple) CFI records and adds the
appropriate edges to any CFI fields that do not have existing relocations.
These passes can be run independently of one another. By handling eh-frame
splitting/fixing with LinkGraph passes we can both re-use existing relocations
for CFI record fields and avoid applying eh-frame fixups before parsing the
section (which would complicate the linker and require extra temporary
allocations of working memory).
|
|
|
|
|
|
| |
To do so, we need to register the sanitizer libraries with the target
so that they get uploaded before running. This patch adds a helper to
the test class to this effect.
|
|
|
|
|
|
| |
Add support for clangs mangling extension for block invocations.
Differential Revision: https://reviews.llvm.org/D69738
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D69805
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch factors out code to clone instructions -- partly for
readability and partly to facilitate an upcoming patch of my own.
Reviewers: wmi
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69861
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had a subtle, but nasty bug in our definition of a widenable branch, and thus in the transforms which used that utility. Specifically, we returned true for any branch which included a widenable condition within it's condition, regardless of whether that widenable condition also had other uses.
The problem is that the result of the WC() call is defined to be one particular value. As such, all users must agree as to what that value is. If we widen a branch without also updating *all other users* of the WC in the same way, we have broken the required semantics.
Most of the textual diff is updating existing transforms not to leave dead uses hanging around. They're largely NFC as the dead instructions would be immediately deleted by other passes. The reason to make these changes is so that the transforms preserve the widenable branch form.
In practice, we don't get bitten by this only because it isn't profitable to CSE WC() calls and the lowering pass from guards uses distinct WC calls per branch.
Differential Revision: https://reviews.llvm.org/D69916
|
|
|
|
| |
optimization
|
|
|
|
| |
This avoids config time dependencies on liblldb. And enables other refactoring.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes two issues noticed by inspection when going to enable the loop predication code in IndVarSimplify.
Issue 1 - Both the LoopPredication transform, and the already on by default optimizeLoopExits transform, modify the exit count of the exits they modify. (either to 0 or Infinity) Looking at the code more closely, this was not reflected into SCEV and we were instead running later transforms with incorrect SCEVs. Fixing this requires forgetting the loop, weakening a too strong assert, and updating SCEV to not pessimize results when a loop is provable untaken. I haven't been able to find a test case to demonstrate the miscompile.
Issue 2 - For modules without a data layout, we can end up with unsized pointer typed exit counts. Just bail out of this case.
I think these are the last two issues which need addressed before we enable this by default. The code has already survived a decent amount of fuzzing without revealing either of the above.
Differential Revision: https://reviews.llvm.org/D69695
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lit's test suite calls lit multiple times for various sample test
suites. `FILECHECK_OPTS` is safe for FileCheck calls in lit's test
suite. It's not safe for FileCheck calls in the sample test suites,
whose output affects the results of lit's test suite.
Without this patch, only one such sample test suite is protected from
`FILECHECK_OPTS`, and currently `shtest-shell.py` breaks with
`FILECHECK_OPTS=-vv`. Moreover, it's hard to predict the future,
especially false passes. Thus, this patch protects all existing and
future sample test suites from `FILECHECK_OPTS` (and the deprecated
`FILECHECK_DUMP_INPUT_ON_FAILURE`).
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D65156
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MMX intrinsics for shift by immediate take a 32-bit shift
amount but the hardware for shifting by immediate only encodes
8-bits. For the intrinsic we don't require the shift amount to
fit in 8-bits in the frontend because we don't check that its an
immediate in the frontend. If its is not an immediate we move it
to an MMX register and use the shift by register.
But if it is an immediate we'll use the shift by immediate
instruction. But we need to change the shift amount to 8-bits.
We were previously doing this accidentally by masking it in the
encoder. But this can make a large shift amount into a small
in bounds shift amount. Instead we should clamp larger shift
amounts to 255 so that the they don't become in bounds.
Fixes PR43922
|
|
|
|
|
|
|
| |
These patterns were added in D46009, but removed in D54276 due to
missing test coverage.
Differential Revision: https://reviews.llvm.org/D69831
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dump_format_style.py to get a little closer to being correct. (part 2)
Summary:
a change {D67541} cause LanguageStandard to now be subtly different from all other clang-format options, in that the Enum value (less the prefix) is not always allowed as valid as the configuration option.
This caused the ClangFormatStyleOptions.rst and the Format.h to diverge so that the ClangFormatStyleOptions.rst could no longer be generated from the Format.h using dump_format_stlye.py
This fix tried to remedy that:
1) by allowing an additional comment (in Format.h) after the enum to be used as the `in configuration ( XXXX )` text, and changing the dump_format_style.py to support that.
This makes the following code:
```
enum {
...
LS_Cpp03, // c++03
LS_Cpp11, // c++11
...
};
```
would render as:
```* ``LS_Cpp03`` (in configuration: ``c++03``)
* ``LS_Cpp11`` (in configuration: ``c++11``)
```
And we also move the deprecated alias into the text of the enum (otherwise it won't be added at the end as an option)
This patch includes a couple of other whitespace changes which help bring Format.h and ClangFormatStyleOptions.rst almost back into line and regeneratable... (there is still one more)
Reviewers: klimek, mitchell-stellar, sammccall
Reviewed By: mitchell-stellar, sammccall
Subscribers: mrexodia, cfe-commits
Tags: #clang, #clang-format
Differential Revision: https://reviews.llvm.org/D69433
|
|
|
|
|
|
|
|
|
| |
This diff adds a new "driver" for llvm-objcopy
which is supposed to emulate the behavior of install-name-tool.
Differential revision: https://reviews.llvm.org/D69146
Test plan: make check-all
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
this allows us to move logic about when it is appropriate set
LLVM_NO_DEAD_STRIP out of each tool and into add_llvm_executable,
which will enable future platform specific handling.
This is a follow on to the reverted D69356
Reviewers: hubert.reinterpretcast, beanz, lhames
Reviewed By: beanz
Subscribers: mgorny, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69638
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
concat_vectors
The combine G_UNMERGE_VALUES with G_CONCAT_VECTORS used to only be performed
when the result type of the G_UNMERGE_VALUES was a vector type.
In other words, we were expecting that the G_UNMERGE_VALUES was effectively
the exact opposite of the G_CONCAT_VECTORS.
Lift that constraint by allowing any G_UNMERGE_VALUES to be combined
with any G_CONCAT_VECTORS (as long as the size of the different pieces
that we merge/unmerge match).
Differential Revision: https://reviews.llvm.org/D69288
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Rewrite one of the invalid macho test input file with YAML file. The
original invalid macho is breaking our internal test infrastusture
because it is too broken to be copy around.
Need to relax an assertion in the YAML/MachoEmitter to allow yaml2obj to
write an invalid object like this.
rdar://problem/56879982
Reviewers: beanz, mtrent
Reviewed By: beanz
Subscribers: hiraditya, jkorous, dexonsmith, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69856
|
|
|
|
| |
always true. NFCI.
|
|
|
|
|
|
| |
Noticed while fixing the reduction costs for D59710 - the SLM model doesn't account for the poor throughput of v2i64 ops.
Numbers taken from Intel AOM (+ checked against Agner)
|
|
|
|
| |
Noticed while fixing the reduction costs for D59710 - the SLM model doesn't account for the poor throughput of v2f64/v2i64 ops.
|
| |
|
|
|
|
|
|
|
|
|
| |
Disable the type information emission for libunwind. libunwind does not
use `dynamic_cast`. This results in a smaller binary, and more
importantly, avoids the dependency on libc++abi. This ensures that we
have complete symbol resolution of symbols on ELF targets without
linking to the C++ runtime support library. This change avoids the
emission of a reference to `__si_class_type_info`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabling '3dnow'
All SSE capable CPUs have MMX. 3dnow implicitly enables MMX.
We have code that detects if sse is enabled and implicitly enables
MMX unless -mno-mmx is passed. So in most cases we were already
enabling MMX if march passed a CPU that supported SSE.
The exception to this is if you pass -march for a cpu supports SSE
and also pass -mno-sse. We should still enable MMX since its part
of the CPU capability.
|
|
|
|
| |
As noted on D59710 we weren't handling the high costs of these operations on SLM.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: relanding {D68969} after it failed UBSAN build caused by the passing of an invalid SMLoc() (nullptr)
Reviewers: thakis, vlad.tsyrklevich, klimek, mitchell-stellar
Reviewed By: thakis
Subscribers: merge_guards_bot, mgorny, cfe-commits
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D69854
|