| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
Added matchBinaryPermuteVectorShuffle and moved the blend+zero and insertps matching code into it.
llvm-svn: 277808
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a registry"
This differs from the previous version by being more careful about template
instantiation/specialization in order to prevent errors when building with
clang -Werror. Specifically:
* begin is not defined in the template and is instead instantiated when Head
is. I think the warning when we don't do that is wrong (PR28815) but for now
at least do it this way to avoid the warning.
* Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY
instead provide a template definition then do explicit instantiation. No
compiler I've tried has problems with doing it the other way, but strictly
speaking it's not permitted by the C++ standard so better safe than sorry.
Original commit message:
Currently the Registry class contains the vestiges of a previous attempt to
allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a
plugin would have its own copy of a registry and export it to be imported by
the tool that's loading the plugin. This only works if the plugin is entirely
self-contained with the only interface between the plugin and tool being the
registry, and in particular this conflicts with how IR pass plugins work.
This patch changes things so that instead the add_node function of the registry
is exported by the tool and then imported by the plugin, which solves this
problem and also means that instead of every plugin having to export every
registry they use instead LLVM only has to export the add_node functions. This
allows plugins that use a registry to work on Windows if
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used.
llvm-svn: 277806
|
| |
|
|
|
|
|
|
|
|
| |
This patch fixes passing long double type arguments to function in
soft float mode. If there is less than 4 argument registers free
(long double type is mapped in 4 gpr registers in soft float mode)
long double type argument must be passed through stack.
Differential Revision: https://reviews.llvm.org/D20114.
llvm-svn: 277804
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Turn (select C, (sext A), B) into (sext (select C, A, B')) when A is i1 and
B is a compatible constant, also for zext instead of sext. This will then be
further folded into logical operations.
The transformation would be valid for non-i1 types as well, but other parts of
InstCombine prefer to have sext from non-i1 as an operand of select.
Motivated by the shader compiler frontend in Mesa for AMDGPU, which emits i32
for boolean operations. With this change, the boolean logic is fully
recovered.
Reviewers: majnemer, spatel, tstellarAMD
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22747
llvm-svn: 277801
|
| |
|
|
| |
llvm-svn: 277793
|
| |
|
|
| |
llvm-svn: 277792
|
| |
|
|
| |
llvm-svn: 277786
|
| |
|
|
|
|
|
|
|
|
| |
The patch splits a complex && if condition into easier to read and understand
logic. That wrong early exit condition was letting some instructions with not
all operands available pass through when HoistingGeps was true.
Differential Revision: https://reviews.llvm.org/D23174
llvm-svn: 277785
|
| |
|
|
|
|
|
|
|
|
| |
Add a generalized IRBuilderCallbackInserter, which is just given a
callback to execute after insertion. This can be used to get rid of
the custom inserter in InstCombine, which will in turn allow me to add
target specific InstCombineCalls API for intrinsics without horrible
layering violations.
llvm-svn: 277784
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shifts with a uniform but non-constant count were considered very expensive to
vectorize, because the splat of the uniform count and the shift would tend to
appear in different blocks. That made the splat invisible to ISel, and we'd
scalarize the shift at codegen time.
Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
are able to select the appropriate vector shifts. This updates the cost model to
to take this into account by making shifts by a uniform cheap again.
Differential Revision: https://reviews.llvm.org/D23049
llvm-svn: 277782
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constant vectors
This concludes the splat vector enhancements for foldICmpEqualityWithConstant().
Other commits in this series:
https://reviews.llvm.org/rL277762
https://reviews.llvm.org/rL277752
https://reviews.llvm.org/rL277738
https://reviews.llvm.org/rL277731
https://reviews.llvm.org/rL277659
https://reviews.llvm.org/rL277638
https://reviews.llvm.org/rL277629
llvm-svn: 277779
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is not a nullptr
when we are pointed at real data.
David Blaikie pointed out some odd logic in the case the Err value was a nullptr and
Lang Hames suggested it could be cleaned it up with an assert to know that Err is
not a nullptr when we are pointed at real data. As only in the case of constructing
the sentinel value by pointing it at null data is Err is permitted to be a nullptr,
since no error could occur in that case.
With this change the testing for “if (Err)” is removed from the constructor’s logic
and *Err is used directly without any check after the assert().
llvm-svn: 277776
|
| |
|
|
|
|
|
| |
These are the operations that are trivially identical. Division is omitted for
now because you need to use the correct sign/zero extension.
llvm-svn: 277775
|
| |
|
|
| |
llvm-svn: 277774
|
| |
|
|
| |
llvm-svn: 277769
|
| |
|
|
| |
llvm-svn: 277767
|
| |
|
|
|
|
| |
and remove the JITSymbolFlags header.
llvm-svn: 277766
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
coro.destroy.
This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume
and coro.destroy intrinsics by replacing them with an indirect call to an address
returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes
devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate
function address.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22998
llvm-svn: 277765
|
| |
|
|
|
|
| |
constant vectors
llvm-svn: 277762
|
| |
|
|
|
|
|
|
|
|
| |
Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22964
llvm-svn: 277759
|
| |
|
|
|
|
|
| |
This eliminates the remnants of std::error_code from the
DebugInfo libraries.
llvm-svn: 277758
|
| |
|
|
|
|
|
| |
It leads to a crash when they're not. I'm *sure* I've made this mistake before,
at least once.
llvm-svn: 277755
|
| |
|
|
|
|
| |
constant vectors
llvm-svn: 277752
|
| |
|
|
| |
llvm-svn: 277749
|
| |
|
|
| |
llvm-svn: 277747
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Previously, FastISel for WebAssembly wasn't checking the return value of
`getRegForValue` in certain cases, which would generate instructions
referencing NoReg. This patch fixes this behavior.
Patch by Dominic Chen
Differential Revision: https://reviews.llvm.org/D23100
llvm-svn: 277742
|
| |
|
|
| |
llvm-svn: 277740
|
| |
|
|
|
|
|
|
|
| |
constant vectors
I'm removing a misplaced pair of more specific folds from InstCombine in this patch as well,
so we know where those folds are happening in InstSimplify.
llvm-svn: 277738
|
| |
|
|
|
|
| |
In preparation for adding a binary permute matching function.
llvm-svn: 277737
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
adjustments.
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.
Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:
- x86 failing tests either did not have the right target, or the right alignment.
- NVPTX failing tests did not have the right alignment.
- AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info
considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks
for a maximum size allowed and relies on the next condition checking for %4 for correctness.
This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).
Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.
Reviewers: arsenm, jlebar, tstellarAMD
Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D23068
llvm-svn: 277735
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: seanbruno, sdardis
Subscribers: tberghammer, danalbert, srhines, dsanders, sdardis, llvm-commits, seanbruno
Differential Revision: https://reviews.llvm.org/D23113
llvm-svn: 277732
|
| |
|
|
|
|
| |
constant vectors
llvm-svn: 277731
|
| |
|
|
|
|
|
|
|
|
| |
NFCI.
The new function now returns true if the shuffle should be commuted.
This will allow target shuffle combines to share the code.
llvm-svn: 277728
|
| |
|
|
| |
llvm-svn: 277727
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On modern Intel processors hardware SQRT in many cases is faster than RSQRT
followed by Newton-Raphson refinement. The patch introduces a simple heuristic
to choose between hardware SQRT instruction and Newton-Raphson software
estimation.
The patch treats scalars and vectors differently. The heuristic is that for
scalars the compiler should optimize for latency while for vectors it should
optimize for throughput. It is based on the assumption that throughput bound
code is likely to be vectorized.
Basically, the patch disables scalar NR for big cores and disables NR completely
for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores.
Secondly, vector SQRT has been greatly improved in Skylake and has better
throughput compared to NR.
Differential Revision: https://reviews.llvm.org/D21379
llvm-svn: 277725
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D22347
llvm-svn: 277719
|
| |
|
|
| |
llvm-svn: 277716
|
| |
|
|
|
|
|
|
| |
lowering - this is what cost models are for
Improved CTTZ/CTLZ costings will be added shortly
llvm-svn: 277713
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable tail calls by default for (micro)MIPS(64).
microMIPS is slightly more tricky than doing it for MIPS(R6) or microMIPSR6.
microMIPS has two instruction encodings: 16bit and 32bit along with some
restrictions on the size of the instruction that can fill the delay slot.
For safe tail calls for microMIPS, the delay slot filler attempts to find
a correct size instruction for the delay slot of TAILCALL pseudos.
Reviewers: dsanders, vkalintris
Subscribers: jfb, dsanders, sdardis, llvm-commits
Differential Revision: https://reviews.llvm.org/D21138
llvm-svn: 277708
|
| |
|
|
| |
llvm-svn: 277704
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should ensure that we can atomically write two bytes (on top of the
retq and the one past it) and have those two bytes not straddle cache
lines.
We also move the label past the alignment instruction so that we can refer
to the actual first instruction, as opposed to potential padding before the
aligned instruction.
Update the tests to allow us to reflect the new order of assembly.
Reviewers: rSerge, echristo, majnemer
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23101
llvm-svn: 277701
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: As per title.
Reviewers: majnemer, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23139
llvm-svn: 277694
|
| |
|
|
| |
llvm-svn: 277693
|
| |
|
|
|
|
|
| |
This reinstates r277611 + r277614 and reverts r277642. A cast_or_null
should have been a dyn_cast_or_null.
llvm-svn: 277691
|
| |
|
|
|
|
|
|
| |
This reverts commits r277685 & r277688. r277685 broke compiler-rt
compilation http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/23335
and r277685 is a followup from it.
llvm-svn: 277690
|
| |
|
|
|
|
|
|
|
|
|
| |
overloaded (and simpler).
Sean rightly pointed out in code review that we've started using
"wrapper pass" as a specific part of the old pass manager, and in fact
it is more applicable there. Here, we really have a pass *template* to
build a repeated pass, so call it that.
llvm-svn: 277689
|
| |
|
|
|
|
|
|
|
|
| |
As we addressed all compilation time problems with GVN-hoist
https://llvm.org/bugs/show_bug.cgi?id=28670
this patch turns GVN-hoist back by default.
Differential Revision: https://reviews.llvm.org/D23136
llvm-svn: 277685
|
| |
|
|
|
|
|
|
|
|
| |
pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit
without calling DbiStreamBuilder::finalize. Because `finalize` initializes
`Header` member, `Header` remained nullptr which caused a crash bug.
Differential Revision: https://reviews.llvm.org/D23143
llvm-svn: 277681
|
| |
|
|
|
|
|
|
| |
rewriteOperands() always performed liveness queries at the base index
rather than the RegSlot/Base as apropriate for the machine operand. This
could lead to illegal rewriting in some cases.
llvm-svn: 277661
|
| |
|
|
|
|
| |
constant vectors
llvm-svn: 277659
|