| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
This matcher matches any node and at the same time executes all its
inner matchers to produce any possbile result bindings.
This is useful when a user wants certain supplementary information
that's not always present along with the main match result.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Detect a run of memory tagging instructions for adjacent stack frame slots,
and replace them with a shorter instruction sequence
* replace STG + STG with ST2G
* replace STGloop + STGloop with STGloop
This code needs to run when stack slot offsets are already known, but before
FrameIndex operands in STG instructions are eliminated; that's the
reason for the new hook in PrologueEpilogue.
This change modifies STGloop and STZGloop pseudos to take the size as an
immediate integer operand, and base address as a FI operand when
possible. This is needed to simplify recognizing an STGloop instruction
as operating on a stack slot post-regalloc.
This improves memtag code size by ~0.25%, and it looks like an additional ~0.1%
is possible by rearranging the stack frame such that consecutive STG
instructions reference adjacent slots (patch pending).
Reviewers: pcc, ostannard
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70286
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Right now the path for each lib in whole_archive_link when MSVC is used as the compiler is not a full path - and it's not even the correct path when VS is used to build. This patch sets the lib path to a full path using CMAKE_CFG_INTDIR which means the path will be correct regardless of whether ninja, make or VS is used and it will always be a full path.
Reviewers: denis13, jpienaar
Reviewed By: jpienaar
Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits, asmith
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72403
|
|
|
|
|
|
|
| |
`readability-misleading-indentation`.
Because Windows build uses by default `fdelayed-template-parsing` we cannot have a test
where we don't instantiate the template. Please see D72333.
|
|
|
|
|
|
|
|
|
|
| |
a new class AliasState.
Summary: This reduces the complexity of ModuleState and simplifies the code. A future revision will mold ModuleState into something that can be used by users for caching of printer state, as well as for implementing printAsOperand style methods.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D72292
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This diff adds lowering of the linalg.reshape op to LLVM.
A new descriptor is created with fields initialized as follows:
1. allocatedPTr, alignedPtr and offset are copied from the source descriptor
2. sizes are copied from the static destination shape
3. strides are copied from the static strides collected with `getStridesAndOffset`
Only the static case in which the target view conforms to strided memref
semantics is supported. Other cases are left for future work and will be added on
a per-need basis.
Reviewers: ftynse, mravishankar
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72316
|
|
|
|
|
|
|
|
|
|
| |
64-bit mode
For v4i64->v4f32 uint_to_fp on pre-avx targets where v4i64 isn't legal we create to v2i64->v2f32 uint_to_fp that need to be shuffled together. Our codegen for v2i64->v2f32 involves detecting if the number is larger than (2^31 - 1), if so we do a special divison by 2 so we can do a signed conversion which we need to scalarize, then do a multiply by 2 at the end if we divided earlier.
When v4i64 isn't legal we need to split the checking for a larger number and dividing by 2 into two v2i64 vectors. The scalar part can extract the 4 i64 values from those 4 splits. But we can reassemble the 4 scalar f32 results directly into a single v432 vector. Then we just need to combine the fixup indications from the 2 halves and we can do the final multiply by 2 fixup on all 4 values if needed at once using a single v4f32 blend and v4f32 fadd.
Differential Revision: https://reviews.llvm.org/D72368
|
|
|
|
|
|
| |
We have to do an intermediate jump to a GPR to make the cast.
Fixes PR43750.
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed heavily in the original review (D70157), there's a need for the compiler to be able to selective suppress padding (either nop or prefix) to respect assumptions about the meaning of labels and instructions in generated code.
Rather than wait for syntax to be finalized - which appears to be a very slow process - this patch focuses on the compiler use case and *only* worries about the integrated assembler. To my knowledge, this covers all cases mentioned to date for clang/JIT support.
For testing purposes, I wired it up so that if the integrated assembler was using autopadding for branch alignment (e.g. enabled at command line) then the textual assembly output would contain a comment for each location where padding was enabled or disabled. This seemed like the least painful choice overall.
Note that the result of this patch effective disables the jcc errata mitigation for many constructs (statepoints, implicit null checks, xray, etc...) which is non ideal. It is at least *correct* and should allow us to enable the mitigation for the compiler. Once that's done, and a few other items are worked through, we probably want to come back to this an explore a bundling based approach instead so that we can pad instructions while keeping labels in the right place.
Differential Revision: https://reviews.llvm.org/D72303
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Weak undefined symbols are preemptible after D71794.
if (sym.isPreemptible)
return false;
if (!config->isPic)
return true;
// isPic means includeInDynsym is true after D71794.
...
// We can delete this if because it can never be true.
if (sym.isUndefWeak)
return true;
Differential Revision: https://reviews.llvm.org/D71795
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
D59275 added the following clause to Symbol::includeInDynsym()
if (isUndefWeak() && Config->Pie && SharedFiles.empty())
return false;
D59549 explored the possibility to generalize it for -no-pie.
GNU ld's rules are architecture dependent and partly controlled by -z
{,no-}dynamic-undefined-weak. Our attempts to mimic its rules are
actually half-baked and don't provide perceivable benefits (it can save
a few more weak undefined symbols in .dynsym in a -static-pie
executable). Let's just delete the rule for simplicity. We will expect
cosmetic inconsistencies with ld.bfd in certain -static-pie scenarios.
This permits a simplification in D71795.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D71794
|
|
|
|
| |
Silence (clang/MSVC) static analyzer warnings that the fragment data may either write out of bounds of the local array or reference uninitialized data.
|
|
|
|
| |
Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately below in the getSignature call).
|
|
|
|
| |
Use castAs<> instead of getAs<> since we know that the pointer will be valid (and is dereferenced immediately below).
|
|
|
|
|
|
|
| |
Libxml2 is already an optional dependency. It should use the same
infrastructure as the other dependencies.
Differential revision: https://reviews.llvm.org/D72290
|
| |
|
|
|
|
| |
Use llvm::Optional<APInt> instead of std::pair<APInt, bool> with the bool second being used to report success/failure of fold.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The added testcase shows the current transformation for the operation
Z / (1.0 / Y), which remains unchanged. This will be updated to align
with the transformed code (Y * Z) with D72319.
The existing transformation Z / (X / Y) => (Y * Z) / X is not handling
this case as there are multiple uses for (1.0 / Y) in this testcase.
Patch by: @raghesh (Raghesh Aloor)
Differential Revision: https://reviews.llvm.org/D72388
|
|
|
|
|
|
|
| |
This hopes to improve readability and adds an assert.
The functional change noted by the TODO comment is
proposed in:
D72361
|
|
|
|
|
| |
Use ParseExpression() instead of ParseAssignmentExpression() to allow
commas in combiner expressions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch teaches JumpThreading.cpp to thread through two basic
blocks like:
bb3:
%var = phi i32* [ null, %bb1 ], [ @a, %bb2 ]
%tobool = icmp eq i32 %cond, 0
br i1 %tobool, label %bb4, label ...
bb4:
%cmp = icmp eq i32* %var, null
br i1 %cmp, label bb5, label bb6
by duplicating basic blocks like bb3 above. Once we duplicate bb3 as
bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have:
bb3:
%var = phi i32* [ @a, %bb2 ]
%tobool = icmp eq i32 %cond, 0
br i1 %tobool, label %bb4, label ...
bb3.dup:
%var = phi i32* [ null, %bb1 ]
%tobool = icmp eq i32 %cond, 0
br i1 %tobool, label %bb4, label ...
bb4:
%cmp = icmp eq i32* %var, null
br i1 %cmp, label bb5, label bb6
Then the existing code in JumpThreading.cpp can thread edge
bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5.
Reviewers: wmi
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70247
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This batch of intrinsics fills in all the shift instructions that take
a variable shift distance in a register, instead of an immediate. Some
of these instructions take a single shift distance in a scalar
register and apply it to all lanes; others take a vector of per-lane
distances.
These instructions are all basically one family, varying in whether
they saturate out-of-range values, and whether they round when bits
are shifted off the bottom. I've implemented them at the IR level by a
much smaller family of IR intrinsics, which take flag parameters to
indicate saturating and/or rounding (along with the usual one to
specify signed/unsigned integers).
An oddity is that all of them are //left// shift instructions – but if
you pass a negative shift count, they'll shift right. So the vector
shift distances are always vectors of //signed// integers, regardless
of whether you're considering the other input vector to be of signed
or unsigned. Also, even the simplest `vshlq` instruction in this
family (neither saturating nor rounding) has to be implemented as an
IR intrinsic, because the ordinary LLVM IR `shl` operation would
consider an out-of-range shift count to be undefined behavior.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72329
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This batch of intrinsics covers two sets of immediate shift
instructions, which have in common that they only overwrite part of
their output register and so they need an extra input giving its
previous value.
The VSLI and VSRI instructions shift each lane of the input vector
left or right just as if they were normal immediate VSHL/VSHR, but
then they only overwrite the output bits that correspond to actual
shifted bits of the input. So VSLI will leave the low n bits of each
output lane unchanged, and VSRI the same with the top n bits.
The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input
vector of 2n-bit integers, shift each lane right by a constant, and
then narrowing the shifted result to only n bits. So they only
overwrite half of the n-bit lanes in the output register, and the B/T
suffix indicates whether it's the bottom or top half of each 2n-bit
lane.
I've implemented the whole of the latter family using a single IR
intrinsic `vshrn`, which takes a lot of i32 parameters indicating
which instruction it expands to (by specifying signedness of the input
and output types, whether it saturates and/or rounds, etc).
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72328
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instantiation for `readability-misleading-indentation` check.
Summary: Fixes fixes `readability-misleading-identation` for `if constexpr`. This is very similar to D71980.
Reviewers: alexfh
Subscribers: xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72333
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds intrinsics and ISelDAG nodes for
signed and unsigned fixed-point division:
llvm.sdiv.fix.*
llvm.udiv.fix.*
These intrinsics perform scaled division on two
integers or vectors of integers. They are required
for the implementation of the Embedded-C fixed-point
arithmetic in Clang.
Patch by: ebevhan
Reviewers: bjope, leonardchan, efriedma, craig.topper
Reviewed By: craig.topper
Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70007
|
|
|
|
|
|
|
| |
This patch moves `InPQueue` into function arguments instead of template
arguments of `releaseNode`, which is a cleaner approach.
Differential Revision: https://reviews.llvm.org/D72125
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Adds a pass to the ARM backend that takes a v4i32
gather and transforms it into a call to MVE's
masked gather intrinsics.
Differential Revision: https://reviews.llvm.org/D71743
|
|
|
|
|
|
|
|
| |
An empty string literal in an asm label does not make a whole lot of sense. GCC
does not diagnose such a construct, but it also generates code that cannot be
assembled by gas should two symbols have an empty asm label within the same TU.
This does not affect an asm statement with an empty string literal, which is
still a useful construct.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
optimizing part. #2.
Summary:
This patch relands D71271. The problem with D71271 is that it has cyclic dependency:
CodeGen->AsmPrinter->DebugInfoDWARF->CodeGen. To avoid cyclic dependency this patch
puts implementation for DWARFOptimizer into separate library: lib/DWARFLinker.
Thus the difference between this patch and D71271 is in that DWARFOptimizer renamed into
DWARFLinker and it`s files are put into lib/DWARFLinker.
Reviewers: JDevlieghere, friss, dblaikie, aprantl
Reviewed By: JDevlieghere
Subscribers: thegameg, merge_guards_bot, probinson, mgorny, hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D71839
|
|
|
|
| |
Run the update_mir_test on some of the low-overhead loop tests.
|
|
|
|
|
|
|
|
|
|
|
| |
Creating an ASTContext with an unknown triple is rarely a good idea (as usually
all our ASTs have a valid triple that is either from the host or the target) and the
default argument makes it far to easy to implicitly create such an AST. Let's
remove it and force people to pass a triple.
The only place where we don't pass a triple is a DWARFASTParserClangTests
where we now just pass the host triple instead (the test doesn't depend on any
triple so this shouldn't change anything).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
relates https://bugs.llvm.org/show_bug.cgi?id=44443
Adding missing newline when printing bad input values.
Fix testcase
Reviewers: jhenderson
Reviewed By: jhenderson
Subscribers: rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72313
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a041c4ec6f7aa659b235cb67e9231a05e0a33b7d.
This looks like a non-trivial change and there has been no code
reviews (at least there were no phabricator revisions attached to the
commit description). It is also causing a regression in one of our
downstream integration tests, we haven't been able to come up with a
minimal reproducer yet.
|
|
|
|
|
|
|
|
| |
Apple's CPUs are called A7-A13 in official communication, occasionally with
weird suffixes which we probably don't need to care about. This adds each one
and describes its features. It also switches the default CPU to the canonical
name for Cyclone, but leaves legacy support in so that existing bitcode still
compiles.
|
|
|
|
|
|
|
| |
ArchSpec has a superset of the information of llvm::Triple but the ClangASTContext
just uses the Triple part of it. This deletes the ArchSpec constructor and all
the code creating ArchSpecs and instead just uses the llvm::Triple constructor
for ClangASTContext.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
span.cons/container.pass.cpp
N4842 22.7.3.2 [span.cons]/13 constrains span's range constructor
for ranges::contiguous_range (among other criteria).
24.4.5 [range.refinements]/2 says that contiguous_range requires data(),
and (via contiguous_range, random_access_range, bidirectional_range,
forward_range, input_range, range) it also requires begin() and end()
(see 24.4.2 [range.range]/1).
Therefore, IsAContainer needs to provide begin() and end().
(Detected by MSVC's concept-constrained implementation.)
span.cons/stdarray.pass.cpp
This test uses std::array, so it must include <array>.
<span> isn't guaranteed to drag in <array>.
(Detected by MSVC's implementation which uses a forward declaration to
avoid dragging in <array>, for increased compiler throughput.)
span.objectrep/as_bytes.pass.cpp
span.objectrep/as_writable_bytes.pass.cpp
Testing `sp.extent == std::dynamic_extent` triggers MSVC warning
C4127 "conditional expression is constant". Using `if constexpr` is a
simple way to avoid this without disrupting anyone else (as span
requires C++20 mode).
span.tuple/get.pass.cpp
22.7.3.2 [span.cons]/4.3: "Preconditions: If extent is not equal to
dynamic_extent, then count is equal to extent."
These lines were triggering undefined behavior (detected by assertions
in MSVC's implementation).
I changed the count arguments in the first two chunks, followed by
changing the span extents, in order to preserve the test's coverage
and follow the existing pattern.
span.cons/span.pass.cpp
22.7.3.2 [span.cons]/18.1 constrains span's converting constructor with
"Extent == dynamic_extent || Extent == OtherExtent is true".
This means that converting from dynamic extent to static extent is
not allowed. (Other constructors tested elsewhere, like
span(It first, size_type count), can be used to write such code.)
As this is the test for the converting constructor, I have:
* Removed the "dynamic -> static" case from checkCV(), which is
comprehensive.
* Changed the initialization of std::span<T, 0> s1{}; in
testConstexprSpan() and testRuntimeSpan(), because s1 is used below.
* Removed ASSERT_NOEXCEPT(std::span<T, 0>{s0}); from those functions,
as they are otherwise comprehensive.
* Deleted testConversionSpan() entirely. Note that this could never
compile (it had a bool return type, but forgot to say `return`). And it
couldn't have provided useful coverage, as the /18.2 constraint
"OtherElementType(*)[] is convertible to ElementType(*)[]"
permits only cv-qualifications, which are already tested by checkCV().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds macro references to the dynamic index.
Tests added.
Also exposed a new API to convert path to URI in URI.h
Reviewers: hokein
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71406
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This rule helps avoid repeated setting of check-libc's dependency on the
various testsuites.
Reviewers: abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D72353
|
|
|
|
| |
fma-combine.ll
|
|
|
|
|
|
|
|
|
| |
As correctly pointed out by Martin on the mailing list, Python should
only be auto-enabled if SWIG is found as well. This moves the logic of
finding SWIG into FindPythonInterpAndLibs to make that possible.
To make diagnosing easier I've included a status message to convey why
Python support is disabled.
|
|
|
|
|
|
|
|
|
|
|
|
| |
In TestConvenienceVariables I changed %t from a file to a directory.
This tripped up mkdir which can't deal with an existing file at the
given location. In order to solve this issue on the bots I added an
`rm -rf %t` statement, but now the Windows bot complains that "This
function is not supported on this system".
If you never ran the test suite wit this temporary workaround, the test
might fail. If this happens please remove what %t expands to in the lit
output and rerun the test.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Adding fp128 support for strict fcmp
Reviewers: craig.topper, LiuChen3, andrew.w.kaylor, RKSimon, uweigand
Subscribers: hiraditya, llvm-commits, LuoYuanke
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71897
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is analogous to D58943, which correctly finds the corresponding
fixup. However, when linker relaxations are disabled and we evaluate the
fixup, we need to also ensure we use an offset of 0 rather than the size
of the previous fragment.
Reviewers: asb, efriedma, lenary
Reviewed By: efriedma
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71978
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D72137
|
|
|
|
|
| |
This reverts commit 7e7f849a6d94f77f1a29630419acb7226051f4b6 because
it recorded the wrong commit author.
|
|
|
|
|
| |
This partially fixes GlobalISel import of the patterns, but removes a
lot of entriess from the end of the skipped pattern log.
|