| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary of changes:
- simplified handling of FLAT offset: offset_s13 and offset_u12 have been replaced with flat_offset;
- provided information about error position for pre-gfx9 targets;
- improved errors handling.
Reviewers: artem.tamazov, arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D64244
llvm-svn: 365321
|
| |
|
|
| |
llvm-svn: 365320
|
| |
|
|
|
|
|
|
|
|
| |
BUILD_VECTOR cases.
Don't do this locally, computeKnownBits does this better (and can handle non-constant cases as well).
A next step would be to actually simplify non-constant elements - building on what we already do in SimplifyDemandedVectorElts.
llvm-svn: 365309
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
According to a recently updated Armv8-M spec
(https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf) the
32-bit width versions of the following instructions:
* VQDMLADH
* VQDMLADHX
* VQRDMLADH
* VQRDMLADHX
* VQDMLSDH
* VQDMLSDHX
* VQRDMLSDH
* VQRDMLSDHX
are no longer unpredictable when their output register is the same as
one of the input registers.
This patch updates the assembler parser and the corresponding tests
and also removes @earlyclobber from the instruction constraints.
Reviewers: simon_tatham, ostannard, dmgreen, SjoerdMeijer, samparker
Reviewed By: simon_tatham
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64250
llvm-svn: 365306
|
| |
|
|
|
|
|
|
|
|
|
| |
Defines RISCV registers for getExceptionPointerRegister() and
getExceptionSelectorRegister().
Differential Revision: https://reviews.llvm.org/D63411
Patch by Edward Jones.
Modified by Alex Bradbury to add CHECK lines to exception-pointer-register.ll.
llvm-svn: 365301
|
| |
|
|
|
|
|
|
| |
CodeGen/ARM/Windows/stack-protector-msvc.ll.test after D64292/r365283
CLI.CS may not be set.
llvm-svn: 365299
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64201
llvm-svn: 365294
|
| |
|
|
|
|
|
| |
This can help with code size on SSE targets where SHUFPD requires
a 0x66 prefix and SHUFPS doesn't.
llvm-svn: 365293
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
targets.
This can help avoid a copy or enable load folding.
On SSE4.1 targets we can commute it to blendi instead.
I had to make shufpd with a 0x02 immediate commutable as well
since we expect commuting to be reversible.
llvm-svn: 365292
|
| |
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D57792
Patch by James Clarke.
llvm-svn: 365291
|
| |
|
|
|
|
|
|
| |
to turn UNPCKLPDrr->MOVHPDrm when load is under aligned.
If the load is aligned we can turn UNPCKLPDrr into UNPCKLPDrm.
llvm-svn: 365287
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to use the analyzer from unit tests.
* Refactor the interface to use proper error handling for most functions
after JF's work.
* Move everything into a BitstreamAnalyzer class.
* Move that to Bitcode/BitcodeAnalyzer.h.
Differential Revision: https://reviews.llvm.org/D64116
llvm-svn: 365286
|
| |
|
|
|
|
|
|
| |
Heavily based on the same for AArch64, from SVN r346469.
Differential Revision: https://reviews.llvm.org/D64292
llvm-svn: 365283
|
| |
|
|
|
|
| |
patterns.
llvm-svn: 365275
|
| |
|
|
|
|
|
|
|
|
| |
Some out of tree backend require larger vector type. Since maintaining the changes out of tree is difficult due to the many manual changes needed when adding a new type we are adding it even if no backend currently use it.
Differential Revision: https://reviews.llvm.org/D64141
Patch by Thomas Raoux!
llvm-svn: 365274
|
| |
|
|
|
|
|
|
| |
NFCI.
Fixes cppcheck warning.
llvm-svn: 365271
|
| |
|
|
| |
llvm-svn: 365270
|
| |
|
|
|
|
| |
NFCI.
llvm-svn: 365269
|
| |
|
|
|
|
|
|
| |
These patterns are the same as the MOVLPDmr and MOVHPDmr patterns,
but with a bitcast at the end. We can just select the PD instruction
and let execution domain fixing switch to PS.
llvm-svn: 365267
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
UNPCKL+load.
These narrow the load so we can only do it if the load isn't
volatile.
There also tests in vector-shuffle-128-v4.ll that this should
support, but we don't seem to fold bitcast+load on pre-sse4.2
targets due to the slow unaligned mem 16 flag.
llvm-svn: 365266
|
| |
|
|
|
|
|
|
| |
We had versions of this code scattered around, so consolidate into one location.
Not strictly NFC since the order of intermediate results may change in some places, but since these operations are associatives, should not change results.
llvm-svn: 365259
|
| |
|
|
|
|
|
|
|
|
| |
Although removeCopyByCommutingDef deals with full copies, it is still
possible to copy undef lanes and thus, we wouldn't have any a value
number for these lanes.
This fixes PR40215.
llvm-svn: 365256
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not sure if it's worth it or not to add a hook to disable the pass
for an arbitrary function.
This pass is taking up to 5% of compile time in tiny programs by
iterating through all of the physical registers in every register
class. This pass should be rewritten in terms of regunits. For now,
skip doing anything for entry point functions. The vast majority of
functions in the real world aren't callable, so just not running this
will give the majority of the benefit.
llvm-svn: 365255
|
| |
|
|
|
|
| |
This reverts commit 096600a4b073dd94a366cc8e57bff93c34ff6966.
llvm-svn: 365251
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch simplifies 2 aspects in the FileCheckNumericVariable code.
First, setValue() method is turned into a void function since being
called only on undefined variable is an invariant and is now asserted
rather than returned. This remove the assert from the callers.
Second, clearValue() method is also turned into a void function since
the only caller does not check its return value since it may be trying
to clear the value of variable that is already cleared without this
being noteworthy.
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk
Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64231
llvm-svn: 365249
|
| |
|
|
| |
llvm-svn: 365245
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Only custom lower uaddo+addcarry or usubo+subcarry chains and leave
mixtures like usubo+addcarry or uaddo+subcarry to the generic
legalizer. Otherwise we run into issues because SystemZ uses
different CC values for carries and borrows.
Fixes https://bugs.llvm.org/show_bug.cgi?id=42512.
Differential Revision: https://reviews.llvm.org/D64213
llvm-svn: 365242
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add a string attribute instead of directly setting
MachineFunctionInfo. This avoids trying to get the analysis in the
MachineFunctionInfo in a way that doesn't work with the new pass
manager.
This will also avoid re-visiting the call graph for every single
function.
llvm-svn: 365241
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Explicitly specify the parent MBB to allow the end iterator to be
used.
Reviewers: aprantl, MatzeB, craig.topper, qcolombet
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64261
llvm-svn: 365240
|
| |
|
|
| |
llvm-svn: 365237
|
| |
|
|
|
|
| |
Avoids a warning in Release builds.
llvm-svn: 365236
|
| |
|
|
| |
llvm-svn: 365235
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Size either needs to be 0 meaning we aren't folding
a stack reload. Or the stack slot needs to be at least
16 bytes. I've also added a paranoia check ensure the
RCSize is at leat 16 bytes as well. This avoids any
FR32/FR64 surprises, but I think we already filtered
those earlier.
All of our test case have Size as either 0 or 16 and
RCSize == 16. So the Size <= 16 check worked for those
cases.
llvm-svn: 365234
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The indirect call sequence on PPC requires that the TOC base register be saved
prior to the indirect call and restored after the call since the indirect call
may branch to a global entry point in another DSO which will update the TOC
base. Over the last couple of years, we have improved this to:
- be able to hoist TOC saves from loops (with changes to MachineLICM)
- avoid multiple saves when one dominates the other[s]
However, it is still possible to have multiple TOC saves dynamically in the
execution path if there is no dominance relationship between them.
This patch moves the TOC save to the prologue when one of the TOC saves is in a
block that post-dominates entry (i.e. it cannot be avoided) or if it is in a
block that is hotter than entry.
Differential revision: https://reviews.llvm.org/D63803
llvm-svn: 365232
|
| |
|
|
|
|
|
|
|
|
|
| |
non-volatile before folding.
These patterns use 128-bit loads, but the instructions only load
64-bits. We shouldn't narrow the load if its volatile.
Fixes another variant of PR42079
llvm-svn: 365225
|
| |
|
|
|
|
|
|
|
| |
This was identical to a pattern for MOVPQI2QImr with a bitcast
as an input. But we should be able to turn MOVPQI2QImr into
MOVLPSmr in the execution domain fixup pass so we shouldn't
need this.
llvm-svn: 365224
|
| |
|
|
| |
llvm-svn: 365223
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch changes expression support to use one instance of
FileCheckNumericVariable per numeric variable rather than one per
variable and per definition. The current system was only necessary for
the last patch of the numeric expression support patch series in order
to handle a line using a variable defined earlier on the same line from
the input text. However this can be dealt more efficiently.
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk
Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64229
llvm-svn: 365220
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Diagnosing use of undefined variables takes place in
parseNumericVariableUse() and printSubstitutions() for numeric variables
but only takes place in printSubstitutions() for string variables. The
reason for the split location of diagnostics is that parsing is not
aware of the clearing of variables due to --enable-var-scope and thus
use of variables cleared in this way can only be catched by
printSubstitutions().
Beyond the code level inconsistency, there is also a user facing
inconsistency since diagnostics look different between the two
functions. While the diagnostic in printSubstitutions is more verbose,
doing the diagnostic there allows to diagnose all undefined variables
rather than just the first one and error out.
This patch create dummy variable definition when encountering a use of
undefined variable so that parsing can proceed and be diagnosed by
printSubstitutions() later. Tests that were testing whether parsing
fails in such case are thus modified accordingly.
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk
Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64228
llvm-svn: 365219
|
| |
|
|
|
|
|
|
| |
Patch by Christudasan Devadasan.
Differential Revision: https://reviews.llvm.org/D63886
llvm-svn: 365217
|
| |
|
|
|
|
|
|
|
|
| |
When looking for uses/defs to add kill flags, the iterator was double
incremented, skipping the first instruction in the bundle. The use
register in the first bundle instruction was then incorrectly killed.
The "First" instruction should be the BUNDLE itself as the proper
reverse iterator endpoint.
llvm-svn: 365216
|
| |
|
|
| |
llvm-svn: 365215
|
| |
|
|
|
|
|
|
| |
This add simple Q register forms of bitwise not instructions.
Differential Revision: https://reviews.llvm.org/D63983
llvm-svn: 365214
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows the DPP combiner to kick in more often. For example the
exclusive scan generated by the atomic optimizer for a divergent atomic
add used to look like this:
v_mov_b32_e32 v3, v1
v_mov_b32_e32 v5, v1
v_mov_b32_e32 v6, v1
v_mov_b32_dpp v3, v2 wave_shr:1 row_mask:0xf bank_mask:0xf
s_nop 1
v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf
v_mov_b32_dpp v6, v3 row_shr:3 row_mask:0xf bank_mask:0xf
v_add3_u32 v3, v4, v5, v6
v_mov_b32_e32 v4, v1
s_nop 1
v_mov_b32_dpp v4, v3 row_shr:4 row_mask:0xf bank_mask:0xe
v_add_u32_e32 v3, v3, v4
v_mov_b32_e32 v4, v1
s_nop 1
v_mov_b32_dpp v4, v3 row_shr:8 row_mask:0xf bank_mask:0xc
v_add_u32_e32 v3, v3, v4
v_mov_b32_e32 v4, v1
s_nop 1
v_mov_b32_dpp v4, v3 row_bcast:15 row_mask:0xa bank_mask:0xf
v_add_u32_e32 v3, v3, v4
s_nop 1
v_mov_b32_dpp v1, v3 row_bcast:31 row_mask:0xc bank_mask:0xf
v_add_u32_e32 v1, v3, v1
v_add_u32_e32 v1, v2, v1
v_readlane_b32 s0, v1, 63
But now most of the dpp movs are combined into adds:
v_mov_b32_e32 v3, v1
v_mov_b32_e32 v5, v1
s_nop 0
v_mov_b32_dpp v3, v2 wave_shr:1 row_mask:0xf bank_mask:0xf
s_nop 1
v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf
v_mov_b32_dpp v1, v3 row_shr:3 row_mask:0xf bank_mask:0xf
v_add3_u32 v1, v4, v5, v1
s_nop 1
v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe
s_nop 1
v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc
s_nop 1
v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf
s_nop 1
v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf
v_add_u32_e32 v1, v2, v1
v_readlane_b32 s0, v1, 63
Reviewers: arsenm, vpykhtin
Subscribers: kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, MaskRay, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64207
llvm-svn: 365211
|
| |
|
|
| |
llvm-svn: 365206
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reintroduces the scalable vector IR type from D32530, after it was reverted
a couple of times due to increasing chromium LTO build times. This latest
incarnation removes the walk over aggregate types from the verifier entirely,
in favor of rejecting scalable vectors in the isValidElementType methods in
ArrayType and StructType. This removes the 70% degradation observed with
the second repro tarball from PR42210.
Reviewers: thakis, hans, rengolin, sdesmalen
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D64079
llvm-svn: 365203
|
| |
|
|
|
|
|
|
| |
Revision r365061 changed a skip of debug instructions for a skip
of meta instructions. This is not safe, as IMPLICIT_DEF is classed
as a meta instruction.
llvm-svn: 365202
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On RISC-V, the `cycle` CSR holds a 64-bit count of the number of clock
cycles executed by the core, from an arbitrary point in the past. This
matches the intended semantics of `@llvm.readcyclecounter()`, which we
currently leave to the default lowering (to the constant 0).
With this patch, we will now correctly lower this intrinsic to the
intended semantics, using the user-space instruction `rdcycle`. On
64-bit targets, we can directly lower to this instruction.
On 32-bit targets, we need to do more, as `rdcycle` only returns the low
32-bits of the `cycle` CSR. In this case, we perform a custom lowering,
based on the PowerPC lowering, using `rdcycleh` to obtain the high
32-bits of the `cycle` CSR. This custom lowering inserts a new basic
block which detects overflow in the high 32-bits of the `cycle` CSR
during reading (because multiple instructions are required to read). The
emitted assembly matches the suggested assembly in the RISC-V
specification.
Differential Revision: https://reviews.llvm.org/D64125
llvm-svn: 365201
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
printing unknown args
Since OPT_UNKNOWN args never have any values and consist only of
spelling (and are never aliased), this doesn't make any difference in
practice, but it's more consistent with Arg's guidance to use
getAsString() for diagnostics, and it matches what clang does.
Also tweak two tests to use an unknown option that contains '=' for
additional coverage while here. (The new tests pass fine with the old
code too though.)
llvm-svn: 365200
|
| |
|
|
|
|
| |
should not have been added.
llvm-svn: 365199
|