| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Bots sad. Looks like missing std::is_trivially_copyable.
llvm-svn: 341730
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning.
Reviewers: javed.absar
Subscribers: dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51693
llvm-svn: 341728
|
|
|
|
|
|
|
|
|
|
|
| |
Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a
FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a
proper addressing-mode and tells the rewriter about it so that encodable
offsets are exploited and others are rejected.
Should fix PR38828.
llvm-svn: 341642
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SHF_ARM_PURECODE flag when being built with the -mexecute-only flag.
All code sections of an ELF must have the flag set for the final .text
section to be execute-only, otherwise the flag gets removed.
A HasData flag is added to MCSection to aid in the determination that
the section is empty. A virtual setTargetSectionFlags is added to
MCELFObjectTargetWriter to allow subclasses to set target specific
section flags to be added to sections which we then use in the ARM
backend to set SHF_ARM_PURECODE.
Patch by Ivan Lozano!
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D48792
llvm-svn: 341593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the FrameAccess struct that was added to the interface
in D51537, since the PseudoValue from the MachineMemoryOperand
can be safely casted to a FixedStackPseudoSourceValue.
Reviewers: MatzeB, thegameg, javed.absar
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D51617
llvm-svn: 341454
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shouldAssumeDSOLocal. NFC.
On Windows, if shouldAssumeDSOLocal returns false, it's either a
dllimport reference, or a reference that we should treat as non-local
and create a stub for.
Clean up AArch64Subtarget::ClassifyGlobalReference a little while
touching the flag handling relating to dllimport.
Differential Revision: https://reviews.llvm.org/D51590
llvm-svn: 341402
|
|
|
|
|
|
|
| |
Also adjust some of dsymutil's headers to put the header guards at the top,
otherwise the compiler will not recognize them as header guards.
llvm-svn: 341323
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For instructions that spill/fill to and from multiple frame-indices
in a single instruction, hasStoreToStackSlot and hasLoadFromStackSlot
should return an array of accesses, rather than just the first encounter
of such an access.
This better describes FI accesses for AArch64 (paired) LDP/STP
instructions.
Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D51537
llvm-svn: 341301
|
|
|
|
|
|
|
|
|
|
|
| |
The runtime pseudo relocations can't handle the ARM format embedded
addresses in movw/movt pairs. By using stubs, the potentially
dllimported addresses can be touched up by the runtime pseudo relocation
framework.
Differential Revision: https://reviews.llvm.org/D51450
llvm-svn: 341176
|
|
|
|
|
|
|
|
|
|
|
| |
It has essentially the same benefit it has on 64-bit ARM: it
substantially reduces the number of constants used by large GEP
operations. Seems to be generally helpful across a few different
codebases I've tried.
Differential Revision: https://reviews.llvm.org/D51462
llvm-svn: 341136
|
|
|
|
|
|
| |
Enable `FeatureUseAA` for all Exynos processors.
llvm-svn: 341101
|
|
|
|
|
|
|
|
|
|
|
| |
..Move all target-dependent checks into new isCopyInstrImpl method.
This change allows us to treat MoveReg-type instructions and generic
COPY instruction in the same way
Differential Revision: https://reviews.llvm.org/D49913
llvm-svn: 341072
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a continuation of https://reviews.llvm.org/D49727
Below the original text, current changes in the comments:
Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results.
For example:
extern int bar(int[]);
int foo(int i) {
int a[i]; // VLA
asm volatile(
"mov r7, #1"
:
:
: "r7"
);
return 1 + bar(a);
}
Compiled for thumb, this gives:
$ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb
...
foo:
.fnstart
@ %bb.0: @ %entry
.save {r4, r5, r6, r7, lr}
push {r4, r5, r6, r7, lr}
.setfp r7, sp, #12
add r7, sp, #12
.pad #4
sub sp, #4
movs r1, #7
add.w r0, r1, r0, lsl #2
bic r0, r0, #7
sub.w r0, sp, r0
mov sp, r0
@APP
mov.w r7, #1
@NO_APP
bl bar
adds r0, #1
sub.w r4, r7, #12
mov sp, r4
pop {r4, r5, r6, r7, pc}
...
r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block.
This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now.
The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter.
If we find a reserved register, we print a warning:
repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm]
"mov r7, #1"
^
Reviewers: efriedma, olista01, javed.absar
Reviewed By: efriedma
Subscribers: eraman, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D51165
llvm-svn: 341062
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the endianness of the target when printing register names. This is in line with the documentation at http://llvm.org/docs/LangRef.html#asm-template-argument-modifiers
Patch by Jackson Woodruff <jackson.woodruff@arm.com>
Reviewers: t.p.northover, echristo, javed.absar, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D49778
llvm-svn: 341052
|
|
|
|
|
|
|
|
|
|
| |
The inline sequence is very long (about 70 bytes on Thumb1), so it's
not really a good idea to inline it, especially when optimizing for
size.
Differential Revision: https://reviews.llvm.org/D47917
llvm-svn: 340458
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Windows, movw+movt pairs with relocations are handled with a single
relocation that covers them both. Therefore we can't inject anything
between these instructions, otherwise the relocation (which in LLVM
only is treated as the movw instruction's relocation, while the movt
instruction's relocation is dropped) will end up bogus.
These instructions are bundled up until right before the constant
islands pass, making this effectively the only place that can split
them apart.
Differential Revision: https://reviews.llvm.org/D51032
llvm-svn: 340451
|
|
|
|
|
|
|
|
| |
This makes sure the flags are available for use for thumb MIR as well.
A test that requires this will be added in the next commit.
llvm-svn: 340450
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids a potential infinite loop setting and unsetting bits in the
mask.
Reduced from a failure on the polly-aosp bot.
Differential Revision: https://reviews.llvm.org/D51066
llvm-svn: 340446
|
|
|
|
|
|
|
|
|
| |
Add intrinsic isel patterns for sxtb16, sxtab16, uxtb16 and uxtab16
so that they can perform a ror.
Differential Revision: https://reviews.llvm.org/D51034
llvm-svn: 340405
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the plumbing for the Tiny code model for the AArch64 backend. This,
instead of loading addresses through the normal ADRP;ADD pair used in the Small
model, uses a single ADR. The 21 bit range of an ADR means that the code and
its statically defined symbols need to be within 1MB of each other.
This makes it mostly interesting for embedded applications where we want to fit
as much as we can in as small a space as possible.
Differential Revision: https://reviews.llvm.org/D49673
llvm-svn: 340397
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add +fp16fml feature for new FP16 instructions, which are a
mandatory part of FP16 from v8.4-A and an optional part of FP16
from v8.2-A. It doesn't seem to be possible to model this in
LLVM, but the relationship between the options is handled by
the related clang patch.
In keeping with what I think is the usual practice, the fp16fml
extension is accepted regardless of base architecture version.
Builds on/replaces Sjoerd Meijer's patch to add these instructions at
https://reviews.llvm.org/D49839.
Differential Revision: https://reviews.llvm.org/D50228
llvm-svn: 340013
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50846
llvm-svn: 339997
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a generically extensible collection of extra info attached to
a `MachineInstr`.
The primary change here is cleaning up the APIs used for setting and
manipulating the `MachineMemOperand` pointer arrays so chat we can
change how they are allocated.
Then we introduce an extra info object that using the trailing object
pattern to attach some number of MMOs but also other extra info. The
design of this is specifically so that this extra info has a fixed
necessary cost (the header tracking what extra info is included) and
everything else can be tail allocated. This pattern works especially
well with a `BumpPtrAllocator` which we use here.
I've also added the basic scaffolding for putting interesting pointers
into this, namely pre- and post-instruction symbols. These aren't used
anywhere yet, they're just there to ensure I've actually gotten the data
structure types correct. I'll flesh out support for these in
a subsequent patch (MIR dumping, parsing, the works).
Finally, I've included an optimization where we store any single pointer
inline in the `MachineInstr` to avoid the allocation overhead. This is
expected to be the overwhelmingly most common case and so should avoid
any memory usage growth due to slightly less clever / dense allocation
when dealing with >1 MMO. This did require several ergonomic
improvements to the `PointerSumType` to reasonably support the various
usage models.
This also has a side effect of freeing up 8 bits within the
`MachineInstr` which could be repurposed for something else.
The suggested direction here came largely from Hal Finkel. I hope it was
worth it. ;] It does hopefully clear a path for subsequent extensions
w/o nearly as much leg work. Lots of thanks to Reid and Justin for
careful reviews and ideas about how to do all of this.
Differential Revision: https://reviews.llvm.org/D50701
llvm-svn: 339940
|
|
|
|
|
|
|
|
|
|
|
| |
While searching through the use-def tree, ignore GetElementPtrInst
instructions because they don't need promoting and neither do their
indices. Otherwise, the wide indices prevent the transformation from
happening.
Differential Revision: https://reviews.llvm.org/D50762
llvm-svn: 339871
|
|
|
|
|
|
|
|
| |
Treat zext instructions as roots, like we do for truncs.
Differential Revision: https://reviews.llvm.org/D50759
llvm-svn: 339868
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally committed in r339755 which was reverted in r339806 due to
an asan issue. The issue was caused by my assumption that operands to
a CallInst mapped to the FunctionType Params. CallInsts are now
handled by iterating over their ArgOperands instead of Operands.
Original Message:
Treat signed icmps as 'sinks', allowing them to be in the use-def
tree, enabling more promotions to be performed. As a sink, any
promoted incoming values need to be truncated before being used by
the signed icmp.
Differential Revision: https://reviews.llvm.org/D50067
llvm-svn: 339858
|
|
|
|
|
|
|
|
| |
use-after-poison in check-llvm under asan
This reverts commit r339755.
llvm-svn: 339806
|
|
|
|
|
|
|
|
|
| |
We only try to promote types with are smaller than 16-bits, but we
also need to check that the type is not less than 8-bits.
Differential Revision: https://reviews.llvm.org/D50769
llvm-svn: 339770
|
|
|
|
|
|
|
|
|
|
|
| |
Treat signed icmps as 'sinks', allowing them to be in the use-def
tree, enabling more promotions to be performed. As a sink, any
promoted incoming values need to be truncated before being used by
the signed icmp.
Differential Revision: https://reviews.llvm.org/D50067
llvm-svn: 339755
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add pointers to the list of allowed types, but don't try to promote
them. Also fixed a bug with the promotion of undef values, so a new
value is now created instead of mutating in place. We also now only
promote if there's an instruction in the use-def chains other than
the icmp, sinks and sources.
Differential Revision: https://reviews.llvm.org/D50054
llvm-svn: 339754
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`MachineMemOperand` pointers attached to `MachineSDNodes` and instead
have the `SelectionDAG` fully manage the memory for this array.
Prior to this change, the memory management was deeply confusing here --
The way the MI was built relied on the `SelectionDAG` allocating memory
for these arrays of pointers using the `MachineFunction`'s allocator so
that the raw pointer to the array could be blindly copied into an
eventual `MachineInstr`. This creates a hard coupling between how
`MachineInstr`s allocate their array of `MachineMemOperand` pointers and
how the `MachineSDNode` does.
This change is motivated in large part by a change I am making to how
`MachineFunction` allocates these pointers, but it seems like a layering
improvement as well.
This would run the risk of increasing allocations overall, but I've
implemented an optimization that should avoid that by storing a single
`MachineMemOperand` pointer directly instead of allocating anything.
This is expected to be a net win because the vast majority of uses of
these only need a single pointer.
As a side-effect, this makes the API for updating a `MachineSDNode` and
a `MachineInstr` reasonably different which seems nice to avoid
unexpected coupling of these two layers. We can map between them, but we
shouldn't be *surprised* at where that occurs. =]
Differential Revision: https://reviews.llvm.org/D50680
llvm-svn: 339740
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intentionally excluding nodes from the DAGCombine worklist is likely to
lead to weird optimizations and infinite loops, so it's generally a bad
idea.
To avoid the infinite loops, fix DAGCombine to use the
isDesirableToCommuteWithShift target hook before performing the
transforms in question, and implement the target hook in the ARM backend
disable the transforms in question.
Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a
reduced testcase for that bug. But we should have sufficient test
coverage for PerformSHLSimplify given that we're not playing weird
tricks with the worklist. I can try to bugpoint it if necessary,
though.)
Differential Revision: https://reviews.llvm.org/D50667
llvm-svn: 339734
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50511
llvm-svn: 339645
|
|
|
|
| |
llvm-svn: 339546
|
|
|
|
| |
llvm-svn: 339479
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM normally prefers to minimize the number of bits set in an AND
immediate, but that doesn't always match the available ARM instructions.
In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer
a two-instruction sequence movs+ands or movs+bics.
Some potential improvements outlined in
ARMTargetLowering::targetShrinkDemandedConstant, but seems to work
pretty well already.
The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX
instruction due to a larger-than-expected mask. (It's orthogonal, in
some sense, but as far as I can tell it's either impossible or nearly
impossible to reproduce the bug without this change.)
According to my testing, this seems to consistently improve codesize by
a small amount by forming bic more often for ISD::AND with an immediate.
Differential Revision: https://reviews.llvm.org/D50030
llvm-svn: 339472
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enabling ARMCodeGenPrepare by default caused a whole load of
failures. This is due to zexts and truncs not being handled properly.
ZExts are messy so it's just easier to disable for now and truncs
are allowed only as 'sinks'. I still need to figure out why allowing
them as 'sources' causes so many failures. The other main changes are
that we are explicit in the types that we converting to, it's now
always 'TypeSize'. Type support is also now performed while checking
for valid opcodes as it unnecessarily complicated having the checks
are different stages.
I've moved the tests around too, so we have the zext and truncs in
their own file as well as the overflowing opcode tests.
Differential Revision: https://reviews.llvm.org/D50518
llvm-svn: 339432
|
|
|
|
|
|
|
|
|
| |
Enable `FeatureZCZeroing`, `FeatureHasSlowFPVMLx`, `FeatureExpandMLx`,
`FeatureProfUnpredicate`, `FeatureSlowVDUP32`, `FeatureSlowVGETLNi32`,
`FeatureSplatVFPToNeon`, `FeatureHasRetAddrStack`, `FeatureSlowFPBrcc` for
all Exynos processors.
llvm-svn: 339356
|
|
|
|
|
|
|
| |
Add new feature, `FeatureUseWideStrideVFP`, that replaces the need for a
processor check. Otherwise, NFC.
llvm-svn: 339354
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50454
llvm-svn: 339340
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally, if any registers are spilled, we prefer to spill lr on Thumb1
so we can fold the "bx lr" into the "pop". However, if there are tail
calls involved, restoring lr is expensive, so skip the optimization in
that case.
The spill of r7 in the new test also isn't necessary, but that's
mostly orthogonal to this patch. (It's the same code in
ARMFrameLowering, but it's not related to tail calls.)
Differential Revision: https://reviews.llvm.org/D49459
llvm-svn: 339283
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50427
llvm-svn: 339241
|
|
|
|
|
|
|
|
| |
This adds codegen support for the vmov_n_f16 and vdup_n_f16 variants.
Differential Revision: https://reviews.llvm.org/D50329
llvm-svn: 339238
|
|
|
|
|
|
|
|
| |
This adds codegen support for the vmul_lane_f16 and vmul_n_f16 variants.
Differential Revision: https://reviews.llvm.org/D50326
llvm-svn: 339232
|
|
|
|
|
|
|
|
| |
This adds codegen support for the different vcvt_f16 variants.
Differential Revision: https://reviews.llvm.org/D50393
llvm-svn: 339227
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50238
llvm-svn: 339221
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50236
llvm-svn: 339148
|
|
|
|
|
|
|
|
|
| |
ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes
out that part of any addend in an object file. But it only does that for
symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting
that extra bit in other cases.
llvm-svn: 339007
|
|
|
|
|
|
|
|
| |
This is addressing PR38404.
Differential Revision: https://reviews.llvm.org/D50186
llvm-svn: 338835
|
|
|
|
|
|
| |
This is addressing PR38404.
llvm-svn: 338830
|