| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D13896
llvm-svn: 251287
|
|
|
|
|
|
|
|
| |
Bug https://llvm.org/bugs/show_bug.cgi?id=25318
Differential Revision: http://reviews.llvm.org/D14062
llvm-svn: 251285
|
|
|
|
|
|
|
|
|
|
| |
Otherwise value can be reused , despite its value could be changed - produces incorrect assembler.
https://llvm.org/bugs/show_bug.cgi?id=25270
Differential Revision: http://reviews.llvm.org/D14057
llvm-svn: 251275
|
|
|
|
|
|
|
|
|
|
| |
We didn't validate that the .word directive was given a sane value,
leading to crashes when we attempt to write out the object file.
Instead, perform some validation and issue a diagnostic pointing at the
start of the diagnostic.
llvm-svn: 251270
|
|
|
|
| |
llvm-svn: 251266
|
|
|
|
|
|
| |
Incorrect range test - found during fuzz testing.
llvm-svn: 251245
|
|
|
|
|
|
|
|
|
|
| |
When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations.
If the mask is not constant, the scalarizer will build a chain of conditional basic blocks.
I added isLegalMaskedGather() isLegalMaskedScatter() APIs.
Differential Revision: http://reviews.llvm.org/D13722
llvm-svn: 251237
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using the MCU psABI, compiler-generated library calls should pass
some parameters in-register. However, since inreg marking for x86 is currently
done by the front end, it will not be applied to backend-generated calls.
This is a workaround for PR3997, which describes a similar issue for -mregparm.
Differential Revision: http://reviews.llvm.org/D13977
llvm-svn: 251223
|
|
|
|
|
|
|
|
| |
This adds support for the i?86-*-elfiamcu triple, which indicates the IAMCU psABI is used.
Differential Revision: http://reviews.llvm.org/D13977
llvm-svn: 251222
|
|
|
|
| |
llvm-svn: 251219
|
|
|
|
|
|
| |
Most 128-bit and 256-bit shuffles were manually matching UNPCK patterns - use lowerVectorShuffleWithUNPCK to be more thorough.
llvm-svn: 251211
|
|
|
|
|
|
| |
Use isShuffleEquivalent to match UNPCK shuffles - better support for build vector inputs.
llvm-svn: 251207
|
|
|
|
|
|
|
|
|
| |
This enables tail calls with thiscall, stdcall, vectorcall and
fastcall functions.
Differential Revision: http://reviews.llvm.org/D13999
llvm-svn: 251190
|
|
|
|
| |
llvm-svn: 251189
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions.
This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future.
Differential Revision: http://reviews.llvm.org/D13851
llvm-svn: 251188
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The logic here isn't straightforward because our support for
TargetOptions::GuaranteedTailCallOpt.
Also fix a bug where we were allowing tail calls to cdecl functions from
fastcall and vectorcall functions. We were special casing thiscall and
stdcall callers rather than checking for any convention that requires
clearing stack arguments before returning.
Reviewers: hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14024
llvm-svn: 251137
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This ensures that BranchFolding (and similar) won't remove these blocks.
Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are
address-taken but do not have BBs that are address-taken, since otherwise
its call to getAddrLabelSymbolTableToEmit would fail an assertion on such
blocks. I audited the other callers of getAddrLabelSymbolTableToEmit
(and getAddrLabelSymbol); they all have BBs known to be address-taken
except for the call through getAddrLabelSymbol from
WinException::create32bitRef; that call is actually now unreachable, so
I've removed it and updated the signature of create32bitRef.
This fixes PR25168.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: pgavlin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13774
llvm-svn: 251113
|
|
|
|
|
|
| |
unused.
llvm-svn: 251035
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D13945
llvm-svn: 251018
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed faiure:
LLVM ERROR: Cannot select: t33: i1 = select_cc t25, Constant:i32<0>, t45, t42, seteq:ch
added a test
Differential Revision: http://reviews.llvm.org/D13943
llvm-svn: 250996
|
|
|
|
|
|
|
|
| |
Clang runtime failure was reported.
Assertion failed: (isExtended() && "Type is not extended!"), function getTypeForEVT
I'll need to add a proper handling for PointerType in masked load/store intrinsics.
llvm-svn: 250995
|
|
|
|
| |
llvm-svn: 250926
|
|
|
|
|
|
| |
parser and disassembler.
llvm-svn: 250911
|
|
|
|
|
| |
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 250883
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D13884
llvm-svn: 250819
|
|
|
|
| |
llvm-svn: 250741
|
|
|
|
| |
llvm-svn: 250698
|
|
|
|
|
|
|
|
|
|
| |
Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case.
Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces.
Differential Revision: http://reviews.llvm.org/D13850
llvm-svn: 250686
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D13769
llvm-svn: 250650
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D13632
llvm-svn: 250649
|
|
|
|
|
|
| |
Added X86ISD opcodes for VPROT vector rotate by variable and by immediate.
llvm-svn: 250620
|
|
|
|
|
|
| |
take ArrayRef instead of pointer and length. NFC
llvm-svn: 250615
|
|
|
|
|
|
| |
Targets with AVX but without AVX2 were incorrectly reporting costs of 256-bit integer shifts.
llvm-svn: 250611
|
|
|
|
|
|
|
|
|
|
| |
Add FastISel support for SSE4A scalar float / double non-temporal stores
Follow up to D13698
Differential Revision: http://reviews.llvm.org/D13773
llvm-svn: 250610
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset
is incorrect when CSRs are involved. We were supposed to have a test
case to catch this, but it wasn't very rigorous.
The main effect here is that calling _CxxThrowException inside a
catchpad doesn't immediately crash on MOVAPS when you have an odd number
of CSRs.
llvm-svn: 250583
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivation for this patch starts with PR20134:
https://llvm.org/bugs/show_bug.cgi?id=20134
void foo(int *a, int i) {
a[i] = a[i+1] + a[i+2];
}
It seems better to produce this (14 bytes):
movslq %esi, %rsi
movl 0x4(%rdi,%rsi,4), %eax
addl 0x8(%rdi,%rsi,4), %eax
movl %eax, (%rdi,%rsi,4)
Rather than this (22 bytes):
leal 0x1(%rsi), %eax
cltq
leal 0x2(%rsi), %ecx
movslq %ecx, %rcx
movl (%rdi,%rcx,4), %ecx
addl (%rdi,%rax,4), %ecx
movslq %esi, %rax
movl %ecx, (%rdi,%rax,4)
The most basic problem (the first test case in the patch combines constants) should also be fixed in InstCombine,
but it gets more complicated after that because we need to consider architecture and micro-architecture. For
example, AArch64 may not see any benefit from the more general transform because the ISA solves the sexting in
hardware. Some x86 chips may not want to replace 2 ADD insts with 1 LEA, and there's an attribute for that:
FeatureSlowLEA. But I suspect that doesn't go far enough or maybe it's not getting used when it should; I'm
also not sure if FeatureSlowLEA should also mean "slow complex addressing mode".
I see no perf differences on test-suite with this change running on AMD Jaguar, but I see small code size
improvements when building clang and the LLVM tools with the patched compiler.
A more general solution to the sext(add nsw(x, C)) problem that works for multiple targets is available
in CodeGenPrepare, but it may take quite a bit more work to get that to fire on all of the test cases that
this patch takes care of.
Differential Revision: http://reviews.llvm.org/D13757
llvm-svn: 250560
|
|
|
|
|
|
|
|
| |
Patch by Mitch Bodart
Differential Revision: http://reviews.llvm.org/D13780
llvm-svn: 250550
|
|
|
|
| |
llvm-svn: 250497
|
|
|
|
|
|
| |
Breaks the hexagon buildbot.
llvm-svn: 250461
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.
This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.
This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.
llvm-svn: 250456
|
|
|
|
|
|
|
|
|
|
| |
D4796 taught LLVM to fold some atomic integer operations into a single
instruction. The pattern was unaware that the instructions clobbered
flags. I fixed some of this issue in D13680 but had missed INC/DEC.
This patch adds the missing EFLAGS definition.
llvm-svn: 250438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
x86 codegen is clever about generating good code for relaxed
floating-point operations, but it was being silly when globals and
immediates were involved, forgetting where the global was and
loading/storing from/to the wrong place. The same applied to hard-coded
address immediates.
Don't let it forget about the displacement.
This fixes https://llvm.org/bugs/show_bug.cgi?id=25171
A very similar bug when doing floating-points atomics to the stack is
also fixed by this patch.
This fixes https://llvm.org/bugs/show_bug.cgi?id=25144
Reviewers: pete
Subscribers: llvm-commits, majnemer, rsmith
Differential Revision: http://reviews.llvm.org/D13749
llvm-svn: 250429
|
|
|
|
| |
llvm-svn: 250406
|
|
|
|
|
|
|
|
| |
shuffle packed values at 128-bit granularity )
Differential Revision: http://reviews.llvm.org/D13648
llvm-svn: 250400
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D13768
llvm-svn: 250396
|
|
|
|
|
|
|
|
|
| |
AVX-512 bit shuffle fails on 32 bit since we create a vector of 64-bit constants.
I split 8x64-bit const vector to 16x32 on 32-bit mode.
Differential Revision: http://reviews.llvm.org/D13644
llvm-svn: 250390
|
|
|
|
| |
llvm-svn: 250362
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch teaches x86 fast-isel how to select nontemporal stores.
On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords.
Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal
vector stores.
Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti'
for doubleword/quadword nontemporal stores. In the case of nontemporal stores
of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of
movntps/movntpd/movntdq.
With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now
always get the expected (V)MOVNT instructions.
The lack of fast-isel support for nontemporal stores was spotted when analyzing
the -O0 codegen for nontemporal stores.
Differential Revision: http://reviews.llvm.org/D13698
llvm-svn: 250285
|
|
|
|
| |
llvm-svn: 250268
|
|
|
|
| |
llvm-svn: 250174
|