| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
Add support for TLS access for Windows on ARM. This generates a similar access
to MSVC for ARM.
The changes to the tablegen data is needed to support loading an external symbol
global that is not for a call. The adjustments to the DAG to DAG transforms are
needed to preserve the 32-bit move.
llvm-svn: 259676
|
| |
|
|
|
|
|
|
|
|
| |
The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores
which happens to be a lot faster than __divsi3/__modsi3 when the core
has hardware divide instructions. Do the same here.
Fixes PR26450.
llvm-svn: 259657
|
| |
|
|
| |
llvm-svn: 259655
|
| |
|
|
|
|
| |
Simple fix - Constant values were not being sign extended in FastIsel.
llvm-svn: 259645
|
| |
|
|
|
|
|
|
|
|
| |
MIPS ABI states that .sbss and .sdata sections must have SHF_MIPS_GPREL
flag. See Figure 4–7 on page 69 in the following document:
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf.
Differential Revision: http://reviews.llvm.org/D15740
llvm-svn: 259641
|
| |
|
|
|
|
|
|
|
|
|
|
| |
EltsFromConsecutiveLoads
Follow up to D16217 and D16729
This change uncovered an odd pattern where VZEXT_LOAD v4i64 was being lowered to a load of the lower v2i64 (so the 2nd i64 destination element wasn't being zeroed), I can't find any use/reason for this and have removed the pattern and replaced it so only the 1st i64 element is loaded and the upper bits all zeroed. This matches the description for X86ISD::VZEXT_LOAD
Differential Revision: http://reviews.llvm.org/D16768
llvm-svn: 259635
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of PPCVSXFMAMutate is to elide copies by changing FMA forms
on PPC.
%vreg6<def> = COPY %vreg96
%vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg7
;v6 = v6 + v5 * v7
is replaced by
%vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg7, %vreg96
;v5 = v5 * v7 + v96
This was broken in the case where the target register was also used as a
multiplicand. Fix this case by checking for it and replacing both uses
with the copied register.
%vreg6<def> = COPY %vreg96
%vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg6
;v6 = v6 + v5 * v6
is replaced by
%vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg96, %vreg96
;v5 = v5 * v96 + v96
llvm-svn: 259617
|
| |
|
|
|
|
| |
Will re-implement based on review feedback.
llvm-svn: 259615
|
| |
|
|
|
|
|
|
| |
See comments in test/CodeGen/X86/avx-vzeroupper.ll for more explanation.
Original patch by: Sean Silva
llvm-svn: 259576
|
| |
|
|
|
|
|
|
| |
If we can't assume the pointer value isn't within the bounds
of the object, it seems risky to try to replace the pointer
calculations.
llvm-svn: 259573
|
| |
|
|
|
|
| |
Also add missing tests for the others.
llvm-svn: 259558
|
| |
|
|
|
|
|
|
|
| |
When the merging was involving LEAs, we were taking the wrong immediate
from the list of operands.
rdar://problem/24446069
llvm-svn: 259553
|
| |
|
|
| |
llvm-svn: 259551
|
| |
|
|
|
|
| |
Mostly convert to use range loops.
llvm-svn: 259550
|
| |
|
|
| |
llvm-svn: 259547
|
| |
|
|
|
|
|
| |
We shouldn't crash on unhandled intrinsics.
Also simplify failure handling in loop.
llvm-svn: 259546
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When promoting allocas to LDS, we know we are indexing
into a specific area just created, and the calculation
will also never overflow.
Also emit some of the muls as nsw nuw, because instcombine
infers this already from the range metadata. I think
putting this on the other adds and muls might be OK too,
but I'm not 100% sure.
llvm-svn: 259545
|
| |
|
|
|
|
| |
Differential revision: http://reviews.llvm.org/D16793
llvm-svn: 259539
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Enables eip-based addressing, e.g.,
lea constant(%eip), %rax
lea constant(%eip), %eax
in MC, (used for the x32 ABI). EIP-base addressing is also valid in x86_64,
it is left enabled for that architecture as well.
Patch by João Porto
Differential Revision: http://reviews.llvm.org/D16581
llvm-svn: 259528
|
| |
|
|
| |
llvm-svn: 259515
|
| |
|
|
| |
llvm-svn: 259510
|
| |
|
|
|
|
| |
The 3 programs used __attribute__((mode(?))) on enum, which clang r259497 fixed.
llvm-svn: 259508
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-commit of r258951 after fixing layering violation.
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.
There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.
llvm-svn: 259498
|
| |
|
|
| |
llvm-svn: 259496
|
| |
|
|
|
|
| |
Having this hidden option makes it easier to debug other issues.
llvm-svn: 259482
|
| |
|
|
|
|
|
|
|
| |
description and changed the regression test accordingly.
The default configuration of a Cortex-R7 is to implement the
VFPv3-D16 architecture and the feature line as it was is too
restrictive.
llvm-svn: 259480
|
| |
|
|
|
|
|
|
|
| |
Fix a crash in `getMemOpBaseRegImmOfs` that happens if the base of
`MemOp` is a frame index memory operand. The fix is to have
`getMemOpBaseRegImmOfs` bail out in such cases. We can possibly be more
clever here, if needed.
llvm-svn: 259456
|
| |
|
|
|
|
| |
FastISel counterpart to r259448.
llvm-svn: 259449
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Officially, we don't acknowledge non-default configurations of MXCSR,
as getting there would require usage of the FENV_ACCESS pragma (at
least insofar as rounding mode is concerned).
We don't support the pragma, so we can assume that the default
rounding mode - round to nearest, ties to even - is always used.
However, it's inconsistent with the rest of the instruction set,
where MXCSR is always effective (unless otherwise specified).
Also, it's an unnecessary obstacle to the few brave souls that use
fenv.h with LLVM.
Avoid the hard-coded rounding mode for fp_to_f16; use MXCSR instead.
llvm-svn: 259448
|
| |
|
|
| |
llvm-svn: 259438
|
| |
|
|
| |
llvm-svn: 259430
|
| |
|
|
| |
llvm-svn: 259427
|
| |
|
|
| |
llvm-svn: 259420
|
| |
|
|
|
|
|
|
|
|
| |
These sets do linear searching in small mode; It is not a good idea to
use huge numbers as the small value here, save people from themselves by
adding a static_assert.
Differential Revision: http://reviews.llvm.org/D16706
llvm-svn: 259419
|
| |
|
|
| |
llvm-svn: 259411
|
| |
|
|
| |
llvm-svn: 259402
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is an extension to the existing implementation of r242436 which
restricts to only select inputs. This version fixes missed opportunities
in pr26084 by attempting to lower conditional compare sequences of
and/or trees with setcc leafs. This will additionaly handle the case
when a tree with select input is not a conjunction-disjunction tree
but some of the sub trees are conjunction-disjunction trees.
Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB
Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry
Differential Revision: http://reviews.llvm.org/D16291
llvm-svn: 259387
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Factor out common code for callee-save register pair calculation. This
is intended to simplify follow-on changes that reduce the number of
registers saved/restored.
Depends on D16732
Reviewers: mcrosier, jmolloy, t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D16734
llvm-svn: 259384
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've found another bug in the code generation logic conditions for a
certain class of always-false conditions, those of the form
if ((a & 1) < 0)
These only reach the back end when compiling without optimization.
The bug was introduced by the choice of using TEST UNDER MASK
to implement a check for
if ((a & MASK) < VAL)
as
if ((a & MASK) == 0)
where VAL is less than the the lowest bit of MASK. This is correct
in all cases except for VAL == 0, in which case the original
condition is always false, but the replacement isn't.
Fixed by excluding that particular case.
llvm-svn: 259381
|
| |
|
|
| |
llvm-svn: 259380
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Simplify callee-save register save/restore code generation by
remembering the size of the callee-save area when it is computed so we
don't have to scan the prologue/epilogue instructions again later to
reconstruct it.
This is intended to simplify follow-on changes that reduce the number of
registers saved/restored.
Reviewers: mcrosier, jmolloy, t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D16732
llvm-svn: 259365
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16399
llvm-svn: 259363
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The bugs were:
* teq and similar take 4-bit unsigned immediates on microMIPS.
* teqi and similar have side-effects like teq do.
* shll_s.w and shra_r.w take 5-bit unsigned immediates.
* The various DSP ext* instructions take a 5-bit immediate.
* repl.qh takes an 8-bit unsigned immediate.
* repl.ph takes a 10-bit unsigned immediate.
* rddsp/wrdsp take a 10-bit unsigned immediate.
* teqi and similar take signed 16-bit immediates (10-bit for microMIPS).
* Out-of-range immediate macros for or/xor take a simm32/simm64 depending
on architecture. I'll fix the simm64 case properly when I reach simm32.
lui is a bit more lenient than GAS and accepts signed immediates in addition
to unsigned. This is because MipsMCExpr can produce signed values when
constant folding and it currently lacks a way of knowing it should fold to
an unsigned value.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15446
llvm-svn: 259360
|
| |
|
|
|
|
| |
This should now be easier to read.
llvm-svn: 259349
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16755
llvm-svn: 259346
|
| |
|
|
|
|
|
|
| |
Minor patch to trace back through target shuffles to the source of the inserted element in a (V)INSERTPS shuffle.
Differential Revision: http://reviews.llvm.org/D16652
llvm-svn: 259343
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16752
llvm-svn: 259342
|
| |
|
|
|
|
|
|
| |
Remove unnecessary includes and class state.
No functional change intended.
llvm-svn: 259340
|
| |
|
|
|
|
| |
efficient, its unclear the few places that were using the _32 version were doing so for efficiency.
llvm-svn: 259330
|
| |
|
|
| |
llvm-svn: 259321
|