| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 222911
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AAPCS treats small structs and homogeneous floating (or vector) aggregates
specially, and guarantees they either get passed as a contiguous block of
registers, or prevent any future use of those registers and get passed on the
stack.
This concept can fit quite neatly into LLVM's own type system, mapping an HFA
to [N x float] and so on, and small structs to [N x i64]. Doing so allows
front-ends to emit AAPCS compliant code without having to duplicate the
register counting logic.
llvm-svn: 222903
|
| |
|
|
|
|
|
|
|
|
|
|
| |
divisor info FMULs by the reciprocal.
E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip)
A hook is added to allow the target to control whether it needs to do such combine.
Reviewed in http://reviews.llvm.org/D6334
llvm-svn: 222510
|
| |
|
|
|
|
|
| |
Phabricator Revision: http://reviews.llvm.org/D5589
Patch by Balaram Makam <bmakam@codeaurora.org>!!
llvm-svn: 219276
|
| |
|
|
| |
llvm-svn: 218934
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This required a new hook called hasLoadLinkedStoreConditional to know whether
to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to
CmpXchg (X86).
Apart from that, the new code in AtomicExpandPass is mostly moved from
X86AtomicExpandPass. The main result of this patch is to get rid of that
pass, which had lots of code duplicated with AtomicExpandPass.
llvm-svn: 217928
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We were materialising big-endian constants using DAG nodes with types different
from what was requested, followed by a bitcast. This is fine on little-endian
machines where bitcasting is a nop, but we need a slightly different
representation for big-endian. This adds a new set of NVCAST (natural-vector
cast) operations which are always nops.
Patch by Asiri Rathnayake.
llvm-svn: 217138
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs.
Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving
the X86 backend to this pass.
This requires a way of easily detecting which instructions are atomic.
I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making
isAtomic() a method of Instruction implemented by a switch on the opcodes.
Test Plan: make check
Reviewers: jfb
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D5035
llvm-svn: 217080
|
| |
|
|
|
|
| |
No functionality change. Changes made by clang-tidy + some manual cleanup.
llvm-svn: 217028
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Just fixing comments, no functional change.
Test Plan: N/A
Reviewers: jfb
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D5130
llvm-svn: 216784
|
| |
|
|
|
|
|
|
|
|
| |
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
|
| |
|
|
|
|
|
|
|
|
| |
Rename to allowsMisalignedMemoryAccess.
On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment,
and don't need to be split into multiple accesses. Vector loads with
an alignment of the element type are not uncommon in OpenCL code.
llvm-svn: 214055
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
address of the stack guard was being spilled to the stack.
Previously the address of the stack guard would get spilled to the stack if it
was impossible to keep it in a register. This patch introduces a new target
independent node and pseudo instruction which gets expanded post-RA to a
sequence of instructions that load the stack guard value. Register allocator
can now just remat the value when it can't keep it in a register.
<rdar://problem/12475629>
llvm-svn: 213967
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The target-independent DAGcombiner will generate:
asr w1, X, #31 w1 = splat sign bit.
add X, X, w1, lsr #28 X = X + 0 or pow2-1
asr w0, X, asr #4 w0 = X/pow2
However, the add + shifts is expensive, so generate:
add w0, X, 15 w0 = X + pow2-1
cmp X, wzr X - 0
csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X;
asr w0, X, asr 4 w0 = X/pow2
llvm-svn: 213758
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector type legalization strategies in a more fine grained manner, and
change the legalization of several v1iN types and v1f32 to be widening
rather than scalarization on AArch64.
This fixes an assertion failure caused by scalarizing nodes like "v1i32
trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32.
This also provides a foundation for other targets to have more granular
control over how vector types are legalized.
Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow
some work to start taking place on top of this patch as it adds some
really important hooks to the backend that I'd like to immediately start
using. =]
http://reviews.llvm.org/D4322
llvm-svn: 212242
|
| |
|
|
|
|
|
| |
This currently necessitates a TargetMachine for the TargetLowering
constructor and TLOF.
llvm-svn: 210605
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.
"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.
This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.
llvm-svn: 209577
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm doing this in two phases for a better "git blame" record. This
commit removes the previous AArch64 backend and redirects all
functionality to ARM64. It also deduplicates test-lines and removes
orphaned AArch64 tests.
The next step will be "git mv ARM64 AArch64" and rewire most of the
tests.
Hopefully LLVM is still functional, though it would be even better if
no-one ever had to care because the rename happens straight
afterwards.
llvm-svn: 209576
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r208934.
The patch depends on aliases to GEPs with non zero offsets. That is not
supported and fairly broken.
The good news is that GlobalAlias is being redesigned and will have support
for offsets, so this patch should be a nice match for it.
llvm-svn: 208978
|
| |
|
|
|
|
|
|
|
|
|
| |
This commit implements two command line switches -global-merge-on-external
and -global-merge-aligned, and both of them are false by default, so this
optimization is disabled by default for all targets.
For ARM64, some back-end behaviors need to be tuned to get this optimization
further enabled.
llvm-svn: 208934
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We must validate the value type in TLI::getRegisterByName, because if we
don't and the wrong type was used with the IR intrinsic, then we'll assert
(because we won't be able to find a valid register class with which to
construct the requested copy operation). For PPC64, additionally, the type
information is necessary to decide between the 64-bit register and the 32-bit
subregister.
No functionality change.
llvm-svn: 208508
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 208507
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).
So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.
llvm-svn: 208104
|
| |
|
|
|
|
| |
'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. AArch64 edition
llvm-svn: 207510
|
| |
|
|
| |
llvm-svn: 206861
|
| |
|
|
|
|
|
|
| |
back end. This should boost vectorized code performance.
Patched by Z. Zheng
llvm-svn: 206557
|
| |
|
|
| |
llvm-svn: 206089
|
| |
|
|
| |
llvm-svn: 205926
|
| |
|
|
|
|
|
|
|
| |
In AArch64 i64 to i32 truncate operation is a subregister access.
This allows more opportunities for LSR optmization to eliminate
variables of different types (i32 and i64).
llvm-svn: 205925
|
| |
|
|
|
|
|
|
| |
Lower SHL_PARTS, SRA_PARTS and SRL_PARTS to perform 128-bit integer shift
Patch by GuanHong Liu.
llvm-svn: 204940
|
| |
|
|
| |
llvm-svn: 202621
|
| |
|
|
|
|
|
|
| |
SHUFFLE_VECTOR.
Replace r199791.
llvm-svn: 200180
|
| |
|
|
|
|
| |
It's old version which has some bugs. I'll commit lattest patch soon.
llvm-svn: 200179
|
| |
|
|
|
|
| |
backend.
llvm-svn: 199858
|
| |
|
|
|
|
| |
SHUFFLE_VECTOR.
llvm-svn: 199791
|
| |
|
|
|
|
|
|
|
|
| |
subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.
Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.
llvm-svn: 198685
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During the years there have been some attempts at figuring out how to
align byval arguments. A look at the commit log suggests that they
were
* Use the ABI alignment.
* When that was not sufficient for x86-64, I added the 's' specification to
DataLayout.
* When that was not sufficient Evan added the virtual getByValTypeAlignment.
* When even that was not sufficient, we just got the FE to add the alignment
to the byval.
This patch is just a simple cleanup that removes my first attempt at fixing the
problem. I also added an AArch64 implementation of getByValTypeAlignment to
make sure this patch is a nop. I also left the 's' parsing for backward
compatibility.
I will send a short email to llvmdev about the change for anyone maintaining
an out of tree target.
llvm-svn: 198287
|
| |
|
|
| |
llvm-svn: 197551
|
| |
|
|
| |
llvm-svn: 196998
|
| |
|
|
|
|
| |
v4i64 and v8i64.
llvm-svn: 196456
|
| |
|
|
|
|
|
| |
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().
llvm-svn: 195716
|
| |
|
|
| |
llvm-svn: 195078
|
| |
|
|
|
|
| |
The functions are like: vst1_s8_x2 ...
llvm-svn: 194990
|
| |
|
|
| |
llvm-svn: 194656
|
| |
|
|
| |
llvm-svn: 194118
|
| |
|
|
|
|
|
|
|
|
|
|
| |
class SIMD(lselem-post).
Including following 14 instructions:
4 ld1 insts: post-index load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: post-index load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: post-index store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: post-index store multiple N-element structure from sequential N registers (N = 2,3,4).
llvm-svn: 194043
|
| |
|
|
|
|
| |
Fixes PR17690
llvm-svn: 193625
|
| |
|
|
| |
llvm-svn: 192410
|
| |
|
|
|
|
|
|
|
|
|
|
| |
SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).
llvm-svn: 192361
|
| |
|
|
|
|
|
|
| |
class SIMD(lselem). Including following 14 instructions: 4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers. ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4). 4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers. st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4)."
This reverts commit r192352. It broke the build.
llvm-svn: 192354
|