| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that.
Test Plan: Tests updated.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6876
llvm-svn: 225450
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This partially fixes PR13007 (ARM CodeGen fails with large stack
alignment): for ARM and Thumb2 targets, but not for Thumb1, as it
seems stack alignment for Thumb1 targets hasn't been supported at
all.
Producing an aligned stack pointer is done by zero-ing out the lower
bits of the stack pointer. The BIC instruction was used for this.
However, the immediate field of the BIC instruction only allows to
encode an immediate that can zero out up to a maximum of the 8 lower
bits. When a larger alignment is requested, a BIC instruction cannot
be used; llvm was silently producing incorrect code in this case.
This commit fixes code generation for large stack aligments by
using the BFC instruction instead, when the BFC instruction is
available. When not, it uses 2 instructions: a right shift,
followed by a left shift to zero out the lower bits.
The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code
that unconditionally uses BIC to realign the stack pointer, so it
very likely has the same problem. However, I wasn't able to
produce a test case for that. This commit adds an assert so that
the compiler will fail the assert instead of silently generating
wrong code if this is ever reached.
llvm-svn: 225446
|
|
|
|
|
|
|
|
|
| |
Its functionality has been replaced by calling
SIInstrInfo::legalizeOperands() from
SIISelLowering::AdjstInstrPostInstrSelection() and running the
SIFoldOperands and SIShrinkInstructions passes.
llvm-svn: 225445
|
|
|
|
|
|
|
|
|
| |
The call lowering assumes that if the callee is a global, we want to emit a direct call.
This is correct for regular globals, but not for TLS ones.
Differential Revision: http://reviews.llvm.org/D6862
llvm-svn: 225438
|
|
|
|
|
|
| |
LEA variants in Intel syntax. The memory operand is inherently unsized.
llvm-svn: 225432
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
type (in addition to the memory type).
The *LoadExt* legalization handling used to only have one type, the
memory type. This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.
However, this isn't always the case. For instance, on X86, with AVX,
this is legal:
v4i32 load, zext from v4i8
but this isn't:
v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.
Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.
Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.
Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior. The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)
No functional change intended.
Differential Revision: http://reviews.llvm.org/D6532
llvm-svn: 225421
|
|
|
|
|
|
| |
live-ins.
llvm-svn: 225419
|
|
|
|
| |
llvm-svn: 225410
|
|
|
|
|
|
|
| |
Folding the same immediate into multiple instruction will increase
program size, which can hurt performance.
llvm-svn: 225405
|
|
|
|
|
|
|
|
| |
A few loops do trickier things than just iterating on an MVT subset,
so I'll leave them be for now.
Follow-up of r225387.
llvm-svn: 225392
|
|
|
|
|
|
|
|
|
|
|
| |
Use VGPR_32 register class instead. These two register classes were
identical and having separate classes was causing
SIInstrInfo::isLegalOperands() to be overly conservative in some cases.
This change is necessary to prevent future paches from missing a folding
opportunity in fneg-fabs.ll.
llvm-svn: 225382
|
|
|
|
|
|
| |
[USR] maintains existing functionality to old instructions without encodings.
llvm-svn: 225377
|
|
|
|
| |
llvm-svn: 225374
|
|
|
|
|
|
|
| |
This is used to simplify the SIFoldOperands pass and make it easier to
fold immediates.
llvm-svn: 225373
|
|
|
|
| |
llvm-svn: 225372
|
|
|
|
| |
llvm-svn: 225371
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows folding of sequences like:
s[0:1] = s_mov_b64 4
v_add_i32 v0, s0, v0
v_addc_u32 v1, s1, v1
into
v_add_i32 v0, 4, v0
v_add_i32 v1, 0, v1
llvm-svn: 225369
|
|
|
|
| |
llvm-svn: 225367
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM emits stack probes on Windows targets to ensure that the stack is
correctly accessed. However, the amount of stack allocated before
emitting such a probe is hardcoded to 4096.
It is desirable to have this be configurable so that a function might
opt-out of stack probes. Our level of granularity is at the function
level instead of, say, the module level to permit proper generation of
code after LTO.
Patch by Andrew H!
N.B. The inliner needs to be updated to properly consider what happens
after inlining a function with a specific stack-probe-size into another
function with a different stack-probe-size.
llvm-svn: 225360
|
|
|
|
|
|
| |
This will make a future patch much less intrusive.
llvm-svn: 225358
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For code like:
float foo(float x) { return copysign(1.0, x); }
We used to generate:
andps <-0.000000e+00,0,0,0>, %xmm0
movss <1.000000e+00>, %xmm1
andps <nan>, %xmm1
orps %xmm0, %xmm1
Basically doing an abs(1.0f) in the two middle instructions.
We now generate:
andps <-0.000000e+00,0,0,0>, %xmm0
orps <1.000000e+00,0,0,0>, %xmm0
Builds on cleanups r223415, r223542.
rdar://19049548
Differential Revision: http://reviews.llvm.org/D6555
llvm-svn: 225357
|
|
|
|
|
|
|
|
| |
The change in r225266 was reviewed under D6722. But the commit r225266 has a
typo, causing some MCHammer failures. This patch fixes it.
Change-Id: I573efcff25003af7478ac02548ebbe929fc7f5fd
llvm-svn: 225347
|
|
|
|
|
|
| |
statement on the same variable. There was no additional code in the default so this should be no functional change.
llvm-svn: 225345
|
|
|
|
|
|
| |
There is no handling for them.
llvm-svn: 225344
|
|
|
|
| |
llvm-svn: 225343
|
|
|
|
|
|
|
|
| |
Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case
where the original subtraction could have caused an overflow.
Reverting the same.
llvm-svn: 225341
|
|
|
|
| |
llvm-svn: 225331
|
|
|
|
|
|
|
|
|
|
| |
Remove the README.txt entry regarding register allocation of CR logical ops,
and replace it with a FIXME in PPCInstrInfo.td. The text in the README.txt was
not really accurate, and thanks goes to Pat Haugen (and Bill Schmidt) from IBM
for clarifying what was intended and highlighting the relevant text in the ISA
specification.
llvm-svn: 225325
|
|
|
|
|
|
|
|
| |
This is affecting the behavior of some ObjC++ / AArch64 test cases on Darwin.
Reverting to get the bots green while I track down the source of the changed
behavior.
llvm-svn: 225311
|
|
|
|
| |
llvm-svn: 225310
|
|
|
|
| |
llvm-svn: 225307
|
|
|
|
| |
llvm-svn: 225306
|
|
|
|
| |
llvm-svn: 225305
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
int->fp conversions on PPC must be done through memory loads and stores. On a
modern core, this process begins by storing the int value to memory, then
loading it using a (sometimes special) FP load instruction. Unfortunately, we
would do this even when the value to be converted was itself a load, and we can
just use that same memory location instead of copying it to another first.
There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs,
because the fp_to_int operand has not been lowered when the int_to_fp is being
lowered. We handle this specially by invoking fp_to_int's lowering logic
(partially) and getting the necessary memory location (some trivial refactoring
was done to make this possible).
This is all somewhat ugly, and it would be nice if some later CodeGen stage
could just clean this stuff up, but because doing so would involve modifying
target-specific nodes (or instructions), it is not immediately clear how that
would work.
Also, remove a related entry from the README.txt for which we now generate
reasonable code.
llvm-svn: 225301
|
|
|
|
| |
llvm-svn: 225291
|
|
|
|
|
|
|
| |
This ensures that all memory operations are complete when all threads
reach the barrier.
llvm-svn: 225290
|
|
|
|
|
|
|
|
|
|
|
|
| |
In DS write instructions, the address operand comes before the value
operand(s) which is reversed from every other instruction type.
The SIInsertWait assumed that the first use for each instruction
was the value, so for DS write it was protecting the address
operand with s_waitcnt instructions when it should have been
protecting the value operand.
llvm-svn: 225289
|
|
|
|
|
|
| |
dcfetch.
llvm-svn: 225283
|
|
|
|
| |
llvm-svn: 225279
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is equivalent to the AMDGPUTargetMachine now, but it is the
starting point for separating R600 and GCN functionality into separate
targets.
It is recommened that users start using the gcn triple for GCN-based
GPUs, because using the r600 triple for these GPUs will be deprecated in
the future.
llvm-svn: 225277
|
|
|
|
|
|
|
| |
The dump was dependent on a feature string, which meant that it couldn't
be disabled or enable on a per compile basis.
llvm-svn: 225275
|
|
|
|
|
|
| |
Uses [GP] maintains existing behavior.
llvm-svn: 225270
|
|
|
|
| |
llvm-svn: 225267
|
|
|
|
|
|
|
|
|
|
|
| |
No functional changes. Support for ARM's modified immediate syntax was added
in r223113 and r223115 (review: D6408). That patch introduced the mod_imm*
tblegen definitions which renders the existing so_imm* definitions redundant.
This patch gets rid of them completely.
Reviewed as: D6722
llvm-svn: 225266
|
|
|
|
|
|
| |
Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building.
llvm-svn: 225256
|
|
|
|
|
|
|
|
| |
non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering.
Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness.
llvm-svn: 225252
|
|
|
|
|
|
|
| |
Because of how Clang represents structs as arrays (at least on non-Darwin
platforms), and what SROA does, etc. this is no longer a problem.
llvm-svn: 225251
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF
relocation target a movq or addq instruction.
Prohibit the truncation of such loads to movl or addl.
This fixes PR22083.
Differential Revision: http://reviews.llvm.org/D6839
llvm-svn: 225250
|
|
|
|
|
|
| |
These are used for debugging output; NFC.
llvm-svn: 225249
|
|
|
|
|
|
|
|
|
| |
The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x))
without a load/store pair is updated here with support for unsigned integers,
and to support single-precision values without a third rounding step, on newer
cores with the appropriate instructions.
llvm-svn: 225248
|