| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
memory intrinsics in the SDAG builder.
When alignment is zero, the lang ref says that *no* alignment
assumptions can be made. This is the exact opposite of the internal API
contracts of the DAG where alignment 0 indicates that the alignment can
be made to be anything desired.
There is another, more explicit alignment that is better suited for the
role of "no alignment at all": an alignment of 1. Map the intrinsic
alignment to this early so that we don't end up generating aligned DAGs.
It is really terrifying that we've never seen this before, but we
suddenly started generating a large number of alignment 0 memcpys due to
the new code to do memcpy-based copying of POD class members. That patch
contains a bug that rounds bitfield alignments down when they are the
first field. This can in turn produce zero alignments.
This fixes weird crashes I've seen in library users of LLVM on 32-bit
hosts, etc.
llvm-svn: 176022
|
|
|
|
|
|
| |
Fix PR15239.
llvm-svn: 175985
|
|
|
|
|
|
| |
Fixes PR15115.
llvm-svn: 175962
|
|
|
|
|
|
| |
macros.The rest is some small misc. stuff.
llvm-svn: 175950
|
|
|
|
| |
llvm-svn: 175920
|
|
|
|
| |
llvm-svn: 175914
|
|
|
|
|
|
| |
under coldcc
llvm-svn: 175911
|
|
|
|
| |
llvm-svn: 175905
|
|
|
|
|
|
|
|
|
| |
instructions.
The Printer will now print instructions with the correct alignment specifier syntax, like
vld1.8 {d16}, [r0:64]
llvm-svn: 175884
|
|
|
|
| |
llvm-svn: 175862
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was incorrectly checking a Function* being an IntrinsicInst* which
isn't possible. It should always have been checking the CallInst* instead.
Added test case for x86 which ensures we only get one constant load.
It was 2 before this change.
rdar://problem/13267920
llvm-svn: 175853
|
|
|
|
| |
llvm-svn: 175783
|
|
|
|
|
|
|
|
|
|
| |
This fixes some problems with too conservative checking where we were
marking all aliases of a register as used, and then also checking all
aliases when allocating a register.
<rdar://problem/13249625>
llvm-svn: 175782
|
|
|
|
|
|
|
|
|
|
|
| |
Large code model is identical to medium code model except that the
addis/addi sequence for "local" accesses is never used. All accesses
use the addis/ld sequence.
The coding changes are straightforward; most of the patch is taken up
with creating variants of the medium model tests for large model.
llvm-svn: 175767
|
|
|
|
|
|
|
|
| |
A legal BUILD_VECTOR goes in and gets constant folded into another legal
BUILD_VECTOR so we don't lose any legality here. The problematic PPC
optimization that made this check necessary was fixed recently.
llvm-svn: 175759
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes for-loop.cl piglit test
Patch By: Vincent Lejeune
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 175742
|
|
|
|
|
|
|
| |
NOTE: This is a candidate for the Mesa stable branch.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 175733
|
|
|
|
|
|
|
| |
there were inline br .+4 instructions. Soon everything can enjoy the
full instruction scheduling experience.
llvm-svn: 175718
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the PPCDAGToDAGISel::PostprocessISelDAG virtual
method to perform post-selection peephole optimizations on the DAG
representation.
One optimization is implemented here: folds to clean up complex
addressing expressions for thread-local storage and medium code
model. It will also be useful for large code model sequences when
those are added later. I originally thought about doing this on the
MI representation prior to register assignment, but it's difficult to
do effective global dead code elimination at that point. DCE is
trivial on the DAG representation.
A typical example of a candidate code sequence in assembly:
addis 3, 2, globalvar@toc@ha
addi 3, 3, globalvar@toc@l
lwz 5, 0(3)
When the final instruction is a load or store with an immediate offset
of zero, the offset from the add-immediate can replace the zero,
provided the relocation information is carried along:
addis 3, 2, globalvar@toc@ha
lwz 5, globalvar@toc@l(3)
Since the addi can in general have multiple uses, we need to only
delete the instruction when the last use is removed.
llvm-svn: 175697
|
|
|
|
| |
llvm-svn: 175683
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y)))
can be folded into a (2xi32) (buildvector i32 a, i32 b).
Such a DAG would cause uneccessary vdup instructions followed by vmovn
instructions.
We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in
the vectorized version of the code below.
double A[N];
double B[N];
void test_double_compare_to_double() {
int i;
for(i=0;i<N;i++)
A[i] = (double)(A[i] < B[i]);
}
radar://13191881
Fixes bug 15283.
llvm-svn: 175670
|
|
|
|
|
|
|
|
| |
This handles the cases where the 6-bit splat element is odd, converting
to a three-instruction sequence to add or subtract two splats. With this
fix, the XFAIL in test/CodeGen/PowerPC/vec_constants.ll is removed.
llvm-svn: 175663
|
|
|
|
|
|
|
|
|
| |
- When extloading from a vector with non-byte-addressable element, e.g.
<4 x i1>, the current logic breaks. Extend the current logic to
fix the case where the element type is not byte-addressable by loading
all bytes, bit-extracting/packing each element.
llvm-svn: 175642
|
|
|
|
|
|
|
|
| |
The PPC backend doesn't handle these correctly. This patch uses logic
similar to that in the X86 and ARM backends to track these arguments
properly.
llvm-svn: 175635
|
|
|
|
|
|
|
|
| |
Add HexagonMCInst class which adds various Hexagon VLIW annotations.
In addition, this class also includes some APIs related to the
constant extenders.
llvm-svn: 175634
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During lowering of a BUILD_VECTOR, we look for opportunities to use a
vector splat. When the splatted value fits in 5 signed bits, a single
splat does the job. When it doesn't fit in 5 bits but does fit in 6,
and is an even value, we can splat on half the value and add the result
to itself.
This last optimization hasn't been working recently because of improved
constant folding. To circumvent this, create a pseudo VADD_SPLAT that
can be expanded during instruction selection.
llvm-svn: 175632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sext <4 x i1> to <4 x i64>
sext <4 x i8> to <4 x i64>
sext <4 x i16> to <4 x i64>
I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns:
(sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT)))
The sext_in_reg (v4i32 x) may be lowered to shl+sar operations.
The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution.
I also added a cost of this operations to the AVX costs table.
llvm-svn: 175619
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that frame pointer is not found in the
callee saved info, thus FramePtrSpillFI may be incorrect
if we don't check the result of hasFP(MF).
Besides, if we enable the stack coloring algorithm, there
will be an assertion to ensure the slot is live. But in
the test case, %var1 is not live in the prologue of the
function, and we will get the assertion failure.
Note: There is similar code in ARMFrameLowering.cpp.
llvm-svn: 175616
|
|
|
|
|
|
|
|
| |
SltCCRxRy16, SltiCCRxImmX16, SltiuCCRxImmX16, SltuCCRxRy16
$T8 shows up as register $24 when emitted from C++ code so we had
to change some tests that were already there for this functionality.
llvm-svn: 175593
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MS-style inline assembly.
This is a follow-on to r175334. Forcing a FP to be emitted doesn't ensure it
will be used. Therefore, force the base pointer as well. We now treat MS
inline assembly in the same way we treat functions with dynamic stack
realignment and VLAs. This guarantees the BP will be used to reference
parameters and locals.
rdar://13218191
llvm-svn: 175576
|
|
|
|
|
|
|
|
|
| |
When creating an allocation hint for a register pair, make sure the hint
for the physical register reference is still in the allocation order.
rdar://13240556
llvm-svn: 175541
|
|
|
|
| |
llvm-svn: 175530
|
|
|
|
|
|
|
|
|
|
|
| |
Due to the execution order of doFinalization functions, the GC information were
deleted before AsmPrinter::doFinalization was executed. Thus, the
GCMetadataPrinter::finishAssembly was never called.
The patch fixes that by moving the code of the GCInfoDeleter::doFinalization to
Printer::doFinalization.
llvm-svn: 175528
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers.
The patch fixes this by using a COPY_TO_REGCLASS and a EXTRACT_SUBREG to extract
the element from the vector instead.
radar://13191881
llvm-svn: 175520
|
|
|
|
|
|
| |
BtnezT8SltiX16, BtnezT8SltiuX16 .
llvm-svn: 175486
|
|
|
|
| |
llvm-svn: 175474
|
|
|
|
| |
llvm-svn: 175460
|
|
|
|
|
|
|
|
| |
If the memcpy has an odd length with an alignment of 2, this would incorrectly
assert on the last 1 byte copy.
rdar://13202135
llvm-svn: 175459
|
|
|
|
| |
llvm-svn: 175457
|
|
|
|
| |
llvm-svn: 175446
|
|
|
|
| |
llvm-svn: 175420
|
|
|
|
| |
llvm-svn: 175417
|
|
|
|
| |
llvm-svn: 175416
|
|
|
|
|
|
|
|
|
|
| |
This expansion will be moved to expandISelPseudos as soon as I can figure
out how to do that. There are other instructions which use this
ExpandFEXT_T8I816_ins and as soon as I have finished expanding them all,
I will delete the macro asm string text so it has no way to be used
in the future.
llvm-svn: 175413
|
|
|
|
| |
llvm-svn: 175401
|
|
|
|
|
|
| |
Also fix one test by changing "vpermilps" to "vpshufd".
llvm-svn: 175357
|
|
|
|
|
|
| |
new features.
llvm-svn: 175336
|
|
|
|
|
|
|
|
|
|
|
| |
If the frame pointer is omitted, and any stack changes occur in the inline
assembly, e.g.: "pusha", then any C local variable or C argument references
will be incorrect.
I pass no judgement on anyone who would do such a thing. ;)
rdar://13218191
llvm-svn: 175334
|
|
|
|
| |
llvm-svn: 175322
|
|
|
|
|
|
|
| |
When we're recalculating the feature set of the subtarget, we need to have the
ivars in their initial state.
llvm-svn: 175320
|