|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| | FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct.
llvm-svn: 163458 | 
| | 
| 
| 
| 
| 
| 
| | The 'select' transformations apply to all ARM architectures and don't
require hasV6T2Ops.
llvm-svn: 163396 | 
| | 
| 
| 
| | llvm-svn: 163306 | 
| | 
| 
| 
| 
| 
| | If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base.
llvm-svn: 163304 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
llvm-svn: 163136 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Thumb2 instructions are mostly constrained to rGPR, not tGPR which is
for Thumb1.
rdar://problem/12203728
llvm-svn: 162968 | 
| | 
| 
| 
| 
| 
| 
| | The test case ARM/2011-05-04-MultipleLandingPadSuccs.ll was creating
duplicate successor list entries.
llvm-svn: 162222 | 
| | 
| 
| 
| 
| 
| 
| | These nodes are no longer needed because the peephole pass can fold
CMOV+AND into ANDCC etc.
llvm-svn: 162179 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This turns common i1 patterns into predicated instructions:
  (add (zext cc), x) -> (select cc (add x, 1), x)
  (add (sext cc), x) -> (select cc (add x, -1), x)
For a function like:
  unsigned f(unsigned s, int x) {
    return s + (x>0);
  }
We now produce:
  cmp r1, #0
  it  gt
  addgt.w r0, r0, #1
Instead of:
  movs  r2, #0
  cmp r1, #0
  it  gt
  movgt r2, #1
  add r0, r2
llvm-svn: 162177 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Add these transformations to the existing add/sub ones:
  (and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
  (or  (select cc, 0, c), x)  -> (select cc, x, (or, x, c))
  (xor (select cc, 0, c), x)  -> (select cc, x, (xor, x, c))
The selects can then be transformed to a single predicated instruction
by peephole.
This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.
llvm-svn: 162176 | 
| | 
| 
| 
| | llvm-svn: 162107 | 
| | 
| 
| 
| 
| 
| | Use the target independent select analysis hooks.
llvm-svn: 162060 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.
This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.
llvm-svn: 161994 | 
| | 
| 
| 
| 
| 
| | unaligned access. rdar://12091029
llvm-svn: 161962 | 
| | 
| 
| 
| 
| 
| 
| 
| | ISDNode has more than one user.
rdar://11876519
llvm-svn: 161775 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | architecture
It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.
llvm-svn: 161736 | 
| | 
| 
| 
| 
| 
| | getSimpleVT can be removed.
llvm-svn: 161735 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
llvm-svn: 161581 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>
llvm-svn: 161232 | 
| | 
| 
| 
| 
| 
| 
| 
| | but somehow managed to be dropped later.
Patch by Karel Gardas.
llvm-svn: 161226 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Before accessing a node as a ConstandSDNode, make sure it actually is one.
No testcase of non-trivial size.
rdar://11948669
llvm-svn: 160735 | 
| | 
| 
| 
| 
| 
| | Based on Evan's suggestion without a commitable test.
llvm-svn: 160441 | 
| | 
| 
| 
| | llvm-svn: 160440 | 
| | 
| 
| 
| 
| 
| 
| 
| | This change is to be enabled in clang.
rdar://9877866
llvm-svn: 158684 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi
For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.
rdar: 11633193
llvm-svn: 158551 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>
llvm-svn: 158302 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Factor out the expansion code into a function.
This change is to be enabled in clang.
rdar://9877866
llvm-svn: 157830 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | We handle struct byval by inserting a pseudo op, which will be expanded to a
loop at ExpandISelPseudos.
A separate patch for clang will be submitted to enable struct byval.
rdar://9877866
llvm-svn: 157793 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.
NV_CONTRIB
llvm-svn: 157479 | 
| | 
| 
| 
| | llvm-svn: 157152 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.
Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.
I'm not entirely happy with the name of this flag, suggestions welcome ;)
llvm-svn: 156233 | 
| | 
| 
| 
| | llvm-svn: 156189 | 
| | 
| 
| 
| 
| 
| 
| 
| | This moves the logic for selecting a TLS model to a single place,
instead of the previous three (ARM, Mips, and X86 which already
uses this function).
llvm-svn: 156162 | 
| | 
| 
| 
| 
| 
| 
| 
| | ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal.  Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.
llvm-svn: 155824 | 
| | 
| 
| 
| 
| 
| | since they are equivalent.
llvm-svn: 155188 | 
| | 
| 
| 
| | llvm-svn: 154439 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.
PR12419
rdar://9770785
rdar://11195178
llvm-svn: 154370 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | in-register, such that we can use a single vector store rather then a 
series of scalar stores.
For func_4_8 the generated code
	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vmov.u16	r0, d16[3]
	strb	r0, [r2, #3]
	vmov.u16	r0, d16[2]
	strb	r0, [r2, #2]
	vmov.u16	r0, d16[1]
	strb	r0, [r2, #1]
	vmov.u16	r0, d16[0]
	strb	r0, [r2]
	bx	lr
becomes
	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vuzp.8	d16, d17
	vst1.32	{d16[0]}, [r2, :32]
	bx	lr
I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.
This
	ldrh	r0, [r0, #4]
	strh	r0, [r1]
becomes
	vldr	d16, [r0]
	vmov.u16	r0, d16[2]
	vmov.32	d16[0], r0
	vuzp.16	d16, d17
	vst1.32	{d16[0]}, [r1, :32]
PR11158
rdar://10703339
llvm-svn: 154340 | 
| | 
| 
| 
| | llvm-svn: 154336 | 
| | 
| 
| 
| | llvm-svn: 154226 | 
| | 
| 
| 
| 
| 
| | which exists for this purpose.
llvm-svn: 154199 | 
| | 
| 
| 
| 
| 
| 
| | ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.
llvm-svn: 154183 | 
| | 
| 
| 
| 
| 
| 
| 
| | This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
llvm-svn: 154011 | 
| | 
| 
| 
| 
| 
| | tailcall opt. rdar://11140249
llvm-svn: 153717 | 
| | 
| 
| 
| 
| 
| | vmov.f32.
llvm-svn: 153696 | 
| | 
| 
| 
| | llvm-svn: 153500 | 
| | 
| 
| 
| | llvm-svn: 153422 | 
| | 
| 
| 
| | llvm-svn: 153421 | 
| | 
| 
| 
| 
| 
| 
| | Patch by Weiming Zhao!
This fixes PR12212
llvm-svn: 153049 | 
| | 
| 
| 
| | llvm-svn: 152978 |