|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| ... |  | 
| | 
| 
| 
| | llvm-svn: 162186 | 
| | 
| 
| 
| 
| 
| 
| | These nodes are no longer needed because the peephole pass can fold
CMOV+AND into ANDCC etc.
llvm-svn: 162179 | 
| | 
| 
| 
| 
| 
| | class, but the base class methods aren't virtual so it just increased call overhead.
llvm-svn: 162178 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This turns common i1 patterns into predicated instructions:
  (add (zext cc), x) -> (select cc (add x, 1), x)
  (add (sext cc), x) -> (select cc (add x, -1), x)
For a function like:
  unsigned f(unsigned s, int x) {
    return s + (x>0);
  }
We now produce:
  cmp r1, #0
  it  gt
  addgt.w r0, r0, #1
Instead of:
  movs  r2, #0
  cmp r1, #0
  it  gt
  movgt r2, #1
  add r0, r2
llvm-svn: 162177 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Add these transformations to the existing add/sub ones:
  (and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
  (or  (select cc, 0, c), x)  -> (select cc, x, (or, x, c))
  (xor (select cc, 0, c), x)  -> (select cc, x, (xor, x, c))
The selects can then be transformed to a single predicated instruction
by peephole.
This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.
llvm-svn: 162176 | 
| | 
| 
| 
| 
| 
| | better compare/branch code.
llvm-svn: 162172 | 
| | 
| 
| 
| 
| 
| | Make sure the generic pattern is used.
llvm-svn: 162170 | 
| | 
| 
| 
| 
| 
| | functional change intended.
llvm-svn: 162166 | 
| | 
| 
| 
| | llvm-svn: 162165 | 
| | 
| 
| 
| | llvm-svn: 162164 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | arithmetic instructions. However, when small data types are used, a truncate
node appears between the SETCC node and the arithmetic operation. This patch
adds support for this pattern.
Before:
  xorl  %esi, %edi
  testb %dil, %dil
  setne %al
  ret
After:
  xorb  %dil, %sil
  setne %al
  ret
rdar://12081007
llvm-svn: 162160 | 
| | 
| 
| 
| | llvm-svn: 162136 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | No new tests are added.
All tests in ExecutionEngine/MCJIT that have been failing pass after this patch
is applied (when "make check" is done on a mips board). 
Patch by Petar Jovanovic.
llvm-svn: 162135 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.
Fixes PR13628.
llvm-svn: 162130 | 
| | 
| 
| 
| 
| 
| | Patch by Vladimir Medic.
llvm-svn: 162124 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | make it more consistent with its intended semantics.
The `linker_private_weak_def_auto' linkage type was meant to automatically hide
globals which never had their addresses taken. It has nothing to do with the
`linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix
among other things.
The intended semantic is more like the `linkonce_odr' linkage type.
Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore
changing the semantics so that it produces the correct output for the linker.
Note: The old linkage name `linker_private_weak_def_auto' will still parse but
is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0.
<rdar://problem/11754934>
llvm-svn: 162114 | 
| | 
| 
| 
| | llvm-svn: 162107 | 
| | 
| 
| 
| | llvm-svn: 162094 | 
| | 
| 
| 
| | llvm-svn: 162089 | 
| | 
| 
| 
| 
| 
| | reduce to only a single call to it thus allowing it to be inlined by the compiler.
llvm-svn: 162088 | 
| | 
| 
| 
| | llvm-svn: 162086 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.
Then the pseudo-instructions can go away.
llvm-svn: 162061 | 
| | 
| 
| 
| 
| 
| | Use the target independent select analysis hooks.
llvm-svn: 162060 | 
| | 
| 
| 
| | llvm-svn: 162039 | 
| | 
| 
| 
| | llvm-svn: 162037 | 
| | 
| 
| 
| | llvm-svn: 162032 | 
| | 
| 
| 
| 
| 
| 
| 
| | Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.
llvm-svn: 162013 | 
| | 
| 
| 
| | llvm-svn: 162012 | 
| | 
| 
| 
| | llvm-svn: 162010 | 
| | 
| 
| 
| 
| 
| | floats.
llvm-svn: 162008 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.
This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.
llvm-svn: 161994 | 
| | 
| 
| 
| 
| 
| | unaligned access. rdar://12091029
llvm-svn: 161962 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When predicating this instruction:
  Rd = ADD Rn, Rm
We need an extra operand to represent the value given to Rd when the
predicate is false:
  Rd = ADDCC Rfalse, Rn, Rm, pred
The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.
Previously, Rd and Rn were tied, but that is not required.
Compare to MOVCC:
  Rd = MOVCC Rfalse, Rtrue, pred
llvm-svn: 161955 | 
| | 
| 
| 
| 
| 
| 
| 
| | reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.
llvm-svn: 161907 | 
| | 
| 
| 
| | llvm-svn: 161906 | 
| | 
| 
| 
| | llvm-svn: 161902 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.
llvm-svn: 161894 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.
As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:
Time to compile at -O2 (averaged w/ hot caches):
  Previous: 35.5s
  New:       8.9s
TEXT size:
  Previous: 447,251
  New:      297,661
Builds in 25% of the time previously required and generates code 66% of
the size.
Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.
llvm-svn: 161888 | 
| | 
| 
| 
| | llvm-svn: 161860 | 
| | 
| 
| 
| 
| 
| | Reduces compiled code size a little bit.
llvm-svn: 161859 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.
The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.
When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.
llvm-svn: 161794 | 
| | 
| 
| 
| 
| 
| 
| 
| | This change is to be enabled in clang.
rdar://9877866
llvm-svn: 161789 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This was causing unnecessary spills/restores of callee saved registers.
Fixes PR13572.
Patch by Pranav Bhandarkar!
llvm-svn: 161778 | 
| | 
| 
| 
| 
| 
| 
| 
| | ISDNode has more than one user.
rdar://11876519
llvm-svn: 161775 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.
PR13576
llvm-svn: 161769 | 
| | 
| 
| 
| 
| 
| | Patch by Weiming Zhao.
llvm-svn: 161768 | 
| | 
| 
| 
| 
| 
| | Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.
llvm-svn: 161763 | 
| | 
| 
| 
| 
| 
| 
| 
| | Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.
llvm-svn: 161748 | 
| | 
| 
| 
| 
| 
| | putting an a couple if conditions in a better order.
llvm-svn: 161746 | 
| | 
| 
| 
| | llvm-svn: 161745 |