| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.
This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.
llvm-svn: 161994
|
| |
|
|
| |
llvm-svn: 161990
|
| |
|
|
|
|
|
|
|
|
| |
allocations of executable memory would not be padded
to account for the size of the allocation header.
This resulted in undersized allocations, meaning that
when the allocation was written to later the next
allocation's header would be corrupted.
llvm-svn: 161984
|
| |
|
|
| |
llvm-svn: 161978
|
| |
|
|
| |
llvm-svn: 161976
|
| |
|
|
|
|
| |
infinity. Problem and solution identified by Steve Canon.
llvm-svn: 161969
|
| |
|
|
|
|
| |
unaligned access. rdar://12091029
llvm-svn: 161962
|
| |
|
|
| |
llvm-svn: 161956
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When predicating this instruction:
Rd = ADD Rn, Rm
We need an extra operand to represent the value given to Rd when the
predicate is false:
Rd = ADDCC Rfalse, Rn, Rm, pred
The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.
Previously, Rd and Rn were tied, but that is not required.
Compare to MOVCC:
Rd = MOVCC Rfalse, Rtrue, pred
llvm-svn: 161955
|
| |
|
|
|
|
|
|
|
|
| |
instruction to something absurdly high, while setting the probability of
branching to the 'unwind' destination to the bare minimum. This should set cause
the normal destination's invoke blocks to be moved closer to the invoke.
PR13612
llvm-svn: 161944
|
| |
|
|
|
|
| |
to handle unaligned partially OOB accesses. See http://code.google.com/p/address-sanitizer/issues/detail?id=100
llvm-svn: 161937
|
| |
|
|
|
|
| |
results for negative inputs to trunc. Add unit tests to verify this behavior.
llvm-svn: 161929
|
| |
|
|
|
|
|
|
|
|
|
|
| |
- memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is
0-sized memcpy
- as 0-sized memcpy/memset is already removed before SimplifyMemTransfer
and SimplifyMemSet in visitCallInst, replace 0 checking with
assertions.
- replace getZExtValue() with getLimitedValue() according to
Eli Friedman
llvm-svn: 161923
|
| |
|
|
|
|
| |
Patch by Stephen Hines!
llvm-svn: 161921
|
| |
|
|
|
|
| |
pointer.
llvm-svn: 161919
|
| |
|
|
|
|
|
|
| |
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.
llvm-svn: 161907
|
| |
|
|
| |
llvm-svn: 161906
|
| |
|
|
| |
llvm-svn: 161902
|
| |
|
|
|
|
|
|
|
|
|
|
| |
- FP_EXTEND only support extending from vectors with matching elements.
This results in the scalarization of extending to v2f64 from v2f32,
which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.
llvm-svn: 161894
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.
As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:
Time to compile at -O2 (averaged w/ hot caches):
Previous: 35.5s
New: 8.9s
TEXT size:
Previous: 447,251
New: 297,661
Builds in 25% of the time previously required and generates code 66% of
the size.
Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.
llvm-svn: 161888
|
| |
|
|
|
|
| |
safe. Fixes c-torture/execute/990826-0.c
llvm-svn: 161885
|
| |
|
|
|
|
| |
end of the function. This doesn't seem to fix or break anything, but is considered to be more friendly to downstream passes
llvm-svn: 161870
|
| |
|
|
| |
llvm-svn: 161860
|
| |
|
|
|
|
| |
Reduces compiled code size a little bit.
llvm-svn: 161859
|
| |
|
|
|
|
| |
store to the same offset is treated as completing overwriting.
llvm-svn: 161857
|
| |
|
|
|
|
| |
change.
llvm-svn: 161853
|
| |
|
|
|
|
|
|
|
|
|
| |
and allow some optimizations to turn conditional branches into unconditional.
This commit adds a simple control-flow optimization which merges two consecutive
basic blocks which are connected by a single edge. This allows the codegen to
operate on larger basic blocks.
rdar://11973998
llvm-svn: 161852
|
| |
|
|
| |
llvm-svn: 161851
|
| |
|
|
| |
llvm-svn: 161826
|
| |
|
|
|
|
| |
various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807
|
| |
|
|
| |
llvm-svn: 161805
|
| |
|
|
| |
llvm-svn: 161804
|
| |
|
|
|
|
|
|
|
|
|
|
| |
other passes, such as LoopRotate
may invalidate its AliasSet because SSAUpdater does not update the AliasSet properly.
This patch teaches SSAUpdater to notify AliasSet that it made changes.
The testcase in PR12901 is too big to be useful and I could not reduce it to a normal size.
rdar://11872059 PR12901
llvm-svn: 161803
|
| |
|
|
|
|
|
|
|
|
|
| |
function calls.
Currently, if GetLocation reports that it did not find a valid pointer (this is the case for volatile load/stores),
we ignore the result. This patch adds code to handle the cases where we did not obtain a valid pointer.
rdar://11872864 PR12899
llvm-svn: 161802
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.
The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.
When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.
llvm-svn: 161794
|
| |
|
|
|
|
|
|
| |
This change is to be enabled in clang.
rdar://9877866
llvm-svn: 161789
|
| |
|
|
| |
llvm-svn: 161788
|
| |
|
|
| |
llvm-svn: 161783
|
| |
|
|
| |
llvm-svn: 161782
|
| |
|
|
|
|
|
|
| |
It is still possible to if-convert if the tail block has extra
predecessors, but the tail phis must be rewritten instead of being
removed.
llvm-svn: 161781
|
| |
|
|
|
|
|
|
|
|
| |
This was causing unnecessary spills/restores of callee saved registers.
Fixes PR13572.
Patch by Pranav Bhandarkar!
llvm-svn: 161778
|
| |
|
|
|
|
|
|
| |
ISDNode has more than one user.
rdar://11876519
llvm-svn: 161775
|
| |
|
|
|
|
|
|
|
| |
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.
PR13576
llvm-svn: 161769
|
| |
|
|
|
|
| |
Patch by Weiming Zhao.
llvm-svn: 161768
|
| |
|
|
|
|
| |
Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.
llvm-svn: 161763
|
| |
|
|
|
|
| |
idea. (partly related to Bug 13225)
llvm-svn: 161757
|
| |
|
|
|
|
|
|
| |
Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.
llvm-svn: 161748
|
| |
|
|
|
|
| |
putting an a couple if conditions in a better order.
llvm-svn: 161746
|
| |
|
|
| |
llvm-svn: 161745
|
| |
|
|
| |
llvm-svn: 161743
|