| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
compiled code size a bit.
llvm-svn: 154473
|
| |
|
|
| |
llvm-svn: 154469
|
| |
|
|
|
|
| |
ret instructions.
llvm-svn: 154468
|
| |
|
|
|
|
| |
for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp .
llvm-svn: 154459
|
| |
|
|
|
|
| |
rdar://11222742
llvm-svn: 154457
|
| |
|
|
|
|
|
|
|
|
| |
1. The new instruction itinerary entries are not properly described.
2. The asm parser can't handle vfms and vfnms.
3. There were no assembler, disassembler test cases.
4. HasNEON2 has the wrong assembler predicate.
rdar://10139676
llvm-svn: 154456
|
| |
|
|
|
|
|
|
|
|
|
| |
Allow cheap instructions to be hoisted if they are register pressure
neutral or better. This happens if the instruction is the last loop use
of another virtual register.
Only expensive instructions are allowed to increase loop register
pressure.
llvm-svn: 154455
|
| |
|
|
|
|
|
|
|
|
|
| |
Hoisting a value that is used by a PHI in the loop will introduce a
copy because the live range is extended to cross the PHI.
The same applies to PHIs in exit blocks.
Also use this opportunity to make HasLoopPHIUse() non-recursive.
llvm-svn: 154454
|
| |
|
|
|
|
|
|
| |
one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.
llvm-svn: 154447
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- don't isntrument reads from constant globals.
Saves ~1.5% of instrumented instructions on CPU2006
(counting static instructions, not their execution).
- don't insrument reads from vtable (which is a global constant too).
Saves ~5%.
I did not measure the run-time impact of this,
but it is certainly non-negative.
llvm-svn: 154444
|
| |
|
|
| |
llvm-svn: 154439
|
| |
|
|
|
|
| |
multiplication by a denormal, and some tests checking that.
llvm-svn: 154431
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
StringMap. This was redundant and unnecessarily bloated the MDString class.
Because the MDString class is a "Value" and will never have a "name", and
because the Name field in the Value class is a pointer to a StringMap entry, we
repurpose the Name field for an MDString. It stores the StringMap entry in the
Name field, and uses the normal methods to get the string (name) back.
PR12474
llvm-svn: 154429
|
| |
|
|
| |
llvm-svn: 154427
|
| |
|
|
| |
llvm-svn: 154426
|
| |
|
|
| |
llvm-svn: 154425
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
a write to the same temp follows in the same BB.
Also add stats printing.
On Spec CPU2006 this optimization saves roughly 4% of instrumented reads
(which is 3% of all instrumented accesses):
Writes : 161216
Reads : 446458
Reads-before-write: 18295
llvm-svn: 154418
|
| |
|
|
|
|
|
|
|
| |
don't elide the branch instruction if it's the only one in the block,
otherwise it's ok.
PR9796 and rdar://11215207
llvm-svn: 154417
|
| |
|
|
| |
llvm-svn: 154414
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We were incorrectly conflating some add variants which don't have a
cc_out operand with the mirroring sub encodings, which do. Part of the
awesome non-orthogonality legacy of thumb1. Similarly, handling of
add/sub of an immediate was sometimes incorrectly removing the cc_out
operand for add/sub register variants.
rdar://11216577
llvm-svn: 154411
|
| |
|
|
| |
llvm-svn: 154398
|
| |
|
|
|
|
|
|
|
| |
always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.
llvm-svn: 154397
|
| |
|
|
|
|
|
| |
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154396
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the loop header has a non-loop predecessor which has been pre-fused into
its chain due to unanalyzable branches. In this case, rotating the
header into the body of the loop in order to place a loop exit at the
bottom of the loop is a Very Bad Idea as it makes the loop
non-contiguous.
I'm working on a good test case for this, but it's a bit annoynig to
craft. I should get one shortly, but I'm submitting this now so I can
begin the (lengthy) performance analysis process. An initial run of LNT
looks really, really good, but there is too much noise there for me to
trust it much.
llvm-svn: 154395
|
| |
|
|
|
|
| |
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)
llvm-svn: 154394
|
| |
|
|
|
|
| |
Patch by Dmitri Shubin!
llvm-svn: 154391
|
| |
|
|
|
|
| |
rational number, eg as 2.5 rather than 5, 2. OK'd by Peter Collingbourne.
llvm-svn: 154387
|
| |
|
|
|
|
|
|
| |
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.
llvm-svn: 154386
|
| |
|
|
| |
llvm-svn: 154385
|
| |
|
|
| |
llvm-svn: 154378
|
| |
|
|
| |
llvm-svn: 154371
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.
PR12419
rdar://9770785
rdar://11195178
llvm-svn: 154370
|
| |
|
|
|
|
| |
not fit in a i64.
llvm-svn: 154364
|
| |
|
|
|
|
| |
Generalized logic of r154141.
llvm-svn: 154362
|
| |
|
|
|
|
| |
GOT if jump table uses 64-bit gp-relative relocation.
llvm-svn: 154341
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in-register, such that we can use a single vector store rather then a
series of scalar stores.
For func_4_8 the generated code
vldr d16, LCPI0_0
vmov d17, r0, r1
vadd.i16 d16, d17, d16
vmov.u16 r0, d16[3]
strb r0, [r2, #3]
vmov.u16 r0, d16[2]
strb r0, [r2, #2]
vmov.u16 r0, d16[1]
strb r0, [r2, #1]
vmov.u16 r0, d16[0]
strb r0, [r2]
bx lr
becomes
vldr d16, LCPI0_0
vmov d17, r0, r1
vadd.i16 d16, d17, d16
vuzp.8 d16, d17
vst1.32 {d16[0]}, [r2, :32]
bx lr
I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.
This
ldrh r0, [r0, #4]
strh r0, [r1]
becomes
vldr d16, [r0]
vmov.u16 r0, d16[2]
vmov.32 d16[0], r0
vuzp.16 d16, d17
vst1.32 {d16[0]}, [r1, :32]
PR11158
rdar://10703339
llvm-svn: 154340
|
| |
|
|
|
|
|
|
|
|
| |
This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when
rescheduling instructions in TryInstructionTransform. Hopefully this will fix
PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after
the copy that unties the operands is emitted (this seems to be a more
appropriate fix for that issue anyway).
llvm-svn: 154338
|
| |
|
|
| |
llvm-svn: 154336
|
| |
|
|
|
|
|
|
|
|
|
| |
-Wconstant-conversion.
A couple of cases where we were accidentally creating constant conditions by
something like "x == a || b" instead of "x == a || x == b". In one case a
conditional & then unreachable was used - I transformed this into a direct
assert instead.
llvm-svn: 154324
|
| |
|
|
| |
llvm-svn: 154322
|
| |
|
|
|
|
| |
original patch to add itineraries, to X86InstrArithmetc.td.
llvm-svn: 154320
|
| |
|
|
| |
llvm-svn: 154313
|
| |
|
|
|
|
|
|
| |
target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.
llvm-svn: 154310
|
| |
|
|
|
|
| |
some checks to allow better early out.
llvm-svn: 154309
|
| |
|
|
| |
llvm-svn: 154308
|
| |
|
|
| |
llvm-svn: 154307
|
| |
|
|
|
|
| |
happen.
llvm-svn: 154305
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.
To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
the match to try to fit RIP into the base register. If it fails, it
now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
as it did before. The reason these immediates are safe is because the
ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
This is the only case changed by the patch, and the primary place you
see it is in TLS, either the win64 section offset TLS or Linux
local-exec TLS model in a PIC compilation. Here the ABI again ensures
that the immediates fit because we are in small mode, and any other
operations required due to the PIC relocation model have been handled
externally to the Wrapper node (extra loads etc are made around the
wrapper node in ISelLowering).
I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.
llvm-svn: 154304
|
| |
|
|
| |
llvm-svn: 154299
|
| |
|
|
| |
llvm-svn: 154297
|