| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.
<rdar://problem/11481151>
llvm-svn: 158659
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi
For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.
rdar: 11633193
llvm-svn: 158551
|
|
|
|
|
|
| |
This probably mostly shows up in bugpoint-generated code.
llvm-svn: 158527
|
|
|
|
|
|
| |
Sorry that I accidently checked in this file with my previous commit.
llvm-svn: 158442
|
|
|
|
|
|
| |
uno && ueq was converted to ueq, it should be converted to uno.
llvm-svn: 158441
|
|
|
|
|
|
|
|
|
|
|
| |
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.
This should fix the standard mode as well as -enable-aa-sched-mi".
llvm-svn: 158380
|
|
|
|
|
|
| |
Patch by Jush Lu <jush.msn@gmail.com>.
llvm-svn: 158368
|
|
|
|
|
|
|
| |
The test is really checking the prolog/epilog load/store multiple
formation.
llvm-svn: 158328
|
|
|
|
|
|
|
|
|
| |
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>
llvm-svn: 158302
|
|
|
|
|
|
|
|
|
|
|
| |
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.
Fast register allocation in the optimizing pipeline is better done by
RABasic.
llvm-svn: 158242
|
|
|
|
| |
llvm-svn: 157972
|
|
|
|
|
|
|
|
|
| |
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits.
<rdar://problem/11481151>
llvm-svn: 157966
|
|
|
|
| |
llvm-svn: 157925
|
|
|
|
|
|
| |
rdar://9877866
llvm-svn: 157876
|
|
|
|
| |
llvm-svn: 157761
|
|
|
|
|
|
| |
wrote and the usual LLVM convention.
llvm-svn: 157708
|
|
|
|
|
|
| |
operands of an FMA node.
llvm-svn: 157707
|
|
|
|
|
|
| |
Patch by Jush Lu <jush.msn@gmail.com>.
llvm-svn: 157696
|
|
|
|
| |
llvm-svn: 157663
|
|
|
|
|
|
|
|
|
| |
because
the old verifier just checked that something "was a pointer", but not that the pointee
was correct.
llvm-svn: 157544
|
|
|
|
|
|
| |
Patch by Jush Lu <jush.msn@gmail.com>.
llvm-svn: 157336
|
|
|
|
|
|
|
|
| |
objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking
llvm-svn: 157255
|
|
|
|
|
|
| |
They need to go on the PICLDR as the verifier points out.
llvm-svn: 157151
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.
Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.
data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"
The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.
rdar://11459456
llvm-svn: 157062
|
|
|
|
|
|
| |
Patch by Meador Inge.
llvm-svn: 156989
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is now possible to coalesce weird skewed sub-register copies by
picking a super-register class larger than both original registers. The
included test case produces code like this:
vld2.32 {d16, d17, d18, d19}, [r0]!
vst2.32 {d18, d19, d20, d21}, [r0]
We still perform interference checking as if it were a normal full copy
join, so this is still quite conservative. In particular, the f1 and f2
functions in the included test case still have remaining copies because
of false interference.
llvm-svn: 156878
|
|
|
|
| |
llvm-svn: 156646
|
|
|
|
|
|
| |
Minor cleanup.
llvm-svn: 156632
|
|
|
|
|
|
|
|
| |
retval. Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796
llvm-svn: 156628
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch will optimize the following cases:
sub r1, r3 | sub r1, imm
cmp r3, r1 or cmp r1, r3 | cmp r1, imm
bge L1
TO
subs r1, r3
bge L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
llvm-svn: 156599
|
|
|
|
|
|
| |
This commit broke an external linux bot and gave a compile-time warning.
llvm-svn: 156556
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch will optimize the following cases:
sub r1, r3 | sub r1, imm
cmp r3, r1 or cmp r1, r3 | cmp r1, imm
bge L1
TO
subs r1, r3
bge L1 or ble L1
If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.
rdar: 10734411
llvm-svn: 156550
|
|
|
|
|
|
|
|
| |
maximum runtime performance penalty that the user is willing to accept.
This commit only adds the parameter. Code taking advantage of it will follow.
llvm-svn: 156473
|
|
|
|
| |
llvm-svn: 156324
|
|
|
|
|
|
| |
just like it now knows for FMULs.
llvm-svn: 156029
|
|
|
|
| |
llvm-svn: 156023
|
|
|
|
|
|
|
|
| |
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal. Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.
llvm-svn: 155824
|
|
|
|
|
|
| |
<rdar://problem/11325085>.
llvm-svn: 155724
|
|
|
|
| |
llvm-svn: 155686
|
|
|
|
|
|
|
| |
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.
llvm-svn: 155630
|
|
|
|
|
|
| |
refuse to break edge to EH landing pad. rdar://11300144
llvm-svn: 155470
|
|
|
|
| |
llvm-svn: 154915
|
|
|
|
|
|
| |
Add an extra test to ldr_post with an immediate increment.
llvm-svn: 154859
|
|
|
|
|
|
| |
It makes it less sensitive to small changes in heuristics.
llvm-svn: 154857
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly to test the waters. I'd like to get results from FNT
build bots and other bots running on non-x86 platforms.
This feature has been pretty heavily tested over the last few months by
me, and it fixes several of the execution time regressions caused by the
inlining work by preventing inlining decisions from radically impacting
block layout.
I've seen very large improvements in yacr2 and ackermann benchmarks,
along with the expected noise across all of the benchmark suite whenever
code layout changes. I've analyzed all of the regressions and fixed
them, or found them to be impossible to fix. See my email to llvmdev for
more details.
I'd like for this to be in 3.1 as it complements the inliner changes,
but if any failures are showing up or anyone has concerns, it is just
a flag flip and so can be easily turned off.
I'm switching it on tonight to try and get at least one run through
various folks' performance suites in case SPEC or something else has
serious issues with it. I'll watch bots and revert if anything shows up.
llvm-svn: 154816
|
|
|
|
|
|
| |
explicitly.
llvm-svn: 154689
|
|
|
|
| |
llvm-svn: 154484
|
|
|
|
| |
llvm-svn: 154469
|
|
|
|
| |
llvm-svn: 154466
|
|
|
|
|
|
|
|
| |
one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.
llvm-svn: 154447
|