| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.
The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.
llvm-svn: 145179
|
| |
|
|
|
|
| |
that mainline needs no autoupgrade logic for intrinsics yet, woohoo!
llvm-svn: 145178
|
| |
|
|
|
|
| |
autoupgrade logic for 2.9 and before.
llvm-svn: 145176
|
| |
|
|
|
|
|
| |
trampoline forms. Both of these were correct in LLVM 3.0, and we don't
need to support LLVM 2.9 and earlier in mainline.
llvm-svn: 145174
|
| |
|
|
|
|
| |
only produced by LLVM 2.9 and earlier. LLVM 3.0 and later prefers "load volatile".
llvm-svn: 145172
|
| |
|
|
|
|
|
| |
I think this is the last of autoupgrade that can be removed in 3.1.
Can the atomic upgrade stuff also go?
llvm-svn: 145169
|
| |
|
|
| |
llvm-svn: 145167
|
| |
|
|
|
|
| |
LLVM 3.0 and later.
llvm-svn: 145165
|
| |
|
|
|
|
| |
back to 3.0
llvm-svn: 145164
|
| |
|
|
|
|
| |
These instructions are not generated by the backend yet, this will come in a later commit.
llvm-svn: 145161
|
| |
|
|
|
|
| |
Fix a couple of 80-column violations.
llvm-svn: 145159
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.
This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.
llvm-svn: 145158
|
| |
|
|
| |
llvm-svn: 145154
|
| |
|
|
|
|
| |
Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
llvm-svn: 145153
|
| |
|
|
| |
llvm-svn: 145152
|
| |
|
|
|
|
| |
not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
llvm-svn: 145148
|
| |
|
|
|
|
|
|
| |
was returning incorrect values in rare cases, and incorrectly marking
exact conversions as inexact in some more common cases. Fixes PR11406, and a
missed optimization in test/CodeGen/X86/fp-stack-O0.ll.
llvm-svn: 145141
|
| |
|
|
|
|
|
|
|
| |
tablegen patterns for scalar FMA4 operations and intrinsic. Also
add tests for vfmaddsd.
Patch by Jan Sjodin
llvm-svn: 145133
|
| |
|
|
| |
llvm-svn: 145129
|
| |
|
|
|
|
| |
128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
llvm-svn: 145126
|
| |
|
|
|
|
| |
the 128-bit versions and let the vector type distinguish.
llvm-svn: 145125
|
| |
|
|
|
|
|
|
| |
a lot.
While at it pull the trivial ctor in line.
llvm-svn: 145124
|
| |
|
|
| |
llvm-svn: 145122
|
| |
|
|
| |
llvm-svn: 145121
|
| |
|
|
|
|
|
|
|
|
|
|
| |
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.
Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.
llvm-svn: 145120
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.
I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.
llvm-svn: 145119
|
| |
|
|
|
|
|
|
| |
- lower unaligned loads/stores.
- encode the size operand of instructions INS and EXT.
- emit relocation information needed for JAL (jump-and-link).
llvm-svn: 145113
|
| |
|
|
| |
llvm-svn: 145112
|
| |
|
|
| |
llvm-svn: 145111
|
| |
|
|
|
|
| |
Fixes PR11426. Not sure if a test case with a "wrong" malloc would be useful.
llvm-svn: 145106
|
| |
|
|
|
|
|
|
| |
and positive: positive, because it could be directly computed to be positive;
negative, because the nsw flags means it is either negative or undefined (the
multiplication always overflowed).
llvm-svn: 145104
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
movabsq $4294967296, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00]
testq %rax, %rdi ## encoding: [0x48,0x85,0xf8]
jne LBB0_2 ## encoding: [0x75,A]
After:
btq $32, %rdi ## encoding: [0x48,0x0f,0xba,0xe7,0x20]
jb LBB0_2 ## encoding: [0x72,A]
btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off
saving one register and a giant movabsq.
llvm-svn: 145103
|
| |
|
|
|
|
|
|
|
|
|
| |
further. This invariant just wasn't going to work in the face of
unanalyzable branches; we need to be resillient to the phenomenon of
chains poking into a loop and poking out of a loop. In fact, we already
were, we just needed to not assert on it.
This was found during a bootstrap with block placement turned on.
llvm-svn: 145100
|
| |
|
|
|
|
|
|
| |
VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask.
The patch was reviewed by Bruno.
llvm-svn: 145099
|
| |
|
|
|
|
|
|
|
| |
successors, they just are all landing pad successors. We handle this the
same way as no successors. Comments attached for the next person to wade
through here and another lovely test case courtesy of Benjamin Kramer's
bugpoint reduction.
llvm-svn: 145098
|
| |
|
|
|
|
| |
Patch by Bill Wendling.
llvm-svn: 145097
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This was a bug in keeping track of the available domains when merging
domain values.
The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.
Also add an assertion to catch future attempts at emitting AVX2
instructions.
llvm-svn: 145096
|
| |
|
|
|
|
|
|
|
|
|
| |
reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.
Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.
llvm-svn: 145094
|
| |
|
|
| |
llvm-svn: 145092
|
| |
|
|
|
|
| |
new register-allocation pattern
llvm-svn: 145065
|
| |
|
|
|
|
| |
add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries.
llvm-svn: 145063
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.
This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.
This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.
llvm-svn: 145062
|
| |
|
|
|
|
| |
reading files.
llvm-svn: 145061
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.
No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.
llvm-svn: 145060
|
| |
|
|
|
|
|
| |
This was put in because in a certain version of DragonFlyBSD stat(2) lied about the
size of some files. This was fixed a long time ago so we can remove the workaround.
llvm-svn: 145059
|
| |
|
|
|
|
|
|
| |
before the clobber so that we copy the value if needed.
Fixes pr11415.
llvm-svn: 145056
|
| |
|
|
|
|
| |
correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms.
llvm-svn: 145055
|
| |
|
|
|
|
| |
the places that had to check a version of SSE and AVX.
llvm-svn: 145053
|
| |
|
|
| |
llvm-svn: 145047
|
| |
|
|
| |
llvm-svn: 145044
|