| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
new register-allocation pattern
llvm-svn: 145065
|
| |
|
|
|
|
| |
add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries.
llvm-svn: 145063
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.
This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.
This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.
llvm-svn: 145062
|
| |
|
|
|
|
| |
reading files.
llvm-svn: 145061
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.
No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.
llvm-svn: 145060
|
| |
|
|
|
|
|
| |
This was put in because in a certain version of DragonFlyBSD stat(2) lied about the
size of some files. This was fixed a long time ago so we can remove the workaround.
llvm-svn: 145059
|
| |
|
|
|
|
|
|
| |
before the clobber so that we copy the value if needed.
Fixes pr11415.
llvm-svn: 145056
|
| |
|
|
|
|
| |
correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms.
llvm-svn: 145055
|
| |
|
|
|
|
| |
the places that had to check a version of SSE and AVX.
llvm-svn: 145053
|
| |
|
|
| |
llvm-svn: 145047
|
| |
|
|
| |
llvm-svn: 145044
|
| |
|
|
| |
llvm-svn: 145028
|
| |
|
|
|
|
| |
AVX2 is enabled.
llvm-svn: 145026
|
| |
|
|
| |
llvm-svn: 145025
|
| |
|
|
|
|
| |
use AVX2 shifts when AVX2 is enabled.
llvm-svn: 145022
|
| |
|
|
| |
llvm-svn: 145014
|
| |
|
|
|
|
|
|
| |
Suggested in code review by Eli.
That code in InstCombine looks kinda suspicious.
llvm-svn: 145013
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.
The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.
This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]
llvm-svn: 145010
|
| |
|
|
|
|
| |
setFlags doesn't modify its arguments.
llvm-svn: 145007
|
| |
|
|
|
|
| |
instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
llvm-svn: 145005
|
| |
|
|
| |
llvm-svn: 145004
|
| |
|
|
|
|
| |
limitation of not being able to remove redundant bitconverts from patterns.
llvm-svn: 145003
|
| |
|
|
|
|
| |
the intrinsic patterns.
llvm-svn: 144999
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.
Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.
Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).
llvm-svn: 144994
|
| |
|
|
|
|
| |
add/sub of appropriate shuffle vectors.
llvm-svn: 144989
|
| |
|
|
| |
llvm-svn: 144988
|
| |
|
|
| |
llvm-svn: 144987
|
| |
|
|
| |
llvm-svn: 144985
|
| |
|
|
|
|
|
|
|
|
|
| |
The loop tree's inclusive block lists are painful and expensive to
update. (I have no idea why they're inclusive). The design was
supposed to handle this case but the implementation missed it and my
unit tests weren't thorough enough.
Fixes PR11335: loop unroll update.
llvm-svn: 144970
|
| |
|
|
| |
llvm-svn: 144967
|
| |
|
|
|
|
| |
large chunks of inline assembler
llvm-svn: 144962
|
| |
|
|
|
|
| |
work/dead code.
llvm-svn: 144959
|
| |
|
|
|
|
| |
in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
llvm-svn: 144937
|
| |
|
|
|
|
| |
r144933. For some reason this compiles on linux
llvm-svn: 144936
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.
Fixes PR11350: runaway iteration assertion.
llvm-svn: 144935
|
| |
|
|
|
|
| |
asan; add a test check that asan does not touch linkonce_odr
llvm-svn: 144933
|
| |
|
|
| |
llvm-svn: 144920
|
| |
|
|
|
|
| |
vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments.
llvm-svn: 144896
|
| |
|
|
| |
llvm-svn: 144888
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
ADDs. MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD. Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to
coalesce ADDs.
rdar://10412592
llvm-svn: 144886
|
| |
|
|
| |
llvm-svn: 144885
|
| |
|
|
|
|
|
|
|
|
| |
Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom
names for fwrite and fputs.
Fixes <rdar://problem/9815881>.
llvm-svn: 144876
|
| |
|
|
|
|
| |
rdar://10456186
llvm-svn: 144872
|
| |
|
|
|
|
| |
I know, and I'd like to see wider testing.
llvm-svn: 144867
|
| |
|
|
|
|
| |
LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
|
| |
|
|
| |
llvm-svn: 144861
|
| |
|
|
|
|
|
| |
We don't (yet) have the granularity in the fixups to be specific about which
bitranges are affected. That's a future cleanup, but we're not there yet.
llvm-svn: 144852
|
| |
|
|
| |
llvm-svn: 144849
|
| |
|
|
| |
llvm-svn: 144847
|
| |
|
|
| |
llvm-svn: 144842
|