| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
Untill now we detected the vectorizable tree and evaluated the cost of the
entire tree. With this patch we can decide to trim-out branches of the tree
that are not profitable to vectorizer.
Also, increase the max depth from 6 to 12. In the worse possible case where all
of the code is made of diamond-shaped graph this can bring the cost to 2**10,
but diamonds are not very common.
llvm-svn: 184681
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to write unit tests that are less susceptible
to minor code motion, particularly copy placement. block-placement.ll
covers this case with -pre-RA-sched=source which will soon be
default. One incorrectly named block is already fixed, but without
this fix, enabling new coalescing and scheduling would cause more
failures.
llvm-svn: 184680
|
|
|
|
|
|
|
|
| |
sequences.
Make sure that we don't replace and RAUW two sequences if one does not dominate the other.
llvm-svn: 184674
|
|
|
|
|
|
| |
The RAII builder location guard is saving a reference to instructions, so we can't erase instructions during vectorization.
llvm-svn: 184671
|
|
|
|
|
|
| |
ULEB128/SLEB128 generation
llvm-svn: 184669
|
|
|
|
|
|
|
|
| |
This is an awful implementation of the target hook. But we don't have
abstractions yet for common machine ops, and I don't see any quick way
to make it table-driven.
llvm-svn: 184664
|
|
|
|
| |
llvm-svn: 184660
|
|
|
|
|
|
|
|
|
| |
Rewrote the SLP-vectorization as a whole-function vectorization pass. It is now able to vectorize chains across multiple basic blocks.
It still does not vectorize PHIs, but this should be easy to do now that we scan the entire function.
I removed the support for extracting values from trees.
We are now able to vectorize more programs, but there are some serious regressions in many workloads (such as flops-6 and mandel-2).
llvm-svn: 184647
|
|
|
|
|
|
| |
and parameter packs
llvm-svn: 184643
|
|
|
|
| |
llvm-svn: 184642
|
|
|
|
|
|
|
|
| |
argument."
It doesn't work as I intended it to. This reverts commit r184638.
llvm-svn: 184641
|
|
|
|
|
|
| |
It has become an expensive operation. No functionality change.
llvm-svn: 184638
|
|
|
|
|
|
|
|
|
| |
Although in reality the symbol table in ELF resides in a section, the
standard requires that there be no more than one SHT_SYMTAB. To enforce
this constraint, it is cleaner to group all the symbols under a
top-level `Symbols` key on the object file.
llvm-svn: 184627
|
|
|
|
|
|
|
|
|
| |
We have no targets on trunk that bundle before regalloc. However, we
have been advertising regalloc as bundle safe for use with out-of-tree
targets. We need to at least contain the parts of the code that are
still unsafe.
llvm-svn: 184620
|
|
|
|
|
|
|
|
|
|
|
|
| |
A FastISel optimization was causing us to emit no information for such
parameters & when they go missing we end up emitting a different
function type. By avoiding that shortcut we not only get types correct
(very important) but also location information (handy) - even if it's
only live at the start of a function & may be clobbered later.
Reviewed/discussion by Evan Cheng & Dan Gohman.
llvm-svn: 184604
|
|
|
|
|
|
| |
Thanks to Bill Wendling for pointing this out!
llvm-svn: 184593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that have been run through the 'C' pre-processor.
The implementation of SrcMgr.FindLineNumber() is slow but OK if
it uses its cache when called multiple times with an SMLoc that is
forward of the previous call.
In the case of generating dwarf for assembly source files that have
been run through the 'C' pre-processor we need to calculate the
logical line number based on the last parsed cpp hash file line
comment. And the current code calls SrcMgr.FindLineNumber()
twice to do this causing its cache not to work and results in very
slow compile times:
% time /Volumes/SandBox/build-llvm/Debug+Asserts/bin/llvm-mc -triple thumbv7-apple-ios -filetype=obj -o /tmp/x.o mscorlib.dll.E -g
672.542u 0.299s 11:13.15 99.9% 0+0k 0+2io 2106pf+0w
So we save the info from the last parsed cpp hash file line comment
to avoid making the second call to SrcMgr.FindLineNumber() most times
and end up with compile times like:
% time /Volumes/SandBox/build-llvm/Debug+Asserts/bin/llvm-mc -triple thumbv7-apple-ios -filetype=obj -o /tmp/x.o mscorlib.dll.E -g
3.404u 0.104s 0:03.80 92.1% 0+0k 0+3io 2105pf+0w
rdar://14156934
llvm-svn: 184592
|
|
|
|
|
|
|
|
| |
frequency with a branch probability."
This reverts commit r184584. Breaks PPC selfhost.
llvm-svn: 184590
|
|
|
|
|
|
| |
PtrState.RRI private and delete the TODO.
llvm-svn: 184587
|
|
|
|
|
|
| |
several methods on PtrState.
llvm-svn: 184586
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a branch probability.
Zero is used by BlockFrequencyInfo as a special "don't know" value. It also
causes a sink for frequencies as you can't ever get off a zero frequency with
more multiplies.
This recovers a 10% regression on MultiSource/Benchmarks/7zip. A zero frequency
was propagated into an inner loop causing excessive spilling.
PR16402.
llvm-svn: 184584
|
|
|
|
|
|
| |
PtrState.IsTrackingImpreciseRelease().
llvm-svn: 184583
|
|
|
|
|
|
| |
PtrState.{IsCFGHazardAfflicted,SetCFGHazardAfflicted}.
llvm-svn: 184582
|
|
|
|
|
|
| |
IR for CUDA should use "nvptx[64]-nvidia-cuda", and IR for NV OpenCL should use "nvptx[64]-nvidia-nvcl"
llvm-svn: 184579
|
|
|
|
|
|
|
| |
When (srl (anyextend x), c) is folded into (anyextend (srl x, c)), the
high bits are not cleared. Add 'and' to clear off them.
llvm-svn: 184575
|
|
|
|
| |
llvm-svn: 184574
|
|
|
|
| |
llvm-svn: 184573
|
|
|
|
|
|
|
|
|
|
| |
Live intervals for dead physregs may be created during coalescing. We
need to update these in the event that their instruction goes away.
crash.ll is the unit test that catches it when MI sched is enabled on
X86.
llvm-svn: 184572
|
|
|
|
|
|
| |
I want to add logic to handle more cases.
llvm-svn: 184571
|
|
|
|
| |
llvm-svn: 184570
|
|
|
|
| |
llvm-svn: 184569
|
|
|
|
|
|
|
| |
Always coalesce in forward order to propagate rematerialization.
I'm fixing this option so I can enable it by default soon.
llvm-svn: 184568
|
|
|
|
| |
llvm-svn: 184567
|
|
|
|
| |
llvm-svn: 184566
|
|
|
|
| |
llvm-svn: 184565
|
|
|
|
| |
llvm-svn: 184564
|
|
|
|
|
|
|
|
|
|
|
| |
The GNU assembler supports (as extension to the ABI) use of PC-relative
relocations in half16 fields, which allows writing code like:
li 1, base-.
This patch adds support for those relocation types in the assembler.
llvm-svn: 184552
|
|
|
|
|
|
|
|
|
|
| |
The current code base only supports the minimum set of tls-related
relocations and @modifiers that are necessary to support compiler-
generated code. This patch extends this to the full set defined
in the ABI (and supported by the GNU assembler) for the benefit
of the assembler parser.
llvm-svn: 184551
|
|
|
|
|
|
|
| |
This adds support for the @higher, @highera, @highest, and @highesta
modifers, including some missing relocation types.
llvm-svn: 184550
|
|
|
|
|
|
|
| |
This adds the relocation type and other necessary infrastructure
to use the @toc@h modifier in the assembler.
llvm-svn: 184549
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds necessary infrastructure to support the @h modifier.
Note that all required relocation types were already present
(and unused).
This patch provides support for using @h in the assembler;
it would also be possible to now use this feature in code
generated by the compiler, but this is not done yet.
llvm-svn: 184548
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This renames more VK_PPC_ enums, to make them more closely reflect
the @modifier string they represent. This also prepares for adding
a bunch of new VK_PPC_ enums in upcoming patches.
For consistency, some MO_ flags related to VK_PPC_ enums are
likewise renamed.
No change in behaviour.
llvm-svn: 184547
|
|
|
|
|
|
| |
PtrState.GetReleaseMetadata() and PtrState.SetReleaseMetadata().
llvm-svn: 184534
|
|
|
|
|
|
| |
PtrState.IsTailCallRelease() and PtrState.SetTailCallRelease().
llvm-svn: 184533
|
|
|
|
|
|
|
|
|
|
|
|
| |
PtrState.IsKnownSafe and PtrState.SetKnownSafe.
This is apart of a series of patches to encapsulate PtrState.RRI and
make PtrState.RRI a private field of PtrState.
*NOTE* This is actually the second commit in the patch stream. I should
have put this note on the first such commit r184528.
llvm-svn: 184532
|
|
|
|
| |
llvm-svn: 184531
|
|
|
|
|
|
|
|
| |
RRInfo::Merge.
I also added some comments and performed minor code cleanups.
llvm-svn: 184528
|
|
|
|
|
|
| |
vector-register size.
llvm-svn: 184527
|
|
|
|
|
|
|
|
|
| |
definitions (& rename the 'fwd' tag to 'decl' for clarity)
This change is version locked with a change in Clang, so expect some
transient buildbot fallout.
llvm-svn: 184525
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead, just have 3 sub-lists, one for each of
{STB_LOCAL,STB_GLOBAL,STB_WEAK}.
This allows us to be a lot more explicit w.r.t. the symbol ordering in
the object file, because if we allowed explicitly setting the STB_*
`Binding` key for the symbol, then we might have ended up having to
shuffle STB_LOCAL symbols to the front of the list, which is likely to
cause confusion and potential for error.
Also, this new approach is simpler ;)
llvm-svn: 184506
|