|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | template
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:
 va_start(ValueArgs, Desc);
with Desc being a StringRef.
Differential Revision: https://reviews.llvm.org/D25342
llvm-svn: 283671 | 
| | 
| 
| 
| | llvm-svn: 280257 | 
| | 
| 
| 
| 
| 
| 
| 
| | This should finish the GraphTraits migration.
Differential Revision: http://reviews.llvm.org/D23730
llvm-svn: 279475 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Currently nodes_iterator may dereference to a NodeType* or a NodeType&. Make them all dereference to NodeType*, which is NodeRef later.
Differential Revision: https://reviews.llvm.org/D23704
Differential Revision: https://reviews.llvm.org/D23705
llvm-svn: 279326 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This is part of the "NodeType* -> NodeRef" migration. Notice that since
GraphWriter prints object address as identity, I added a static_assert on
NodeRef to be a pointer type.
Reviewers: dblaikie
Subscribers: llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D23580
llvm-svn: 278966 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.
Patch by River Riddle!
Differential Revision: https://reviews.llvm.org/D22744
llvm-svn: 277411 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | They seem to trigger an LSan failure:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/15140/steps/check-llvm%20asan/logs/stdio
Revert "Add the tests for r277313"
This reverts commit r277314.
Revert "CodeExtractor : Add ability to preserve profile data."
This reverts commit r277313.
llvm-svn: 277317 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.
Patch by River Riddle!
Differential Revision: https://reviews.llvm.org/D22744
llvm-svn: 277313 | 
| | 
| 
| 
| | llvm-svn: 274007 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).
llvm-svn: 273996 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..
llvm-svn: 273992 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.
This patch also enables BFI graph to show branch probability
on edges as MBFI does before.
llvm-svn: 273990 | 
| | 
| 
| 
| 
| 
| 
| 
| | Expose getBPI interface from BFI impl and use
it in graph viewer. This eliminates the dependency
on old PM interface.
llvm-svn: 273967 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | -view-machine-block-freq-propagation-dags currently
support integer and fraction as the suboptions. This
patch adds the 'count' suboption to display actual
profile count if available.
llvm-svn: 273460 | 
| | 
| 
| 
| 
| 
| | Differential Revision:  http://reviews.llvm.org/D21596
llvm-svn: 273430 | 
| | 
| 
| 
| | llvm-svn: 273366 | 
| | 
| 
| 
| | llvm-svn: 273335 | 
| | 
| 
| 
| | llvm-svn: 249880 | 
| | 
| 
| 
| 
| 
| 
| 
| | pointers to references.
http://reviews.llvm.org/D11196
llvm-svn: 242322 | 
| | 
| 
| 
| 
| 
| | Temporarily back out commits r211749, r211752 and r211754.
llvm-svn: 211814 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
llvm-svn: 211749 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.
Other sub-trees will follow.
llvm-svn: 206837 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit r206707, reapplying r206704.  The preceding commit
to CalcSpillWeights should have sorted out the failing buildbots.
<rdar://problem/14292693>
llvm-svn: 206766 | 
| | 
| 
| 
| 
| 
| | This reverts commit r206704, as expected.
llvm-svn: 206707 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit r206677, reapplying my BlockFrequencyInfo rewrite.
I've done a careful audit, added some asserts, and fixed a couple of
bugs (unfortunately, they were in unlikely code paths).  There's a small
chance that this will appease the failing bots [1][2].  (If so, great!)
If not, I have a follow-up commit ready that will temporarily add
-debug-only=block-freq to the two failing tests, allowing me to compare
the code path between what the failing bots and what my machines (and
the rest of the bots) are doing.  Once I've triggered those builds, I'll
revert both commits so the bots go green again.
[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445
<rdar://problem/14292693>
llvm-svn: 206704 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit r206666, as planned.
Still stumped on why the bots are failing.  Sanitizer bots haven't
turned anything up.  If anyone can help me debug either of the failures
(referenced in r206666) I'll owe them a beer.  (In the meantime, I'll be
auditing my patch for undefined behaviour.)
llvm-svn: 206677 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit r206628, reapplying r206622 (and r206626).
Two tests are failing only on buildbots [1][2]: i.e., I can't reproduce
on Darwin, and Chandler can't reproduce on Linux.  Asan and valgrind
don't tell us anything, but we're hoping the msan bot will catch it.
So, I'm applying this again to get more feedback from the bots.  I'll
leave it in long enough to trigger builds in at least the sanitizer
buildbots (it was failing for reasons unrelated to my commit last time
it was in), and hopefully a few others.... and then I expect to revert a
third time.
[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445
llvm-svn: 206666 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit r206622 and the MSVC fixup in r206626.
Apparently the remotely failing tests are still failing, despite my
attempt to fix the nondeterminism in r206621.
llvm-svn: 206628 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit r206556, effectively reapplying commit r206548 and
its fixups in r206549 and r206550.
In an intervening commit I've added target triples to the tests that
were failing remotely [1] (but passing locally).  I'm hoping the mystery
is solved?  I'll revert this again if the tests are still failing
remotely.
[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
llvm-svn: 206622 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commits r206548, r206549 and r206549.
There are some unit tests failing that aren't failing locally [1], so
reverting until I have time to investigate.
[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
llvm-svn: 206556 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Rewrite the shared implementation of BlockFrequencyInfo and
MachineBlockFrequencyInfo entirely.
The old implementation had a fundamental flaw:  precision losses from
nested loops (or very wide branches) compounded past loop exits (and
convergence points).
The @nested_loops testcase at the end of
test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating.  This
function has three nested loops, with branch weights in the loop headers
of 1:4000 (exit:continue).  The old analysis gives non-sensical results:
    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    ---- Block Freqs ----
     entry = 1.0
     for.cond1.preheader = 1.00103
     for.cond4.preheader = 5.5222
     for.body6 = 18095.19995
     for.inc8 = 4.52264
     for.inc11 = 0.00109
     for.end13 = 0.0
The new analysis gives correct results:
    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    block-frequency-info: nested_loops
     - entry: float = 1.0, int = 8
     - for.cond1.preheader: float = 4001.0, int = 32007
     - for.cond4.preheader: float = 16008001.0, int = 128064007
     - for.body6: float = 64048012001.0, int = 512384096007
     - for.inc8: float = 16008001.0, int = 128064007
     - for.inc11: float = 4001.0, int = 32007
     - for.end13: float = 1.0, int = 8
Most importantly, the frequency leaving each loop matches the frequency
entering it.
The new algorithm leverages BlockMass and PositiveFloat to maintain
precision, separates "probability mass distribution" from "loop
scaling", and uses dithering to eliminate probability mass loss.  I have
unit tests for these types out of tree, but it was decided in the review
to make the classes private to BlockFrequencyInfoImpl, and try to shrink
them (or remove them entirely) in follow-up commits.
The new algorithm should generally have a complexity advantage over the
old.  The previous algorithm was quadratic in the worst case.  The new
algorithm is still worst-case quadratic in the presence of irreducible
control flow, but it's linear without it.
The key difference between the old algorithm and the new is that control
flow within a loop is evaluated separately from control flow outside,
limiting propagation of precision problems and allowing loop scale to be
calculated independently of mass distribution.  Loops are visited
bottom-up, their loop scales are calculated, and they are replaced by
pseudo-nodes.  Mass is then distributed through the function, which is
now a DAG.  Finally, loops are revisited top-down to multiply through
the loop scales and the masses distributed to pseudo nodes.
There are some remaining flaws.
  - Irreducible control flow isn't modelled correctly.  LoopInfo and
    MachineLoopInfo ignore irreducible edges, so this algorithm will
    fail to scale accordingly.  There's a note in the class
    documentation about how to get closer.  See also the comments in
    test/Analysis/BlockFrequencyInfo/irreducible.ll.
  - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting
    the 64-bit integer precision used downstream.
  - The "bias" calculation proposed on llvmdev is *not* incorporated
    here.  This will be added in a follow-up commit, once comments from
    this review have been handled.
llvm-svn: 206548 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is a shared implementation class for BlockFrequencyInfo and
MachineBlockFrequencyInfo, not for BlockFrequency, a related (but
distinct) class.
No functionality change.
<rdar://problem/14292693>
llvm-svn: 206083 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Implement Pass::releaseMemory() in BlockFrequencyInfo and
MachineBlockFrequencyInfo.  Just delete the private implementation when
not in use.  Switch to a std::unique_ptr to make the logic more clear.
<rdar://problem/14292693>
llvm-svn: 204741 | 
| | 
| 
| 
| 
| 
| | <rdar://problem/14292693>
llvm-svn: 204740 | 
| | 
| 
| 
| 
| 
| | getBlockFreq() in all *BlockFrequencyInfo*.
llvm-svn: 197304 | 
| | 
| 
| 
| 
| 
| | new print methods.
llvm-svn: 197289 | 
| | 
| 
| 
| 
| 
| | BlockFrequencyInfo that were added to BlockFrequencyImpl in r197285 and r197284.
llvm-svn: 197287 | 
| | 
| 
| 
| | llvm-svn: 196310 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | propagation graph via graphviz.
This is useful for debugging issues in the BlockFrequency implementation
since one can easily visualize where probability mass and other errors
occur in the propagation.
This is the MI version of r194654.
llvm-svn: 196183 | 
| | 
| 
| 
| 
| 
| 
| | This is a band-aid to fix the most severe regressions we're seeing from basing
spill decisions on block frequencies, until we have a better solution.
llvm-svn: 184835 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131 | 
| | 
| 
| 
| | llvm-svn: 146986 | 
| | 
| 
| 
| | llvm-svn: 136816 | 
| | 
| 
| 
| | llvm-svn: 136278 | 
|  | MachineBlockFrequencyInfo.
llvm-svn: 135937 |