| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Remove the concepts of "forward" and "general" mass distributions, which
was wrong. The split might have made sense in an early version of the
algorithm, but it's definitely wrong now.
<rdar://problem/14292693>
llvm-svn: 207195
|
|
|
|
|
|
| |
<rdar://problem/14292693>
llvm-svn: 207194
|
|
|
|
|
|
| |
<rdar://problem/14292693>
llvm-svn: 207191
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than scaling loop headers and then scaling all the loop members
by the header frequency, scale `LoopData::Scale` itself, and scale the
loop members by it. It's much more obvious what's going on this way,
and doesn't cost any extra multiplies.
<rdar://problem/14292693>
llvm-svn: 207189
|
|
|
|
|
|
| |
<rdar://problem/14292693>
llvm-svn: 207188
|
|
|
|
|
|
| |
<rdar://problem/14292693>
llvm-svn: 207187
|
|
|
|
|
|
|
|
| |
When unwrapping loops, just visit the loops rather than all nodes.
<rdar://problem/14292693>
llvm-svn: 207186
|
|
|
|
|
|
| |
<rdar://problem/14292693>
llvm-svn: 207185
|
|
|
|
|
|
|
|
|
| |
Make `getPackagedNode()` a member function of
`BlockFrequencyInfoImplBase` so that it's available for templated code.
<rdar://problem/14292693>
llvm-svn: 207183
|
|
|
|
|
|
| |
<rdar://problem/14292693>
llvm-svn: 207182
|
|
|
|
|
|
| |
<rdar://problem/14292693>
llvm-svn: 207181
|
|
|
|
|
|
|
|
| |
Instead of passing around loop headers, pass around `LoopData` directly.
<rdar://problem/14292693>
llvm-svn: 207179
|
|
|
|
|
|
|
|
|
|
|
|
| |
As pointed out by David Blaikie in code review, a `std::list<T>` is
simpler than a `std::vector<std::unique_ptr<T>>`. Another option is a
`std::deque<T>` (which allocates in chunks), but I'd like to leave open
the option of inserting in the middle of the sequence for handling
irreducible control flow on the fly.
<rdar://problem/14292693>
llvm-svn: 207177
|
|
|
|
|
|
| |
This patch adds support for vectorization of bit intrinsics such as bswap,ctpop,ctlz,cttz.
llvm-svn: 207174
|
|
|
|
| |
llvm-svn: 207172
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
iteration.
When fixing the symbols in each compressed section we were iterating
over all symbols for each compressed section. In extreme cases this
could snowball severely (5min uncompressed -> 35min compressed) due to
iterating over all symbols for each compressed section (large numbers of
compressed sections can be generated by DWARF type units).
To address this, build a map of the symbols in each section ahead of
time, and access that map if a section is being compressed. This brings
compile time for the aforementioned example down to ~6 minutes.
llvm-svn: 207167
|
|
|
|
|
|
|
|
| |
check for"
Typo in testcase.
llvm-svn: 207166
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
llvm-svn: 207165
|
|
|
|
|
|
|
|
| |
stack and"
This reverts commit 207130 for buildbot breakage.
llvm-svn: 207162
|
|
|
|
|
|
|
|
| |
location"
This reverts commit 207130 for buildbot breakage.
llvm-svn: 207159
|
|
|
|
| |
llvm-svn: 207156
|
|
|
|
|
|
| |
section. Remove the former.
llvm-svn: 207153
|
|
|
|
| |
llvm-svn: 207151
|
|
|
|
|
|
|
| |
condition into an obviously infinite loop with an assert about the
degenerate condition. No functionality changed.
llvm-svn: 207147
|
|
|
|
|
|
| |
Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode.
llvm-svn: 207145
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur. It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.
Reviewers: nicholas
Differential Revision: http://llvm-reviews.chandlerc.com/D3240
llvm-svn: 207143
|
|
|
|
|
|
| |
appease the buildbots.
llvm-svn: 207136
|
|
|
|
|
|
|
|
|
| |
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.
rdar://problem/14874886, rdar://problem/16679936
llvm-svn: 207135
|
|
|
|
|
|
|
| |
rather than by adding an overload and hoping that it's declared before the code
that calls it. (In a modules build, it isn't.)
llvm-svn: 207133
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine-intrinsics testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
llvm-svn: 207130
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch:
- Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and
'int_x86_rdtscp') as GCCBuiltin intrinsics;
- Teaches the backend how to lower the two new builtins;
- Introduces a common function to lower READCYCLECOUNTER dag nodes
and the two new rdtsc/rdtscp intrinsics;
- Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll'
correctly verifies that both READCYCLECOUNTER and the two new intrinsics
work fine for both 64bit and 32bit Subtargets.
llvm-svn: 207127
|
|
|
|
| |
llvm-svn: 207126
|
|
|
|
|
|
|
|
| |
I discovered this const-hole while attempting to coalesnce the Symbol
and SymbolMap data structures. There's some pending issues with that,
but I figured this change was easy to flush early.
llvm-svn: 207124
|
|
|
|
| |
llvm-svn: 207123
|
|
|
|
|
|
|
|
|
| |
Leak identified by LSan and reported by Kostya Serebryany.
Let's get a bit experimental here... in theory our minimum compiler
versions support unordered_map.
llvm-svn: 207118
|
|
|
|
|
|
|
| |
This matches ARM64 behaviour, which I think is clearer. It also puts all the
churn from that difference into one easily ignored commit.
llvm-svn: 207116
|
|
|
|
|
|
| |
Patch by Yuri Gorshenin.
llvm-svn: 207115
|
|
|
|
| |
llvm-svn: 207111
|
|
|
|
| |
llvm-svn: 207110
|
|
|
|
| |
llvm-svn: 207109
|
|
|
|
|
|
| |
We only need assembly support, so it's fairly easy.
llvm-svn: 207108
|
|
|
|
| |
llvm-svn: 207106
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These can have different relocations in ELF. In particular both:
b.eq global
ldr x0, global
are valid, giving different relocations. The only possible way to distinguish
them is via a different fixup, so the operands had to be separated throughout
the backend.
llvm-svn: 207105
|
|
|
|
|
|
|
|
|
|
|
| |
ARM64 was not producing pure BFI instructions for bitfield insertion
operations, unlike AArch64. The approach had to be a little different (in
ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but
hopefully this gives it similar power.
This should address PR19424.
llvm-svn: 207102
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
algorithm here: http://dl.acm.org/citation.cfm?id=177301.
The idea of isolating the roots has even more relevance when using the
stack not just to implement the DFS but also to implement the recursive
step. Because we use it for the recursive step, to isolate the roots we
need to maintain two stacks: one for our recursive DFS walk, and another
of the nodes that have been walked. The nice thing is that the latter
will be half the size. It also fixes a complete hack where we scanned
backwards over the stack to find the next potential-root to continue
processing. Now that is always the top of the DFS stack.
While this is a really nice improvement already (IMO) it further opens
the door for two important simplifications:
1) De-duplicating some of the code across the two different walks. I've
actually made the duplication a bit worse in some senses with this
patch because the two are starting to converge.
2) Dramatically simplifying the loop structures of both walks.
I wanted to do those separately as they'll be essentially *just* CFG
restructuring. This patch on the other hand actually uses different
datastructures to implement the algorithm itself.
llvm-svn: 207098
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
applied prior to pushing a node onto the DFSStack. This is the first
step toward avoiding the stack entirely for leaf nodes. It also
simplifies things a bit and I think is pointing the way toward factoring
some more of the shared logic out of the two implementations.
It is also making it more obvious how to restructure the loops
themselves to be a bit easier to read (although no different in terms of
functionality).
llvm-svn: 207095
|
|
|
|
|
|
| |
Patch by Yuri Gorshenin.
llvm-svn: 207092
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a SmallPtrSet. Currently, there is no need for stable iteration in this
dimension, and I now thing there won't need to be going forward.
If this is ever re-introduced in any form, it needs to not be
a SetVector based solution because removal cannot be linear. There will
be many SCCs with large numbers of parents. When encountering these, the
incremental SCC update for intra-SCC edge removal was quadratic due to
linear removal (kind of).
I'm really hoping we can avoid having an ordering property here at all
though...
llvm-svn: 207091
|
|
|
|
|
|
|
| |
can use the node -> SCC mapping in the top-level graph to test this on
the rare occasions we need it.
llvm-svn: 207090
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unused.
This allows us to compile
return (mask & 0x8 ? a : b);
into
testb $8, %dil
cmovnel %edx, %esi
instead of
andl $8, %edi
shrl $3, %edi
cmovnel %edx, %esi
which we formed previously because dag combiner canonicalizes setcc of and into shift.
llvm-svn: 207088
|