| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
infrastructure to do promotion without a domtree the same smarts about
looking through GEPs, bitcasts, etc., that I just taught mem2reg about.
This way, if SROA chooses to promote an alloca which still has some
noisy instructions this code can cope with them.
I've not used as principled of an approach here for two reasons:
1) This code doesn't really need it as we were already set up to zip
through the instructions used by the alloca.
2) I view the code here as more of a hack, and hopefully a temporary one.
The SSAUpdater path in SROA is a real sore point for me. It doesn't make
a lot of architectural sense for many reasons:
- We're likely to end up needing the domtree anyways in a subsequent
pass, so why not compute it earlier and use it.
- In the future we'll likely end up needing the domtree for parts of the
inliner itself.
- If we need to we could teach the inliner to preserve the domtree. Part
of the re-work of the pass manager will allow this to be very powerful
even in large SCCs with many functions.
- Ultimately, computing a domtree has gotten significantly faster since
the original SSAUpdater-using code went into ScalarRepl. We no longer
use domfrontiers, and much of domtree is lazily done based on queries
rather than eagerly.
- At this point keeping the SSAUpdater-based promotion saves a total of
0.7% on a build of the 'opt' tool for me. That's not a lot of
performance given the complexity!
So I'm leaving this a bit ugly in the hope that eventually we just
remove all of this nonsense.
I can't even readily test this because this code isn't reachable except
through SROA. When I re-instate the patch that fast-tracks allocas
already suitable for promotion, I'll add a testcase there that failed
before this change. Before that, SROA will fix any test case I give it.
llvm-svn: 187347
|
|
|
|
| |
llvm-svn: 187340
|
|
|
|
| |
llvm-svn: 187336
|
|
|
|
|
|
|
|
|
|
|
|
| |
standards for LLVM. Remove duplicated comments on the interface from the
implementation file (implementation comments are left there of course).
Also clean up, re-word, and fix a few typos and errors in the commenst
spotted along the way.
This is in preparation for changes to these files and to keep the
uninteresting tidying in a separate commit.
llvm-svn: 187335
|
|
|
|
|
|
| |
I forgot that we had two totally independent things here. :: sigh ::
llvm-svn: 187327
|
|
|
|
|
|
|
| |
Added 512-bit operands printing.
Added instruction formats for KNL instructions.
llvm-svn: 187324
|
|
|
|
|
|
|
|
|
|
|
|
| |
uses of an alloca, we can pre-compute promotability while analyzing an
alloca for splitting in SROA. That lets us short-circuit the common case
of a bunch of trivially promotable allocas. This cuts 20% to 30% off the
run time of SROA for typical frontend-generated IR sequneces I'm seeing.
It gets the new SROA to within 20% of ScalarRepl for such code. My
current benchmark for these numbers is PR15412, but it fits the general
pattern of IR emitted by Clang so it should be widely applicable.
llvm-svn: 187323
|
|
|
|
|
|
|
| |
useful in a subsequent patch, but causes an unfortunate amount of noise,
so I pulled it out into a separate patch.
llvm-svn: 187322
|
|
|
|
| |
llvm-svn: 187320
|
|
|
|
| |
llvm-svn: 187319
|
|
|
|
|
|
|
|
|
| |
The tests !defined(__ppc__) && !defined(__powerpc__) are not needed
or helpful when verifying that code is being compiled for a 64-bit
target. The simpler test provided by this revision is sufficient to
tell if the target is 64-bit.
llvm-svn: 187318
|
|
|
|
| |
llvm-svn: 187316
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IEEE-754R 1.4 Exclusions states that IEEE-754R does not specify the
interpretation of the sign of NaNs. In order to remove an irrelevant
variable that most floating point implementations do not use,
standardize add, sub, mul, div, mod so that operating anything with
NaN always yields a positive NaN.
In a later commit I am going to update the APIs for creating NaNs so
that one can not even create a negative NaN.
llvm-svn: 187314
|
|
|
|
|
|
|
|
| |
Zeroing the significand of a floating point number does not necessarily cause a
floating point number to become finite non zero. For instance, if one has a NaN,
zeroing the significand will cause it to become +/- infinity.
llvm-svn: 187313
|
|
|
|
| |
llvm-svn: 187309
|
|
|
|
|
|
| |
This makes LLVM emit the same signature regardless of host and target endianess.
llvm-svn: 187304
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
do in the SDag when lowering references to the GOT: use
ARMConstantPoolSymbol rather than creating a dummy global variable. The
computation of the alignment still feels weird (it uses IR types and
datalayout) but it preserves the exact previous behavior. This change
fixes the memory leak of the global variable detected on the valgrind
leak checking bot.
Thanks to Benjamin Kramer for pointing me at ARMConstantPoolSymbol to
handle this use case.
llvm-svn: 187303
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
me) should start watching this bot more as its catching lots of bugs.
The fix here is to not construct the global if we aren't going to need
it. That's cheaper anyways, and globals have highly predictable types in
practice. I've added an assert to catch skew between our manual testing
of the type and the actual type just for paranoia's sake.
Note that this pattern is actually fine in most globals because when you
build a global with a module it automatically is moved to be owned by
that module. But here, we're in isel and don't really want to do that.
The solution of not creating a global is simpler anyways.
llvm-svn: 187302
|
|
|
|
|
|
|
|
| |
There doesn't appear to be any reason to put this variable on the heap.
I'm suspicious of the LexicalScope above that we stuff in a map and then
delete afterward, but I'm just trying to get the valgrind bot clean.
llvm-svn: 187301
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
than once, and the second time through we leaked memory. Found thanks to
the vg-leak bot, but I can't locally reproduce it with valgrind. The
debugger confirms that it is in fact leaking here.
This whole code is totally gross. Why is initialize being called on each
runOnFunction??? Why aren't these OwningPtr<>s, and why aren't their
lifetimes better defined? Anyways, this is just a surgical change to
help out the leak checking bots.
llvm-svn: 187299
|
|
|
|
|
|
|
|
|
| |
their being optimized out in debug mode. Realistically, this just isn't
going to be the slow part anyways. This also fixes unused variable
warnings that are breaking LLD build bots. =/ I didn't see these at
first, and kept losing track of the fact that they were broken.
llvm-svn: 187297
|
|
|
|
|
|
|
|
|
|
|
| |
analysis of the alloca. We don't need to visit all the users twice for
this. We build up a kill list during the analysis and then just process
it afterward. This recovers the tiny bit of performance lost by moving
to the visitor based analysis system as it removes one entire use-list
walk from mem2reg. In some cases, this is now faster than mem2reg was
previously.
llvm-svn: 187296
|
|
|
|
|
|
|
|
| |
Also always add DIType, DISubprogram and DIGlobalVariable to the list
in DebugInfoFinder without checking them, so we can verify them later
on.
llvm-svn: 187285
|
|
|
|
| |
llvm-svn: 187284
|
|
|
|
|
|
|
|
|
|
| |
Adds unit tests for it too.
Split BasicBlockUtils into an analysis-half and a transforms-half, and put the
analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable
into llvm::isPotentiallyReachable and move it into Analysis/CFG.
llvm-svn: 187283
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conditions
Merge consecutive if-regions if they contain identical statements.
Both transformations reduce number of branches. The transformation
is guarded by a target-hook, and is currently enabled only for +R600,
but the correctness has been tested on X86 target using a variety of
CPU benchmarks.
Patch by: Mei Ye
llvm-svn: 187278
|
|
|
|
|
|
| |
handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.
llvm-svn: 187267
|
|
|
|
|
|
| |
as <3 x float>, because we dont have a good cost model for these types.
llvm-svn: 187265
|
|
|
|
|
|
| |
This reverts commit r187248. It broke many bots.
llvm-svn: 187254
|
|
|
|
| |
llvm-svn: 187253
|
|
|
|
|
|
|
| |
also worthwhile for it to look through FP extensions and truncations, whose
application commutes with fneg.
llvm-svn: 187249
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both GCC and LLVM will implicitly define __ppc__ and __powerpc__ for
all PowerPC targets, whether 32- or 64-bit. They will both implicitly
define __ppc64__ and __powerpc64__ for 64-bit PowerPC targets, and not
for 32-bit targets. We cannot be sure that all other possible
compilers used to compile Clang/LLVM define both __ppc__ and
__powerpc__, for example, so it is best to check for both when relying
on either inside the Clang/LLVM code base.
This patch makes sure we always check for both variants. In addition,
it fixes one unnecessary check in lib/Target/PowerPC/PPCJITInfo.cpp.
(At least one of __ppc__ and __powerpc__ should always be defined when
compiling for a PowerPC target, no matter which compiler is used, so
testing for them is unnecessary.)
There are some places in the compiler that check for other variants,
like __POWERPC__ and _POWER, and I have left those in place. There is
no need to add them elsewhere. This seems to be in Apple-specific
code, and I won't take a chance on breaking it.
There is no intended change in behavior; thus, no test cases are
added.
llvm-svn: 187248
|
|
|
|
| |
llvm-svn: 187247
|
|
|
|
| |
llvm-svn: 187245
|
|
|
|
|
|
| |
Patch by Sasa Stankovic.
llvm-svn: 187244
|
|
|
|
|
|
| |
register operands.
llvm-svn: 187242
|
|
|
|
|
|
| |
Thanks to Han Finkel for noticing it.
llvm-svn: 187241
|
|
|
|
|
|
| |
operands.
llvm-svn: 187238
|
|
|
|
|
|
|
|
| |
We used to call Verify before adding DICompileUnit to the list, and now we
remove the check and always add DICompileUnit to the list in DebugInfoFinder,
so we can verify them later on.
llvm-svn: 187237
|
|
|
|
| |
llvm-svn: 187234
|
|
|
|
|
|
|
|
|
| |
to have register FCC0 (the first floating point condition code register) in
their Uses/Defs list.
No intended functionality change.
llvm-svn: 187233
|
|
|
|
|
|
|
|
| |
needed. The generic method printOperand will do.
No functionality change.
llvm-svn: 187231
|
|
|
|
|
|
|
|
|
|
| |
instructions "beqz", "bnez" and "move", when possible.
beq $2, $zero, $L1 => beqz $2, $L1
bne $2, $zero, $L1 => bnez $2, $L1
or $2, $3, $zero => move $2, $3
llvm-svn: 187229
|
|
|
|
|
|
| |
for consistency.
llvm-svn: 187225
|
|
|
|
| |
llvm-svn: 187224
|
|
|
|
|
|
|
|
| |
These were reverted in r167222 along with the rest
of the last different address space pointer size attempt.
These will be used in later commits.
llvm-svn: 187223
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
type units.
Initially this support is used in the computation of an ODR checker
for C++. For now we're attaching it to the DIE, but in the future
it will be attached to the type unit.
This also starts breaking out types into the separation for type
units, but without actually splitting the DIEs.
In preparation for hashing the DIEs this adds a DIEString type
that contains a StringRef with the string contained at the label.
llvm-svn: 187213
|
|
|
|
| |
llvm-svn: 187212
|
|
|
|
|
|
|
|
|
|
| |
On Windows, this improves clean cmake configuration time on my
workstation from 1m58s to 1m32s, which is pretty significant. There's
probably more that can be done here, but this is the low hanging fruit.
Eric volunteered to regenerate ./configure for me.
llvm-svn: 187209
|
|
|
|
|
|
| |
Thanks to Hal Finkel for finding the bug and for the initial patch.
llvm-svn: 187208
|