|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| ... |  | 
| | 
| 
| 
| 
| 
| 
| | By overriding Pass::verifyAnalysis(), the pass contents will be verified
by the pass manager.
llvm-svn: 160994 | 
| | 
| 
| 
| 
| 
| | loads from different x86 segments but the same address would get CSEd
llvm-svn: 160987 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is a cleaned up version of the isFree() function in
MachineTraceMetrics.cpp.
Transient instructions are very unlikely to produce any code in the
final output. Either because they get eliminated by RegisterCoalescing,
or because they are pseudo-instructions like labels and debug values.
llvm-svn: 160977 | 
| | 
| 
| 
| 
| 
| 
| | This function verifies the consistency of cached data in the
MachineTraceMetrics analysis.
llvm-svn: 160976 | 
| | 
| 
| 
| 
| 
| 
| | The MachineTraceMetrics analysis must be invalidated before modifying
the CFG. This will catch some of the violations of that rule.
llvm-svn: 160969 | 
| | 
| 
| 
| 
| 
| 
| 
| | A->isPredecessor(B) is the same as B->isSuccessor(A), but it can
tolerate a B that is null or dangling. This shouldn't happen normally,
but it it useful for verification code.
llvm-svn: 160968 | 
| | 
| 
| 
| | llvm-svn: 160927 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.
rdar://10554090 and rdar://11873276
llvm-svn: 160919 | 
| | 
| 
| 
| 
| 
| | Jakob fixed ProcessImplicifDefs in r159149.
llvm-svn: 160910 | 
| | 
| 
| 
| | llvm-svn: 160905 | 
| | 
| 
| 
| 
| 
| 
| | This makes it possible to quickly detect blocks that are outside the
trace.
llvm-svn: 160904 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | all tests accordingly.
Fixes PR13351.
Patch by shinichiro hamaji!
llvm-svn: 160899 | 
| | 
| 
| 
| | llvm-svn: 160898 | 
| | 
| 
| 
| 
| 
| 
| 
| | A value number is a PHI def if and only if it begins at a block
boundary. This can be derived from the def slot, a separate flag is not
necessary.
llvm-svn: 160893 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This option replaces the existing live interval computation with one
based on LiveRangeCalc.cpp. The new algorithm does not depend on
LiveVariables, and it can be run at any time, before or after leaving
SSA form.
llvm-svn: 160892 | 
| | 
| 
| 
| 
| 
| | Patch by Tyler Nowicki!
llvm-svn: 160888 | 
| | 
| 
| 
| | llvm-svn: 160798 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is still a work in progress.
Out-of-order CPUs usually execute instructions from multiple basic
blocks simultaneously, so it is necessary to look at longer traces when
estimating the performance effects of code transformations.
The MachineTraceMetrics analysis will pick a typical trace through a
given basic block and provide performance metrics for the trace. Metrics
will include:
- Instruction count through the trace.
- Issue count per functional unit.
- Critical path length, and per-instruction 'slack'.
These metrics can be used to determine the performance limiting factor
when executing the trace, and how it will be affected by a code
transformation.
Initially, this will be used by the early if-conversion pass.
llvm-svn: 160796 | 
| | 
| 
| 
| | llvm-svn: 160791 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.
This also fixed rdar://11830760 where xor is expected instead of movi0.
llvm-svn: 160749 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When a live range splits into multiple connected components, we would
arbitrarily assign <undef> uses to component 0. This is wrong when the
use is tied to a def that gets assigned to a different component:
  %vreg69<def> = ADD8ri %vreg68<undef>, 1
The use and def must get the same virtual register.
Fix this by assigning <undef> uses to the same component as the value
defined by the instruction, if any:
  %vreg69<def> = ADD8ri %vreg69<undef>, 1
This fixes PR13402. The PR has a test case which I am not including
because it is unlikely to keep exposing this behavior in the future.
llvm-svn: 160739 | 
| | 
| 
| 
| 
| 
| | Include <undef> operands and virtual registers after leaving SSA form.
llvm-svn: 160734 | 
| | 
| 
| 
| 
| 
| | release builds from crashing if code uses an intrinsic with an illegal type.
llvm-svn: 160661 | 
| | 
| 
| 
| | llvm-svn: 160621 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | that do not support it (X86 does not lower select_cc).
PR: 13428
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160619 | 
| | 
| 
| 
| | llvm-svn: 160617 | 
| | 
| 
| 
| 
| 
| | release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled.
llvm-svn: 160616 | 
| | 
| 
| 
| 
| 
| | clang's -Wunused-private-field.
llvm-svn: 160583 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.
This fixes PR13414.
llvm-svn: 160575 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.
Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.
The test case demonstrates the improvement. Before:
LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1
After:
LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1
  movl  %esi, %eax
llvm-svn: 160571 | 
| | 
| 
| 
| 
| 
| | which has no def
llvm-svn: 160531 | 
| | 
| 
| 
| 
| 
| | No functionality change.
llvm-svn: 160501 | 
| | 
| 
| 
| | llvm-svn: 160493 | 
| | 
| 
| 
| | llvm-svn: 160475 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | LiveIntervals due to the two-addr pass generating bogus MI code.
The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.
The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.
As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]
Thanks to Jakob for helpful hints on the way here, and the review.
llvm-svn: 160443 | 
| | 
| 
| 
| | llvm-svn: 160411 | 
| | 
| 
| 
| 
| 
| | instcombine transformation.
llvm-svn: 160387 | 
| | 
| 
| 
| | llvm-svn: 160380 | 
| | 
| 
| 
| | llvm-svn: 160372 | 
| | 
| 
| 
| 
| 
| 
| | When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.
llvm-svn: 160357 | 
| | 
| 
| 
| | llvm-svn: 160354 | 
| | 
| 
| 
| | llvm-svn: 160350 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.
int foo(unsigned long l) {
  return (l>> 47) == 1;
}
we produce
  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv
which codegens to
movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret
TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.
Based on a patch by Eli Friedman.
PR10328
rdar://9758774
llvm-svn: 160346 | 
| | 
| 
| 
| | llvm-svn: 160311 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160305 | 
| | 
| 
| 
| 
| 
| 
| 
| | wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160235 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160229 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The notable fix is to look at any dependencies attached to the kill
instruction (or other instructions between MI nad the kill) where the
dependencies are specific to the register in question.
The old code implicitly handled this by rejecting the transform if *any*
other uses were found within the block, but after the start point. The
new code directly finds the kill, and has to re-use the existing
dependency scan to check for non-kill uses.
This was caught by self-host, but I found the bug via inspection and use
of absurd assert scaffolding to compute the kills in two ways and
compare them. So I have no useful testcase for this other than
"bootstrap". I'd work harder to reduce a test case if this particular
code were likely to live for a long time.
Thanks to Benjamin Kramer for reviewing the fix itself.
llvm-svn: 160228 | 
| | 
| 
| 
| 
| 
| 
| 
| | single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.
llvm-svn: 160221 | 
| | 
| 
| 
| 
| 
| | No test case, there are no in-tree targets that require this.
llvm-svn: 160219 |