|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| ... |  | 
| | 
| 
| 
| 
| 
| | register units.
llvm-svn: 197253 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.
This was blocking the switch to MI-Sched.
llvm-svn: 192669 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.
Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.
I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.
llvm-svn: 192635 | 
| | 
| 
| 
| 
| 
| | avoid specifying the vector size unnecessarily.
llvm-svn: 185512 | 
| | 
| 
| 
| | llvm-svn: 182680 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | All callers of these functions really want the isPhysRegOrOverlapUsed()
functionality which also checks aliases. For historical reasons, targets
without register aliases were calling isPhysRegUsed() instead.
Change isPhysRegUsed() to also check aliases, and switch all
isPhysRegOrOverlapUsed() callers to isPhysRegUsed().
llvm-svn: 166117 | 
| | 
| 
| 
| 
| 
| | not propagate through implicit defs.
llvm-svn: 165102 | 
| | 
| 
| 
| | llvm-svn: 157885 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | No functional change intended.
Sorry for the churn. The iterator classes are supposed to help avoid
giant commits like this one in the future. The TableGen-produced
register lists are getting quite large, and it may be necessary to
change the table representation.
This makes it possible to do so without changing all clients (again).
llvm-svn: 157854 | 
| | 
| 
| 
| | llvm-svn: 152001 | 
| | 
| 
| 
| | llvm-svn: 147071 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.
For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.
llvm-svn: 146026 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This was a bug in keeping track of the available domains when merging
domain values.
The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.
Also add an assertion to catch future attempts at emitting AVX2
instructions.
llvm-svn: 145096 | 
| | 
| 
| 
| 
| 
| | A function using any RC alias is enough to enable the ExeDepsFix pass.
llvm-svn: 144636 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.
The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.
The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.
llvm-svn: 144602 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Keep track of the last instruction to define each register individually
instead of per DomainValue.  This lets us track more accurately when a
register was last written.
Also track register ages across basic blocks.  When entering a new
basic block, use the least stale predecessor def as a worst case
estimate for register age.
The register age is used to arbitrate between conflicting domains. The
most recently defined register wins.
llvm-svn: 144601 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | During the initial RPO traversal of the basic blocks, remember the ones
that are incomplete because of back-edges from predecessors that haven't
been visited yet.
After the initial RPO, revisit all those loop headers so the incoming
DomainValues on the back-edges can be properly collapsed.
This will properly fix execution domains on software pipelined code,
like the included test case.
llvm-svn: 144151 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When merging two uncollapsed DomainValues, place a link to the active
DomainValue from the passive DomainValue.  This allows old stale
references to the passive DomainValue to be updated to point to the
active DomainValue.
The new resolve() function finds the active DomainValue and updates the
pointer.
This change makes old live-out lists more useful since they may contain
uncollapsed DomainValues that have since been merged into other
DomainValues.
llvm-svn: 144149 | 
| | 
| 
| 
| 
| 
| | This allows clear() to be called on a DomainValue with references.
llvm-svn: 144147 | 
| | 
| 
| 
| 
| 
| | There is no need to involve the LiveRegs array and kill() any longer.
llvm-svn: 144133 | 
| | 
| 
| 
| 
| 
| | No functional change.
llvm-svn: 144132 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This new function will decrement the reference count, and collapse a
domain value when the last reference is gone.
This simplifies DomainValue reference counting, and decouples it from
the LiveRegs array.
llvm-svn: 144131 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The old value may still be referenced by some live-out list, and we
don't wan't to collapse those instructions twice.
This fixes the "Can only swizzle VMOVD" assertion in some armv7 SPEC
builds.
<rdar://problem/10413292>
llvm-svn: 144117 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | DomainValues that are only used by "don't care" instructions are now
collapsed to the first possible execution domain after all basic blocks
have been processed.  This typically means the PS domain on x86.
For example, the vsel_i64 and vsel_double functions in sse2-blend.ll are
completely collapsed to the PS domain instead of containing a mix of
execution domains created by isel.
llvm-svn: 144037 | 
| | 
| 
| 
| 
| 
| 
| 
| | The enterBasicBlock() function is combining live-out values from
predecessor blocks.  The RPO traversal means that more predecessors
have been visited when that happens, only back-edges are missing.
llvm-svn: 144025 | 
| | 
| 
| 
| | llvm-svn: 144020 | 
| | 
| 
| 
| | llvm-svn: 144015 | 
| | 
| 
| 
| | llvm-svn: 144014 | 
| | 
| 
| 
| 
| 
| | No functional change intended.
llvm-svn: 140664 | 
|  | I'll clean up the source in the next commit.
llvm-svn: 140663 |