| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generic function calls and intrinsics. This is somewhat overlapping with
an existing intrinsic cost method, but that one seems targetted at
vector intrinsics. I'll merge them or separate their names and use cases
in a separate commit.
This sinks the test of 'callIsSmall' down into TTI where targets can
control it. The whole thing feels very hack-ish to me though. I've left
a FIXME comment about the fundamental design problem this presents. It
isn't yet clear to me what the users of this function *really* care
about. I'll have to do more analysis to figure that out. Putting this
here at least provides it access to proper analysis pass tools and other
such. It also allows us to more cleanly implement the baseline cost
interfaces in TTI.
With this commit, it is now theoretically possible to simplify much of
the inline cost analysis's handling of calls by calling through to this
interface. That conversion will have to happen in subsequent commits as
it requires more extensive restructuring of the inline cost analysis.
The CodeMetrics class is now really only in the business of running over
a block of code and aggregating the metrics on that block of code, with
the actual cost evaluation done entirely in terms of TTI.
llvm-svn: 173148
|
|
|
|
|
|
|
|
|
| |
Attribute.
This further restricts the use of the Attribute class to the Attribute family of
classes.
llvm-svn: 173098
|
|
|
|
|
|
|
|
|
| |
Attribute.
This is more code to isolate the use of the Attribute class to that of just
holding one attribute instead of a collection of attributes.
llvm-svn: 173094
|
|
|
|
|
|
|
|
|
| |
(sub 0, (sext bool to A)) to (zext bool to A).
Patch by Muhammad Ahmad
Reviewed by Duncan Sands
llvm-svn: 173093
|
|
|
|
| |
llvm-svn: 173061
|
|
|
|
|
|
|
|
|
|
|
|
| |
is free. The whole CodeMetrics API should probably be reworked more, but
this is enough to allow deleting the duplicate code there for computing
whether an instruction is free.
All of the passes using this have been updated to pull in TTI and hand
it to the CodeMetrics stuff. Further, a dead CodeMetrics API
(analyzeFunction) is nuked for lack of users.
llvm-svn: 173036
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a dynamic analysis done on each call to the routine. However, now it can
use the standard pass infrastructure to reference other analyses,
instead of a silly setter method. This will become more interesting as
I teach it about more analysis passes.
This updates the two inliner passes to use the inline cost analysis.
Doing so highlights how utterly redundant these two passes are. Either
we should find a cheaper way to do always inlining, or we should merge
the two and just fiddle with the thresholds to get the desired behavior.
I'm leaning increasingly toward the latter as it would also remove the
Inliner sub-class split.
llvm-svn: 173030
|
|
|
|
|
|
| |
Formatting fixes brought to you by clang-format.
llvm-svn: 173029
|
|
|
|
|
|
| |
functionality changed.
llvm-svn: 173028
|
|
|
|
| |
llvm-svn: 172990
|
|
|
|
| |
llvm-svn: 172971
|
|
|
|
|
|
|
| |
We ignore the cpu frontend and focus on pipeline utilization. We do this because we
don't have a good way to estimate the loop body size at the IR level.
llvm-svn: 172964
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This separates the check for "too few elements to run the vector loop" from the
"memory overlap" check, giving a lot nicer code and allowing to skip the memory
checks when we're not going to execute the vector code anyways. We still leave
the decision of whether to emit the memory checks as branches or setccs, but it
seems to be doing a good job. If ugly code pops up we may want to emit them as
separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso.
Most of this is legwork to allow multiple bypass blocks while updating PHIs,
dominators and loop info.
llvm-svn: 172902
|
|
|
|
|
|
| |
includes.
llvm-svn: 172891
|
|
|
|
| |
llvm-svn: 172864
|
|
|
|
| |
llvm-svn: 172863
|
|
|
|
|
|
|
| |
Further encapsulation of the Attribute object. Don't allow direct access to the
Attribute object as an aggregate.
llvm-svn: 172853
|
|
|
|
|
|
|
|
| |
Because the Attribute class is going to stop representing a collection of
attributes, limit the use of it as an aggregate in favor of using AttributeSet.
This replaces some of the uses for querying the function attributes.
llvm-svn: 172844
|
|
|
|
| |
llvm-svn: 172839
|
|
|
|
| |
llvm-svn: 172813
|
|
|
|
| |
llvm-svn: 172806
|
|
|
|
|
|
| |
with other code related to shuffles and easier to implement in compiled code.
llvm-svn: 172788
|
|
|
|
| |
llvm-svn: 172784
|
|
|
|
| |
llvm-svn: 172782
|
|
|
|
| |
llvm-svn: 172736
|
|
|
|
|
|
| |
``Visited'' Debug message to use it.
llvm-svn: 172735
|
|
|
|
|
|
| |
the values of shadow scale and offset to the runtime
llvm-svn: 172709
|
|
|
|
|
|
| |
passes. Add test for non-default mapping scale and offset. No functionality change
llvm-svn: 172610
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-fno-objc-arc-exception is enabled due to it's affect on correctness.
Specifically according to the semantics of ARC -fno-objc-arc-exception simply
states that it is expected that the unwind path out of a call *MAY* not release
objects. Thus we can have the situation where a release gets moved into a catch
block which we ignore when we remove a retain/release pair resulting in (even
though we assume the program is exiting anyways) the cleanup code path
potentially blowing up before program exit.
llvm-svn: 172599
|
|
|
|
|
|
| |
with a constant zero.
llvm-svn: 172576
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
some optimization opportunities (in the enclosing supper-expressions).
rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
if expression "-0.0 - X" has only one reference.
rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
if expression "0.0 - X" has only one reference, and
the instruction is marked "noSignedZero".
2. Eliminate negation (The compiler was already able to handle these
opt if the 0.0s are replaced with -0.0.)
rule 3: (0.0 - X) * (0.0 - Y) => X * Y
rule 4: (0.0 - X) * C => X * -C
if the expr is flagged "noSignedZero".
3.
Rule 5: (X*Y) * X => (X*X) * Y
if X!=Y and the expression is flagged with "UnsafeAlgebra".
The purpose of this transformation is two-fold:
a) to form a power expression (of X).
b) potentially shorten the critical path: After transformation, the
latency of the instruction Y is amortized by the expression of X*X,
and therefore Y is in a "less critical" position compared to what it
was before the transformation.
4. Remove the InstCombine code about simplifiying "X * select".
The reasons are following:
a) The "select" is somewhat architecture-dependent, therefore the
higher level optimizers are not able to precisely predict if
the simplification really yields any performance improvement
or not.
b) The "select" operator is bit complicate, and tends to obscure
optimization opportunities. It is btter to keep it as low as
possible in expr tree, and let CodeGen to tackle the optimization.
llvm-svn: 172551
|
|
|
|
|
|
| |
vectorization factor even if the target machine does not have any vector registers.
llvm-svn: 172544
|
|
|
|
|
|
| |
Also improve test coveration of the handling of relational comparisons.
llvm-svn: 172539
|
|
|
|
| |
llvm-svn: 172489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
---------------------------------------------------------------------------
C_A: reassociation is allowed
C_R: reciprocal of a constant C is appropriate, which means
- 1/C is exact, or
- reciprocal is allowed and 1/C is neither a special value nor a denormal.
-----------------------------------------------------------------------------
rule1: (X/C1) / C2 => X / (C2*C1) (if C_A)
=> X * (1/(C2*C1)) (if C_A && C_R)
rule 2: X*C1 / C2 => X * (C1/C2) if C_A
rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value)
rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3)
rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)
llvm-svn: 172488
|
|
|
|
|
|
| |
Add a const version of getFpValPtr to avoid a cast-away-const warning.
llvm-svn: 172467
|
|
|
|
| |
llvm-svn: 172460
|
|
|
|
| |
llvm-svn: 172452
|
|
|
|
| |
llvm-svn: 172374
|
|
|
|
|
|
| |
use doxygen). Still some work to do though.
llvm-svn: 172371
|
|
|
|
|
|
|
|
| |
2x blocks each assigned a value via a phi-node causing each to depend on the other.
A test case is provided as well.
llvm-svn: 172368
|
|
|
|
| |
llvm-svn: 172358
|
|
|
|
|
|
| |
and i16).
llvm-svn: 172348
|
|
|
|
| |
llvm-svn: 172347
|
|
|
|
| |
llvm-svn: 172346
|
|
|
|
|
|
|
|
| |
case, but looking at the diff this was an obviously unintended change.
Thanks for the careful review Bill! =]
llvm-svn: 172336
|
|
|
|
|
|
| |
Found by valgrind.
llvm-svn: 172318
|
|
|
|
| |
llvm-svn: 172299
|
|
|
|
| |
llvm-svn: 172298
|
|
|
|
|
|
| |
=> objc_autorelease but were not updating the InstructionClass to IC_Autorelease.
llvm-svn: 172288
|