| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 296222
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds various new functionality and cleanup surrounding the
use of the Stream library. Major changes include:
* Renaming of all classes for more consistency / meaningfulness
* Addition of some new methods for reading multiple values at once.
* Full suite of unit tests for reader / writer functionality.
* Full set of doxygen comments for all classes.
* Streams now store their own endianness.
* Fixed some bugs in a few of the classes that were discovered
by the unit tests.
llvm-svn: 296215
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of a larger effort to get the Stream code moved
up to Support. I don't want to do it in one large patch, in
part because the changes are so big that it will treat everything
as file deletions and add, losing history in the process.
Aside from that though, it's just a good idea in general to
make small changes.
So this change only changes the names of the Stream related
source files, and applies necessary source fix ups.
llvm-svn: 296211
|
|
|
|
|
|
|
| |
This replaces the __stack_pointer variable which was allocated in linear
memory.
llvm-svn: 296201
|
|
|
|
|
|
|
|
|
| |
With the "wasm32-unknown-unknown-wasm" triple, this allows writing out
simple wasm object files, and is another step in a larger series toward
migrating from ELF to general wasm object support. Note that this code
and the binary format itself is still experimental.
llvm-svn: 296190
|
|
|
|
|
|
|
|
| |
This reverts commit r296009. It broke one out of tree target and also
does not account for all partial lines added or removed when calculating
PressureDiff.
llvm-svn: 296182
|
|
|
|
|
|
|
| |
All G_CONSTANTS created by the MachineIRBuilder have an operand of type CImm
(i.e. a ConstantInt), so that's what the selector needs to look for.
llvm-svn: 296176
|
|
|
|
|
|
|
|
|
|
| |
When we construct addressing modes, we use isNoopAddrSpaceCast to ignore
addrspacecast instructions. Make sure we insert the correct addrspacecast
when we reconstruct the addressing mode.
Differential Revision: https://reviews.llvm.org/D30114
llvm-svn: 296167
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Splitting critical edges when one of the source edges is an indirectbr
is hard in general (because it requires changing the memory the indirectbr
reads). But if a block only has a single indirectbr predecessor (which is
the common case), we can simulate splitting that edge by splitting
the destination block, and retargeting the *direct* branches.
This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame()
ends up using an indirect branch with ~100 successors, and passing a constant to
each of those. Since MachineSink can't break indirect critical edges on demand
(and doing this in MIR doesn't look feasible), this causes us to emit about ~100
defs of registers containing constants, which we in the predecessor block, where
only one of those constants is used in each successor. So, at each computed goto,
we needlessly spill about a 100 constants to stack. The end result is that a
clang-compiled python interpreter can be about ~2.5x slower on a simple python
reduction loop than a gcc-compiled interpreter.
Differential Revision: https://reviews.llvm.org/D29916
llvm-svn: 296149
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivation for filling out these select-of-constants cases goes back to D24480,
where we discussed removing an IR fold from add(zext) --> select. And that goes back to:
https://reviews.llvm.org/rL75531
https://reviews.llvm.org/rL159230
The idea is that we should always canonicalize patterns like this to a select-of-constants
in IR because that's the smallest IR and the best for value tracking. Note that we currently
do the opposite in some cases (like the cases in *this* patch). Ie, the proposed folds in
this patch already exist in InstCombine today:
https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1151
As this patch shows, most targets generate better machine code for simple ext/add/not ops
rather than a select of constants. So the follow-up steps to make this less of a patchwork
of special-case folds and missing IR canonicalization:
1. Have DAGCombiner convert any select of constants into ext/add/not ops.
2 Have InstCombine canonicalize in the other direction (create more selects).
Differential Revision: https://reviews.llvm.org/D30180
llvm-svn: 296137
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This isn't testable for AArch64 by itself so this patch also adds
support for constant immediates in the pattern and physical
register uses in the result.
The new IntOperandMatcher matches the constant in patterns such as
'(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold
immediates into an instruction so this is the first rule that will match
across multiple BB's.
The Renderer hierarchy is responsible for adding operands to the result
instruction. Renderers can copy operands (CopyRenderer) or add physical
registers (in particular %wzr and %xzr) to the result instruction
in any order (OperandMatchers now import the operand names from
SelectionDAG to allow renderers to access any operand). This allows us to
emit the result instruction for:
%1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0
%1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0
although the latter is untested since the matcher/importer has not been
taught about commutativity yet.
Added BuildMIAction which can build new instructions and mutate them where
possible. W.r.t the mutation aspect, MatchActions are now told the name of
an instruction they can recycle and BuildMIAction will emit mutation code
when the renderers are appropriate. They are appropriate when all operands
are rendered using CopyRenderer and the indices are the same as the matcher.
This currently assumes that all operands have at least one matcher.
Finally, this change also fixes a crash in
AArch64InstructionSelector::select() caused by an immediate operand
passing isImm() rather than isCImm(). This was uncovered by the other
changes and was detected by existing tests.
Depends on D29711
Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar
Reviewed By: rovka
Subscribers: aemerson, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D29712
llvm-svn: 296131
|
|
|
|
|
|
| |
This was missed in r293110.
llvm-svn: 296096
|
|
|
|
| |
llvm-svn: 296093
|
|
|
|
| |
llvm-svn: 296064
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Splitting critical edges when one of the source edges is an indirectbr
is hard in general (because it requires changing the memory the indirectbr
reads). But if a block only has a single indirectbr predecessor (which is
the common case), we can simulate splitting that edge by splitting
the destination block, and retargeting the *direct* branches.
This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame()
ends up using an indirect branch with ~100 successors, and passing a constant to
each of those. Since MachineSink can't break indirect critical edges on demand
(and doing this in MIR doesn't look feasible), this causes us to emit about ~100
defs of registers containing constants, which we in the predecessor block, where
only one of those constants is used in each successor. So, at each computed goto,
we needlessly spill about a 100 constants to stack. The end result is that a
clang-compiled python interpreter can be about ~2.5x slower on a simple python
reduction loop than a gcc-compiled interpreter.
Differential Revision: https://reviews.llvm.org/D29916
llvm-svn: 296060
|
|
|
|
|
|
| |
While there, switch to the explicit ctor.
llvm-svn: 296059
|
|
|
|
|
|
|
| |
Justin added support for DISubprogram locs in r295531 and r296052.
Use that instead of no-loc for constants and arguments.
llvm-svn: 296058
|
|
|
|
|
|
|
| |
Since r296047, we're able to return early on failures.
Don't track whether we succeeded.
llvm-svn: 296057
|
|
|
|
|
|
|
| |
Add an optimization remark to asm-printer that summarizes the number
of instructions emitted per function.
llvm-svn: 296053
|
|
|
|
|
|
|
|
|
|
| |
We were stopping the translation of the parent block when the
translation of an instruction failed, but we were still trying to
translate the other blocks of the parent function.
Don't do that.
llvm-svn: 296047
|
|
|
|
|
|
|
| |
This is the compromise between having a per-function IRTranslator
and manually managing the per-function state.
llvm-svn: 296046
|
|
|
|
|
|
|
|
|
| |
Rename ComputedTrellisEdges to ComputedEdges to allow for other methods of
pre-computing edges.
Differential Revision: https://reviews.llvm.org/D30308
llvm-svn: 296018
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having more fine-grained information on the specific construct that
caused us to fallback is valuable for large-scale data collection.
We still have the fallback warning, that's also used for FastISel.
We still need to remove the fallback warning, and teach FastISel to also
emit remarks (it currently has a combination of the warning, stats, and
debug prints: the remarks could unify all three).
The abort-on-fallback path could also be better handled using remarks:
one could imagine a "-Rpass-error", analoguous to "-Werror", which would
promote missed/failed remarks to errors. It's not clear whether that
would be useful for other remarks though, so we're not there yet.
llvm-svn: 296013
|
|
|
|
|
|
| |
This will be used with GISel opt remarks.
llvm-svn: 296012
|
|
|
|
|
|
|
| |
This matches the behavior for skip-operands. While there, document it.
This is a follow-up to r296007.
llvm-svn: 296011
|
|
|
|
|
|
|
|
|
|
| |
If a subreg is used in an instruction it counts as a whole superreg
for the purpose of register pressure calculation. This patch corrects
improper register pressure calculation by examining operand's lane mask.
Differential Revision: https://reviews.llvm.org/D29835
llvm-svn: 296009
|
|
|
|
| |
llvm-svn: 296007
|
|
|
|
|
|
| |
This lets us use more natural early-returns when selection fails.
llvm-svn: 296006
|
|
|
|
| |
llvm-svn: 296004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since LoopInfo is not available in machine passes as universally as in IR
passes, using the same approach for OptimizationRemarkEmitter as we did for IR
will run LoopInfo and DominatorTree unnecessarily. (LoopInfo is not used
lazily by ORE.)
To fix this, I am modifying the approach I took in D29836. LazyMachineBFI now
uses its client passes including MachineBFI itself that are available or
otherwise compute them on the fly.
So for example GreedyRegAlloc, since it's already using MBFI, will reuse that
instance. On the other hand, AsmPrinter in Justin's patch will generate DT,
LI and finally BFI on the fly.
(I am of course wondering now if the simplicity of this approach is even
preferable in IR. I will do some experiments.)
Testing is provided by an updated version of D29837 which requires Justin's
patch to bring ORE to the AsmPrinter.
Differential Revision: https://reviews.llvm.org/D30128
llvm-svn: 295996
|
|
|
|
| |
llvm-svn: 295962
|
|
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 295893
|
|
|
|
|
|
|
|
|
|
|
| |
r295336 causes a bootstrapped clang to fail for many compilations on
powerpc BE. See
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315
for example.
Reverting as per the developer's request.
llvm-svn: 295849
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Initial implementation for X86InstructionSelector. Handle selection COPY and G_ADD/G_SUB gpr, gpr .
Reviewers: qcolombet, rovka, zvi, ab
Reviewed By: rovka
Subscribers: mgorny, dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D29816
llvm-svn: 295824
|
|
|
|
|
|
|
|
|
| |
This just adds the basic skeleton for supporting a new object file format.
All of the actual encoding will be implemented in followup patches.
Differential Revision: https://reviews.llvm.org/D26722
llvm-svn: 295803
|
|
|
|
|
|
|
| |
Avoids test regressions in future AMDGPU commits when
more vector types are custom lowered.
llvm-svn: 295782
|
|
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 295773
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Rework the code that was sinking/duplicating (icmp and, 0) sequences
into blocks where they were being used by conditional branches to form
more tbz instructions on AArch64. The new code is more general in that
it just looks for 'and's that have all icmp 0's as users, with a target
hook used to select which subset of 'and' instructions to consider.
This change also enables 'and' sinking for X86, where it is more widely
beneficial than on AArch64.
The 'and' sinking/duplicating code is moved into the optimizeInst phase
of CodeGenPrepare, where it can take advantage of the fact the
OptimizeCmpExpression has already sunk/duplicated any icmps into the
blocks where they are used. One minor complication from this change is
that optimizeLoadExt needed to be updated to always mark 'and's it has
determined should be in the same block as their feeding load in the
InsertedInsts set to avoid an infinite loop of hoisting and sinking the
same 'and'.
This change fixes a regression on X86 in the tsan runtime caused by
moving GVNHoist to a later place in the optimization pipeline (see
PR31382).
Reviewers: t.p.northover, qcolombet, MatzeB
Subscribers: aemerson, mcrosier, sebpop, llvm-commits
Differential Revision: https://reviews.llvm.org/D28813
llvm-svn: 295746
|
|
|
|
|
|
|
|
|
| |
- Fix doxygen comments (do not repeat documented name, remove definition
comment if there is already one at the declaration, add \p, ...)
- Add some const modifiers
- Use range based for
llvm-svn: 295688
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction.
Summary:
Currently, BranchFolder drops DebugLoc for branch instructions in some places. For example, for the test code attached, the branch instruction of 'entry' block has a DILocation of
```
!12 = !DILocation(line: 6, column: 3, scope: !11)
```
, but this information is gone when then block is lowered because BranchFolder misses it. This patch is a fix for this issue.
Reviewers: qcolombet, aprantl, craig.topper, MatzeB
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29902
llvm-svn: 295684
|
|
|
|
| |
llvm-svn: 295653
|
|
|
|
|
|
| |
Thanks to Mikael Holmén for the initial test case
llvm-svn: 295652
|
|
|
|
| |
llvm-svn: 295607
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Adapt MachineBasicBlock::getName() to have the same behavior as the IR
BasicBlock (Value::getName()).
- Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in
the CodeGen library.
- MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region").
- MachineRegionInfo should depend on MachineDominatorTree,
MachinePostDominatorTree and MachineDominanceFrontier instead of their
respective IR versions.
- Since there were no tests for this, add a X86 MIR test.
Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com>
llvm-svn: 295518
|
|
|
|
| |
llvm-svn: 295505
|
|
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 295499
|
|
|
|
|
|
|
|
|
|
| |
This fixes PR31381, which caused an assertion and/or invalid debug info.
This affects debug variables that have multiple fragments in the MMI
side (i.e.: in the stack frame) table.
rdar://problem/30571676
llvm-svn: 295486
|
|
|
|
|
|
|
| |
The mem operand is used by GlobalISel to convey atomic constraints so dropping
it is invalid.
llvm-svn: 295476
|
|
|
|
|
|
| |
I can't find any tests of the non-i1 code path, so it may be unnecessary at this point.
llvm-svn: 295463
|
|
|
|
| |
llvm-svn: 295453
|