| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was support for writing the AArch64 big endian data fixup entries in
the .eh_frame section in BE. This is changed to write all such fixup
entries in BE with no restriction on the section. This is similar to
the existing support for fixup entries for ARM.
A test is added to check the length field in the .debug_line section as
this is an example of where such a fixup occurs.
Differential Revision: http://reviews.llvm.org/D16064
llvm-svn: 258320
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16296
llvm-svn: 258316
|
| |
|
|
|
|
|
|
|
|
|
| |
If converter was somewhat careless about "diamond" cases, where there
was no join block, or in other words, where the true/false blocks did
not have analyzable branches. In such cases, it was possible for it to
remove (needed) branches, resulting in a loss of entire basic blocks.
Differential Revision: http://reviews.llvm.org/D16156
llvm-svn: 258310
|
| |
|
|
|
|
|
|
| |
implementation.
Differential Revision: http://reviews.llvm.org/D16350
llvm-svn: 258309
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AArch64 .inst directive was implemented using EmitIntValue, which resulted
in both $x and $d (code and data) mapping symbols being emitted at the same
address. This fixes it to only emit the $x mapping symbol.
EmitIntValue also emits the value in big-endian order when targeting big-endian
systems, but instructions are always emitted in little-endian order for
AArch64.
Differential Revision: http://reviews.llvm.org/D16349
llvm-svn: 258308
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SelectionDAG previously missed opportunities to fold constants into
GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it
would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to
create `(add GA+c, y)`. This isn't often visible on targets such as X86 which
effectively reassociate adds in their complex address-mode folding logic,
however it is currently visible on WebAssembly since it currently has very
simple address mode folding code that doesn't reassociate anything.
This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses
at the same times that it folds constants together, so that it doesn't miss any
opportunities to perform such folding.
Differential Revision: http://reviews.llvm.org/D16090
llvm-svn: 258296
|
| |
|
|
| |
llvm-svn: 258295
|
| |
|
|
| |
llvm-svn: 258285
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Funclet EH tables require that a given funclet have only one unwind
destination for exceptional exits. The verifier will therefore reject
e.g. two cleanuprets with different unwind dests for the same cleanup, or
two invokes exiting the same funclet but to different unwind dests.
Because catchswitch has no 'nounwind' variant, and because IR producers
are not *required* to annotate calls which will not unwind as 'nounwind',
it is legal to nest a call or an "unwind to caller" catchswitch within a
funclet pad that has an unwind destination other than caller; it is
undefined behavior for such a call or catchswitch to unwind.
Normally when inlining an invoke, calls in the inlined sequence are
rewritten to invokes that unwind to the callsite invoke's unwind
destination, and "unwind to caller" catchswitches in the inlined sequence
are rewritten to unwind to the callsite invoke's unwind destination.
However, if such a call or "unwind to caller" catchswitch is located in a
callee funclet that has another exceptional exit with an unwind
destination within the callee, applying the normal transformation would
give that callee funclet multiple unwind destinations for its exceptional
exits. There would be no way for EH table generation to determine which
is the "true" exit, and the verifier would reject the function
accordingly.
Add logic to the inliner to detect these cases and leave such calls and
"unwind to caller" catchswitches as calls and "unwind to caller"
catchswitches in the inlined sequence.
This fixes PR26147.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: alexcrichton, llvm-commits
Differential Revision: http://reviews.llvm.org/D16319
llvm-svn: 258273
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15035
llvm-svn: 258256
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This teaches MachineSink to not sink instructions that might break the
implicit null check optimization that runs later. This should not
affect frontends that do not use implicit null checks.
Reviewers: aadg, reames, hfinkel, atrick
Subscribers: majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D14632
llvm-svn: 258254
|
| |
|
|
|
|
|
|
|
|
|
| |
calling convention.
The implementation of the related callbacks in the x86 backend for such
functions are not ready to deal with a prologue block that is not the entry
block of the function.
This fixes PR26107, but the longer term solution would be to fix those callbacks.
llvm-svn: 258221
|
| |
|
|
| |
llvm-svn: 258218
|
| |
|
|
|
|
|
| |
This adds rudimentary support for a few relocations that we will use for
the CodeView debug format.
llvm-svn: 258216
|
| |
|
|
|
|
| |
Add support for decoding VZEXT_MOVL target shuffle masks, allowing it to be used as a source in target shuffle combines.
llvm-svn: 258215
|
| |
|
|
|
|
|
|
|
|
| |
As vector shuffles can only reference two inputs many (V)INSERTPS patterns end up being split over two targets shuffles.
This patch adds combines to attempt to combine (V)INSERTPS nodes with input/output nodes that are just zeroing out these additional vector elements.
Differential Revision: http://reviews.llvm.org/D16072
llvm-svn: 258205
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, the max backedge taken count can be more conservative
than the exact backedge taken count (for instance, because
ScalarEvolution::getRange is not control-flow sensitive whereas
computeExitLimitFromICmp can be). In these cases,
computeExitLimitFromCond (specifically the bit that deals with `and` and
`or` instructions) can create an ExitLimit instance with a
`SCEVCouldNotCompute` max backedge count expression, but a computable
exact backedge count expression. This violates an implicit SCEV
assumption: a computable exact BE count should imply a computable max BE
count.
This change
- Makes the above implicit invariant explicit by adding an assert to
ExitLimit's constructor
- Changes `computeExitLimitFromCond` to be more robust around
conservative max backedge counts
llvm-svn: 258184
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16297
llvm-svn: 258161
|
| |
|
|
|
|
|
| |
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555
llvm-svn: 258158
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pow(x, [small integer]) calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555
As with D15937, the intent of the patch is to preserve the current behavior of the transform
except that we use the pow call's 'fast' attribute as a trigger rather than a function-level
attribute.
The TODO comment notes a potential follow-on patch that would propagate FMF to the new
instructions.
Differential Revision: http://reviews.llvm.org/D16122
llvm-svn: 258153
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16294
llvm-svn: 258144
|
| |
|
|
|
|
|
|
|
| |
Teach the register stackifier to rematerialize constants that have multiple
uses instead of leaving them in registers. In the WebAssembly encoding, it's
the same code size to materialize most constants as it is to read a value
from a register.
llvm-svn: 258142
|
| |
|
|
| |
llvm-svn: 258139
|
| |
|
|
|
|
| |
are resolved.
llvm-svn: 258138
|
| |
|
|
|
|
|
|
| |
According to x86 spec "xlat m8" is a legal instruction and it is equivalent to "xlatb".
Differential Revision: http://reviews.llvm.org/D15150
llvm-svn: 258135
|
| |
|
|
|
|
|
|
|
|
|
|
| |
argument.
Reviewers: eddyb
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16321
llvm-svn: 258134
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following are legal according to X86 spec:
ins mem, DX
outs DX, mem
lods mem
stos mem
scas mem
cmps mem, mem
movs mem, mem
Differential Revision: http://reviews.llvm.org/D14827
llvm-svn: 258132
|
| |
|
|
| |
llvm-svn: 258125
|
| |
|
|
|
|
|
|
| |
cover all width and types (pd/ps/sd/ss) of fixupimm instruction and inrtinsics
Differential Revision: http://reviews.llvm.org/D16313
llvm-svn: 258124
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a companion patch for http://reviews.llvm.org/D16124.
Internalized symbols increase the size of strongly-connected components in
SCC-based module splitting and thus reduce the amount of parallelism. This
patch records the original linkage of non-local symbols prior to
internalization and then restores it just before splitting/CodeGen. This is
also useful for cases where the linker requires symbols to remain external, for
instance, so they can be placed according to linker script rules.
It's currently under its own flag (-restore-globals) but should eventually
share a common flag with D16124.
Reviewers: joker.eph, pcc
Subscribers: slarin, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D16229
llvm-svn: 258100
|
| |
|
|
| |
llvm-svn: 258096
|
| |
|
|
|
|
|
|
|
|
| |
This breaks the tests that were meant for testing
64-bit inline immediates, so move those to shl where
they won't be broken up.
This should be repeated for the other related bit ops.
llvm-svn: 258095
|
| |
|
|
| |
llvm-svn: 258094
|
| |
|
|
|
|
|
| |
Reduce 64-bit shl with constant > 32. We already special cased
this for the == 32 case, but this also works for any >= 32 constant.
llvm-svn: 258092
|
| |
|
|
| |
llvm-svn: 258091
|
| |
|
|
|
|
| |
64-bit shifts are very slow on some subtargets.
llvm-svn: 258090
|
| |
|
|
|
|
|
|
| |
The coverage is almost non-existent, hopefully more will come after this.
Differential Revision: http://reviews.llvm.org/D16096
llvm-svn: 258087
|
| |
|
|
| |
llvm-svn: 258086
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
globalize any local variables.
Summary:
Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D16124
llvm-svn: 258083
|
| |
|
|
|
|
|
|
| |
AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly.
Differential Revision: http://reviews.llvm.org/D16050
llvm-svn: 258081
|
| |
|
|
|
|
|
|
| |
Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD.
Differential Revision: http://reviews.llvm.org/D16271
llvm-svn: 258047
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction.
code example , previous implementation.
movzbl %dil, %eax
kmovw %eax, %k0
new code
kmovw %edi, %k0
Differential Revision: http://reviews.llvm.org/D16287
llvm-svn: 258045
|
| |
|
|
|
|
|
|
|
| |
When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of
the input operands needs to be swapped.
Differential Revision: http://reviews.llvm.org/D16288
llvm-svn: 258044
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`LCSSASafePhiForRAUW` as computed was incorrect -- in cases like
these (this exact example does not actually trigger the bug):
define i32 @f(i32 %n, i1* %c) {
entry:
br label %outer.loop
outer.loop:
br label %inner.loop
inner.loop:
%iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ]
%iv.inc = add nuw nsw i32 %iv, 1
%tc = udiv i32 %n, 13
%be.cond = icmp ult i32 %iv, %tc
br i1 %be.cond, label %inner.loop, label %inner.exit
inner.exit:
%iv.lcssa = phi i32 [ %iv, %inner.loop ]
%outer.be.cond = load volatile i1, i1* %c
br i1 %outer.be.cond, label %outer.loop, label %leave
leave:
%iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ]
ret i32 %iv.lcssa.lcssa
}
`LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit
value of `%iv` for `%inner.loop` to `%tc` (this can happen due to
`SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA.
To fix this, instead of computing `SafePhi` with special logic, decide
the safety of RAUW directly via `replacementPreservesLCSSAForm`.
llvm-svn: 258016
|
| |
|
|
| |
llvm-svn: 258013
|
| |
|
|
|
|
|
|
| |
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D16226
llvm-svn: 258010
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16189
llvm-svn: 258008
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16194
llvm-svn: 258006
|
| |
|
|
|
|
|
|
| |
review of r257343.
Thanks Dave!
llvm-svn: 258002
|
| |
|
|
|
|
|
|
|
|
| |
MIPS 32-bit ABI uses REL relocation record format to save dynamic
relocations. The patch teaches llvm-readobj to show dynamic relocations
in this format.
Differential Revision: http://reviews.llvm.org/D16114
llvm-svn: 258001
|