| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Access to aligned globals gives us a chance to peephole optimize nonzero
offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't
overflow the available displacement. For example:
addis 3, 2, b4v@toc@ha
addi 4, 3, b4v@toc@l
lbz 5, b4v@toc@l(3) ; This is the result of the current peephole
lbz 6, 1(4) ; optimizer
lbz 7, 2(4)
lbz 8, 3(4)
If b4v is 4-byte aligned, we can skip using register 4 because we know
that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate:
addis 3, 2, b4v@toc@ha
lbz 4, b4v@toc@l(3)
lbz 5, b4v@toc@l+1(3)
lbz 6, b4v@toc@l+2(3)
lbz 7, b4v@toc@l+3(3)
Saving a register and an addition.
Larger alignments allow larger structures/arrays to be optimized.
llvm-svn: 255319
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Previously in the conversion cost table there are no entries for integer-integer
conversions on SSE2. This will result in imprecise costs for certain vectorized
operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers
are counted from the result of running llc on the new test case in this patch.
Differential revision: http://reviews.llvm.org/D15132
llvm-svn: 255315
|
| |
|
|
|
|
|
|
|
|
|
|
| |
x)) combines for ppc_fp128, since signbit computation is more
complicated.
Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html
Patch by Tim Shen!
llvm-svn: 255305
|
| |
|
|
|
|
|
|
|
| |
This was causing bad code gen and assembly that won't assemble, as
mixed altivec and vsx code would end up with a vsx high register
assigned to an altivec instruction, which won't work. Constraining the
classes allows the optimization to proceed.
llvm-svn: 255299
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: As a follow-up to rL255054 I wasn't able to convince myself that the code did what I thought, so I wrote more tests.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15371
llvm-svn: 255295
|
| |
|
|
|
|
| |
PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code.
llvm-svn: 255289
|
| |
|
|
|
|
| |
I see a few bots timing out, so I'm speculatively disabling r255247.
llvm-svn: 255286
|
| |
|
|
| |
llvm-svn: 255272
|
| |
|
|
|
|
| |
Using %t rather %T/<specific_name> as the temporary profdata filename.
llvm-svn: 255271
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Convert f16 vectors to corresponding f32 vectors before doing the
conversion to int.
Add tests for v4f16, v8f16.
Reviewers: ab, jmolloy
Subscribers: llvm-commits, srhines
Differential Revision: http://reviews.llvm.org/D14936
llvm-svn: 255263
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a redo of r255137 (reverted at r255227) which was a redo of
r255124 (reverted at r255126) with a fixed check for a scalar source
type and an added test for the failure that caused the revert.
Original commit message:
Example:
bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
--->
extractelement <2 x float> %X, i32 1
This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)
Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232
with 2 less specific transforms to catch the case in the bug report.
Differential Revision: http://reviews.llvm.org/D14879
llvm-svn: 255261
|
| |
|
|
| |
llvm-svn: 255255
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A linker normally has two stages: symbol resolution and "moving stuff".
In lib/Linker there is the complication of lazy linking some globals,
but it was still far more mixed than it needed to.
This splits the linker into a lower level IRMover and the linker proper.
The IRMover just takes a list of globals to move and a callback that
lets the user control what is lazy linked.
The main motivation is that now tools/gold (and soon lld) can use their
own symbol resolution to instruct IRMover what to do.
llvm-svn: 255254
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We extend the search for redundant stores to predecessor blocks that
unconditionally lead to the block BB with the current store instruction. That
also includes single-block loops that unconditionally lead to BB, and
if-then-else blocks where then- and else-blocks unconditionally lead to BB.
http://reviews.llvm.org/D13363
Patch by Ivan Baev <ibaev@codeaurora.org>!
llvm-svn: 255247
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
http://reviews.llvm.org/D15286
LLVM IR frequently contains bitcast operations between floating point and
integer values of the same width. Doing this through memory operations is
quite expensive on PPC. This patch allows the use of direct register moves
between FPRs and GPRs for lowering bitcasts.
llvm-svn: 255246
|
| |
|
|
|
|
|
|
| |
Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros.
Differential Revision: http://reviews.llvm.org/D14687
llvm-svn: 255245
|
| |
|
|
|
|
| |
This commit broke apple's internal bot.
llvm-svn: 255227
|
| |
|
|
|
|
|
|
| |
ISD::FCOPYSIGN permits its operands to have differing types, and DAGCombiner
uses this. Add some def : Pat rules to expand this out into an explicit
conversion and a normal copysign operation.
llvm-svn: 255220
|
| |
|
|
|
|
| |
It is lowered to a libcall for now, but this is expected to change in the future.
llvm-svn: 255219
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows us to remove the END_OF_TEXT_LABEL hack we had been using
and simplifies the fixups used to compute the address of constant
arrays.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15257
llvm-svn: 255204
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs.
Reviewers: arsenm, echristo
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15342
llvm-svn: 255203
|
| |
|
|
| |
llvm-svn: 255202
|
| |
|
|
| |
llvm-svn: 255191
|
| |
|
|
|
|
|
|
| |
Target-specific instructions may have uninteresting physreg clobbers,
for target-specific reasons. The peephole pass doesn't need to concern
itself with such defs, as long as they're implicit and marked as dead.
llvm-svn: 255182
|
| |
|
|
| |
llvm-svn: 255181
|
| |
|
|
| |
llvm-svn: 255179
|
| |
|
|
|
|
|
|
| |
without a frame pointer when unwind may happen.
This is a workaround for a bug in the way we emit the CFI directives for
frameless unwind information. See PR25614.
llvm-svn: 255175
|
| |
|
|
|
|
|
| |
We were deciding to not link an available_externally gv over a
declaration, but then copying over the body anyway.
llvm-svn: 255169
|
| |
|
|
|
|
|
|
| |
Two tests diag_mismatch.ll and diag_no_funcprofdata.ll generates the same
profdata filename which can conflict in current test runs. This patch
renames them to have different names.
llvm-svn: 255158
|
| |
|
|
|
|
|
|
|
|
| |
The ConstantDataArray::getFP(LLVMContext &, ArrayRef<uint16_t>)
overload has had a typo in it since it was written, where it will
create a Vector instead of an Array. This obviously doesn't work at
all, but it turns out that until r254991 there weren't actually any
callers of this overload. Fix the typo and add some test coverage.
llvm-svn: 255157
|
| |
|
|
|
|
|
| |
This fixes a crash bug. It's also not clear if we'd want to do this
transform for vectors.
llvm-svn: 255155
|
| |
|
|
|
|
|
|
| |
`CloneAndPruneIntoFromInst` can DCE instructions after cloning them into
the new function, and so an AssertingVH is too strong. This change
switches CloneCodeInfo to use a std::vector<WeakVH>.
llvm-svn: 255148
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a redo of r255124 (reverted at r255126) with an added check for a
scalar destination type and an added test for the failure seen in Clang's
test/CodeGen/vector.c. The extra test shows a different missing optimization.
Original commit message:
Example:
bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
--->
extractelement <2 x float> %X, i32 1
This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)
Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232
with 2 less specific transforms to catch the case in the bug report.
Differential Revision: http://reviews.llvm.org/D14879
llvm-svn: 255137
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new patch fixes a few bugs that exposed in last submit. It also improves
the test cases.
--Original Commit Message--
This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.
Differential Revision: http://reviews.llvm.org/D12781
llvm-svn: 255132
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r255124.
Broke http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/4193/steps/test/logs/stdio
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255126
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reinteroduce the code for moving ARGUMENTS back to the top of the basic block.
While the ARGUMENTS physical register prevents sinking and scheduling from
moving them, it does not appear to be sufficient to prevent SelectionDAG from
moving them down in the initial schedule. This patch introduces a patch that
moves them back to the top immediately after SelectionDAG runs.
This is still hopefully a temporary solution. http://reviews.llvm.org/D14750 is
one alternative, though the review has not been favorable, and proposed
alternatives are longer-term and have other downsides.
This fixes the main outstanding -verify-machineinstrs failures, so it adds
-verify-machineinstrs to several tests.
Differential Revision: http://reviews.llvm.org/D15377
llvm-svn: 255125
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Example:
bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
--->
extractelement <2 x float> %X, i32 1
This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)
Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232
with 2 less specific transforms to catch the case in the bug report.
Differential Revision: http://reviews.llvm.org/D14879
llvm-svn: 255124
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of !isWeakForLinker()
Summary:
Available_externally global variable with initializer were considered "hasInitializer()",
while obviously it can't match the description:
Whether the global variable has an initializer, and any changes made to the
initializer will turn up in the final executable.
since modifying the initializer of an externally available variable does not make sense.
Reviewers: pcc, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15351
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255123
|
| |
|
|
|
|
|
|
|
|
| |
We mutated the DAG, which invalidated the node we were trying to use
as a base register. Sometimes we got away with it, but other times the
node really did get deleted before it was finished with.
Should fix PR25733
llvm-svn: 255120
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During selection DAG legalization, extractelement is replaced with a load
instruction. To do this, a temporary store to the stack is used unless an
existing store is found that can be re-used.
If re-using a store, the chain going out of the store must be replaced by
the one going out of the new load (this ensures that any stores that must
take place after the store happens after the load, else the value might
be overwritten before it is loaded).
The problem is, if the extractelement index is dependent on the store
replacing the chain will introduce a cycle in the selection DAG (the load
uses the index, and by replacing the chain we will make the index dependent
on the load).
To fix this, if the index is dependent on the store, the store is skipped.
This is conservative as we may end up creating an unnecessary extra store
to the stack. However, the situation is not expected to occur very often.
Differential Revision: http://reviews.llvm.org/D15330
llvm-svn: 255114
|
| |
|
|
| |
llvm-svn: 255113
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15229
llvm-svn: 255112
|
| |
|
|
|
|
|
|
|
| |
Commited patch was intended to implement LH, LHE, LHU and LHUE instructions.
After commit test-suite failed with error message in the form of:
fatal error: error in backend: Cannot select: t124: i32,ch = load<LD2[%d](tbaa=<0x94acc48>), sext from i16> t0, t2, undef:i32
For that reason I decided to revert commit r254897 and make new patch which besides implementation and standard regression tests will also have dedicated tests (CodeGen) for the above error.
llvm-svn: 255109
|
| |
|
|
|
|
|
|
|
|
|
| |
DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933"
This reverts commit r255096.
Break the bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/16378/
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255101
|
| |
|
|
|
|
| |
DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933
llvm-svn: 255096
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, we think that most types that look like they'd fit in a
legal vector type are legal (so, basically, *any* vector type with a
size between 33 and 128 bits, I think, since we use pow2 alignment;
e.g., v2i25, v3f32, ...).
DataLayout::getTypeAllocSize rounds up based on alignment.
When checking for target intrinsic legality, that's not what we want:
if rounding makes a difference, the type isn't legal, and the
target intrinsics shouldn't be used, as they are always assumed legal.
One could make the argument that alloc size is ultimately the most
relevant here, since we're dealing with LD/ST intrinsics. That's only
true if we did legalize them though; that's a problem for another day.
Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits.
Type::getSizeInBits can't be used because that'd gratuitously break
pointer vector support.
Some of these uses are currently fine, because we only hit them when
the type is already known legal (e.g., r114454). Update them for
consistency. It's faster to avoid the rounding anyway!
llvm-svn: 255089
|
| |
|
|
|
|
|
| |
Test case attached (test case also checks that we don't drop the calling
convention, but that functionality was correct before this patch).
llvm-svn: 255088
|
| |
|
|
|
|
|
| |
Reviewer: Simon Pilgrim.
Differential Revision: http://reviews.llvm.org/D15317
llvm-svn: 255080
|
| |
|
|
|
|
|
|
| |
For an invoke with operand bundles, the [op_begin(), op_end()-3] range
can contain things other than invoke arguments. This change teaches
PruneEH to use arg_begin() and arg_end() explicitly.
llvm-svn: 255073
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes failure when trying to select
insertelement <4 x half> undef, half %a, i64 0
which gets transformed to a scalar_to_vector node.
The accompanying v4 and v8 tests fail instruction selection without this
patch.
Reviewers: ab, jmolloy
Subscribers: srhines, llvm-commits
Differential Revision: http://reviews.llvm.org/D15322
llvm-svn: 255072
|