| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared.
It has been reverted previously because of some problems with comparing APInt with raw uint64_t. That has been fixed/changed with r241204.
llvm-svn: 241254
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TwoAddressInstructionPass stops after a successful commuting but 3 Addr
conversion might be good for some cases.
Consider:
int foo(int a, int b) {
return a + b;
}
Before this commit, we emit:
addl %esi, %edi
movl %edi, %eax
ret
After this commit, we try 3 Addr conversion:
leal (%rsi,%rdi), %eax
ret
Patch by Volkan Keles <vkeles@apple.com>!
Differential Revision: http://reviews.llvm.org/D10851
llvm-svn: 241206
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is not intended to change existing codegen behavior for any target.
It just exposes the JumpIsExpensive setting on the command-line to allow for
easier testing and emergency overrides.
Also, change the existing regression test to use FileCheck, explicitly specify
the jump-is-expensive option, and use more precise checks.
Differential Revision: http://reviews.llvm.org/D10846
llvm-svn: 241179
|
| |
|
|
|
|
|
|
| |
The EH code might have been deleted as unreachable and the personality
pruned while the filter is still present. Currently I'm hitting this at
-O0 due to the clang bug PR24009.
llvm-svn: 241170
|
| |
|
|
|
|
|
|
| |
Added tests for encoding
Differential Revision: http://reviews.llvm.org/D10865
llvm-svn: 241159
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The incoming EBP value established by the runtime is actually a pointer
to the end of the EH registration object, and not the true parent
function frame pointer. Clang doesn't need llvm.x86.seh.exceptioninfo
anymore because we know that the exception info pointer is at a fixed
offset from this incoming EBP.
The llvm.x86.seh.recoverfp intrinsic takes an EBP value provided by the
EH runtime and returns a pointer that is usable with llvm.framerecover.
The llvm.x86.seh.restoreframe intrinsic is inserted by the 32-bit
specific preparation pass in blocks targetted by the EH runtime. It
re-establishes any physical registers used by the parent function to
address the stack, such as the frame, base, and stack pointers.
Neither of these intrinsics correctly handle stack realignment prologues
yet, but it's possible to add that later.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D10848
llvm-svn: 241125
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change introduces a !make.implicit metadata that allows the
frontend to pre-select the set of explicit null checks that will be
considered for transformation into implicit null checks.
The reason for not using profiling data instead of !make.implicit is
explained in the change to `FaultMaps.rst`.
Reviewers: atrick, reames, pgavlin, JosephTremoulet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10824
llvm-svn: 241116
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is mandatory to specify a comdat in order to receive comdat semantics
for a symbol. We were previously getting this wrong in -function-sections
mode; linker-weak symbols were being emitted in a selectany comdat. This
change causes such symbols to use a noduplicates comdat instead, fixing
the inconsistency.
Also correct an inaccuracy in the docs.
Differential Revision: http://reviews.llvm.org/D10828
llvm-svn: 241103
|
| |
|
|
|
|
|
|
|
|
| |
Duplicating an FP register "as itself" is a bad idea, since it violates the
invariant that every FP register is mapped to at most one FPU stack slot.
Use the scratch FP register instead.
This fixes PR23957.
llvm-svn: 241069
|
| |
|
|
|
|
| |
Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64)
llvm-svn: 241049
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cleanup.
This change unifies how LTOModule and the backend obtain linker flags
for globals: via a new TargetLoweringObjectFile member function named
emitLinkerFlagsForGlobal. A new function LTOModule::getLinkerOpts() returns
the list of linker flags as a single concatenated string.
This change affects the C libLTO API: the function lto_module_get_*deplibs now
exposes an empty list, and lto_module_get_*linkeropts exposes a single element
which combines the contents of all observed flags. libLTO should never have
tried to parse the linker flags; it is the linker's job to do so. Because
linkers will need to be able to parse flags in regular object files, it
makes little sense for libLTO to have a redundant mechanism for doing so.
The new API is compatible with the old one. It is valid for a user to specify
multiple linker flags in a single pragma directive like this:
#pragma comment(linker, "/defaultlib:foo /defaultlib:bar")
The previous implementation would not have exposed
either flag via lto_module_get_*deplibs (as the test in
TargetLoweringObjectFileCOFF::getDepLibFromLinkerOpt was case sensitive)
and would have exposed "/defaultlib:foo /defaultlib:bar" as a single flag via
lto_module_get_*linkeropts. This may have been a bug in the implementation,
but it does give us a chance to fix the interface.
Differential Revision: http://reviews.llvm.org/D10548
llvm-svn: 241010
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a new version of http://reviews.llvm.org/D10260.
It turned out that when you specify an integer register in inline asm on
x86 you get the register of the required type size back. That means that
X86TargetLowering::getRegForInlineAsmConstraint() has to accept any of
the integer registers and adapt its size to the given target size which
may be any 8/16/32/64 bit sized type. Surprisingly that means given a
constraint of "{ax}" and a type of MVT::F32 we need to return X86::EAX.
This change makes this face explicit, the previous code seemed like
working by accident because there it never returned an error once a
register was found. On the other hand this rewrite allows to actually
return errors for invalid situations like requesting an integer register
for an i128 type.
Related to rdar://21042280
Differential Revision: http://reviews.llvm.org/D10813
llvm-svn: 241002
|
| |
|
|
|
|
|
| |
implicit-null-check-negative.ll had a missing 2>&1. Fix this, and
remove an incorrect test case that this exposes.
llvm-svn: 240998
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch fixes the cases of sext/zext constant folding in DAG combiner where constans do not fit 64 bits. The fix simply removes un$
Test Plan: New regression test included.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: RKSimon, llvm-commits
Differential Revision: http://reviews.llvm.org/D10607
llvm-svn: 240991
|
| |
|
|
|
|
| |
encoding, intrinsics and tests.
llvm-svn: 240936
|
| |
|
|
|
|
|
|
| |
Added tests for DAG lowering ,encoding and intrinsics
Differential Revision: http://reviews.llvm.org/D10796
llvm-svn: 240926
|
| |
|
|
|
|
|
|
|
|
|
| |
Add vscalef support
include encoding and intrinsics
review:
http://reviews.llvm.org/D10730
llvm-svn: 240906
|
| |
|
|
|
|
|
| |
Added intrinsics.
Added encoding and tests.
llvm-svn: 240905
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
simplify sdiv by a constant.
We had a hack in SDAGBuilder in place to work around this but now we
can avoid that. Call BuildExactSDIV from BuildSDIV so DAGCombiner can
perform this trick automatically.
The added check in DAGCombiner is necessary to prevent exact sdiv by pow2
from regressing as the target-specific pow2 lowering is not aware of
exact bits yet.
This is mostly covered by existing tests. One side effect is that we
get the better lowering for exact vector sdivs now too :)
llvm-svn: 240891
|
| |
|
|
|
|
|
|
|
| |
%struct.ref_s = type { %union.v, i16, i16 }
%union.v = type { i64 }
It seems %struct.ref_s is incompatible in tail padding.
llvm-svn: 240874
|
| |
|
|
|
|
|
|
|
|
|
| |
X86WindowsTargetObjectFile::getSectionForConstant""
This reverts commit r240793 while fixing how we handle array constant
pool entries.
This fixes PR23966.
llvm-svn: 240811
|
| |
|
|
|
|
| |
Fixes a miscompilation of MultiSource/Benchmarks/MallocBench/gs
llvm-svn: 240796
|
| |
|
|
|
|
| |
Allows more aggressive folding of ashr/shl pairs.
llvm-svn: 240788
|
| |
|
|
|
|
|
| |
Instcombine also does this but many opportunities only become visible
after GEPs are lowered.
llvm-svn: 240787
|
| |
|
|
|
|
|
|
| |
Revert until http://llvm.org/PR23955 is investigated.
This reverts commit r239309.
llvm-svn: 240746
|
| |
|
|
| |
llvm-svn: 240744
|
| |
|
|
|
|
|
|
|
|
| |
We don't always have FMA, for example when using 'clang -mavx512f'
without an explicit CPU.
Also check for an explicit +avx512f instead of CPUs in a couple
related tests.
llvm-svn: 240616
|
| |
|
|
|
|
|
| |
Reformat, isolate 213->231 xform, actually --check-prefix CHECK,
and deduplicate the FMA intrinsic tests (FMA3 in AMD-land).
llvm-svn: 240615
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary
This change turns on the emission of
__LLVM_Stackmaps section when generating COFF binaries.
Test Plan
Added a scenario to the test case:
test\CodeGen\X86\statepoint-stackmap-format.ll.
Code Review:
http://reviews.llvm.org/D10680
llvm-svn: 240613
|
| |
|
|
| |
llvm-svn: 240542
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch fixes PR23405 (https://llvm.org/bugs/show_bug.cgi?id=23405).
During a node unscheduling an entry in LiveRegGens can be replaced with a new value. That corrupts the live reg tracking and LiveReg* structure is not cleared as should be during unscheduling. Problematic condition that enforces Gen replacement is `I->getSUnit()->getHeight() < LiveRegGens[I->getReg()]->getHeight()`. This condition should be checked only if LiveRegGen was set in current node unscheduling.
Test Plan: Regression test included.
Reviewers: hfinkel, atrick
Reviewed By: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9993
llvm-svn: 240538
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to erroneously match:
(v4i64 shuffle (v2i64 load), <0,0,0,0>)
Whereas vbroadcasti128 is more like:
(v4i64 shuffle (v2i64 load), <0,1,0,1>)
This problem doesn't exist for vbroadcastf128, which kept matching
the intrinsic after r231182. We should perhaps re-introduce the
intrinsic here as well, but that's a separate issue still being
discussed.
While there, add some proper vbroadcastf128 tests. We don't currently
match those, like for loading vbroadcastsd/ss on AVX (the reg-reg
broadcasts where added in AVX2).
Fixes PR23886.
llvm-svn: 240488
|
| |
|
|
|
|
| |
Some of them had gone stale.
llvm-svn: 240485
|
| |
|
|
|
|
| |
Removed some old duplicate tests.
llvm-svn: 240465
|
| |
|
|
|
|
| |
Added all intrinsics, tests for encoding, tests for intrinsics.
llvm-svn: 240386
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions
Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass
to reassociate the following sequence to reduce the critical path:
A = ? op ?
B = A op X
C = B op Y
-->
A = ? op ?
B = X op Y
C = A op B
'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could
be any associative math/logic op (see TODO in code comment).
This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of
a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth
reassociating them.
This generalization has a compile-time cost because we can now match more instruction sequences
and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't
improve the critical path.
For example, in the new test case:
A = M div N
B = A add X
C = B add Y
We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path:
A = M div N
B = A add Y
C = B add X
We need the combiner to reject that pattern but select this:
A = M div N
B = X add Y
C = B add A
Differential Revision: http://reviews.llvm.org/D10460
llvm-svn: 240361
|
| |
|
|
| |
llvm-svn: 240343
|
| |
|
|
|
|
| |
Fixes the regression test added in r240291.
llvm-svn: 240336
|
| |
|
|
| |
llvm-svn: 240332
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The _Int instructions are special, in that they operate on the full
VR128 instead of FR32. The load folding then looks at MOVSS, at the
user, and bails out when it sees a size mismatch.
What we really know is that the rm_Int instructions don't load the
higher lanes, so folding is fine.
This happens for the straightforward intrinsic code, e.g.:
_mm_add_ss(a, _mm_load_ss(p));
Fixes PR23349.
Differential Revision: http://reviews.llvm.org/D10554
llvm-svn: 240326
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line
options to allow reciprocal estimate instructions to be used in place of
divisions and square roots.
This patch changes the default settings for x86 targets to allow that recip
codegen (except for scalar division because that breaks too much code) when
using -ffast-math or its equivalent.
This matches GCC behavior for this kind of codegen.
Differential Revision: http://reviews.llvm.org/D10396
llvm-svn: 240310
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The parser is exercised by llvm-objdump using -print-fault-maps. As is
probably obvious, the code itself was "heavily inspired" by
http://reviews.llvm.org/D10434.
Reviewers: reames, atrick, JosephTremoulet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10491
llvm-svn: 240304
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared.
Test Plan: A regression test included.
Reviewers: andreadb
Reviewed By: andreadb
Subscribers: andreadb, test, llvm-commits
Differential Revision: http://reviews.llvm.org/D10602
llvm-svn: 240291
|
| |
|
|
|
|
| |
Added intrinsics and encoding tests.
llvm-svn: 240277
|
| |
|
|
| |
llvm-svn: 240258
|
| |
|
|
|
|
|
|
|
| |
This allows more call sequences to use pushes instead of movs when optimizing for size.
In particular, calling conventions that pass some parameters in registers (e.g. thiscall) are now supported.
Differential Revision: http://reviews.llvm.org/D10500
llvm-svn: 240257
|
| |
|
|
|
|
|
| |
VPERMI2W/D/Q/PS/PD instructions.
Added tests.
llvm-svn: 240256
|
| |
|
|
| |
llvm-svn: 240253
|
| |
|
|
| |
llvm-svn: 240241
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
binary search tree
Sparse switches with profile info are lowered as weight-balanced BSTs. For
example, if the node weights are {1,1,1,1,1,1000}, the right-most node would
end up in a tree by itself, bringing it closer to the top.
However, a leaf in this BST can contain up to 3 cases, and having a single
case in a leaf node as in the example means the tree might become
unnecessarily high.
This patch adds a heauristic to the pivot selection algorithm that moves more
cases into leaf nodes unless that would lower their rank. It still doesn't
yield the optimal tree in every case, but I believe it's conservatibely correct.
llvm-svn: 240224
|