| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 241567
|
| |
|
|
|
|
|
|
|
|
|
|
| |
zero'd lanes
The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments.
As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well.
Differential Revision: http://reviews.llvm.org/D10593
llvm-svn: 241516
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Extend the reassociation optimization of http://reviews.llvm.org/rL240361 (D10460)
to SSE scalar FP SP adds in addition to AVX scalar FP SP adds.
With the 'switch' in place, we can trivially add other opcodes and test cases in
future patches.
Differential Revision: http://reviews.llvm.org/D10975
llvm-svn: 241515
|
| |
|
|
|
|
|
|
| |
This patch adds vectorization support for uniform constant i64 arithmetic shift right operators.
Differential Revision: http://reviews.llvm.org/D9645
llvm-svn: 241514
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it.
As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions.
From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level.
Differential Revision: http://reviews.llvm.org/D10146
llvm-svn: 241508
|
| |
|
|
|
|
|
|
|
|
|
|
| |
implementation
With the completion of D9746 there is now a common implementation of integer signed/unsigned min/max nodes, removing the need for the equivalent X86 specific implementations.
This patch removes the old X86ISD nodes, legalizes the relevant SSE2/SSE41/AVX2/AVX512 instructions for the ISD versions and converts the small amount of existing X86 code.
Differential Revision: http://reviews.llvm.org/D10947
llvm-svn: 241506
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This concludes the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
At this point, the StringRef-form of GNU Triples should only be used in the
public API (including IR serialization) and a couple objects that directly
interact with the API (most notably the Module class). The next step is to
replace these Triple objects with the TargetTuple object that will represent
our authoratative/unambiguous internal equivalent to GNU Triples.
Reviewers: rengolin
Subscribers: llvm-commits, jholewinski, ted, rengolin
Differential Revision: http://reviews.llvm.org/D10962
llvm-svn: 241472
|
| |
|
|
|
|
|
|
|
| |
pmulhrsw
review:
http://reviews.llvm.org/D10948
llvm-svn: 241443
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the linker's perspective, an available_externally global is equivalent
to an external declaration (per isDeclarationForLinker()), so it is incorrect
to consider it to be a weak definition.
Also clean up some logic in the dead argument elimination pass and clarify
its comments to better explain how its behavior depends on linkage,
introduce GlobalValue::isStrongDefinitionForLinker() and start using
it throughout the optimizers and backend.
Differential Revision: http://reviews.llvm.org/D10941
llvm-svn: 241413
|
| |
|
|
|
|
|
|
| |
There is some functional change here because it changes target code from
atoi(3) to StringRef::getAsInteger which has error checking. For valid
constraints there should be no difference.
llvm-svn: 241411
|
| |
|
|
|
|
|
|
|
| |
include encoding and intrinsics tests.
review
http://reviews.llvm.org/D10896
llvm-svn: 241406
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Correctly support assembling "pushw $imm8" on x86-64 targets.
Also some cleanup of the PUSH instructions (PUSH64i16 and PUSHi16 actually
represent the same instruction)
This fixes PR23996
Patch by: david.l.kreitzer@intel.com
Differential Revision: http://reviews.llvm.org/D10878
llvm-svn: 241404
|
| |
|
|
|
|
| |
Followup to D10433 and D10589 that fixes i8/i16 uint2fp vector conversions by zero extending to i32 and using the sint2fp path (unless the target does actually support uint2fp).
llvm-svn: 241394
|
| |
|
|
| |
llvm-svn: 241381
|
| |
|
|
|
|
|
| |
It can fail trying to get the section on ELF and COFF. This makes sure the
error is handled.
llvm-svn: 241366
|
| |
|
|
| |
llvm-svn: 241365
|
| |
|
|
|
|
|
|
| |
Add support for v2i8/v2i16 to v2f64 by using a sign extension to v2i32 before conversion to v2f64.
Differential Revision: http://reviews.llvm.org/D10589
llvm-svn: 241325
|
| |
|
|
|
|
|
|
| |
This patch adds support for sign extension for sub 128-bit vectors, such as to v2i32. It concatenates with UNDEF subvectors up to 128-bits, performs the sign extension (i.e. as v4i32) and then extracts the target subvector.
Patch 1/2 of D10589 - the second patch covers the conversion of v2i8/v2i16 to v2f64.
llvm-svn: 241323
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This function can really fail since the string table offset can be out of
bounds.
Using ErrorOr makes sure the error is checked.
Hopefully a lot of the boilerplate code in tools/* can go away once we have
a diagnostic manager in Object.
llvm-svn: 241297
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This checks subtarget feature compatibility for inlining by verifying
that the callee is a strict subset of the caller's features. This includes
the cpu as part of the subtarget we can get via the incoming functions as
the backend takes CPUs as feature sets.
This allows us to inline things like:
int foo() { return baz(); }
int __attribute__((target("sse4.2"))) bar() {
return foo();
}
so that generic code can be inlined into specialized functions.
llvm-svn: 241221
|
| |
|
|
| |
llvm-svn: 241174
|
| |
|
|
|
|
|
|
| |
The EH code might have been deleted as unreachable and the personality
pruned while the filter is still present. Currently I'm hitting this at
-O0 due to the clang bug PR24009.
llvm-svn: 241170
|
| |
|
|
|
|
|
|
| |
Added tests for encoding
Differential Revision: http://reviews.llvm.org/D10865
llvm-svn: 241159
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions.
Only consider an instruction a candidate for relaxation if the last operand of the
instruction is an expression. We previously checked whether any operand is an expression,
which is useless, since for all instructions concerned, the only operand that may be
affected by relaxation is the last one.
In addition, this removes the check for having RIP as an argument, since it was
plain wrong - even when one of the arguments is RIP, relaxation may still be needed.
This fixes PR9807.
Patch by: david.l.kreitzer@intel.com
Differential Revision: http://reviews.llvm.org/D10766
llvm-svn: 241152
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The incoming EBP value established by the runtime is actually a pointer
to the end of the EH registration object, and not the true parent
function frame pointer. Clang doesn't need llvm.x86.seh.exceptioninfo
anymore because we know that the exception info pointer is at a fixed
offset from this incoming EBP.
The llvm.x86.seh.recoverfp intrinsic takes an EBP value provided by the
EH runtime and returns a pointer that is usable with llvm.framerecover.
The llvm.x86.seh.restoreframe intrinsic is inserted by the 32-bit
specific preparation pass in blocks targetted by the EH runtime. It
re-establishes any physical registers used by the parent function to
address the stack, such as the frame, base, and stack pointers.
Neither of these intrinsics correctly handle stack realignment prologues
yet, but it's possible to add that later.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D10848
llvm-svn: 241125
|
| |
|
|
|
|
|
|
|
|
| |
Duplicating an FP register "as itself" is a bad idea, since it violates the
invariant that every FP register is mapped to at most one FPU stack slot.
Use the scratch FP register instead.
This fixes PR23957.
llvm-svn: 241069
|
| |
|
|
| |
llvm-svn: 241061
|
| |
|
|
|
|
|
|
|
| |
represented by uint64_t, this patch replaces these
usages with the FeatureBitset (std::bitset) type.
Differential Revision: http://reviews.llvm.org/D10542
llvm-svn: 241058
|
| |
|
|
|
|
| |
Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64)
llvm-svn: 241049
|
| |
|
|
| |
llvm-svn: 241033
|
| |
|
|
|
|
|
|
|
|
|
| |
Realistically, this will be returning ErrorOr for some time as refactoring the
user code to check once per section will take some time.
Given that, use it for checking if a relocation has addend or not.
While at it, add ELFRelocationRef to simplify the users.
llvm-svn: 241028
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cleanup.
This change unifies how LTOModule and the backend obtain linker flags
for globals: via a new TargetLoweringObjectFile member function named
emitLinkerFlagsForGlobal. A new function LTOModule::getLinkerOpts() returns
the list of linker flags as a single concatenated string.
This change affects the C libLTO API: the function lto_module_get_*deplibs now
exposes an empty list, and lto_module_get_*linkeropts exposes a single element
which combines the contents of all observed flags. libLTO should never have
tried to parse the linker flags; it is the linker's job to do so. Because
linkers will need to be able to parse flags in regular object files, it
makes little sense for libLTO to have a redundant mechanism for doing so.
The new API is compatible with the old one. It is valid for a user to specify
multiple linker flags in a single pragma directive like this:
#pragma comment(linker, "/defaultlib:foo /defaultlib:bar")
The previous implementation would not have exposed
either flag via lto_module_get_*deplibs (as the test in
TargetLoweringObjectFileCOFF::getDepLibFromLinkerOpt was case sensitive)
and would have exposed "/defaultlib:foo /defaultlib:bar" as a single flag via
lto_module_get_*linkeropts. This may have been a bug in the implementation,
but it does give us a chance to fix the interface.
Differential Revision: http://reviews.llvm.org/D10548
llvm-svn: 241010
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a new version of http://reviews.llvm.org/D10260.
It turned out that when you specify an integer register in inline asm on
x86 you get the register of the required type size back. That means that
X86TargetLowering::getRegForInlineAsmConstraint() has to accept any of
the integer registers and adapt its size to the given target size which
may be any 8/16/32/64 bit sized type. Surprisingly that means given a
constraint of "{ax}" and a type of MVT::F32 we need to return X86::EAX.
This change makes this face explicit, the previous code seemed like
working by accident because there it never returned an error once a
register was found. On the other hand this rewrite allows to actually
return errors for invalid situations like requesting an integer register
for an i128 type.
Related to rdar://21042280
Differential Revision: http://reviews.llvm.org/D10813
llvm-svn: 241002
|
| |
|
|
|
|
| |
encoding, intrinsics and tests.
llvm-svn: 240936
|
| |
|
|
|
|
|
|
| |
Added tests for DAG lowering ,encoding and intrinsics
Differential Revision: http://reviews.llvm.org/D10796
llvm-svn: 240926
|
| |
|
|
| |
llvm-svn: 240924
|
| |
|
|
|
|
|
|
|
|
|
| |
Add vscalef support
include encoding and intrinsics
review:
http://reviews.llvm.org/D10730
llvm-svn: 240906
|
| |
|
|
|
|
|
| |
Added intrinsics.
Added encoding and tests.
llvm-svn: 240905
|
| |
|
|
|
|
|
|
|
|
|
| |
X86WindowsTargetObjectFile::getSectionForConstant""
This reverts commit r240793 while fixing how we handle array constant
pool entries.
This fixes PR23966.
llvm-svn: 240811
|
| |
|
|
|
|
|
|
| |
Patch by Matthew Barney. Thanks!
Differential Revision: http://reviews.llvm.org/D9514
llvm-svn: 240795
|
| |
|
|
|
|
| |
It seems to have caused PR23966: "UNREACHABLE executed at ..\lib\Target\X86\X86TargetObjectFile.cpp:148"
llvm-svn: 240793
|
| |
|
|
| |
llvm-svn: 240785
|
| |
|
|
|
|
| |
No functionality changed, just keeping things clean.
llvm-svn: 240762
|
| |
|
|
|
|
|
|
| |
Revert until http://llvm.org/PR23955 is investigated.
This reverts commit r239309.
llvm-svn: 240746
|
| |
|
|
|
|
|
| |
This allows user code to say Sym.getSize() instead of having to manually fetch
the object.
llvm-svn: 240708
|
| |
|
|
|
|
|
|
|
|
| |
We don't always have FMA, for example when using 'clang -mavx512f'
without an explicit CPU.
Also check for an explicit +avx512f instead of CPUs in a couple
related tests.
llvm-svn: 240616
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary
This change turns on the emission of
__LLVM_Stackmaps section when generating COFF binaries.
Test Plan
Added a scenario to the test case:
test\CodeGen\X86\statepoint-stackmap-format.ll.
Code Review:
http://reviews.llvm.org/D10680
llvm-svn: 240613
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Deciding that insn->sibIndex is SIB_INDEX_NONE does not require another
check beyond the fully decoded bits being equal to 0x4.
The expression insn->sibIndex == SIB_INDEX_sib could not have been true unless
index were 0x4, because SIB_INDEX_sib is merely the range base (SIB_INDEX_EAX)
plus 4. Respectively SIB_INDEX_sib64.
- Don't use a switch statement to perform left-shift.
Differential Revision: http://reviews.llvm.org/D9762
llvm-svn: 240598
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
COFF and MachO only define symbol sizes for common symbols. Reflect that
in the class hierarchy by having a method for common symbols only in the base
and a general one in ELF.
This avoids the need of using a magic value for the size, which had a few
problems
* Most callers didn't check for it.
* The ones that did could not tell the magic value from a file actually having
that value.
llvm-svn: 240529
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to erroneously match:
(v4i64 shuffle (v2i64 load), <0,0,0,0>)
Whereas vbroadcasti128 is more like:
(v4i64 shuffle (v2i64 load), <0,1,0,1>)
This problem doesn't exist for vbroadcastf128, which kept matching
the intrinsic after r231182. We should perhaps re-introduce the
intrinsic here as well, but that's a separate issue still being
discussed.
While there, add some proper vbroadcastf128 tests. We don't currently
match those, like for loading vbroadcastsd/ss on AVX (the reg-reg
broadcasts where added in AVX2).
Fixes PR23886.
llvm-svn: 240488
|