| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 233554
|
| |
|
|
|
|
| |
see comments http://reviews.llvm.org/D6835
llvm-svn: 233528
|
| |
|
|
|
|
| |
by Asaf Badouh (asaf.badouh@intel.com)
llvm-svn: 233525
|
| |
|
|
|
|
| |
which doesn't have AES. Having AES and not PCLMUL makes 'corei7' halfway between Nehalem and Westmere.
llvm-svn: 233517
|
| |
|
|
|
|
| |
By Asaf Badouh (asaf.badouh@intel.com)
llvm-svn: 233489
|
| |
|
|
|
|
| |
instead of from MCInstPrinter::AvailableFeatures.
llvm-svn: 233485
|
| |
|
|
| |
llvm-svn: 233474
|
| |
|
|
| |
llvm-svn: 233473
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
per-function subtarget.
Currently, code-gen passes the default or generic subtarget to the constructors
of MCInstPrinter subclasses (see LLVMTargetMachine::addPassesToEmitFile), which
enables some targets (AArch64, ARM, and X86) to change their instprinter's
behavior based on the subtarget feature bits. Since the backend can now use
different subtargets for each function, instprinter has to be changed to use the
per-function subtarget rather than the default subtarget.
This patch takes the first step towards enabling instprinter to change its
behavior based on the per-function subtarget. It adds a bit "PassSubtarget" to
AsmWriter which tells table-gen to pass a reference to MCSubtargetInfo to the
various print methods table-gen auto-generates.
I will follow up with changes to instprinters of AArch64, ARM, and X86.
llvm-svn: 233411
|
| |
|
|
| |
llvm-svn: 233392
|
| |
|
|
| |
llvm-svn: 233293
|
| |
|
|
| |
llvm-svn: 233292
|
| |
|
|
| |
llvm-svn: 233289
|
| |
|
|
|
|
|
|
|
| |
This patch teaches fast-isel how to select 128-bit vector load instructions.
Added test CodeGen/X86/fast-isel-vecload.ll
Differential Revision: http://reviews.llvm.org/D8605
llvm-svn: 233270
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows AVX blend instructions to handle insertion into the low
element of a 256-bit vector for the appropriate data types.
For f32, instead of:
vblendps $1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
vblendps $15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]
we get:
vblendps $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]
For f64, instead of:
vmovsd %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1]
vblendpd $3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]
we get:
vblendpd $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]
For the hardware-neglected integer data types, I left a TODO comment in the
code and added regression tests for a follow-on patch.
Differential Revision: http://reviews.llvm.org/D8609
llvm-svn: 233199
|
| |
|
|
|
|
| |
functions from X86 MC layer. They haven't been used since CPU autodetection was removed from X86Subtarget.cpp.
llvm-svn: 233170
|
| |
|
|
|
|
|
|
|
|
|
| |
We can't use TargetFrameLowering::getFrameIndexOffset directly, because
Win64 really wants the offset from the stack pointer at the end of the
prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(),
which is a pretty close approximiation of that. It fails to handle cases
with interestingly large stack alignments, which is pretty uncommon on
Win64 and is TODO.
llvm-svn: 233137
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
vperm2x128 instructions have the special ability (aka free hardware capability)
to shuffle zero values into a vector.
This patch recognizes that type of shuffle and generates the appropriate
control byte.
https://llvm.org/bugs/show_bug.cgi?id=22984
Differential Revision: http://reviews.llvm.org/D8563
llvm-svn: 233100
|
| |
|
|
|
|
|
|
| |
This reverts commit r233055.
It still causes buildbot failures (gcc running out of memory on several platforms, and a self-host failure on arm), although less than the previous time.
llvm-svn: 233068
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, subtarget features were a bitfield with the underlying type being uint64_t.
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.
The first time this was committed (r229831), it caused several buildbot failures.
At least some of the ARM ones were due to gcc/binutils issues, and should now be fixed.
Differential Revision: http://reviews.llvm.org/D8542
llvm-svn: 233055
|
| |
|
|
|
|
|
|
|
|
| |
Simplify boolean expressions with `true` and `false` with `clang-tidy`
Patch by Richard Thomson.
Differential Revision: http://reviews.llvm.org/D8519
llvm-svn: 233002
|
| |
|
|
| |
llvm-svn: 232998
|
| |
|
|
| |
llvm-svn: 232923
|
| |
|
|
|
|
| |
- was reporting 'warning C4715: 'getType32' : not all control paths return a value'
llvm-svn: 232913
|
| |
|
|
|
|
|
|
|
|
|
| |
As preparation for removing the getSubtargetImpl() call from
TargetMachine go ahead and flip the switch on caching the function
dependent subtarget and remove the bare getSubtargetImpl call
from the X86 port. As part of this add a few tests that show we
can generate code and assemble on X86 based on features/cpu on
the Function.
llvm-svn: 232879
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, for this one exact case, we'll generate:
blendps %xmm0, %xmm1, $1
instead of:
insertps %xmm0, %xmm1, $0
If there's a memory operand available for load folding and we're
optimizing for size, we'll still generate the insertps.
The detailed performance data motivation for this may be found in D7866;
in summary, blendps has 2-3x throughput vs. insertps on widely used chips.
Differential Revision: http://reviews.llvm.org/D8332
llvm-svn: 232850
|
| |
|
|
| |
llvm-svn: 232848
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main differences are:
* Split in 32 and 64 bit functions.
* First switch on the Modifier so that we have only one non fully covered
switch.
* Map the fixup kind first to a x86_64 (or i386) specific enum, to make
it easy to handle cases like X86::reloc_riprel_4byte_movq_load.
* Switch on IsPCRel last, which reduces code duplication.
Fixes pr22308.
llvm-svn: 232837
|
| |
|
|
| |
llvm-svn: 232822
|
| |
|
|
| |
llvm-svn: 232814
|
| |
|
|
| |
llvm-svn: 232813
|
| |
|
|
| |
llvm-svn: 232811
|
| |
|
|
| |
llvm-svn: 232810
|
| |
|
|
|
|
| |
functions; NFCI
llvm-svn: 232781
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Another case of x86-specific shuffle strength reduction:
avoid generating insert*128 instructions with index 0 because
they are slower than their non-lane-changing blend equivalents.
Shuffle lowering already catches most of these cases, but
the zero vector case and some other paths such as in the
modified test in vector-shuffle-256-v32.ll were getting
through.
Differential Revision: http://reviews.llvm.org/D8366
llvm-svn: 232773
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There are two main advantages to doing this
* Targets that only need to handle one of the formats specially don't have
to worry about the others. For example, x86 now only registers a
constructor for the COFF streamer.
* Changes to the arguments passed to one format constructor will not impact
the other formats.
llvm-svn: 232699
|
| |
|
|
| |
llvm-svn: 232688
|
| |
|
|
|
|
|
|
| |
Fixed broken tests.
Differential Revision: http://reviews.llvm.org/D8416
llvm-svn: 232682
|
| |
|
|
|
|
|
|
| |
appears to have broken tests/bots.
This reverts commit r232660.
llvm-svn: 232670
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently v2i64 vectors shifts (non-equal shift amounts) are scalarized, costing 4 x extract, 2 x x86-shifts and 2 x insert instructions - and it gets even more awkward on 32-bit targets.
This patch separately shifts the vector by both shift amounts and then shuffles the partial results back together, costing 2 x shuffles and 2 x sse-shifts instructions (+ 2 movs on pre-AVX hardware).
Note - this patch only improves the SHL / LSHR logical shifts as only these are supported in SSE hardware.
Differential Revision: http://reviews.llvm.org/D8416
llvm-svn: 232660
|
| |
|
|
|
|
|
|
| |
We can get there with .code64.
Fixes pr22349.
llvm-svn: 232651
|
| |
|
|
| |
llvm-svn: 232481
|
| |
|
|
| |
llvm-svn: 232429
|
| |
|
|
| |
llvm-svn: 232428
|
| |
|
|
|
|
|
|
|
| |
uppercase letter
This covers essentially all of llvm's headers and libs. One or two weird
cases I wasn't sure were worth/appropriate to fix.
llvm-svn: 232394
|
| |
|
|
| |
llvm-svn: 232385
|
| |
|
|
| |
llvm-svn: 232378
|
| |
|
|
| |
llvm-svn: 232375
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InlineAsm::Constraint_m. NFC.
Summary:
This is instead of doing this in target independent code and is the last
non-functional change before targets begin to distinguish between
different memory constraints when selecting code for the ISD::INLINEASM
node.
Next, each target will individually move away from the idea that all
memory constraints behave like 'm'.
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8173
llvm-svn: 232373
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch consists of the suggestions of clang-tidy/misc-static-assert check.
Reviewers: alexfh
Reviewed By: alexfh
Subscribers: xazax.hun, llvm-commits
Differential Revision: http://reviews.llvm.org/D8343
llvm-svn: 232366
|