| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commutation
MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors.
This patch inserts a vreg copy mi to ensure that the registers are correct.
Fix for PR30607
Differential Revision: https://reviews.llvm.org/D25280
llvm-svn: 283539
|
|
|
|
| |
llvm-svn: 283524
|
|
|
|
| |
llvm-svn: 283523
|
|
|
|
| |
llvm-svn: 283515
|
|
|
|
|
|
|
|
|
|
|
|
| |
When replacing FrameIndex with BasePtr, we must preserve BasePtr for
LEA64_32r since BasePtr is used later for stack adjustment if it is
the same as StackPtr.
Patch by H.J Lu <hjl.tools@gmail.com>
Differential Revision: https://reviews.llvm.org/D23575
llvm-svn: 283486
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change erroneous parsing of push immediate instructions in intel syntax
to default to pointer size by rewriting into the ATT style for matching.
This fixes PR22028.
Reviewers: majnemer, rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25288
llvm-svn: 283457
|
|
|
|
|
|
|
|
| |
Fuchsia is a new operating system.
Differential Revision: https://reviews.llvm.org/D25116
llvm-svn: 283419
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes. It will however allow a client to distinguish back from cross edges in a DFS tree.
Reviewers: nadav, mehdi_amini, dberlin
Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits
Differential Revision: https://reviews.llvm.org/D25191
llvm-svn: 283391
|
|
|
|
|
|
|
|
|
| |
(PR26302)"
This is suspected to cause a miscompile in Chromium. Reverting while
investigating.
llvm-svn: 283329
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25112
llvm-svn: 283326
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivation for the change is that we can't have pseudo-global settings for
codegen living in TargetOptions because that doesn't work with LTO.
Ideally, these reciprocal attributes will be moved to the instruction-level via
FMF, metadata, or something else. But making them function attributes is at least
an improvement over the current state.
The ingredients of this patch are:
Remove the reciprocal estimate command-line debug option.
Add TargetRecip to TargetLowering.
Remove TargetRecip from TargetOptions.
Clean up the TargetRecip implementation to work with this new scheme.
Set the default reciprocal settings in TargetLoweringBase (everything is off).
Update the PowerPC defaults, users, and tests.
Update the x86 defaults, users, and tests.
Note that if this patch needs to be reverted, the related clang patch checked in
at r283251 should be reverted too.
Differential Revision: https://reviews.llvm.org/D24816
llvm-svn: 283252
|
|
|
|
|
|
| |
match MOV8rm.
llvm-svn: 283184
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR30433)
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30433
There are a couple of open questions about the codegen:
1. Should we let scalar ops be scalars and avoid vector constant loads/splats?
2. Should we have a pass to combine constants such as the inverted pair that we have here?
Differential Revision: https://reviews.llvm.org/D25165
llvm-svn: 283119
|
|
|
|
|
|
| |
This was accidentally copy and pasted from other Pseudos in the file.
llvm-svn: 283084
|
|
|
|
|
|
| |
I don't know for sure that we truly needs this, but its the only vector load that isn't rematerializable. Making it consistent allows it to not be a special case in the td files.
llvm-svn: 283083
|
|
|
|
| |
llvm-svn: 283080
|
|
|
|
|
|
|
|
|
|
|
|
| |
To allow broadcast loads of a non-zero'th vector element, lowerVectorShuffleAsBroadcast can replace a load with a new load with an adjusted address, but unfortunately we weren't ensuring that the new load respected the same dependencies.
This patch adds a TokenFactor and updates all dependencies of the old load to reference the new load instead.
Bug found during internal testing.
Differential Revision: https://reviews.llvm.org/D25039
llvm-svn: 283070
|
|
|
|
|
|
| |
64-bit. This way we don't have to catch them and do nothing with them in ReplaceNodeResults.
llvm-svn: 283066
|
|
|
|
| |
llvm-svn: 283065
|
|
|
|
| |
llvm-svn: 283050
|
|
|
|
| |
llvm-svn: 283041
|
|
|
|
|
|
| |
We already had support for 1-input BLEND with zero - this adds support for 2-input BLEND as well.
llvm-svn: 283040
|
|
|
|
|
|
|
|
| |
Now we can commute to BLENDPD/BLENDPS on SSE41+ targets if necessary, so simplify the combine matching where we can.
This required me to add a couple of scalar math movsd/moss fold patterns that hadn't been needed in the past.
llvm-svn: 283038
|
|
|
|
|
|
|
|
|
|
|
|
| |
targets
Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements.
We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful
I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets
llvm-svn: 283037
|
|
|
|
|
|
|
|
| |
-Remove OptForSize. Not all of the backend follows the same rules for creating broadcasts and there is no conflicting pattern.
-Don't stop selecting VEX VMOVDDUP when AVX512 is supported. We need VLX for EVEX VMOVDDUP.
-Only use VMOVDDUP for v2i64 broadcasts if AVX2 is not supported.
llvm-svn: 283020
|
|
|
|
| |
llvm-svn: 283018
|
|
|
|
| |
llvm-svn: 283015
|
|
|
|
| |
llvm-svn: 283004
|
|
|
|
|
|
|
|
|
|
|
| |
We can't use Jcc to leave a Win64 function in general, because that
confuses the unwinder. However, for "leaf" functions, that is, functions
where the return address is always on top of the stack and which don't
have unwind info, it's OK.
Differential Revision: https://reviews.llvm.org/D24836
llvm-svn: 282920
|
|
|
|
|
|
| |
stack spilling pseudos for XMM16-31 and YMM16-31 without VLX.
llvm-svn: 282843
|
|
|
|
|
|
| |
without VLX to teh isFrameLoadOpcode and isFrameStoreOpcode.
llvm-svn: 282842
|
|
|
|
|
|
|
|
| |
addRegisterClass regardless of whether AVX512/VLX is enabled or not."
Turns out this doesn't pass verify-machineinstrs.
llvm-svn: 282841
|
|
|
|
|
|
|
|
| |
also missing. Change register class to include the extra 16 AVX512 registers.
I'm not completely sure what this method does or why all the 256-bit VTs returned VR128RegClass when the comments on the method definiton say it should return the largest super register class. I just figured AVX-512 should be similar.
llvm-svn: 282836
|
|
|
|
|
|
|
|
|
|
| |
addRegisterClass regardless of whether AVX512/VLX is enabled or not.
If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of.
I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change.
llvm-svn: 282835
|
|
|
|
|
|
|
|
|
| |
Code that doesn't use floating point and doesn't use SSE (kernel code)
shouldn't save and restore SSE registers.
Fixes PR30503
llvm-svn: 282819
|
|
|
|
| |
llvm-svn: 282732
|
|
|
|
|
|
|
|
|
|
| |
The shuffle mask decodes have a large amount of repeated code extracting/splitting mask values from Constant data.
This patch pulls all of this duplicated code into a single helper function to identify undef elements and combine/split constant integer data into the requested shuffle mask elements.
Updated PSHUFB/VPERMIL/VPERMIL2/VPPERM decoders to use it (VPERMV/VPERMV3 could be converted as well in the future).
llvm-svn: 282720
|
|
|
|
|
|
|
|
| |
This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used.
Fixes one of the cases from PR29112.
llvm-svn: 282690
|
|
|
|
|
|
| |
(X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition.
llvm-svn: 282688
|
|
|
|
|
|
| |
domain fixing table.
llvm-svn: 282687
|
|
|
|
| |
llvm-svn: 282686
|
|
|
|
| |
llvm-svn: 282684
|
|
|
|
|
|
|
|
|
|
| |
Implement 'retn' simply by aliasing it to the relevant 'ret' instruction
Commit on behalf of coby
Differential Revision: https://reviews.llvm.org/D24346
llvm-svn: 282601
|
|
|
|
|
|
|
|
|
|
|
| |
The KORTEST was introduced due to a bug where a TEST instruction used a K register.
but, turns out that the opposite case of KORTEST using a GPR is now happening
The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR.
Differential Revision: https://reviews.llvm.org/D24953
llvm-svn: 282580
|
|
|
|
| |
llvm-svn: 282579
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 'or' case shows up in copysign. The copysign code also had
redundant checking for a scalar zero operand with 'and', so I
removed that.
I'm not sure how to test vector 'and', 'andn', and 'xor' yet,
but it seems better to just include all of the logic ops since
we're fixing 'or' anyway.
llvm-svn: 282546
|
|
|
|
|
|
| |
Also, put the related FP logic functions together to see the similarities.
llvm-svn: 282522
|
|
|
|
|
|
| |
will not return a value greater than 32. I think it theoretically could be 64 for AVX-512.
llvm-svn: 282471
|
|
|
|
|
| |
PR: 30494
llvm-svn: 282451
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[X86] The .code16gcc directive parses X86 assembly input in 32-bit mode and
outputs in 16-bit mode. Teach parser to switch modes appropriately.
Reviewers: dwmw2, craig.topper
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20109
llvm-svn: 282430
|