| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 250162
|
|
|
|
| |
llvm-svn: 250151
|
|
|
|
| |
llvm-svn: 250148
|
|
|
|
|
|
| |
parser will check the size.
llvm-svn: 250147
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
D4796 taught LLVM to fold some atomic integer operations into a single
instruction. The pattern was unaware that the instructions clobbered
flags.
This patch adds the missing EFLAGS definition.
Floating point operations don't set flags, the subsequent fadd
optimization is therefore correct. The same applies for surrounding
load/store optimizations.
Reviewers: rsmith, rtrieu
Subscribers: llvm-commits, reames, morisset
Differential Revision: http://reviews.llvm.org/D13680
llvm-svn: 250135
|
|
|
|
|
|
|
|
|
| |
We made them SP relative back in March (r233137) because that's the
value the runtime passes to EH functions. With the new cleanuppad IR,
funclets adjust their frame argument from SP to FP, so our offsets
should now be FP-relative.
llvm-svn: 250088
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong
assumption that for non-AVX512 targets, the source type and destination type
of a type-legalized setcc node were always the same type.
This assumption was unfortunately incorrect; the type legalizer is not always
able to promote the return type of a setcc to the same type as the first
operand of a setcc.
In the case of a vsetcc node, the legalizer firstly checks if the first input
operand has a legal type. If so, then it promotes the return type of the vsetcc
to that same type. Otherwise, the return type is promoted to the 'next legal
type', which, for vectors of MVT::i1 is always a 128-bit integer vector type.
Example (-mattr=+avx):
%0 = trunc <8 x i32> %a to <8 x i23>
%1 = icmp eq <8 x i23> %0, zeroinitializer
The initial selection dag for the code above is:
v8i1 = setcc t5, t7, seteq:ch
t5: v8i23 = truncate t2
t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1
t7: v8i32 = build_vector of all zeroes.
The type legalizer would firstly check if 't5' has a legal type. If so, then it
would reuse that same type to promote the return type of the setcc node.
Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to
promote the return type of the setcc node. Consequently, the setcc return type
is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the
following dag node:
v8i16 = setcc t32, t25, seteq:ch
where t32 and t25 are now values of type v8i32.
Before this patch, function LowerVSETCC would have wrongly expanded the setcc
to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an
instruction. In our case, ISel would have matched a VPCMPEQWrr:
t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25
However, t36 and t25 are both VR256, while the result type is instead of class
VR128. This inconsistency ended up causing the insertion of COPY instructions
like this:
%vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3
Which is an invalid full copy (not a sub register copy).
Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy
instruction" in the attempt to expand the malformed pseudo COPY instructions.
This patch fixes the problem adding the missing logic in LowerVSETCC to handle
the corner case of a setcc with 128-bit return type and 256-bit operand type.
This problem was originally reported by Dimitry as PR25080. It has been latent
for a very long time. I have added the minimal reproducible from that bugzilla
as test setcc-lowering.ll.
Differential Revision: http://reviews.llvm.org/D13660
llvm-svn: 250085
|
|
|
|
| |
llvm-svn: 250075
|
|
|
|
| |
llvm-svn: 250059
|
|
|
|
| |
llvm-svn: 250049
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add intrinsics for the
XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64)
XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64)
XSAVEC instructions (XSAVEC/XSAVEC64)
XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64)
Differential Revision: http://reviews.llvm.org/D13012
llvm-svn: 250029
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
indices have the most significant bit set.
This patch fixes a problem in function 'combineX86ShuffleChain' that causes a
chain of shuffles to be wrongly folded away when the combined shuffle mask has
only one element.
We may end up with a combined shuffle mask of one element as a result of
multiple calls to function 'canWidenShuffleElements()'.
Function canWidenShuffleElements attempts to simplify a shuffle mask by widening
the size of the elements being shuffled.
For every pair of shuffle indices, function canWidenShuffleElements checks if
indices refer to adjacent elements. If all pairs refer to "adjacent" elements
then the shuffle mask is safely widened. As a consequence of widening, we end up
with a new shuffle mask which is half the size of the original shuffle mask.
The byte shuffle (pshufb) from test pr24562.ll has a mask of all SM_SentinelZero
indices. Function canWidenShuffleElements would combine each pair of
SM_SentinelZero indices into a single SM_SentinelZero index. So, in a
logarithmic number of steps (4 in this case), the pshufb mask is simplified to
a mask with only one index which is equal to SM_SentinelZero.
Before this patch, function combineX86ShuffleChain wrongly assumed that a mask
of size one is always equivalent to an identity mask. So, the entire shuffle
chain was just folded away as the combined shuffle mask was treated as a no-op
mask.
With this patch we know check if the only element of a combined shuffle mask is
SM_SentinelZero. In case, we propagate a zero vector.
Differential Revision: http://reviews.llvm.org/D13364
llvm-svn: 250027
|
|
|
|
|
|
| |
instructions. This way the assembler will perform range checking. Believe this matches gas behavior.
llvm-svn: 250016
|
|
|
|
|
|
| |
%xmmX, %xmmX encoding if it would be a shorter VEX encoding.
llvm-svn: 250014
|
|
|
|
| |
llvm-svn: 250013
|
|
|
|
|
|
| |
parser will check the size.
llvm-svn: 250012
|
|
|
|
|
|
|
|
| |
arithmetic instructions with 8-bit immediates over the forms that implicitly use the ax/eax/rax.
This allows us to remove the explicit code for working around the existing priority
llvm-svn: 250011
|
|
|
|
|
|
| |
al/ax/eax/rax as a def. Probably doesn't have a functional affect since these aren't used in isel.
llvm-svn: 249994
|
|
|
|
|
|
|
|
| |
Instead mark its operand type as u8imm which will cause it to fail to match. This is more consistent with other instruction behavior.
This also fixes a bug where negative immediates below -128 were not being reported as errors.
llvm-svn: 249989
|
|
|
|
| |
llvm-svn: 249979
|
|
|
|
|
|
|
|
| |
comparisons to XOP PCOM/PCOMU instructions.
The XOP vector integer comparisons can deal with all signed/unsigned comparison cases directly and can be easily commuted as well (D7646).
llvm-svn: 249976
|
|
|
|
| |
llvm-svn: 249941
|
|
|
|
| |
llvm-svn: 249940
|
|
|
|
|
|
| |
wineh-parent is dead, so is ValueOrMBB.
llvm-svn: 249920
|
|
|
|
|
|
|
|
|
|
|
| |
The new implementation works at least as well as the old implementation
did.
Also delete the associated preparation tests. They don't exercise
interesting corner cases of the new implementation. All the codegen
tests of the EH tables have already been ported.
llvm-svn: 249918
|
|
|
|
|
|
|
|
| |
x64 catchpads use rax to inform the unwinder where control should go
next. However, we must initialize rax before the epilogue sequence so
as to not perturb the unwinder.
llvm-svn: 249910
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running combine on an extract_vector_elt, it wants to look through
a bitcast to check if the argument to the bitcast was itself an
extract_vector_elt with particular operands.
However, it called getOperand() on the argument to the bitcast *before*
checking that the opcode was EXTRACT_VECTOR_ELT, assert-failing if there
were zero operands for the actual opcode.
Fix, and add trivial test.
llvm-svn: 249891
|
|
|
|
|
|
|
|
|
|
| |
Win64"""
This reverts commit r249794.
Apparently my checkouts are full of unexpected surprises today.
llvm-svn: 249796
|
|
|
|
|
|
|
|
| |
This reverts commit r249032.
TODO write commit msg
llvm-svn: 249794
|
|
|
|
|
|
|
| |
This is a simple refactoring that replaces Triple.getEnvironment()
checks for Android with Triple.isAndroid().
llvm-svn: 249750
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
its own variable.
This is needed so that we can explicitly turn off MMX without turning
off SSE and also so that we can diagnose feature set incompatibilities
that involve MMX without SSE.
Rationale:
// sse3
__m128d test_mm_addsub_pd(__m128d A, __m128d B) {
return _mm_addsub_pd(A, B);
}
// mmx
void shift(__m64 a, __m64 b, int c) {
_mm_slli_pi16(a, c);
_mm_slli_pi32(a, c);
_mm_slli_si64(a, c);
_mm_srli_pi16(a, c);
_mm_srli_pi32(a, c);
_mm_srli_si64(a, c);
_mm_srai_pi16(a, c);
_mm_srai_pi32(a, c);
}
clang -msse3 -mno-mmx file.c -c
For this code we should be able to explicitly turn off MMX
without affecting the compilation of the SSE3 function and then
diagnose and error on compiling the MMX function.
This matches the existing gcc behavior and follows the spirit of
the SSE/MMX separation in llvm where we can (and do) turn off
MMX code generation except in the presence of intrinsics.
Updated a couple of tests, but primarily tested with a couple of tests
for turning on only mmx and only sse.
This is paired with a patch to clang to take advantage of this behavior.
llvm-svn: 249731
|
|
|
|
|
|
| |
The code is correct as is, but we should test it.
llvm-svn: 249715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We emit 1 compact unwind encoding per function, and this can’t represent
the varying stack pointer that will be generated by X86CallFrameOptimization.
Disable the optimization on Darwin.
(It might be possible to split the function into multiple ranges
and emit 1 compact unwind info per range. The compact unwind emission
code isn’t ready for that and this kind of info certainly isn’t
tested/used anywhere. It might be worth exploring this path if we want
to get the space savings at some point though)
llvm-svn: 249694
|
|
|
|
|
|
|
|
|
| |
This instructions doesn't have intrincis.
Added tests for lowering and encoding.
Differential Revision: http://reviews.llvm.org/D12317
llvm-svn: 249688
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes two separate bugs:
1) The mask for the high lane was not set correctly. That fixes PR24532.
2) The transformation should bail out if it believes it involves more than
2 lanes, as it does not currently do anything sensible in this case.
Differential Revision: http://reviews.llvm.org/D13505
llvm-svn: 249669
|
|
|
|
|
|
|
|
| |
In particular, passing non-trivially copyable objects by value on win32
uses a dynamic alloca (inalloca). We would clobber ESP in the epilogue
and end up returning to outer space.
llvm-svn: 249637
|
|
|
|
|
|
| |
period termination.
llvm-svn: 249571
|
|
|
|
|
|
|
|
|
|
|
|
| |
When outgoing function arguments are passed using push instructions, and EH
is enabled, we may need to indicate to the stack unwinder that the stack
pointer was adjusted before the call.
This should fix the exception handling issues in PR24792.
Differential Revision: http://reviews.llvm.org/D13132
llvm-svn: 249522
|
|
|
|
|
|
|
|
|
|
| |
as W0 and not W1, like all other instruction have been implemented.
Add encoding tests.
Differential Revision: http://reviews.llvm.org/D13471
llvm-svn: 249521
|
|
|
|
|
|
|
|
|
|
| |
generated files; other minor cleanups.
Patch by Eugene Zelenko!
Differential Revision: http://reviews.llvm.org/D13321
llvm-svn: 249482
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Add CoreCLR to if/else ladders and switches as appropriate.
- Rename isMSVCEHPersonality to isFuncletEHPersonality to better
reflect what it captures.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: pgavlin, AndyAyers, llvm-commits
Differential Revision: http://reviews.llvm.org/D13449
llvm-svn: 249455
|
|
|
|
|
|
|
|
| |
0x80000000-0xffffffff can be handled cheaply and don't need to be hoisted.
Most importantly, this keeps constant hoisting from preventing instruction selections ability to turn an AND with 0xffffffff into a move into a 32-bit subregister.
llvm-svn: 249370
|
|
|
|
|
|
| |
wrapped in the equivalent earlier. NFC
llvm-svn: 249369
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CATCHRET operand did not match the MachineFunction's CFG. This
mismatch happened because FrameLowering created a new MachineBasicBlock
and updated the CFG but forgot to update the CATCHRET operand.
Let's make sure this doesn't happen again by strengthing the funclet
membership analysis: it can now reason about the membership of all basic
blocks, not just those inside of funclets.
llvm-svn: 249344
|
|
|
|
|
|
|
|
| |
Added tests for intrinsics and encoding.
Differential Revision: http://reviews.llvm.org/D12690
llvm-svn: 249261
|
|
|
|
|
|
| |
The custom lowering in LowerExtendedLoad is doing the equivalent shuffle, so make use of existing lowering code to reduce duplication.
llvm-svn: 249243
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
between 128/256-bit vector types."
This patch teaches FastIsel the following two things:
1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types;
2) On AVX, no instructions are needed for bitcasts between 256-bit vector types.
Example:
%1 = bitcast <4 x i31> %V to <2 x i64>
Before (-fast-isel -fast-isel-abort=1):
FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64>
Now we don't fall back to SelectionDAG and we correctly fold that computation
propagating the register associated to %V.
Originally reviewed here: http://reviews.llvm.org/D13347
llvm-svn: 249147
|
|
|
|
|
|
|
|
|
| |
128/256-bit vector types.
r249121 caused a Clang test failure (avx2-buitins.c).
Revert r249121 while I keep investigating on the reason why that test failed.
llvm-svn: 249124
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector types.
This patch teaches FastIsel the following two things:
1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types;
2) On AVX, no instructions are needed for bitcasts between 256-bit vector types.
Example:
%1 = bitcast <4 x i31> %V to <2 x i64>
Before (-fast-isel -fast-isel-abort=1):
FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64>
Now we don't fall back to SelectionDAG and we correctly fold that computation
propagating the register associated to %V.
Differential Revision: http://reviews.llvm.org/D13347
llvm-svn: 249121
|
|
|
|
|
|
|
|
|
|
|
|
| |
We emit denormalized tables, where every range of invokes in the same
state gets a complete list of EH action entries. This is significantly
simpler than trying to infer the correct nested scoping structure from
the MI. Fortunately, for SEH, the nesting structure is really just a
size optimization.
With this, some basic __try / __except examples work.
llvm-svn: 249078
|