| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
from the Intel docs these instructions require the L-bit to be 0.
llvm-svn: 256486
|
| |
|
|
| |
llvm-svn: 256484
|
| |
|
|
|
|
|
|
| |
names. Add a missing encoding to disassembler and assembler.
I believe this also fixes a case where a 64-bit memory form that is documented as being unsupported in 32-bit mode was able to be selected there.
llvm-svn: 256483
|
| |
|
|
| |
llvm-svn: 256482
|
| |
|
|
|
|
| |
instructions.
llvm-svn: 256481
|
| |
|
|
| |
llvm-svn: 256480
|
| |
|
|
|
|
|
| |
optional '\brief' tag and reflow some comments based on the added
horizontal space. NFC.
llvm-svn: 256479
|
| |
|
|
|
|
| |
ctlz_zero_undef. Change the operation for CTLZ_ZERO_UNDEF to Expand so SelectionDAG will convert them to CTLZ before lowering.
llvm-svn: 256477
|
| |
|
|
|
|
| |
CTTZ_ZERO_UNDEF if the non-ZERO_UNDEF form is legal or custom. Will be used to simplify X86 code in a follow on commit.
llvm-svn: 256476
|
| |
|
|
|
|
| |
and VMOVSLDUP. They don't have any tests and I don't think they can be selected. If they are truly needed they should be implemented with patterns against the normal instructions and not separate instructions.
llvm-svn: 256475
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should work with ShTest (executed externally or internally) and GTest
test formats.
To set the timeout a new option ``--timeout=`` has
been added which specifies the maximum run time of an individual test
in seconds. By default this 0 which causes no timeout to be enforced.
The timeout can also be set from a lit configuration file by modifying
the ``lit_config.maxIndividualTestTime`` property.
To implement a timeout we now require the psutil Python module if a
timeout is requested. This dependency is confined to the newly added
``lit.util.killProcessAndChildren()``. A note has been added into the
TODO document describing how we can remove the dependency on the
``pustil`` module in the future. It would be nice to remove this
immediately but that is a lot more work and Daniel Dunbar believes it is
better that we get a working implementation first and then improve it.
To avoid breaking the existing behaviour the psutil module will not be
imported if no timeout is requested.
The included testcases are derived from test cases provided by
Jonathan Roelofs which were in an previous attempt to add a per test
timeout to lit (http://reviews.llvm.org/D6584). Thanks Jonathan!
Reviewers: ddunbar, jroelofs, cmatthews, MatzeB
Subscribers: cmatthews, llvm-commits
Differential Revision: http://reviews.llvm.org/D14706
llvm-svn: 256471
|
| |
|
|
|
|
|
|
|
| |
Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB.
Implement VPMOVB/W/D/Q2M intrinsic.
Differential Revision: http://reviews.llvm.org/D15675
llvm-svn: 256470
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15786
llvm-svn: 256469
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a standalone pass.
There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.
In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.
The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.
Differential Revision: http://reviews.llvm.org/D15676
llvm-svn: 256466
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is (by default) run much earlier than FuncitonAttrs proper.
This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.
I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.
Differential Revision: http://reviews.llvm.org/D15668
llvm-svn: 256465
|
| |
|
|
|
|
| |
no tests for them and I don't see any way to select them anyway. If they are really needed they should be implemented as patterns and not full fledged instructions.
llvm-svn: 256462
|
| |
|
|
| |
llvm-svn: 256460
|
| |
|
|
|
|
|
| |
MSC18 Debug didn't merge them.
FIXME: I tweaked just to appease a builder. Almost string literals should be addressed identically there.
llvm-svn: 256459
|
| |
|
|
| |
llvm-svn: 256458
|
| |
|
|
| |
llvm-svn: 256457
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A frame pointer must be used if stack pointer is modified after the
prologue. LLVM will emit pushf/popf if we need to save/restore the
FLAGS register, requiring us to have a frame pointer for the function.
There is a small twist: this sequence might exist in user code via
inline-assembly. For now, conservatively assume that such functions
require a frame pointer. For real world justification, please see
clang's implementation of __readeflags.
This fixes PR25945.
llvm-svn: 256456
|
| |
|
|
|
|
|
| |
This is aids in debugging WinEH, similar functionality is present for
DWARF EH.
llvm-svn: 256455
|
| |
|
|
|
|
|
| |
This is a follow-on to:
http://reviews.llvm.org/rL255700
llvm-svn: 256454
|
| |
|
|
| |
llvm-svn: 256453
|
| |
|
|
| |
llvm-svn: 256452
|
| |
|
|
|
|
| |
statements. NFC
llvm-svn: 256451
|
| |
|
|
|
|
| |
Should bring back the bots after r256443.
llvm-svn: 256450
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint.
Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob
Subscribers: reames, mjacob, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D15662
llvm-svn: 256443
|
| |
|
|
|
|
| |
to be tolerant of the Constant type not matching due to folding in the constant pool and to get VPERMILPD correct."
llvm-svn: 256435
|
| |
|
|
|
|
| |
This is the test case for r256433, but it got committed incorrectly in my local repo.
llvm-svn: 256434
|
| |
|
|
|
|
| |
Constant type not matching due to folding in the constant pool and to get VPERMILPD correct.
llvm-svn: 256433
|
| |
|
|
| |
llvm-svn: 256432
|
| |
|
|
|
|
| |
code.
llvm-svn: 256431
|
| |
|
|
|
|
| |
library to fix the bots.
llvm-svn: 256430
|
| |
|
|
| |
llvm-svn: 256429
|
| |
|
|
|
|
| |
used by AsmParser library without depending on X86CodeGen library.
llvm-svn: 256428
|
| |
|
|
|
|
| |
InstPrinters. NFC
llvm-svn: 256427
|
| |
|
|
|
|
| |
violation in AsmParser.
llvm-svn: 256426
|
| |
|
|
|
|
|
|
| |
getX86SubSuperRegister with just an unsigned representing size.
This a is step towards fixing a layering violation so the X86 AsmParser won't depending on CodeGen types.
llvm-svn: 256425
|
| |
|
|
|
|
| |
getX86SubSuperRegister. Most place don't care about this argument. NFC
llvm-svn: 256424
|
| |
|
|
| |
llvm-svn: 256423
|
| |
|
|
|
|
| |
recursively. It should call itself instead. Otherwise it might fire an assertion when it was designed not too.
llvm-svn: 256422
|
| |
|
|
|
|
| |
getMemoryOperandNo. These aren't used by any instructions, but could be someday. NFC
llvm-svn: 256421
|
| |
|
|
| |
llvm-svn: 256420
|
| |
|
|
| |
llvm-svn: 256419
|
| |
|
|
|
|
|
|
| |
We already know how to properly print out basic blocks in
printAsOperand, we should not roll it ourselves in
AsmPrinter::EmitBasicBlockStart. No functionality change is intended.
llvm-svn: 256413
|
| |
|
|
|
|
| |
definitions to DerivedTypes.h so they can be inlined by the compiler.
llvm-svn: 256406
|
| |
|
|
| |
llvm-svn: 256405
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Move RegStackify after coalescing and teach it to use LiveIntervals instead
of depending on SSA form. This avoids a problem where a register in a COPY
instruction is stackified and then subsequently coalesced with a register
that is not stackified.
This also puts it after the scheduler, which allows us to simplify the
EXPR_STACK constraint, as we no longer have instructions being reordered
after stackification and before coloring.
llvm-svn: 256402
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an extension of the shuffle combining from r203229:
http://reviews.llvm.org/rL203229
The idea is to widen a short input vector with undef elements so the
existing shuffle transform for extract/insert can kick in.
The motivation is to finally solve PR2109:
https://llvm.org/bugs/show_bug.cgi?id=2109
For that example, the IR becomes:
%1 = bitcast <2 x i32>* %P to <2 x float>*
%ld1 = load <2 x float>, <2 x float>* %1, align 8
%2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
%i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
ret <4 x float> %i2
And x86 SSE output improves from:
movq (%rdi), %xmm1 ## xmm1 = mem[0],zero
movdqa %xmm1, %xmm2
shufps $229, %xmm2, %xmm2 ## xmm2 = xmm2[1,1,2,3]
shufps $48, %xmm0, %xmm1 ## xmm1 = xmm1[0,0],xmm0[3,0]
shufps $132, %xmm1, %xmm0 ## xmm0 = xmm0[0,1],xmm1[0,2]
shufps $32, %xmm0, %xmm2 ## xmm2 = xmm2[0,0],xmm0[2,0]
shufps $36, %xmm2, %xmm0 ## xmm0 = xmm0[0,1],xmm2[2,0]
retq
To the almost optimal:
movhpd (%rdi), %xmm0
Note: There's a tension in the existing transform related to generating
arbitrary shufflevector masks. We avoid that in other places in InstCombine
because we're scared that codegen can't handle strange masks, but it looks
like we're ok with producing those here. I purposely chose weird insert/extract
indexes for the regression tests to see the effect in these cases.
For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or
better for these examples.
Differential Revision: http://reviews.llvm.org/D15096
llvm-svn: 256394
|