| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
We used to try to constant-fold them to i32 immediates.
Given that fast-isel doesn't otherwise support vNi1, when selecting
the result users, we'd fallback to SDAG anyway.
However, if the users were in another block, we'd insert broken
cross-class copies (GPR32 to FPR64).
Give up, let SDAG agree with itself on a vNi1 legalization strategy.
llvm-svn: 252364
|
|
|
|
|
|
|
|
|
|
|
| |
When matching non-LSB-extracting truncating broadcasts, we now insert
the necessary SRL. If the scalar resulted from a load, the SRL will be
folded into it, creating a narrower, offset, load.
However, i16 loads aren't Desirable, so we get i16->i32 zextloads.
We already catch i16 aextloads; catch these as well.
llvm-svn: 252363
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we recognize this, we can support it instead of bailing out.
That is, we can fold:
(v8i16 (shufflevector
(v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
<1,1,...,1>))
into:
(v8i16 (vbroadcast (i16 (trunc (srl Y, 16)))))
llvm-svn: 252362
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to incorrectly assume that the offset we're extracting from
was a multiple of the element size. So, we'd fold:
(v8i16 (shufflevector
(v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
<1,1,...,1>))
into:
(v8i16 (vbroadcast (i16 (trunc Y))))
whereas we should have extracted the higher bits from X.
Instead, bail out if the assumption doesn't hold.
llvm-svn: 252361
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Pass the VOPProfile object all the through to *_m multiclasses. This will
allow us to do more simplifications in the future.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D13437
llvm-svn: 252339
|
|
|
|
|
|
|
|
|
|
|
|
| |
All 3 operands of FMA3 instructions are commutable now.
Patch by Slava Klochkov
Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab).
Differential Revision: http://reviews.llvm.org/D13269
llvm-svn: 252335
|
|
|
|
|
|
|
|
|
| |
Modelling of the expression stack is evolving. This patch takes another
step by making pushes explicit.
Differential Revision: http://reviews.llvm.org/D14338
llvm-svn: 252334
|
|
|
|
| |
llvm-svn: 252328
|
|
|
|
|
|
| |
Test has a bogus verifier error which will be fixed by later commits.
llvm-svn: 252327
|
|
|
|
|
|
| |
The SGPR spill pseudos don't actually use them.
llvm-svn: 252324
|
|
|
|
|
|
|
|
|
|
|
| |
Mark kernels that use certain features that require user
SGPRs to support with kernel attributes. We need to know
before instruction selection begins because it impacts
the kernel calling convention lowering.
For now this only detects the workitem intrinsics.
llvm-svn: 252323
|
|
|
|
|
|
|
| |
Instead of forcing 4 alignment when spilled, set register class
alignments.
llvm-svn: 252322
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some reason VS_32 ends up factoring into the pressure heuristics
even though we should never see a virtual register with this class.
When SGPRs are reserved for register spilling, this for some reason
triggers reg-crit scheduling.
Setting isAllocatable = 0 may help with this since that seems to remove
it from the default implementation's generated table.
llvm-svn: 252321
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.
Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer
Subscribers: MatzeB, llvm-commits
Differential Revision: http://reviews.llvm.org/D14407
llvm-svn: 252318
|
|
|
|
|
|
|
|
|
| |
The benefit from converting narrow loads into a wider load (r251438) could be
micro-architecturally dependent, as it assumes that a single load with two bitfield
extracts is cheaper than two narrow loads. Currently, this conversion is
enabled only in cortex-a57 on which performance benefits were verified.
llvm-svn: 252316
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The bug was that the sldi instructions have immediate widths dependant on
their element size. So sldi.d has a 1-bit immediate and sldi.b has a 4-bit
immediate. All of these were using 4-bit immediates previously.
Reviewers: vkalintiris
Subscribers: llvm-commits, atanasyan, dsanders
Differential Revision: http://reviews.llvm.org/D14018
llvm-svn: 252297
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reviewers: vkalintiris
Subscribers: atanasyan, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D14016
llvm-svn: 252296
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The bug was that the MIPS32R6/MIPS64R6/microMIPS32R6 versions of LSA and DLSA
(unlike the MSA version) failed to account for the off-by-one encoding of the
immediate. The range is actually 1..4 rather than 0..3.
Reviewers: vkalintiris
Subscribers: atanasyan, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D14015
llvm-svn: 252295
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: vkalintiris
Subscribers: dsanders, atanasyan, llvm-commits
Differential Revision: http://reviews.llvm.org/D14013
llvm-svn: 252294
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without these patterns we would generate a complete LL/SC sequence.
This would be problematic for memory regions marked as WRITE-only or
READ-only, as the instructions LL/SC would read/write to the protected
memory regions correspondingly.
Reviewers: dsanders
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D14397
llvm-svn: 252293
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D13804
llvm-svn: 252291
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the EH_RESTORE x86 pseudo instr, which is responsible for
restoring the stack pointers: EBP and ESP, and ESI if stack realignment
is involved. We only need this on 32-bit x86, because on x64 the runtime
restores CSRs for us.
Previously we had to keep the CATCHRET instruction around during SEH so
that we could convince X86FrameLowering to restore our frame pointers.
Now we can split these instructions earlier.
This was confusing, because we had a return instruction which wasn't
really a return and was ultimately going to be removed by
X86FrameLowering. This change also simplifies X86FrameLowering, which
really shouldn't be building new MBBs.
No observable functional change currently, but with the new register
mask stuff in D14407, CATCHRET will become a register allocator barrier,
and our existing tests rely on us having reasonable register allocation
around SEH.
llvm-svn: 252266
|
|
|
|
| |
llvm-svn: 252217
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already had a test for this for 32-bit SEH catchpads, but those don't
actually create funclets. We had a bug that only appeared in funclet
prologues, where we would establish EBP and ESI as our FP and BP, and
then downstream prologue code would overwrite them.
While I was at it, I fixed Win64+funclets+stackrealign. This issue
doesn't come up as often there due to the ABI requring 16 byte stack
alignment, but now we can rest easy that AVX and WinEH will work well
together =P.
llvm-svn: 252210
|
|
|
|
|
|
| |
Noticed by dschff in http://reviews.llvm.org/rL252203
llvm-svn: 252208
|
|
|
|
|
|
| |
This more closely reflects the naming convention in the spec.
llvm-svn: 252204
|
|
|
|
|
|
|
|
|
| |
Mangling type information into MachineInstr opcode names was a temporary
measure, and it's starting to get hairy. At the same time, the MC instruction
printer wants to use AsmString strings for printing. This patch takes the
first step, starting the process of adding AsmStrings for instructions.
llvm-svn: 252203
|
|
|
|
|
|
|
| |
The page_size operator has been removed from the spec, and the resize_memory
operator has been changed to grow_memory.
llvm-svn: 252202
|
|
|
|
|
|
|
|
| |
Also, remove an enum hack where enum values were used as indexes into an array.
We may want to make this a real class to allow pattern-based queries/customization (D13417).
llvm-svn: 252196
|
|
|
|
|
|
|
| |
This isn't used yet; it's just a start towards eventually using MC to
do instruction printing, and eventually binary encoding.
llvm-svn: 252194
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it.
D11268 is quite old and has merge conflicts against the current trunk.
This request
- rebases D11268 onto the new trunk;
- resolves the merge conflicts;
- fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct.
Reviewers: echristo, rengolin, kubabrecka
Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl
Differential Revision: http://reviews.llvm.org/D14338
llvm-svn: 252177
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the issue of wrong CFA calculation in the following case:
0x08048400 <+0>: push %ebx
0x08048401 <+1>: sub $0x8,%esp
0x08048404 <+4>: **call 0x8048409 <test+9>**
0x08048409 <+9>: **pop %eax**
0x0804840a <+10>: add $0x1bf7,%eax
0x08048410 <+16>: mov %eax,%ebx
0x08048412 <+18>: call 0x80483f0 <bar>
0x08048417 <+23>: add $0x8,%esp
0x0804841a <+26>: pop %ebx
0x0804841b <+27>: ret
The highlighted instructions are a product of movpc instruction. The call
instruction changes the stack pointer, and pop instruction restores its
value. However, the rule for computing CFA is not updated and is wrong on
the pop instruction. So, e.g. backtrace in gdb does not work when on the pop
instruction. This adds cfi instructions for both call and pop instructions.
cfi_adjust_cfa_offset** instruction is used with the appropriate offset for
setting the rules to calculate CFA correctly.
Patch by Violeta Vukobrat.
Differential Revision: http://reviews.llvm.org/D14021
llvm-svn: 252176
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The spec uses "or" for inclusive-or and "xor" for exclusive-or
Reviewers: sunfish
Subscribers: jfb, llvm-commits, dschuff
Differential Revision: http://reviews.llvm.org/D14362
llvm-svn: 252174
|
|
|
|
|
|
|
|
|
|
| |
We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities.
I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :(
This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case.
llvm-svn: 252168
|
|
|
|
|
|
| |
[X86][AVX512] add comi with Sae
llvm-svn: 252154
|
|
|
|
|
|
|
|
| |
add builtin_ia32_vcomisd and builtin_ia32_vcomisd
Differential Revision: http://reviews.llvm.org/D14331
llvm-svn: 252153
|
|
|
|
|
|
|
|
| |
VPBROADCASTMW2D and VPBROADCASTMB2Q
Differential Revision: http://reviews.llvm.org/D14335
llvm-svn: 252151
|
|
|
|
| |
llvm-svn: 252145
|
|
|
|
|
|
|
| |
This doesn't quite match how SC prints it, which doesn't put it in a
comment.
llvm-svn: 252144
|
|
|
|
| |
llvm-svn: 252142
|
|
|
|
|
|
|
|
|
|
| |
The operand layout is slightly different for the atomic
opcodes from the usual MUBUF loads and stores.
This should only fix it on SI/CI. VI is still broken
because it still emits the addr64 replacement.
llvm-svn: 252140
|
|
|
|
|
|
|
| |
vaddr comes before srsrc in every other MUBUF instruction,
and is the order it is printed.
llvm-svn: 252139
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The CLR's personality routine passes the pointer to the establisher frame
in RCX, not RDX.
Reviewers: pgavlin, majnemer, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14343
llvm-svn: 252135
|
|
|
|
|
|
|
|
| |
This brings back the behavior from before r252090 for out of range symbols.
Should bring some arm bots back.
llvm-svn: 252119
|
|
|
|
| |
llvm-svn: 252116
|
|
|
|
|
|
|
|
| |
The generic infrastructure already did a lot of work to decide if the
fixup value is know or not. It doesn't make sense to reimplement a very
basic case: same fragment.
llvm-svn: 252090
|
|
|
|
|
|
|
|
|
|
| |
Win64 has some strict requirements for the epilogue. As a result, we disable
shrink-wrapping for Win64 unless the block that gets the epilogue is already an
exit block.
Fixes PR24193.
llvm-svn: 252088
|
|
|
|
| |
llvm-svn: 252078
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction.
The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening.
This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done.
It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted.
Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll
Differential Revision: http://reviews.llvm.org/D13988
llvm-svn: 252074
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is intended to make a later change simpler.
Note: adding this bounds checking required fixing `X86FastISel`. As
far I can tell I've preserved original behavior but a careful review
will be appreciated.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14304
llvm-svn: 252073
|