| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
| |
This feature enables the fusion of the comparison and the conditional select
instructions together.
Differential revision: https://reviews.llvm.org/D42392
llvm-svn: 325939
|
| |
|
|
| |
llvm-svn: 325938
|
| |
|
|
|
|
|
|
|
|
| |
is not.
The test changes you can see are related to the changes in ReplaceNodeResults. Though shuffle-vs-trunc-512.ll does have a test that exercises the code in LowerBITCAST. Looks like the test output didn't change because DAG combining is able to clean up the resulting type legalization. Adding the custom hook just makes type legalization work less hard.
Differential Revision: https://reviews.llvm.org/D43447
llvm-svn: 325933
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add a target option AllowRegisterRenaming that is used to opt in to
post-register-allocation renaming of registers. This is set to 0 by
default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq
fields of all opcodes to be set to 1, causing
MachineOperand::isRenamable to always return false.
Set the AllowRegisterRenaming flag to 1 for all in-tree targets that
have lit tests that were effected by enabling COPY forwarding in
MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC,
RISCV, Sparc, SystemZ and X86).
Add some more comments describing the semantics of the
MachineOperand::isRenamable function and how it is set and maintained.
Change isRenamable to check the operand's opcode
hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of
relying on it being consistently reflected in the IsRenamable bit
setting.
Clear the IsRenamable bit when changing an operand's register value.
Remove target code that was clearing the IsRenamable bit when changing
registers/opcodes now that this is done conservatively by default.
Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in
one place covering all opcodes that have constant pipe read limit
restrictions.
Reviewers: qcolombet, MatzeB
Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D43042
llvm-svn: 325931
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following set of instructions was originally planned to be added for Power 9
and so code was added to support them. However, a decision was made later on to
withdraw support for these instructions in the hardware.
xscmpnedp
xvcmpnesp
xvcmpnedp
This patch removes support for the instructions that were not added.
Differential Revision: https://reviews.llvm.org/D43641
llvm-svn: 325918
|
| |
|
|
|
|
| |
r325916 missed to remove calls in constructor.
llvm-svn: 325917
|
| |
|
|
|
|
| |
Unused fields cause buildbreak if -Werror,-Wunused-private-field is passed.
llvm-svn: 325916
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Adds support for this flag. There is also another piece for clang
(separate review). More info:
https://bugs.llvm.org/show_bug.cgi?id=36221
By Ruslan Nikolaev!
Differential Revision: https://reviews.llvm.org/D43107
llvm-svn: 325900
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There are transformation that change setcc into other constructs, and transform that try to reconstruct a setcc from the brcond condition. Depending on what order these transform are done, the end result differs.
Most of the time, it is preferable to get a setcc as a brcond argument (and this is why brcond try to recreate the setcc in the first place) so we ensure this is done every time by also doing it at the setcc level when the only user is a brcond.
Reviewers: spatel, hfinkel, niravd, craig.topper
Subscribers: nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D41235
llvm-svn: 325892
|
| |
|
|
|
|
|
|
|
|
|
| |
Add GlobalISel infrastructure up to the point where we can select a ret
void.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D43583
llvm-svn: 325888
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This handles def-after-use of physregs, and allows us to merge loads and
stores even across some physreg defs (typically M0 defs).
Change-Id: I076484b2bda27c2cf46013c845a0380c5b89b67b
Reviewers: arsenm, mareko, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D42647
llvm-svn: 325882
|
| |
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Simon Dardis
llvm-svn: 325870
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is combination of two patches by Nicholas Wilson:
1. https://reviews.llvm.org/D41954
2. https://reviews.llvm.org/D42495
Along with a few local modifications:
- One change I made was to add the UNDEFINED bit to the binary format
to avoid the extra byte used when writing data symbols. Although this
bit is redundant for other symbols types (i.e. undefined can be
implied if a function or global is a wasm import)
- I prefer to be explicit and consistent and not have derived flags.
- Some field renaming.
- Some reverting of unrelated minor changes.
- No test output differences.
Differential Revision: https://reviews.llvm.org/D43147
llvm-svn: 325860
|
| |
|
|
|
|
| |
This is causing miscompiles in some situations. See the llvm-commits thread for the commit for details.
llvm-svn: 325852
|
| |
|
|
|
|
|
|
|
|
| |
avoid an invert
We won't be able to fold the constant pool load, but its still better than materialing ones and xoring for the invert if we used PCMPEQ.
This will fix another regression from D42948.
llvm-svn: 325845
|
| |
|
|
|
|
|
|
|
| |
Move checks for each fusion case into separate functions for better
legibility and maintainability.
Differential revision: https://reviews.llvm.org/D43649
llvm-svn: 325844
|
| |
|
|
|
|
| |
Thank to Eric Christopher for noticing.
llvm-svn: 325842
|
| |
|
|
|
|
|
|
|
|
| |
avoid an invert
This will fix one of the regressions from D42948.
Differential Revision: https://reviews.llvm.org/D43531
llvm-svn: 325840
|
| |
|
|
|
|
|
| |
toString conflicts with llvm::toString here. Yay for overly generic
function names.
llvm-svn: 325833
|
| |
|
|
|
|
|
|
|
|
| |
Previously this code overrode the flags and opcode used by the later code in LowerVSETCC. This makes the code difficult to read and follow.
This patch moves all the SUBUS code into its own function and makes it responsible for creating its own SDNodes on success.
Differential Revision: https://reviews.llvm.org/D43530
llvm-svn: 325827
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
.NAME is a bit of an odd duck, in that we should really treat it like
a template argument, but we currently don't, and so when and where
NAME is initialized and how is pretty inconsistent. Best to just avoid
using it as a field of already instantiated records, and use cast to
string instead.
Change-Id: I5a0c202401cede3d5c3827ab9c7858ea48b29108
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D43551
llvm-svn: 325794
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Implement c.lui immediate constraint to [1, 31] and [0xfffe0, 0xfffff].
The RISC-V ISA describes the constraint as [1, 63], with that value
being loaded in to bits 17-12 of the destination register and sign extended
from bit 17. Therefore, this 6-bit immediate can represent values in the
ranges [1, 31] and [0xfffe0, 0xfffff].
Differential Revision: https://reviews.llvm.org/D42834
llvm-svn: 325792
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were no memory dependencies made between stores generated
when lowering formal arguments and loads generated when
call lowering byVal arguments which made the Post-RA scheduler
place a load before a matching store.
Make the fixed object stored to mutable so that the load
instructions can have their memory dependencies added
Set the frame object as isAliased which clears the underlying
objects vector in ScheduleDAGInstrs::buildSchedGraph().
This results in addition of all stores as dependenies for loads.
This problem appeared when passing a byVal parameter
coupled with a fastcc function call.
Differential Revision: https://reviews.llvm.org/D37515
llvm-svn: 325782
|
| |
|
|
|
|
|
|
|
|
|
| |
As pointed out by @sabuasal in a comment on D23568, the logic in
RISCVMCCodeEmitter::getImmOpValue could be more defensive. Although with the
current instruction definitions it is always the case that `VK_RISCV_LO` is
always used with either an I- or S-format instruction, this may not always be
the case in the future. Add a check to ensure we will get an assertion in
debug builds if that changes.
llvm-svn: 325775
|
| |
|
|
|
|
|
| |
This recommits r325754; the modified and failing test case
actually didn't need any modifications.
llvm-svn: 325765
|
| |
|
|
|
|
|
|
|
|
| |
Fixup to rL325573 for large xor constants.
Thanks to Eli Friedman for the catch.
Differential revision: https://reviews.llvm.org/D43549
llvm-svn: 325761
|
| |
|
|
| |
llvm-svn: 325756
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a follow up of r325012, that allowed half types in constant pools.
Proper alignment was enforced when a big basic block was split up, but not when
a CPE was placed before/after a block; the successor block had the wrong
alignment.
Differential Revision: https://reviews.llvm.org/D43580
llvm-svn: 325754
|
| |
|
|
|
|
| |
"a a" -> "a"
llvm-svn: 325752
|
| |
|
|
|
|
|
|
|
|
|
| |
An FRem instruction inside a loop should prevent the loop from being converted
into a CTR loop since this is not an operation that is legal on any PPC
subtarget. This will always be a call to a library function which means the
loop will be invalid if this instruction is in the body.
Fixes PR36292.
llvm-svn: 325739
|
| |
|
|
|
|
|
|
| |
vectors as well as v2i32
Also handle both cases where the lower 32-bits of the MMX is undef or zero extended.
llvm-svn: 325736
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pahole does not work with BPF backend properly:
-bash-4.2$ cat test.c
struct test_t {
int a;
int b;
};
int test(struct test_t *s) {
return s->a;
}
-bash-4.2$ clang -g -O2 -target bpf -c test.c
-bash-4.2$ pahole test.o
struct clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) {
clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); /* 0 4 */
clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); /* 4 4 */
/* size: 8, cachelines: 1, members: 2 */
/* last cacheline: 8 bytes */
};
-bash-4.2$
The reason is that BPF backend is not yet implemented in elfutils backend
https://github.com/threatstack/elfutils/tree/master/backends
and pahole depends on elfutils for dwarf parsing and resolving relocation.
More specifically, the unsupported relocation in .debug_info for type/member name
against symbol table caused the incorrect result above. The following is
the raw .rel.debug_info for the above example,
Hex dump of section '.rel.debug_info':
0x00000000 06000000 00000000 0a000000 0b000000 ................
0x00000010 0c000000 00000000 0a000000 01000000 ................
0x00000020 12000000 00000000 0a000000 02000000 ................
0x00000030 16000000 00000000 0a000000 0e000000 ................
0x00000040 1a000000 00000000 0a000000 03000000 ................
----------------- -------- --------
reloc location type symtab index
Hex dump of section '.debug_info':
0x00000000 7b000000 04000000 00000801 00000000 {...............
0x00000010 0c000000 00000000 00000000 00000000 ................
0x00000020 00000000 00001000 00000200 00000000 ................
Based on "type", the proper value will be extracted from symbol table
and filled in .debug_info so later on .debug_info can be properly
resolved against debug strings.
There are two ways to fix this problem. One is to fix elfutils by adding
BPF support which is desirable. This could take a long time and won't work
with already deployed pahole. For a short term workaround, we can disable
dwarf cross-section relation which specifically avoids debug_info and
symbol table cross relocation. This should help any dwarf-related tool
which has not implement BPF specific relocations yet.
Now .rel.debug_info does not have any relocation for symbol table and
.debug_info itself contains necessary relocation information by itself.
Hex dump of section '.debug_info':
0x00000000 7b000000 04000000 00000801 00000000 {...............
0x00000010 0c003700 00000000 00003e00 00000000 ..7.......>.....
0x00000020 00000000 00001000 00000200 00000000 ................
location 0xc has 0, 0x12 has 0x37, 0x1a has 0x3e in place which
will be used in relocation resolution. Here, the values of 0, 0x37 and 0x3e
are offset in .debug_str section.
Please note the difference between two above .debug_info dumps.
With the fix, pahole works properly with BPF backend:
-bash-4.2$ clang -O2 -g -target bpf -c test.c
-bash-4.2$ pahole test.o
struct test_t {
int a; /* 0 4 */
int b; /* 4 4 */
/* size: 8, cachelines: 1, members: 2 */
/* last cacheline: 8 bytes */
};
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325735
|
| |
|
|
| |
llvm-svn: 325731
|
| |
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Krzysztof Parzyszek
llvm-svn: 325697
|
| |
|
|
| |
llvm-svn: 325695
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Global Dynamic and Local Dynamic call relocations only implicitly
reference __tls_get_addr; there is no connection in the ELF file between
the relocations and the symbol other than the specification for the
relocations' semantics. However, it still needs to be in the symbol
table despite the lack of explicit references to the symbol table entry,
since it needs to be bound at link time for these relocations, otherwise
any objects will fail to link.
For details, see https://sourceware.org/bugzilla/show_bug.cgi?id=22832.
Path by: James Clarke (jrtc27)
Differential revision: https://reviews.llvm.org/D43271
llvm-svn: 325688
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Since this pass operates on machine SSA form, this should only really
affect M0 in practice.
Fixes various piglit variable-indexing/vs-varying-array-mat4-index-*
Change-Id: Ib2a1dc3a8d7b08225a8da49a86f533faa0986aa8
Fixes: r317751 ("AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4")
Reviewers: arsenm, mareko, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D40343
llvm-svn: 325677
|
| |
|
|
|
|
|
|
|
| |
See bug 28234: https://bugs.llvm.org/show_bug.cgi?id=28234
Differential Revision: https://reviews.llvm.org/D43472
Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 325676
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Cannon Lake does not support CLWB, therefore it
does not include all features listed under SKX anymore.
Instead, enumerate all SKX features with the exception of CLWB.
Patch by Gabor Buella
Differential Revision: https://reviews.llvm.org/D43380
llvm-svn: 325654
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch provides mitigation for CVE-2017-5715, Spectre variant two,
which affects the P5600 and P6600. It implements the LLVM part of
-mindirect-jump=hazard. It is _not_ enabled by default for the P5600.
The migitation strategy suggested by MIPS for these processors is to use
hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard
barrier variants of the 'jalr' and 'jr' instructions respectively.
These instructions impede the execution of instruction stream until
architecturally defined hazards (changes to the instruction stream,
privileged registers which may affect execution) are cleared. These
instructions in MIPS' designs are not speculated past.
These instructions are used with the attribute +use-indirect-jump-hazard
when branching indirectly and for indirect function calls.
These instructions are defined by the MIPS32R2 ISA, so this mitigation
method is not compatible with processors which implement an earlier
revision of the MIPS ISA.
Performance benchmarking of this option with -fpic and lld using
-z hazardplt shows a difference of overall 10%~ time increase
for the LLVM testsuite. Certain benchmarks such as methcall show a
substantially larger increase in time due to their nature.
Reviewers: atanasyan, zoran.jovanovic
Differential Revision: https://reviews.llvm.org/D43486
llvm-svn: 325653
|
| |
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/rL325518
It breaks following OpenCL conformance tests:
- Basic - parameter_types
- Basic - vload_private
llvm-svn: 325643
|
| |
|
|
|
|
|
|
|
|
|
| |
Get rid of icky goto loops and make the code easier to maintain. Otherwise,
NFC.
Restore r324903 and fix PR36369.
Differentail revision: https://reviews.llvm.org/D43364
llvm-svn: 325621
|
| |
|
|
|
|
|
|
| |
This case wasn't handled yet.
Differential Revision: https://reviews.llvm.org/D43508
llvm-svn: 325616
|
| |
|
|
| |
llvm-svn: 325606
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
uses of the condition.
SimplifyDemandedBits forces the demanded mask to all 1s if the node has multiple uses, unless the AssumeSingleUse flag is set.
So previously we were only really likely to simplify something if the condition had a single use. And on the off chance we did simplify with multiple uses the demanded mask being used was all ones so there was no reason to create a shrunkblend.
This patch now checks that the condition is only used by selects first, and then sets the AssumeSingleUse flag for the simplifcation. Then we convert the selects to shrunkblend, and finally replace condition.
Differential Revision: https://reviews.llvm.org/D43446
llvm-svn: 325604
|
| |
|
|
|
|
|
|
|
|
| |
This allows us to avoid an opsize prefix. And forcing some move immediates to i32 avoids a length changing prefix on those instructions.
This mostly replaces the existing combine we had for zext/sext+cmov of constants. I left in a case for sign extending a 32 bit cmov of constants to 64 bits.
Differential Revision: https://reviews.llvm.org/D43327
llvm-svn: 325601
|
| |
|
|
|
|
|
|
| |
An upcoming patch D41434, changes the ordering of the matcher table
for assembly. This patch corrects the definition of the normal MIPS
cvt.d.w not to be available in microMIPS.
llvm-svn: 325589
|
| |
|
|
|
|
|
|
|
|
|
|
| |
parameter save area when needed
Current implementation always allocates the parameter save area conservatively
for fastcc functions. There is no reason to allocate the parameter save area if
all the parameters can be passed via registers.
Differential Revision: https://reviews.llvm.org/D42602
llvm-svn: 325581
|
| |
|
|
| |
llvm-svn: 325580
|
| |
|
|
|
|
|
|
|
|
| |
We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1
would not otherwise be considered a cheap constant. This prevents the
-1's from being pulled out into constants and potentially hoisted.
Differential Revision: https://reviews.llvm.org/D43451
llvm-svn: 325573
|