| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
on targets without AVX512BW.
LowerIntUnary as its name says has an assert for integer types. But for the bitcast case one side might be an FP type.
Rather than making sure the function really works for fp types and renaming it. Just do really basic splitting directly. The LowerIntUnary has the advantage that it can peek through BUILD_VECTOR because every other call is during Lowering. But these calls are during legalization and will be followed by a DAG combine round.
Revert some change to LowerVectorIntUnary that were originally made just to make these two calls work even in pure integer cases.
This was found purely by compiling the avx512f-builtins.c test from clang so I've copied over the offending function from that.
llvm-svn: 329616
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a code size win in code that takes offseted addresses
frequently, such as C++ constructors that typically need to compute
an offseted address of a vtable. It reduces the size of Chromium for
Android's .text section by 46KB, or 56KB with ThinLTO (which exposes
more opportunities to use a direct access rather than a GOT access).
Because the addend range is limited in COFF and Mach-O, this is
enabled for ELF only.
Differential Revision: https://reviews.llvm.org/D45199
llvm-svn: 329611
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r329591.
It breaks various bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/16516
http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/17374
http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/15992
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt
http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/11251
...
llvm-svn: 329610
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: sunfish, RKSimon
Reviewed By: sunfish
Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, llvm-commits
Differential Revision: https://reviews.llvm.org/D44873
llvm-svn: 329607
|
| |
|
|
|
|
| |
While it appears to be correct information based on Intel's optimization manual and Agner's data, it causes perf regressions on a couple of the benchmarks in our internal list.
llvm-svn: 329593
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Samuel Pitoiset
ds_read_b128 and ds_write_b128 have been recently enabled
under the amdgpu-ds128 option because the performance benefit
is unclear.
Though, using 128-bit loads/stores for the local address space
appears to introduce regressions in tessellation shaders. Not
sure what is broken, but as ds_read_b128/ds_write_b128 are not
enabled by default, just introduce a global option and enable
128-bit only if requested (until it's fixed/used correctly).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
llvm-svn: 329591
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes AMDGPU GlobalISel test failures when enabling the AMDGPU
target without any other targets that use GlobalISel.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D45353
llvm-svn: 329588
|
| |
|
|
| |
llvm-svn: 329568
|
| |
|
|
| |
llvm-svn: 329567
|
| |
|
|
| |
llvm-svn: 329565
|
| |
|
|
|
|
|
|
|
|
|
| |
See bugs
36841: https://bugs.llvm.org/show_bug.cgi?id=36841
36842: https://bugs.llvm.org/show_bug.cgi?id=36842
Differential Revision: https://reviews.llvm.org/D45251
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329562
|
| |
|
|
|
|
| |
The RR/RM itineraries were the wrong way around
llvm-svn: 329561
|
| |
|
|
|
|
| |
The RM folded itineraries were incorrectly using the f64 version.
llvm-svn: 329556
|
| |
|
|
|
|
| |
"is is" -> "is", "are are" -> "are"
llvm-svn: 329546
|
| |
|
|
|
|
|
|
|
|
| |
The TargetSchedModel is always initialized using the TargetSubtargetInfo's
MCSchedModel and TargetInstrInfo, so we don't need to extract those and
pass 3 parameters to init().
Differential Revision: https://reviews.llvm.org/D44789
llvm-svn: 329540
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Cmov and setcc previously used WriteALU, but on Intel processors at least they are more restricted than basic ALU ops.
This patch adds new SchedWrites for them and removes the InstRWs. I had to leave some InstRWs for CMOVA/CMOVBE and SETA/SETBE because those have an extra uop relative to the other condition codes on Intel CPUs.
The test changes are due to fixing a missing ZnAGU dependency on the memory form of setcc.
Reviewers: RKSimon, andreadb, GGanesh
Reviewed By: RKSimon
Subscribers: GGanesh, llvm-commits
Differential Revision: https://reviews.llvm.org/D45380
llvm-svn: 329539
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This removes the InstRWs for BLENDVPS/PD in favor of WriteFVarBlend. The latency listed was 3 cycles but WriteFVarBlend is defined as 1 cycle latency. The 1 cycle latency matches Agner Fog's data.
The patterns were missing the VEX forms which is why there are no test changes. We don't test "-mcpu=znver1 -mattr=-avx"
Reviewers: RKSimon, GGanesh
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44841
llvm-svn: 329538
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: hfinkel, RKSimon
Reviewed By: RKSimon
Subscribers: nemanjai, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D44870
llvm-svn: 329535
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: chandlerc, craig.topper, RKSimon
Reviewed By: chandlerc, craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44874
llvm-svn: 329534
|
| |
|
|
| |
llvm-svn: 329524
|
| |
|
|
|
|
|
|
| |
lowering.
Previously we used a custom lowering for this because of the AVX1 splitting requirement. But we can do the split during DAG combine if we check the types and subtarget
llvm-svn: 329510
|
| |
|
|
| |
llvm-svn: 329498
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Should fix UBSan bot by also checking there's no "uwtable" attribute
before skipping. Otherwise the unwind table will be useless since its
moves expect CSRs to actually be preserved.
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch mostly by myeisha (pmb).
llvm-svn: 329494
|
| |
|
|
|
|
|
|
| |
Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM
This reverts commit r329287
llvm-svn: 329486
|
| |
|
|
|
|
|
|
|
| |
Previously HalfTy was not handled which would either trigger an assertion,
or result in array initialized with garbage.
Differential Revision: https://reviews.llvm.org/D45391
llvm-svn: 329463
|
| |
|
|
|
|
|
|
|
| |
v2f16 is a special case in NVPTX. v4f16 may be loaded as a pair of v2f16
and that was not previously handled correctly by tryLDGLDU()
Differential Revision: https://reviews.llvm.org/D45339
llvm-svn: 329456
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch implements a tablegen-driven Instruction Compression
mechanism for generating RISCV compressed instructions
(C Extension) from the expanded instruction form.
This tablegen backend processes CompressPat declarations in a
td file and generates all the compile-time and runtime checks
required to validate the declarations, validate the input
operands and generate correct instructions.
The checks include validating register operands, immediate
operands, fixed register operands and fixed immediate operands.
Example:
class CompressPat<dag input, dag output> {
dag Input = input;
dag Output = output;
list<Predicate> Predicates = [];
}
let Predicates = [HasStdExtC] in {
def : CompressPat<(ADD GPRNoX0:$rs1, GPRNoX0:$rs1, GPRNoX0:$rs2),
(C_ADD GPRNoX0:$rs1, GPRNoX0:$rs2)>;
}
The result is an auto-generated header file
'RISCVGenCompressEmitter.inc' which exports two functions for
compressing/uncompressing MCInst instructions, plus
some helper functions:
bool compressInst(MCInst& OutInst, const MCInst &MI,
const MCSubtargetInfo &STI,
MCContext &Context);
bool uncompressInst(MCInst& OutInst, const MCInst &MI,
const MCRegisterInfo &MRI,
const MCSubtargetInfo &STI);
The clients that include this auto-generated header file and
invoke these functions can compress an instruction before emitting
it, in the target-specific ASM or ELF streamer, or can uncompress
an instruction before printing it, when the expanded instruction
format aliases is favored.
The following clients were added to implement compression\uncompression
for RISCV:
1) RISCVAsmParser::MatchAndEmitInstruction:
Inserted a call to compressInst() to compresses instructions
parsed by llvm-mc coming from an ASM input.
2) RISCVAsmPrinter::EmitInstruction:
Inserted a call to compressInst() to compress instructions that
were lowered from Machine Instructions (MachineInstr).
3) RVInstPrinter::printInst:
Inserted a call to uncompressInst() to print the expanded
version of the instruction instead of the compressed one (e.g,
add s0, s0, a5 instead of c.add s0, a5) when -riscv-no-aliases
is not passed.
This patch squashes D45119, D42780 and D41932. It was reviewed in smaller patches by
asb, efriedma, apazos and mgrang.
Reviewers: asb, efriedma, apazos, llvm-commits, sabuasal
Reviewed By: sabuasal
Subscribers: mgorny, eraman, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng
Differential Revision: https://reviews.llvm.org/D45385
llvm-svn: 329455
|
| |
|
|
|
|
|
|
|
| |
See bug 36843: https://bugs.llvm.org/show_bug.cgi?id=36843
Differential Revision: https://reviews.llvm.org/D45268
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329440
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler is generating packet with the following instructions,
which causes an undefined register assert in the verifier.
$r0 = IMPLICIT_DEF
$r1 = IMPLICIT_DEF
S2_storerd_io killed $r29, 0, killed %d0
The problem is that the packetizer is not saving the IMPLICIT_DEF
instructions, which are needed when checking if it is legal to
add the store instruction. The fix is to add the IMPLICIT_DEF
instructions to the CurrentPacketMIs structure.
Patch by Brendon Cahoon.
llvm-svn: 329439
|
| |
|
|
|
|
|
|
|
|
| |
Packetizer keeps two zero-latency bound instrctions in the same packet ignoring
the stalls on the later instruction. This should not be the case if there is no
data dependence.
Patch by Sumanth Gundapaneni.
llvm-svn: 329437
|
| |
|
|
| |
llvm-svn: 329436
|
| |
|
|
| |
llvm-svn: 329434
|
| |
|
|
|
|
|
|
|
| |
See bug 36844: https://bugs.llvm.org/show_bug.cgi?id=36844
Differential Revision: https://reviews.llvm.org/D45313
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329430
|
| |
|
|
|
|
| |
ADC/SBB.
llvm-svn: 329424
|
| |
|
|
|
|
|
|
|
| |
See bug 36840: https://bugs.llvm.org/show_bug.cgi?id=36840
Differential Revision: https://reviews.llvm.org/D45250
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329419
|
| |
|
|
|
|
|
|
| |
scheduler model.
We can get this right through WriteALU and friends now.
llvm-svn: 329417
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Haswell/Broadwell/Skylake scheduler models without InstRWs
Summary:
This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency.
Apparently we were inconsistent about whether the store has latency or not thus the test changes.
I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5.
Reviewers: RKSimon, andreadb
Reviewed By: andreadb
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45351
llvm-svn: 329416
|
| |
|
|
|
|
|
|
| |
Bridge/Broadwell/Haswell/Skylake scheduler model.
Even those the address was calculated for the load, its calculated again for the store.
llvm-svn: 329415
|
| |
|
|
|
|
| |
These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation.
llvm-svn: 329414
|
| |
|
|
|
|
|
|
|
| |
See bug 36839: https://bugs.llvm.org/show_bug.cgi?id=36839
Differential Revision: https://reviews.llvm.org/D45249
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329408
|
| |
|
|
|
|
|
|
|
| |
Add disassembler support for instructions which writeback STATUS32.
https://reviews.llvm.org/D45148
Patch by Yan Luo! (Yan.Luo2@synopsys.com)
llvm-svn: 329404
|
| |
|
|
|
|
|
|
|
| |
See bug 36838: https://bugs.llvm.org/show_bug.cgi?id=36838
Differential Revision: https://reviews.llvm.org/D45247
Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329397
|
| |
|
|
|
|
| |
Noticed this during D44654
llvm-svn: 329389
|
| |
|
|
|
|
|
|
|
|
|
|
| |
As mentioned on D44647, this patch increases the default memory latency to +5cy , which more closely matches what most custom cases are doing for reg-mem instructions.
I've bumped LoadLatency, ReadAfterLd and WriteLoad values to 5cy to be consistent.
As Sandy Bridge is currently our default generic model, this affects a lot of scheduling tests...
Differential Revision: https://reviews.llvm.org/D44654
llvm-svn: 329388
|
| |
|
|
| |
llvm-svn: 329386
|
| |
|
|
|
|
|
|
|
|
| |
VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this.
So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack.
For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions.
Differential Revision: https://reviews.llvm.org/D45079
llvm-svn: 329377
|
| |
|
|
|
|
|
| |
Use double braces in std::array initialization
to keep Darwin builders happy.
llvm-svn: 329363
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Replace ArrayRefs by actual std::array objects so that there are
no dangling references.
Reviewers: rsmith, gkistanova
Subscribers: sdardis, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D45338
llvm-svn: 329359
|
| |
|
|
|
|
| |
According to Agner's data, CDQE is closer to CWDE.
llvm-svn: 329354
|
| |
|
|
| |
llvm-svn: 329351
|