summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Pass Divergence Analysis data to Selection DAG to drive divergenceAlexander Timofeev2018-03-052-16/+59
| | | | | | | | dependent instruction selection. Differential revision: https://reviews.llvm.org/D35267 llvm-svn: 326703
* [X86] Add a DAG combine to turn stores of vXi1 constants into scalar stores.Craig Topper2018-03-042-120/+26
| | | | llvm-svn: 326679
* [X86] Add a 32-bit mode command line to avx512-mask-op.ll. Add tests for ↵Craig Topper2018-03-041-0/+711
| | | | | | storing v2i1 and v4i1 constants. llvm-svn: 326678
* [DAGCombiner] Add a peekThroughBitcast to MergeStoresOfConstantsOrVecElts to ↵Craig Topper2018-03-041-0/+32
| | | | | | | | fix a crash if we are storing a bitcast of a constant. Loading a constant into a k-register in AVX512 requires a bitcast from a scalar constant. In the test case here we have a k-register store that gets split into multiple parts of KNL. MergeConsecutiveStores sees each of these pieces as a consecutive store and looks through the bitcast to find the underly scalar constant. But when we went to create the combined store we didn't look through the same bitcast. llvm-svn: 326677
* [X86][X87] Add X87 folded integer arithmetic testsSimon Pilgrim2018-03-041-0/+621
| | | | | | | | Add tests for FIADD/FISUB/FISUBR/FIMUL/FIDIV/FIDIVR Shows we have more FILD stack usage than necessary (arg load, spill, reload to x87) llvm-svn: 326674
* [X86] Combine (store (v1i1 (scalar_to_vector (i8 X)))) -> (store (i8 X)).Craig Topper2018-03-041-14/+14
| | | | llvm-svn: 326670
* [X86] Lower v1i1/v2i1/v4i1/v8i1 load/stores to i8 load/store during op ↵Craig Topper2018-03-041-6/+2
| | | | | | | | legalization if AVX512DQ is not supported. We were previously doing this with isel patterns. Moving it to op legalization gives us chance to see the required bitcast earlier. And it lets us remove some isel patterns. llvm-svn: 326669
* [LegalizeVectorTypes] When scalarizing the operand of a unary op like TRUNC, ↵Craig Topper2018-03-021-38/+11
| | | | | | | | | | | | use a SCALAR_TO_VECTOR rather than a single element BUILD_VECTOR to convert back to a vector type. X86 considers v1i1 a legal type under AVX512 and as such a truncate from a v1iX type to v1i1 can be turned into a scalar truncate plus a conversion to v1i1. We would much prefer a v1i1 SCALAR_TO_VECTOR over a one element BUILD_VECTOR. During lowering we were detecting the v1i1 BUILD_VECTOR as a splat BUILD_VECTOR like we try to do for v2i1/v4i1/etc. In this case we create (select i1 splat_elt, v1i1 all-ones, v1i1 all-zeroes). That goes through some more legalization and we end up with a CMOV choosing between 0 and 1 in scalar and a scalar_to_vector. Arguably we could detect the v1i1 BUILD_VECTOR and do this better in X86 target code. But just using a SCALAR_TO_VECTOR in legalization is much easier. llvm-svn: 326637
* [Hexagon] Generate valignb for shifting shuffles (instead of vdelta)Krzysztof Parzyszek2018-03-022-0/+2548
| | | | llvm-svn: 326627
* [SystemZ] Fix test cases after r326613Ulrich Weigand2018-03-026-185/+50
| | | | | | I forgot to check in the updated test cases after the r326613 commit. llvm-svn: 326616
* [SystemZ] Add support for anyregcc calling conventionUlrich Weigand2018-03-023-0/+613
| | | | | | | | | | | | | This adds back-end support for the anyregcc calling convention for use with patchpoints. Since all registers are considered call-saved with anyregcc (except for 0 and 1 which may still be clobbered by PLT stubs and the like), this required adding support for saving and restoring vector registers in prologue/epilogue code for the first time. This is not used by any other calling convention. llvm-svn: 326612
* [SystemZ] Support stackmaps and patchpointsUlrich Weigand2018-03-025-0/+871
| | | | | | | This adds back-end support for the @llvm.experimental.stackmap and @llvm.experimental.patchpoint intrinsics. llvm-svn: 326611
* [SystemZ] Fix common-code users of stack sizeUlrich Weigand2018-03-021-0/+38
| | | | | | | | | | | | | | | | | | | On SystemZ we need to provide a register save area of 160 bytes to any called function. This size needs to be added when allocating stack in the function prologue. However, it was not accounted for as part of MachineFrameInfo::getStackSize(); instead the back-end used a private routine getAllocatedStackSize(). This is OK for code-gen purposes, but it breaks other users of the getStackSize() routine, in particular it breaks the recently- added -stack-size-section feature. Fix this by updating the main stack size tracked by common code (in emitPrologue) instead of using the private routine. No change in code generation intended. llvm-svn: 326610
* [SystemZ] Support vector registers in inline asmUlrich Weigand2018-03-021-0/+138
| | | | | | | | This adds support for specifying vector registers for use with inline asm statements, either via the 'v' constraint or by explicit register names (v0 ... v31). llvm-svn: 326609
* [Hexagon] Handle VACOPY in isel loweringKrzysztof Parzyszek2018-03-021-0/+18
| | | | llvm-svn: 326599
* [X86][BTVER2] Fix throughput of YMM bitwise instructionsSimon Pilgrim2018-03-021-12/+12
| | | | | | | | These instructions are double-pumped, split into 2 128-bit ops and then passing through either FPU pipe. Found while testing llvm-mca (D43951) llvm-svn: 326597
* [X86] Reject xmm16-31 in inline asm constraints when AVX512 is disabledCraig Topper2018-03-021-0/+8
| | | | | | | | Fixes PR36532 Differential Revision: https://reviews.llvm.org/D43960 llvm-svn: 326596
* [X86][x32] Save callee-save register used as base pointer for x32 ABIDerek Schuff2018-03-021-1/+3
| | | | | | | | | | | | | For the x32 ABI, since the base pointer register (EBX) is a callee save register it should be saved before use. This fixes https://bugs.llvm.org/show_bug.cgi?id=36011 Differential Revision: https://reviews.llvm.org/D42358 Patch by Pratik Bhatu llvm-svn: 326593
* AMDGPU/GlobalISel: InstrMapping for G_ZEXTMatt Arsenault2018-03-021-0/+31
| | | | llvm-svn: 326589
* AMDGPU/GlobalISel: InstrMapping for G_TRUNCMatt Arsenault2018-03-021-0/+31
| | | | llvm-svn: 326588
* AMDGPU/GlobalISel: Define InstrMappings for G_FCMPMatt Arsenault2018-03-021-0/+69
| | | | | | Patch by Tom Stellard llvm-svn: 326587
* AMDGPU/GlobalISel: Define instruction mapping for @llvm.minnumMatt Arsenault2018-03-021-0/+66
| | | | | | Patch by Tom Stellard llvm-svn: 326586
* [ARM] Fix access to stack arguments when re-aligning SP in Armv6mMomchil Velikov2018-03-021-0/+416
| | | | | | | | | | | | | | | | | | | | When an Armv6m function dynamically re-aligns the stack, access to incoming stack arguments (and to stack area, allocated for register varargs) is done via SP, which is incorrect, as the SP is offset by an unknown amount relative to the value of SP upon function entry. This patch fixes it, by making access to "fixed" frame objects be done via FP when the function needs stack re-alignment. It also changes the access to "fixed" frame objects be done via FP (instead of using R6/BP) also for the case when the stack frame contains variable sized objects. This should allow more objects to fit within the immediate offset of the load instruction. All of the above via a small refactoring to reuse the existing `ARMFrameLowering::ResolveFrameIndexReference.` Differential Revision: https://reviews.llvm.org/D43566 llvm-svn: 326584
* [MergeICmps] Revert 324317 "Enable the MergeICmps Pass by default."Clement Courbet2018-03-023-16/+41
| | | | | | While working on PR36557. llvm-svn: 326575
* [ARM] Fix codegen for VLD3/VLD4/VST3/VST4 with WBFlorian Hahn2018-03-024-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is broken due to 2 separate bugs: 1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing rules to expand them to non pseudo MIR. These are selected for ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD. 2) Selection of the right VLD/VST instruction is broken for load and store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called with MIR opcode for fixed writeback (ie increment is access size) and call getVLDSTRegisterUpdateOpcode() to select an opcode with register writeback if base register update is of a different size. Since getVLDSTRegisterUpdateOpcode() only knows about VLD1/VLD2/VST1/VST2 the call is currently conditional on the number of element in the vector. However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not updated which later lead to a fixed writeback instruction being constructed with an extra operand for the register writeback. This patch addresses the two issues as follows: - it adds the necessary mapping from VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register to VLD1d64Twb_register and VLD1d64Qwb_register respectively. Like for the existing _fixed variants, the cost of these is bumped for unaligned access. - it changes the logic in SelectVLD and SelectVSD to call isVLDfixed and isVSTfixed respectively to decide whether the opcode should be updated. It also reworks the logic and comments for pushing the writeback offset operand and r0 operand to clarify the logic: writeback offset needs to be pushed if it's a register writeback, r0 needs to be pushed if not and the instruction is a VLD1/VLD2/VST1/VST2. Reviewers: rengolin, t.p.northover, samparker Reviewed By: samparker Patch by Thomas Preud'homme <thomas.preudhomme@arm.com> Differential Revision: https://reviews.llvm.org/D42970 llvm-svn: 326570
* AMDGPU/GlobalISel: Define instruction mapping for @llvm.maxnumMatt Arsenault2018-03-021-0/+66
| | | | | | Patch by Tom Stellard llvm-svn: 326567
* AMDGPU/GCN: Promote i16 ctpopJan Vesely2018-03-021-0/+334
| | | | | | | | | i16 capable ASICs do not support i16 operands for this instruction. Add tablegen pattern to merge chained i16 additions. Differential Revision: https://reviews.llvm.org/D43985 llvm-svn: 326535
* AMDGPU/GlobalISel: Define instruction mapping for G_FPTOSIMatt Arsenault2018-03-021-0/+31
| | | | | | Patch by Tom Stellard llvm-svn: 326534
* AMDGPU/GlobalISel: Define instruction mapping for G_FPTOUIMatt Arsenault2018-03-021-0/+31
| | | | | | Patch by Tom Stellard llvm-svn: 326533
* AMDGPU/GlobalISel: Define instruction mapping for G_FMULMatt Arsenault2018-03-021-0/+69
| | | | llvm-svn: 326532
* AMDGPU/GlobalISel: Define instruction mapping for G_FADDMatt Arsenault2018-03-021-0/+69
| | | | | | Patch by Tom Stellard llvm-svn: 326526
* AMDGPU/GlobalISel: Define instruction mapping for G_SHLMatt Arsenault2018-03-021-0/+68
| | | | | | Patch by Tom Stellard llvm-svn: 326525
* AMDGPU/GlobalISel: Define instruction mapping for G_XORMatt Arsenault2018-03-021-0/+68
| | | | llvm-svn: 326524
* AMDGPU/GlobalISel: Define instruction mapping for G_ANDMatt Arsenault2018-03-021-0/+68
| | | | | | Patch by Tom Stellard llvm-svn: 326523
* [DAGCombiner] When combining zero_extend of a truncate, only mask before ↵Craig Topper2018-03-0115-90/+68
| | | | | | | | | | extending for vectors. Masking first, prevents the extend from being combine with loads. Its also interfering with some vXi1 extraction code. Differential Revision: https://reviews.llvm.org/D42679 llvm-svn: 326500
* [X86][MMX] Improve handling of 64-bit MMX constantsSimon Pilgrim2018-03-012-24/+4
| | | | | | | | | | | | 64-bit MMX constant generation usually ends up lowering into SSE instructions before being spilled/reloaded as a MMX type. This patch bitcasts the constant to a double value to allow correct loading directly to the MMX register. I've added MMX constant asm comment support to improve testing, it's better to always print the double values as hex constants as MMX is mainly an integer unit (and even with 3DNow! its just floats). Differential Revision: https://reviews.llvm.org/D43616 llvm-svn: 326497
* [SelectionDAG] Support some SimplifySetCC cases for comparing against vector ↵Craig Topper2018-03-012-50/+30
| | | | | | | | | | | | | | splats of constants. This supports things like (setcc ugt X, 0) -> (setcc ne X, 0) I've restricted to only make changes to vectors before legalize ops because I doubt all targets have accurate condition code legality information for vectors given how little we did before. Differential Revision: https://reviews.llvm.org/D42948 llvm-svn: 326495
* [X86][AVX] Add v2f32 <-> v2i8/v2i16/v2i32 vector testsSimon Pilgrim2018-03-011-0/+342
| | | | llvm-svn: 326494
* AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.cvt.pkrtzMatt Arsenault2018-03-011-0/+66
| | | | | | Patch by Tom Stellard llvm-svn: 326490
* AMDGPU/GlobalISel: Define instruction mapping for G_ORMatt Arsenault2018-03-011-0/+68
| | | | | | Patch by Tom Stellard llvm-svn: 326489
* [X86][SSE] Regenerate float to/from i8/i16 vector testsSimon Pilgrim2018-03-011-20/+238
| | | | llvm-svn: 326488
* [X86][SSE] Regenerate odd sized sext/zext testsSimon Pilgrim2018-03-011-10/+158
| | | | llvm-svn: 326484
* AMDGPU/GlobalISel: Define instruction mapping for G_BITCASTMatt Arsenault2018-03-011-0/+31
| | | | | | Patch by Tom Stellard llvm-svn: 326482
* AMDGPU/GlobalISel: Mark i32->i64 zext as legalMatt Arsenault2018-03-011-0/+14
| | | | llvm-svn: 326481
* AMDGPU/GlobalISel: InstrMapping for llvm.amdgcn.exp.comprMatt Arsenault2018-03-011-0/+67
| | | | | | Patch by Tom Stellard llvm-svn: 326479
* AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.expMatt Arsenault2018-03-011-0/+77
| | | | | | Patch by Tom Stellard llvm-svn: 326477
* AMDGPU/GlobalISel: Define InstrMappings for G_ICMPMatt Arsenault2018-03-011-0/+67
| | | | | | Patch by Tom Stellard llvm-svn: 326472
* AMDGPU/GlobalISel: Make i32 mul legalMatt Arsenault2018-03-011-0/+18
| | | | llvm-svn: 326471
* AMDGPU/GlobalISel: Define instruction mapping for G_IMPLICIT_DEFMatt Arsenault2018-03-011-6/+27
| | | | | | Patch by Tom Stellard llvm-svn: 326470
* AMDGPU/GlobalISel: Define instruction mapping for G_FCONSTANTMatt Arsenault2018-03-011-0/+31
| | | | | | Patch by Tom Stellard llvm-svn: 326468
OpenPOWER on IntegriCloud