summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Fix typo in test (verify-machine-instrs -> verify-machineinstrs)Jessica Paquette2018-04-201-2/+2
| | | | llvm-svn: 330494
* [MachineOutliner] XFAIL machine-outliner-noredzone.llJessica Paquette2018-04-201-2/+3
| | | | | | | | | | The verifier began complaining about an undefined physical register in this test. XFAILing for the purposes of getting a bot up while I look into it. Failure: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/11385/ llvm-svn: 330493
* [X86] Add WriteFSign/WriteFLogic scheduler classesSimon Pilgrim2018-04-205-851/+851
| | | | | | | | | | | | | | Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes. This unearthed a couple of things that are also handled in this patch: (1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic (2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015. (3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops. Differential Revision: https://reviews.llvm.org/D45629 llvm-svn: 330480
* [Hexagon] Improve HVX instruction selection (bitcast, vsplat)Krzysztof Parzyszek2018-04-202-0/+76
| | | | | | | | | | There was some unfortunate interaction between VSPLAT and BITCAST related to the selection of constant vectors (coming from selecting shuffles). Introduce VSPLATW that always splats a 32-bit word, and can have arbitrary result type (to avoid BITCASTs of VSPLAT). Clean up the previous selection of BITCAST/VSPLAT. llvm-svn: 330471
* [Hexagon] Skip fixed-stack indexes in HexagonConstExtendersKrzysztof Parzyszek2018-04-201-0/+51
| | | | | | | Fixed slots have negative values, and TRI::stackSlot2Index and TRI::index2StackSlot do not handle negative numbers. llvm-svn: 330468
* [X86] WaitPKG instructionsGabor Buella2018-04-201-0/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Three new instructions: umonitor - Sets up a linear address range to be monitored by hardware and activates the monitor. The address range should be a writeback memory caching type. umwait - A hint that allows the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. tpause - Directs the processor to enter an implementation-dependent optimized state until the TSC reaches the value in EDX:EAX. Also modifying the description of the mfence instruction, as the rep prefix (0xF3) was allowed before, which would conflict with umonitor during disassembly. Before: $ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble .text mfence After: $ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble .text umonitor %rax Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45253 llvm-svn: 330462
* [MachineOutliner] Change B instruction for tail calls to TCRETURNdiJessica Paquette2018-04-201-2/+2
| | | | | | | | | | First off, this is more correct than having the B. Second off, this was making a bot upset. This fixes that. Update the test to include -verify-machineinstrs as well to prevent stuff like this slipping by non debug/assert builds in the future. llvm-svn: 330459
* [x86] auto-generate checks; NFCSanjay Patel2018-04-201-213/+411
| | | | | | | There's a proposal to change/add to this file in D45653, so we should know exactly what those differences would be. llvm-svn: 330445
* [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-208-330/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally committed at rL328921 and reverted at rL329920 to investigate failures in Chrome. This time I've added to the ReleaseNotes to warn users of the potential of exposing UB and let me repeat that here for more exposure: Optimization of floating-point casts is improved. This may cause surprising results for code that is relying on undefined behavior. Code sanitizers can be used to detect affected patterns such as this: int main() { float x = 4294967296.0f; x = (float)((int)x); printf("junk in the ftrunc: %f\n", x); return 0; } $ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of representable values of type 'int' junk in the ftrunc: 0.000000 Original commit message: fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 330437
* Revert "This pass, fixing an erratum in some LEON 2 processors..."Daniel Cederman2018-04-201-11/+0
| | | | | | | | | | | | | | | | | | | | | Summary: Reading Atmel's AT697E errata document this does not seem like a valid workaround. While the text only mentions SDIV, it says that the ICC flags can be wrong, and those are only generated by SDIVcc. Verification on hardware shows that simply replacing SDIV with SDIVcc does not avoid the bug with negative operands. This reverts r283727. Reviewers: lero_chris, jyknight Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D45813 llvm-svn: 330397
* [Sparc] Use synthetic instruction clr to zero register instead of sethiDaniel Cederman2018-04-204-4/+51
| | | | | | | | | | | | | | | Using `clr reg`/`mov %g0, reg`/`or %g0, %g0, reg` to zero a register looks much better than `sethi 0, reg`. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D45810 llvm-svn: 330396
* AMDGPU: Legalize the operand of SI_INIT_M0Nicolai Haehnle2018-04-201-0/+15
| | | | | | | | | | | | | | | | | | | | Summary: This fixes a case where the argument to a sendmsg intrinsic ends up in a VGPR, for whatever reason. The underlying performance issue is that a multiplication that can be an s_mul_i32 is instead needlessly generated as v_mul_u32_u24, but this is not addressed by this patch. Change-Id: I61fd4034314d5acdf6074632c30b65364dfa7328 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45826 llvm-svn: 330393
* [Sparc] Fix addressing mode when using 64-bit values in inline assemblyDaniel Cederman2018-04-201-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If a 64-bit register is used as an operand in inline assembly together with a memory reference, the memory addressing will be wrong. The addressing will be a single reg, instead of reg+reg or reg+imm. This will generate a bad offset value or an exception in printMemOperand(). For example: ``` long long int val = 5; long long int mem; __asm__ volatile ("std %1, %0":"=m"(mem):"r"(val)); ``` becomes: ``` std %i0, [%i2+589833] ``` The problem is that SelectInlineAsmMemoryOperand() is never called for the memory references if one of the operands is a 64-bit register. By calling SelectInlineAsmMemoryOperands() in tryInlineAsm() the Sparc version of SelectInlineAsmMemoryOperand() gets called for each memory reference. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D45761 llvm-svn: 330392
* [AMDGPU] Use packed literals with zero either lower or hi partStanislav Mekhanoshin2018-04-194-5/+98
| | | | | | Differential Revision: https://reviews.llvm.org/D45790 llvm-svn: 330365
* [X86] Enable popcnt false dependency breaking on Silvermont and Goldmont.Craig Topper2018-04-191-3/+5
| | | | | | Silvermont and Goldmont have the same issue on popcnt as Sandy Bridge, Haswell, Broadwell, and Skylake. Believe it is fixed in Goldmont Plus. llvm-svn: 330358
* [X86] Correct the scheduling data for register forms of XCHG and XADD on ↵Craig Topper2018-04-192-76/+76
| | | | | | | | | | | | Intel CPUs. The XCHG16rr/XCHG32rr/XCHG64rr instructions should be 3 uops just like XCHG8rr. I believe they're just implemented as 3 move uops with a temporary register. XADD is probably 2 moves and an add also using a temporary register. Change the latency for both from 2 cycles to 3 cycles. Only 2 of the uops are serialized in their execution, the move into the temporary and the move out of the temporary. The move from one GPR to the other should be able to go in parallel with this if there are ALU resources available. llvm-svn: 330349
* [if-converter] Handle BBs that terminate in ret during diamond conversionKrzysztof Parzyszek2018-04-192-0/+59
| | | | | | | | | | This fixes https://llvm.org/PR36825. Original patch by Valentin Churavy (D45218). Differential Revision: https://reviews.llvm.org/D45731 llvm-svn: 330345
* [Hexagon] Use legal types when lowering CONCAT_VECTORS via BUILD_VECTORKrzysztof Parzyszek2018-04-191-0/+34
| | | | llvm-svn: 330344
* [AMDGPU] Do not only rely on BB number when finding bottom loopMark Searles2018-04-191-0/+59
| | | | | | | | We should also check that the "bottom" basic block of a loopis a successor of the "header" basic block, otherwise we don't propagate the information correctly when the CFG is complex. This fixes an important rendering problem with Wolfsentein 2, because of one vector-memory wait was missing. Differential Revision: https://reviews.llvm.org/D43831 llvm-svn: 330337
* [Hexagon] Generate code for vector bswap intrinsicsKrzysztof Parzyszek2018-04-191-0/+45
| | | | llvm-svn: 330333
* [Hexagon] Add/fix patterns for 32/64-bit vector compares and logical opsKrzysztof Parzyszek2018-04-194-3/+356
| | | | llvm-svn: 330330
* [mips] Correct the definitions of the unaligned word memory operation ↵Simon Dardis2018-04-192-0/+159
| | | | | | | | | | | | | | | | instructions These instructions lacked the correct predicates, were not marked as loads and stores and lacked the proper instruction mapping information. In the case of microMIPS sw(l|r)e (EVA) these instructions were using the load EVA description. Reviewers: abeserminji, smaksimovic, atanasyan Differential Revision: https://reviews.llvm.org/D45626 llvm-svn: 330326
* Lowering x86 adds/addus/subs/subus intrinsics (llvm part)Alexander Ivchenko2018-04-1912-1954/+5028
| | | | | | | | | | | | | This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44785 llvm-svn: 330322
* [ARM] Add some missing FP16 VSEL test casesSjoerd Meijer2018-04-191-8/+83
| | | | | | Differential Revision: https://reviews.llvm.org/D45724 llvm-svn: 330313
* [X86] Scrub scheduling information for MUL/IMUL on Intel CPUs.Craig Topper2018-04-191-38/+38
| | | | | | This removes a bunch of unnecessary InstRW overrides. It also cleans up the missing information from the Sandy Bridge model. Other fixes to other models. llvm-svn: 330308
* [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma ↵Artem Belevich2018-04-181-18/+23
| | | | | | | | | | instructions. The new instructions were added added for sm_70+ GPUs in CUDA-9.1. Differential Revision: https://reviews.llvm.org/D45068 llvm-svn: 330296
* [RISCV] Add test changes missed from rL330293Alex Bradbury2018-04-181-4/+0
| | | | llvm-svn: 330294
* [RISCV] Introduce pattern for materialising immediates with 0 for lower 12 bitsAlex Bradbury2018-04-185-52/+29
| | | | | | | These immediates can be materialised with just an lui, rather than an lui+addi pair. llvm-svn: 330293
* [RISCV] Add imm-cse.ll test caseAlex Bradbury2018-04-181-0/+39
| | | | | | | | | This test case demonstrates that common subexpression elimination takes place between code sequences for materialising constants. In particular, it demonstrates that redundant lui aren't generated. This would capture a regression if applying a patch such as D41949. llvm-svn: 330291
* [NFC] test case clean upLei Huang2018-04-181-54/+14
| | | | | | | 1. remove redundant tests 2. update XForm_tests to generated expected code gen llvm-svn: 330290
* [RISCV] Expand codegen -> compression sanity checks and move to a single fileAlex Bradbury2018-04-184-85/+169
| | | | | | | | | | The objdump tests interfere with update_llc_test_checks.py and can't be automatically update them. Put the sanitify check for compression on the codegen codepath into a separate file, and expand it to also include tests of integer materialisation. This would catch changes such as those triggered by D41949. llvm-svn: 330288
* Revert "[RISCV] implement li pseudo instruction"Alex Bradbury2018-04-185-104/+127
| | | | | | | | | Reverts rL330224, while issues with the C extension and missed common subexpression elimination opportunities are addressed. Neither of these issues are visible in current RISC-V backend unit tests, which clearly need expanding. llvm-svn: 330281
* [Power9]Legalize and emit code for converting Unsigned HWord/Char to ↵Lei Huang2018-04-181-0/+141
| | | | | | | | | | | | | | | | Quad-Precision Legalize and emit code for converting unsigned HWord/Char to QP: xscvsdqp xscvudqp Only covering patterns for unsigned forms cause we don't have part-word sign-extending integer loads into VSX registers. Differential Revision: https://reviews.llvm.org/D45494 llvm-svn: 330278
* [AArch64] Add isel pattern for v8i8->v2f32 NVCASTs.Amara Emerson2018-04-181-0/+16
| | | | | | rdar://39454635 llvm-svn: 330276
* [RISCV] Add specific tests for materialising imm32hi20 constantsAlex Bradbury2018-04-181-0/+16
| | | | | | | i.e. constants that can be materialised with a single lui, as the lower 12 bits are zero. llvm-svn: 330274
* [Power9]Legalize and emit code for converting (Un)Signed Word to Quad-PrecisionLei Huang2018-04-181-1/+160
| | | | | | | | | | | Legalize and emit code for converting (Un)Signed Word to quad-precision via: xscvsdqp xscvudqp Differential Revision: https://reviews.llvm.org/D45389 llvm-svn: 330273
* [x86] Switch EFLAGS copy lowering to use reg-reg form of testing forChandler Carruth2018-04-186-31/+31
| | | | | | | | | | | | | | | | a zero register. Previously I tried this and saw LLVM unable to transform this to fold with memory operands such as spill slot rematerialization. However, it clearly works as shown in this patch. We turn these into `cmpb $0, <mem>` when useful for folding a memory operand without issue. This form has no disadvantage compared to `testb $-1, <mem>`. So overall, this is likely no worse and may be slightly smaller in some cases due to the `testb %reg, %reg` form. Differential Revision: https://reviews.llvm.org/D45475 llvm-svn: 330269
* [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite usesChandler Carruth2018-04-183-0/+110
| | | | | | | | | | | | | | | | | | | | | | across basic blocks in the limited cases where it is very straight forward to do so. This will also be useful for other places where we do some limited EFLAGS propagation across CFG edges and need to handle copy rewrites afterward. I think this is rapidly approaching the maximum we can and should be doing here. Everything else begins to require either heroic analysis to prove how to do PHI insertion manually, or somehow managing arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these seem at all promising so if those cases come up, we'll almost certainly need to rewrite the parts of LLVM that produce those patterns. We do now require dominator trees in order to reliably diagnose patterns that would require PHI nodes. This is a bit unfortunate but it seems better than the completely mysterious crash we would get otherwise. Differential Revision: https://reviews.llvm.org/D45673 llvm-svn: 330264
* [AMDGPU] Fix issues for backend divergence trackingDavid Stuttard2018-04-182-0/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A change to use divergence analysis in the AMDGPU backend was getting formal arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or VGPR2 For graphics shaders it is possible to have more than these passed in as VGPR Modified the checking code to check for any VGPR registers passed in as formal arguments. Also, some intrinsics that are sources of divergence may have been lowered during instruction selection and are missed on subsequent calls to isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well. Finally, the FunctionLoweringInfo tracks virtual registers that are live across basic block boundaries. This is used to check for divergence of CopyFromRegister registers using the DivergenceAnalysis analysis. For multiple blocks the lazily evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45372 Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3 llvm-svn: 330257
* Add tests for shrink wrapping and VLAsMomchil Velikov2018-04-182-0/+189
| | | | | | Differential revision: https://reviews.llvm.org/D45727 llvm-svn: 330253
* [X86] Give CMOV 2 cycle latency on SLM.Craig Topper2018-04-181-180/+180
| | | | llvm-svn: 330239
* [X86] Don't crash on bad operand modifiers in inline assemblyCraig Topper2018-04-181-0/+8
| | | | | | | | | | | | | | Summary: Previously if a modifer was placed on a non-GPR register class we would hit an assert or crash. Reviewers: echristo Reviewed By: echristo Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D45751 llvm-svn: 330238
* [AMDGPU] Enabled v2.16 literals for VOP3PStanislav Mekhanoshin2018-04-1712-56/+49
| | | | | | | | Literal encoding needs op_sel_hi to select low 16 bit in this case. Differential Revision: https://reviews.llvm.org/D45745 llvm-svn: 330230
* [RISCV] implement li pseudo instructionAlex Bradbury2018-04-174-125/+104
| | | | | | | | | | | | | | The implementation follows the MIPS backend and expands the pseudo instruction directly during asm parsing. As the result, only real MC instructions are emitted to the MCStreamer. Additionally, PseudoLI instructions are emitted during codegen. The actual expansion to real instructions is performed during MI to MC lowering and is similar to the expansion performed by the GNU Assembler. Differential Revision: https://reviews.llvm.org/D41949 Patch by Mario Werner. llvm-svn: 330224
* LoadStoreVectorizer crashes due to unsized typeStanislav Mekhanoshin2018-04-171-0/+16
| | | | | | | | | When we skip bitcasts while looking for GEP in LoadSoreVectorizer we should also verify that the type is sized otherwise we assert Differential Revision: https://reviews.llvm.org/D45709 llvm-svn: 330221
* [XRay] Typed event logging intrinsicKeith Wyss2018-04-172-2/+47
| | | | | | | | | | | | | | | | | | | | | Summary: Add an LLVM intrinsic for type discriminated event logging with XRay. Similar to the existing intrinsic for custom events, but also accepts a type tag argument to allow plugins to be aware of different types and semantically interpret logged events they know about without choking on those they don't. Relies on a symbol defined in compiler-rt patch D43668. I may wait to submit before I can see demo everything working together including a still to come clang patch. Reviewers: dberris, pelikan, eizan, rSerge, timshen Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45633 llvm-svn: 330219
* [WebAssembly] Teach fast-isel to gracefully recover from illegal return types.Dan Gohman2018-04-171-0/+16
| | | | | | Fixes PR36564. llvm-svn: 330215
* [X86] Add separate scheduling class for PSADBW instruction.Craig Topper2018-04-173-23/+19
| | | | llvm-svn: 330204
* [X86] Remove -mcpu=skx/knl from some tests and use -mattr instead.Craig Topper2018-04-173-33/+34
| | | | | | mcpu exposes other tuning flags. These tests are only trying to test instruction set features so it is better to use mattr. llvm-svn: 330196
* Revert "Fix incorrect choice of callee-saved registers save/restore points ↵Momchil Velikov2018-04-171-114/+0
| | | | | | | | | (take 2)" Revert in order to fix the test to not run when required targets aren't configured. llvm-svn: 330193
OpenPOWER on IntegriCloud