summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [Power9]Legalize and emit code for converting (Un)Signed Word to Quad-PrecisionLei Huang2018-04-181-1/+160
| | | | | | | | | | | Legalize and emit code for converting (Un)Signed Word to quad-precision via: xscvsdqp xscvudqp Differential Revision: https://reviews.llvm.org/D45389 llvm-svn: 330273
* [x86] Switch EFLAGS copy lowering to use reg-reg form of testing forChandler Carruth2018-04-186-31/+31
| | | | | | | | | | | | | | | | a zero register. Previously I tried this and saw LLVM unable to transform this to fold with memory operands such as spill slot rematerialization. However, it clearly works as shown in this patch. We turn these into `cmpb $0, <mem>` when useful for folding a memory operand without issue. This form has no disadvantage compared to `testb $-1, <mem>`. So overall, this is likely no worse and may be slightly smaller in some cases due to the `testb %reg, %reg` form. Differential Revision: https://reviews.llvm.org/D45475 llvm-svn: 330269
* [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite usesChandler Carruth2018-04-183-0/+110
| | | | | | | | | | | | | | | | | | | | | | across basic blocks in the limited cases where it is very straight forward to do so. This will also be useful for other places where we do some limited EFLAGS propagation across CFG edges and need to handle copy rewrites afterward. I think this is rapidly approaching the maximum we can and should be doing here. Everything else begins to require either heroic analysis to prove how to do PHI insertion manually, or somehow managing arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these seem at all promising so if those cases come up, we'll almost certainly need to rewrite the parts of LLVM that produce those patterns. We do now require dominator trees in order to reliably diagnose patterns that would require PHI nodes. This is a bit unfortunate but it seems better than the completely mysterious crash we would get otherwise. Differential Revision: https://reviews.llvm.org/D45673 llvm-svn: 330264
* [AMDGPU] Fix issues for backend divergence trackingDavid Stuttard2018-04-182-0/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A change to use divergence analysis in the AMDGPU backend was getting formal arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or VGPR2 For graphics shaders it is possible to have more than these passed in as VGPR Modified the checking code to check for any VGPR registers passed in as formal arguments. Also, some intrinsics that are sources of divergence may have been lowered during instruction selection and are missed on subsequent calls to isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well. Finally, the FunctionLoweringInfo tracks virtual registers that are live across basic block boundaries. This is used to check for divergence of CopyFromRegister registers using the DivergenceAnalysis analysis. For multiple blocks the lazily evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45372 Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3 llvm-svn: 330257
* Add tests for shrink wrapping and VLAsMomchil Velikov2018-04-182-0/+189
| | | | | | Differential revision: https://reviews.llvm.org/D45727 llvm-svn: 330253
* [X86] Give CMOV 2 cycle latency on SLM.Craig Topper2018-04-181-180/+180
| | | | llvm-svn: 330239
* [X86] Don't crash on bad operand modifiers in inline assemblyCraig Topper2018-04-181-0/+8
| | | | | | | | | | | | | | Summary: Previously if a modifer was placed on a non-GPR register class we would hit an assert or crash. Reviewers: echristo Reviewed By: echristo Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D45751 llvm-svn: 330238
* [AMDGPU] Enabled v2.16 literals for VOP3PStanislav Mekhanoshin2018-04-1712-56/+49
| | | | | | | | Literal encoding needs op_sel_hi to select low 16 bit in this case. Differential Revision: https://reviews.llvm.org/D45745 llvm-svn: 330230
* [RISCV] implement li pseudo instructionAlex Bradbury2018-04-174-125/+104
| | | | | | | | | | | | | | The implementation follows the MIPS backend and expands the pseudo instruction directly during asm parsing. As the result, only real MC instructions are emitted to the MCStreamer. Additionally, PseudoLI instructions are emitted during codegen. The actual expansion to real instructions is performed during MI to MC lowering and is similar to the expansion performed by the GNU Assembler. Differential Revision: https://reviews.llvm.org/D41949 Patch by Mario Werner. llvm-svn: 330224
* LoadStoreVectorizer crashes due to unsized typeStanislav Mekhanoshin2018-04-171-0/+16
| | | | | | | | | When we skip bitcasts while looking for GEP in LoadSoreVectorizer we should also verify that the type is sized otherwise we assert Differential Revision: https://reviews.llvm.org/D45709 llvm-svn: 330221
* [XRay] Typed event logging intrinsicKeith Wyss2018-04-172-2/+47
| | | | | | | | | | | | | | | | | | | | | Summary: Add an LLVM intrinsic for type discriminated event logging with XRay. Similar to the existing intrinsic for custom events, but also accepts a type tag argument to allow plugins to be aware of different types and semantically interpret logged events they know about without choking on those they don't. Relies on a symbol defined in compiler-rt patch D43668. I may wait to submit before I can see demo everything working together including a still to come clang patch. Reviewers: dberris, pelikan, eizan, rSerge, timshen Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45633 llvm-svn: 330219
* [WebAssembly] Teach fast-isel to gracefully recover from illegal return types.Dan Gohman2018-04-171-0/+16
| | | | | | Fixes PR36564. llvm-svn: 330215
* [X86] Add separate scheduling class for PSADBW instruction.Craig Topper2018-04-173-23/+19
| | | | llvm-svn: 330204
* [X86] Remove -mcpu=skx/knl from some tests and use -mattr instead.Craig Topper2018-04-173-33/+34
| | | | | | mcpu exposes other tuning flags. These tests are only trying to test instruction set features so it is better to use mattr. llvm-svn: 330196
* Revert "Fix incorrect choice of callee-saved registers save/restore points ↵Momchil Velikov2018-04-171-114/+0
| | | | | | | | | (take 2)" Revert in order to fix the test to not run when required targets aren't configured. llvm-svn: 330193
* Fix incorrect choice of callee-saved registers save/restore points (take 2)Momchil Velikov2018-04-171-0/+114
| | | | | | | | Add the accidentally omitted testcase. Differential revision: https://reviews.llvm.org/D45524 llvm-svn: 330192
* [Hexagon] Do not merge initializers for stack and non-stack expressionsKrzysztof Parzyszek2018-04-171-0/+35
| | | | | | | | | Stack addressing needs addressing modes that provide an offset field immediately following the frame index. An initializer from a non-stack addressing could force the stack address to use a form that does not provide an offset field. llvm-svn: 330191
* [PowerPC] Mark the BDNZ intrinsic as NoDuplicateNemanja Ivanovic2018-04-171-0/+75
| | | | | | | | | | | | | Duplicating this intrinsic is not generally valid because it has the side-effect of decrementing the CTR. Any passes that duplicate it would need to be taught to keep the regions formed completely disjoint. This patch should be NFC for typical uses as CTRLoops runs after the remaining loop passes. It only affects situations where the loop passes are scheduled on the IR after the codegen passes (as is the case with some JIT pipelines). Fixes https://bugs.llvm.org/show_bug.cgi?id=37050 llvm-svn: 330186
* [X86] Add FP comparison scheduler classesSimon Pilgrim2018-04-171-12/+12
| | | | | | | | Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd Differential Revision: https://reviews.llvm.org/D45656 llvm-svn: 330179
* [DAGCombiner] Fix for oss-fuzz bugGerolf Hoflehner2018-04-171-0/+53
| | | | llvm-svn: 330178
* [MIR-Canon] Fixing a test failure caused by COPY Folding.Puyan Lotfi2018-04-161-3/+1
| | | | llvm-svn: 330115
* [MIR-Canon] Adding ISA-Agnostic COPY Folding.Puyan Lotfi2018-04-161-0/+45
| | | | | | | | | | | | | | | | | Transforms the following: %vreg1234:gpr32 = COPY %42 %vreg1235:gpr32 = COPY %vreg1234 %vreg1236:gpr32 = COPY %vreg1235 $w0 = COPY %vreg1236 into: $w0 = COPY %42 Assuming %42 is also a gpr32 llvm-svn: 330113
* [X86] Introduce archs: goldmont-plus & tremontGabor Buella2018-04-161-0/+2
| | | | | | | | | | | | | | | Using Goldmont's cost tables for these two upcoming atom archs. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45612 llvm-svn: 330109
* [DAGCombiner, PowerPC] allow X - (fpext(-Y) --> X + fpext(Y) with multiple usesSanjay Patel2018-04-151-4/+4
| | | | | | | | | | | | | This is a transform that I limited in instcombine in rL329821 because it was creating more instructions in IR when the cast has multiple uses. But if the cast is free, then we can do the transform regardless of other uses because it improves the potential throughput of the calculation by removing a dependency on the fneg. Differential Revision: https://reviews.llvm.org/D45598 llvm-svn: 330098
* [X86] Tests for unsigned saturation downconvert detection.Artur Gainullin2018-04-141-4/+32
| | | | | | One more test for smax(smin(x, C2), C1) pattern. llvm-svn: 330090
* [X86] Tests for unsigned saturation downconvert detection.Artur Gainullin2018-04-141-0/+310
| | | | llvm-svn: 330088
* [X86][MMX] Set PAVG/PHADD/PMIN/PMAX/PSIGN instructions to use same scheduler ↵Simon Pilgrim2018-04-141-78/+78
| | | | | | classes as SSE/AVX llvm-svn: 330085
* MachO: trap unreachable instructionsTim Northover2018-04-1310-7/+35
| | | | | | | Debugability is more important than saving 4 bytes to let us to fall through to nonense. llvm-svn: 330073
* Revert r329956, "AArch64: Introduce a DAG combine for folding offsets into ↵Peter Collingbourne2018-04-136-172/+68
| | | | | | | | | | addresses." Caused a hang and eventually an assertion failure in LTO builds of 7zip-benchmark on aarch64 iOS targets. http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/ llvm-svn: 330063
* [mips] Materialize constants for multiplicationSimon Dardis2018-04-131-223/+64
| | | | | | | | | | | | | | | | | | | | | | | Previously, the MIPS backend would alwyas break down constant multiplications into a series of shifts, adds, and subs. This patch changes that so the cost of doing so is estimated. The cost is estimated against worst case constant materialization and retrieving the results from the HI/LO registers. For cases where the value type of the multiplication is not legal, the cost of legalization is estimated and is accounted for before performing the optimization of breaking down the constant This resolves PR36884. Thanks to npl for reporting the issue! Reviewers: abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D45316 llvm-svn: 330037
* [ARM] FP16 vmaxnm/vminnm scalar instructionsSjoerd Meijer2018-04-133-1/+790
| | | | | | | | | This adds code generation support for the FP16 vmaxnm/vminnm scalar instructions. Differential Revision: https://reviews.llvm.org/D44675 llvm-svn: 330034
* [X86][AVX512] UNPCKL/H PS and PD should be scheduled with WriteFShuffle not ↵Simon Pilgrim2018-04-131-192/+192
| | | | | | WriteFAdd llvm-svn: 330023
* [PostRASink]Add register dependency check for implicit operandsJun Bum Lim2018-04-131-0/+63
| | | | | | | | | | | | | | | | Summary: This change extend the register dependency check for implicit operands in Copy instructions. Fixes PR36902. Reviewers: thegameg, sebpop, uweigand, jnspaulsson, gberry, mcrosier, qcolombet, MatzeB Reviewed By: thegameg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44958 llvm-svn: 330018
* [NEON] Support intrinsic for scalar and vector versions of the VRINTN ↵Ivan A. Kosarev2018-04-131-0/+11
| | | | | | | | instruction Differential Revision: https://reviews.llvm.org/D45514 llvm-svn: 330011
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-04-132-2/+2
| | | | | | "the the" -> "the", "we we" -> "we", etc llvm-svn: 330006
* [X86] Introduce cldemote instructionGabor Buella2018-04-131-0/+21
| | | | | | | | | | | | | | Hint to hardware to move the cache line containing the address to a more distant level of the cache without writing back to memory. Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45256 llvm-svn: 329992
* [X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR.Craig Topper2018-04-1311-1643/+2553
| | | | | | | | This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics. We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well. llvm-svn: 329990
* [PowerPC] add fsub-fneg test; NFCSanjay Patel2018-04-121-0/+21
| | | | | | | | | This is a test for a transform that was suggested in the post-commit mailing list thread for rL329821. The target in question is not in trunk, so PPC gets to stand in for it because it's the only in-tree target that sets 'isFPExtFree()' to 'true'. llvm-svn: 329963
* AArch64: Introduce a DAG combine for folding offsets into addresses.Peter Collingbourne2018-04-126-68/+172
| | | | | | | | | | | This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. This reduces the size of Chromium for Android's .text section by 108KB. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329956
* [X86] Introduce LLVM wbinvd intrinsicGabor Buella2018-04-121-0/+19
| | | | | | | | | | | | A previously missing intrinsic for an old instruction. Reviewers: craig.topper, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45312 llvm-svn: 329936
* [Power9]Legalize and emit code for converting (Un)Signed DWord to Quad-PrecisionLei Huang2018-04-121-0/+140
| | | | | | | | | | | Legalize and emit code for: * xscvsdqp * xscvudqp Differential Revision: https://reviews.llvm.org/D45230 llvm-svn: 329931
* [AArch64] Move AFI->setRedZone(false) to top of emitPrologueJessica Paquette2018-04-121-5/+35
| | | | | | | | | | | | | AFI->setRedZone(false) was put in the wrong place before, and so it only fired on functions that didn't have stack frames. This moves that to the top of emitPrologue to make sure that every function without a redzone has it set correctly. This also adds a function representing one of the early exit cases (GHC calling convention) to the MachineOutliner noredzone test to ensure that we can outline from functions like these, where we never use a redzone. llvm-svn: 329922
* revert r328921 - [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-128-83/+330
| | | | | | | This change is exposing UB in source code - as was warned/predicted. :) See D44909 for discussion. Reverting while we figure out how to fix things. llvm-svn: 329920
* [RISCV] Change function alignment to 4 bytes, and 2 bytes for RVCShiva Chen2018-04-121-0/+13
| | | | | | | | | | | | | | | | | | Summary: According RISC-V ELF psABI specification, base RV32 and RV64 ISAs only allow 32-bit instruction alignment, but instruction allow to be aligned to 16-bit boundaries for C-extension. So we just align to 4 bytes and 2 bytes for C-extension is enough. Reviewers: asb, apazos Differential Revision: https://reviews.llvm.org/D45560 Patch by Kito Cheng. llvm-svn: 329899
* [MIPS GlobalISel] minor update to MIR tests added in r329819Petar Jovanovic2018-04-123-12/+0
| | | | | | | | Remove 'registers' section, as suggested (D. Sanders) at code review https://reviews.llvm.org/D44304 llvm-svn: 329888
* [NFC] fix trivial typos in documents and commentsHiroshi Inoue2018-04-121-1/+1
| | | | | | "is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878
* [RISCV] Codegen support for RV32D floating point comparison operationsAlex Bradbury2018-04-124-0/+1327
| | | | | | | | Also add double-prevoius-failure.ll which captures a test case that at one point triggered a compiler crash, while developing calling convention support for f64 on RV32D with soft-float ABI. llvm-svn: 329877
* [RISCV] Codegen support for RV32D floating point conversion operationsAlex Bradbury2018-04-122-0/+106
| | | | | | | This also includes support and a test for truncating stores, which are now possible thanks to the fpround pattern. llvm-svn: 329876
* [RISCV] Add codegen support for RV32D floating point arithmetic operationsAlex Bradbury2018-04-121-0/+256
| | | | llvm-svn: 329874
* [RISCV] Add tests missed in r329871Alex Bradbury2018-04-126-0/+464
| | | | llvm-svn: 329872
OpenPOWER on IntegriCloud