summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] PMOV*X* tests - remove unnecessary mcpu arguments and regenerateSimon Pilgrim2015-10-251-38/+110
| | | | llvm-svn: 251230
* [X86] Stack folding tests - just use mtriple - no need for mcpu in these testsSimon Pilgrim2015-10-257-7/+7
| | | | llvm-svn: 251229
* [X86] Use correct calling convention for MCU psABI libcallsMichael Kuperstein2015-10-251-0/+11
| | | | | | | | | | | | When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223
* [X86][SSE] Use lowerVectorShuffleWithUNPCK instead of custom matches.Simon Pilgrim2015-10-244-24/+24
| | | | | | Most 128-bit and 256-bit shuffles were manually matching UNPCK patterns - use lowerVectorShuffleWithUNPCK to be more thorough. llvm-svn: 251211
* Removed old FIXME - we do generate movddup for SSE3 and higherSimon Pilgrim2015-10-241-2/+1
| | | | llvm-svn: 251205
* [DAGCombiner] Generalize masking of constant rotates.Simon Pilgrim2015-10-242-58/+32
| | | | | | | | We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197
* X86ISelLowering: Support tail calls to/from callee pop functionsHans Wennborg2015-10-241-10/+159
| | | | | | | | | This enables tail calls with thiscall, stdcall, vectorcall and fastcall functions. Differential Revision: http://reviews.llvm.org/D13999 llvm-svn: 251190
* [X86][XOP] Add support for lowering vector rotationsSimon Pilgrim2015-10-242-302/+126
| | | | | | | | | | This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188
* [X86] Clean up the tail call eligibility logicReid Kleckner2015-10-231-0/+40
| | | | | | | | | | | | | | | | | | | Summary: The logic here isn't straightforward because our support for TargetOptions::GuaranteedTailCallOpt. Also fix a bug where we were allowing tail calls to cdecl functions from fastcall and vectorcall functions. We were special casing thiscall and stdcall callers rather than checking for any convention that requires clearing stack arguments before returning. Reviewers: hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14024 llvm-svn: 251137
* [ARM] Renaming +t2dsp feature into +dsp, as discussed on llvm-devArtyom Skrobov2015-10-233-4/+4
| | | | llvm-svn: 251125
* [ARM CodeGen] @llvm.debugtrap call may be removed when restoring callee ↵Oleg Ranevskyy2015-10-231-0/+17
| | | | | | | | | | | | | | | | | saved registers Summary: When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR. This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap. Reviewers: asl, rengolin Subscribers: aemerson, jfb, rengolin, dschuff, llvm-commits Differential Revision: http://reviews.llvm.org/D13672 llvm-svn: 251123
* [CodeGen] Mark setjmp/catchret MBBs address-takenJoseph Tremoulet2015-10-234-10/+88
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This ensures that BranchFolding (and similar) won't remove these blocks. Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are address-taken but do not have BBs that are address-taken, since otherwise its call to getAddrLabelSymbolTableToEmit would fail an assertion on such blocks. I audited the other callers of getAddrLabelSymbolTableToEmit (and getAddrLabelSymbol); they all have BBs known to be address-taken except for the call through getAddrLabelSymbol from WinException::create32bitRef; that call is actually now unreachable, so I've removed it and updated the signature of create32bitRef. This fixes PR25168. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, llvm-commits Differential Revision: http://reviews.llvm.org/D13774 llvm-svn: 251113
* Revert "[AArch64]Merge halfword loads into a 32-bit load"James Molloy2015-10-231-49/+0
| | | | | | This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53. llvm-svn: 251108
* [X86] - Catch extra combine opportunities for redundant imuls.Zia Ansari2015-10-221-0/+163
| | | | | | | | | | | | When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 llvm-svn: 251028
* Fix incorrect target triple in fp16-promote.llPirama Arumuga Nainar2015-10-221-96/+99
| | | | | | | | | | | | | | | | | Summary: Hyphens were missing from the triple, causing it to be parsed incorrectly. This patch updates the triple and makes necessary changes to the expected output. Patch is from Vinicius Tinti. Reviewers: ab, tinti Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D13792 llvm-svn: 251020
* [mips][mips16] Fix typo in FileCheck directive.Daniel Sanders2015-10-221-1/+1
| | | | llvm-svn: 251019
* [X86][AVX512] extend vcvtph2ps to support xmm/ymm and sae versionsAsaf Badouh2015-10-222-0/+76
| | | | | | Differential Revision: http://reviews.llvm.org/D13945 llvm-svn: 251018
* AVX-512: Fixed a bug in select_cc for i1 typeElena Demikhovsky2015-10-221-0/+25
| | | | | | | | | | | Fixed faiure: LLVM ERROR: Cannot select: t33: i1 = select_cc t25, Constant:i32<0>, t45, t42, seteq:ch added a test Differential Revision: http://reviews.llvm.org/D13943 llvm-svn: 250996
* WebAssembly: fix more syntaxJF Bastien2015-10-223-9/+9
| | | | | | | br_if shouldn't start with a dot. div and rem went from prefix u/s to suffix. llvm-svn: 250972
* Add missing load/store flags to thumb2 instructions.Pete Cooper2015-10-221-1/+1
| | | | | | | | | | | | | | | | | | These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. This reapplies r242300 which was reverted in r242428 due to bot failures. Ultimately those failures were spurious and completely unrelated to this commit. I reverted this at the time because it was thought to be at fault. llvm-svn: 250969
* AMDGPU: Fix verifier error in SIFoldOperandsMatt Arsenault2015-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | There may be other use operands that also need their kill flags cleared. This happens in a few tests when SIFoldOperands is moved after PeepholeOptimizer. PeepholeOptimizer rewrites cases that look like: %vreg0 = ... %vreg1 = COPY %vreg0 use %vreg1<kill> %vreg2 = COPY %vreg0 use %vreg2<kill> to use the earlier source to %vreg0 = ... use %vreg0 use %vreg0 Currently SIFoldOperands sees the copied registers, so there is only one use. So far I haven't managed to come up with a test that currently has multiple uses of a foldable VGPR -> VGPR copy. llvm-svn: 250960
* Drop assert that a call with struct return goes to a function with sretJoerg Sonnenberger2015-10-211-0/+9
| | | | | | | attribute. Clang incorrectly misses it on __muldc3 and friends and the type system doesn't include it properly either. llvm-svn: 250938
* [WinEH] Add test for llvm.va.start in catchpadReid Kleckner2015-10-211-0/+103
| | | | | | | | It already works, but we should have a test for it. This used to be PR23094 in the old model. llvm-svn: 250936
* [x86] add test case that shows holes in LEA iselSanjay Patel2015-10-211-0/+125
| | | | llvm-svn: 250910
* [mips][mips16] Re-work the inline assembly stubs to work with IAS. NFC.Daniel Sanders2015-10-213-137/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, we were inserting an InlineAsm statement for each line of the inline assembly. This works for GAS but it triggers prologue/epilogue emission when IAS is in use. This caused: .set noreorder .cpload $25 to be emitted as: .set push .set reorder .set noreorder .set pop .set push .set reorder .cpload $25 .set pop which led to assembler errors and caused the test to fail. The whitespace-after-comma changes included in this patch are necessary to match the output when IAS is in use. Reviewers: vkalintiris Subscribers: rkotler, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13653 llvm-svn: 250895
* Masked Load/Store optimization for scalar codeElena Demikhovsky2015-10-211-0/+37
| | | | | | | | | When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks. I added optimization for constant mask vector. Differential Revision: http://reviews.llvm.org/D13855 llvm-svn: 250893
* [mips][msa] Remove copy_u.d and move copy_u.w to MSA64.Daniel Sanders2015-10-211-2/+3
| | | | | | | | | | | | | | | | | | | | | | | Summary: The forwards compatibility strategy employed by MIPS is to consider registers to be infinitely sign-extended. Then on ISA's with a wider register, the result of existing instructions are sign-extended to register width and zero-extended counterparts are added. copy_u.w on MSA32 and copy_u.w on MSA64 violate this strategy and we have therefore corrected the MSA specs to fix this. We still keep track of sign/zero-extension during legalization but we now match copy_s.[wd] where required. No change required to clang since __builtin_msa_copy_u_[wd] will map to copy_s.[wd] where appropriate for the target. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13472 llvm-svn: 250887
* Let MachineVerifier be aware of mem-to-mem instructions.Jonas Paulsson2015-10-211-1/+1
| | | | | | | | | | | | | | | A mem-to-mem instruction (that both loads and stores), which store to an FI, cannot pass the verifier since it thinks it is loading from the FI. For the mem-to-mem instruction, do a looser check in visitMachineOperand() and only check liveness at the reg-slot while analyzing a frame index operand. Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Evan Cheng and Quentin Colombet. llvm-svn: 250885
* Tail duplication can mix incompatible registers in phi nodesKrzysztof Parzyszek2015-10-211-0/+28
| | | | | | | | | Do not tail duplicate blocks where the successor has a phi node, and the corresponding value in that phi node uses a subregister. http://reviews.llvm.org/D13922 llvm-svn: 250877
* WebAssembly: support importsJF Bastien2015-10-211-0/+21
| | | | | | C/C++ code can declare an extern function, which will show up as an import in WebAssembly's output. It's expected that the linker will resolve these, and mark unresolved imports as call_import (I have a patch which does this in wasmate). llvm-svn: 250875
* [Hexagon] Bit-based instruction simplificationKrzysztof Parzyszek2015-10-206-4/+137
| | | | | | | Analyze bit patterns of operands and values of instructions to perform various simplifications, dead/redundant code elimination, etc. llvm-svn: 250868
* [X86][SSE] Add 256-bit vector bit rotation tests.Simon Pilgrim2015-10-201-0/+1200
| | | | llvm-svn: 250853
* [SystemZ] Comment fix in test/CodeGen/SystemZ/fp-cmp-05.llJonas Paulsson2015-10-201-3/+1
| | | | llvm-svn: 250828
* Adding support for TargetLoweringBase::LibCallArtyom Skrobov2015-10-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826
* AVX512: Implemented encoding and intrinsics for VPBROADCASTB/W/D/Q instructions.Igor Breger2015-10-204-10/+262
| | | | | | Differential Revision: http://reviews.llvm.org/D13884 llvm-svn: 250819
* [x86] Fix AVX maskload/store intrinsic prototypes.Andrea Di Biagio2015-10-203-28/+26
| | | | | | | | | | | | | | | | | | | | | The mask value type for maskload/maskstore GCC builtins is never a vector of packed floats/doubles. This patch fixes the following issues: 1. The mask argument for builtin_ia32_maskloadpd and builtin_ia32_maskstorepd should be of type llvm_v2i64_ty and not llvm_v2f64_ty. 2. The mask argument for builtin_ia32_maskloadpd256 and builtin_ia32_maskstorepd256 should be of type llvm_v4i64_ty and not llvm_v4f64_ty. 3. The mask argument for builtin_ia32_maskloadps and builtin_ia32_maskstoreps should be of type llvm_v4i32_ty and not llvm_v4f32_ty. 4. The mask argument for builtin_ia32_maskloadps256 and builtin_ia32_maskstoreps256 should be of type llvm_v8i32_ty and not llvm_v8f32_ty. Differential Revision: http://reviews.llvm.org/D13776 llvm-svn: 250817
* AMDGPU: Stop reserving v[254:255]Matt Arsenault2015-10-201-52/+52
| | | | | | | | | | | This wasn't doing anything useful. They weren't explicitly used anywhere, and the RegScavenger ignores reserved registers. This for some reason caused a random scheduling change in the test. Getting the check lines to pass is too frustrating, and there's probably not too much value in checking the vector case's operands N times. llvm-svn: 250794
* WebAssembly: fix call/return syntax.JF Bastien2015-10-202-3/+13
| | | | | | They are now typeless, unlike other operations. llvm-svn: 250793
* WebAssembly: fix syntax for br_if.JF Bastien2015-10-201-13/+13
| | | | llvm-svn: 250777
* Enhance loop rotation with existence of profile data in ↵Cong Hou2015-10-192-0/+202
| | | | | | | | | | | | | | | | MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation: 1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header. 2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop. 3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain. Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies. Differential revision: http://reviews.llvm.org/D10717 llvm-svn: 250754
* [AArch64]Merge halfword loads into a 32-bit loadJun Bum Lim2015-10-191-0/+49
| | | | | | | | | | | | | Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 250719
* [Hexagon] Delay emission of CFI instructionsKrzysztof Parzyszek2015-10-193-6/+70
| | | | | | | Emit the CFI instructions after all code transformation have been done. This will avoid any interference between CFI instructions and packetization. llvm-svn: 250714
* Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructionsAsiri Rathnayake2015-10-195-0/+100
| | | | | | | | | | | | | The mapping of these two intrinsics in ARMInstrInfo.td had a small omission which lead to their operands not being validated/transformed before being lowered into usat and ssat instructions. This can cause incorrect instructions to be emitted. I've also added tests for the remaining two saturating arithmatic intrinsics @llvm.arm.qadd and @llvm.arm.qsub as they are missing codegen tests. llvm-svn: 250697
* [X86][SSE] Add vector bit rotation tests.Simon Pilgrim2015-10-181-0/+1684
| | | | llvm-svn: 250656
* [X86][AVX512DQ] add scalar fpclassAsaf Badouh2015-10-181-0/+34
| | | | | | Differential Revision: http://reviews.llvm.org/D13769 llvm-svn: 250650
* AVX512: Lowering i8/i16 vector CTLZ using the dword LZCNT vector instructionIgor Breger2015-10-183-1376/+515
| | | | | | Differential Revision: http://reviews.llvm.org/D13632 llvm-svn: 250649
* [X86][XOP] Add VPROT rotate by immediate intrinsics testsSimon Pilgrim2015-10-171-0/+28
| | | | llvm-svn: 250618
* [X86][FastISel] Teach how to select SSE4A nontemporal stores.Simon Pilgrim2015-10-171-12/+53
| | | | | | | | | | Add FastISel support for SSE4A scalar float / double non-temporal stores Follow up to D13698 Differential Revision: http://reviews.llvm.org/D13773 llvm-svn: 250610
* [Hexagon] Reverting test file change.Colin LeMahieu2015-10-171-1/+2
| | | | llvm-svn: 250601
* [Hexagon] Adding skeleton of HVX extension instructions.Colin LeMahieu2015-10-171-2/+1
| | | | llvm-svn: 250600
OpenPOWER on IntegriCloud