summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][AVX512] Regenerate avx512 arithmetic testsSimon Pilgrim2017-06-271-129/+129
| | | | llvm-svn: 306389
* [globalisel][tablegen] Add support for EXTRACT_SUBREG.Daniel Sanders2017-06-271-2/+2
| | | | | | | | | | | | | | | | Summary: After this patch, we finally have test cases that require multiple instruction emission. Depends on D33590 Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D33596 llvm-svn: 306388
* [ARM] GlobalISel: Support G_SELECT for i32Diana Picus2017-06-274-0/+106
| | | | | | | | | | * Mark as legal for (s32, i1, s32, s32) * Map everything into GPRs * Select to two instructions: a CMP of the condition against 0, to set the flags, and a MOVCCr to select between the two inputs based on the flags that we've just set llvm-svn: 306382
* [PowerPC] fix incorrect processor name for -mcpu in a test caseHiroshi Inoue2017-06-271-1/+1
| | | | | | to surpress warnings. ppc970 should be 970 (or g5) llvm-svn: 306380
* AMDGPU: M0 operands to spill/restore opcodes are deadNicolai Haehnle2017-06-271-0/+22
| | | | | | | | | | | | | | | | | Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 llvm-svn: 306375
* [GlobalISel][X86] Add fp32/62 legalizer, regbank-select, selection tests for ↵Igor Breger2017-06-2715-157/+970
| | | | | | G_FADD, G_FSUB, G_FMUL, G_FDIV. NFC. llvm-svn: 306370
* [PowerPC] set optimization level in SelectionDAGISelHiroshi Inoue2017-06-274-45/+45
| | | | | | | | | PowerPC backend does not pass the current optimization level to SelectionDAGISel and so SelectionDAGISel works with the default optimization level regardless of the current optimization level. This patch makes the PowerPC backend set the optimization level correctly. Differential Revision: https://reviews.llvm.org/D34615 llvm-svn: 306367
* ScheduleDAGInstrs: Fix fixupKills() adding too many kill flags.Matthias Braun2017-06-271-0/+45
| | | | | | | | | Remove invalid shortcut in fixupKills(): A register needs to be marked live even when we are not adding a kill flag. This is because a partially live register must not get a kill flags, but it still needs to be fully marked live when walking backwards. llvm-svn: 306352
* DAGCombine: Make sure we only eliminate trunc/extend when the scales of ↵Wolfgang Pieb2017-06-261-0/+35
| | | | | | | | | | | | truncation and extension match. This fixes PR33368. Reviewer: rksimon Differential Revision: https://reviews.llvm.org/D34069 llvm-svn: 306345
* [x86] add tests for missing sbb transforms; NFCSanjay Patel2017-06-261-0/+80
| | | | llvm-svn: 306337
* RenameIndependentSubregs: Fix iterator problemMatt Arsenault2017-06-261-0/+67
| | | | | | | | | | Fixes bug 33597. Use of substituteRegister in the tied operand case messes up the register use iterator, causing some uses to be left unprocessed. llvm-svn: 306333
* AArch64: legalize G_EXTRACT operations.Tim Northover2017-06-263-8/+92
| | | | | | | This is the dual problem to legalizing G_INSERTs so most of the code and testing was cribbed from there. llvm-svn: 306328
* AArch64: remove all kill flags when extending register liveness.Tim Northover2017-06-261-0/+19
| | | | | | | | | | | | When we forward a stored value to a load and eliminate it entirely we need to make sure the liveness of the register is maintained all the way to its use. Previously we only cleared liveness on the store doing the forwarding, but there could be other killing uses in between. We already do the right thing when the load has to be converted into something else, it was just this one path that skipped it. llvm-svn: 306318
* [X86][SSE] Check SSE2/SSE3 codegen tests on i686 and x86_64Simon Pilgrim2017-06-262-184/+456
| | | | llvm-svn: 306314
* AMDGPU: Setup SP/FP in callee function prolog/epilogMatt Arsenault2017-06-262-3/+32
| | | | llvm-svn: 306312
* [SystemZ] Fix missing emergency spill slot corner caseUlrich Weigand2017-06-261-0/+76
| | | | | | | | | | | | | | | | We sometimes need emergency spill slots for the register scavenger. This may be the case when code needs to access a stack slot that has an offset of 4096 or more relative to the stack pointer. To make that determination, processFunctionBeforeFrameFinalized currently simply checks the total stack frame size of the current function. But this is not enough, since code may need to access stack slots in the caller's stack frame as well, in particular incoming arguments stored on the stack. This commit fixes the problem by taking argument slots into account. llvm-svn: 306305
* [X86][SSE] Add combine tests for PMULDQ/PMULUDQSimon Pilgrim2017-06-261-0/+110
| | | | | | Found several missed optimizations while investigating replacing _mm_mul_epi32/_mm_mul_epu32 with generic implementations llvm-svn: 306302
* [X86][AVX-512] Don't raise inexact in ceil, floor, round, trunc.Ahmed Bougacha2017-06-261-8/+8
| | | | | | | | | | | | The non-AVX-512 behavior was changed in r248266 to match N1778 (C bindings for IEEE-754 (2008)), which defined the four functions to not raise the inexact exception ("rint" is still defined as raising it). Update the AVX-512 lowering of these functions to match that: it should not be different. llvm-svn: 306299
* AMDGPU/GlobalISel: Mark 32-bit G_SHL as legalTom Stellard2017-06-261-0/+18
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34589 llvm-svn: 306298
* [X86] Add test case for PR15981Simon Pilgrim2017-06-261-0/+59
| | | | llvm-svn: 306296
* [x86] transform vector inc/dec to use -1 constant (PR33483)Sanjay Patel2017-06-2620-1605/+1714
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert vector increment or decrement to sub/add with an all-ones constant: add X, <1, 1...> --> sub X, <-1, -1...> sub X, <1, 1...> --> add X, <-1, -1...> The all-ones vector constant can be materialized using a pcmpeq instruction that is commonly recognized as an idiom (has no register dependency), so that's better than loading a splat 1 constant. AVX512 uses 'vpternlogd' for 512-bit vectors because there is apparently no better way to produce 512 one-bits. The general advantages of this lowering are: 1. pcmpeq has lower latency than a memop on every uarch I looked at in Agner's tables, so in theory, this could be better for perf, but... 2. That seems unlikely to affect any OOO implementation, and I can't measure any real perf difference from this transform on Haswell or Jaguar, but... 3. It doesn't look like it from the diffs, but this is an overall size win because we eliminate 16 - 64 constant bytes in the case of a vector load. If we're broadcasting a scalar load (which might itself be a bug), then we're replacing a scalar constant load + broadcast with a single cheap op, so that should always be smaller/better too. 4. This makes the DAG/isel output more consistent - we use pcmpeq already for padd x, -1 and psub x, -1, so we should use that form for +1 too because we can. If there's some reason to favor a constant load on some CPU, let's make the reverse transform for all of these cases (either here in the DAG or in a later machine pass). This should fix: https://bugs.llvm.org/show_bug.cgi?id=33483 Differential Revision: https://reviews.llvm.org/D34336 llvm-svn: 306289
* [Hexagon] Handle cases when the aligned stack pointer is missingKrzysztof Parzyszek2017-06-262-0/+81
| | | | llvm-svn: 306288
* [SystemZ] Add a check against zero before calling getTestUnderMaskCond()Jonas Paulsson2017-06-261-0/+20
| | | | | | | | | | | | | | Csmith discovered that this function can be called with a zero argument, in which case an assert for this triggered. This patch also adds a guard before the other call to this function since it was missing, although the test only covers the case where it was discovered. Reduced test case attached as CodeGen/SystemZ/int-cmp-54.ll. Review: Ulrich Weigand llvm-svn: 306287
* [X86][LLVM][test]Expanding Supports lowerInterleavedStore() in ↵Michael Zuckerman2017-06-261-0/+59
| | | | | | | | X86InterleavedAccess test. Adding base tast (to trunk) for Store strid=4 vf=32. llvm-svn: 306286
* This reverts commit r306272.Serguei Katkov2017-06-262-54/+4
| | | | | | | | Revert "[MBP] do not rotate loop if it creates extra branch" It breaks the sanitizer build bots. Need to fix this. llvm-svn: 306276
* [MBP] do not rotate loop if it creates extra branchSerguei Katkov2017-06-262-4/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true 1) Best Exit has next element in chain as successor. 2) Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34271 llvm-svn: 306272
* AMDGPU: Partially fix implicit.buffer.ptr intrinsic handlingMatt Arsenault2017-06-262-0/+59
| | | | | | | | | | | | | | This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. llvm-svn: 306264
* [X86] Add test case for PR15705Simon Pilgrim2017-06-251-0/+48
| | | | llvm-svn: 306246
* AVX-512: Fixed a crash during legalization of <3 x i8> typeElena Demikhovsky2017-06-251-0/+31
| | | | | | | | | The compiler fails with assertion during legalization of SETCC for <3 x i8> operands. The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>. Differential Revision: https://reviews.llvm.org/D34503 llvm-svn: 306242
* [GlobalISel][X86] Support vector type G_EXTRACT selection.Igor Breger2017-06-252-0/+207
| | | | | | | | | | | | | | | | Summary: Support vector type G_EXTRACT selection. For now G_EXTRACT marked as legal for any type, so nothing to do in legalizer. Split from https://reviews.llvm.org/D33665 Reviewers: qcolombet, t.p.northover, zvi, guyblank Reviewed By: guyblank Subscribers: guyblank, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33957 llvm-svn: 306240
* fix trivial typos in comment, NFCHiroshi Inoue2017-06-241-1/+1
| | | | llvm-svn: 306211
* [SelectionDAG] set dereferenceable flag when expanding memcpy/memmoveHiroshi Inoue2017-06-241-0/+74
| | | | | | | | | | When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable. This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer. This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure. Differential Revision: https://reviews.llvm.org/D34467 llvm-svn: 306209
* Add missing %s to RUN line.Rafael Espindola2017-06-241-1/+1
| | | | llvm-svn: 306199
* Test the object file creation too.Rafael Espindola2017-06-241-0/+10
| | | | | | | This should *really* be a llvm-mc test, but the parser is broken. See PR33579 for the parser bug. llvm-svn: 306198
* Update constants in complex-return test to prevent reduction to smaller ↵Nirav Dave2017-06-241-1/+1
| | | | | | constants llvm-svn: 306192
* [MSP430] Fix data layout string.Vadzim Dambrouski2017-06-232-1/+58
| | | | | | | | | | | | | | | | Summary: Without this patch some types have incorrect size and/or alignment according to the MSP430 EABI. Reviewers: asl, awygle Reviewed By: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34561 llvm-svn: 306159
* Add bitcast store-merge test.Nirav Dave2017-06-231-2/+29
| | | | llvm-svn: 306158
* Revert "[Hexagon] Handle decreasing of stack alignment in frame lowering"Krzysztof Parzyszek2017-06-231-51/+0
| | | | | | This breaks passing of aligned function arguments. llvm-svn: 306145
* [AArch64] Prefer Bcc to CBZ/CBNZ/TBZ/TBNZ when NZCV flags can be set for "free".Chad Rosier2017-06-237-20/+189
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains a pass that transforms CBZ/CBNZ/TBZ/TBNZ instructions into a conditional branch (Bcc), when the NZCV flags can be set for "free". This is preferred on targets that have more flexibility when scheduling Bcc instructions as compared to CBZ/CBNZ/TBZ/TBNZ (assuming all other variables are equal). This can reduce register pressure and is also the default behavior for GCC. A few examples: add w8, w0, w1 -> cmn w0, w1 ; CMN is an alias of ADDS. cbz w8, .LBB_2 -> b.eq .LBB0_2 ; single def/use of w8 removed. add w8, w0, w1 -> adds w8, w0, w1 ; w8 has multiple uses. cbz w8, .LBB1_2 -> b.eq .LBB1_2 sub w8, w0, w1 -> subs w8, w0, w1 ; w8 has multiple uses. tbz w8, #31, .LBB6_2 -> b.ge .LBB6_2 In looking at all current sub-target machine descriptions, this transformation appears to be either positive or neutral. Differential Revision: https://reviews.llvm.org/D34220. llvm-svn: 306144
* [x86] fix value types for SBB transform (PR33560)Sanjay Patel2017-06-231-0/+25
| | | | | | | | | | | I'm not sure yet why this wouldn't fail in the simple case, but clearly I used the wrong value type with: https://reviews.llvm.org/rL306040 ...and the bug manifests with: https://bugs.llvm.org/show_bug.cgi?id=33560 llvm-svn: 306139
* [X86][AVX] Regenerate i256 bitcasted store testSimon Pilgrim2017-06-231-6/+20
| | | | | | Check on slow/fast unaligned memory targets llvm-svn: 306138
* Regenerate extract-store.ll testsSimon Pilgrim2017-06-231-6/+123
| | | | llvm-svn: 306131
* [Hexagon] Handle decreasing of stack alignment in frame loweringKrzysztof Parzyszek2017-06-231-0/+51
| | | | llvm-svn: 306124
* GlobalISel: remove G_SEQUENCE instruction.Tim Northover2017-06-232-41/+13
| | | | | | | | It was trying to do too many things. The basic lumping together of values for legalization purposes is now handled by G_MERGE_VALUES. More complex things involving gaps and odd sizes are handled by G_INSERT sequences. llvm-svn: 306120
* GlobalISel: convert buildSequence to use non-deprecated instructions.Tim Northover2017-06-233-9/+24
| | | | | | | | G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to be taught how to emulate it with alternatives. We use G_MERGE_VALUES where possible, and a sequence of G_INSERTs if not. llvm-svn: 306119
* [SystemZ] Remove unnecessary serialization before volatile loadsUlrich Weigand2017-06-231-21/+0
| | | | | | | | | | | | | | | | | This reverts the use of TargetLowering::prepareVolatileOrAtomicLoad introduced by r196905. Nothing in the semantics of the "volatile" keyword or the definition of the z/Architecture actually requires that volatile loads are preceded by a serialization operation, and no other compiler on the platform actually implements this. Since we've now seen a use case where this additional serialization causes noticable performance degradation, this patch removes it. The patch still leaves in the serialization before atomic loads, which is now implemented directly in lowerATOMIC_LOAD. (This also seems overkill, but that can be addressed separately.) llvm-svn: 306117
* [x86] auto-generate complete checks; NFCSanjay Patel2017-06-231-41/+93
| | | | llvm-svn: 306114
* [x86] auto-generate complete checks; NFCSanjay Patel2017-06-231-85/+212
| | | | llvm-svn: 306113
* AMDGPU/GlobalISel: Mark 32-bit G_AND as legalTom Stellard2017-06-231-0/+22
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34349 llvm-svn: 306112
* [x86] remove overridden target settings in test; NFCSanjay Patel2017-06-231-2/+0
| | | | | | r306109 was supposed to make this change, but I committed the wrong version. llvm-svn: 306110
OpenPOWER on IntegriCloud