summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* Write AArch64 big endian data fixup entries as BE.Keith Walker2016-01-201-0/+6
| | | | | | | | | | | | | | There was support for writing the AArch64 big endian data fixup entries in the .eh_frame section in BE. This is changed to write all such fixup entries in BE with no restriction on the section. This is similar to the existing support for fixup entries for ARM. A test is added to check the length field in the .debug_line section as this is an example of where such a fixup occurs. Differential Revision: http://reviews.llvm.org/D16064 llvm-svn: 258320
* [AVX512] Adding VPERMB Intrinsics Michael Zuckerman2016-01-202-0/+63
| | | | | | Differential Revision: http://reviews.llvm.org/D16296 llvm-svn: 258316
* Proper handling of diamond-like cases in if-conversionKrzysztof Parzyszek2016-01-201-0/+43
| | | | | | | | | | | If converter was somewhat careless about "diamond" cases, where there was no join block, or in other words, where the true/false blocks did not have analyzable branches. In such cases, it was possible for it to remove (needed) branches, resulting in a loss of entire basic blocks. Differential Revision: http://reviews.llvm.org/D16156 llvm-svn: 258310
* AVX512: Store (MOVNTPD, MOVNTPS, MOVNTDQ) using non-temporal hint intrinsic ↵Igor Breger2016-01-201-0/+32
| | | | | | | | implementation. Differential Revision: http://reviews.llvm.org/D16350 llvm-svn: 258309
* [AArch64] Fix two bugs in the .inst directiveOliver Stannard2016-01-201-2/+13
| | | | | | | | | | | | | | The AArch64 .inst directive was implemented using EmitIntValue, which resulted in both $x and $d (code and data) mapping symbols being emitted at the same address. This fixes it to only emit the $x mapping symbol. EmitIntValue also emits the value in big-endian order when targeting big-endian systems, but instructions are always emitted in little-endian order for AArch64. Differential Revision: http://reviews.llvm.org/D16349 llvm-svn: 258308
* [SelectionDAG] Fold more offsets into GlobalAddressesDan Gohman2016-01-203-11/+683
| | | | | | | | | | | | | | | | | | SelectionDAG previously missed opportunities to fold constants into GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to create `(add GA+c, y)`. This isn't often visible on targets such as X86 which effectively reassociate adds in their complex address-mode folding logic, however it is currently visible on WebAssembly since it currently has very simple address mode folding code that doesn't reassociate anything. This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses at the same times that it folds constants together, so that it doesn't miss any opportunities to perform such folding. Differential Revision: http://reviews.llvm.org/D16090 llvm-svn: 258296
* [WebAssembly] Tighten up some regexes in some tests.Dan Gohman2016-01-204-80/+80
| | | | llvm-svn: 258295
* [WebAssembly] Don't stackify stores across instructions with side effects.Dan Gohman2016-01-202-12/+36
| | | | llvm-svn: 258285
* [Inliner/WinEH] Honor implicit nounwindsJoseph Tremoulet2016-01-201-0/+455
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Funclet EH tables require that a given funclet have only one unwind destination for exceptional exits. The verifier will therefore reject e.g. two cleanuprets with different unwind dests for the same cleanup, or two invokes exiting the same funclet but to different unwind dests. Because catchswitch has no 'nounwind' variant, and because IR producers are not *required* to annotate calls which will not unwind as 'nounwind', it is legal to nest a call or an "unwind to caller" catchswitch within a funclet pad that has an unwind destination other than caller; it is undefined behavior for such a call or catchswitch to unwind. Normally when inlining an invoke, calls in the inlined sequence are rewritten to invokes that unwind to the callsite invoke's unwind destination, and "unwind to caller" catchswitches in the inlined sequence are rewritten to unwind to the callsite invoke's unwind destination. However, if such a call or "unwind to caller" catchswitch is located in a callee funclet that has another exceptional exit with an unwind destination within the callee, applying the normal transformation would give that callee funclet multiple unwind destinations for its exceptional exits. There would be no way for EH table generation to determine which is the "true" exit, and the verifier would reject the function accordingly. Add logic to the inliner to detect these cases and leave such calls and "unwind to caller" catchswitches as calls and "unwind to caller" catchswitches in the inlined sequence. This fixes PR26147. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: alexcrichton, llvm-commits Differential Revision: http://reviews.llvm.org/D16319 llvm-svn: 258273
* AMDGPU/SI: Prevent the DAGCombiner from creating setcc with i1 inputsTom Stellard2016-01-203-2/+69
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15035 llvm-svn: 258256
* [MachineSink] Don't break ImplicitNullsSanjoy Das2016-01-201-0/+49
| | | | | | | | | | | | | | | Summary: This teaches MachineSink to not sink instructions that might break the implicit null check optimization that runs later. This should not affect frontends that do not use implicit null checks. Reviewers: aadg, reames, hfinkel, atrick Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D14632 llvm-svn: 258254
* [X86] Do not run shrink-wrapping on function with split-stack attribute or HiPEQuentin Colombet2016-01-191-6/+77
| | | | | | | | | | | calling convention. The implementation of the related callbacks in the x86 backend for such functions are not ready to deal with a prologue block that is not the entry block of the function. This fixes PR26107, but the longer term solution would be to fix those callbacks. llvm-svn: 258221
* add tests to show missing memset/malloc optimizations (PR25892)Sanjay Patel2016-01-192-0/+75
| | | | llvm-svn: 258218
* [MC, COFF] Add .reloc support for WinCOFFDavid Majnemer2016-01-192-0/+46
| | | | | | | This adds rudimentary support for a few relocations that we will use for the CodeView debug format. llvm-svn: 258216
* [X86][SSE] Add VZEXT_MOVL target shuffle decoding.Simon Pilgrim2016-01-192-12/+4
| | | | | | Add support for decoding VZEXT_MOVL target shuffle masks, allowing it to be used as a source in target shuffle combines. llvm-svn: 258215
* [X86][SSE] Add INSERTPS target shuffle combines.Simon Pilgrim2016-01-193-28/+8
| | | | | | | | | | As vector shuffles can only reference two inputs many (V)INSERTPS patterns end up being split over two targets shuffles. This patch adds combines to attempt to combine (V)INSERTPS nodes with input/output nodes that are just zeroing out these additional vector elements. Differential Revision: http://reviews.llvm.org/D16072 llvm-svn: 258205
* [SCEV] Fix PR26207Sanjoy Das2016-01-191-0/+20
| | | | | | | | | | | | | | | | | | | | | | | In some cases, the max backedge taken count can be more conservative than the exact backedge taken count (for instance, because ScalarEvolution::getRange is not control-flow sensitive whereas computeExitLimitFromICmp can be). In these cases, computeExitLimitFromCond (specifically the bit that deals with `and` and `or` instructions) can create an ExitLimit instance with a `SCEVCouldNotCompute` max backedge count expression, but a computable exact backedge count expression. This violates an implicit SCEV assumption: a computable exact BE count should imply a computable max BE count. This change - Makes the above implicit invariant explicit by adding an assert to ExitLimit's constructor - Changes `computeExitLimitFromCond` to be more robust around conservative max backedge counts llvm-svn: 258184
* [AVX512] Adding VPERMT2B and VPERMI2B instruction .Michael Zuckerman2016-01-191-0/+239
| | | | | | Differential Revision: http://reviews.llvm.org/D16297 llvm-svn: 258161
* [LibCallSimplifier] use instruction-level fast-math-flags to shrink callsSanjay Patel2016-01-191-20/+18
| | | | | | | This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 llvm-svn: 258158
* [LibCallSimplifier] use instruction-level fast-math-flags to transform ↵Sanjay Patel2016-01-191-21/+20
| | | | | | | | | | | | | | | | | | pow(x, [small integer]) calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 As with D15937, the intent of the patch is to preserve the current behavior of the transform except that we use the pow call's 'fast' attribute as a trigger rather than a function-level attribute. The TODO comment notes a potential follow-on patch that would propagate FMF to the new instructions. Differential Revision: http://reviews.llvm.org/D16122 llvm-svn: 258153
* [AVX512] Adding VPERMB instructionMichael Zuckerman2016-01-191-0/+123
| | | | | | Differential Revision: http://reviews.llvm.org/D16294 llvm-svn: 258144
* [WebAssembly] Rematerialize constants rather than hold them live in registers.Dan Gohman2016-01-198-60/+81
| | | | | | | | | Teach the register stackifier to rematerialize constants that have multiple uses instead of leaving them in registers. In the WebAssembly encoding, it's the same code size to materialize most constants as it is to read a value from a register. llvm-svn: 258142
* [WebAssembly] Change a FIXME to a TODO in a comment.Dan Gohman2016-01-191-1/+1
| | | | llvm-svn: 258139
* [WebAssembly] Re-enable this test, now that interactions with the coalescer ↵Dan Gohman2016-01-191-3/+8
| | | | | | are resolved. llvm-svn: 258138
* [X86] Add support for "xlat m8"Marina Yatsina2016-01-191-0/+4
| | | | | | | | According to x86 spec "xlat m8" is a legal instruction and it is equivalent to "xlatb". Differential Revision: http://reviews.llvm.org/D15150 llvm-svn: 258135
* Fix constant folding of constant vector GEPs with undef or null as pointer ↵Manuel Jacob2016-01-191-0/+4
| | | | | | | | | | | | argument. Reviewers: eddyb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16321 llvm-svn: 258134
* [X86] Adding support for missing variations of X86 string related instructionsMarina Yatsina2016-01-192-0/+40
| | | | | | | | | | | | | | | The following are legal according to X86 spec: ins mem, DX outs DX, mem lods mem stos mem scas mem cmps mem, mem movs mem, mem Differential Revision: http://reviews.llvm.org/D14827 llvm-svn: 258132
* [WebAssembly] Re-enable loop idiom recognition for memcpy et al.Dan Gohman2016-01-191-53/+0
| | | | llvm-svn: 258125
* [X86][AVX512]fix dag & add intrinsics for fixupimmAsaf Badouh2016-01-192-0/+346
| | | | | | | | cover all width and types (pd/ps/sd/ss) of fixupimm instruction and inrtinsics Differential Revision: http://reviews.llvm.org/D16313 llvm-svn: 258124
* [LTO] Restore original linkage of externals prior to splittingTobias Edler von Koch2016-01-181-0/+24
| | | | | | | | | | | | | | | | | | | | | | | Summary: This is a companion patch for http://reviews.llvm.org/D16124. Internalized symbols increase the size of strongly-connected components in SCC-based module splitting and thus reduce the amount of parallelism. This patch records the original linkage of non-local symbols prior to internalization and then restores it just before splitting/CodeGen. This is also useful for cases where the linker requires symbols to remain external, for instance, so they can be placed according to linker script rules. It's currently under its own flag (-restore-globals) but should eventually share a common flag with D16124. Reviewers: joker.eph, pcc Subscribers: slarin, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16229 llvm-svn: 258100
* AMDGPU: Reduce 64-bit SRAsMatt Arsenault2016-01-183-20/+30
| | | | llvm-svn: 258096
* AMDGPU: Split 64-bit and of constant upMatt Arsenault2016-01-183-51/+399
| | | | | | | | | | This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095
* [X86][AVX2] Ensure integer execution domain for integer blend testsSimon Pilgrim2016-01-181-2/+6
| | | | llvm-svn: 258094
* AMDGPU: Generalize shl combineMatt Arsenault2016-01-181-0/+47
| | | | | | | Reduce 64-bit shl with constant > 32. We already special cased this for the == 32 case, but this also works for any >= 32 constant. llvm-svn: 258092
* [X86][SSE] Regenerate vector blend commutation testsSimon Pilgrim2016-01-182-46/+48
| | | | llvm-svn: 258091
* AMDGPU: Reduce 64-bit lshr by constant to 32-bitMatt Arsenault2016-01-182-2/+64
| | | | | | 64-bit shifts are very slow on some subtargets. llvm-svn: 258090
* [JIT] Add small-code model test for ELF.Davide Italiano2016-01-181-0/+15
| | | | | | | | The coverage is almost non-existent, hopefully more will come after this. Differential Revision: http://reviews.llvm.org/D16096 llvm-svn: 258087
* AMDGPU: Cleanup sra testMatt Arsenault2016-01-181-163/+206
| | | | llvm-svn: 258086
* Add to the split module utility an SCC based method which allows not to ↵Sergei Larin2016-01-187-0/+313
| | | | | | | | | | | | | | | | | | globalize any local variables. Summary: Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios. This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols. Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module). Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org) Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16124 llvm-svn: 258083
* [X86][AVX2] Broadcast subvectorsSimon Pilgrim2016-01-187-18/+137
| | | | | | | | AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly. Differential Revision: http://reviews.llvm.org/D16050 llvm-svn: 258081
* AVX512: Masked store intrinsic implementation.Igor Breger2016-01-184-12/+400
| | | | | | | | Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD. Differential Revision: http://reviews.llvm.org/D16271 llvm-svn: 258047
* AVX512 : Change v8i1 bitconvert GR8 pattern, remove unnecessary movzbl ↵Igor Breger2016-01-1811-1360/+1126
| | | | | | | | | | | | | | instruction. code example , previous implementation. movzbl %dil, %eax kmovw %eax, %k0 new code kmovw %edi, %k0 Differential Revision: http://reviews.llvm.org/D16287 llvm-svn: 258045
* [ARM] Operands for PKHTB alias should be swappedOliver Stannard2016-01-182-2/+2
| | | | | | | | | When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of the input operands needs to be swapped. Differential Revision: http://reviews.llvm.org/D16288 llvm-svn: 258044
* [IndVars] Fix PR25576Sanjoy Das2016-01-171-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `LCSSASafePhiForRAUW` as computed was incorrect -- in cases like these (this exact example does not actually trigger the bug): define i32 @f(i32 %n, i1* %c) { entry: br label %outer.loop outer.loop: br label %inner.loop inner.loop: %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ] %iv.inc = add nuw nsw i32 %iv, 1 %tc = udiv i32 %n, 13 %be.cond = icmp ult i32 %iv, %tc br i1 %be.cond, label %inner.loop, label %inner.exit inner.exit: %iv.lcssa = phi i32 [ %iv, %inner.loop ] %outer.be.cond = load volatile i1, i1* %c br i1 %outer.be.cond, label %outer.loop, label %leave leave: %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ] ret i32 %iv.lcssa.lcssa } `LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit value of `%iv` for `%inner.loop` to `%tc` (this can happen due to `SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA. To fix this, instead of computing `SafePhi` with special logic, decide the safety of RAUW directly via `replacementPreservesLCSSAForm`. llvm-svn: 258016
* [X86][AVX512] Regenerate v1 shuffle testsSimon Pilgrim2016-01-171-2/+2
| | | | llvm-svn: 258013
* Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionallyArtur Pilipenko2016-01-173-1/+69
| | | | | | | | Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16226 llvm-svn: 258010
* [AVX512] Adding VPERMW/D/Q VPERMPS/D Intrinsics Michael Zuckerman2016-01-174-1/+231
| | | | | | Differential Revision: http://reviews.llvm.org/D16189 llvm-svn: 258008
* [AVX512] Adding VPERMQ VPERMPD Intrinsics Michael Zuckerman2016-01-172-0/+86
| | | | | | Differential Revision: http://reviews.llvm.org/D16194 llvm-svn: 258006
* Remove some stale comments and fix a typo as suggested by David Blaikie in hisLang Hames2016-01-172-2/+0
| | | | | | | | review of r257343. Thanks Dave! llvm-svn: 258002
* [llvm-readobj][ELF] Teach llvm-readobj to show dynamic relocation in REL formatSimon Atanasyan2016-01-162-0/+71
| | | | | | | | | | MIPS 32-bit ABI uses REL relocation record format to save dynamic relocations. The patch teaches llvm-readobj to show dynamic relocations in this format. Differential Revision: http://reviews.llvm.org/D16114 llvm-svn: 258001
OpenPOWER on IntegriCloud