summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [X86][LLVM][test]Expanding Supports lowerInterleavedStore() in ↵Michael Zuckerman2017-06-281-0/+58
| | | | | | | | | X86InterleavedAccess test. Exapnding the test to include AVX target. Adding base tast (to trunk) for Store strid=4 vf=32. llvm-svn: 306543
* Add zero-length check to memcpy/memset load store loop expansionTeresa Johnson2017-06-281-0/+4
| | | | | | | | | | | | | | | | Summary: I was testing using this expansion logic in other cases besides NVPTX, and found some runtime failures due to the lack of a check for a zero length memcpy/memset before the loop. There is already such a check in the memmove expansion code though. Reviewers: hfinkel Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34707 llvm-svn: 306541
* [GlobalISel][X86] Test G_CONSTANT i32 0 TableGen'erated selection.NFC.Igor Breger2017-06-281-0/+21
| | | | llvm-svn: 306537
* Revert r306528Nikolai Bozhenov2017-06-281-1/+1
| | | | llvm-svn: 306536
* [GlobalISel][X86] Support bitwise operations : G_AND, G_OR, G_XORIgor Breger2017-06-2810-0/+1098
| | | | | | | | | | | | | | Summary: Support G_AND, G_OR, G_XOR for i8/i16/i32/i64. Selection done via TableGen'erated code. Reviewers: zvi, guyblank, aymanmus, m_zuckerman Reviewed By: aymanmus Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34605 llvm-svn: 306533
* Reverting commit 306414 on behalf of @gadi.haberMichael Zuckerman2017-06-2830-4307/+3620
| | | | llvm-svn: 306532
* [X86][AVX2] Dropped -mcpu from avx2 arithmetic/intrinsics testsSimon Pilgrim2017-06-2810-384/+364
| | | | | | Use triple and attribute only for consistency llvm-svn: 306531
* [X86] Correct dwarf unwind information in function epiloguePetar Jovanovic2017-06-2855-243/+1058
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CFI instructions that set appropriate cfa offset and cfa register are now inserted in emitEpilogue() in X86FrameLowering. Majority of the changes in this patch: 1. Ensure that CFI instructions do not affect code generation. 2. Enable maintaining correct information about cfa offset and cfa register in a function when basic blocks are reordered, merged, split, duplicated. These changes are target independent and described below. Changed CFI instructions so that they: 1. are duplicable 2. are not counted as instructions when tail duplicating or tail merging 3. can be compared as equal Add information to each MachineBasicBlock about cfa offset and cfa register that are valid at its entry and exit (incoming and outgoing CFI info). Add support for updating this information when basic blocks are merged, split, duplicated, created. Add a verification pass (CFIInfoVerifier) that checks that outgoing cfa offset and register of predecessor blocks match incoming values of their successors. Incoming and outgoing CFI information is used by a late pass (CFIInstrInserter) that corrects CFA calculation rule for a basic block if needed. That means that additional CFI instructions get inserted at basic block beginning to correct the rule for calculating CFA. Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D18046 llvm-svn: 306529
* [ValueTracking] Enabling existing ValueTracking patch by default.Nikolai Bozhenov2017-06-281-1/+1
| | | | | | | | | | | | | | | The original patch was an improvement to IR ValueTracking on non-negative integers. It has been checked in to trunk (D18777, r284022). But was disabled by default due to performance regressions. Perf impact has improved. The patch would be enabled by default. Reviewers: reames Differential Revision: https://reviews.llvm.org/D34101 Patch by: Olga Chupina <olga.chupina@intel.com> llvm-svn: 306528
* [InstCombine] Canonicalize clamp of float types to minmax in fast mode.Nikolai Bozhenov2017-06-281-27/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This commit allows matchSelectPattern to recognize clamp of float arguments in the presence of FMF the same way as already done for integers. This case is a little different though. With integers, given the min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX "automatically". That is not the case for float, because for them only full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care about NaNs. On the other hand, some backends (e.g. X86) have only FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM nodes are illegal thus selection is not happening. So I decided to do such kind of transformation in IR (InstCombiner) instead of complicating the logic in the backend. Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper Reviewed By: efriedma Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D33186 llvm-svn: 306525
* Add tests to document current InstCombine behavior for clamp pattern.Nikolai Bozhenov2017-06-281-0/+500
| | | | | | | | | | | | | | | | | | | Summary: This commit adds the tests for clamp pattern as a prerequisite of D33186 to make the impact of that fix more clear and also to document current behavior. Reviewers: spatel, jmolloy Reviewed By: spatel Subscribers: n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D34350 llvm-svn: 306524
* [ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).Kristof Beyls2017-06-2847-413/+405
| | | | | | | | | | | | | | | | The benchmarking summarized in http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed this is beneficial for a wide range of cores. As is to be expected, quite a few small adaptations are needed to the regressions tests, as the difference in scheduling results in: - Quite a few small instruction schedule differences. - A few changes in register allocation decisions caused by different instruction schedules. - A few changes in IfConversion decisions, due to a difference in instruction schedule and/or the estimated cost of a branch mispredict. llvm-svn: 306514
* [InstCombine] Add test case demonstrating that we don't handle icmp eq ↵Craig Topper2017-06-281-0/+21
| | | | | | (trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC llvm-svn: 306510
* Revert r306508 "[InstCombine] Add test case demonstrating that we don't ↵Craig Topper2017-06-281-21/+0
| | | | | | | | handle icmp eq (trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC" I accidentally had a extra change in there. llvm-svn: 306509
* [InstCombine] Add test case demonstrating that we don't handle icmp eq ↵Craig Topper2017-06-281-0/+21
| | | | | | (trunc (lshr(X, cst1)), cst->icmp (and X, mask), cst when the shift type is larger than 64-bits. NFC llvm-svn: 306508
* [AMDGPU] Add pattern for v_alignbit_b32 with immediateStanislav Mekhanoshin2017-06-283-16/+42
| | | | | | | | If immediate in shift is less than 32 we can use alignbit too. Differential Revision: https://reviews.llvm.org/D34729 llvm-svn: 306500
* Allow to truncate left shift with non-constant shift amountStanislav Mekhanoshin2017-06-282-69/+74
| | | | | | | | | | | That is pretty common for clang to produce code like (shl %x, (and %amt, 31)). In this situation we can still perform trunc (shl) into shl (trunc) conversion given the known value range of shift amount. Differential Revision: https://reviews.llvm.org/D34723 llvm-svn: 306499
* [COFF, ARM64] Add support for Windows ARM64 COFF formatMandeep Singh Grang2017-06-271-0/+8
| | | | | | | | | | | | | | | | Summary: This is the llvm part of the initial implementation to support Windows ARM64 COFF format. I will gradually add more functionality in subsequent patches. Reviewers: ruiu, rnk, t.p.northover, compnerd Reviewed By: ruiu, compnerd Subscribers: aemerson, mgorny, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34705 llvm-svn: 306490
* Object: Teach irsymtab::read() to try to use the irsymtab that we wrote to disk.Peter Collingbourne2017-06-271-0/+13
| | | | | | | | Fixes PR27551. Differential Revision: https://reviews.llvm.org/D33974 llvm-svn: 306488
* Bitcode: Write the irsymtab to disk.Peter Collingbourne2017-06-2711-8/+60
| | | | | | Differential Revision: https://reviews.llvm.org/D33973 llvm-svn: 306487
* [CGP] add specialization for memcmp expansion with only one basic blockSanjay Patel2017-06-274-162/+93
| | | | llvm-svn: 306485
* [NewPM/Inliner] Reduce threshold for cold callsites in the non-PGO caseEaswaran Raman2017-06-272-43/+90
| | | | | | Differential Revision: https://reviews.llvm.org/D34312 llvm-svn: 306484
* [AArch64] Inline callee if its target-features are a subset of the callerFlorian Hahn2017-06-271-0/+40
| | | | | | | | | | | | | | | | | | | Summary: Similar to X86, it should be safe to inline callees if their target-features are a subset of the caller. This change matches GCC's inlining behavior with respect to attributes [1]. [1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes Reviewers: kristof.beyls, javed.absar, rengolin, t.p.northover Reviewed By: t.p.northover Subscribers: aemerson, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D34698 llvm-svn: 306478
* [EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of ↵Geoff Berry2017-06-272-0/+2
| | | | | | EarlyCSE. llvm-svn: 306477
* [GISel]: Add G_FEXP, G_FEXP2 opcodesAditya Nandakumar2017-06-271-0/+20
| | | | | | | Also add IRTranslator support. https://reviews.llvm.org/D34710 llvm-svn: 306475
* re-commit r306336: Enable vectorizer-maximize-bandwidth by default.Dehao Chen2017-06-2711-67/+76
| | | | | | Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306473
* [CGP] eliminate a sub instruction in memcmp expansionSanjay Patel2017-06-275-141/+118
| | | | | | | | | | | | | | | | | | | | | | As noted in D34071, there are some IR optimization opportunities that could be handled by normal IR passes if this expansion wasn't happening so late in CGP. Regardless of that, it seems wasteful to knowingly produce suboptimal IR here, so I'm proposing this change: %s = sub i32 %x, %y %r = icmp ne %s, 0 => %r = icmp ne %x, %y Changing the predicate to 'eq' mimics what InstCombine would do, so that's just an efficiency improvement if we decide this expansion should happen sooner. The fact that the PowerPC backend doesn't eliminate the 'subf.' might be something for PPC folks to investigate separately. Differential Revision: https://reviews.llvm.org/D34416 llvm-svn: 306471
* GlobalISel: verify that a COPY is trivial when created.Tim Northover2017-06-271-1/+1
| | | | | | | | | | | | Without this check, COPY instructions can actually be one of the generic casts in disguise. That's confusing and bad. At some point during ISel this restriction has to be relaxed since the fully selected instructions will usually use COPY for those purposes. Right now I think it's possible that relaxation occurs during RegBankSelect (hence the change there). I'm not convinced that's where it belongs long-term though. llvm-svn: 306470
* Clean up a test caseXinliang David Li2017-06-271-34/+41
| | | | llvm-svn: 306468
* Create a PHI value when merging with a known undef live-inKrzysztof Parzyszek2017-06-271-0/+35
| | | | | | Differential Revision: https://reviews.llvm.org/D34640 llvm-svn: 306466
* [WebAssembly] Only run WebAssembly objdump tests if it is enabled as a targetSam Clegg2017-06-271-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D34712 llvm-svn: 306464
* [WebAssembly] Add support for printing relocations with llvm-objdumpSam Clegg2017-06-271-0/+8
| | | | | | Differential Revision: https://reviews.llvm.org/D34658 llvm-svn: 306461
* [WebAssembly] Add data size and alignement to linking sectionSam Clegg2017-06-275-46/+75
| | | | | | | | | The overal size of the data section (including BSS) is otherwise not included in the wasm binary. Differential Revision: https://reviews.llvm.org/D34657 llvm-svn: 306459
* [Hexagon] Use proper predicate register state when expanding PS_vselectKrzysztof Parzyszek2017-06-271-0/+53
| | | | llvm-svn: 306458
* [InstCombine] Propagate nsw flag when turning mul by pow2 into shift when ↵Craig Topper2017-06-271-5/+3
| | | | | | | | | | | | the constant is a vector splat or the scalar bit width is larger than 64-bits The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less. This patch changes it to use m_APInt to remove both these issues Differential Revision: https://reviews.llvm.org/D34699 llvm-svn: 306457
* [AMDGPU] Add 2 new alignbit patternsStanislav Mekhanoshin2017-06-271-0/+142
| | | | | | Differential Revision: https://reviews.llvm.org/D34655 llvm-svn: 306449
* [CodeExtractor] Prevent extraction of block involving blockaddressSerge Guelton2017-06-272-0/+79
| | | | | | | | | BlockAddress are only valid within their function context, which does not interact well with CodeExtractor. Detect this case and prevent it. Differential Revision: https://reviews.llvm.org/D33839 llvm-svn: 306448
* [AMDGPU] Simplify setcc (sext from i1 b), -1|0, ccStanislav Mekhanoshin2017-06-271-0/+292
| | | | | | | | | | | Depending on the compare code that can be either an argument of sext or negate of it. This helps to avoid v_cndmask_b64 instruction for sext. A reversed value can be further simplified and folded into its parent comparison if possible. Differential Revision: https://reviews.llvm.org/D34545 llvm-svn: 306446
* [Hexagon] Update kills in hexagon-nvj even more properly than beforeKrzysztof Parzyszek2017-06-271-0/+18
| | | | | | | Account for the fact that both, the feeder and the compare can be moved over instructions that kill registers. llvm-svn: 306443
* RenameIndependentSubregs: Fix infinite loopMatt Arsenault2017-06-271-0/+19
| | | | | | | | Apparently this replacement can really be substituting the same as the original register. Avoid restarting the loop when there's been no change in the register uses. llvm-svn: 306441
* [SROA] Fix APInt size when alloca address space is not 0Yaxun Liu2017-06-271-1/+30
| | | | | | | | SROA assumes alloca address space is 0, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34104 llvm-svn: 306440
* [AMDGPU] Combine and x, (sext cc from i1) => select cc, x, 0Stanislav Mekhanoshin2017-06-272-0/+47
| | | | | | | | | | Also factored out function to check if a boolean is an already deserialized value which does not require v_cndmask_b32 to be loaded. Added binary logical operators to its check. Differential Revision: https://reviews.llvm.org/D34500 llvm-svn: 306439
* [InstCombine] canonicalize icmp predicate feeding selectSanjay Patel2017-06-278-63/+68
| | | | | | | | | | | | | | | | | | | | | This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. We have this transform for icmp+br, so unless there's some reason that icmp+select should be treated differently, we should do the same thing here. The benefit comes from increasing the chances of creating identical instructions. This is shown in the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE can simplify the identical cmps, and then InstCombine can fold the selects together. The possible regression for the tests in select.ll raises questions about poison/undef: http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html ...but that transform is just as likely to be triggered by this canonicalization as it is to be missed, so we're just pointing out a commutation deficiency in the pattern matching: https://reviews.llvm.org/rL228409 Differential Revision: https://reviews.llvm.org/D34242 llvm-svn: 306435
* [InstCombine] Add test case demonstrating that we don't propagate nsw flag ↵Craig Topper2017-06-271-0/+10
| | | | | | when converting mul by pow2 to shl when the type is larger than 64-bits. NFC llvm-svn: 306427
* [InstCombine] Add test cases to show that we don't propagate 'nsw' flags ↵Craig Topper2017-06-271-0/+20
| | | | | | when converting mul by pow2 constant to shl for splat vectors. NFC llvm-svn: 306426
* [X86][AsmParser][MS-compatability] Binary/Unary operators enhancementsCoby Tayree2017-06-271-0/+17
| | | | | | | | | | | Introducing MOD binary operator https://msdn.microsoft.com/en-us/library/hha180wt.aspx Enhancing unary operators NEG and NOT, to support more complex patterns Differential Revision: https://reviews.llvm.org/D33876 llvm-svn: 306425
* Another test commitChih-Hung Hsieh2017-06-271-4/+4
| | | | llvm-svn: 306420
* [PatternMatch] Remove 64-bit or less restriction from m_SpecificIntCraig Topper2017-06-271-8/+4
| | | | | | | | | | Not sure why this restriction existed, but it seems like we should support any size Constant here. The particular pattern in the tests is not the only use of this matcher in the tree. There's one in CodeGenPrepare and one in InstSimplify as well. Differential Revision: https://reviews.llvm.org/D34666 llvm-svn: 306417
* [JumpThreading] Add test case that was supposed to go with r306085.Craig Topper2017-06-271-0/+125
| | | | | | Looks like I forgot to 'git add' when I submitted the commit. Thanks to Chandler for noticing. llvm-svn: 306416
* Updated and extended the information about each instruction in HSW and SNB ↵Gadi Haber2017-06-2731-3632/+4339
| | | | | | | | | | | | | | | | | | | to include the following data: •static latency •number of uOps from which the instructions consists •all ports used by the instruction Reviewers:  RKSimon zvi aymanmus m_zuckerman Differential Revision: https://reviews.llvm.org/D33897 llvm-svn: 306414
OpenPOWER on IntegriCloud