summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [MSP430] Fix PR33050: Don't use ADD16ri to lower FrameIndex.Vadzim Dambrouski2017-05-241-1/+1
| | | | | | | | | Use ADDframe pseudo instruction instead. This will fix machine verifier error, and will help to fix PR32146. Differential Revision: https://reviews.llvm.org/D33452 llvm-svn: 303758
* [InstCombine] add tests to show potential missing folds; NFCSanjay Patel2017-05-241-0/+39
| | | | | | | | | As noted in https://bugs.llvm.org/show_bug.cgi?id=33138 and the comments, there are multiple ways to view this. If we choose not to solve this in InstCombine, these tests will serve as documentation of that choice. llvm-svn: 303755
* [InstCombine] add tests to document bitcast + bitwise-logic behavior; NFCSanjay Patel2017-05-241-0/+45
| | | | | | | | The solution for PR26702 ( https://bugs.llvm.org/show_bug.cgi?id=26702 ) added a canonicalization rule, but the minimal regression tests don't demonstrate how that rule interacts with other folds. llvm-svn: 303750
* Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"Diana Picus2017-05-245-115/+18
| | | | | | This reverts commit r303730 because it broke all the buildbots. llvm-svn: 303747
* [LoopVectorizer] Let target prefer scalar addressing computations.Jonas Paulsson2017-05-241-0/+72
| | | | | | | | | | | | | | | | | | | | | | The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744
* MachineCSE: Respect interblock physreg livenessMikael Holmen2017-05-241-0/+35
| | | | | | | | | | | | | | | | | | | | Summary: This is a fix for PR32538. MachineCSE first looks at MO.isDead(), but if it is not marked dead, MachineCSE still wants to do its own check to see if it is trivially dead. This check for the trivial case assumed that physical registers cannot be live out of a block. Patch by Mattias Eriksson. Reviewers: qcolombet, jbhateja Reviewed By: qcolombet, jbhateja Subscribers: jbhateja, llvm-commits Differential Revision: https://reviews.llvm.org/D33408 llvm-svn: 303731
* [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr startMax Kazantsev2017-05-245-18/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that the loop of our base recurrency is the bottom-lost in terms of domination. This assumption may be broken by an expression which is treated as invariant, and which depends on a complex Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence. Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike other SCEVs, SCEVUnknown are sometimes position-bound. For example, here: for (...) { // loop phi = {A,+,B} } X = load ... Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment). It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop> may be existant. This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr, if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer expect such behavior. llvm-svn: 303730
* Explicitly set CPU and -slow-incdec to try to fix r303678's test on ↵Daniel Sanders2017-05-241-1/+1
| | | | | | llvm-clang-x86_64-expensive-checks-win. llvm-svn: 303727
* Revert r303720: Tweak r303678's test to try to fix ↵Daniel Sanders2017-05-241-1/+1
| | | | | | | | llvm-clang-x86_64-expensive-checks-win. It doesn't fix that builder. llvm-svn: 303721
* Tweak r303678's test to try to fix llvm-clang-x86_64-expensive-checks-win.Daniel Sanders2017-05-241-1/+1
| | | | | | | | I suspect this buildbot has slow-incdec set by default, most likely due to the default CPU having this set. This feature bit can prevent optsize from having an effect on this IR. llvm-svn: 303720
* [NewGVN] Update additionalUsers when we simplify to a value.Davide Italiano2017-05-241-0/+45
| | | | | | | | | | Otherwise we don't revisit an instruction that could be simplified, and when we verify, we discover there's something that changed, i.e. what we had wasn't a maximal fixpoint. Fixes PR32836. llvm-svn: 303715
* Revert "Disable coverage opt-out for strong postdominator blocks."George Karpenkov2017-05-241-27/+0
| | | | | | | This reverts commit 2ed06f05fc10869dd1239cff96fcdea2ee8bf4ef. Buildbots do not like this on Linux. llvm-svn: 303710
* Revert "Fixes for tests for r303698"George Karpenkov2017-05-242-7/+2
| | | | | | This reverts commit 69bfaf72e7502eb08bbca88a57925fa31c6295c6. llvm-svn: 303709
* Fixes for tests for r303698George Karpenkov2017-05-232-2/+7
| | | | llvm-svn: 303701
* [LIR] Strengthen the check for recurrence variable in popcnt/CTLZ.Davide Italiano2017-05-231-0/+35
| | | | | | | Fixes PR33114. Differential Revision: https://reviews.llvm.org/D33420 llvm-svn: 303700
* Disable coverage opt-out for strong postdominator blocks.George Karpenkov2017-05-231-0/+27
| | | | | | | | | | | | | | | | Coverage instrumentation has an optimization not to instrument extra blocks, if the pass is already "accounted for" by a successor/predecessor basic block. However (https://github.com/google/sanitizers/issues/783) this reasoning may become circular, which stops valid paths from having coverage. In the worst case this can cause fuzzing to stop working entirely. This change simplifies logic to something which trivially can not have such circular reasoning, as losing valid paths does not seem like a good trade-off for a ~15% decrease in the # of instrumented basic blocks. llvm-svn: 303698
* [MSP430] Add subtarget features for hardware multiplier.Vadzim Dambrouski2017-05-233-0/+3
| | | | | | | | Also add more processors to make -mcpu option behave similar to gcc. Differential Revision: https://reviews.llvm.org/D33335 llvm-svn: 303695
* [AMDGPU] Add INDIRECT_BASE_ADDR to R600_Reg32 class (PR33045)Simon Pilgrim2017-05-2317-23/+23
| | | | | | | | This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045 Differential Revision: https://reviews.llvm.org/D33451 llvm-svn: 303691
* AsmPrinter: mark the beginning and the end of a function in verbose modeFrancis Visoiu Mistrih2017-05-235-21/+42
| | | | llvm-svn: 303690
* AMDGPU/SI: Move the local memory usage related checking after calling ↵Changpeng Fang2017-05-231-0/+22
| | | | | | | | | | | | | | | | | convention checking in PromoteAlloca Summary: Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other. As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage related checking out and replace it after the calling convention checking. Reviewer: arsenm Differential Revision: http://reviews.llvm.org/D33139 llvm-svn: 303684
* [AMDGPU] Combine and (srl) into shl (bfe)Stanislav Mekhanoshin2017-05-231-0/+41
| | | | | | | | | | | | | | | | | | | Perform DAG combine: and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb Where nb is a number of trailing zeroes in mask. It replaces two instructions with two and BFE is generally a more expensive one. However this is only done if we are selecting a byte or word at an aligned boundary which results in a proper SDWA operand pattern. It is only done if SDWA is supported. TODO: improve SDWA pass to actually convert this pattern. It is not done now because we have an immediate in the instruction, which has be moved into a VGPR. Differential Revision: https://reviews.llvm.org/D33455 llvm-svn: 303681
* [ARM] Temporarily disable globals promotion to constant pools to prevent ↵Oleg Ranevskyy2017-05-233-15/+15
| | | | | | | | | | | | | | | | | | | | | miscompilation Summary: A temporary workaround for PR32780 - rematerialized instructions accessing the same promoted global through different constant pool entries. The patch turns off the globals promotion optimization leaving all its code in place, so that it can be easily turned on once PR32780 is fixed. Since this is a miscompilation issue causing generation of misbehaving code, and the problem is very subtle, the patch might be valuable enough to get into 4.0.1. Reviewers: efriedma, jmolloy Reviewed By: efriedma Subscribers: aemerson, javed.absar, llvm-commits, rengolin, asl, tstellar Differential Revision: https://reviews.llvm.org/D33446 llvm-svn: 303679
* [globalisel][tablegen] Add support for (set $dst, 1) and test X86's ↵Daniel Sanders2017-05-232-0/+122
| | | | | | | | | | | | | | | | | | | | | | | | | OptForSize predicate. Summary: It's rare but a small number of patterns use IntInit's at the root of the match. On X86, one such rule is enabled by the OptForSize predicate and causes the compiler to use the smaller: %0 = MOV32r1 instead of the usual: %0 = MOV32ri 1 This patch adds support for matching IntInit's at the root and uses this as a test case for the optsize attribute that was implemented in r301750 Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls, aditya_nandakumar Reviewed By: qcolombet Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32791 llvm-svn: 303678
* [InstSimplify] Add more tests for undef inputs and multiplying by 0 for the ↵Craig Topper2017-05-231-0/+92
| | | | | | add/sub/mul with overflow intrinsics. NFC llvm-svn: 303671
* [InstSimplify] auto-generate test checks. NFCCraig Topper2017-05-231-33/+80
| | | | llvm-svn: 303664
* [InstCombine] auto-generate test checks; NFCSanjay Patel2017-05-231-19/+18
| | | | llvm-svn: 303663
* [InstCombine] allow icmp-xor folds for vectors (PR33138)Sanjay Patel2017-05-231-12/+4
| | | | | | | | | This fixes the first part of: https://bugs.llvm.org/show_bug.cgi?id=33138 More work is needed for the bitcasted variant. llvm-svn: 303660
* [InstCombine] Use update_test_checks to regenerate the ctpop test. NFCCraig Topper2017-05-231-9/+18
| | | | llvm-svn: 303659
* [InstCombine] add icmp-xor tests to show vector neglect; NFCSanjay Patel2017-05-232-87/+197
| | | | | | | | | | Also, rename the tests and the file, add comments, and add more tests because there are no existing tests for some of these folds. These patterns are particularly important for crippled vector ISAs that have limited compare predicates (PR33138). llvm-svn: 303652
* [AMDGPU] Convert shl (add) into add (shl)Stanislav Mekhanoshin2017-05-231-0/+40
| | | | | | | | | | | shl (or|add x, c2), c1 => or|add (shl x, c1), (c2 << c1) This allows to fold a constant into an address in some cases as well as to eliminate second shift if the expression is used as an address and second shift is a result of a GEP. Differential Revision: https://reviews.llvm.org/D33432 llvm-svn: 303641
* [RuntimeDyld, PowerPC] Fix check for external symbols when detecting ↵Ulrich Weigand2017-05-233-2/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | reloction overflow The PowerPC part of processRelocationRef currently assumes that external symbols can be identified by checking for SymType == SymbolRef::ST_Unknown. This is actually incorrect in some cases, causing relocation overflows to be mis-detected. The correct check is to test whether Value.SymbolName is null. Includes test case. Note that it is a bit tricky to replicate the exact condition that triggers the bug in a test case. The one included here seems to fail reliably (before the fix) across different operating system versions on Power, but it still makes a few assumptions (called out in the test case comments). Also add ppc64le platform name to the supported list in the lit.local.cfg files for the MCJIT and OrcMCJIT directories, since those tests were currently not run at all. Fixes PR32650. Reviewer: hfinkel Differential Revision: https://reviews.llvm.org/D33402 llvm-svn: 303637
* [JumpThreading] Safely replace uses of conditionAnna Thomas2017-05-233-52/+184
| | | | | | | | | | | | | | | | | | | | | | This patch builds over https://reviews.llvm.org/rL303349 and replaces the use of the condition only if it is safe to do so. We should not blindly RAUW the condition if experimental.guard or assume is a use of that condition. This is because LVI may have used the guard/assume to identify the value of the condition, and RUAWing will fold the guard/assume and uses before the guards/assumes. Reviewers: sanjoy, reames, trentxintong, mkazantsev Reviewed by: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33257 llvm-svn: 303633
* [AMDGPU] SDWA: Add assembler support for GFX9Sam Kolton2017-05-231-169/+268
| | | | | | | | | | | | | | | Summary: Added separate pseudo and real instruction for GFX9 SDWA instructions. Currently supports only in assembler. Depends D32493 Reviewers: vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33132 llvm-svn: 303620
* [AArch64] Make instruction fusion more aggressive. Florian Hahn2017-05-231-74/+71
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch makes instruction fusion more aggressive by * adding artificial edges between the successors of FirstSU and SecondSU, similar to BaseMemOpClusterMutation::clusterNeighboringMemOps. * updating PostGenericScheduler::tryCandidate to keep clusters together, similar to GenericScheduler::tryCandidate. This change increases the number of AES instruction pairs generated on Cortex-A57 and Cortex-A72. This doesn't change code at all in most benchmarks or general code, but we've seen improvement on kernels using AESE/AESMC and AESD/AESIMC. Reviewers: evandro, kristof.beyls, t.p.northover, silviu.baranga, atrick, rengolin, MatzeB Reviewed By: evandro Subscribers: aemerson, rengolin, MatzeB, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33230 llvm-svn: 303618
* [GlobalISel][X86] G_LOAD/G_STORE vec256/512 supportIgor Breger2017-05-235-40/+530
| | | | | | | | | | | | | | Summary: mark G_LOAD/G_STORE vec256/512 legal for AVX/AVX512. Implement instruction selection. Reviewers: zvi, guyblank Reviewed By: zvi Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33268 llvm-svn: 303617
* [LV] Report multiple reasons for not vectorizing under allowExtraAnalysisAyal Zaks2017-05-231-16/+108
| | | | | | | | | | | | | | | | | | | | The default behavior of -Rpass-analysis=loop-vectorizer is to report only the first reason encountered for not vectorizing, if one is found, at which time the vectorizer aborts its handling of the loop. This patch allows multiple reasons for not vectorizing to be identified and reported, at the potential expense of additional compile-time, under allowExtraAnalysis which can currently be turned on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed. Removed from LoopVectorizationLegality::canVectorize() the redundant checking and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also does that. This redundancy is caught by a lit test once multiple reasons are reported. Patch initially developed by Dror Barak. Differential Revision: https://reviews.llvm.org/D33396 llvm-svn: 303613
* libDebugInfo: Support symbolizing using DWP filesDavid Blaikie2017-05-233-0/+8
| | | | llvm-svn: 303609
* [AArch64] Fix PRR33100.Akira Hatanaka2017-05-231-0/+19
| | | | | | | | | | | | | This commit fixes a bug introduced in r301019 where optimizeLogicalImm would replace a logical node's immediate operand that was CSE'd and was also an operand of another node. This commit fixes the bug by replacing the logical node instead of its immediate operand. rdar://problem/32295276 llvm-svn: 303607
* Update expected result for or-branch.ll . NFCAmaury Sechet2017-05-231-13/+53
| | | | llvm-svn: 303606
* Support for taking the max of module flags when linking, use for PIE/PICTeresa Johnson2017-05-233-9/+15
| | | | | | | | | | | | | | | | | | | | | | Summary: Add Max ModFlagBehavior, which can be used to take the max of two module flag values when merging modules. Use it for the PIE and PIC levels. This avoids an error when we try to import from a module built -fpic into a module built -fPIC, for example. For both PIE and PIC levels, this will be legal, since the code generation gets more conservative as the level is increased. Therefore we can take the max instead of somehow trying to block importing between modules compiled with different levels. Reviewers: tmsriram, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33418 llvm-svn: 303590
* InstructionSimplify: don't speculate about Constants changing.Tim Northover2017-05-221-0/+39
| | | | | | | | | | When presented with an icmp/select pair, we can end up asking what would happen if we replaced one constant with another in an instruction. This is a mistake, while non-constant Values could become a constant, constants cannot change and trying to do so can lead to completely invalid IR (a GEP referencing a non-existant field in the original case). llvm-svn: 303580
* Infer relocation model from module flags in relocatable LTO link.Evgeniy Stepanov2017-05-221-0/+63
| | | | | | Fix for PR33096. llvm-svn: 303578
* Implement various flavors of type merging.Zachary Turner2017-05-229-2/+421
| | | | | | | | | | | | | | Previous algotirhm assumed that types and ids are in a single unified stream. For inputs that come from object files, this is the case. But if the input is already a PDB, or is the result of a previous merge, then the types and ids will already have been split up, in which case we need an algorithm that can accept operate on independent streams of types and ids that refer across stream boundaries to each other. Differential Revision: https://reviews.llvm.org/D33417 llvm-svn: 303577
* Don't generate line&scope debug info for meta-instructions.Adrian Prantl2017-05-221-0/+122
| | | | | | | | | | | | | | | MachineInstructions that don't generate any code (such as IMPLICIT_DEFs) should not generate any debug info either. Fixes PR33107. https://bugs.llvm.org/show_bug.cgi?id=33107 This reapplies r303566 without any modifications. The stage2 build failures persisted even after reverting this patch, and looking back through history, it looks like these tests are flaky. llvm-svn: 303575
* Fix update VP metadata after inlining for instrumentation PGOTeresa Johnson2017-05-222-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With instrumentation profiling, when updating the VP metadata after an inline, VP metadata on the inlined copy was inadvertantly having all counts zeroed out. This was causing indirect calls from code inlined during the call step to be marked as cold in the ThinLTO summaries and not imported. The CallerBFI needs to be passed down so that the CallSiteCount can be computed from the profile summary info. With Sample PGO this was working since the count is extracted from the branch weight metadata on the call being inlined (even before we stopped looking at metadata for non-sample PGO in r302844 this largely wasn't working for instrumentation PGO since only promoted indirect calls would be getting inlined and have the metadata). Added an instrumentation PGO test and renamed the sample PGO test. Reviewers: danielcdh, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D33389 llvm-svn: 303574
* Revert "Don't generate line&scope debug info for meta-instructions."Adrian Prantl2017-05-221-122/+0
| | | | | | This reverts commit r303566 while investigating a stage2 buildbot failure. llvm-svn: 303570
* [AMDGPU] Narrow lshl from 64 to 32 bit if possibleStanislav Mekhanoshin2017-05-2213-26/+72
| | | | | | | | | Turn expensive 64 bit shift into 32 bit if shift does not overflow int: shl (ext x) => zext (shl x) Differential Revision: https://reviews.llvm.org/D33367 llvm-svn: 303569
* Don't generate line&scope debug info for meta-instructions.Adrian Prantl2017-05-221-0/+122
| | | | | | | | | | | MachineInstructions that don't generate any code (such as IMPLICIT_DEFs) should not generate any debug info either. Fixes PR33107. https://bugs.llvm.org/show_bug.cgi?id=33107 llvm-svn: 303566
* [X86] Remove target feature info from mul-i256.ll test. NFC.Nirav Dave2017-05-221-2/+1
| | | | llvm-svn: 303558
* [mips] Support micromips attribute passed by front-endSimon Atanasyan2017-05-221-0/+39
| | | | | | | | | This patch adds handling of the `micromips` and `nomicromips` attributes passed by front-end. The patch depends on D33363. Differential revision: https://reviews.llvm.org/D33364 llvm-svn: 303545
OpenPOWER on IntegriCloud