summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra ↵Dinar Temirbulatov2017-08-151-7/+7
| | | | | | | | parameter VL0, replace E->Scalars[0] to VL0, NFCI. llvm-svn: 310904
* [CMake] Add install target for LLVMFuzzerPetr Hosek2017-08-141-0/+21
| | | | | | | | This allows including LLVMFuzzer as distribution component. Differential Revision: https://reviews.llvm.org/D36540 llvm-svn: 310897
* Add missing dependency in ICP. (NFC)Dehao Chen2017-08-141-4/+9
| | | | llvm-svn: 310896
* [MachineOutliner] Only outline candidates of length >= 2Jessica Paquette2017-08-141-0/+7
| | | | | | | | | Since we don't factor in instruction lengths into outlining calculations right now, it's never the case that a candidate could have length < 2. Thus, we should quit early when we see such candidates. llvm-svn: 310894
* [InstSimplify] Teach decomposeBitTestICmp to handle non-canonical comparesCraig Topper2017-08-141-0/+28
| | | | | | | | This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred. Differential Revision: https://reviews.llvm.org/D36646 llvm-svn: 310893
* Remove checks for debug info intrinsics in use lists, NFCReid Kleckner2017-08-143-4/+0
| | | | | | | These haven't done anything since debug info intrinsics stopped appearing in Value use lists in 2014. llvm-svn: 310892
* [MIPS] Implement support for -mstack-alignment.John Baldwin2017-08-146-16/+32
| | | | | | | | | | | | | | | | | | | | | Summary: This is modeled on the implementation for x86 which stores the command line option in a 'StackAlignOverride' field in MipsSubtarget and then uses this to compute a 'stackAlignment' value in MipsSubtarget::initializeSubtargetDependencies. The stackAlignment() method in MipsSubTarget is renamed to getStackAlignment() and returns the computed 'stackAlignment'. Reviewers: sdardis Reviewed By: sdardis Subscribers: llvm-commits, arichardson Differential Revision: https://reviews.llvm.org/D35874 llvm-svn: 310891
* Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of ↵Craig Topper2017-08-145-51/+54
| | | | | | | | | | | | | | | | | | | | decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889
* [InlineCost] Refactor the checks for different analyses to be a bit moreChandler Carruth2017-08-141-62/+62
| | | | | | | | | | | | | | | | | | | localized to the code that uses those analyses. Technically, this can change behavior as we no longer require the existence of the ProfileSummaryInfo analysis to use local profile information via BFI. We didn't actually require the PSI to have an interesting profile though, so this only really impacts the behavior in non-default pass pipelines. IMO, this makes it substantially less surprising how everything works -- before an analysis that wasn't actually used had to exist to trigger *any* profile aware inlining. I think the new organization makes it more obvious where various checks for profile signals happen. Differential Revision: https://reviews.llvm.org/D36710 llvm-svn: 310888
* Add strictfp attribute to prevent unwanted optimizations of libm callsAndrew Kaylor2017-08-1411-82/+117
| | | | | | Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885
* [libFuzzer] try to use less RAM while processing the initial corpusKostya Serebryany2017-08-141-1/+2
| | | | llvm-svn: 310881
* [libFuzzer] explicitly use -fsanitize-coverage=trace-pc-guard in ↵Kostya Serebryany2017-08-143-7/+7
| | | | | | test/dump_coverage.test; mark print_coverage/dump_coverage as To-be-deprecated llvm-svn: 310877
* IPRA: Allow target to enable IPRA by defaultMatt Arsenault2017-08-142-6/+10
| | | | llvm-svn: 310876
* IPRA: Run RegUsageInfoPropagate much laterMatt Arsenault2017-08-141-3/+3
| | | | | | | | | | | | | | This was running immediately after isel, before isel pseudos were even expanded which is really unreasonable. Move this to before pre-reglloc passes in case some other pre-regalloc pass wants to use the updated regmask info. Fixes one of the reasons IPRA doesn't do anything on AMDGPU currently. Tests will be included with future patch after a few more are fixed. llvm-svn: 310875
* Revert r310869 "[InstSimplify][InstCombine] Modify the interface of ↵Craig Topper2017-08-145-37/+143
| | | | | | | | decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873
* Revert r310870 "[InstCombine][InstSimplify] 'git add' two files that moved ↵Craig Topper2017-08-141-137/+0
| | | | | | | | in r310869." An extra change crept in here. llvm-svn: 310872
* [InstCombine][InstSimplify] 'git add' two files that moved in r310869.Craig Topper2017-08-141-0/+137
| | | | llvm-svn: 310870
* [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and ↵Craig Topper2017-08-145-143/+37
| | | | | | | | | | | | | | | | use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869
* [PowerPC] Add codegen for VSX word extract convert to FPLei Huang2017-08-141-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add codegen for VSX word extract conversion from signed/unsigned to single/double precision. For UINT_TO_FP: Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239. Here we will add the missing extract integer and conversion to double. This utilizes the new P9 instruction xxextractuw to extracting an integer element when the result will be converted to double thereby saving 2 direct moves (VSR <-> GPR). For SINT_TO_FP: We will implement the following sequence which will also reduce the number of instructions by saving 2 direct moves. v4i32->f32: xxspltw xvcvsxwsp xscvspdpn v4i32->f64: xxspltw xvcvsxwdp Differential Revision: https://reviews.llvm.org/D35859 llvm-svn: 310866
* [ValueTracking] Don't delete assumes of side-effectful instructionsHal Finkel2017-08-141-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | ValueTracking has to strike a balance when attempting to propagate information backwards from assumes, because if the information is trivially propagated backwards, it can appear to LLVM that the assumption is known to be true, and therefore can be removed. This is sound (because an assumption has no semantic effect except for causing UB), but prevents the assume from allowing further optimizations. The isEphemeralValueOf check exists to try and prevent this issue by not removing the source of an assumption. This tries to make it a little bit more general to handle the case of side-effectful instructions, such as in %0 = call i1 @get_val() %1 = xor i1 %0, true call void @llvm.assume(i1 %1) Patch by Ariel Ben-Yehuda, thanks! Differential Revision: https://reviews.llvm.org/D36590 llvm-svn: 310859
* Revert "Reland "[mips][mt][6/7] Add support for mftr, mttr instructions.""Simon Dardis2017-08-147-375/+1
| | | | | | | This reverts r310834. It didn't pacify the buildbot, FileCheck is still crashing. llvm-svn: 310854
* [x86] fold the mask op on 8- and 16-bit rotatesSanjay Patel2017-08-141-3/+39
| | | | | | | | | | | | | | | | | | | | | | | Ref the post-commit thread for r310770: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/478507.html The motivating cases as 'C' source examples can look like this: unsigned char rotate_right_8(unsigned char v, int shift) { // shift &= 7; v = ( v >> shift ) | ( v << ( 8 - shift ) ); return v; } https://godbolt.org/g/K6rc1A Notice that the source doesn't contain UB-safe masked shift amounts, but instcombine created those in order to produce narrow rotate patterns. This should be the last step needed to resolve PR34046: https://bugs.llvm.org/show_bug.cgi?id=34046 Differential Revision: https://reviews.llvm.org/D36644 llvm-svn: 310849
* [SLPVectorizer] Schedule bundle with different opcodes.Dinar Temirbulatov2017-08-141-52/+140
| | | | | | | | | | | | This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ] Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36518 llvm-svn: 310847
* [X86] Fix a place that was mishandling X86ISD::UMUL.Craig Topper2017-08-141-1/+1
| | | | | | | | According to the X86ISelLowering.h, UMUL results are low, high, and flags. But this place was treating result 1 or 2 as flags. Differential Revision: https://reviews.llvm.org/D36654 llvm-svn: 310846
* [X86] Remove flag setting ISD nodes from computeKnownBitsForTargetNodeCraig Topper2017-08-141-15/+0
| | | | | | | | | | | | | | | | | Summary: The flag result is an i32 type. But its only really used for connectivity. I don't think anything even assumes a particular format. We don't ever do any real operations on it. So known bits don't help us optimize anything. My main motivation is that the UMUL behavior is actually wrong. I was going to fix this in D36654, but then realized there was just no reason for it to be here. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36657 llvm-svn: 310845
* [AVX512] Make the itinerary parameter actually pass through the the ↵Craig Topper2017-08-141-2/+2
| | | | | | | | | | | | | | | | AVX512_maskable_common multiclass Summary: This looks to have been disconnected about 3 years ago in r219358. Reviewers: gadi.haber, RKSimon, zvi Reviewed By: gadi.haber Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36658 llvm-svn: 310844
* [AVX512] Remove leftover code for when i1 was a legal type from the fast ↵Craig Topper2017-08-141-14/+0
| | | | | | | | | | | | | | | | | | | isel load/store code. Summary: I don't think we need this code anymore. It only existed because i1 used to be legal. There's probably more unneeded code in fast isel still. Reviewers: guyblank, zvi Reviewed By: guyblank Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36652 llvm-svn: 310843
* [BDCE] reduce scope of an assert (PR34179)Sanjay Patel2017-08-141-10/+7
| | | | | | | | | | | | | | | | | The assert was added with r310779 and is usually correct, but as the test shows, not always. The 'volatile' on the load is needed to expose the faulty path because without it, DemandedBits would return that the load is just dead rather than not demanded, and so we wouldn't hit the bogus assert. Also, since the lambda is just a single-line now, get rid of it and inline the DB.isAllOnesValue() calls. This should fix (prevent execution of a faulty assert): https://bugs.llvm.org/show_bug.cgi?id=34179 llvm-svn: 310842
* Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."Simon Dardis2017-08-147-1/+375
| | | | | | | | | | | | | | | | | | | | This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win buildbot. Unlike many other instructions, these instructions have aliases which take coprocessor registers, gpr register, accumulator (and dsp accumulator) registers, floating point registers, floating point control registers and coprocessor 2 data and control operands. For the moment, these aliases are treated as pseudo instructions which are expanded into the underlying instruction. As a result, disassembling these instructions shows the underlying instruction and not the alias. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35253 llvm-svn: 310834
* [DAGCombine] Do not try to deduplicate commutative operations if both ↵Amaury Sechet2017-08-141-3/+3
| | | | | | | | | | | | | | operand are the same. Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33840 llvm-svn: 310832
* [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))Elad Cohen2017-08-141-0/+9
| | | | | | | | | into vextract(vNiX,Idx) when creating vextract with getNode(). This case appeared in AVX512 after fixing pr33349 in r310552. Differential revision: https://reviews.llvm.org/D36571 llvm-svn: 310828
* MachineInstr: Reason locally about some memory objects before going to AA.Balaram Makam2017-08-141-17/+42
| | | | | | This addresses a FIXME in MachineInstr::mayAlias. llvm-svn: 310825
* [LoopUnroll] Enable option to peel remainder loopSam Parker2017-08-143-10/+37
| | | | | | | | | | | | | | | | | | | | On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824
* [AArch64] Remove unused MC functionSam Parker2017-08-141-18/+0
| | | | | | | | | | | | An unused function warning was raised in https://bugs.llvm.org/show_bug.cgi?id=34178. The offending function, in AArch64MCCodeEmitter.cpp, was committed by me last week. Differential Revision: https://reviews.llvm.org/D36665 llvm-svn: 310823
* Revert "[DAGCombiner] Extending pattern detection for vector shuffle ↵Elad Cohen2017-08-141-47/+2
| | | | | | | | (REAPPLIED)" This reverts commit r310782. llvm-svn: 310822
* [ValueTracking] Revert r310583 which enabled functionality that still isChandler Carruth2017-08-141-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | causing compile time issues. Moreover, the patch *deleted* the flag in addition to changing the default, and links to a code review that doesn't even discuss the flag and just has an update to a Clang test case. I've followed up on the commit thread to ask for numbers on compile time at this point, leaving the flag in place until things stabilize, and pointing at specific code that seems to exhibit excessive compile time with this patch. Original commit message for r310583: """ [ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2. The original patch was an improvement to IR ValueTracking on non-negative integers. It has been checked in to trunk (D18777, r284022). But was disabled by default due to performance regressions. Perf impact has improved. The patch would be enabled by default. """" llvm-svn: 310816
* [AVX-512] Add hasSideEffects = 0 to the 8-bit and 16-bit register broadcasts.Craig Topper2017-08-141-1/+1
| | | | llvm-svn: 310813
* [X86] Remove unused argument from the vextract_for_size multiclass. NFCCraig Topper2017-08-141-14/+7
| | | | llvm-svn: 310812
* [AVX512] Remove comment I should have removed in r310808. NFCCraig Topper2017-08-141-3/+0
| | | | llvm-svn: 310811
* [PowerPC] Revert r310346 (and followups r310356 & r310424) whichChandler Carruth2017-08-141-132/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | introduce a miscompile bug. There appears to be a bug where the generated code to extract the sign bit doesn't work correctly for 32-bit inputs. I've replied to the original commit pointing out the problem. I think I see by inspection (and reading the manual for PPC) how to fix this, but I can't be 100% confident and I also don't know what the best way to test this is. Currently it seems nearly impossible to get the backend to hit this code path, but the patch autohr is likely in a better position to craft such test cases than I am, and based on where the bug is it should be easily done. Original commit message for r310346: """ [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 """ llvm-svn: 310809
* [AVX512] Simplify the instruction defintion for VEXTRACT. NFCICraig Topper2017-08-141-33/+16
| | | | | | The comment about why we couldn't use avx512_maskable appears to have been incorrect. llvm-svn: 310808
* [ARM] Tidy-up Cortex-A15 DPR-SPR optimizer implementationJaved Absar2017-08-141-27/+12
| | | | | | | | | Modernise the code with range-loops etc Reviewed by: @fhahn, @rovka Differential Revision: https://reviews.llvm.org/D36502 llvm-svn: 310807
* [InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstantsCraig Topper2017-08-141-85/+19
| | | | | | | | | | | | | | | | | Summary: These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code. Next step is to use m_APInt instead of ConstantInt. Reviewers: spatel, efriedma, davide, majnemer Reviewed By: spatel Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D36439 llvm-svn: 310806
* [X86] Fix typo from r310794. Index = 0 should have been Index == 0.Craig Topper2017-08-131-2/+2
| | | | llvm-svn: 310801
* [X86] Remove unused pattern fragment that referenced MVT::i1. NFCCraig Topper2017-08-131-5/+0
| | | | llvm-svn: 310799
* [COFF, ARM64] Use '//' as comment character in assembly files in GNU ↵Martin Storsjo2017-08-133-2/+19
| | | | | | | | | | | | environments This allows using semicolons for bundling up more than one statement per line. This is used within the mingw-w64 project in some assembly files that contain code for multiple architectures. Differential Revision: https://reviews.llvm.org/D36366 llvm-svn: 310797
* [AVX512] Correct isExtractSubvectorCheap so that it will return the correct ↵Craig Topper2017-08-131-1/+7
| | | | | | | | | | | | answers for extracting 128-bits from a 512-bit vector and for mask registers. Previously it would not return true for extracting either of the upper quarters of a 512-bit registers. For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register. Differential Revision: https://reviews.llvm.org/D36638 llvm-svn: 310794
* [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheapCraig Topper2017-08-135-5/+7
| | | | | | | | | | | | | | | | | Summary: Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not. For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors. Reviewers: RKSimon, zvi, efriedma Reviewed By: RKSimon Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36649 llvm-svn: 310793
* [X86][SandyBridge] Additional updates to the SNB instructions scheduling ↵Gadi Haber2017-08-131-824/+988
| | | | | | | | | | | | | | | | information This is a continuation patch for commit r307529 which completely replaces the scheduling information for the SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target (see also https://reviews.llvm.org/D35019). In this patch we added the scheduling information of additional SNB instructions that were missing from the patch commit r307529, fixed the scheduling of several resource groups that include only port0 instead of port05 (i.e., port0 OR port5) and fixed several incorrect instructions' scheduling in the r307529 commit. The patch also includes the X87 instructions which were missing in previous patch commit r307529 as reported in bugzilla bug 34080. Reviewers: zvi, RKSimon, chandlerc, igorb, m_zuckerman, craig.topper, aymanmus, dim Differential Revision: https://reviews.llvm.org/D36388 llvm-svn: 310792
* [X86][AsmParser][AVX512] Error appropriately when K0 is tried as a write-maskCoby Tayree2017-08-131-1/+4
| | | | | | | | | K0 isn't expected as a write-mask, so provide a detailed error here, instead of the more generic one (invalid op for insn) Conforms with gas Differential Revision: https://reviews.llvm.org/D36570 llvm-svn: 310789
OpenPOWER on IntegriCloud