summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [x86] fold the mask op on 8- and 16-bit rotatesSanjay Patel2017-08-141-3/+39
| | | | | | | | | | | | | | | | | | | | | | | Ref the post-commit thread for r310770: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/478507.html The motivating cases as 'C' source examples can look like this: unsigned char rotate_right_8(unsigned char v, int shift) { // shift &= 7; v = ( v >> shift ) | ( v << ( 8 - shift ) ); return v; } https://godbolt.org/g/K6rc1A Notice that the source doesn't contain UB-safe masked shift amounts, but instcombine created those in order to produce narrow rotate patterns. This should be the last step needed to resolve PR34046: https://bugs.llvm.org/show_bug.cgi?id=34046 Differential Revision: https://reviews.llvm.org/D36644 llvm-svn: 310849
* [SLPVectorizer] Schedule bundle with different opcodes.Dinar Temirbulatov2017-08-141-52/+140
| | | | | | | | | | | | This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ] Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36518 llvm-svn: 310847
* [X86] Fix a place that was mishandling X86ISD::UMUL.Craig Topper2017-08-141-1/+1
| | | | | | | | According to the X86ISelLowering.h, UMUL results are low, high, and flags. But this place was treating result 1 or 2 as flags. Differential Revision: https://reviews.llvm.org/D36654 llvm-svn: 310846
* [X86] Remove flag setting ISD nodes from computeKnownBitsForTargetNodeCraig Topper2017-08-141-15/+0
| | | | | | | | | | | | | | | | | Summary: The flag result is an i32 type. But its only really used for connectivity. I don't think anything even assumes a particular format. We don't ever do any real operations on it. So known bits don't help us optimize anything. My main motivation is that the UMUL behavior is actually wrong. I was going to fix this in D36654, but then realized there was just no reason for it to be here. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36657 llvm-svn: 310845
* [AVX512] Make the itinerary parameter actually pass through the the ↵Craig Topper2017-08-141-2/+2
| | | | | | | | | | | | | | | | AVX512_maskable_common multiclass Summary: This looks to have been disconnected about 3 years ago in r219358. Reviewers: gadi.haber, RKSimon, zvi Reviewed By: gadi.haber Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36658 llvm-svn: 310844
* [AVX512] Remove leftover code for when i1 was a legal type from the fast ↵Craig Topper2017-08-141-14/+0
| | | | | | | | | | | | | | | | | | | isel load/store code. Summary: I don't think we need this code anymore. It only existed because i1 used to be legal. There's probably more unneeded code in fast isel still. Reviewers: guyblank, zvi Reviewed By: guyblank Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36652 llvm-svn: 310843
* [BDCE] reduce scope of an assert (PR34179)Sanjay Patel2017-08-141-10/+7
| | | | | | | | | | | | | | | | | The assert was added with r310779 and is usually correct, but as the test shows, not always. The 'volatile' on the load is needed to expose the faulty path because without it, DemandedBits would return that the load is just dead rather than not demanded, and so we wouldn't hit the bogus assert. Also, since the lambda is just a single-line now, get rid of it and inline the DB.isAllOnesValue() calls. This should fix (prevent execution of a faulty assert): https://bugs.llvm.org/show_bug.cgi?id=34179 llvm-svn: 310842
* Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."Simon Dardis2017-08-147-1/+375
| | | | | | | | | | | | | | | | | | | | This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win buildbot. Unlike many other instructions, these instructions have aliases which take coprocessor registers, gpr register, accumulator (and dsp accumulator) registers, floating point registers, floating point control registers and coprocessor 2 data and control operands. For the moment, these aliases are treated as pseudo instructions which are expanded into the underlying instruction. As a result, disassembling these instructions shows the underlying instruction and not the alias. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35253 llvm-svn: 310834
* [DAGCombine] Do not try to deduplicate commutative operations if both ↵Amaury Sechet2017-08-141-3/+3
| | | | | | | | | | | | | | operand are the same. Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33840 llvm-svn: 310832
* [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))Elad Cohen2017-08-141-0/+9
| | | | | | | | | into vextract(vNiX,Idx) when creating vextract with getNode(). This case appeared in AVX512 after fixing pr33349 in r310552. Differential revision: https://reviews.llvm.org/D36571 llvm-svn: 310828
* MachineInstr: Reason locally about some memory objects before going to AA.Balaram Makam2017-08-141-17/+42
| | | | | | This addresses a FIXME in MachineInstr::mayAlias. llvm-svn: 310825
* [LoopUnroll] Enable option to peel remainder loopSam Parker2017-08-143-10/+37
| | | | | | | | | | | | | | | | | | | | On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824
* [AArch64] Remove unused MC functionSam Parker2017-08-141-18/+0
| | | | | | | | | | | | An unused function warning was raised in https://bugs.llvm.org/show_bug.cgi?id=34178. The offending function, in AArch64MCCodeEmitter.cpp, was committed by me last week. Differential Revision: https://reviews.llvm.org/D36665 llvm-svn: 310823
* Revert "[DAGCombiner] Extending pattern detection for vector shuffle ↵Elad Cohen2017-08-141-47/+2
| | | | | | | | (REAPPLIED)" This reverts commit r310782. llvm-svn: 310822
* [ValueTracking] Revert r310583 which enabled functionality that still isChandler Carruth2017-08-141-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | causing compile time issues. Moreover, the patch *deleted* the flag in addition to changing the default, and links to a code review that doesn't even discuss the flag and just has an update to a Clang test case. I've followed up on the commit thread to ask for numbers on compile time at this point, leaving the flag in place until things stabilize, and pointing at specific code that seems to exhibit excessive compile time with this patch. Original commit message for r310583: """ [ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2. The original patch was an improvement to IR ValueTracking on non-negative integers. It has been checked in to trunk (D18777, r284022). But was disabled by default due to performance regressions. Perf impact has improved. The patch would be enabled by default. """" llvm-svn: 310816
* [AVX-512] Add hasSideEffects = 0 to the 8-bit and 16-bit register broadcasts.Craig Topper2017-08-141-1/+1
| | | | llvm-svn: 310813
* [X86] Remove unused argument from the vextract_for_size multiclass. NFCCraig Topper2017-08-141-14/+7
| | | | llvm-svn: 310812
* [AVX512] Remove comment I should have removed in r310808. NFCCraig Topper2017-08-141-3/+0
| | | | llvm-svn: 310811
* [PowerPC] Revert r310346 (and followups r310356 & r310424) whichChandler Carruth2017-08-141-132/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | introduce a miscompile bug. There appears to be a bug where the generated code to extract the sign bit doesn't work correctly for 32-bit inputs. I've replied to the original commit pointing out the problem. I think I see by inspection (and reading the manual for PPC) how to fix this, but I can't be 100% confident and I also don't know what the best way to test this is. Currently it seems nearly impossible to get the backend to hit this code path, but the patch autohr is likely in a better position to craft such test cases than I am, and based on where the bug is it should be easily done. Original commit message for r310346: """ [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 """ llvm-svn: 310809
* [AVX512] Simplify the instruction defintion for VEXTRACT. NFCICraig Topper2017-08-141-33/+16
| | | | | | The comment about why we couldn't use avx512_maskable appears to have been incorrect. llvm-svn: 310808
* [ARM] Tidy-up Cortex-A15 DPR-SPR optimizer implementationJaved Absar2017-08-141-27/+12
| | | | | | | | | Modernise the code with range-loops etc Reviewed by: @fhahn, @rovka Differential Revision: https://reviews.llvm.org/D36502 llvm-svn: 310807
* [InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstantsCraig Topper2017-08-141-85/+19
| | | | | | | | | | | | | | | | | Summary: These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code. Next step is to use m_APInt instead of ConstantInt. Reviewers: spatel, efriedma, davide, majnemer Reviewed By: spatel Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D36439 llvm-svn: 310806
* [X86] Fix typo from r310794. Index = 0 should have been Index == 0.Craig Topper2017-08-131-2/+2
| | | | llvm-svn: 310801
* [X86] Remove unused pattern fragment that referenced MVT::i1. NFCCraig Topper2017-08-131-5/+0
| | | | llvm-svn: 310799
* [COFF, ARM64] Use '//' as comment character in assembly files in GNU ↵Martin Storsjo2017-08-133-2/+19
| | | | | | | | | | | | environments This allows using semicolons for bundling up more than one statement per line. This is used within the mingw-w64 project in some assembly files that contain code for multiple architectures. Differential Revision: https://reviews.llvm.org/D36366 llvm-svn: 310797
* [AVX512] Correct isExtractSubvectorCheap so that it will return the correct ↵Craig Topper2017-08-131-1/+7
| | | | | | | | | | | | answers for extracting 128-bits from a 512-bit vector and for mask registers. Previously it would not return true for extracting either of the upper quarters of a 512-bit registers. For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register. Differential Revision: https://reviews.llvm.org/D36638 llvm-svn: 310794
* [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheapCraig Topper2017-08-135-5/+7
| | | | | | | | | | | | | | | | | Summary: Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not. For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors. Reviewers: RKSimon, zvi, efriedma Reviewed By: RKSimon Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36649 llvm-svn: 310793
* [X86][SandyBridge] Additional updates to the SNB instructions scheduling ↵Gadi Haber2017-08-131-824/+988
| | | | | | | | | | | | | | | | information This is a continuation patch for commit r307529 which completely replaces the scheduling information for the SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target (see also https://reviews.llvm.org/D35019). In this patch we added the scheduling information of additional SNB instructions that were missing from the patch commit r307529, fixed the scheduling of several resource groups that include only port0 instead of port05 (i.e., port0 OR port5) and fixed several incorrect instructions' scheduling in the r307529 commit. The patch also includes the X87 instructions which were missing in previous patch commit r307529 as reported in bugzilla bug 34080. Reviewers: zvi, RKSimon, chandlerc, igorb, m_zuckerman, craig.topper, aymanmus, dim Differential Revision: https://reviews.llvm.org/D36388 llvm-svn: 310792
* [X86][AsmParser][AVX512] Error appropriately when K0 is tried as a write-maskCoby Tayree2017-08-131-1/+4
| | | | | | | | | K0 isn't expected as a write-mask, so provide a detailed error here, instead of the more generic one (invalid op for insn) Conforms with gas Differential Revision: https://reviews.llvm.org/D36570 llvm-svn: 310789
* [X86][AVX512] Add combine for TESTMGuy Blank2017-08-131-9/+16
| | | | | | | | | | | | Add an X86 combine for TESTM when one of the operands is a BUILD_VECTOR(0,0,...). TESTM op0, BUILD_VECTOR(0,0,...) -> BUILD_VECTOR(0,0,...) TESTM BUILD_VECTOR(0,0,...), op1 -> BUILD_VECTOR(0,0,...) Differential Revision: https://reviews.llvm.org/D36536 llvm-svn: 310787
* [X86] Early out of combineInsertSubvector for mask vectors.Craig Topper2017-08-121-1/+6
| | | | | | The combines here shouldn't be done for mask vectors, but it wasn't clear anything was preventing that. llvm-svn: 310786
* [X86] Fix bad comment. NFCCraig Topper2017-08-121-1/+1
| | | | llvm-svn: 310785
* [X86] When handling addcarry intrinsic, create the flag result with the ↵Craig Topper2017-08-121-2/+2
| | | | | | | | | | | | | | | | | | | | | correct type so we don't crash if we use a memory instruction Summary: Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result. Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types. We should probably consider this for 5.0 as well. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36645 llvm-svn: 310784
* [DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)Simon Pilgrim2017-08-121-2/+47
| | | | | | | | | | | | If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Reapplied with fix to only work with simple value types. Committed on behalf of @jbhateja (Jatin Bhateja) Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 310782
* [Triple] Add isThumb and isARM functions.Florian Hahn2017-08-125-15/+5
| | | | | | | | | | | | | | | | | | | Summary: isThumb returns true for Thumb triples (little and big endian), isARM returns true for ARM triples (little and big endian). There are a few more checks using arm/thumb that are not covered by those functions, e.g. that the architecture is either ARM or Thumb (little endian) or ARM/Thumb little endian only. Reviewers: javed.absar, rengolin, kristof.beyls, t.p.northover Reviewed By: rengolin Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D34682 llvm-svn: 310781
* [BDCE] clear poison generators after turning a value into zero (PR33695, ↵Sanjay Patel2017-08-121-0/+47
| | | | | | | | | | | | | | | | PR34037) nsw, nuw, and exact carry implicit assumptions about their operands, so we need to clear those after trivializing a value. We decided there was no danger for llvm.assume or metadata, so there's just a comment about that. This fixes miscompiles as shown in: https://bugs.llvm.org/show_bug.cgi?id=33695 https://bugs.llvm.org/show_bug.cgi?id=34037 Differential Revision: https://reviews.llvm.org/D36592 llvm-svn: 310779
* D36604: PR34148: Do not assume we can use a copy relocation for an ↵Richard Smith2017-08-111-2/+3
| | | | | | | | | | | `external_weak` global An `external_weak` global may be intended to resolve as a null pointer if it's not defined, so it doesn't make sense to use a copy relocation for it. Differential Revision: https://reviews.llvm.org/D36604 llvm-svn: 310773
* [libFuzzer] experimental support for Clang's coverage ↵Kostya Serebryany2017-08-116-15/+99
| | | | | | (fprofile-instr-generate), Linux-only llvm-svn: 310771
* [MIPS] Use ABI to determine stack alignment.John Baldwin2017-08-111-1/+3
| | | | | | | | | | | | | | | | Summary: The stack alignment depends on the ABI (16 bytes for N32 and N64 and 8 bytes for O32), not the CPU type. Reviewers: sdardis Reviewed By: sdardis Subscribers: atanasyan, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D36326 llvm-svn: 310768
* [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-116-77/+187
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 310766
* [OptDiag] Updating Remarks in SampleProfileEli Friedman2017-08-112-30/+55
| | | | | | | | | | | | | | | | Updating remark API to newer OptimizationDiagnosticInfo API. This allows remarks to show up in diagnostic yaml file, and enables use of opt-viewer tool. Hotness information for remarks (L505 and L751) do not display hotness information, most likely due to profile information not being propagated yet. Unsure if this is the desired outcome. Patch by Tarun Rajendran. Differential Revision: https://reviews.llvm.org/D36127 llvm-svn: 310763
* [X86] Don't use fsin/fcos/fsincos instructions everCraig Topper2017-08-111-15/+12
| | | | | | | | | | | | | | | | | Summary: Previously we would use these instructions if sse was disabled and fastmath was enabled. As mentioned in D28335, this is a bad idea. Reviewers: efriedma, scanon, DavidKreitzer Reviewed By: DavidKreitzer Subscribers: zvi, llvm-commits Differential Revision: https://reviews.llvm.org/D36344 llvm-svn: 310762
* Fix access to undefined weak symbols in pic codeRafael Espindola2017-08-111-1/+17
| | | | | | | | | | When the access to a weak symbol is not a call, the access has to be able to produce the value 0 at runtime. We were sometimes producing code sequences where that was not possible if the code was leaded more than 4g away from 0. llvm-svn: 310756
* AMDGPU: Start adding tail call supportMatt Arsenault2017-08-119-21/+294
| | | | | | Handle the sibling call cases. llvm-svn: 310753
* [libFuzzer] Re-enable coverage.test on Darwin.George Karpenkov2017-08-111-2/+0
| | | | llvm-svn: 310750
* Revert r310716 (and r310735): [globalisel][tablegen] Support ↵Daniel Sanders2017-08-111-11/+1
| | | | | | | | | | | zero-instruction emission. Two of the Windows bots are failing test\CodeGen\X86\GlobalISel\select-inc.mir which should not have been affected by the change. Reverting while I investigate. Also reverted r310735 because it builds on r310716. llvm-svn: 310745
* [LLD/PDB] Write actual records to the globals stream.Zachary Turner2017-08-115-15/+111
| | | | | | | | | | | | | | | | Previously we were writing an empty globals stream. Windows tools interpret this as "private symbols are not present in this PDB", even when they are, so we need to fix this. Regardless, without it we don't have information about global variables, so we need to fix it anyway. This patch does that. With this patch, the "lm" command in WinDbg correctly reports that we have private symbols available, but the "dv" command still refuses to display local variables. Differential Revision: https://reviews.llvm.org/D36535 llvm-svn: 310743
* [mips] clang-format MipsSubtarget.cpp.John Baldwin2017-08-111-3/+3
| | | | | | This only fixes a few things and serves as my initial test commit. llvm-svn: 310742
* [AMDGPU] Fix santizer error after last commitStanislav Mekhanoshin2017-08-111-1/+0
| | | | | | Removed useless assert. llvm-svn: 310738
* Fix typo /NFCXinliang David Li2017-08-111-1/+1
| | | | llvm-svn: 310737
OpenPOWER on IntegriCloud