summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Move some classes into anonymous namespaces. NFC.Benjamin Kramer2019-02-117-4/+14
| | | | llvm-svn: 353710
* [MCA] Return a mask of busy resources from method ↵Andrea Di Biagio2019-02-112-10/+23
| | | | | | | | | | | ResourceManager::checkAvailability(). NFCI In case of bottlenecks caused by pipeline pressure, we want to be able to correctly report the set of problematic pipelines. This is a first step towards adding support for bottleneck hints in llvm-mca (see PR37494). No functional change intended. llvm-svn: 353706
* [AMDGPU] Remove unused variableBenjamin Kramer2019-02-111-2/+0
| | | | llvm-svn: 353704
* [AMDGPU] Fix DPP sequence in atomic optimizer.Neil Henning2019-02-111-38/+38
| | | | | | | | | | This commit fixes the DPP sequence in the atomic optimizer (which was previously missing the row_shr:3 step), and works around a read_register exec bug by using a ballot instead. Differential Revision: https://reviews.llvm.org/D57737 llvm-svn: 353703
* Revert "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"Sam McCall2019-02-112-63/+60
| | | | | | | | | This reverts commit r353610. It causes a miscompile visible in macro expansion in a bootstrapped clang. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190211/626590.html llvm-svn: 353699
* [ARM] Add v8m.base pattern for add negative immSam Parker2019-02-111-0/+5
| | | | | | | | | | | The v8m.base ISA contains movw, which can operate on an unsigned 16-bit value. Add the pattern that converts an add with a negative value, that could fit into 16-bits when negated, into a sub with that positive value. Differential Revision: https://reviews.llvm.org/D57942 llvm-svn: 353692
* [AMDGPU] Enable DPP combiner pass by default.Valery Pykhtin2019-02-111-1/+1
| | | | | | Related revisions: https://reviews.llvm.org/D55444, https://reviews.llvm.org/D55314 llvm-svn: 353691
* [ARM] LoadStoreOptimizer: reoder limitSjoerd Meijer2019-02-111-1/+6
| | | | | | | | | | | | | | | The whole design of generating LDMs/STMs is fragile and unreliable: it depends on rescheduling here in the LoadStoreOptimizer that isn't register pressure aware and regalloc that isn't aware of generating LDMs/STMs. This patch adds a (hidden) option to control the total number of instructions that can be re-ordered. I appreciate this looks only a tiny bit better than a hard-coded constant, but at least it allows more easy experimentation with different values for now. Ideally we calculate this reorder limit based on some heuristics, and take register pressure into account. I might be looking into that next. Differential Revision: https://reviews.llvm.org/D57954 llvm-svn: 353678
* Move CFLGraph and the AA summary code over to the new `CallBase`Chandler Carruth2019-02-113-38/+36
| | | | | | instruction base class rather than the `CallSite` wrapper. llvm-svn: 353676
* Remove `CallSite` from the CodeMetrics analysis, moving it to the newChandler Carruth2019-02-111-7/+4
| | | | | | `CallBase` and simpler APIs therein. llvm-svn: 353673
* [ARM] LoadStoreOptimizer: just a clean-up. NFC.Sjoerd Meijer2019-02-111-35/+25
| | | | | | Differential Revision: https://reviews.llvm.org/D57955 llvm-svn: 353670
* Update more files added with the old header to the new one.Chandler Carruth2019-02-111-4/+3
| | | | llvm-svn: 353667
* Update files that were mistakenly added with the old file header to theChandler Carruth2019-02-112-8/+6
| | | | | | new one. llvm-svn: 353665
* [CallSite removal] Port InstSimplify over to use `CallBase` both in itsChandler Carruth2019-02-111-19/+17
| | | | | | | | interface and implementation. Port code with: `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353662
* [CallSite removal] Migrate ConstantFolding APIs and implementation toChandler Carruth2019-02-116-35/+42
| | | | | | | | | `CallBase`. Users have been updated. You can see how to update any out-of-tree usages: pass `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353661
* [CallSite removal] Migrate the statepoint GC infrastructure to use theChandler Carruth2019-02-118-178/+157
| | | | | | | | | | | | | | | `CallBase` class rather than `CallSite` wrappers. I pushed this change down through most of the statepoint infrastructure, completely removing the use of CallSite where I could reasonably do so. I ended up making a couple of cut-points: generic call handling (instcombine, TLI, SDAG). As soon as it hit truly generic handling with users outside the immediate code, I simply transitioned into or out of a `CallSite` to make this a reasonable sized chunk. Differential Revision: https://reviews.llvm.org/D56122 llvm-svn: 353660
* [X86] Removed unused SDTypeProfile. NFCCraig Topper2019-02-111-2/+0
| | | | llvm-svn: 353659
* [X86] EltsFromConsecutiveLoads - replace SmallBitVector with APInt (NFC).Simon Pilgrim2019-02-101-10/+11
| | | | | | Minor refactor to simplify some incoming patches to improve broadcast loads. llvm-svn: 353655
* [CodeGen][X86] Don't scalarize vector saturating add/subNikita Popov2019-02-101-15/+6
| | | | | | | | | | | Now that we have vector support for [US](ADD|SUB)O we no longer need to scalarize when expanding [US](ADD|SUB)SAT. This matches what the cost model already does. Differential Revision: https://reviews.llvm.org/D57348 llvm-svn: 353651
* [DAG] Add optional AllowUndefs to isNullOrNullSplatSimon Pilgrim2019-02-102-7/+3
| | | | | | No change in default behaviour (AllowUndefs = false) llvm-svn: 353646
* [DAGCombine] Simplify funnel shifts with undef/zero args to bitshiftsSimon Pilgrim2019-02-101-2/+41
| | | | | | | | Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero. Differential Revision: https://reviews.llvm.org/D58009 llvm-svn: 353645
* [x86] narrow 256-bit horizontal ops via demanded elementsSanjay Patel2019-02-101-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | 256-bit horizontal math ops are an x86 monstrosity (and thankfully have not been extended to 512-bit AFAIK). The two 128-bit halves operate on separate halves of the inputs. So if we don't demand anything in the upper half of the result, we can extract the low halves of the inputs, do the math, and then insert that result into a 256-bit output. All of the extract/insert is free (ymm<-->xmm), so we're left with a narrower (cheaper) version of the original op. In the affected tests based on: https://bugs.llvm.org/show_bug.cgi?id=33758 https://bugs.llvm.org/show_bug.cgi?id=38971 ...we see that the h-op narrowing can result in further narrowing of other math via existing generic transforms. I originally drafted this patch as an exact pattern match starting from extract_vector_elt, but I thought we might see diffs starting from extract_subvector too, so I changed it to a more general demanded elements solution. There are no extra existing regression test improvements from that switch though, so we could go back. Differential Revision: https://reviews.llvm.org/D57841 llvm-svn: 353641
* [TargetLowering] refactor setcc folds to fix another miscompile (PR40657)Sanjay Patel2019-02-101-55/+55
| | | | | | | | | | SimplifySetCC still has much room for improvement, but this should fix the remaining problem examples from: https://bugs.llvm.org/show_bug.cgi?id=40657 The initial fix for this problem was rL353615. llvm-svn: 353639
* [Local] Delete a redundant check. NFCFangrui Song2019-02-101-1/+1
| | | | | | isInstructionTriviallyDead also performs the use_empty() check. llvm-svn: 353637
* [X86] Move some vector InstAliases out from under unnecessary 'let ↵Craig Topper2019-02-102-87/+77
| | | | | | | | Predicates'. NFCI We don't have any assembler predicates for vector ISAs so this isn't necessary. It just adds extra lines and identation. llvm-svn: 353631
* [InstCombine] Fix an unused variable warning.Craig Topper2019-02-101-1/+1
| | | | llvm-svn: 353630
* [X86] CombineOr - fold to generic funnel shiftsSimon Pilgrim2019-02-091-21/+22
| | | | | | | | As discussed on D57389, this is a first step towards moving the SHLD/SHRD matching code to DAGCombiner using FSHL/FSHR instead. There's a bit of work to do before I can do that, so this just folds to FSHL/FSHR in the existing code (handling the different SHRD/FSHR argument ordering), which fixes the issue we had with i16 shift amounts not being correctly masked. llvm-svn: 353626
* llvm-lib: Implement /list flagNico Weber2019-02-092-0/+49
| | | | | | Differential Revision: https://reviews.llvm.org/D57952 llvm-svn: 353620
* [TargetLowering] add tests to show effect of setcc sub->shift; NFCSanjay Patel2019-02-091-1/+0
| | | | | | | | | There's effectively no difference for the cases with variables. We just trade a sub for an add on those. But the case with a subtract from constant would require an extra move instruction on x86, so this looks like a reasonable generic combine. llvm-svn: 353619
* [TargetLowering] avoid miscompile in setcc transform (PR40657)Sanjay Patel2019-02-091-1/+3
| | | | llvm-svn: 353615
* Revert "[SelectionDAG] Extract [US]MULO expansion into TL method; NFC"Nikita Popov2019-02-092-112/+141
| | | | | | | | This reverts commit r353611. Triggers an assertion during the libcall expansion on ARM. llvm-svn: 353612
* [SelectionDAG] Extract [US]MULO expansion into TL method; NFCNikita Popov2019-02-092-141/+112
| | | | | | | | | In preparation for supporting vector expansion. Also drop a variant of ExpandLibCall, of which the MULO expansions were the only user. llvm-svn: 353611
* [X86][SSE] Generalize X86ISD::BLENDI support to more value typesSimon Pilgrim2019-02-092-60/+63
| | | | | | | | | | | | | | | | D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required. With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors. This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts. I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case). The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV. Differential Revision: https://reviews.llvm.org/D57888 llvm-svn: 353610
* [lib/ObjectYAML] - Fix BB after r353607 [2]. NFC.George Rimar2019-02-091-2/+1
| | | | | | | | | | | The second and the last place it seems. Error was: [ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Error.cpp.o /Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:993:15: error: unused variable 'Object' [-Werror,-Wunused-variable] const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext()); llvm-svn: 353609
* [lib/ObjectYAML] - Fix BB after r353607. NFC.George Rimar2019-02-091-2/+1
| | | | | | | | | | Error was: [ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/DAGDeltaAlgorithm.cpp.o /Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:666:15: error: unused variable 'Object' [-Werror,-Wunused-variable] const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext()); (http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/29920) llvm-svn: 353608
* [yaml2obj][obj2yaml] - Add support for dumping/parsing .dynamic sections.George Rimar2019-02-091-0/+46
| | | | | | | | | This teaches the tools to parse and dump the .dynamic section and its dynamic tags. Differential revision: https://reviews.llvm.org/D57691 llvm-svn: 353606
* [GlobalOpt] Simplify __cxa_atexit eliminationFangrui Song2019-02-091-39/+9
| | | | | | | | | | | | | cxxDtorIsEmpty checks callers recursively to determine if the __cxa_atexit-registered function is empty, and eliminates the __cxa_atexit call accordingly. This recursive check is unnecessary as redundant instructions and function calls can be removed by early-cse and inliner. In addition, cxxDtorIsEmpty does not mark visited function and it may visit a function exponential times (multiplication principle). llvm-svn: 353603
* [MC] Clean up unused inline function and non-anchor defaulted destructors; NFCIHubert Tong2019-02-093-6/+0
| | | | | | | | | | | | | | | | | | | | | | Summary: Take care of some missing clean-ups that belong with r249548 and some other copy/paste that had happened. In particular, the destructors are no longer vtable anchors after r249548; and `setSectionName` in `MCSectionWasm` is private and unused since r313058 culled its only caller. The destructors are now implicitly defined, and the unused function is removed. Reviewers: nemanjai, jasonliu, grosbach Reviewed By: nemanjai Subscribers: sbc100, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57182 llvm-svn: 353597
* Extra processing for BitCast + PHI in InstCombineGabor Buella2019-02-091-11/+34
| | | | | | | | | | | | | | | | | | | | | | | For some specific cases with bitcast A->B->A with intervening PHI nodes InstCombiner::optimizeBitCastFromPhi transformation creates extra PHI nodes, which are actually a copy of already created PHI or in another words, they are redundant. These extra PHI nodes could lead to extra move instructions generated after DeSSA transformation. This happens when several conditions are met - SROA kicks in and creates new alloca; - there is a simple assignment L = R, which falls under 'canonicalize loads' done by combineLoadToOperationType (this transformation is by default). Exactly this transformation is the reason of bitcasts generated; - the alloca is then used in A->B->A + PHI chain; - there is a loop unrolling. As a result optimizeBitCastFromPhi creates as many of PHI nodes for each new SROA alloca as loop unrolling factor is. These new extra PHI nodes are redundant actually except of one and should not be created. Moreover the idea of optimizeBitCastFromPhi is to get rid of the cast (when possible) but that doesn't happen in these conditions. The proposed fix is to do the cast replacement for the whole calculated/accumulated PHI closure not for one cast only, which is an argument to the optimizeBitCastFromPhi. These will help to accomplish several things: 1) avoid extra PHI nodes generated as all casts which may trigger optimizeBitCastFromPhi transformation will be replaced, 3) bitcasts will be replaced, and 3) create more opportunities to remove dead code, which appears after the replacement. A new test case shows that it's possible to get rid of all bitcasts completely and get quite good code reduction. Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com> Reviewed By: Carrot Differential Revision: https://reviews.llvm.org/D57053 llvm-svn: 353595
* This reverts commit 1440a848a635849b97f7a5cfa0ecc40d37451f5b.Mikhail R. Gadelha2019-02-092-848/+1
| | | | | | | | and commit a1853e834c65751f92521f7481b15cf0365e796b. They broke arm and aarch64 llvm-svn: 353590
* [AMDGPU] Split dot-insts featureStanislav Mekhanoshin2019-02-095-24/+56
| | | | | | Differential Revision: https://reviews.llvm.org/D57971 llvm-svn: 353587
* [NFC] Avoid passing blocks vector to the OutlineRegionInfo constructor by value.Sergey Dmitriev2019-02-081-4/+3
| | | | | | | | | | | | | | Reviewers: vsk, fhahn, davidxl Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57957 llvm-svn: 353582
* Re-apply r353553 "[GISel][NFC]: Add missing call to record CSE hits in the ↵Francis Visoiu Mistrih2019-02-082-9/+10
| | | | | | | | CSEMIRBuilder" With a fix after r353563 that adds some more opcodes. llvm-svn: 353579
* Revert r353553 "[GISel][NFC]: Add missing call to record CSE hits in the ↵Francis Visoiu Mistrih2019-02-082-10/+9
| | | | | | | | | | | | CSEMIRBuilder" This reverts commit r353553. This breaks CodeGen/AArch64/GlobalISel/legalize-ext-csedebug-output.mir: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/57963/console llvm-svn: 353575
* [X86] Add FPCW as an implicit use on floating point load instructions.Craig Topper2019-02-081-7/+7
| | | | | | These instructions can generate a stack overflow exception so technically they read the stack overflow exception mask bit. llvm-svn: 353564
* Implementation of asm-goto support in LLVMCraig Topper2019-02-0855-74/+654
| | | | | | | | | | | | | | | | | | | | | | | | | This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563
* [CodeExtractor] Restore outputs after creating exit stubsVedant Kumar2019-02-081-35/+44
| | | | | | | | | | | | | | | | | When CodeExtractor saves the result of InvokeInst at the first insertion point of the 'normal destination' basic block, this block can be omitted in the outlined region, so store is placed outside of the function. The suggested solution is to process saving outputs after creating exit stubs for new function, and stores will be placed in that blocks before return in this case. Patch by Sergei Kachkov! Fixes llvm.org/PR40455. Differential Revision: https://reviews.llvm.org/D57919 llvm-svn: 353562
* AMDGPU: Eliminate GPU specific SubtargetFeaturesMatt Arsenault2019-02-084-80/+69
| | | | | | | | | | | Inline compatability is determined from the individual feature bits. These are just sets of the separate features, but will always be treated as incompatible unless they are specifically ignored. Defining the ISA version number here in tablegen would be nice, but it turns out this wasn't actually used. llvm-svn: 353558
* [DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))Nemanja Ivanovic2019-02-081-4/+14
| | | | | | | | | | | | The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
* [GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilderAditya Nandakumar2019-02-082-9/+10
| | | | | | | | | | https://reviews.llvm.org/D57932 Add some logging + tests to make sure CSEInfo prints debug output. reviewed by: arsenm llvm-svn: 353553
OpenPOWER on IntegriCloud