bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Move some classes into anonymous namespaces. NFC.	Benjamin Kramer	2019-02-11	7	-4/+14
\| \| \| \|	llvm-svn: 353710
*	[MCA] Return a mask of busy resources from method ↵	Andrea Di Biagio	2019-02-11	2	-10/+23
\| \| \| \| \| \| \| \| \| \| \|	ResourceManager::checkAvailability(). NFCI In case of bottlenecks caused by pipeline pressure, we want to be able to correctly report the set of problematic pipelines. This is a first step towards adding support for bottleneck hints in llvm-mca (see PR37494). No functional change intended. llvm-svn: 353706
*	[AMDGPU] Remove unused variable	Benjamin Kramer	2019-02-11	1	-2/+0
\| \| \| \|	llvm-svn: 353704
*	[AMDGPU] Fix DPP sequence in atomic optimizer.	Neil Henning	2019-02-11	1	-38/+38
\| \| \| \| \| \| \| \| \| \|	This commit fixes the DPP sequence in the atomic optimizer (which was previously missing the row_shr:3 step), and works around a read_register exec bug by using a ballot instead. Differential Revision: https://reviews.llvm.org/D57737 llvm-svn: 353703
*	Revert "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"	Sam McCall	2019-02-11	2	-63/+60
\| \| \| \| \| \| \| \| \|	This reverts commit r353610. It causes a miscompile visible in macro expansion in a bootstrapped clang. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190211/626590.html llvm-svn: 353699
*	[ARM] Add v8m.base pattern for add negative imm	Sam Parker	2019-02-11	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	The v8m.base ISA contains movw, which can operate on an unsigned 16-bit value. Add the pattern that converts an add with a negative value, that could fit into 16-bits when negated, into a sub with that positive value. Differential Revision: https://reviews.llvm.org/D57942 llvm-svn: 353692
*	[AMDGPU] Enable DPP combiner pass by default.	Valery Pykhtin	2019-02-11	1	-1/+1
\| \| \| \| \| \|	Related revisions: https://reviews.llvm.org/D55444, https://reviews.llvm.org/D55314 llvm-svn: 353691
*	[ARM] LoadStoreOptimizer: reoder limit	Sjoerd Meijer	2019-02-11	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The whole design of generating LDMs/STMs is fragile and unreliable: it depends on rescheduling here in the LoadStoreOptimizer that isn't register pressure aware and regalloc that isn't aware of generating LDMs/STMs. This patch adds a (hidden) option to control the total number of instructions that can be re-ordered. I appreciate this looks only a tiny bit better than a hard-coded constant, but at least it allows more easy experimentation with different values for now. Ideally we calculate this reorder limit based on some heuristics, and take register pressure into account. I might be looking into that next. Differential Revision: https://reviews.llvm.org/D57954 llvm-svn: 353678
*	Move CFLGraph and the AA summary code over to the new `CallBase`	Chandler Carruth	2019-02-11	3	-38/+36
\| \| \| \| \| \|	instruction base class rather than the `CallSite` wrapper. llvm-svn: 353676
*	Remove `CallSite` from the CodeMetrics analysis, moving it to the new	Chandler Carruth	2019-02-11	1	-7/+4
\| \| \| \| \| \|	`CallBase` and simpler APIs therein. llvm-svn: 353673
*	[ARM] LoadStoreOptimizer: just a clean-up. NFC.	Sjoerd Meijer	2019-02-11	1	-35/+25
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57955 llvm-svn: 353670
*	Update more files added with the old header to the new one.	Chandler Carruth	2019-02-11	1	-4/+3
\| \| \| \|	llvm-svn: 353667
*	Update files that were mistakenly added with the old file header to the	Chandler Carruth	2019-02-11	2	-8/+6
\| \| \| \| \| \|	new one. llvm-svn: 353665
*	[CallSite removal] Port InstSimplify over to use `CallBase` both in its	Chandler Carruth	2019-02-11	1	-19/+17
\| \| \| \| \| \| \| \|	interface and implementation. Port code with: `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353662
*	[CallSite removal] Migrate ConstantFolding APIs and implementation to	Chandler Carruth	2019-02-11	6	-35/+42
\| \| \| \| \| \| \| \| \|	`CallBase`. Users have been updated. You can see how to update any out-of-tree usages: pass `cast<CallBase>(CS.getInstruction())`. llvm-svn: 353661
*	[CallSite removal] Migrate the statepoint GC infrastructure to use the	Chandler Carruth	2019-02-11	8	-178/+157
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`CallBase` class rather than `CallSite` wrappers. I pushed this change down through most of the statepoint infrastructure, completely removing the use of CallSite where I could reasonably do so. I ended up making a couple of cut-points: generic call handling (instcombine, TLI, SDAG). As soon as it hit truly generic handling with users outside the immediate code, I simply transitioned into or out of a `CallSite` to make this a reasonable sized chunk. Differential Revision: https://reviews.llvm.org/D56122 llvm-svn: 353660
*	[X86] Removed unused SDTypeProfile. NFC	Craig Topper	2019-02-11	1	-2/+0
\| \| \| \|	llvm-svn: 353659
*	[X86] EltsFromConsecutiveLoads - replace SmallBitVector with APInt (NFC).	Simon Pilgrim	2019-02-10	1	-10/+11
\| \| \| \| \| \|	Minor refactor to simplify some incoming patches to improve broadcast loads. llvm-svn: 353655
*	[CodeGen][X86] Don't scalarize vector saturating add/sub	Nikita Popov	2019-02-10	1	-15/+6
\| \| \| \| \| \| \| \| \| \| \|	Now that we have vector support for [US](ADD\|SUB)O we no longer need to scalarize when expanding [US](ADD\|SUB)SAT. This matches what the cost model already does. Differential Revision: https://reviews.llvm.org/D57348 llvm-svn: 353651
*	[DAG] Add optional AllowUndefs to isNullOrNullSplat	Simon Pilgrim	2019-02-10	2	-7/+3
\| \| \| \| \| \|	No change in default behaviour (AllowUndefs = false) llvm-svn: 353646
*	[DAGCombine] Simplify funnel shifts with undef/zero args to bitshifts	Simon Pilgrim	2019-02-10	1	-2/+41
\| \| \| \| \| \| \| \|	Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero. Differential Revision: https://reviews.llvm.org/D58009 llvm-svn: 353645
*	[x86] narrow 256-bit horizontal ops via demanded elements	Sanjay Patel	2019-02-10	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	256-bit horizontal math ops are an x86 monstrosity (and thankfully have not been extended to 512-bit AFAIK). The two 128-bit halves operate on separate halves of the inputs. So if we don't demand anything in the upper half of the result, we can extract the low halves of the inputs, do the math, and then insert that result into a 256-bit output. All of the extract/insert is free (ymm<-->xmm), so we're left with a narrower (cheaper) version of the original op. In the affected tests based on: https://bugs.llvm.org/show_bug.cgi?id=33758 https://bugs.llvm.org/show_bug.cgi?id=38971 ...we see that the h-op narrowing can result in further narrowing of other math via existing generic transforms. I originally drafted this patch as an exact pattern match starting from extract_vector_elt, but I thought we might see diffs starting from extract_subvector too, so I changed it to a more general demanded elements solution. There are no extra existing regression test improvements from that switch though, so we could go back. Differential Revision: https://reviews.llvm.org/D57841 llvm-svn: 353641
*	[TargetLowering] refactor setcc folds to fix another miscompile (PR40657)	Sanjay Patel	2019-02-10	1	-55/+55
\| \| \| \| \| \| \| \| \| \|	SimplifySetCC still has much room for improvement, but this should fix the remaining problem examples from: https://bugs.llvm.org/show_bug.cgi?id=40657 The initial fix for this problem was rL353615. llvm-svn: 353639
*	[Local] Delete a redundant check. NFC	Fangrui Song	2019-02-10	1	-1/+1
\| \| \| \| \| \|	isInstructionTriviallyDead also performs the use_empty() check. llvm-svn: 353637
*	[X86] Move some vector InstAliases out from under unnecessary 'let ↵	Craig Topper	2019-02-10	2	-87/+77
\| \| \| \| \| \| \| \|	Predicates'. NFCI We don't have any assembler predicates for vector ISAs so this isn't necessary. It just adds extra lines and identation. llvm-svn: 353631
*	[InstCombine] Fix an unused variable warning.	Craig Topper	2019-02-10	1	-1/+1
\| \| \| \|	llvm-svn: 353630
*	[X86] CombineOr - fold to generic funnel shifts	Simon Pilgrim	2019-02-09	1	-21/+22
\| \| \| \| \| \| \| \|	As discussed on D57389, this is a first step towards moving the SHLD/SHRD matching code to DAGCombiner using FSHL/FSHR instead. There's a bit of work to do before I can do that, so this just folds to FSHL/FSHR in the existing code (handling the different SHRD/FSHR argument ordering), which fixes the issue we had with i16 shift amounts not being correctly masked. llvm-svn: 353626
*	llvm-lib: Implement /list flag	Nico Weber	2019-02-09	2	-0/+49
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57952 llvm-svn: 353620
*	[TargetLowering] add tests to show effect of setcc sub->shift; NFC	Sanjay Patel	2019-02-09	1	-1/+0
\| \| \| \| \| \| \| \| \|	There's effectively no difference for the cases with variables. We just trade a sub for an add on those. But the case with a subtract from constant would require an extra move instruction on x86, so this looks like a reasonable generic combine. llvm-svn: 353619
*	[TargetLowering] avoid miscompile in setcc transform (PR40657)	Sanjay Patel	2019-02-09	1	-1/+3
\| \| \| \|	llvm-svn: 353615
*	Revert "[SelectionDAG] Extract [US]MULO expansion into TL method; NFC"	Nikita Popov	2019-02-09	2	-112/+141
\| \| \| \| \| \| \| \|	This reverts commit r353611. Triggers an assertion during the libcall expansion on ARM. llvm-svn: 353612
*	[SelectionDAG] Extract [US]MULO expansion into TL method; NFC	Nikita Popov	2019-02-09	2	-141/+112
\| \| \| \| \| \| \| \| \|	In preparation for supporting vector expansion. Also drop a variant of ExpandLibCall, of which the MULO expansions were the only user. llvm-svn: 353611
*	[X86][SSE] Generalize X86ISD::BLENDI support to more value types	Simon Pilgrim	2019-02-09	2	-60/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required. With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors. This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts. I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case). The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV. Differential Revision: https://reviews.llvm.org/D57888 llvm-svn: 353610
*	[lib/ObjectYAML] - Fix BB after r353607 [2]. NFC.	George Rimar	2019-02-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	The second and the last place it seems. Error was: [ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Error.cpp.o /Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:993:15: error: unused variable 'Object' [-Werror,-Wunused-variable] const auto Object = static_cast<ELFYAML::Object >(IO.getContext()); llvm-svn: 353609
*	[lib/ObjectYAML] - Fix BB after r353607. NFC.	George Rimar	2019-02-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	Error was: [ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/DAGDeltaAlgorithm.cpp.o /Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:666:15: error: unused variable 'Object' [-Werror,-Wunused-variable] const auto Object = static_cast<ELFYAML::Object >(IO.getContext()); (http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/29920) llvm-svn: 353608
*	[yaml2obj][obj2yaml] - Add support for dumping/parsing .dynamic sections.	George Rimar	2019-02-09	1	-0/+46
\| \| \| \| \| \| \| \| \|	This teaches the tools to parse and dump the .dynamic section and its dynamic tags. Differential revision: https://reviews.llvm.org/D57691 llvm-svn: 353606
*	[GlobalOpt] Simplify __cxa_atexit elimination	Fangrui Song	2019-02-09	1	-39/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	cxxDtorIsEmpty checks callers recursively to determine if the __cxa_atexit-registered function is empty, and eliminates the __cxa_atexit call accordingly. This recursive check is unnecessary as redundant instructions and function calls can be removed by early-cse and inliner. In addition, cxxDtorIsEmpty does not mark visited function and it may visit a function exponential times (multiplication principle). llvm-svn: 353603
*	[MC] Clean up unused inline function and non-anchor defaulted destructors; NFCI	Hubert Tong	2019-02-09	3	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Take care of some missing clean-ups that belong with r249548 and some other copy/paste that had happened. In particular, the destructors are no longer vtable anchors after r249548; and `setSectionName` in `MCSectionWasm` is private and unused since r313058 culled its only caller. The destructors are now implicitly defined, and the unused function is removed. Reviewers: nemanjai, jasonliu, grosbach Reviewed By: nemanjai Subscribers: sbc100, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57182 llvm-svn: 353597
*	Extra processing for BitCast + PHI in InstCombine	Gabor Buella	2019-02-09	1	-11/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some specific cases with bitcast A->B->A with intervening PHI nodes InstCombiner::optimizeBitCastFromPhi transformation creates extra PHI nodes, which are actually a copy of already created PHI or in another words, they are redundant. These extra PHI nodes could lead to extra move instructions generated after DeSSA transformation. This happens when several conditions are met - SROA kicks in and creates new alloca; - there is a simple assignment L = R, which falls under 'canonicalize loads' done by combineLoadToOperationType (this transformation is by default). Exactly this transformation is the reason of bitcasts generated; - the alloca is then used in A->B->A + PHI chain; - there is a loop unrolling. As a result optimizeBitCastFromPhi creates as many of PHI nodes for each new SROA alloca as loop unrolling factor is. These new extra PHI nodes are redundant actually except of one and should not be created. Moreover the idea of optimizeBitCastFromPhi is to get rid of the cast (when possible) but that doesn't happen in these conditions. The proposed fix is to do the cast replacement for the whole calculated/accumulated PHI closure not for one cast only, which is an argument to the optimizeBitCastFromPhi. These will help to accomplish several things: 1) avoid extra PHI nodes generated as all casts which may trigger optimizeBitCastFromPhi transformation will be replaced, 3) bitcasts will be replaced, and 3) create more opportunities to remove dead code, which appears after the replacement. A new test case shows that it's possible to get rid of all bitcasts completely and get quite good code reduction. Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com> Reviewed By: Carrot Differential Revision: https://reviews.llvm.org/D57053 llvm-svn: 353595
*	This reverts commit 1440a848a635849b97f7a5cfa0ecc40d37451f5b.	Mikhail R. Gadelha	2019-02-09	2	-848/+1
\| \| \| \| \| \| \| \|	and commit a1853e834c65751f92521f7481b15cf0365e796b. They broke arm and aarch64 llvm-svn: 353590
*	[AMDGPU] Split dot-insts feature	Stanislav Mekhanoshin	2019-02-09	5	-24/+56
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57971 llvm-svn: 353587
*	[NFC] Avoid passing blocks vector to the OutlineRegionInfo constructor by value.	Sergey Dmitriev	2019-02-08	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: vsk, fhahn, davidxl Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57957 llvm-svn: 353582
*	Re-apply r353553 "[GISel][NFC]: Add missing call to record CSE hits in the ↵	Francis Visoiu Mistrih	2019-02-08	2	-9/+10
\| \| \| \| \| \| \| \|	CSEMIRBuilder" With a fix after r353563 that adds some more opcodes. llvm-svn: 353579
*	Revert r353553 "[GISel][NFC]: Add missing call to record CSE hits in the ↵	Francis Visoiu Mistrih	2019-02-08	2	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \|	CSEMIRBuilder" This reverts commit r353553. This breaks CodeGen/AArch64/GlobalISel/legalize-ext-csedebug-output.mir: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/57963/console llvm-svn: 353575
*	[X86] Add FPCW as an implicit use on floating point load instructions.	Craig Topper	2019-02-08	1	-7/+7
\| \| \| \| \| \|	These instructions can generate a stack overflow exception so technically they read the stack overflow exception mask bit. llvm-svn: 353564
*	Implementation of asm-goto support in LLVM	Craig Topper	2019-02-08	55	-74/+654
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563
*	[CodeExtractor] Restore outputs after creating exit stubs	Vedant Kumar	2019-02-08	1	-35/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When CodeExtractor saves the result of InvokeInst at the first insertion point of the 'normal destination' basic block, this block can be omitted in the outlined region, so store is placed outside of the function. The suggested solution is to process saving outputs after creating exit stubs for new function, and stores will be placed in that blocks before return in this case. Patch by Sergei Kachkov! Fixes llvm.org/PR40455. Differential Revision: https://reviews.llvm.org/D57919 llvm-svn: 353562
*	AMDGPU: Eliminate GPU specific SubtargetFeatures	Matt Arsenault	2019-02-08	4	-80/+69
\| \| \| \| \| \| \| \| \| \| \|	Inline compatability is determined from the individual feature bits. These are just sets of the separate features, but will always be treated as incompatible unless they are specifically ignored. Defining the ISA version number here in tablegen would be nice, but it turns out this wasn't actually used. llvm-svn: 353558
*	[DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))	Nemanja Ivanovic	2019-02-08	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \|	The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
*	[GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder	Aditya Nandakumar	2019-02-08	2	-9/+10
\| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D57932 Add some logging + tests to make sure CSEInfo prints debug output. reviewed by: arsenm llvm-svn: 353553