bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include	Reid Kleckner	2017-09-07	6	-0/+7
\| \| \| \| \| \| \|	Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759
*	Revert r312318, r312325, r312424, r312489	Richard Trieu	2017-09-07	3	-39/+1
\| \| \| \| \| \| \| \| \| \|	r312318 - Debug info for variables whose type is shrinked to bool r312325, r312424, r312489 - Test case for r312318 Revision 312318 introduced a null dereference bug. Details in https://bugs.llvm.org/show_bug.cgi?id=34490 llvm-svn: 312758
*	Move duplicate helpers from DbgValueInst / DbgDeclareInst to DbgInfoIntrinsic	Reid Kleckner	2017-09-07	1	-28/+11
\| \| \| \| \| \|	NFC llvm-svn: 312754
*	[DWARF] Line 0 should not have a discriminator.	Paul Robinson	2017-09-07	1	-2/+2
\| \| \| \| \| \| \| \|	It's meaningless and takes up extra space in the line table. Differential Revision: https://reviews.llvm.org/D37364 llvm-svn: 312751
*	[yaml2obj][ELF] Add support for symbol indexes greater than SHN_LORESERVE	Petr Hosek	2017-09-07	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \|	Right now Symbols must be either undefined or defined in a specific section. Some symbols have section indexes like SHN_ABS however. This change adds support for outputting symbols that have such section indexes. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D37391 llvm-svn: 312745
*	COFF: PDB: Allow multiple modules with the same name.	Peter Collingbourne	2017-09-07	1	-18/+3
\| \| \| \| \| \| \| \| \| \|	It is possible for two modules to have the same name if they are archive members with the same name, or if we are doing LTO (in which case all modules will have the name "lto.tmp"). Differential Revision: https://reviews.llvm.org/D37589 llvm-svn: 312744
*	Remove dead code. NFCI.	Peter Collingbourne	2017-09-07	1	-8/+0
\| \| \| \|	llvm-svn: 312740
*	[CUDA] Added rudimentary support for CUDA-9 and sm_70.	Artem Belevich	2017-09-07	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	For now CUDA-9 is not included in the list of CUDA versions clang searches for, so the path to CUDA-9 must be explicitly passed via --cuda-path=. On LLVM side NVPTX added sm_70 GPU type which bumps required PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment. Differential Revision: https://reviews.llvm.org/D37576 llvm-svn: 312734
*	AMDGPU: Start selecting v_mad_mix_f32	Matt Arsenault	2017-09-07	4	-5/+105
\| \| \| \|	llvm-svn: 312732
*	DAG: Allow creating extract_vector_elt post-legalize	Matt Arsenault	2017-09-07	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes some combine issues for AMDGPU where we weren't getting the many extract_vector_elt combines expected in a future patch. This should really be checking isOperationLegalOrCustom on the extract. That improves a number of x86 lit tests, but a few get stuck in an infinite loop from one place where a similar looking extract is created. I have a different workaround in the backend for that which keeps many of those improvements, but also adds a few regressions. llvm-svn: 312730
*	AMDGPU: Handle non-temporal loads and stores	Konstantin Zhuravlyov	2017-09-07	1	-23/+59
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D36862 llvm-svn: 312729
*	AMDGPU: Handle more than one memory operand in SIMemoryLegalizer	Konstantin Zhuravlyov	2017-09-07	2	-58/+145
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37397 llvm-svn: 312725
*	[ARM] Remove redundant vcvt patterns.	Benjamin Kramer	2017-09-07	1	-14/+0
\| \| \| \| \| \| \| \|	These don't add any value as they're just compositions of existing patterns. However, they can confuse the cost logic in ISel, leading to duplicated vcvt instructions like in PR33199. llvm-svn: 312724
*	[X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess ↵	Michael Zuckerman	2017-09-07	1	-20/+193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(VF{8\|16\|32} stride 3). This patch expands the support of lowerInterleavedload to {8\|16\|32}x8i stride 3. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8\|16\|32}) and we plan to include the store (deinterleved side). The patch goal is to optimize the following sequence: a0 b0 c0 a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 a5 b5 c5 a6 b6 c6 a7 b7 c7 into a0 a1 a2 a3 a4 a5 a6 a7 b0 b1 b2 b3 b4 b5 b6 b7 c0 c1 c2 c3 c4 c5 c6 c7 Reviewers 1. zvi 2. igor 3. guyblank 4. dorit 5. Ayal llvm-svn: 312722
*	[mips] Use RegisterMCAsmBackend to register all MIPS asm backends. NFC	Simon Atanasyan	2017-09-07	5	-81/+28
\| \| \| \| \| \| \| \| \| \| \| \| \|	This change converts the `MipsAsmBackend` constructor to the "standard" form. It makes possible to use `RegisterMCAsmBackend` for the backends registrations. Now we pass `Triple` instance to the `MipsAsmBackend` ctor and deduce all required options like endianness and bitness from the triple. We still need to implement explicit ABI checking for providing correct options to backends. Differential revision: https://reviews.llvm.org/D37519 llvm-svn: 312720
*	[MachineCombiner] Update instruction depths incrementally for large BBs.	Florian Hahn	2017-09-07	2	-23/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For large basic blocks with lots of combinable instructions, the MachineTraceMetrics computations in MachineCombiner can dominate the compile time, as computing the trace information is quadratic in the number of instructions in a BB and it's relevant successors/predecessors. In most cases, knowing the instruction depth should be enough to make combination decisions. As we already iterate over all instructions in a basic block, the instruction depth can be computed incrementally. This reduces the cost of machine-combine drastically in cases where lots of instructions are combined. The major drawback is that AFAIK, computing the critical path length cannot be done incrementally. Therefore we only compute instruction depths incrementally, for basic blocks with more instructions than inc_threshold. The -machine-combiner-inc-threshold option can be used to set the threshold and allows for easier experimenting and checking if using incremental updates for all basic blocks has any impact on the performance. Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn Reviewed By: fhahn Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D36619 llvm-svn: 312719
*	[MachineTraceMetrics] Add computeDepth function (NFCI).	Florian Hahn	2017-09-07	1	-54/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This function is used in D36619 to update the instruction depths incrementally. Reviewers: efriedma, Gerolf, MatzeB, fhahn Reviewed By: fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36696 llvm-svn: 312714
*	[Sparc][NFC] Clean up SelectCC lowering	Alex Bradbury	2017-09-07	1	-44/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ARM, BPF, MSP430, Sparc and Mips backends all use a similar code sequence for lowering SelectCC. As pointed out by @reames in D29937, this code isn't particularly clear and in most of these backends doesn't actually match the comments. This patch makes the code sequence clearer for the Sparc backend through better variable naming and more accurate comments (e.g. we are inserting triangle control flow, _not_ diamond). There is no functional change. Differential Revision: https://reviews.llvm.org/D37194 llvm-svn: 312713
*	Revert "[RegAlloc] Make sure live-ranges reflect the state of the IR when ↵	Jonas Paulsson	2017-09-07	2	-8/+2
\| \| \| \| \| \| \| \| \| \|	removing them" This temporarily reverts commit 463fa38 (r311401). See https://bugs.llvm.org/show_bug.cgi?id=34502 llvm-svn: 312708
*	X86: Improve AVX512 fptoui lowering	Zvi Rackover	2017-09-07	3	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add patterns for fptoui <16 x float> to <16 x i8> fptoui <16 x float> to <16 x i16> Reviewers: igorb, delena, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37505 llvm-svn: 312704
*	[X86] Force shuffle lowering to only create X86ISD::VPERM2X128 with 64-bit ↵	Craig Topper	2017-09-07	2	-22/+5
\| \| \| \| \| \| \| \| \| \|	element types so we can remove some patterns from isel. Intrinsic handling is still creating these nodes with 32-bit elements as well. But at least this gets rid of 8 and 16. Ideally, someday we'll convert the intrinsics to generic vector shuffles and remove the intrinsics. llvm-svn: 312702
*	AMDGPU: Don't legalize i16 extloads to i32 with legal i16	Matt Arsenault	2017-09-07	3	-1/+8
\| \| \| \| \| \| \|	Keeping non-i16 extloads makes it easier to match some new gfx9 load instructions. llvm-svn: 312699
*	ModuleSummaryAnalysis: Correctly handle all function operand references.	Peter Collingbourne	2017-09-07	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code that handles personality functions when creating a module summary does not correctly handle the case where a function's personality function operand refers to the function indirectly (e.g. via a bitcast). This patch handles such cases by treating personality function references like any other reference, i.e. by adding them to the function's reference list. This has the minor side benefit of allowing personality functions to participate in early dead stripping. We do this by calling findRefEdges on the function itself. This way we also end up handling other function operands (specifically prefix data and prologue data) for free. Differential Revision: https://reviews.llvm.org/D37553 llvm-svn: 312698
*	[X86] Remove patterns for selecting a v8f32 X86ISD::MOVSS or v4f64 ↵	Craig Topper	2017-09-07	2	-48/+0
\| \| \| \| \| \| \| \|	X86ISD::MOVSD. I don't think we ever generate these. If we did, I would expect we would also be able to generate v16f32 and v8f64, but we don't have those patterns. llvm-svn: 312694
*	ARM: track globals promoted to coalesced const pool entries	Saleem Abdulrasool	2017-09-07	3	-13/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Globals that are promoted to an ARM constant pool may alias with another existing constant pool entry. We need to keep a reference to all globals that were promoted to each constant pool value so that we can emit a distinct label for each promoted global. These labels are necessary so that debug info can refer to the promoted global without an undefined reference during linking. Patch by Stephen Crane! llvm-svn: 312692
*	Object: Downgrade invalid weak externals from an assert fail to an ↵	Peter Collingbourne	2017-09-07	1	-3/+6
\| \| \| \| \| \| \| \|	llvm::Error when creating an irsymtab. This fixes bitcode emission for modules containing invalid weak externals. llvm-svn: 312686
*	InstSimplify: canonicalize is idempotent	Matt Arsenault	2017-09-07	1	-0/+1
\| \| \| \|	llvm-svn: 312685
*	LTO: Remove unnecessary Windows support code.	Peter Collingbourne	2017-09-07	1	-15/+0
\| \| \| \| \| \| \|	I empirically verified that open files can in fact be renamed on Windows with sys::fs::rename, so remove the incorrect code and comment. llvm-svn: 312683
*	[Pass] Fix some Clang-tidy modernize and Include What You Use warnings; ↵	Eugene Zelenko	2017-09-06	2	-32/+43
\| \| \| \| \| \|	other minor fixes (NFC). llvm-svn: 312679
*	[AMDGPU] Use v_pk_max_f16 for fcanonicalize	Stanislav Mekhanoshin	2017-09-06	1	-5/+10
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37325 llvm-svn: 312676
*	[WebAssembly] Only treat imports/exports as symbols when reading relocatable ↵	Sam Clegg	2017-09-06	2	-34/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	object files This change only treats imported and exports functions and globals as symbol table entries the object has a "linking" section (i.e. it is relocatable object file). In this case all globals must be of type I32 and initialized with i32.const. This was previously being assumed but not checked for and was causing a failure on big endian machines due to using the wrong value of then union. See: https://bugs.llvm.org/show_bug.cgi?id=34487 Differential Revision: https://reviews.llvm.org/D37497 llvm-svn: 312674
*	Removes redundant `llvm::`, add comments and simplify a return type of a ↵	Rui Ueyama	2017-09-06	1	-29/+32
\| \| \| \| \| \| \| \|	function. No functional change intended. llvm-svn: 312673
*	Insert IMPLICIT_DEFS for undef uses in tail merging	Matthias Braun	2017-09-06	6	-84/+138
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tail merging can convert an undef use into a normal one when creating a common tail. Doing so can make the register live out from a block which previously contained the undef use. To keep the liveness up-to-date, insert IMPLICIT_DEFs in such blocks when necessary. To enable this patch the computeLiveIns() function which used to compute live-ins for a block and set them immediately is split into new functions: - computeLiveIns() just computes the live-ins in a LivePhysRegs set. - addLiveIns() applies the live-ins to a block live-in list. - computeAndAddLiveIns() is a convenience function combining the other two functions and behaving like computeLiveIns() before this patch. Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D37034 llvm-svn: 312668
*	Disable jump threading into loop headers	Krzysztof Parzyszek	2017-09-06	1	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consider this type of a loop: for (...) { ... if (...) continue; ... } Normally, the "continue" would branch to the loop control code that checks whether the loop should continue iterating and which contains the (often) unique loop latch branch. In certain cases jump threading can "thread" the inner branch directly to the loop header, creating a second loop latch. Loop canonicalization would then transform this loop into a loop nest. The problem with this is that in such a loop nest neither loop is countable even if the original loop was. This may inhibit subsequent loop optimizations and be detrimental to performance. Differential Revision: https://reviews.llvm.org/D36404 llvm-svn: 312664
*	[X86] Move more isel patterns to X86InstrVecCompiler.td. NFC	Craig Topper	2017-09-06	3	-437/+184
\| \| \| \| \| \|	This moves more of our subvector insert/extract tricks to X86InstrVecCompiler.td and refactors them into multiclasses. llvm-svn: 312661
*	[AMDGPU] Fixed encoding of v_pk_mul_f16 in fcanonicalize	Stanislav Mekhanoshin	2017-09-06	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37522 llvm-svn: 312660
*	[IfConversion] Remove kill flags from common instructions as well	Krzysztof Parzyszek	2017-09-06	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When if-converting a diamond, two separate blocks will be placed back to back to form a straight line code. To ensure correctness of the liveness information, any registers that are live in the second block should not be killed in the first block, even if they were in the original code. Additionally, when the two blocks share common instructions at the beginning, these instructions will not be duplicated, but only placed once, before both of the blocks. Since the function "isIdenticalTo" (as used here) ignores kill flags, the common initial code in one block may have a kill flag for a register that is live in the other block. Because the code that removes kill flags only runs for the non-common parts of the predicated blocks, a kill flag mismatch in the common code could still lead to a live register being killed prematurely. llvm-svn: 312654
*	[X86] Actually add the new file that was supposed to go with r312649.	Craig Topper	2017-09-06	1	-0/+179
\| \| \| \|	llvm-svn: 312650
*	[X86] Introduce a new td file to hold patterns some of the non instruction ↵	Craig Topper	2017-09-06	3	-211/+1
\| \| \| \| \| \| \| \| \| \| \| \|	patterns from SSE and AVX512 This patch moves some of similar non-instruction patterns from X86InstrSSE.td and X86InstrAVX512.td to a common file. This is intended as a starting point. There are many other optimization patterns that exist in both files that we could move here. Differential Revision: https://reviews.llvm.org/D37455 llvm-svn: 312649
*	Fix PR33878: BasicAA incorrectly assumes different address spaces don't alias	Nuno Lopes	2017-09-06	1	-5/+0
\| \| \| \| \| \| \| \| \|	Remove code that assumed that a nullptr of address space != 0 couldnt alias with a non-null pointer. This is incorrect, since nothing can be concluded about a null pointer in an address space != 0. This code was written before address spaces were introduced Differential Revision: https://reviews.llvm.org/D37518 llvm-svn: 312648
*	Minor style fixes in lib/Support/**/Program.(inc\|cpp).	Alexander Kornienko	2017-09-06	3	-72/+70
\| \| \| \| \| \|	No functional changes intended. llvm-svn: 312646
*	[Hexagon] Add option to generate calls to "abort" for "unreachable"	Krzysztof Parzyszek	2017-09-06	1	-0/+6
\| \| \| \|	llvm-svn: 312644
*	[TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent	Wei Mi	2017-09-06	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	function return the intrinsics's first argument. llvm.memcpy/memset/memmove return void but they will return the first argument after they are expanded as libcalls. Now if the parent function has any return value, llvm.memcpy cannot be turned into tail call after expansion. The patch is to handle that case in SelectionDAGBuilder so when caller function return the same value as the first argument of llvm.memcpy, tail call is allowed. Differential Revision: https://reviews.llvm.org/D37406 llvm-svn: 312641
*	[AMDGPU] Fix shouldClusterMemOps to process flat loads	Stanislav Mekhanoshin	2017-09-06	1	-0/+4
\| \| \| \| \| \| \| \|	Flat loads do not have vdata operand but have vdst instead. Differential Revision: https://reviews.llvm.org/D37502 llvm-svn: 312640
*	AMDGPU: Make worst-case assumption about the wait states in inline assembly	Nicolai Haehnle	2017-09-06	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Mesa still uses a hack where empty inline assembly is used as a kind of optimization barrier. This exposed a problem where not enough wait states were inserted, because the hazard recognizer implicitly assumed that each inline assembly "instruction" has at least one wait state. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37205 llvm-svn: 312635
*	[X86][X87] Ensure x87 instructions are tagged as altering the FPSW reg	Simon Pilgrim	2017-09-06	1	-7/+8
\| \| \| \| \| \| \| \| \| \|	As noted in PR34080, a lot of x87 instructions alter the FPSW status register (or leave it in an undefined state) but aren't tagged as such in the tablegen. This patch tags the control word, stack, wait and math instructions as altering FPSW, which matches what the AMD APMs suggests happens. Differential Revision: https://reviews.llvm.org/D36414 llvm-svn: 312629
*	[RISCV][NFC] Fix sorting of includes in lib/Target/RISCV	Alex Bradbury	2017-09-06	2	-6/+6
\| \| \| \|	llvm-svn: 312624
*	[DAGCombiner] When combining EXTRACT_SUBVECTOR of a BUILD_VECTOR, make sure ↵	Craig Topper	2017-09-06	1	-2/+3
\| \| \| \| \| \|	we don't create a BUILD_VECTOR with an illegal type after type legalization. llvm-svn: 312621
*	[x86] Fix PR34377 by disabling cmov conversion when we relied on it	Chandler Carruth	2017-09-06	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \|	performing a zext of a register. On the PR there is discussion of how to more effectively handle this, but this patch prevents us from miscompiling code. Differential Revision: https://reviews.llvm.org/D37504 llvm-svn: 312620
*	[X86] Add more FMA3 patterns to cover a load in all 3 possible positions.	Craig Topper	2017-09-06	2	-68/+137
\| \| \| \| \| \|	This matches what we already do for AVX512. The peephole pass makes up for this in most if not all cases. But this makes isel behavior for these consistent with every other instruction. llvm-svn: 312613