bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86][SSE] Add X86ISD::PACKSS\PACKUS to ↵	Simon Pilgrim	2019-05-01	1	-1/+7
\| \| \| \| \| \|	SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359673
*	[X86][SSE] Add X86ISD::UNPCKL\UNPCK to ↵	Simon Pilgrim	2019-05-01	1	-2/+4
\| \| \| \| \| \|	SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359670
*	[X86][SSE] Move extract_subvector(pshufb) fold to ↵	Simon Pilgrim	2019-05-01	1	-12/+3
\| \| \| \| \| \| \| \|	SimplifyDemandedVectorEltsForTargetNode This lets us hit more cases than combineExtractSubvector and allows us reuse more code. llvm-svn: 359669
*	[X86] SimplifyDemandedVectorEltsForTargetNode - pull out vector halving ↵	Simon Pilgrim	2019-05-01	1	-10/+13
\| \| \| \| \| \| \| \|	code. NFCI. Pull out the HADD/HSUB code to halve vector widths if the upper half isn't used - prep work to adding support for other opcodes. llvm-svn: 359667
*	[X86][SSE] Extract i1 elements from vXi1 bool vectors	Simon Pilgrim	2019-05-01	1	-0/+33
\| \| \| \| \| \| \| \|	This is an alternative to D59669 which more aggressively extracts i1 elements from vXi1 bool vectors using a MOVMSK. Differential Revision: https://reviews.llvm.org/D61189 llvm-svn: 359666
*	[X86FixupLEAs] Hoist the calls to isLEA out of the 3 separate functions and ↵	Craig Topper	2019-05-01	1	-14/+9
\| \| \| \| \| \| \| \|	put it in the basic block instruction loop. NFC Now need to check it 3 different times. Just do it once at the top of the loop. llvm-svn: 359658
*	Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract ↵	David L. Jones	2019-05-01	1	-29/+0
\| \| \| \| \| \| \| \|	element" This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313. llvm-svn: 359648
*	[JITLink] Make sure we explicitly deallocate memory on failure.	Lang Hames	2019-05-01	2	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \|	JITLinkGeneric phases 2 and 3 (focused on applying fixups and finalizing memory, respectively) may fail for various reasons. If this happens, we need to explicitly de-allocate the memory allocated in phase 1 (explicitly, because deallocation may also fail and so is implemented as a method returning error). No testcase yet: I am still trying to decide on the right way to test totally platform agnostic code like this. llvm-svn: 359643
*	[WebAssembly] Update expectations for gcc torture tests	Sam Clegg	2019-04-30	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	This is needed to make the wasm waterfall green again after we land the update to WASI: https://github.com/WebAssembly/waterfall/pull/492 Differential Revision: https://reviews.llvm.org/D61351 llvm-svn: 359634
*	[InstCombine] Limit a vector demanded elts rule which was producing invalid IR.	Philip Reames	2019-04-30	1	-0/+12
\| \| \| \| \| \| \| \|	The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes. This should fix https://bugs.llvm.org/show_bug.cgi?id=41624. llvm-svn: 359633
*	[MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.	Alina Sbirlea	2019-04-30	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: LLVM Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359627
*	[ORC] Move SimpleCompiler/ConcurrentIRCompiler definitions into a .cpp file.	Lang Hames	2019-04-30	2	-0/+87
\| \| \| \| \| \| \|	SimpleCompiler is no longer templated, so there's no reason for this code to be in a header any more. llvm-svn: 359626
*	[AliasAnalysis/NewPassManager] Invalidate AAManager less often.	Alina Sbirlea	2019-04-30	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a redo of D60914. The objective is to not invalidate AAManager, which is stateless, unless there is an explicit invalidate in one of the AAResults. To achieve this, this patch adds an API to PAC, to check precisely this: is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61284 llvm-svn: 359622
*	[AMDGPU] gfx1010 VMEM and SMEM implementation	Stanislav Mekhanoshin	2019-04-30	16	-317/+1071
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61330 llvm-svn: 359621
*	Fix a few -Werror warnings:	Eric Christopher	2019-04-30	1	-4/+3
\| \| \| \| \| \| \|	- Remove a variable only used in an assert - Fix pessimizing move warning around copy elision llvm-svn: 359617
*	[PassManagerBuilder] Add option for interleaved loops, for loop vectorize.	Alina Sbirlea	2019-04-30	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615
*	[JITLink] Add debugging output to print resolved external atoms.	Lang Hames	2019-04-30	1	-0/+6
\| \| \| \|	llvm-svn: 359614
*	[ORC][JITLink] Name in-memory compiled objects after their source modules.	Lang Hames	2019-04-30	1	-1/+2
\| \| \| \| \| \| \| \|	In-memory compiled object buffer identifiers will now be derived from the identifiers of their source IR modules. This makes it easier to connect in-memory objects with their source modules in debugging output. llvm-svn: 359613
*	[llvm-profdata] Add overlap command to compute similarity b/w two profile files	Rong Xu	2019-04-30	3	-0/+283
\| \| \| \| \| \| \| \| \|	Add overlap functionality to llvm-profdata tool to compute the similarity between two profile files. Differential Revision: https://reviews.llvm.org/D60977 llvm-svn: 359612
*	[NFC][InlineCost] cleanup - comments, overflow handling.	Fedor Sergeev	2019-04-30	1	-52/+61
\| \| \| \| \| \| \| \|	Reviewed By: apilipenko Tags: #llvm Differential Revision: https://reviews.llvm.org/D60751 llvm-svn: 359609
*	[X86][SSE] Fold extract_subvector(extend(x)) -> extend_vector_inreg(x)	Simon Pilgrim	2019-04-30	1	-5/+7
\| \| \| \| \| \| \| \|	This adds any extend support - folding to zero_extend_vector_inreg (PMOVZX) for legality Minor improvement for PR39709 llvm-svn: 359608
*	Fix stack-use-after free after r359580	Nico Weber	2019-04-30	1	-3/+4
\| \| \| \| \| \| \| \|	`Candidate` was a StringRef refering to a temporary string. Instead, create a local variable for the string and use a StringRef referring to that. llvm-svn: 359604
*	[WebAssembly] Support EXPLICIT_NAME symbols in llvm-readobj	Dan Gohman	2019-04-30	1	-0/+1
\| \| \| \| \| \| \| \| \|	Teach llvm-readobj about WASM_SYMBOL_EXPLICIT_NAME. Differential Revision: https://reviews.llvm.org/D61323 Reviewer: sbc100 llvm-svn: 359602
*	[WebAssembly] Support f16 libcalls	Dan Gohman	2019-04-30	2	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for f16 libcalls in WebAssembly. This entails adding signatures for the remaining F16 libcalls, and renaming gnu_f2h_ieee/gnu_h2f_ieee to truncsfhf2/extendhfsf2 for consistency between f32 and f64/f128 (compiler-rt already supports this). Differential Revision: https://reviews.llvm.org/D61287 Reviewer: dschuff llvm-svn: 359600
*	[X86] Remove if that's always true	Craig Topper	2019-04-30	1	-2/+1
\| \| \| \| \| \| \| \|	It's been like this since it was added in a refactor of this code. Fixes PR41659 llvm-svn: 359597
*	[SimplifyLibCalls] Clean up code (NFC)	Evandro Menezes	2019-04-30	1	-6/+8
\| \| \| \| \| \|	Fix pointer check after dereferencing (PR41665). llvm-svn: 359595
*	[X86] If PreprocessISelDAG reorders a load before a call, make sure we ↵	Craig Topper	2019-04-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	remove dead nodes from the graph The reordering can leave at least a dead TokenFactor in the graph. This cause the linearize scheduler to fail with something like the assert seen in PR22614. This is only one of many ways we can break the linearize scheduler today so I can't say for sure that any of the other failures in that bug were caused by this issue. This takes the heavy hammer approach of just running RemoveDeadNodes unconditionally at the end of the PreprocessISelDAG. If this turns out to be a compile time hit, we can try to refine it. Differential Revision: https://reviews.llvm.org/D61164 llvm-svn: 359582
*	[X86] Initial cleanups on the FixupLEAs pass. Separate Atom LEA creation ↵	Craig Topper	2019-04-30	1	-91/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from other LEA optimizations. This removes some of the class variables. Merge basic block processing into runOnMachineFunction to keep the flags local. Pass MachineBasicBlock around instead of an iterator. We can get the iterator in the few places that need it. Allows a range-based outer for loop. Separate the Atom optimization from the rest of the optimizations. This allows fixupIncDec to create INC/DEC and still allow Atom to turn it back into LEA when profitable by its heuristics. I'd like to improve fixupIncDec to turn LEAs into ADD any time the base or index register is equal to the destination register. This is profitable regardless of the various slow flags. But again we would want Atom to be able to undo that. Differential Revision: https://reviews.llvm.org/D60993 llvm-svn: 359581
*	Re-reland "[Option] Fix PR37006 prefix choice in findNearest"	Nico Weber	2019-04-30	1	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was first reviewed in https://reviews.llvm.org/D46776 and landed in r332299, but got reverted because it broke the PS4 bots. https://reviews.llvm.org/D50410 fixed this, and then this change was re-reviewed at https://reviews.llvm.org/D50515 and relanded in r341329. It got reverted due to causing MSan issues. However, nobody wrote down the error message and the bot link is dead, so I'm relanding this to capture the MSan error. I'll then either fix it, or copy it somewhere and revert if fixing looks difficult. llvm-svn: 359580
*	[SelectionDAG] remove div-by-zero constant folding restriction	Sanjay Patel	2019-04-30	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't have this restriction in IR, so it should not be here either simply out of consistency. Code that wants to handle FP exceptions is expected to use the 'strict' variants of these nodes. We don't get the frem case because frem by 0.0 produces NaN (invalid), and that's the remaining check here (so the removed check for frem was dead code AFAIK). This is the only place in SDAG that uses "HasFPExceptions", so I think we should remove that entirely as a follow-up patch. llvm-svn: 359566
*	[TableGen] Fix null pointer dereferencing in token parser.	Simon Pilgrim	2019-04-30	1	-8/+10
\| \| \| \| \| \|	Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359559
*	Revert rL359519 : [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.	Simon Pilgrim	2019-04-30	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 ........ This was causing windows build bot failures llvm-svn: 359555
*	[ARM] Implement TTI::getMemcpyCost	Sjoerd Meijer	2019-04-30	3	-0/+43
\| \| \| \| \| \| \| \| \|	This implements TargetTransformInfo method getMemcpyCost, which estimates the number of instructions to which a memcpy instruction expands to. Differential Revision: https://reviews.llvm.org/D59787 llvm-svn: 359547
*	Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to ↵	Simon Pilgrim	2019-04-30	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SCALAR_TO_VECTOR(Elt) for all SSE flavors Current LLVM uses pxor+pinsrb on SSE4+ for INSERT_VECTOR_ELT(ZeroVec, 0, Elt) insead of much simpler movd. INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is idiomatic construct which is used e.g. for _mm_cvtsi32_si128(Elt) and for lowest element initialization in _mm_set_epi32. So such inefficient lowering leads to significant performance digradations in ceratin cases switching from SSSE3 to SSE4. https://bugs.llvm.org/show_bug.cgi?id=41512 Here INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is simply converted to SCALAR_TO_VECTOR(Elt) when applicable since latter is closer match to desired behavior and always efficiently lowered to movd and alike. Committed on behalf of @Serge_Preis (Serge Preis) Differential Revision: https://reviews.llvm.org/D60852 llvm-svn: 359545
*	[TargetLowering] findOptimalMemOpLowering. NFCI.	Sjoerd Meijer	2019-04-30	2	-123/+119
\| \| \| \| \| \| \| \| \| \|	This was a local static funtion in SelectionDAG, which I've promoted to TargetLowering so that I can reuse it to estimate the cost of a memory operation in D59787. Differential Revision: https://reviews.llvm.org/D59766 llvm-svn: 359543
*	[ARM GlobalISel] Widen small shift operands	Diana Picus	2019-04-30	1	-0/+1
\| \| \| \| \| \| \|	The legalizer was already widening the shift amount. Add tests for that behaviour, and also support widening the shifted value. llvm-svn: 359542
*	[AsmPrinter] Make AsmPrinter::HandlerInfo::Handler a unique_ptr	Fangrui Song	2019-04-30	2	-16/+15
\| \| \| \| \| \| \|	Handlers.clear() in AsmPrinter::doFinalization() will destroy these handlers. A unique_ptr makes the ownership clearer. llvm-svn: 359541
*	[ARM GlobalISel] Be more careful about bailing out	Diana Picus	2019-04-30	1	-2/+2
\| \| \| \| \| \| \| \| \|	Bail out on function arguments/returns with types aggregating an unsupported type. This fixes cases where we would happily and incorrectly lower functions taking e.g. [1 x i64] parameters, when we don't even support plain i64 yet. llvm-svn: 359540
*	[TargetLowering] Change getOptimalMemOpType to take a function attribute list	Sjoerd Meijer	2019-04-30	16	-49/+40
\| \| \| \| \| \| \| \| \| \| \| \|	The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537
*	MSan: handle llvm.lifetime.start intrinsic	Alexander Potapenko	2019-04-30	1	-8/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When a variable goes into scope several times within a single function or when two variables from different scopes share a stack slot it may be incorrect to poison such scoped locals at the beginning of the function. In the former case it may lead to false negatives (see https://github.com/google/sanitizers/issues/590), in the latter - to incorrect reports (because only one origin remains on the stack). If Clang emits lifetime intrinsics for such scoped variables we insert code poisoning them after each call to llvm.lifetime.start(). If for a certain intrinsic we fail to find a corresponding alloca, we fall back to poisoning allocas for the whole function, as it's now impossible to tell which alloca was missed. The new instrumentation may slow down hot loops containing local variables with lifetime intrinsics, so we allow disabling it with -mllvm -msan-handle-lifetime-intrinsics=false. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60617 llvm-svn: 359536
*	[DebugInfo] DW_OP_deref_size in PrologEpilogInserter.	Markus Lavin	2019-04-30	7	-3/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The PrologEpilogInserter need to insert a DW_OP_deref_size before prepending a memory location expression to an already implicit expression to avoid having the existing expression act on the memory address instead of the value behind it. The reason for using DW_OP_deref_size and not plain DW_OP_deref is that big-endian targets need to read the right size as simply truncating a larger read would yield the wrong result (LSB bytes are not at the lower address). This re-commit fixes issues reported in the first one. Namely deref was inserted under wrong conditions and additionally the deref_size argument was incorrectly encoded. Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 359535
*	[DAGCombiner] Do not generate ISD::ADDE node if adde is not legal for the ↵	Zi Xuan Wu	2019-04-30	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	target when combine ISD::TRUNC node Do not combine (trunc adde(X, Y, Carry)) into (adde trunc(X), trunc(Y), Carry), if adde is not legal for the target. Even it's at type-legalize phase. Because adde is special and will not be legalized at operation-legalize phase later. This fixes: PR40922 https://bugs.llvm.org/show_bug.cgi?id=40922 Differential Revision: https://reviews.llvm.org//D60854 llvm-svn: 359532
*	[ORC] Allow JITDylib definition generators to return Errors.	Lang Hames	2019-04-30	2	-58/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Background: A definition generator can be attached to a JITDylib to generate new definitions in response to queries. For example: a generator that forwards calls to dlsym can map symbols from a dynamic library into the JIT process on demand. If definition generation fails then the generator should be able to return an error. This allows the JIT API to distinguish between the case where a generator does not provide a definition, and the case where it was not able to determine whether it provided a definition due to an error. The immediate motivation for this is cross-process symbol lookups: If the remote-lookup generator is attached to a JITDylib early in the search list, and if a generator failure is misinterpreted as "no definition in this JITDylib" then lookup may continue and bind to a different definition in a later JITDylib, which is a bug. llvm-svn: 359521
*	[MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.	Alina Sbirlea	2019-04-29	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359519
*	[PDB] Fix hash function used to write /src/headerblock	Nico Weber	2019-04-29	2	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lld-link used to write PDB files that DIA couldn't recover natvis files from if: - The global strings table was > 64kiB - There were at least 3 natvis files The cause was that the hash function for the /src/headerblock stream was incorrect: It needs to be truncated to 16 bit. If the global strings table was <= 64kiB, truncating to 16 bit is a no-op, so this wasn't needed for small programs. If there are only 1 or 2 natvis files, then the growth strategy in HashTable::grow() would mean the hash table would have 2 buckets (for 1 natvis file) or 4 buckets (for 4 natvis files), and since the hash function is used modulo number of buckets, and since 2 and 4 divide 0x10000, the missing `% 0x10000` is a no-op there too. For 3 natvis files, the hash table grows to 6 buckets, which has a factor that's not common with 0x10000 and the difference starts to matter. Fixes PR41626. Differential Revision: https://reviews.llvm.org/D61277 llvm-svn: 359515
*	[ORC] Replace the LLJIT/LLLazyJIT Create methods with Builder utilities.	Lang Hames	2019-04-29	1	-105/+131
\| \| \| \| \| \| \| \| \| \| \|	LLJITBuilder and LLLazyJITBuilder construct LLJIT and LLLazyJIT instances respectively. Over time these will allow more configurable options to be added while remaining easy to use in the default case, which for default in-process JITing is now: auto J = ExitOnErr(LLJITBuilder.create()); llvm-svn: 359511
*	[WebAssembly] Make an assertion message prettier. NFC.	Dan Gohman	2019-04-29	1	-2/+2
\| \| \| \| \| \|	This is a follow-up to https://reviews.llvm.org/D59521. llvm-svn: 359509
*	[ThinLTO] Adding architecture name into saved object filename	Steven Wu	2019-04-29	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For ThinLTOCodegenerator, it has an option to save the object file outputs into a directory which is essential for debug info. Tools like lldb and dsymutil will look for these object files for debug info. On Darwin platform, you can link fat binaries with one single clang driver invocation like: $ clang -arch x86_64 -arch i386 -Wl,-object_path_lto,$TMPDIR ... Unfornately, the output object files for one architecture is going to overwrite the previous ones and one architecture slice will end up with no debug info. One example for this is to turn on ThinLTO for sanitizer dylibs in compiler-rt project. To fix the issue, add the name for the architecture into the name of the output object file. rdar://problem/35482935 Reviewers: tejohnson, bd1976llvm, dexonsmith, JDevlieghere Reviewed By: dexonsmith Subscribers: mehdi_amini, aprantl, inglorion, eraman, hiraditya, jkorous, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60924 llvm-svn: 359508
*	[WebAssembly] Define the signature for __stack_chk_fail	Dan Gohman	2019-04-29	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The WebAssembly backend needs to know the signatures of all runtime libcall functions. This adds the signature for __stack_chk_fail which was previously missing. Also, make the error message for a missing libcall include the name of the function. Differential Revision: https://reviews.llvm.org/D59521 Reviewed By: sbc100 llvm-svn: 359505
*	[PowerPC] Try harder to avoid load/move-to VSR for partial vector loads	Roland Froese	2019-04-29	1	-15/+36
\| \| \| \| \| \| \| \| \| \|	Change the PPCISelLowering.cpp function that decides to avoid update form in favor of partial vector loads to know about newer load types and to not be confused by the chain operand. Differential Revision: https://reviews.llvm.org/D60102 llvm-svn: 359504