bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[WebAssembly] Optimize BUILD_VECTOR lowering for size	Thomas Lively	2019-01-30	3	-112/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implements custom lowering logic that finds the optimal value for the initial splat of the vector and either uses it or uses v128.const if it is available and if it would produce smaller code. This logic replaces large TableGen ISEL patterns that would lower all non-splat BUILD_VECTORs into a splat followed by a fixed number of replace_lane instructions. This CL fixes PR39685. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56633 llvm-svn: 352592
*	GlobalISel: Handle some odd splits in fewerElementsVector	Matt Arsenault	2019-01-30	1	-10/+55
\| \| \| \| \| \|	Also add some quick hacks to AMDGPU legality for the tests. llvm-svn: 352591
*	GlobalISel: Handle more cases for widenScalar for G_STORE	Matt Arsenault	2019-01-30	1	-3/+10
\| \| \| \|	llvm-svn: 352585
*	[PowerPC] more opportunity for converting reg+reg to reg+imm	Chen Zheng	2019-01-30	1	-2/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57314 llvm-svn: 352583
*	GlobalISel: Verify memory size for load/store	Matt Arsenault	2019-01-30	1	-4/+9
\| \| \| \|	llvm-svn: 352578
*	Remove a redundant space from an error message; NFC	George Burgess IV	2019-01-30	1	-1/+1
\| \| \| \|	llvm-svn: 352576
*	[WebAssembly] Add missing SymbolRef update from rL352551	Sam Clegg	2019-01-30	1	-2/+2
\| \| \| \| \| \| \| \|	This change broke some MC tests which are now fixed. Differential Revision: https://reviews.llvm.org/D57424 llvm-svn: 352573
*	[WebAssembly] Lower SCALAR_TO_VECTOR to splats	Thomas Lively	2019-01-29	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57269 llvm-svn: 352568
*	GlobalISel: Fix unused variable warning in release builds	Matt Arsenault	2019-01-29	1	-2/+1
\| \| \| \|	llvm-svn: 352565
*	[IR] Use CallBase to reduce code duplication. NFC	Craig Topper	2019-01-29	1	-4/+2
\| \| \| \| \| \| \| \|	Noticed in the asm-goto patch. Callbr needs to go here too. One cast and call is better than 3. Differential Revision: https://reviews.llvm.org/D57295 llvm-svn: 352563
*	GlobalISel: Verify pointer casts	Matt Arsenault	2019-01-29	1	-0/+44
\| \| \| \| \| \| \|	Not sure if the old AArch64 tests should be just deleted or not. llvm-svn: 352562
*	GlobalISel: Partially implement widenScalar for MERGE_VALUES	Matt Arsenault	2019-01-29	2	-7/+48
\| \| \| \|	llvm-svn: 352560
*	Check bool attribute value in getOptionalBoolLoopAttribute.	Alina Sbirlea	2019-01-29	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Check the bool value of the attribute in getOptionalBoolLoopAttribute not just its existance. Eliminates the warning noise generated when vectorization is explicitly disabled. Reviewers: Meinersbur, hfinkel, dmgreen Subscribers: jlebar, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D57260 llvm-svn: 352555
*	[WebAssembly] Ensure BasicSymbolRef.getRawDataRefImpl().p is non-null	Sam Clegg	2019-01-29	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Store a non-zero value to ref.d.a and use ref.d.b to store the symbol index. This means that ref.p is never null, which was confusing llvm-nm. Fixes PR40497 Differential Revision: https://reviews.llvm.org/D57373 llvm-svn: 352551
*	[AArch64][GlobalISel] Unmerge into scalars from a vector should use FPR bank.	Amara Emerson	2019-01-29	1	-1/+5
\| \| \| \| \| \| \| \| \|	This currently shows up as a selection fallback since the dest regs were given GPR banks but the source was a vector FPR reg. Differential Revision: https://reviews.llvm.org/D57408 llvm-svn: 352545
*	[DWARF] Emit reasonable debug info for empty .s files.	Paul Robinson	2019-01-29	1	-0/+3
\| \| \| \|	llvm-svn: 352541
*	[InstCombine] canonicalize cmp/select form of uadd saturate with constant	Sanjay Patel	2019-01-29	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm circling back around to a loose end from D51929. The backend (either CGP or DAG) doesn't recognize this pattern, so we end up with different asm for these IR variants. Regardless of any future changes to canonicalize to saturation/overflow intrinsics, we want to get raw IR variations into the minimal number of raw IR forms. If/when we can canonicalize to intrinsics, that will make that step easier. Pre: C2 == ~C1 %a = add i32 %x, C1 %c = icmp ugt i32 %x, C2 %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, C1 %c2 = icmp ult i32 %x, C2 %r = select i1 %c2, i32 %a, i32 -1 https://rise4fun.com/Alive/pkH Differential Revision: https://reviews.llvm.org/D57352 llvm-svn: 352536
*	[DAGCombiner] fold extract_subvector of extract_subvector	Sanjay Patel	2019-01-29	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the sibling fold for insert-of-insert that was added with D56604. Now that we have x86 shuffle narrowing (D57156), this change shows improvements for lots of AVX512 reduction code (not sure that we would ever expect extract-of-extract otherwise). There's a small regression in some of the partial-permute tests (extracting followed by splat). That is tracked by PR40500: https://bugs.llvm.org/show_bug.cgi?id=40500 Differential Revision: https://reviews.llvm.org/D57336 llvm-svn: 352528
*	GlobalISel: Fix narrowScalar for load/store with different mem size	Matt Arsenault	2019-01-29	2	-4/+49
\| \| \| \| \| \| \| \| \| \|	This was ignoring the memory size, and producing multiple loads/stores if the operand size was different from the memory size. I assume this is the intent of not having an explicit G_ANYEXTLOAD (although I think that would probably be better). llvm-svn: 352523
*	[X86][Btver2] Improved latency/throughput model for scalar int-to-float ↵	Andrea Di Biagio	2019-01-29	3	-14/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	conversions. Account for bypass delays when computing the latency of scalar int-to-float conversions. On Jaguar we need to account for an extra 6cy latency (see AMD fam16h SOG). This patch also fixes the number of micropcodes for the register-memory variants of scalar int-to-float conversions. Differential Revision: https://reviews.llvm.org/D57148 llvm-svn: 352518
*	Adjust documentation for git migration.	James Y Knight	2019-01-29	2	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes most references to the paths: llvm.org/svn/ llvm.org/git/ llvm.org/viewvc/ github.com/llvm-mirror/ github.com/llvm-project/ reviews.llvm.org/diffusion/ to instead point to https://github.com/llvm/llvm-project. This is not a trivial substitution, because additionally, all the checkout instructions had to be migrated to instruct users on how to use the monorepo layout, setting LLVM_ENABLE_PROJECTS instead of checking out various projects into various subdirectories. I've attempted to not change any scripts here, only documentation. The scripts will have to be addressed separately. Additionally, I've deleted one document which appeared to be outdated and unneeded: lldb/docs/building-with-debug-llvm.txt Differential Revision: https://reviews.llvm.org/D57330 llvm-svn: 352514
*	[SelectionDAGBuilder] Remove redundant variable. NFCI.	Nirav Dave	2019-01-29	1	-7/+3
\| \| \| \|	llvm-svn: 352506
*	Reversing the checkin for version 352484 as tests are failing.	Ayonam Ray	2019-01-29	2	-28/+52
\| \| \| \|	llvm-svn: 352504
*	[AMDGPU] Fix a weird WWM intrinsic issue.	Neil Henning	2019-01-29	2	-17/+17
\| \| \| \| \| \| \| \| \| \| \|	I found a really strange WWM issue through a very convoluted shader that essentially boils down to a bug in SIInstrInfo where canReadVGPR did not correctly identify that WWM is like a copy and can have a VGPR as its source. Differential Revision: https://reviews.llvm.org/D56002 llvm-svn: 352500
*	[CodeGen] Omit range checks from jump tables when lowering switches with ↵	Ayonam Ray	2019-01-29	2	-52/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Review ID: D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 352484
*	[WebAssembly] Re-enable main-function signature rewriting	Dan Gohman	2019-01-29	1	-12/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-enable the code to rewrite main-function signatures into "int main(int argc, char argv[])", but limited to only handling the case of "int main(void)", so that it doesn't silently strip an argument in the "int main(int argc, char argv[], char *envp[])" case. This allows main to be called by C startup code, since WebAssembly requires caller and callee signatures to match, so it can't rely on passing main a different number of arguments than it expects. Differential Revision: https://reviews.llvm.org/D57323 llvm-svn: 352479
*	[ARM] Use sub for negative offset load/store in thumb1	David Green	2019-01-29	2	-6/+47
\| \| \| \| \| \| \| \| \| \| \|	This attempts to optimise negative values used in load/store operands a little. We currently try to selct them as rr, materialising the negative constant using a MOV/MVN pair. This instead selects ri with an immediate of 0, forcing the add node to become a simpler sub. Differential Revision: https://reviews.llvm.org/D57121 llvm-svn: 352475
*	[IPCP] Don't crash due to arg count/type mismatch between caller/callee	Bjorn Pettersson	2019-01-29	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch avoids an assert in IPConstantPropagation when there is a argument count/type mismatch between the caller and the callee. While this is actually UB on C-level (clang emits a warning), the IR verifier seems to accept it. I'm not sure what other frontends/languages might think about this, so simply bailing out to avoid hitting an assert (in CallSiteBase<>::getArgOperand or Value::doRAUW) seems like a simple solution. The problem is exposed by the fact that AbstractCallSites will look through a bitcast at the callee position of a call/invoke. Reviewers: jdoerfert, reames, efriedma Reviewed By: jdoerfert, efriedma Subscribers: eli.friedman, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D57052 llvm-svn: 352469
*	[DebugInfo][DAG] Process FrameIndex dbg.values unconditionally	Jeremy Morse	2019-01-29	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A FrameIndex should be valid throughout a block regardless of what instructions get selected in that block -- therefore we shouldn't harness dbg.values that refer to FrameIndexes to an SDNode. There are numerous codegen reasons why an SDNode never appears or doesn't become a location that a DBG_VALUE can refer to. None of them actually affect the variable location. Therefore, before any other tests to encode dbg_values in a SelectionDAG, identify FrameIndex operands and encode them unattached to any SDNode. Differential Revision: https://reviews.llvm.org/D57328 llvm-svn: 352467
*	[NFC] Use ArrayRef instead of SmallVectorImpl where possible	Max Kazantsev	2019-01-29	1	-6/+6
\| \| \| \|	llvm-svn: 352466
*	[COFF, ARM64] Don't put jump table into a separate COFF section for ↵	Martin Storsjo	2019-01-29	2	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EK_LabelDifference32 Windows ARM64 has PIC relocation model and uses jump table kind EK_LabelDifference32. This produces jump table entry as ".word LBB123 - LJTI1_2" which represents the distance between the block and jump table. A new relocation type (IMAGE_REL_ARM64_REL32) is needed to do the fixup correctly if they are in different COFF section. This change saves the jump table to the same COFF section as the associated code. An ideal fix could be utilizing IMAGE_REL_ARM64_REL32 relocation type. Patch by Tom Tan! Differential Revision: https://reviews.llvm.org/D57277 llvm-svn: 352465
*	[CodeGenPrepare] Handle all debug calls in dupRetToEnableTailCallOpts()	Jonas Paulsson	2019-01-29	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \|	This patch makes sure that a debug value that is after the bitcast in dupRetToEnableTailCallOpts() is also skipped. The reduced test case is from SPEC-2006 on SystemZ. Review: Vedant Kumar, Wolfgang Pieb https://reviews.llvm.org/D57050 llvm-svn: 352462
*	Fix compiler warning when using clang 3.6.0	Mikael Holmen	2019-01-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Without the fix we get the following (with -Werror): ../lib/Target/X86/X86ISelLowering.cpp:14181:58: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] SmallVector<std::array<int, 2>, 2> LaneSrcs(NumLanes, {-1, -1}); ^~~~~~ { } 1 error generated. llvm-svn: 352455
*	[SCEV] Take correct loop in AddRec simplification. PR40420	Max Kazantsev	2019-01-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The code of AddRec simplification is using wrong loop when it creates a new AddRecExpr. It should be using AddRecLoop which we have saved and against which all gate checks are made, and not calling AddRec->getLoop() over and over again because AddRec may change and become an AddRecurrency from outer loop during the transform iterations. Considering this change trivial, commiting for postcommit review. llvm-svn: 352451
*	[WebAssembly] Handle more types of uses in WebAssemblyAddMissingPrototypes	Sam Clegg	2019-01-29	1	-37/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we were only handling bitcast operations, however prototypeless functions can also appear in other places such as comparisons and as function params. Switch to using replaceAllUsesWith() to replace the prototype-less function uses. This new approach results in some redundant bitcasting but is much simpler and handles all cases. Differential Revision: https://reviews.llvm.org/D56938 llvm-svn: 352445
*	[PPC] Include tablegenerated PPCGenCallingConv.inc once	Reid Kleckner	2019-01-29	7	-147/+137
\| \| \| \| \| \| \| \| \|	Move the CC analysis implementation to its own .cpp file instead of duplicating it and artificually using functions in PPCISelLowering.cpp and PPCFastISel.cpp. Follow-up to the same change done for X86, ARM, and AArch64. llvm-svn: 352444
*	[WebAssembly] Expand BUILD_PAIR nodes	Thomas Lively	2019-01-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57276 llvm-svn: 352442
*	[ThinLTO] Add option to dump per-module summary dot graph	Teresa Johnson	2019-01-28	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I found that there currently isn't a way to invoke exportToDot from the command line for a per-module summary index, and therefore no testing of that case. Add an internal option and use it to test dumping of per module summary indexes. In particular, I am looking at fixing the limitation that causes the aliasee GUID in the per-module summary to be 0, and want to be able to test that change. Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D57206 llvm-svn: 352441
*	Demanded elements support for vector GEPs	Philip Reames	2019-01-28	1	-0/+12
\| \| \| \| \| \| \| \|	GEPs can produce either scalar or vector results. If we're extracting only a subset of the vector lanes, simplifying the operands is helpful in eliminating redundant computation, and (eventually) allowing further optimizations Differential Revision: https://reviews.llvm.org/D57177 llvm-svn: 352440
*	[ThinLTO] Refine reachability check to fix compile time increase	Teresa Johnson	2019-01-28	1	-8/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A recent fix to the ThinLTO whole program dead code elimination (D56117) increased the thin link time on a large MSAN'ed binary by 2x. It's likely that the time increased elsewhere, but was more noticeable here since it was already large and ended up timing out. That change made it so we would repeatedly scan all copies of linkonce symbols for liveness every time they were encountered during the graph traversal. This was needed since we only mark one copy of an aliasee as live when we encounter a live alias. This patch fixes the issue in a more efficient manner by simply proactively visiting the aliasee (thus marking all copies live) when we encounter a live alias. Two notes: One, this requires a hash table lookup (finding the aliasee summary in the index based on aliasee GUID). However, the impact of this seems to be small compared to the original pre-D56117 thin link time. It could be addressed if we keep the aliasee ValueInfo in the alias summary instead of the aliasee GUID, which I am exploring in a separate patch. Second, we only populate the aliasee GUID field when reading summaries from bitcode (whether we are reading individual summaries and merging on the fly to form the compiled index, or reading in a serialized combined index). Thankfully, that's currently the only way we can get to this code as we don't yet support reading summaries from LLVM assembly directly into a tool that performs the thin link (they must be converted to bitcode first). I added a FIXME, however I have the fix under test already. The easiest fix is to simply populate this field always, which isn't hard, but more likely the change I am exploring to store the ValueInfo instead as described above will subsume this. I don't want to hold up the regression fix for this though. Reviewers: trentxintong Subscribers: mehdi_amini, inglorion, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D57203 llvm-svn: 352438
*	Recommit r352255 "[SelectionDAG][X86] Don't use SEXTLOAD for promoting ↵	Craig Topper	2019-01-28	2	-4/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	masked loads in the type legalizer" This did not cause the buildbot failure it was previously reverted for. Original commit message: I'm not sure why we were using SEXTLOAD. EXTLOAD seems more appropriate since we don't care about the upper bits. This patch changes this and then modifies the X86 post legalization combine to emit a extending shuffle instead of a sign_extend_vector_inreg. Could maybe use an any_extend_vector_inre On AVX512 targets I think we might be able to use a masked vpmovzx and not have to expand this at all. llvm-svn: 352433
*	[RuntimeDyld] load all sections with ProcessAllSections	Yonghong Song	2019-01-28	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch tried to address the following use case. . bcc (https://github.com/iovisor/bcc) utilizes llvm JIT to compile for BTF target. . with -g, .BTF and .BTF.ext sections (BPF debug info) will be generated by LLVM. . .BTF does not have relocations and .BTF.ext has some relocations. . With ProcessAllSections, .BTF.ext is loaded by JIT dynamic linker and is available to application. But .BTF is not loaded. The bcc application needs both .BTF.ext and .BTF for debugging purpose, and .BTF is not loaded. This patch addressed this issue by iterating over all sections and loading any missing sections, after symbol/relocation processing in loadObjectImpl(). Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55943 llvm-svn: 352432
*	[ARM] Deduplicate table generated CC analysis code	Reid Kleckner	2019-01-28	6	-275/+324
\| \| \| \| \| \| \|	Create ARMCallingConv.cpp and emit code for calling convention analysis from there. llvm-svn: 352431
*	[AArch64] Include AArch64GenCallingConv.inc once	Reid Kleckner	2019-01-28	6	-129/+171
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoids duplicating generated static helpers for calling convention analysis. This also means you can modify AArch64CallingConv.td without recompiling the AArch64ISelLowering.cpp monolith, so it provides faster incremental rebuilds. Saves 12K in llc.exe, but adds a new object file, which is large. Reviewers: efriedma, t.p.northover Subscribers: mgorny, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56948 llvm-svn: 352430
*	[GlobalISel][AArch64] Add legalization for G_FLOG	Jessica Paquette	2019-01-28	3	-2/+9
\| \| \| \| \| \| \| \| \| \|	This adds support for legalizing G_FLOG into a RTLib call. It adds a legalizer test, and updates the existing floating point tests. https://reviews.llvm.org/D57347 llvm-svn: 352429
*	AMDGPU: Add DS append/consume intrinsics	Matt Arsenault	2019-01-28	4	-17/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since these pass the pointer in m0 unlike other DS instructions, these need to worry about whether the address is uniform or not. This assumes the address is dynamically uniform, and just uses readfirstlane to get a copy into an SGPR. I don't know if these have the same 16-bit add for the addressing mode offset problem on SI or not, but I've just assumed they do. Also includes some misc. changes to avoid test differences between the LDS and GDS versions. llvm-svn: 352422
*	[GlobalISel][AArch64] Add instruction selection support for @llvm.log10	Jessica Paquette	2019-01-28	3	-2/+9
\| \| \| \| \| \| \| \| \| \|	This adds instruction selection support for @llvm.log10 in AArch64. It teaches GISel to lower it to a library call, updates the relevant tests, and adds a legalizer test for log10. https://reviews.llvm.org/D57341 llvm-svn: 352418
*	[AliasSetTracker] Cleanup more comments. [NFCI]	Alina Sbirlea	2019-01-28	1	-4/+6
\| \| \| \|	llvm-svn: 352416
*	[MC] Do not consider .ifdef/.ifndef as a use	Scott Linder	2019-01-28	1	-2/+2
\| \| \| \| \| \| \| \|	This is allowed by GAS and seems correct. Differential Revision: https://reviews.llvm.org/D55439 llvm-svn: 352414
*	[AArch64] Add 'apple-latest' CPU alias	Francis Visoiu Mistrih	2019-01-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 'apple-latest' alias is supposed to provide a CPU that contains the latest Apple processor model supported by LLVM. This is supposed to be used by tools like lldb to provide a target that supports most of the CPU features. For now, this is mapped to Cyclone. Differential Revision: https://reviews.llvm.org/D56384 llvm-svn: 352412