bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Give AVX512VL instructions priority over their AVX equivalents.	Craig Topper	2017-11-02	1	-2/+2
\| \| \| \| \| \|	I thought we had gotten all these priority bugs worked out, but I guess not. llvm-svn: 317283
*	IndVarSimplify: preserve debug information attached to widened PHI nodes.	Adrian Prantl	2017-11-02	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	This fixes PR35015. https://bugs.llvm.org/show_bug.cgi?id=35015 Differential Revision: https://reviews.llvm.org/D39345 llvm-svn: 317282
*	AMDGPU: Fix warning discovered by r317266 [-Wunused-private-field]	Konstantin Zhuravlyov	2017-11-02	1	-1/+0
\| \| \| \|	llvm-svn: 317280
*	Irreducible loop metadata for more accurate block frequency under PGO.	Hiroshi Yamauchi	2017-11-02	8	-2/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently the block frequency analysis is an approximation for irreducible loops. The new irreducible loop metadata is used to annotate the irreducible loop headers with their header weights based on the PGO profile (currently this is approximated to be evenly weighted) and to help improve the accuracy of the block frequency analysis for irreducible loops. This patch is a basic support for this. Reviewers: davidxl Reviewed By: davidxl Subscribers: mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D39028 llvm-svn: 317278
*	[Hexagon] Prefer L2_loadrub_io over L4_loadrub_rr	Krzysztof Parzyszek	2017-11-02	1	-52/+82
\| \| \| \| \| \| \|	If the offset is an immediate, avoid putting it in a register to get Rs+Rt<<#0. llvm-svn: 317275
*	[LoopPredication] Enable predication when latchCheckIV is wider than rangeCheck	Anna Thomas	2017-11-02	1	-10/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch allows us to predicate range checks that have a type narrower than the latch check type. We leverage SCEV analysis to identify a truncate for the latchLimit and latchStart. There is also safety checks in place which requires the start and limit to be known at compile time. We require this to make sure that the SCEV truncate expr for the IV corresponding to the latch does not cause us to lose information about the IV range. Added tests show the loop predication over range checks that are of various types and are narrower than the latch type. This enhancement has been in our downstream tree for a while. Reviewers: apilipenko, sanjoy, mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39500 llvm-svn: 317269
*	AMDGPU: Remove outdated fixme (it was already fixed)	Konstantin Zhuravlyov	2017-11-02	1	-3/+0
\| \| \| \|	llvm-svn: 317266
*	[X86] Simplify the pentium4 code in getHostCPUName to be based on feature ↵	Craig Topper	2017-11-02	1	-34/+6
\| \| \| \| \| \| \| \|	flags. Don't use 'x86-64' ever. 'x86-64' has started to reflect a sort of generic tuning flag for more modern 64-bit CPUs. We probably shouldn't be using it as the name of an unidentifiable pentium4. So use nocona for all 64-bit pentium4s instead. llvm-svn: 317230
*	[X86] Change getHostCPUName fallback code to not select 'x86-64' for unknown ↵	Craig Topper	2017-11-02	1	-2/+7
\| \| \| \| \| \| \| \|	CPUs in family 6 that has 64-bit support but not any newer SSE features. Use 'core2' instead We know that's the earliest CPU with 64-bit support. x86-64 has taken on a role of representing a more modern 64-bit CPU so we probably shouldn't be using that when we can't identify things. llvm-svn: 317229
*	Strip off invariant.start because memory locations arent invariant	Anna Thomas	2017-11-02	1	-9/+33
\| \| \| \| \| \| \| \| \| \| \| \|	The original change was reverted in rL317217 because of the failure in the RS4GC testcase. I couldn't reproduce the failure on my local machine (macbook) but could reproduce it on a linux box. The failure was around removing the uses of invariant.start. The fix here is to just RAUW undef (which was the first implementation in D39388). This is perfectly valid IR as discussed in the review. llvm-svn: 317225
*	Revert "[RS4GC] Strip off invariant.start because memory locations arent ↵	Anna Thomas	2017-11-02	1	-39/+9
\| \| \| \| \| \| \| \|	invariant" This reverts commit r317215, investigating the test failure. llvm-svn: 317217
*	[RS4GC] Strip off invariant.start because memory locations arent invariant	Anna Thomas	2017-11-02	1	-9/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Invariant.start on memory locations has the property that the memory location is unchanging. However, this is not true in the face of rewriting statepoints for GC. Teach RS4GC about removing invariant.start so that optimizations after RS4GC does not incorrect sink a load from the memory location past a statepoint. Added test showcasing the issue. Reviewers: reames, apilipenko, dneilson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39388 llvm-svn: 317215
*	Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."	Clement Courbet	2017-11-02	5	-838/+712
\| \| \| \| \| \| \| \| \|	undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage This reverts commit eea333c33fa73ad225ef28607795984829f65688. llvm-svn: 317213
*	[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.	Clement Courbet	2017-11-02	5	-712/+838
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around. See the discussion in PR33325 for more details. Reviewers: spatel Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D39456 llvm-svn: 317211
*	[X86] Fix bug in legalize vector types - Split large loads	Ayman Musa	2017-11-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	When splitting a large load to smaller legally-typed loads, the last load should be padded to reach the size of the previous one so a CONCAT_VECTORS node could reunite them again. The code currently pads the last load to reach the size of the first load (instead of the previous). Differential Revision: https://reviews.llvm.org/D38495 Change-Id: Ib60b55ed26ce901fabf68108daf52683fbd5013f llvm-svn: 317206
*	[mips] Use register scavenging with MSA.	Simon Dardis	2017-11-02	2	-24/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MSA stores and loads to the stack are more likely to require an emergency GPR spill slot due to the smaller offsets available with those instructions. Handle this by overestimating the size of the stack by determining the largest offset presuming that all callee save registers are spilled and accounting of incoming arguments when determining whether an emergency spill slot is required. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39056 llvm-svn: 317204
*	Temporary workaround for msan false positive.	Sam McCall	2017-11-02	1	-1/+1
\| \| \| \|	llvm-svn: 317203
*	Allow inaccessiblememonly and inaccessiblemem_or_argmemonly to be overwriten ↵	Yichao Yu	2017-11-02	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	on call site with operand bundle Summary: Similar to argmemonly, readonly and readnone. Fix PR35128 Reviewers: andrew.w.kaylor, chandlerc, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D39434 llvm-svn: 317201
*	[AsmPrinterDwarf] Add support for .cfi_restore directive	Francis Visoiu Mistrih	2017-11-02	6	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As of today we only use .cfi_offset to specify the offset of a CSR, but we never use .cfi_restore when the CSR is restored. If we want to perform a more advanced type of shrink-wrapping, we need to use .cfi_restore in order to switch the CFI state between blocks. This patch only aims at adding support for the directive. Differential Revision: https://reviews.llvm.org/D36114 llvm-svn: 317199
*	[SimplifyCFG] Discard speculated dbg intrinsics	Bjorn Pettersson	2017-11-02	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SpeculativelyExecuteBB can flatten the CFG by doing speculative execution followed by a select instruction. When the speculatively executed BB contained dbg intrinsics the result could be a little bit weird, since those dbg intrinsics were inserted before the select in the flattened CFG. So when single stepping in the debugger, printing the value of the variable referenced in the dbg intrinsic, it could happen that it looked like the variable had values that never actually were assigned to the variable. This patch simply discards all dbg intrinsics that were found in the speculatively executed BB. Reviewers: aprantl, chandlerc, craig.topper Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39494 llvm-svn: 317198
*	[ARM] and, or, xor and add with shl combine	Sam Parker	2017-11-02	1	-7/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The generic dag combiner will fold: (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2) This can create constants which are too large to use as an immediate. Many ALU operations are also able of performing the shl, so we can unfold the transformation to prevent a mov imm instruction from being generated. Other patterns, such as b + ((a << 1) \| 510), can also be simplified in the same manner. Differential Revision: https://reviews.llvm.org/D38084 llvm-svn: 317197
*	The patch updates sched numbers for YMM AVX instrs such as VMOVx, VORx, ↵	Andrew V. Tischenko	2017-11-02	1	-0/+93
\| \| \| \| \| \| \| \| \|	VXOR, VPERMILx, VBROADCASTx, etc. PR32857 should be closed. Differential Revision: https://reviews.llvm.org/D39227 llvm-svn: 317196
*	[X86] Remove the model checks from the 486 detection code in Host.cpp	Craig Topper	2017-11-02	1	-14/+1
\| \| \| \| \| \|	This just provided a bunch of comments to read and not much else. llvm-svn: 317185
*	[X86] Simplify the detection of pentium-mmx in Host.cpp.	Craig Topper	2017-11-02	1	-21/+6
\| \| \| \| \| \|	Rather than looking at model numbers just check for the mmx feature flag. While there promote INTEL_PENTIUM_MMX to a CPU type instead of a subtype so that we don't have weird type with only one subtype. llvm-svn: 317184
*	[yaml2obj][ELF] Add support for setting alignment in program headers	Jake Ehrlich	2017-11-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sometimes program headers have larger alignments than any of the sections they contain. Currently yaml2obj can't produce such files. A bug recently appeared in llvm-objcopy that failed in such a case. I'd like to be able to add tests to llvm-objcopy for such cases. This change adds an optional alignment parameter to program headers that will be used instead of calculating the alignment. Differential Revision: https://reviews.llvm.org/D39130 llvm-svn: 317139
*	loop-unroll: teach remapInstruction to update dbg.value intrinsics.	Adrian Prantl	2017-11-01	1	-1/+15
\| \| \| \| \| \| \| \|	Fixes PR35112. https://bugs.llvm.org/show_bug.cgi?id=35112 llvm-svn: 317138
*	Revert "Correct dwarf unwind information in function epilogue for X86"	Petar Jovanovic	2017-11-01	10	-474/+14
\| \| \| \| \| \| \|	This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf buildbot failure (build #15606). llvm-svn: 317136
*	[LLVM-C] Expose functions to create debug locations via DIBuilder.	whitequark	2017-11-01	1	-1/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These include: * Several functions for creating an LLVMDIBuilder, * LLVMDIBuilderCreateCompileUnit, * LLVMDIBuilderCreateFile, * LLVMDIBuilderCreateDebugLocation. Patch by Harlan Haskins. Differential Revision: https://reviews.llvm.org/D32368 llvm-svn: 317135
*	[X86] Use foreach in X86.td to combine some of the CPU names that are ↵	Craig Topper	2017-11-01	1	-52/+40
\| \| \| \| \| \|	obviously aliases. NFC llvm-svn: 317134
*	[X86] Add CMOV feature to 'i686' processor, making it a proper alias of ↵	Craig Topper	2017-11-01	1	-1/+1
\| \| \| \| \| \| \| \|	pentiumpro which I believe it should be. This is consistent with current gcc behavior. llvm-svn: 317133
*	[globalisel][regbank] Warn about MIR ambiguities when register bank/class ↵	Daniel Sanders	2017-11-01	1	-0/+4
\| \| \| \| \| \|	names clash. llvm-svn: 317132
*	[X86][SSE] Add PACKUS support to LowerTruncate	Simon Pilgrim	2017-11-01	1	-12/+26
\| \| \| \| \| \| \| \|	Similar to the existing code to lower to PACKSS, we can use PACKUS if the input vector's leading zero bits extend all the way to the packed/truncated value. We have to account for pre-SSE41 targets not supporting PACKUSDW llvm-svn: 317128
*	Rewrite FileOutputBuffer as two separate classes.	Rui Ueyama	2017-11-01	1	-82/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is to rewrite FileOutputBuffer as two separate classes; one for file-backed output buffer and the other for memory-backed output buffer. I think the new code is easier to follow because two different implementations are now actually separated as different classes. Unlike the previous implementation, the class that does not replace the final output file using rename(2) does not create a temporary file at all. Instead, it allocates memory using mmap(2) and use it. I think this is an improvement because it is now guaranteed that the temporary memory region doesn't trigger any I/O and there's now zero chance to leave a temporary file behind. Also, it shouldn't impose new restrictions because were using mmap IO too. Differential Revision: https://reviews.llvm.org/D39449 llvm-svn: 317127
*	[X86] Add custom code to EVEX to VEX pass to turn unmasked 128-bit ↵	Craig Topper	2017-11-01	1	-0/+21
\| \| \| \| \| \| \| \| \| \|	VPALIGND/Q into VPALIGNR if the extended registers aren't being used. This will enable us to prefer VALIGND/Q during shuffle lowering in order to get the extended register encoding space when BWI isn't available. But if we end up not using the extended registers we can switch VPALIGNR for the shorter VEX encoding. Differential Revision: https://reviews.llvm.org/D39401 llvm-svn: 317122
*	loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.	Adrian Prantl	2017-11-01	1	-0/+24
\| \| \| \| \| \| \| \|	This fixes the second half of PR35113. This reapplies r317106 without modifications. llvm-svn: 317121
*	loop-rotate: eliminate duplicate debug intrinsics after splicing.	Adrian Prantl	2017-11-01	1	-1/+26
\| \| \| \| \| \| \| \| \|	Fixes part of PR35113. This reapplies r317105 with an additional check for isa<Instruction> as found by the bots. llvm-svn: 317120
*	Include GUIDs from the same module when computing GUIDs that needs to be ↵	Dehao Chen	2017-11-01	1	-15/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	imported. Summary: In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D39480 llvm-svn: 317118
*	Revert 317016 and 317048	Philip Reames	2017-11-01	1	-44/+50
\| \| \| \| \| \|	The former appears to have introduced a miscompile in a stage2 clang build. Revert so I can investigate offline. llvm-svn: 317116
*	AMDGPU: Fix set but not used warnings related to AMDGPUAS	Konstantin Zhuravlyov	2017-11-01	7	-36/+32
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D39499 llvm-svn: 317114
*	[X86] Prevent fast isel from folding loads into the instructions listed in ↵	Craig Topper	2017-11-01	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	hasPartialRegUpdate. This patch moves the check for opt size and hasPartialRegUpdate into the lower level implementation of foldMemoryOperandImpl to catch the entry point that fast isel uses. We're still folding undef register instructions in AVX that we should also probably disable, but that's a problem for another patch. Unfortunately, this requires reordering a bunch of functions which is why the diff is so large. I can do the function reordering separately if we want. Differential Revision: https://reviews.llvm.org/D39402 llvm-svn: 317112
*	Adds code to PPC ISEL lowering to recognize half-word inserts from ↵	Graham Yiu	2017-11-01	3	-5/+139
\| \| \| \| \| \| \| \|	vector_shuffles, and use P9 shift and vector insert instructions instead of vperm. Differential Revision: https://reviews.llvm.org/D34160 llvm-svn: 317111
*	Revert r317105 to investigate bot breakage.	Adrian Prantl	2017-11-01	1	-23/+1
\| \| \| \|	llvm-svn: 317110
*	Revert r317106 to facilitate reverting r317105.	Adrian Prantl	2017-11-01	1	-24/+0
\| \| \| \|	llvm-svn: 317109
*	LTO: Apply global DCE to ThinLTO modules at LTO opt level 0.	Peter Collingbourne	2017-11-01	2	-31/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is necessary because DCE is applied to full LTO modules. Without this change, a reference from a dead ThinLTO global to a dead full LTO global will result in an undefined reference at link time. This problem is only observable when --gc-sections is disabled, or when targeting COFF, as the COFF port of lld requires all symbols to have a definition even if all references are dead (this is consistent with link.exe). This change also adds an EliminateAvailableExternally pass at -O0. This is necessary to handle the situation on Windows where a non-prevailing copy of a linkonce_odr function has an SEH filter function; any such filters must be DCE'd because they will contain a call to the llvm.localrecover intrinsic, passing as an argument the address of the function that the filter belongs to, and llvm.localrecover requires this function to be defined locally. Fixes PR35142. Differential Revision: https://reviews.llvm.org/D39484 llvm-svn: 317108
*	loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.	Adrian Prantl	2017-11-01	1	-0/+24
\| \| \| \| \| \|	This fixes the second half of PR35113. llvm-svn: 317106
*	loop-rotate: eliminate duplicate debug intrinsics after splicing.	Adrian Prantl	2017-11-01	1	-1/+23
\| \| \| \| \| \|	Fixes part of PR35113. llvm-svn: 317105
*	[X86] Add 64-bit int to float/double conversion with AVX to ↵	Craig Topper	2017-11-01	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X86FastISel::X86SelectSIToFP Summary: [X86] Teach fast isel to handle i64 sitofp with AVX. For some reason we only handled i32 sitofp with AVX. But with SSE only we support i64 so we should do the same with AVX. Also add i686 command lines for the 32-bit tests. 64-bit tests are in a separate file to avoid a fast-isel abort failure in 32-bit mode. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39450 llvm-svn: 317102
*	Update VCVTx, VMOVNTPx and VROUNDYPx instructions scheduling on btver2.	Andrew V. Tischenko	2017-11-01	1	-0/+39
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D39059 llvm-svn: 317101
*	Correct dwarf unwind information in function epilogue for X86	Petar Jovanovic	2017-11-01	10	-14/+474
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: - CFI instructions do not affect code generation - Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Changed CFI instructions so that they: - are duplicable - are not counted as instructions when tail duplicating or tail merging - can be compared as equal Added CFIInstrInserter pass: - analyzes each basic block to determine cfa offset and register valid at its entry and exit - verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors - inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D35844 llvm-svn: 317100
*	[X86][SSE] Begun generalizing truncateVectorWithPACKSS to work with ↵	Simon Pilgrim	2017-11-01	1	-11/+14
\| \| \| \| \| \| \| \|	PACKSS/PACKUS functions Renamed to truncateVectorWithPACK llvm-svn: 317098