bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AVX-512] Support spills of XMM16-31 and YMM16-31 when VLX isn't available.	Craig Topper	2016-09-29	2	-8/+137
\| \| \| \| \| \| \| \|	This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used. Fixes one of the cases from PR29112. llvm-svn: 282690
*	[AVX-512] Replicate pattern from AVX to select VMOVDDUP for (v2f64 ↵	Craig Topper	2016-09-29	1	-2/+6
\| \| \| \| \| \|	(X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition. llvm-svn: 282688
*	[X86] Add EVEX encoded VBROADCASTSS/SD and VPBROADCASTD/Q to execution ↵	Craig Topper	2016-09-29	1	-0/+10
\| \| \| \| \| \|	domain fixing table. llvm-svn: 282687
*	[X86] Remove AddedComplexity adjustments that don't seem to be needed.	Craig Topper	2016-09-29	1	-6/+4
\| \| \| \|	llvm-svn: 282686
*	[X86] Add VBROADCASTF128/VBROADCASTI128 to execution domain fixing tables.	Craig Topper	2016-09-29	1	-1/+2
\| \| \| \|	llvm-svn: 282684
*	Add explanatory comment.	Peter Collingbourne	2016-09-29	1	-0/+4
\| \| \| \|	llvm-svn: 282678
*	Remove an unnecessary duplicate initialization of TLOF from the Mips	Eric Christopher	2016-09-29	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \|	AsmPrinter. This was reinitializing the Mangler after we moved the Mangler down to TLOF and causing us to have two different unnamed global values accessed with the same name. This should fix the problems on the ubsan tests here: http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/15307 llvm-svn: 282675
*	Remove the default constructor and count variable from the Mangler since	Eric Christopher	2016-09-29	1	-1/+1
\| \| \| \| \| \|	we can just use the size of the DenseMap as a unique counter. llvm-svn: 282674
*	Update comment about initializing TLOF with a pointer at the previous	Eric Christopher	2016-09-29	1	-1/+3
\| \| \| \| \| \|	line or the other commented out place. llvm-svn: 282673
*	Tidy spelling and grammar.	Eric Christopher	2016-09-29	1	-1/+1
\| \| \| \|	llvm-svn: 282672
*	MachineFunction: Add missing newline in debug print()	Matthias Braun	2016-09-29	1	-0/+1
\| \| \| \| \| \|	Should not be a functional but an aesthetic change. llvm-svn: 282669
*	AMDGPU: Partially fix control flow at -O0	Matt Arsenault	2016-09-29	7	-21/+426
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes to allow spilling all registers at the end of the block work with exec modifications. Don't emit s_and_saveexec_b64 for if lowering, and instead emit copies. Mark control flow mask instructions as terminators to get correct spill code placement with fast regalloc, and then have a separate optimization pass form the saveexec. This should work if SGPRs are spilled to VGPRs, but will likely fail in the case that an SGPR spills to memory and no workitem takes a divergent branch. llvm-svn: 282667
*	LTO: Fix use-after-scope error.	Peter Collingbourne	2016-09-29	1	-0/+1
\| \| \| \|	llvm-svn: 282665
*	AArch64: Set shift bit of TLSLE HI12 add instruction	Lei Liu	2016-09-29	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282661
*	Wisely choose sext or zext when widening IV.	Evgeny Stupachenko	2016-09-28	1	-37/+88
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650
*	Next set of additional error checks for invalid Mach-O files for the	Kevin Enderby	2016-09-28	1	-0/+32
\| \| \| \| \| \| \| \| \|	load command that uses the Mach::rpath_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_RPATH load command. llvm-svn: 282649
*	[RegisterBankInfo] Uniquely generate OperandsMapping.	Quentin Colombet	2016-09-28	2	-24/+90
\| \| \| \| \| \| \| \| \| \| \|	This is a step toward statically allocate InstructionMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. This should already improve compile time by getting rid of a bunch of memmove of SmallVectors. llvm-svn: 282643
*	[RegisterBankInfo] Rework the APIs of ValueMapping.	Quentin Colombet	2016-09-28	1	-10/+12
\| \| \| \| \| \| \|	This is a preparatory commit for more TableGen-like structure. NFC llvm-svn: 282642
*	Remove dead code from LiveDebugVariables.cpp (NFC)	Adrian Prantl	2016-09-28	1	-44/+10
\| \| \| \| \| \| \| \|	LiveDebugVariables doesn't propagate DBG_VALUEs accross basic block boundaries any more; this functionality was split into LiveDebugValues. We can thus drop the now dead references to LexicalScopes from LiveDebugVariables. llvm-svn: 282638
*	Next set of additional error checks for invalid Mach-O files for the	Kevin Enderby	2016-09-28	1	-0/+32
\| \| \| \| \| \| \| \| \| \|	other load commands that use the Mach::version_min_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_VERSION_MIN_MACOSX, LC_VERSION_MIN_IPHONEOS, LC_VERSION_MIN_TVOS and LC_VERSION_MIN_WATCHOS load commands. llvm-svn: 282635
*	Refactor the ProfileSummaryInfo to use doInitialization and doFinalization ↵	Dehao Chen	2016-09-28	4	-20/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	to handle Module update. Summary: This refactors the change in r282616 Reviewers: davidxl, eraman, mehdi_amini Subscribers: mehdi_amini, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D25041 llvm-svn: 282630
*	IfConversion: Add implicit uses for redefined regs with live subregisters	Krzysztof Parzyszek	2016-09-28	1	-0/+11
\| \| \| \| \| \| \| \| \| \|	Normally, if conversion would add implicit uses for redefined registers, e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of R0 are known to be live but not R0 itself, such implicit uses will not be added, causing prior definitions of such subregisters and R0 itself to become dead. llvm-svn: 282626
*	[AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit ↵	Konstantin Zhuravlyov	2016-09-28	2	-3/+238
\| \| \| \| \| \| \| \|	instructions Differential Revision: https://reviews.llvm.org/D24125 llvm-svn: 282624
*	Fix the bug introduced in r282616.	Dehao Chen	2016-09-28	1	-1/+1
\| \| \| \|	llvm-svn: 282618
*	Fix the bug when -compile-twice is specified, the PSI will be invalidated.	Dehao Chen	2016-09-28	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary. The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll Reviewers: eraman, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24993 llvm-svn: 282616
*	Don't look through addrspacecast in GetPointerBaseWithConstantOffset	Artur Pilipenko	2016-09-28	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612
*	Teach LiveDebugValues about lexical scopes.	Adrian Prantl	2016-09-28	1	-8/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This addresses PR26055 LiveDebugValues is very slow. Contrary to the old LiveDebugVariables pass LiveDebugValues currently doesn't look at the lexical scopes before inserting a DBG_VALUE intrinsic. This means that we often propagate DBG_VALUEs much further down than necessary. This is especially noticeable in large C++ functions with many inlined method calls that all use the same "this"-pointer. For example, in the following code it makes no sense to propagate the inlined variable a from the first inlined call to f() into any of the subsequent basic blocks, because the variable will always be out of scope: void sink(int a); void __attribute((always_inline)) f(int a) { sink(a); } void foo(int i) { f(i); if (i) f(i); f(i); } This patch reuses the LexicalScopes infrastructure we have for LiveDebugVariables to take this into account. The effect on compile time and memory consumption is quite noticeable: I tested a benchmark that is a large C++ source with an enormous amount of inlined "this"-pointers that would previously eat >24GiB (most of them for DBG_VALUE intrinsics) and whose compile time was dominated by LiveDebugValues. With this patch applied the memory consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues. https://reviews.llvm.org/D24994 Thanks to Daniel Berlin and Keith Walker for reviewing! llvm-svn: 282611
*	Rewrite loops to use range-based for. (NFC)	Adrian Prantl	2016-09-28	1	-17/+5
\| \| \| \|	llvm-svn: 282608
*	[NVPTX] Added intrinsics for atom.gen.{sys\|cta}.* instructions.	Artem Belevich	2016-09-28	7	-16/+263
\| \| \| \| \| \| \| \|	These are only available on sm_60+ GPUs. Differential Revision: https://reviews.llvm.org/D24943 llvm-svn: 282607
*	[SCEV] Use a SmallPtrSet as a temporary union predicate; NFC	Sanjoy Das	2016-09-28	1	-55/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Instead of creating and destroying SCEVUnionPredicate instances (which internally creates and destroys a DenseMap), use temporary SmallPtrSet instances of remember the set of predicates that will get reified into a SCEVUnionPredicate. Reviewers: silviu.baranga, sbaranga Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25000 llvm-svn: 282606
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-09-28	3	-121/+282
\| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604
*	[AVR] Rename the builtin calling convention names	Dylan McKay	2016-09-28	1	-3/+3
\| \| \| \| \| \|	'BUILTIN' is clearer than 'RT' in this context. llvm-svn: 282602
*	[x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel)	Marina Yatsina	2016-09-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	Implement 'retn' simply by aliasing it to the relevant 'ret' instruction Commit on behalf of coby Differential Revision: https://reviews.llvm.org/D24346 llvm-svn: 282601
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-09-28	3	-282/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600
*	[AVR] Import the LLVM namespace inside AVRMCTargetDesc.cpp	Dylan McKay	2016-09-28	1	-0/+2
\| \| \| \|	llvm-svn: 282598
*	[AVR] Add AVRMCTargetDesc.cpp	Dylan McKay	2016-09-28	3	-4/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the AVRMCTargetDesc file in tree. It allows creation of the core classes used in the backend. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25023 llvm-svn: 282597
*	[AVR] Update the signature of createAVRAsmBackend	Dylan McKay	2016-09-28	1	-1/+3
\| \| \| \| \| \|	It has been recently changed to also take a MCTargetOptions structure. llvm-svn: 282594
*	[AVR] Enable the assembly parser	Dylan McKay	2016-09-28	2	-15/+18
\| \| \| \| \| \| \| \|	We very recently landed the code. This commit enables the parser. It also adds a missing include to AVRAsmParser.cpp llvm-svn: 282593
*	[InstSimplify] allow or-of-icmps folds with vector splat constants	Sanjay Patel	2016-09-28	1	-12/+11
\| \| \| \|	llvm-svn: 282592
*	[InstSimplify] allow and-of-icmps folds with vector splat constants	Sanjay Patel	2016-09-28	1	-10/+9
\| \| \| \|	llvm-svn: 282590
*	[AVR] Merge most recent changes to AVRInstrInfo.td	Dylan McKay	2016-09-28	1	-21/+85
\| \| \| \| \| \| \| \| \|	This adds two new things: - Operand types per fixup - Atomic pseudo operations llvm-svn: 282588
*	[AVR] Update the data layout	Dylan McKay	2016-09-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous data layout caused issues when dealing with atomics. Foe example, it is illegal to load a 16-bit value with less than 16-bits of alignment. This changes the data layout so that all types are aligned by at least their own width. Interestingly, this also _slightly_ decreased register pressure in some cases. llvm-svn: 282587
*	[AVR] Handle AVR relocations when handling ELF files	Dylan McKay	2016-09-28	1	-0/+7
\| \| \| \|	llvm-svn: 282586
*	[AVR] Add assembly parser	Dylan McKay	2016-09-28	5	-1/+658
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds the AVRAsmParser library. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny, kparzysz, simoncook, jtbandes, llvm-commits Differential Revision: https://reviews.llvm.org/D20046 llvm-svn: 282584
*	[X86][FastISel] Use a COPY from K register to a GPR instead of a K operation	Guy Blank	2016-09-28	1	-27/+31
\| \| \| \| \| \| \| \| \| \| \|	The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580
*	Strip trailing whitespace	Simon Pilgrim	2016-09-28	1	-1/+1
\| \| \| \|	llvm-svn: 282579
*	[SystemZ] Implementation of getUnrollingPreferences().	Jonas Paulsson	2016-09-28	3	-6/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit enables more unrolling for SystemZ by implementing the SystemZTargetTransformInfo::getUnrollingPreferences() method. It has been found that it is better to only unroll moderately, so the DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order to set this to a lower value for SystemZ (4). Reviewers: Evgeny Stupachenko, Ulrich Weigand. https://reviews.llvm.org/D24451 llvm-svn: 282570
*	[DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine	Michael Kuperstein	2016-09-28	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567
*	[libFuzzer] speedup TracePC::FinalizeTrace	Kostya Serebryany	2016-09-28	2	-15/+22
\| \| \| \|	llvm-svn: 282562
*	[LAA] Rename emitAnalysis to recordAnalys. NFC	Adam Nemet	2016-09-28	1	-23/+20
\| \| \| \| \| \| \| \| \|	Ever since LAA was split out into an analysis on its own, this function stopped emitting the report directly. Instead it stores it to be retrieved by the client which can then emit it as its own report (e.g. -Rpass-analysis=loop-vectorize). llvm-svn: 282561