bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[AVX-512] Use EVEX encoded logic operations for scalar types when they are ↵	Craig Topper	2016-12-17	1	-4/+4
\| \| \| \| \| \|	available. This gives the register allocator more registers to work with. llvm-svn: 290049
*	[AVX-512] Update scalar logic test to show missed opportunity to use EVEX ↵	Craig Topper	2016-12-17	1	-19/+40
\| \| \| \| \| \|	encoded logic instructions to get more registers to use. llvm-svn: 290048
*	Revert "AArch64CollectLOH: Rewrite as block-local analysis."	Matthias Braun	2016-12-17	4	-194/+9
\| \| \| \| \| \| \| \|	It is still breaking Chrome. http://llvm.org/PR31361 This reverts commit r290026. llvm-svn: 290047
*	Move test to correct directory	Matthias Braun	2016-12-17	1	-0/+0
\| \| \| \| \| \|	See also test/CodeGen/MIR/README llvm-svn: 290032
*	Revert "[GVNHoist] Move GVNHoist to function simplification part of pipeline."	Evgeniy Stepanov	2016-12-17	1	-38/+0
\| \| \| \| \| \| \| \|	This reverts r289696, which caused TSan perf regression. See PR31382. llvm-svn: 290030
*	AArch64CollectLOH: Rewrite as block-local analysis.	Matthias Braun	2016-12-17	4	-9/+194
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-apply r288561: Liveness tracking should be correct now after r290014. Previously this pass was using up to 5% compile time in some cases which is a bit much for what it is doing. The pass featured a full blown data-flow analysis which in the default configuration was restricted to a single block. This rewrites the pass under the assumption that we only ever work on a single block. This is done in a single pass maintaining a state machine per general purpose register to catch LOH patterns. Differential Revision: https://reviews.llvm.org/D27329 llvm-svn: 290026
*	[sancov] skip dead files from computations	Mike Aizatsky	2016-12-17	4	-8/+32
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27863 llvm-svn: 290017
*	AArch64: Enable post-ra liveness updates	Matthias Braun	2016-12-16	2	-6/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27559 llvm-svn: 290014
*	Allow "line 0" to be the first explicit debug location in a function.	Paul Robinson	2016-12-16	1	-1/+6
\| \| \| \| \| \|	Feedback on r289468 from Adrian Prantl. llvm-svn: 290012
*	Fix a bugs with using some Mach-O command line flags like "-arch armv7m".	Kevin Enderby	2016-12-16	3	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Mach-O command line flag like "-arch armv7m" does not match the arch name part of its llvm Triple which is "thumbv7m-apple-darwin”. I think the best way to fix this is to have llvm::object::MachOObjectFile::getArchTriple() optionally return the name of the Mach-O arch flag that would be used with -arch that matches the CPUType and CPUSubType. Then change llvm::object::MachOUniversalBinary::ObjectForArch::getArchTypeName() to use that and change it to getArchFlagName() as the type name is really part of the Triple and the -arch flag name is a Mach-O thing for a specific Triple with a specific Mcpu value. rdar://29663637 llvm-svn: 290001
*	Resubmit "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."	Zachary Turner	2016-12-16	2	-22/+7
\| \| \| \| \| \| \|	The original patch was broken due to some undefined behavior as well as warnings that were triggering -Werror. llvm-svn: 290000
*	[ThinLTO] Import composite types as declarations	Teresa Johnson	2016-12-16	2	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When reading the metadata bitcode, create a type declaration when possible for composite types when we are importing. Doing this in the bitcode reader saves memory. Also it works naturally in the case when the type ODR map contains a definition for the same composite type because it was used in the importing module (buildODRType will automatically use the existing definition and not create a type declaration). For Chromium built with -g2, this reduces the aggregate size of the generated native object files by 66% (from 31G to 10G). It reduced the time through the ThinLTO link and backend phases by about 20% on my machine. Reviewers: mehdi_amini, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27775 llvm-svn: 289993
*	Preserve loop metadata when folding branches to a common destination.	Michael Kuperstein	2016-12-16	1	-0/+28
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27830 llvm-svn: 289992
*	[CodeGenPrep] Skip merging empty case blocks	Jun Bum Lim	2016-12-16	7	-14/+215
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly : Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289988
*	Revert "[IR] Remove the DIExpression field from DIGlobalVariable."	Adrian Prantl	2016-12-16	166	-566/+418
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 289920 (again). I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable has not DIExpression. Unfortunately it is not possible to safely upgrade these variables without adding a flag to the bitcode record indicating which version they are. My plan of record is to roll the planned follow-up patch that adds a unit: field to DIGlobalVariable into this patch before recomitting. This way we only need one Bitcode upgrade for both changes (with a version flag in the bitcode record to safely distinguish the record formats). Sorry for the churn! llvm-svn: 289982
*	Revert "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."	Zachary Turner	2016-12-16	2	-7/+22
\| \| \| \| \| \| \|	This reverts commit r289978, which is failing due to some rebase/merge issues. llvm-svn: 289981
*	[CodeView] Hook CodeViewRecordIO for reading/writing symbols.	Zachary Turner	2016-12-16	2	-22/+7
\| \| \| \| \| \| \| \| \|	This is the 3rd of 3 patches to get reading and writing of CodeView symbol and type records to use a single codepath. Differential Revision: https://reviews.llvm.org/D26427 llvm-svn: 289978
*	Strip invalid TBAA when reading bitcode	Mehdi Amini	2016-12-16	1	-2/+6
\| \| \| \| \| \| \| \|	This ensures backward compatibility on bitcode loading. Differential Revision: https://reviews.llvm.org/D27839 llvm-svn: 289977
*	Reapply "[LV] Enable vectorization of loops with conditional stores by default"	Matthew Simpson	2016-12-16	5	-6/+6
\| \| \| \| \| \| \| \|	This patch reapplies r289863. The original patch was reverted because it exposed a bug causing the loop vectorizer to crash in the Python runtime on PPC. The underlying issue was fixed with r289958. llvm-svn: 289975
*	Fix CodeGenPrepare::stripInvariantGroupMetadata	Sanjoy Das	2016-12-16	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	`dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs. The only reason it worked at all is that `getMetadataID` returns something unrelated -- it returns the subclass ID of the receiver (which is used in `dyn_cast` etc.). That does not numerically match `LLVMContext::MD_invariant_group` and ends up dropping `invariant_group` along with every other metadata that does not numerically match `LLVMContext::MD_invariant_group`. llvm-svn: 289973
*	[ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.	Eli Friedman	2016-12-16	2	-6/+223
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, there are substantial problems forming vld1_dup even if the VDUP survives legalization. The lack of an actual node leads to terrible results: not only can we not form post-increment vld1_dup instructions, but we form scalar pre-increment and post-increment loads which force the loaded value into a GPR. This patch fixes that by combining the vdup+load into an ARMISD node before DAGCombine messes it up. Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable). Recommiting with fix to avoid forming vld1dup if the type of the load doesn't match the type of the vdup (see https://llvm.org/bugs/show_bug.cgi?id=31404). Differential Revision: https://reviews.llvm.org/D27694 llvm-svn: 289972
*	AMDGPU: Fix name for v_ashrrev_i16	Matt Arsenault	2016-12-16	6	-12/+12
\| \| \| \|	llvm-svn: 289967
*	Revert "dwarfdump: Support/process relocations on a CU's abbrev_off"	David Blaikie	2016-12-16	2	-8/+0
\| \| \| \| \| \| \| \| \|	Reverting because this breaks lld's gdb_index support - it's probably double counting the abbrev relocation offset. This reverts commit r289954. llvm-svn: 289961
*	Revert "[CodeGenPrep] Skip merging empty case blocks"	Jun Bum Lim	2016-12-16	5	-212/+11
\| \| \| \| \| \|	This reverts commit r289951. llvm-svn: 289960
*	[InstCombine] auto-generate checks; NFC	Sanjay Patel	2016-12-16	1	-2/+15
\| \| \| \|	llvm-svn: 289959
*	[LV] Don't attempt to type-shrink scalarized instructions	Matthew Simpson	2016-12-16	1	-0/+53
\| \| \| \| \| \| \| \| \| \|	After r288909, instructions feeding predicated instructions may be scalarized if profitable. Since these instructions will remain scalar, we shouldn't attempt to type-shrink them. We should only truncate vector types to their minimal bit widths. This bug was exposed by enabling the vectorization of loops containing conditional stores by default. llvm-svn: 289958
*	Pass sample pgo flags to thinlto.	Dehao Chen	2016-12-16	2	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: ThinLTO needs to invoke SampleProfileLoader pass during link time in order to annotate profile correctly after module importing. Reviewers: davidxl, mehdi_amini, tejohnson Subscribers: pcc, davide, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27790 llvm-svn: 289957
*	[X86] Fold (setcc (cmp (atomic_load_add x, -C) C), COND) to (setcc (LADD x, ↵	Hans Wennborg	2016-12-16	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-C), COND) (PR31367) atomic_load_add returns the value before addition, but sets EFLAGS based on the result of the addition. That means it's setting the flags based on effectively subtracting C from the value at x, which is also what the outer cmp does. This targets a pattern that occurs frequently with reference counting pointers: void decrement(long volatile *ptr) { if (_InterlockedDecrement(ptr) == 0) release(); } Clang would previously compile it (for 32-bit at -Os) as: 00000000 <?decrement@@YAXPCJ@Z>: 0: 8b 44 24 04 mov 0x4(%esp),%eax 4: 31 c9 xor %ecx,%ecx 6: 49 dec %ecx 7: f0 0f c1 08 lock xadd %ecx,(%eax) b: 83 f9 01 cmp $0x1,%ecx e: 0f 84 00 00 00 00 je 14 <?decrement@@YAXPCJ@Z+0x14> 14: c3 ret and with this patch it becomes: 00000000 <?decrement@@YAXPCJ@Z>: 0: 8b 44 24 04 mov 0x4(%esp),%eax 4: f0 ff 08 lock decl (%eax) 7: 0f 84 00 00 00 00 je d <?decrement@@YAXPCJ@Z+0xd> d: c3 ret (Equivalent variants with _InterlockedExchangeAdd, std::atomic<>'s fetch_add or pre-decrement operator generate the same code.) Differential Revision: https://reviews.llvm.org/D27781 llvm-svn: 289955
*	dwarfdump: Support/process relocations on a CU's abbrev_off	David Blaikie	2016-12-16	2	-0/+8
\| \| \| \| \| \| \| \| \| \|	Input can be produced by ld -r, for example (a normal LLVM workflow never hits this - LLVM only ever produces a single abbrev table in an object (shared by multiple CUs), so the reloc's always 0, and when it's linked together the relocation's resolved so it doesn't need to be handled) llvm-svn: 289954
*	[CodeGenPrep] Skip merging empty case blocks	Jun Bum Lim	2016-12-16	5	-11/+212
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block: Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289951
*	[X86][AVX512] use a single shufps for 512-bit vectors when it can save ↵	Simon Pilgrim	2016-12-16	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \|	instructions This is the 512-bit counterpart to the 128-bit transform checked in here: https://reviews.llvm.org/rL289837 This patch is based on the draft by @sroland (Roland Scheidegger) that is attached to PR27885: https://llvm.org/bugs/show_bug.cgi?id=27885 llvm-svn: 289946
*	[X86][AVX512] Add tests showing missed opportunity to efficiently lower ↵	Simon Pilgrim	2016-12-16	1	-0/+32
\| \| \| \| \| \|	v16i32 to VSHUFPS (PR27885) llvm-svn: 289945
*	Speculatively revert r289925, see PR31407	Nico Weber	2016-12-16	13	-70/+31
\| \| \| \|	llvm-svn: 289944
*	[ARM] GlobalISel: Select add i32, i32	Diana Picus	2016-12-16	5	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add the minimal support necessary to select a function that returns the sum of two i32 values. This includes some support for argument/return lowering of i32 values through registers, as well as the handling of copy and add instructions throughout the GlobalISel pipeline. Differential Revision: https://reviews.llvm.org/D26677 llvm-svn: 289940
*	[X86][SSE] Combine shuffles to MOVSS/MOVSD whatever the domain.	Simon Pilgrim	2016-12-16	1	-6/+2
\| \| \| \| \| \|	We already do the same thing in shuffle lowering; but don't do it if we have SSE41 (PBLEND) instead. llvm-svn: 289937
*	[AVR] Add a test for 64-bit left shifts	Dylan McKay	2016-12-16	1	-0/+8
\| \| \| \|	llvm-svn: 289936
*	Revert r289863: [LV] Enable vectorization of loops with conditional	Chandler Carruth	2016-12-16	5	-6/+6
\| \| \| \| \| \| \| \| \| \|	stores by default This uncovers a crasher in the loop vectorizer on PPC when building the Python runtime. I'll send the testcase to the review thread for the original commit. llvm-svn: 289934
*	Extra coverage tests to demonstrate fixes in D72618 and D26855	Andrew V. Tischenko	2016-12-16	2	-0/+334
\| \| \| \|	llvm-svn: 289931
*	Revert r289638: [PowerPC] Fix logic dealing with nop after calls (and ↵	Chandler Carruth	2016-12-16	2	-133/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	tail-call eligibility) This patch appears to result in trampolines in vtables being miscompiled when they in turn tail call a method. I've posted some preliminary details about the failure on the thread for this commit and talked to Hal. He was comfortable going ahead and reverting until we sort out what is wrong. llvm-svn: 289928
*	Update .debug_line section version information to match DWARF version.	Ekaterina Romanova	2016-12-16	13	-31/+70
\| \| \| \| \| \| \| \| \| \| \| \|	One more attempt to re-commit the patch r285355, which I had to revert in r285362, because some tests were failing (the reason is because the size of the line_table varied depending on the full file name). In the past the compiler always emitted .debug_line version 2, though some opcodes from DWARF 3 (e.g. DW_LNS_set_prologue_end, DW_LNS_set_epilogue_begin or DW_LNS_set_isa) and from DWARF 4 could be emitted by the compiler. This patch changes version information of .debug_line to exactly match the DWARF version. For .debug_line version 4, a new field maximum_operations_per_instruction is emitted. Differential Revision: https://reviews.llvm.org/D16697 llvm-svn: 289925
*	Revert 279703, it caused PR31404.	Nico Weber	2016-12-16	2	-163/+6
\| \| \| \|	llvm-svn: 289923
*	[IR] Remove the DIExpression field from DIGlobalVariable.	Adrian Prantl	2016-12-16	166	-418/+566
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289920
*	Revert patch series introducing the DAG combine to match a load-by-bytes	Chandler Carruth	2016-12-16	3	-1193/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	idiom. r289538: Match load by bytes idiom and fold it into a single load r289540: Fix a buildbot failure introduced by r289538 r289545: Use more detailed assertion messages in the code ... r289646: Add a couple of assertions to the load combine code ... This DAG combine has a bad crash in it that is quite hard to trigger sadly -- it relies on sneaking code with UB through the SDAG build and into this particular combine. I've responded to the original commit with a test case that reproduces it. However, the code also has other problems that will require substantial changes to address and so I'm going ahead and reverting it for now. This should unblock us and perhaps others that are hitting the crash in the wild and will let a fresh patch with updated approach come in cleanly afterward. Sorry for any trouble or disruption! llvm-svn: 289916
*	Revert "[IR] Remove the DIExpression field from DIGlobalVariable."	Adrian Prantl	2016-12-16	163	-561/+414
\| \| \| \| \| \|	This reverts commit 289902 while investigating bot berakage. llvm-svn: 289906
*	[IR] Remove the DIExpression field from DIGlobalVariable.	Adrian Prantl	2016-12-16	163	-414/+561
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289902
*	[PPC] corrections in two testcases	Ehsan Amiri	2016-12-16	1	-14/+14
\| \| \| \| \| \| \| \| \|	Removing sensitivity to scheduling (by using CHECK-DAG instead of CHECK) and some other minor corrections. In preparation to commit Power9 processor model. llvm-svn: 289900
*	IPO: Introduce ThinLTOBitcodeWriter pass.	Peter Collingbourne	2016-12-16	6	-0/+159
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass prepares a module containing type metadata for ThinLTO by splitting it into regular and thin LTO parts if possible, and writing both parts to a multi-module bitcode file. Modules that do not contain type metadata are written unmodified as a single module. All globals with type metadata are added to the regular LTO module, and the rest are added to the thin LTO module. Differential Revision: https://reviews.llvm.org/D27324 llvm-svn: 289899
*	[ThinLTO] Thin link efficiency improvement: don't re-export globals (NFC)	Teresa Johnson	2016-12-15	2	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We were reinvoking exportGlobalInModule numerous times redundantly. No need to re-export globals referenced by a global that was already imported from its module. This resulted in a large speedup in the thin link for a big application, particularly when importing aggressiveness was cranked up. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27687 llvm-svn: 289896
*	[SimplifyLibCalls] Add a test to make sure we lower fls(0) correctly.	Davide Italiano	2016-12-15	1	-0/+9
\| \| \| \|	llvm-svn: 289895
*	[SimplifyLibCalls] Lower fls() to llvm.ctlz().	Davide Italiano	2016-12-15	1	-0/+48
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D14590 llvm-svn: 289894