bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SLPVectorizer][NFC] Make a loop more readable.	Clement Courbet	2018-02-07	1	-7/+5
\| \| \| \|	llvm-svn: 324482
*	Revert r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."	George Rimar	2018-02-07	1	-11/+37
\| \| \| \| \| \| \|	It broke BB: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/23721 llvm-svn: 324458
*	[ThinLTO] - Simplify code in ThinLTOBitcodeWriter.	George Rimar	2018-02-07	1	-37/+11
\| \| \| \| \| \| \| \| \| \|	Recently introduced convertToDeclaration is very similar to code used in filterModule function. Patch reuses it to reduce duplication. Differential revision: https://reviews.llvm.org/D42971 llvm-svn: 324455
*	[LoopPrediction] Introduce utility function getLatchPredicateForGuard. NFC.	Serguei Katkov	2018-02-07	1	-17/+30
\| \| \| \| \| \| \|	Factor out getting the predicate for latch condition in a guard to utility function getLatchPredicateForGuard. llvm-svn: 324450
*	[DSE] Upgrade uses of MemoryIntrinic::getAlignment() to new API. (NFC)	Daniel Neilson	2018-02-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the DeadStoreElimination pass to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324402
*	[InferAddressSpaces] Update uses of IRBuilder memory intrinsic creation to ↵	Daniel Neilson	2018-02-06	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	new API Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the InferAddressSpaces pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324395
*	[InlineFunction] Update deprecated use of IRBuilder CreateMemCpy (NFC)	Daniel Neilson	2018-02-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the InlineFunction pass to ceause using the old IRBuilder CreateMemCpy single-alignment API in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324384
*	[MergeICmps] Handle chains with several complex BCE basic blocks.	Clement Courbet	2018-02-06	1	-3/+4
\| \| \| \| \| \| \| \| \| \|	- Fix condition for detecting that a complex basic block was the first in the chain. - Add tests. This was caught by buildbots when submitting rL324319. llvm-svn: 324341
*	[DeadArgumentElim] Set pointer to DISubprogram before calling RAUW. NFC	Petar Jovanovic	2018-02-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	It is better to update pointer of the DISuprogram before we call RAUW for still live arguments of the function, because with the change reviewed in D42541 in RAUW we compare DISubprograms rather than functions itself. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D42794 llvm-svn: 324335
*	[MergeICmps][NFC] Add more assertions.	Clement Courbet	2018-02-06	1	-0/+4
\| \| \| \|	llvm-svn: 324323
*	ThinLTOBitcodeWriter: Do not include module-level inline asm in the merged ↵	Peter Collingbourne	2018-02-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	module. If the inline asm provides the definition of a symbol, this can result in duplicate symbol errors. Differential Revision: https://reviews.llvm.org/D42944 llvm-svn: 324313
*	[LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused ↵	Sanjay Patel	2018-02-05	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR35681) In the motivating case from PR35681 and represented by the macro-fuse-cmp test: https://bugs.llvm.org/show_bug.cgi?id=35681 ...there's a 37 -> 31 byte size win for the loop because we eliminate the big base address offsets. SPEC2017 on Ryzen shows no significant perf difference. Differential Revision: https://reviews.llvm.org/D42607 llvm-svn: 324289
*	[LowerMemIntrinsics] Update uses of deprecated MemIntrinsic::getAlignment ↵	Daniel Neilson	2018-02-05	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	API (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the LowerMemIntrinsics pass to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324278
*	[InstCombine] don't try to evaluate instructions with >1 use (revert r324014)	Sanjay Patel	2018-02-05	1	-17/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This example causes a compile-time explosion: define i16 @foo(i16 %in) { %x = zext i16 %in to i32 %a1 = mul i32 %x, %x %a2 = mul i32 %a1, %a1 %a3 = mul i32 %a2, %a2 %a4 = mul i32 %a3, %a3 %a5 = mul i32 %a4, %a4 %a6 = mul i32 %a5, %a5 %a7 = mul i32 %a6, %a6 %a8 = mul i32 %a7, %a7 %a9 = mul i32 %a8, %a8 %a10 = mul i32 %a9, %a9 %a11 = mul i32 %a10, %a10 %a12 = mul i32 %a11, %a11 %a13 = mul i32 %a12, %a12 %a14 = mul i32 %a13, %a13 %a15 = mul i32 %a14, %a14 %a16 = mul i32 %a15, %a15 %a17 = mul i32 %a16, %a16 %a18 = mul i32 %a17, %a17 %a19 = mul i32 %a18, %a18 %a20 = mul i32 %a19, %a19 %a21 = mul i32 %a20, %a20 %a22 = mul i32 %a21, %a21 %a23 = mul i32 %a22, %a22 %a24 = mul i32 %a23, %a23 %T = trunc i32 %a24 to i16 ret i16 %T } llvm-svn: 324276
*	[SimplifyLibCalls] Update from deprecated IRBuilder API for creating memory ↵	Daniel Neilson	2018-02-05	1	-25/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	intrinsics (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the SimplifyLibCalls pass to cease using the old IRBuilder createMemCpy/createMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, r3L24148 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324273
*	[InstCombine] add unsigned saturation subtraction canonicalizations	Sanjay Patel	2018-02-05	1	-1/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the instcombine part of unsigned saturation canonicalization. Backend patches already commited: https://reviews.llvm.org/D37510 https://reviews.llvm.org/D37534 It converts unsigned saturated subtraction patterns to forms recognized by the backend: (a > b) ? a - b : 0 -> ((a > b) ? a : b) - b) (b < a) ? a - b : 0 -> ((a > b) ? a : b) - b) (b > a) ? 0 : a - b -> ((a > b) ? a : b) - b) (a < b) ? 0 : a - b -> ((a > b) ? a : b) - b) ((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b) ((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b) Patch by Yulia Koval! Differential Revision: https://reviews.llvm.org/D41480 llvm-svn: 324255
*	LTO: Include dso-local bit in ThinLTO cache key.	Peter Collingbourne	2018-02-05	1	-10/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D42713 llvm-svn: 324253
*	[InstCombine] only allow narrow/wide evaluation of values with >1 use if ↵	Sanjay Patel	2018-02-05	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \|	that user is a binop There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi instructions that might have repeated operands. This is likely a source of an infinite loop. I haven't manufactured a test case to prove that, but it should be safe to speculatively limit this transform to binops while we try to create that test. llvm-svn: 324252
*	Revert r323472 "[Debug] Add dbg.value intrinsics for PHIs created during LCSSA."	Hans Wennborg	2018-02-05	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This broke the Chromium build; see PR36238. > This patch is an enhancement to propagate dbg.value information when > Phis are created on behalf of LCSSA. I noticed a case where a value > carried across a loop was reported as <optimized out>. > > Specifically this case: > > int bar(int x, int y) { > return x + y; > } > > int foo(int size) { > int val = 0; > for (int i = 0; i < size; ++i) { > val = bar(val, i); // Both val and i are correct > } > return val; // <optimized out> > } > > In the above case, after all of the interesting computation completes > our value is reported as "optimized out." This change will add a > dbg.value to correct this. > > This patch also moves the dbg.value insertion routine from > LoopRotation.cpp into Local.cpp, so that we can share it in both places > (LoopRotation and LCSSA). > > Patch by Matt Davis! > > Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 324247
*	[ThinLTO] Convert dead alias to declarations	Teresa Johnson	2018-02-05	1	-12/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This complements the fixes in r323633 and r324075 which drop the definitions of dead functions and variables, respectively. Fixes PR36208. Reviewers: grimar, rafael Subscribers: mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D42856 llvm-svn: 324242
*	Revert [SimplifyCFG] Relax restriction for folding unconditional branches	Serguei Katkov	2018-02-05	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \|	The patch causes the failure of the test compiler-rt/test/profile/Linux/counter_promo_nest.c To unblock buildbot, revert the patch while investigation is in progress. Differential Revision: https://reviews.llvm.org/D42691 llvm-svn: 324214
*	[SimplifyCFG] Relax restriction for folding unconditional branches	Serguei Katkov	2018-02-05	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit rL308422 introduces a restriction for folding unconditional branches. Specifically if empty block with unconditional branch leads to header of the loop then elimination of this basic block is prohibited. However it seems this condition is redundantly strict. If elimination of this basic block does not introduce more back edges then we can eliminate this block. The patch implements this relax of restriction. Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl Reviewed By: pacxx Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42691 llvm-svn: 324208
*	Re-apply [SCEV] Fix isLoopEntryGuardedByCond usage	Serguei Katkov	2018-02-05	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check that SCEV is available at entry point of the loop. It is incorrect and fixed by patch. To bugs additionally fixed: assert is moved after the check whether loop is not a nullptr. Usage of isLoopEntryGuardedByCond in ScalarEvolution::isImpliedCondOperandsViaNoOverflow is guarded by isAvailableAtLoopEntry. Reviewers: sanjoy, mkazantsev, anna, dorit, reames Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42417 llvm-svn: 324204
*	[InlineFunction] Set arg attrs even if there only are VarArg attrs.	Florian Hahn	2018-02-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When using the partial inliner, we might have attributes for forwarded varargs, but the CodeExtractor does not create an empty argument attribute set for regular arguments in that case, because it does not know of the additional arguments. So in case we have attributes for VarArgs, we also have to make sure we create (empty) attributes for all regular arguments. This fixes PR36210. llvm-svn: 324197
*	[LV] Use Demanded Bits and ValueTracking for reduction type-shrinking	Chad Rosier	2018-02-04	2	-76/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The type-shrinking logic in reduction detection, although narrow in scope, is also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies the approach to rely on the demanded bits and value tracking analyses, if available. We currently perform type-shrinking separately for reductions and other instructions in the loop. Long-term, we should probably think about computing minimal bit widths in a more complete way for the loops we want to vectorize. PR35734 Differential Revision: https://reviews.llvm.org/D42309 llvm-svn: 324195
*	[InstCombine] Allow common type conversions to i8/i16/i32	David Green	2018-02-03	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \|	This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 324174
*	[InstCombine] Use getDestAlignment in SimplifyMemSet (NFC)	Daniel Neilson	2018-02-02	1	-2/+2
\| \| \| \| \| \| \| \|	Summary: Small NFC change to change the name of the function used getting and setting the alignment of a memset. llvm-svn: 324148
*	[InstCombine] simplify logic for swapMayExposeCSEOpportunities; NFCI	Sanjay Patel	2018-02-02	1	-23/+9
\| \| \| \|	llvm-svn: 324122
*	[InstCombine] fix typos, formatting; NFC	Sanjay Patel	2018-02-02	1	-7/+6
\| \| \| \|	llvm-svn: 324118
*	[ThinLTO] - Add comment. NFC.	George Rimar	2018-02-02	1	-0/+2
\| \| \| \| \| \|	Was requested during review of D42798. llvm-svn: 324095
*	[GlobalOpt] Include padding in debug fragments	Mikael Holmen	2018-02-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When creating the debug fragments for a SRA'd variable, use the types' allocation sizes. This fixes issues where the pass would emit too small fragments, placed at the wrong offset, for padded types. An example of this is long double on x86. The type is represented using x86_fp80, which is 10 bytes, but the value is aligned to 12/16 bytes. The padding is included in the type's DW_AT_byte_size attribute; therefore, the fragments should also include that. Newer GCC releases (I tested 7.2.0) emit 12/16-byte pieces for long double. Earlier releases, e.g. GCC 5.5.0, behaved as LLVM did, i.e. by emitting a 10-byte piece, followed by an empty 2/6-byte piece for the padding. Failing to cover all `DW_AT_byte_size' bytes of a value with non-empty pieces results in the value being printed as <optimized out> by GDB. Patch by: David Stenberg Reviewers: aprantl, JDevlieghere Reviewed By: aprantl, JDevlieghere Subscribers: llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D42807 llvm-svn: 324066
*	Add missing includes	David Blaikie	2018-02-02	1	-0/+3
\| \| \| \|	llvm-svn: 324040
*	[InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 inst	Sanjay Patel	2018-02-01	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality. When we have multiple uses of a value, but they're all in one instruction, we can allow that expression to be narrowed or widened for the same cost as a single-use value. AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified away if the operands are the same value; add becomes shl; shifts with a variable shift amount aren't handled. Differential Revision: https://reviews.llvm.org/D42739 llvm-svn: 324014
*	Revert commit rL323951	David Green	2018-02-01	1	-6/+2
\| \| \| \| \| \| \|	Looks like it's causing timeouts out on at least ppc64le buildbots. llvm-svn: 323959
*	[InstCombine] Allow common type conversions to i8/i16/i32	David Green	2018-02-01	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \|	This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 323951
*	[LSR] Don't force bases of foldable formulae to the final type.	Mikael Holmen	2018-02-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Before emitting code for scaled registers, we prevent SCEVExpander from hoisting any scaled addressing mode by emitting all the bases first. However, these bases are being forced to the final type, resulting in some odd code. For example, if the type of the base is an integer and the final type is a pointer, we will emit an inttoptr for the base, a ptrtoint for the scale, and then a 'reverse' GEP where the GEP pointer is actually the base integer and the index is the pointer. It's more intuitive to use the pointer as a pointer and the integer as index. Patch by: Bevin Hansson Reviewers: atrick, qcolombet, sanjoy Reviewed By: qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42103 llvm-svn: 323946
*	[GlobalOpt] Improve common case efficiency of static global initializer ↵	Amara Emerson	2018-01-31	1	-2/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	evaluation For very, very large global initializers which can be statically evaluated, the code would create vectors of temporary Constants, modifying them in place, before committing the resulting Constant aggregate to the global's initializer value. This had effectively O(n^2) complexity in the size of the global initializer and would cause memory and non-termination issues compiling some workloads. This change performs the static initializer evaluation and creation in batches, once for each global in the evaluated IR memory. The existing code is maintained as a last resort when the initializers are more complex than simple values in a large aggregate. This should theoretically by NFC, no test as the example case is massive. The existing test cases pass with this, as well as the llvm test suite. To give an example, consider the following C++ code adapted from the clang regression tests: struct S { int n = 10; int m = 2 * n; S(int a) : n(a) {} }; template<typename T> struct U { T r = &q; T q = 42; U p = this; }; U<S> e; The global static constructor for 'e' will need to initialize 'r' and 'p' of the outer struct, while also initializing the inner 'q' structs 'n' and 'm' members. This batch algorithm will simply use general CommitValueTo() method to handle the complex nested S struct initialization of 'q', before processing the outermost members in a single batch. Using CommitValueTo() to handle member in the outer struct is inefficient when the struct/array is very large as we end up creating and destroy constant arrays for each initialization. For the above case, we expect the following IR to be generated: %struct.U = type { %struct.S, %struct.S, %struct.U } %struct.S = type { i32, i32 } @e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e, i64 0, i32 1), %struct.S { i32 42, i32 84 }, %struct.U* @e } The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex constant expression, while the other two elements of @e are "simple". Differential Revision: https://reviews.llvm.org/D42612 llvm-svn: 323933
*	Utils: Fix DomTree update for entry block	Matt Arsenault	2018-01-31	1	-5/+14
\| \| \| \| \| \| \|	If SplitBlockPredecessors was used on a function entry block, it wouldn't update the dominator tree. llvm-svn: 323928
*	[AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf ↵	Amjad Aboud	2018-01-31	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \|	node correctly. This covers the case where TruncInst leaf node is a constant expression. See PR36121 for more details. Differential Revision: https://reviews.llvm.org/D42622 llvm-svn: 323926
*	[GlobalOpt] Fix exponential compile-time with selects.	Eli Friedman	2018-01-31	1	-17/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If you have a long chain of select instructions created from something like `int* p = &g; if (foo()) p += 4; if (foo2()) p += 4;` etc., a naive recursive visitor will recursively visit each select twice, which is O(2^N) in the number of select instructions. Use the visited set to cut off recursion in this case. (No testcase because this doesn't actually change the behavior, just the time.) Differential Revision: https://reviews.llvm.org/D42451 llvm-svn: 323910
*	AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}	Marek Olsak	2018-01-31	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908
*	[SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPs	Marek Olsak	2018-01-31	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 llvm-svn: 323907
*	[InstCombine] reduce code duplication for canEvaluate* functions; NFCI	Sanjay Patel	2018-01-31	1	-47/+43
\| \| \| \| \| \|	We'd have to make the change suggested in D42536 3x otherwise. llvm-svn: 323877
*	[AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks.	Amjad Aboud	2018-01-31	3	-5/+19
\| \| \| \| \| \| \| \| \|	Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis. See PR36134 for more details. Differential Revision: https://reviews.llvm.org/D42683 llvm-svn: 323862
*	LTO: Drop comdats when converting definitions to declarations.	Peter Collingbourne	2018-01-31	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D42715 llvm-svn: 323844
*	Teach ValueMapper to use ODR uniqued types when available	Teresa Johnson	2018-01-30	1	-4/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is exposed during ThinLTO compilation, when we import an alias by creating a clone of the aliasee. Without this fix the debug type is unnecessarily cloned and we get a duplicate, undoing the uniquing. Fixes PR36089. Reviewers: mehdi_amini, pcc Subscribers: eraman, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41669 llvm-svn: 323813
*	Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵	Zaara Syeda	2018-01-30	1	-6/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
*	[RS4GC] Handle call/invoke instructions as base defining values of vectors	Daniel Neilson	2018-01-30	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There's an asymmetry in the definitions of findBaseDefiningValueOfVector() and findBaseDefiningValue() of RS4GC. The later handles call and invoke instructions, and the former does not. This appears to be simple oversight. This patch remedies the oversight by adding the call and invoke cases to findBaseDefiningValueOfVector(). Reviewers: DaniilSuchkov, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42653 llvm-svn: 323764
*	[DSE] make sure memory is not modified before partial store merging (PR36129)	Sanjay Patel	2018-01-30	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	We missed a critical check in D30703. We must make sure that no intermediate store is sitting between the stores that we want to merge. This should fix: https://bugs.llvm.org/show_bug.cgi?id=36129 Differential Revision: https://reviews.llvm.org/D42663 llvm-svn: 323759
*	[JumpThreading][NFC] Rename LoadInst variables	Brian M. Rzycki	2018-01-29	1	-43/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The JumpThreading pass has several locations where to the variable name LI refers to a LoadInst type. This is confusing and inhibits the ability to use LI for LoopInfo as a member of the JumpThreading class. Minor formatting and comments were also altered to reflect this change. Reviewers: dberlin, kuba, spop, sebpop Reviewed by: sebpop Subscribers: sebpop, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42601 llvm-svn: 323695