bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[Debug] LCSSA: Insert dbg.value at the first available insertion point	Vedant Kumar	2018-01-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	Inserting a dbg.value instruction at the start of a basic block with a landingpad instruction triggers a verifier failure. We should be OK if we insert the instruction a bit later. Speculative fix for the bot failure described here: https://reviews.llvm.org/D42551 llvm-svn: 323482
*	[SyntheticCounts] Rewrite the code using only graph traits.	Easwaran Raman	2018-01-25	1	-6/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The intent of this is to allow the code to be used with ThinLTO. In Thinlink phase, a traditional Callgraph can not be computed even though all the necessary information (nodes and edges of a call graph) is available. This is due to the fact that CallGraph class is closely tied to the IR. This patch first extends GraphTraits to add a CallGraphTraits graph. This is then used to implement a version of counts propagation on a generic callgraph. Reviewers: davidxl Subscribers: mehdi_amini, tejohnson, llvm-commits Differential Revision: https://reviews.llvm.org/D42311 llvm-svn: 323475
*	[Debug] Add dbg.value intrinsics for PHIs created during LCSSA.	Vedant Kumar	2018-01-25	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Patch by Matt Davis! Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 323472
*	[Debug] Add a utility to propagate dbg.value to new PHIs, NFC	Vedant Kumar	2018-01-25	2	-33/+39
\| \| \| \| \| \| \| \| \| \|	This simply moves an existing utility to Utils for reuse. Split out of: https://reviews.llvm.org/D42551 Patch by Matt Davis! llvm-svn: 323471
*	[asan] Fix kernel callback naming in instrumentation module.	Evgeniy Stepanov	2018-01-25	1	-3/+1
\| \| \| \| \| \| \| \| \| \|	Right now clang uses "_n" suffix for some user space callbacks and "N" for the matching kernel ones. There's no need for this and it actually breaks kernel build with inline instrumentation. Use the same callback names for user space and the kernel (and also make them consistent with the names GCC uses). Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D42423 llvm-svn: 323470
*	Re-land "[ThinLTO] Add call edges' relative block frequency to per-module ↵	Easwaran Raman	2018-01-25	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	summary." It was reverted after buildbot regressions. Original commit message: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. llvm-svn: 323460
*	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵	Alexey Bataev	2018-01-25	1	-350/+129
\| \| \| \| \| \| \| \|	as shuffle." This reverts commit r323441 to fix buildbots. llvm-svn: 323447
*	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.	Alexey Bataev	2018-01-25	1	-129/+350
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323441
*	[InstCombine] narrow masked zexted binops (PR35792)	Sanjay Patel	2018-01-25	2	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is guarded by shouldChangeType(), so the tests show that we don't do the fold if the narrower type is not legal. Note that there is a proposal (D42424) that would change the results for the specific cases shown in these tests. That difference is also discussed in PR35792: https://bugs.llvm.org/show_bug.cgi?id=35792 Alive proofs for the cases handled here as well as the bitwise logic binops that we should already do better on: https://rise4fun.com/Alive/c97 https://rise4fun.com/Alive/Lc5E https://rise4fun.com/Alive/kdf llvm-svn: 323437
*	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵	Alexey Bataev	2018-01-25	1	-352/+130
\| \| \| \| \| \| \| \|	as shuffle." This reverts commit r323430 to fix buildbots. llvm-svn: 323432
*	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.	Alexey Bataev	2018-01-25	1	-130/+352
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323430
*	Another try to commit 323321 (aggressive instruction combine).	Amjad Aboud	2018-01-25	10	-3/+674
\| \| \| \|	llvm-svn: 323416
*	[GlobalOpt] Emit fragments using field offsets from struct layout	Mikael Holmen	2018-01-25	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When creating the debug fragments for a SRA'd struct, use the fields' offsets, taken from the struct layout, as the offsets for the resulting fragments. This fixes an issue where GlobalOpt would emit fragments with incorrect offsets for padded fields. This should solve PR36016. Patch by David Stenberg. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42489 llvm-svn: 323411
*	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵	Alexey Bataev	2018-01-24	1	-346/+122
\| \| \| \| \| \| \| \|	as shuffle." This reverts commit r323348 because of the broken buildbots. llvm-svn: 323359
*	Revert r321751, "StructurizeCFG: Fix broken backedge detection"	Nicolai Haehnle	2018-01-24	1	-28/+82
\| \| \| \| \| \| \| \| \| \| \|	It causes regressions in various OpenGL test suites. Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression. Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 llvm-svn: 323355
*	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.	Alexey Bataev	2018-01-24	1	-122/+346
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323348
*	Reverted 323321.	Amjad Aboud	2018-01-24	10	-671/+3
\| \| \| \|	llvm-svn: 323326
*	[InstCombine] Introducing Aggressive Instruction Combine pass ↵	Amjad Aboud	2018-01-24	10	-3/+671
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321
*	[NFC] Remove overconfident assert from IRCE	Max Kazantsev	2018-01-24	1	-2/+0
\| \| \| \| \| \| \| \| \|	This patch removes assert that SCEV is able to prove that a value is non-negative. In fact, SCEV can sometimes be unable to do this because its cache does not update properly. This assert will be returned once this problem is resolved. llvm-svn: 323309
*	BlockExtractor: Remove unused variable. NFC.	Volkan Keles	2018-01-23	1	-1/+1
\| \| \| \|	llvm-svn: 323271
*	[llvm-extract] Support extracting basic blocks	Volkan Keles	2018-01-23	4	-153/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, there is no way to extract a basic block from a function easily. This patch extends llvm-extract to extract the specified basic block(s). Reviewers: loladiro, rafael, bogner Reviewed By: bogner Subscribers: hintonda, mgorny, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D41638 llvm-svn: 323266
*	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵	Alexey Bataev	2018-01-23	1	-343/+121
\| \| \| \| \| \| \| \|	as shuffle." This reverts commit r323246 because of the broken buildbots. llvm-svn: 323252
*	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.	Alexey Bataev	2018-01-23	1	-121/+343
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323246
*	This change add's optimization remark in LoopVersioning LICM pass.	Ashutosh Nema	2018-01-23	1	-2/+55
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch is adding remark messages to the LoopVersioning LICM pass, which will be useful for optimization remark emitter (ORE) infrastructure. Patch by: Deepak Porwal Reviewers: anemet, ashutosh.nema, eastig Subscribers: eastig, vivekvpandya, fhahn, llvm-commits llvm-svn: 323183
*	asan: allow inline instrumentation for the kernel	Dmitry Vyukov	2018-01-22	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Currently ASan instrumentation pass forces callback instrumentation when applied to the kernel. This patch changes the current behavior to allow using inline instrumentation in this case. Authored by andreyknvl. Reviewed in: https://reviews.llvm.org/D42384 llvm-svn: 323140
*	[ThinLTO] Re-commit of dot dumper after test fix	Eugene Leviant	2018-01-22	3	-4/+4
\| \| \| \|	llvm-svn: 323116
*	Revert [SCEV] Fix isLoopEntryGuardedByCond usage	Serguei Katkov	2018-01-22	1	-11/+8
\| \| \| \| \| \| \|	It causes buildbot failures. New added assert is fired. It seems not all usages of isLoopEntryGuardedByCond are fixed. llvm-svn: 323079
*	[SCEV] Fix isLoopEntryGuardedByCond usage	Serguei Katkov	2018-01-22	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \|	ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check that SCEV is available at entry point of the loop. It is incorrect and fixed by patch. Reviewers: sanjoy, mkazantsev, anna, dorit Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42165 llvm-svn: 323077
*	[InstCombine] (X << Y) / X -> 1 << Y	Sanjay Patel	2018-01-21	1	-7/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...when the shift is known to not overflow with the matching signed-ness of the division. This closes an optimization gap caused by canonicalizing mul by power-of-2 to shl as shown in PR35709: https://bugs.llvm.org/show_bug.cgi?id=35709 Patch by Anton Bikineev! Differential Revision: https://reviews.llvm.org/D42032 llvm-svn: 323068
*	Temporarily revert r323062 to investigate buildbot failures	Eugene Leviant	2018-01-21	3	-4/+4
\| \| \| \|	llvm-svn: 323065
*	[ThinLTO] Implement summary visualizer	Eugene Leviant	2018-01-21	3	-4/+4
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D41297 llvm-svn: 323062
*	[DSE] Factor out common code [NFC]	Philip Reames	2018-01-21	1	-37/+27
\| \| \| \| \| \|	We already had the pointer being stored to in the MemLoc, reuse that code. In merging cases, it turned out the interface of the getLocForWrite had become inconsitent with other related utilities. Fix that by making sure the input passes hasAnalyzableWrite as well. llvm-svn: 323056
*	[DSE] Minor rename for clarity sake [NFC]	Philip Reames	2018-01-21	1	-8/+14
\| \| \| \|	llvm-svn: 323055
*	[ObjCARC] Do not turn a call to @objc_autoreleaseReturnValue into a call	Akira Hatanaka	2018-01-19	3	-1/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to @objc_autorelease if its operand is a PHI and the PHI has an equivalent value that is used by a return instruction. For example, ARC optimizer shouldn't replace the call in the following example, as doing so breaks the AutoreleaseRV/RetainRV optimization: %v1 = bitcast i32* %v0 to i8* br label %bb3 bb2: %v3 = bitcast i32* %v2 to i8* br label %bb3 bb3: %p = phi i8* [ %v1, %bb1 ], [ %v3, %bb2 ] %retval = phi i32* [ %v0, %bb1 ], [ %v2, %bb2 ] ; equivalent to %p %v4 = tail call i8* @objc_autoreleaseReturnValue(i8* %p) ret i32* %retval Also, make sure ObjCARCContract replaces @objc_autoreleaseReturnValue's operand uses with its value so that the call gets tail-called. rdar://problem/15894705 llvm-svn: 323009
*	Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵	Daniel Neilson	2018-01-19	2	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])\((.), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8\(i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16\(i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32\(i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64\(i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128\(i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8\(i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16\(i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32\(i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64\(i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128\(i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
*	[SLP] Fix vectorization for tree with trunc to minimum required bit width.	Alexey Bataev	2018-01-19	1	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the vectorized tree has truncate to minimum required bit width and the vector type of the cast operation after the truncation is the same as the vector type of the cast operands, count cost of the vector cast operation as 0, because this cast will be later removed. Also, if the vectorization tree root operations are integer cast operations, do not consider them as candidates for truncation. It will just create extra number of the same vector/scalar operations, which will be removed by instcombiner. Reviewers: RKSimon, spatel, mkuper, hfinkel, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41948 llvm-svn: 322946
*	[NFC] fix trivial typos in comments	Hiroshi Inoue	2018-01-19	5	-7/+7
\| \| \| \| \| \|	"the the" -> "the" llvm-svn: 322934
*	[InstCombine] Make foldSelectOpOp able to handle two-operand getelementptr	John Brawn	2018-01-19	1	-7/+19
\| \| \| \| \| \| \| \| \| \|	Three (or more) operand getelementptrs could plausibly also be handled, but handling only two-operand fits in easily with the existing BinaryOperator handling. Differential Revision: https://reviews.llvm.org/D39958 llvm-svn: 322930
*	[HWAsan] Fix uninitialized variable.	Benjamin Kramer	2018-01-18	1	-0/+1
\| \| \| \| \| \|	Found by msan. llvm-svn: 322847
*	[hwasan] LLVM-level flags for linux kernel-compatible hwasan instrumentation.	Evgeniy Stepanov	2018-01-17	1	-7/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: -hwasan-mapping-offset defines the non-zero shadow base address. -hwasan-kernel disables calls to __hwasan_init in module constructors. Unlike ASan, -hwasan-kernel does not force callback instrumentation. This is controlled separately with -hwasan-instrument-with-calls. Reviewers: kcc Subscribers: srhines, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42141 llvm-svn: 322785
*	Add a ProfileCount class to represent entry counts.	Easwaran Raman	2018-01-17	6	-24/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The class wraps a uint64_t and an enum to represent the type of profile count (real and synthetic) with some helper methods. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41883 llvm-svn: 322771
*	[SCEV] Fix typo. NFC.	Javed Absar	2018-01-17	1	-1/+1
\| \| \| \| \| \|	Fix confusing typo in comment. llvm-svn: 322765
*	Revert [PowerPC] This reverts commit rL322721	Zaara Syeda	2018-01-17	1	-158/+6
\| \| \| \| \| \|	Failing build bots. Revert the commit now. llvm-svn: 322748
*	[PowerPC] Add handling for ColdCC calling convention and a pass to mark	Zaara Syeda	2018-01-17	1	-6/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
*	[InstCombine] fix demanded-bits propagation for zext/trunc	Sanjay Patel	2018-01-17	1	-1/+1
\| \| \| \| \| \| \| \| \|	I was comparing the demanded-bits implementations between InstCombine and TargetLowering as part of investigating questions in D42088 and noticed that this was wrong in IR. We were losing all of the prior known bits when we got back to the 'zext'. llvm-svn: 322662
*	[AMDGPU] add LDS f32 intrinsics	Daniil Fukalov	2018-01-17	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	added llvm.amdgcn.atomic.{add\|min\|max}.f32 intrinsics to allow generate ds_{add\|min\|max}[_rtn]_f32 instructions needed for OpenCL float atomics in LDS Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D37985 llvm-svn: 322656
*	[Transforms] Support making mutable versions of new-format TBAA access tags	Ivan A. Kosarev	2018-01-17	1	-16/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D41565 llvm-svn: 322650
*	[SCEV] fix typo	Javed Absar	2018-01-17	1	-1/+1
\| \| \| \|	llvm-svn: 322629
*	[hwasan] Rename sized load/store callbacks to be consistent with ASan.	Evgeniy Stepanov	2018-01-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: __hwasan_load is now __hwasan_loadN. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42138 llvm-svn: 322601
*	[CallSiteSplitting] Pass list of (BB, Conditions) pairs to splitCallSite.	Florian Hahn	2018-01-16	1	-76/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes some duplication from splitCallSite and makes it easier to add additional code dealing with each predecessor. It also allows us to split for more than 2 predecessors, although that is not enabled for now. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41858 llvm-svn: 322599