bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	loop-unroll: teach remapInstruction to update dbg.value intrinsics.	Adrian Prantl	2017-11-01	1	-1/+15
\| \| \| \| \| \| \| \|	Fixes PR35112. https://bugs.llvm.org/show_bug.cgi?id=35112 llvm-svn: 317138
*	loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.	Adrian Prantl	2017-11-01	1	-0/+24
\| \| \| \| \| \| \| \|	This fixes the second half of PR35113. This reapplies r317106 without modifications. llvm-svn: 317121
*	loop-rotate: eliminate duplicate debug intrinsics after splicing.	Adrian Prantl	2017-11-01	1	-1/+26
\| \| \| \| \| \| \| \| \|	Fixes part of PR35113. This reapplies r317105 with an additional check for isa<Instruction> as found by the bots. llvm-svn: 317120
*	Include GUIDs from the same module when computing GUIDs that needs to be ↵	Dehao Chen	2017-11-01	1	-15/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	imported. Summary: In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D39480 llvm-svn: 317118
*	Revert 317016 and 317048	Philip Reames	2017-11-01	1	-44/+50
\| \| \| \| \| \|	The former appears to have introduced a miscompile in a stage2 clang build. Revert so I can investigate offline. llvm-svn: 317116
*	Revert r317105 to investigate bot breakage.	Adrian Prantl	2017-11-01	1	-23/+1
\| \| \| \|	llvm-svn: 317110
*	Revert r317106 to facilitate reverting r317105.	Adrian Prantl	2017-11-01	1	-24/+0
\| \| \| \|	llvm-svn: 317109
*	LTO: Apply global DCE to ThinLTO modules at LTO opt level 0.	Peter Collingbourne	2017-11-01	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is necessary because DCE is applied to full LTO modules. Without this change, a reference from a dead ThinLTO global to a dead full LTO global will result in an undefined reference at link time. This problem is only observable when --gc-sections is disabled, or when targeting COFF, as the COFF port of lld requires all symbols to have a definition even if all references are dead (this is consistent with link.exe). This change also adds an EliminateAvailableExternally pass at -O0. This is necessary to handle the situation on Windows where a non-prevailing copy of a linkonce_odr function has an SEH filter function; any such filters must be DCE'd because they will contain a call to the llvm.localrecover intrinsic, passing as an argument the address of the function that the filter belongs to, and llvm.localrecover requires this function to be defined locally. Fixes PR35142. Differential Revision: https://reviews.llvm.org/D39484 llvm-svn: 317108
*	loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.	Adrian Prantl	2017-11-01	1	-0/+24
\| \| \| \| \| \|	This fixes the second half of PR35113. llvm-svn: 317106
*	loop-rotate: eliminate duplicate debug intrinsics after splicing.	Adrian Prantl	2017-11-01	1	-1/+23
\| \| \| \| \| \|	Fixes part of PR35113. llvm-svn: 317105
*	Revert rL311205 "[IRCE] Fix buggy behavior in Clamp"	Max Kazantsev	2017-11-01	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reverts rL311205 that was initially a wrong fix. The real problem was in intersection of signed and unsigned ranges (see rL316552), and the patch being reverted masked the problem instead of fixing it. By now, the test against which rL311205 was made works OK even without this code. This revert patch also contains a test case that demonstrates incorrect behavior caused by rL311205: it is caused by incorrect choise of signed max instead of unsigned. llvm-svn: 317088
*	[CodeExtractor] Fix iterator invalidation in findOrCreateBlockForHoisting.	Florian Hahn	2017-11-01	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: By replacing branches to CommonExitBlock, we remove the node from CommonExitBlock's predecessors, invalidating the iterator. The problem is exposed when the common exit block has multiple predecessors and needs to sink lifetime info. The modification in the test case trigger the issue. Reviewers: davidxl, davide, wmi Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39112 llvm-svn: 317084
*	[SimplifyIndVar] Inline makIVComparisonInvariant to eleminate code ↵	Philip Reames	2017-10-31	1	-51/+29
\| \| \| \| \| \| \| \|	duplication [NFC] This formulation might be slightly slower since I eagerly compute the cheap replacements. If anyone sees this having a compile time impact, let me know and I'll use lazy population instead. llvm-svn: 317048
*	loop-rotate: simplify code by using llvm::findDbgValues(). (NFC)	Adrian Prantl	2017-10-31	1	-31/+23
\| \| \| \|	llvm-svn: 317037
*	[coro] Make Spill a proper struct instead of deriving from pair.	Benjamin Kramer	2017-10-31	1	-12/+10
\| \| \| \| \| \|	No functionality change. llvm-svn: 317027
*	[SimplifyCFG] Use a more generic name for the selects created by ↵	Craig Topper	2017-10-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	SpeculativelyExecuteBB to prevent long names from being created Currently the selects are created with the names of their inputs concatenated together. It's possible to get cases that chain these selects together resulting in long names due to multiple levels of concatenation. Our internal branch of llvm managed to generate names over 100000 characters in length on a particular test due to an extreme compounding of the names. This patch changes the name to a generic name that is not dependent on its inputs. Differential Revision: https://reviews.llvm.org/D39440 llvm-svn: 317024
*	[IndVarSimplify] Extract wrapper around SE-.isLoopInvariantPredicate [NFC]	Philip Reames	2017-10-31	1	-17/+33
\| \| \| \| \| \|	This an intermediate state, the next patch will re-inline the markLoopInvariantPredicate function to reduce code duplication. llvm-svn: 317016
*	[IndVarSimplify] Simplify code using a dictionary	Philip Reames	2017-10-31	1	-16/+8
\| \| \| \| \| \|	Possibly very slightly slower, but this code is not performance critical and the readability benefit alone is huge. llvm-svn: 317012
*	[asan] Upgrade private linkage globals to internal linkage on COFF	Reid Kleckner	2017-10-31	1	-2/+7
\| \| \| \| \| \| \|	COFF comdats require symbol table entries, which means the comdat leader cannot have private linkage. llvm-svn: 317009
*	[LoopVectorize] Replace manual VPlan memory management with unique_ptr.	Benjamin Kramer	2017-10-31	1	-26/+10
\| \| \| \| \| \|	No functionality change intended. llvm-svn: 317003
*	[InstCombine] Simplify selects that test cmpxchg instructions	Matthew Simpson	2017-10-31	1	-0/+76
\| \| \| \| \| \| \| \| \| \|	If a select instruction tests the returned flag of a cmpxchg instruction and selects between the returned value of the cmpxchg instruction and its compare operand, the result of the select will always be equal to its false value. Differential Revision: https://reviews.llvm.org/D39383 llvm-svn: 316994
*	[LoopUnroll] Clean up remarks for unroll remainder	David Green	2017-10-31	2	-31/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The optimisation remarks for loop unrolling with an unrolled remainder looks something like: test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll] C[i] += A[i*N+j]; ^ test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll] for(int j = 0; j < N; j++) ^ This removes the first of the two messages. Differential revision: https://reviews.llvm.org/D38725 llvm-svn: 316986
*	[IRCE][NFC] Rename fields of InductiveRangeCheck	Max Kazantsev	2017-10-31	1	-23/+23
\| \| \| \| \| \| \| \| \| \|	Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively to make naming of similar entities for Ranges and Range Checks more consistent. Differential Revision: https://reviews.llvm.org/D39414 llvm-svn: 316979
*	[NFC] Get rid of variables used in assert only	Max Kazantsev	2017-10-31	1	-6/+6
\| \| \| \|	llvm-svn: 316977
*	[IndVarSimplify] Simplify code using preheader assumption	Philip Reames	2017-10-31	1	-44/+6
\| \| \| \| \| \| \| \|	As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops. Given SCEV has comments to the contrary, that seems a bit suspect. More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available. Remove the excessive generaility and shorten the code greatly. Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops. See the added test case. llvm-svn: 316976
*	Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't ↵	Max Kazantsev	2017-10-31	1	-0/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pass control flow to successors" This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 316975
*	[SimplifyIndVar] Extract out invariant expression handling	Philip Reames	2017-10-31	1	-82/+107
\| \| \| \| \| \| \| \|	Previously, the code returned early from the function when it couldn't find a free expansion, it should be returning from the transform. I don't have a test case, noticed this via inspection. As a follow up, I'm going to revisit the logic in the extract function. I think that essentially the whole helper routine can be replaced with SCEVExpander, but I wanted to do that in a series of separate commits. llvm-svn: 316974
*	Undo accidental commit	Philip Reames	2017-10-31	1	-237/+82
\| \| \| \| \| \|	These files shouldn't have been submitted in 316967 llvm-svn: 316968
*	[CGP] Fix crash on i96 bit multiply	Philip Reames	2017-10-30	1	-82/+237
\| \| \| \| \| \| \| \|	Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725 If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area. There's a bunch of obviously wrong code in the same function. I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like. llvm-svn: 316967
*	InferAddressSpaces: Fix bug about replacing addrspacecast	Yaxun Liu	2017-10-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	InferAddressSpaces assumes the pointee type of addrspacecast is the same as the operand, which is not always true and causes invalid IR. This bug cause build failure in HCC. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39432 llvm-svn: 316957
*	[NewGVN] Stop assuming PHI args ordering when looking at phi-of-ops.	Davide Italiano	2017-10-30	1	-1/+1
\| \| \| \| \| \| \| \|	It's not guaranteed. There's a bug open to sort them in predecessor order, but it won't happen anytime soon. In the meanwhile, passes will have to do an O(#preds) scan. Such is life. llvm-svn: 316953
*	Create instruction classes for identifying any atomicity of memory ↵	Daniel Neilson	2017-10-30	2	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	intrinsic. (NFC) Summary: For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html This patch fleshes out the instruction class hierarchy with respect to atomic and non-atomic memory intrinsics. With this change, the relevant part of the class hierarchy becomes: IntrinsicInst -> MemIntrinsicBase (methods-only class) -> MemIntrinsic (non-atomic intrinsics) -> MemSetInst -> MemTransferInst -> MemCpyInst -> MemMoveInst -> AtomicMemIntrinsic (atomic intrinsics) -> AtomicMemSetInst -> AtomicMemTransferInst -> AtomicMemCpyInst -> AtomicMemMoveInst -> AnyMemIntrinsic (both atomicities) -> AnyMemSetInst -> AnyMemTransferInst -> AnyMemCpyInst -> AnyMemMoveInst This involves some class renaming: ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst A script for doing this renaming in downstream trees is included below. An example of where the Any* classes should be used in LLVM is when reasoning about the effects of an instruction (ex: aliasing). --- Script for renaming AtomicMem* classes: PREFIXES="[<,([:space:]]" CLASSES="MemIntrinsic\|MemTransferInst\|MemSetInst\|MemMoveInst\|MemCpyInst" SUFFIXES="[;)>,[:space:]]" REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})" REGEX2="visitElementUnorderedAtomic(${CLASSES})" FILES=$( grep -E "(${REGEX}\|${REGEX2})" -r . \| tr ':' ' ' \| awk '{print $1}' \| sort \| uniq ) SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g" SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g" for f in $FILES; do echo "Processing: $f" sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f done Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev Reviewed By: sanjoy Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38419 llvm-svn: 316950
*	[GVNHoist] Fix non-deterministic sort order of PHIs for identical instructions	Mandeep Singh Grang	2017-10-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245. Reviewers: hiraditya, spop, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39410 llvm-svn: 316949
*	[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).	Clement Courbet	2017-10-30	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	- Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905
*	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.	Florian Hahn	2017-10-30	1	-7/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316891
*	[IRCE][NFC] Store Length as SCEV in RangeCheck instead of Value	Max Kazantsev	2017-10-30	1	-6/+6
\| \| \| \|	llvm-svn: 316889
*	Revert r316887 to fix buildbot failures.	Florian Hahn	2017-10-30	1	-93/+7
\| \| \| \|	llvm-svn: 316888
*	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.	Florian Hahn	2017-10-30	1	-7/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316887
*	[GVN][NFC] Mark instruction for deletion instead of immediate erasing in LoadPRE	Max Kazantsev	2017-10-30	1	-2/+1
\| \| \| \| \| \| \| \|	It is done to uniformly handle instructions removal. Differential Revision: https://reviews.llvm.org/D39369 llvm-svn: 316884
*	[SimplifyCFG] use pass options and remove the latesimplifycfg pass	Sanjay Patel	2017-10-28	3	-77/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is no-functional-change-intended. This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired). The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager. Differential Revision: https://reviews.llvm.org/D38631 llvm-svn: 316835
*	[PartialInlineLibCalls] Teach PartialInlineLibCalls to honor nobuiltin, ↵	Craig Topper	2017-10-28	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	properly check the function signature, and check TLI::has Summary: We shouldn't do this transformation if the function is marked nobuitlin. We were only checking that the return type is floating point, we really should be checking the argument types and argument count as well. This can be accomplished by using the other version of getLibFunc that takes the Function and not just the name. We should also be checking TLI::has since sqrtf is a macro on Windows. Fixes PR32559. Reviewers: hfinkel, spatel, davide, efriedma Reviewed By: davide, efriedma Subscribers: efriedma, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D39381 llvm-svn: 316819
*	[LoopPredication] Handle the case when the guard and the latch IV have ↵	Artur Pilipenko	2017-10-27	1	-61/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	different offsets This is a follow up change for D37569. Currently the transformation is limited to the case when: * The loop has a single latch with the condition of the form: ++i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is of the form i u< guardLimit. This patch enables the transform in the case when the latch is latchStart + i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. And the guard is guardStart + i u< guardLimit Reviewed By: anna Differential Revision: https://reviews.llvm.org/D39097 llvm-svn: 316768
*	[GVN][NFC] Refactor loop iteration with foreach	Max Kazantsev	2017-10-27	1	-6/+6
\| \| \| \|	llvm-svn: 316748
*	[Transforms] Fix some Clang-tidy modernize and Include What You Use ↵	Eugene Zelenko	2017-10-27	12	-178/+350
\| \| \| \| \| \|	warnings; other minor fixes (NFC). llvm-svn: 316724
*	[SimplifyIndVars] Shorten code by using SCEV helper [NFC]	Philip Reames	2017-10-26	1	-7/+4
\| \| \| \|	llvm-svn: 316709
*	Do not add discriminator encoding for debug intrinsics.	Dehao Chen	2017-10-26	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are certain requirements for debug location of debug intrinsics, e.g. the scope of the DILocalVariable should be the same as the scope of its debug location. As a result, we should not add discriminator encoding for debug intrinsics. Reviewers: dblaikie, aprantl Reviewed By: aprantl Subscribers: JDevlieghere, aprantl, bjope, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D39343 llvm-svn: 316703
*	[LICM] Restructure implicit exit handling to be more clear [NFCI]	Philip Reames	2017-10-26	1	-27/+34
\| \| \| \| \| \|	When going to explain this to someone else, I got tripped up by the complicated meaning of IsKnownNonEscapingObject in load-store promotion. Extract a helper routine and clarify naming/scopes to make this a bit more obvious. llvm-svn: 316699
*	Reapply r316582 [Local] Fix a bug in the domtree update logic for ↵	Balaram Makam	2017-10-26	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \|	MergeBasicBlockIntoOnlyPred. Summary: This reverts r316612 to reapply r316582. The buildbot failure was unrelated to this commit. Reviewers: Subscribers: llvm-svn: 316669
*	[LSV] Avoid adding vectors of pointers as candidates	Bjorn Pettersson	2017-10-26	1	-3/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We no longer add vectors of pointers as candidates for load/store vectorization. It does not seem to work anyway, but without this patch we can end up in asserts when trying to create casts between an integer type and the pointer of vectors type. The test case I've added used to assert like this when trying to cast between i64 and <2 x i16>: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>) Reviewers: arsenm Reviewed By: arsenm Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D39296 llvm-svn: 316665
*	[LSV] Skip all non-byte sizes, not only less than eight bits	Bjorn Pettersson	2017-10-26	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The code comments indicate that no effort has been spent on handling load/stores when the size isn't a multiple of the byte size correctly. However, the code only avoided types smaller than 8 bits. So for example a load of an i28 could still be considered as a candidate for vectorization. This patch adjusts the code to behave according to the code comment. The test case used to hit the following assert when trying to use "cast" an i32 to i28 using CreateBitOrPointerCast: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>*) Reviewers: arsenm Reviewed By: arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39295 llvm-svn: 316663