bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[LICM] Adjust how moving the re-hoist point works	John Brawn	2019-01-04	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases the order that we hoist instructions in means that when rehoisting (which uses the same order as hoisting) we can rehoist to a block A, then a block B, then block A again. This currently causes an assertion failure as it expects that when changing the hoist point it only ever moves to a block that dominates the hoist point being moved from. Fix this by moving the re-hoist point when it doesn't dominate the dominator of hoisted instruction, or in other words when it wouldn't dominate the uses of the instruction being rehoisted. Differential Revision: https://reviews.llvm.org/D55266 llvm-svn: 350408
*	[memcpyopt] Remove a few unnecessary isVolatile() checks. NFC	Xin Tong	2019-01-04	1	-6/+4
\| \| \| \| \| \|	We already checked for isSimple() on the store. llvm-svn: 350378
*	[BDCE] Remove instructions without demanded bits	Nikita Popov	2019-01-02	1	-2/+6
\| \| \| \| \| \| \| \| \|	If an instruction has no demanded bits, remove it directly during BDCE, instead of leaving it for something else to clean up. Differential Revision: https://reviews.llvm.org/D56185 llvm-svn: 350257
*	Reapply "[BDCE][DemandedBits] Detect dead uses of undead instructions"	Nikita Popov	2019-01-01	1	-15/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The previous attempt to land this lead to miscompiles, because cases where uses were initially dead but were later found to be live during further analysis were not always correctly removed from the DeadUses set. This is fixed now and the added test case demanstrates such an instance. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 350188
*	Drop SE cache early because loop parent can change in LoopSimplifyCFG	Max Kazantsev	2018-12-29	1	-3/+7
\| \| \| \|	llvm-svn: 350145
*	Temporarily disable term folding in LoopSimplifyCFG, add tests	Max Kazantsev	2018-12-28	1	-1/+1
\| \| \| \|	llvm-svn: 350117
*	[LoopSimplifyCFG] Delete dead blocks in RPO	Max Kazantsev	2018-12-28	1	-5/+8
\| \| \| \| \| \| \| \| \|	Deletion of dead blocks in arbitrary order may lead to failure of assertion in `DeleteDeadBlock` that requires that we have deleted all predecessors before we can delete the current block. We should instead delete them in RPO order. llvm-svn: 350116
*	[LoopIdiomRecognize] Add CTTZ support	Craig Topper	2018-12-26	1	-64/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Existing LIR recognizes CTLZ where shifting input variable right until it is zero. (Shift-Until-Zero idiom) This commit: 1. Augments Shift-Until-Zero idiom to recognize CTTZ where input variable is shifted left. 2. Prepare for BitScan idiom recognition. Patch by Yuanfang Chen (tabloid.adroit) Reviewers: craig.topper, evstupac Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55876 llvm-svn: 350074
*	[NFC] Use utility function for guards detection	Max Kazantsev	2018-12-26	1	-3/+3
\| \| \| \|	llvm-svn: 350064
*	[NFC] Reuse variables instead of re-calling getParent	Max Kazantsev	2018-12-25	1	-2/+1
\| \| \| \|	llvm-svn: 350062
*	[LoopSimplifyCFG] Delete dead exiting edges	Max Kazantsev	2018-12-24	1	-8/+111
\| \| \| \| \| \| \| \| \| \|	This patch teaches LoopSimplifyCFG to remove dead exiting edges from loops. Differential Revision: https://reviews.llvm.org/D54025 Reviewed By: fedor.sergeev llvm-svn: 350049
*	Return "[LoopSimplifyCFG] Delete dead in-loop blocks"	Max Kazantsev	2018-12-24	1	-10/+32
\| \| \| \| \| \| \| \|	The underlying bug that caused the revert should be fixed by rL348567. Differential Revision: https://reviews.llvm.org/D54023 llvm-svn: 350045
*	[LoopIdioms] More LocationSize::precise annotations; NFC	George Burgess IV	2018-12-24	1	-2/+3
\| \| \| \| \| \| \| \| \|	Both of these places reference memset-like loops. Memset is precise. Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350044
*	[MemCpyOpt] Use LocationSize instead of ints; NFC	George Burgess IV	2018-12-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. srcSize is derived from the size of an alloca, and we quit out if the size of that is > the size of the thing we're copying to. Hence, we should always copy everything over, so these sizes are precise. Don't make srcSize itself a LocationSize, since optionality isn't helpful, and we do some comparisons against other sizes elsewhere in that function. llvm-svn: 350019
*	[IR] Add Instruction::isLifetimeStartOrEnd, NFC	Vedant Kumar	2018-12-21	2	-10/+5
\| \| \| \| \| \| \| \| \| \| \|	Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964
*	[memcpyopt] Add debug logs when forwarding memcpy src to dst	Reid Kleckner	2018-12-21	1	-0/+2
\| \| \| \|	llvm-svn: 349873
*	Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.	Michael Kruse	2018-12-20	5	-7/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725
*	Revert "[BDCE][DemandedBits] Detect dead uses of undead instructions"	Nikita Popov	2018-12-19	1	-23/+15
\| \| \| \| \| \| \|	This reverts commit r349674. It causes a failure in test-suite enc-3des.execution_time. llvm-svn: 349684
*	[BDCE][DemandedBits] Detect dead uses of undead instructions	Nikita Popov	2018-12-19	1	-15/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The test case has a couple of cases that are not simplified yet. In particular, we're only looking at uses of instructions right now. I think it would make sense to also extend this to arguments. Furthermore DemandedBits doesn't yet know some of the tricks that InstCombine does for the demanded bits or bitwise or/and/xor in combination with known bits information. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 349674
*	[SCCP] Get rid of redundant call for getPredicateInfoFor (NFC).	Florian Hahn	2018-12-18	1	-1/+1
\| \| \| \| \| \|	We can use the result fetched a few lines above. llvm-svn: 349527
*	[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.	Michael Kruse	2018-12-18	1	-18/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics. This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses. The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply. Differential Revision: https://reviews.llvm.org/D55716 llvm-svn: 349509
*	SROA: preserve alignment tags on loads and stores.	Tim Northover	2018-12-18	1	-16/+43
\| \| \| \| \| \| \| \| \| \| \|	When splitting up an alloca's uses we were dropping any explicit alignment tags, which means they default to the ABI-required default alignment and this can cause miscompiles if the real value was smaller. Also refactor the TBAA metadata into a parent class since it's shared by both children anyway. llvm-svn: 349465
*	[EarlyCSE] If DI can't be salvaged, mark it as unavailable.	Davide Italiano	2018-12-17	1	-1/+2
\| \| \| \| \| \|	Fixes PR39874. llvm-svn: 349323
*	[NewGVN] Update use counts for SSA copies when replacing them by their operands.	Florian Hahn	2018-12-15	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code relies on LeaderUseCount to determine if we can remove an SSA copy, but in that the LeaderUseCount does not refer to the SSA copy. If a SSA copy is a dominating leader, we use the operand as dominating leader instead. This means we removed a user of a ssa copy and we should decrement its use count, so we can remove the ssa copy once it becomes dead. Fixes PR38804. Reviewers: efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D51595 llvm-svn: 349217
*	[TransformWarning] Do not warn missed transformations in optnone functions.	Michael Kruse	2018-12-14	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Optimization transformations are intentionally disabled by the 'optnone' function attribute. Therefore do not warn if transformation metadata is still present. Using the legacy pass manager structure, the `skipFunction` method takes care for the optnone attribute (already called before this patch). For the new pass manager, there is no equivalent, so we check for the 'optnone' attribute manually. Differential Revision: https://reviews.llvm.org/D55690 llvm-svn: 349184
*	Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail"	Nikita Popov	2018-12-13	1	-16/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. The previous version of this patch did not perform dependency checking properly: While the dependency is checked at the position of the memset, the used size must be that of the memcpy. Previously the size of the memset was used, which missed modification in the region MemSetSize..CopySize, resulting in miscompiles. The added tests cover variations of this issue. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 349078
*	Revert r348645 - "[MemCpyOpt] memset->memcpy forwarding with undef tail"	David L. Jones	2018-12-13	1	-30/+16
\| \| \| \| \| \| \|	This revision caused trucated memsets for structs with padding. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html llvm-svn: 349002
*	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.	Michael Kruse	2018-12-12	7	-19/+300
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
*	[NewPM] fixing asserts on deleted loop in -print-after-all	Fedor Sergeev	2018-12-11	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	IR-printing AfterPass instrumentation might be called on a loop that has just been invalidated. We should skip printing it to avoid spurious asserts. Reviewed By: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D54740 llvm-svn: 348887
*	[Local] Promote an utility that could be used elsewhere. NFCI.	Davide Italiano	2018-12-10	1	-7/+1
\| \| \| \|	llvm-svn: 348804
*	[MemCpyOpt] memset->memcpy forwarding with undef tail	Nikita Popov	2018-12-07	1	-16/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. For the implementation, I'm reusing a bit of code for a similar existing optimization (direct memcpy of undef). I've also added memset support to MemDepAnalysis GetLocation -- Instead, getPointerDependencyFrom could be used, but it seems to make more sense to add this to GetLocation and thus make the computation cachable. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 348645
*	Reapply "[DemandedBits][BDCE] Support vectors of integers"	Nikita Popov	2018-12-07	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. Unlike the previous iteration of this patch, getDemandedBits() can now again be called on arbirary (sized) instructions, even if they don't have integer or vector of integer type. (For vector types the size of the returned mask will now be the scalar size in bits though.) The added LoopVectorize test case shows a case which triggered an assertion failure with the previous attempt, because getDemandedBits() was called on a pointer-typed instruction. Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348602
*	Introduce llvm.experimental.widenable_condition intrinsic	Max Kazantsev	2018-12-07	3	-0/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a new instinsic `@llvm.experimental.widenable_condition` that allows explicit representation for guards. It is an alternative to using `@llvm.experimental.guard` intrinsic that does not contain implicit control flow. We keep finding places where `@llvm.experimental.guard` is not supported or treated too conservatively, and there are 2 reasons to that: - `@llvm.experimental.guard` has memory write side effect to model implicit control flow, and this sometimes confuses passes and analyzes that work with memory; - Not all passes and analysis are aware of the semantics of guards. These passes treat them as regular throwing call and have no idea that the condition of guard may be used to prove something. One well-known place which had caused us troubles in the past is explicit loop iteration count calculation in SCEV. Another example is new loop unswitching which is not aware of guards. Whenever a new pass appears, we potentially have this problem there. Rather than go and fix all these places (and commit to keep track of them and add support in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible. The only significant difference between guards and regular explicit branches is that guard's condition can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor, and it is always legal to go there no matter what the guard condition is. The other successor is a guarded block, and it is only legal to go there if the condition is true. This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard` intrinsic. Now a widenable guard can be represented in the CFG explicitly like this: %widenable_condition = call i1 @llvm.experimental.widenable.condition() %new_condition = and i1 %cond, %widenable_condition br i1 %new_condition, label %guarded, label %deopt guarded: ; Guarded instructions deopt: call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an `undef`, but the intrinsic prevents the optimizer from folding it early. This form should exploit all optimization boons provided to `br` instuction, and it still can be widened by replacing the result of `@llvm.experimental.widenable.condition()` with `and` with any arbitrary boolean value (as long as the branch that is taken when it is `false` has a deopt and has no side-effects). For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using implicit control flow in guards". This patch introduces this new intrinsic with respective LangRef changes and a pass that converts old-style guards (expressed as intrinsics) into the new form. The naming discussion is still ungoing. Merging this to unblock further items. We can later change the name of this intrinsic. Reviewed By: reames, fedor.sergeev, sanjoy Differential Revision: https://reviews.llvm.org/D51207 llvm-svn: 348593
*	[LoopSimplifyCFG] Do not deal with loops with irreducible CFG inside	Max Kazantsev	2018-12-07	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current algorithm that collects live/dead/inloop blocks relies on some invariants related to RPO and PO traversals. In particular, the important fact it requires is that the only loop's latch is the first block in PO traversal. It also relies on fact that during RPO we visit all prececessors of a block before we visit this block (backedges ignored). If a loop has irreducible non-loop cycle inside, both these assumptions may break. This patch adds detection for this situation and prohibits the terminator folding for loops with irreducible CFG. We can in theory support this later, for this some algorithmic changes are needed. Besides, irreducible CFG is not a frequent situation and we can just don't bother. Thanks @uabelho for finding this! Differential Revision: https://reviews.llvm.org/D55357 Reviewed By: skatkov llvm-svn: 348567
*	Revert "[DemandedBits][BDCE] Support vectors of integers"	Nikita Popov	2018-12-07	1	-7/+6
\| \| \| \| \| \| \|	This reverts commit r348549. Causing assertion failures during clang build. llvm-svn: 348558
*	[DemandedBits][BDCE] Support vectors of integers	Nikita Popov	2018-12-06	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. The getDemandedBits() method can now only be called on instructions that have integer or vector of integer type. Previously it could be called on any sized instruction (even if it was not particularly useful). The size of the return value is now always the scalar size in bits (while previously it was the type size in bits). Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348549
*	[GVN] Don't perform scalar PRE on GEPs	Alexandros Lamprineas	2018-12-06	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Partial Redundancy Elimination of GEPs prevents CodeGenPrepare from sinking the addressing mode computation of memory instructions back to its uses. The problem comes from the insertion of PHIs, which confuse CGP and make it bail. I've autogenerated the check lines of an existing test and added a store instruction to demonstrate the motivation behind this change. The store is now using the gep instead of a phi. Differential Revision: https://reviews.llvm.org/D55009 llvm-svn: 348496
*	Revert "[LoopSimplifyCFG] Delete dead in-loop blocks"	Ilya Biryukov	2018-12-06	1	-32/+10
\| \| \| \| \| \| \| \|	This reverts commit r348457. The original commit causes clang to crash when doing an instrumented build with a new pass manager. Reverting to unbreak our integrate. llvm-svn: 348484
*	[LoopSimplifyCFG] Delete dead in-loop blocks	Max Kazantsev	2018-12-06	1	-10/+32
\| \| \| \| \| \| \| \| \| \|	This patch teaches LoopSimplifyCFG to delete loop blocks that have become unreachable after terminator folding has been done. Differential Revision: https://reviews.llvm.org/D54023 Reviewed By: anna llvm-svn: 348457
*	[LICM] Actually disable ControlFlowHoisting.	Alina Sbirlea	2018-12-05	1	-14/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The remaining code paths that ControlFlowHoisting introduced that were not disabled, increased compile time by 3x for some benchmarks. The time is spent in DominatorTree updates. Reviewers: john.brawn, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D55313 llvm-svn: 348345
*	[SimpleLoopUnswitch] Remove debug dump.	Alina Sbirlea	2018-12-04	1	-3/+1
\| \| \| \|	llvm-svn: 348267
*	Update MemorySSA in SimpleLoopUnswitch.	Alina Sbirlea	2018-12-04	1	-77/+235
\| \| \| \| \| \| \| \| \| \| \|	Summary: Teach SimpleLoopUnswitch to preserve MemorySSA. Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D47022 llvm-svn: 348263
*	[LoopSimplifyCFG] Update MemorySSA in terminator folding. PR39783	Max Kazantsev	2018-11-30	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Terminator folding transform lacks MemorySSA update for memory Phis, while they exist within MemorySSA analysis. They need exactly the same type of updates as regular Phis. Failing to update them properly ends up with inconsistent MemorySSA and manifests in various assertion failures. This patch adds Memory Phi updates to this transform. Thanks to @jonpa for finding this! Differential Revision: https://reviews.llvm.org/D55050 Reviewed By: asbirlea llvm-svn: 347979
*	[LICM] Reapply r347776 "Make LICM able to hoist phis" with fix	John Brawn	2018-11-29	1	-15/+314
\| \| \| \| \| \| \| \| \| \|	This commit caused a large compile-time slowdown in some cases when NDEBUG is off due to the dominator tree verification it added. Fix this by only doing dominator tree and loop info verification when something has been hoisted. Differential Revision: https://reviews.llvm.org/D52827 llvm-svn: 347889
*	[CallSiteSplitting] Report edge deletion to DomTreeUpdater	Joseph Tremoulet	2018-11-29	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When splitting musttail calls, the split blocks' original terminators get removed; inform the DTU when this happens. Also add a testcase that fails an assertion in the DTU without this fix. Reviewers: fhahn, junbuml Reviewed By: fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55027 llvm-svn: 347872
*	[CVP] tidy processCmp(); NFC	Sanjay Patel	2018-11-29	1	-14/+14
\| \| \| \| \| \| \| \|	1. The variables were confusing: 'C' typically refers to a constant, but here it was the Cmp. 2. Formatting violations. 3. Simplify code to return true/false constant. llvm-svn: 347868
*	Revert "[LICM] Enable control flow hoisting by default" and "[LICM] Reapply ↵	Martin Storsjo	2018-11-29	1	-312/+15
\| \| \| \| \| \| \| \| \| \| \|	r347190 "Make LICM able to hoist phis" with fix" This reverts commits r347776 and r347778. The first one, r347776, caused significant compile time regressions for certain input files, see PR39836 for details. llvm-svn: 347867
*	Disable TermFolding in LoopSimplifyCFG until PR39783 is fixed	Max Kazantsev	2018-11-29	1	-1/+1
\| \| \| \|	llvm-svn: 347844
*	[LoopStrengthReduce] ComplexityLimit as an option	Sam Parker	2018-11-29	1	-3/+5
\| \| \| \| \| \| \| \|	Convert ComplexityLimit into a command line value. Differential Revision: https://reviews.llvm.org/D54899 llvm-svn: 347843
*	[LICM] Enable control flow hoisting by default	John Brawn	2018-11-28	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D54949 llvm-svn: 347778