bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	More fixes for subreg join failure in RegCoalescer	Tim Renouf	2018-07-17	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Part of the adjustCopiesBackFrom method wasn't correctly dealing with SubRange intervals when updating. 2 changes. The first to ensure that bogus SubRange Segments aren't propagated when encountering Segments of the form [1234r, 1234d:0) when preparing to merge value numbers. These can be removed in this case. The second forces a shrinkToUses call if SubRanges end on the copy index (instead of just the parent register). V2: Addressed review comments, plus MIR test instead of ll test Subscribers: MatzeB, qcolombet, nhaehnle Differential Revision: https://reviews.llvm.org/D40308 Change-Id: I1d2b2b4beea802fce11da01edf71feb2064aab05 llvm-svn: 337273
*	[DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELT	Simon Pilgrim	2018-07-17	1	-4/+23
\| \| \| \| \| \| \| \|	If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops. Differential Revision: https://reviews.llvm.org/D49262 llvm-svn: 337258
*	[Intrinsics] define funnel shift IR intrinsics + DAG builder support	Sanjay Patel	2018-07-16	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions, and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift" operation which may also be a single hardware instruction. Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working on that (much larger patch), but then I concluded that we don't need it. At least as a first step, we have all of the backend support necessary to match these ops...because it was required. And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are likely all that we'll ever need. There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes (along with improving the oversized shift documentation). Again, I don't think that's strictly necessary (as the test results here prove). That can be an efficiency improvement as a small follow-up patch. So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support. Differential Revision: https://reviews.llvm.org/D49242 llvm-svn: 337221
*	[CodeGen] Fix inconsistent declaration parameter name	Fangrui Song	2018-07-16	37	-89/+89
\| \| \| \|	llvm-svn: 337200
*	[llvm] Change 2 instances of std::sort to llvm::sort	Mandeep Singh Grang	2018-07-16	1	-1/+1
\| \| \| \|	llvm-svn: 337192
*	[RegAlloc] Skip global splitting if the live range is huge and its spill is	Wei Mi	2018-07-16	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trivially rematerializable. We run into a case where machineLICM hoists a large number of live ranges outside of a big loop because it thinks those live ranges are trivially rematerializable. In regalloc, global splitting is tried out first for those live ranges before they are spilled and rematerialized. Because the global splitting algorithm is quadratic, increasing a lot of global splitting candidates causes huge compile time increase (50s to 1400s on my local machine when compiling a module). However, we think for live ranges which are very large and are trivially rematerialiable, it is better to just skip global splitting so as to save compile time with little chance of sacrificing performance. We uses the segment size of live range to indirectly evaluate whether the global splitting of the live range can introduce high cost, and use an option as a knob to adjust the size limit threshold. Differential Revision: https://reviews.llvm.org/D49353 llvm-svn: 337186
*	[X86][AArch64][DAGCombine] Unfold 'check for [no] signed truncation' pattern	Roman Lebedev	2018-07-16	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 \| PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. But the IR-optimal patter does not lower efficiently, so we want to undo it.. This handles the simple pattern. There is a second pattern with predicate and constants inverted. NOTE: we do not check uses here. we always do the transform. Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49266 llvm-svn: 337166
*	Avoid losing Hi part when expanding VAARG nodes on big endian machines	Daniel Cederman	2018-07-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the high part of the load is not used the offset to the next element will not be set correctly. For example, on Sparc V8, the following code will read val2 from offset 4 instead of 8. ``` int val = __builtin_va_arg(va, long long); int val2 = __builtin_va_arg(va, int); ``` Reviewers: jyknight Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D48595 llvm-svn: 337161
*	[AccelTable] Provide DWARF5AccelTableStaticData for dsymutil.	Jonas Devlieghere	2018-07-16	1	-39/+81
\| \| \| \| \| \| \| \| \| \| \|	For dsymutil we want to store offsets in the accelerator table entries rather than DIE pointers. In addition, we need a way to communicate which CU a DIE belongs to. This patch provides support for both of these issues. Differential revision: https://reviews.llvm.org/D49102 llvm-svn: 337158
*	Recommit r335794 "Add support for generating a call graph profile from ↵	Michael J. Spencer	2018-07-16	1	-7/+53
\| \| \| \| \| \|	Branch Frequency Info." with fix for removed functions. llvm-svn: 337140
*	[DAGCombiner] fix typo in comment; NFC	Sanjay Patel	2018-07-15	1	-1/+1
\| \| \| \|	llvm-svn: 337132
*	[DAGCombiner] extend(ifpositive(X)) -> shift-right (not X)	Sanjay Patel	2018-07-15	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is almost the same as an existing IR canonicalization in instcombine, so I'm assuming this is a good early generic DAG combine too. The motivation comes from reduced bit-hacking for select-of-constants in IR after rL331486. We want to restore that functionality in the DAG as noted in the commit comments for that change and the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2018-July/124433.html The PPC and AArch tests show that those targets are already doing something similar. x86 will be neutral in the minimal case and generally better when this pattern is extended with other ops as shown in the signbit-shift.ll tests. Note the asymmetry: we don't include the (extend (ifneg X)) transform because it already exists in SimplifySelectCC(), and that is verified in the later unchanged tests in the signbit-shift.ll files. Without the 'not' op, the general transform to use a shift is always a win because that's a single instruction. Alive proofs: https://rise4fun.com/Alive/ysli Name: if pos, get -1 %c = icmp sgt i16 %x, -1 %r = sext i1 %c to i16 => %n = xor i16 %x, -1 %r = ashr i16 %n, 15 Name: if pos, get 1 %c = icmp sgt i16 %x, -1 %r = zext i1 %c to i16 => %n = xor i16 %x, -1 %r = lshr i16 %n, 15 Differential Revision: https://reviews.llvm.org/D48970 llvm-svn: 337130
*	[MachineOutliner] Check the last instruction from the sequence when updating ↵	Francis Visoiu Mistrih	2018-07-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	liveness The MachineOutliner was doing an std::for_each from the call (inserted before the outlined sequence) to the iterator at the end of the sequence. std::for_each needs the iterator past the end, so the last instruction was not taken into account when propagating the liveness information. This fixes the machine verifier issue in machine-outliner-disubprogram.ll. Differential Revision: https://reviews.llvm.org/D49295 llvm-svn: 337090
*	[SLH] Introduce a new pass to do Speculative Load Hardening to mitigate	Chandler Carruth	2018-07-13	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Spectre variant #1 for x86. There is a lengthy, detailed RFC thread on llvm-dev which discusses the high level issues. High level discussion is probably best there. I've split the design document out of this patch and will land it separately once I update it to reflect the latest edits and updates to the Google doc used in the RFC thread. This patch is really just an initial step. It isn't quite ready for prime time and is only exposed via debugging flags. It has two major limitations currently: 1) It only supports x86-64, and only certain ABIs. Many assumptions are currently hard-coded and need to be factored out of the code here. 2) It doesn't include any options for more fine-grained control, either of which control flow edges are significant or which loads are important to be hardened. 3) The code is still quite rough and the testing lighter than I'd like. However, this is enough for people to begin using. I have had numerous requests from people to be able to experiment with this patch to understand the trade-offs it presents and how to use it. We would also like to encourage work to similar effect in other toolchains. The ARM folks are actively developing a system based on this for AArch64. We hope to merge this with their efforts when both are far enough along. But we also don't want to block making this available on that effort. Many thanks to the numerous people who helped along the way here. For this patch in particular, both Eric and Craig did a ton of review to even have confidence in it as an early, rough cut at this functionality. Differential Revision: https://reviews.llvm.org/D44824 llvm-svn: 336990
*	[LiveDebugValues] Tracking copying value between registers	Petar Jovanovic	2018-07-13	1	-52/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During the execution of long functions or functions that have a lot of inlined code it could come to the situation where tracked value could be transferred from one register to another. The transfer is recognized only if destination register is a callee saved register and if source register is killed. We do not salvage caller-saved registers since there is a great chance that killed register would outlive it. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D44016 llvm-svn: 336978
*	CodeGen: Remove pipeline dependencies on StackProtector; NFC	Matthias Braun	2018-07-13	13	-69/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This re-applies r336929 with a fix to accomodate for the Mips target scheduling multiple SelectionDAG instances into the pass pipeline. PrologEpilogInserter and StackColoring depend on the StackProtector analysis being alive from the point it is run until PEI, which requires that they are all scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass between StackProtector and PEI results in these passes being in separate FunctionPassManagers and the StackProtector is not available for PEI. PEI and StackColoring don't use much information from the StackProtector pass, so transfering the required information to MachineFrameInfo is cleaner than keeping the StackProtector pass around. This commit moves the SSP layout information to MFI instead of keeping it in the pass. This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587) is a first draft of the pagerando implementation described in http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html. Patch by Stephen Crane <sjc@immunant.com> Differential Revision: https://reviews.llvm.org/D49256 llvm-svn: 336964
*	Revert "(HEAD -> master, origin/master, arcpatch-D37582) CodeGen: Remove ↵	Matthias Braun	2018-07-12	13	-83/+69
\| \| \| \| \| \| \| \| \| \|	pipeline dependencies on StackProtector; NFC" This was triggering pass scheduling failures. This reverts commit r336929. llvm-svn: 336934
*	CodeGen: Remove pipeline dependencies on StackProtector; NFC	Matthias Braun	2018-07-12	13	-69/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PrologEpilogInserter and StackColoring depend on the StackProtector analysis being alive from the point it is run until PEI, which requires that they are all scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass between StackProtector and PEI results in these passes being in separate FunctionPassManagers and the StackProtector is not available for PEI. PEI and StackColoring don't use much information from the StackProtector pass, so transfering the required information to MachineFrameInfo is cleaner than keeping the StackProtector pass around. This commit moves the SSP layout information to MFI instead of keeping it in the pass. This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587) is a first draft of the pagerando implementation described in http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html. Patch by Stephen Crane <sjc@immunant.com> Differential Revision: https://reviews.llvm.org/D49256 llvm-svn: 336929
*	[DWARF v5] Generate range list tables into the .debug_rnglists section. No ↵	Wolfgang Pieb	2018-07-12	6	-19/+138
\| \| \| \| \| \| \| \| \| \| \| \|	support for split DWARF and no use of DW_FORM_rnglistx with the DW_AT_ranges attribute. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D49214 llvm-svn: 336927
*	[CodeGen] Emit more precise AssertZext/AssertSext nodes.	Eli Friedman	2018-07-11	1	-25/+8
\| \| \| \| \| \| \| \| \| \| \| \|	This is marginally helpful for removing redundant extensions, and the code is easier to read, so it seems like an all-around win. In the new test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an AssertZext i2, which allows the extension of the return value to be eliminated. Differential Revision: https://reviews.llvm.org/D49004 llvm-svn: 336868
*	[CodeGen] Ignore debug uses in MachineCopyPropagation	Krzysztof Parzyszek	2018-07-11	1	-1/+1
\| \| \| \| \| \| \|	Debug uses should not count as real uses, since the presence of debug information could affect the generated code. llvm-svn: 336803
*	[NFC][InstCombine] Converts isLegalNarrowLoad into isLegalNarrowLdSt	Diogo N. Sampaio	2018-07-11	1	-41/+55
\| \| \| \| \| \| \| \| \| \| \|	Reuse this function as to test correctness and profitability of reducing width of either load or store operations. Reviewsers: samparker Differential Revision: https://reviews.llvm.org/D48624 llvm-svn: 336800
*	[SelectionDAG] Add constant buildvector support to isKnownNeverZero	Simon Pilgrim	2018-07-11	2	-6/+9
\| \| \| \| \| \|	This allows us to use SelectionDAG::isKnownNeverZero in DAGCombiner::visitREM (visitSDIVLike/visitUDIVLike handle the checking for constants). llvm-svn: 336779
*	[DAGCombiner] Support non-uniform X%C -> X-(X/C)*C folds	Simon Pilgrim	2018-07-11	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	First stage in PR38057 - support non-uniform constant vectors in the combine to reuse the division-by-constant logic. We can definitely do better for srem pow2 remainders (and avoid that extra multiply....) but this at least helps keep everything on the vector unit. Differential Revision: https://reviews.llvm.org/D48975 llvm-svn: 336774
*	[DAGCombiner] Add (urem X, -1) -> select(X == -1, 0, x) fold	Simon Pilgrim	2018-07-11	1	-0/+6
\| \| \| \|	llvm-svn: 336773
*	[DAGCombiner] Add special case fast paths for udiv x,1 and udiv x,-1	Simon Pilgrim	2018-07-10	1	-0/+9
\| \| \| \| \| \|	udiv x,-1 was going down the (slow) BuildUDIV route resulting in unnecessary shifts. llvm-svn: 336701
*	Revert "[AccelTable] Provide abstraction for emitting DWARF5 accelerator ↵	Jonas Devlieghere	2018-07-10	1	-56/+18
\| \| \| \| \| \| \| \| \|	tables." This reverts r336529 because an alternative approach turned out to be a better fit for dsymuil. llvm-svn: 336698
*	[DAGCombiner] visitREM - call visitSDIVLike/visitUDIVLike directly to avoid ↵	Simon Pilgrim	2018-07-10	1	-12/+9
\| \| \| \| \| \| \| \|	recursive combining. As suggested by @efriedma on D48975 use the visitSDIVLike/visitUDIVLike functions introduced at rL336656. llvm-svn: 336664
*	[DAGCombiner] Split SDIV/UDIV optimization expansions from the rest of the ↵	Simon Pilgrim	2018-07-10	1	-15/+44
\| \| \| \| \| \| \| \|	combines. NFCI. As suggested by @efriedma on D48975, this patch separates the BuildDiv/Pow2 style optimizations from the rest of the visitSDIV/visitUDIV to make it easier to reuse the combines and will allow us to avoid some rather nasty node recursive combining in visitREM. llvm-svn: 336656
*	[DWARF][NFC] Refactor range list emission to use a static helper	Wolfgang Pieb	2018-07-10	1	-57/+59
\| \| \| \| \| \| \| \| \| \| \|	This is prep for DWARF v5 range list emission. Emission of a single range list is moved to a static helper function. Reviewer: jdevlieghere Differential Revision: https://reviews.llvm.org/D49098 llvm-svn: 336621
*	RenameIndependentSubregs: Fix handling of undef tied operands	Mark Searles	2018-07-09	1	-6/+10
\| \| \| \| \| \| \| \| \|	Ensure that, if updating a tied operand pair, to only update that pair. Differential Revision: https://reviews.llvm.org/D49052 llvm-svn: 336593
*	[globalisel][irtranslator] Add support for atomicrmw and (strong) cmpxchg	Daniel Sanders	2018-07-09	2	-1/+224
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support for the atomicrmw instructions and the strong cmpxchg instruction to the IRTranslator. I've left out weak cmpxchg because LangRef.rst isn't entirely clear on what difference it makes to the backend. As far as I can tell from the code, it only matters to AtomicExpandPass which is run at the LLVM-IR level. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, javed.absar Reviewed By: qcolombet Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D40092 llvm-svn: 336589
*	[X86][TLI] DAGCombine: Unfold variable bit-clearing mask to two shifts.	Roman Lebedev	2018-07-09	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds a reverse transform for the instcombine canonicalizations that were added in D47980, D47981. As discussed later, that was worse at least for the code size, and potentially for the performance, too. https://rise4fun.com/Alive/Zmpl Reviewers: craig.topper, RKSimon, spatel Reviewed By: spatel Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D48768 llvm-svn: 336585
*	[SelectionDAG] Add VT consistency checks to the creation of ISD::FMA.	Craig Topper	2018-07-09	1	-0/+3
\| \| \| \| \| \|	This is similar to what is done for binops. I don't know if this would have helped us catch the bug fixed in r336566 earlier or not, but I figured it couldn't hurt. llvm-svn: 336576
*	[AccelTable] Provide abstraction for emitting DWARF5 accelerator tables.	Jonas Devlieghere	2018-07-09	1	-18/+56
\| \| \| \| \| \| \| \| \| \| \| \|	When emitting the DWARF accelerator tables from dsymutil, we don't have a DwarfDebug instance and we use a custom class to represent Dwarf compile units. This patch adds an interface AccelTableWriterInfo to abstract these from the Dwarf5AccelTableWriter, so we can have a custom implementation for this in dsymutil. Differential revision: https://reviews.llvm.org/D49031 llvm-svn: 336529
*	[AccelTable] Dwarf5AccelTableEmitter -> Writer (NFC)	Jonas Devlieghere	2018-07-09	1	-38/+38
\| \| \| \| \| \| \|	Renames Dwarf5AccelTableEmitter to Dwarf5AccelTableWriter as suggested in D49031. llvm-svn: 336525
*	[SelectionDAG] Split float and integer isKnownNeverZero tests	Simon Pilgrim	2018-07-07	1	-1/+11
\| \| \| \| \| \| \| \| \| \|	Splits off isKnownNeverZeroFloat to handle +/- 0 float cases. This will make it easier to be more aggressive with the integer isKnownNeverZero tests (similar to ValueTracking), use computeKnownBits etc. Differential Revision: https://reviews.llvm.org/D48969 llvm-svn: 336492
*	Use const APInt& to avoid extra copy. NFCI.	Simon Pilgrim	2018-07-07	1	-1/+1
\| \| \| \| \| \|	As discussed on D48825. llvm-svn: 336491
*	[DAGCombiner] Add EXTRACT_SUBVECTOR to SimplifyDemandedVectorElts	Simon Pilgrim	2018-07-07	2	-0/+22
\| \| \| \| \| \| \| \|	As discussed on PR37989, this patch adds EXTRACT_SUBVECTOR handling to TargetLowering::SimplifyDemandedVectorElts and calls it from DAGCombiner::visitEXTRACT_SUBVECTOR. Differential Revision: https://reviews.llvm.org/D48825 llvm-svn: 336490
*	Use Type::isIntOrPtrTy where possible, NFC	Vedant Kumar	2018-07-06	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \|	It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() \|\| T.isPointerTy()`. I used Python's re.sub with this regex to update users: r'([\w.\->()]+)isIntegerTy\(\)\s\\|\\|\s\1isPointerTy\(\)' llvm-svn: 336462
*	Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084.	Nico Weber	2018-07-06	1	-105/+0
\| \| \| \|	llvm-svn: 336453
*	[Local] replaceAllDbgUsesWith: Update debug values before RAUW	Vedant Kumar	2018-07-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The replaceAllDbgUsesWith utility helps passes preserve debug info when replacing one value with another. This improves upon the existing insertReplacementDbgValues API by: - Updating debug intrinsics in-place, while preventing use-before-def of the replacement value. - Falling back to salvageDebugInfo when a replacement can't be made. - Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into common utility code. Along with the API change, this teaches replaceAllDbgUsesWith how to create DIExpressions for three basic integer and pointer conversions: - The no-op conversion. Applies when the values have the same width, or have bit-for-bit compatible pointer representations. - Truncation. Applies when the new value is wider than the old one. - Zero/sign extension. Applies when the new value is narrower than the old one. Testing: - check-llvm, check-clang, a stage2 `-g -O3` build of clang, regression/unit testing. - This resolves a number of mis-sized dbg.value diagnostics from Debugify. Differential Revision: https://reviews.llvm.org/D48676 llvm-svn: 336451
*	Added missing semicolon	Diogo N. Sampaio	2018-07-06	1	-2/+1
\| \| \| \|	llvm-svn: 336428
*	[SelectionDAG] https://reviews.llvm.org/D48278	Diogo N. Sampaio	2018-07-06	1	-0/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D48278 Allow to reduce redundant shift masks. For example: x1 = x & 0xAB00 x2 = (x >> 8) & 0xAB can be reduced to: x1 = x & 0xAB00 x2 = x1 >> 8 It only allows folding when the masks and shift values are constants. llvm-svn: 336426
*	Testing commit permision	Diogo N. Sampaio	2018-07-05	1	-1/+1
\| \| \| \|	llvm-svn: 336384
*	[MachineOutliner] Fix typo in getOutliningCandidateInfo function name	Yvan Roux	2018-07-04	1	-1/+1
\| \| \| \| \| \| \| \|	getOutlininingCandidateInfo -> getOutliningCandidateInfo Differential Revision: https://reviews.llvm.org/D48867 llvm-svn: 336285
*	[ImplicitNullChecks] Check for rewrite of register used in 'test' instruction	Max Kazantsev	2018-07-04	1	-2/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following code pattern: mov %rax, %rcx test %rax, %rax %rax = .... je throw_npe mov(%rcx), %r9 mov(%rax), %r10 gets transformed into the following incorrect code after implicit null check pass: mov %rax, %rcx %rax = .... faulting_load_op("movl (%rax), %r10", throw_npe) mov(%rcx), %r9 For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization. Patch by Surya Kumari Jangala! Differential Revision: https://reviews.llvm.org/D48627 Reviewed By: skatkov llvm-svn: 336241
*	[DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegen	Simon Pilgrim	2018-07-03	1	-2/+0
\| \| \| \| \| \|	Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code llvm-svn: 336198
*	[CodeGen] Make block removal order deterministic in CodeGenPrepare	David Stenberg	2018-07-02	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Replace use of a SmallPtrSet with a SmallSetVector to make the worklist iteration order deterministic. This is done as the order the blocks are removed may affect whether or not PHI nodes in successor blocks are removed. For example, consider the following case where %bb1 and %bb2 are removed: bb1: br i1 undef, label %bb3, label %bb4 bb2: br i1 undef, label %bb4, label %bb3 bb3: pv1 = phi type [ undef, %bb1 ], [ undef, %bb2], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ undef, %bb1 ], [ undef, %bb2 ], [ pv1, %bb3 ], [ v0, %other ] If %bb2 is removed before %bb1, the incoming values from %bb1 and %bb2 to pv1 will be removed before %bb1 is removed as a predecessor to %bb4. The pv1 node will thus be optimized out (to v0) at the time %bb1 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb1 has been removed: bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] The pv2 PHI node will be optimized away by removePredecessor() as all incoming values are identical. In case %bb2 is removed after %bb1, pv1 will not be optimized out at the time %bb2 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb2 to pv2 has been removed: bb3: pv1 = phi type [ undef, %bb2 ], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ pv1, %bb3 ], [ v0, %other ] The pv2 PHI node will thus not be removed in this case, ultimately leading to the following output bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] I have not looked into changing DeleteDeadBlock() so that the redundant PHI nodes are removed. I have not added a test case, as I was not able to create a particularly small and (not messy) reproducer. This is likely due to SmallPtrSet behaving deterministically when in small mode. Reviewers: void, dexonsmith, spatel, skatkov, fhahn, bkramer, nhaehnle Reviewed By: fhahn Subscribers: mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D48369 llvm-svn: 336109
*	Implement strip.invariant.group	Piotr Padlewski	2018-07-02	3	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073