bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SelectionDAG] Use APInt::isSubsetOf. NFC	Craig Topper	2017-06-16	2	-4/+4
\| \| \| \|	llvm-svn: 305606
*	[SelectionDAG] Use APInt::isNullValue/isOneValue. NFC	Craig Topper	2017-06-16	2	-5/+5
\| \| \| \|	llvm-svn: 305605
*	[TargetLowering] Use ConstantSDNode::isOne and getSExtValue instead of ↵	Craig Topper	2017-06-16	1	-6/+6
\| \| \| \| \| \|	getting the underlying APInt first. NFC llvm-svn: 305604
*	Improve the accuracy of variable ranges .debug_loc location lists.	Adrian Prantl	2017-06-16	1	-12/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the following motivating example bool c(); void f(); bool start() { bool result = c(); if (!c()) { result = false; goto exit; } f(); result = true; exit: return result; } we would previously generate a single DW_AT_const_value(1) because only the DBG_VALUE in the second-to-last basic block survived codegen. This patch improves the heuristic used to determine when a DBG_VALUE is available at the beginning of its variable's enclosing lexical scope: - Stop giving singular constants blanket permission to take over the entire scope. There is still a special case for constants in the function prologue that we also miight want to retire later. - Use the lexical scope information to determine available-at-entry instead of proximity to the function prologue. After this patch we generate a location list with a more accurate narrower availability for the constant true value. As a pleasant side effect, we also generate inline locations instead of location lists where a loacation covers the entire range of the enclosing lexical scope. Measured on compiling llc with four targets this doesn't have an effect on compile time and reduces the size of the debug info for llc by ~600K. rdar://problem/30286912 llvm-svn: 305599
*	Revert "RegScavenging: Add scavengeRegisterBackwards()"	Matthias Braun	2017-06-16	1	-315/+116
\| \| \| \| \| \| \| \| \|	Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. llvm-svn: 305566
*	[Atomics] Rename and change prototype for atomic memcpy intrinsic	Daniel Neilson	2017-06-16	2	-26/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558
*	[MachineBlockPlacement] trivial fix in comments, NFC	Hiroshi Inoue	2017-06-16	1	-5/+5
\| \| \| \| \| \| \| \|	- Topologocal is abbreviated as "topo" in comments, but "top" is used in only one comment. Modify it for consistency. - Capitalize "succ" and "pred" for consistency in one figure. - Other trivial fixes. llvm-svn: 305552
*	Revert "[DAG] Allow truncated and extend memory operations in Store Merge. ↵	Ahmed Bougacha	2017-06-15	1	-21/+10
\| \| \| \| \| \| \| \|	NFCI." This reverts commit r305468, as it caused PR33475. llvm-svn: 305527
*	RegScavenging: Add scavengeRegisterBackwards()	Matthias Braun	2017-06-15	1	-116/+315
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305516
*	[MachineLICM] Hoist TOC-based address instructions	Lei Huang	2017-06-15	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490
*	Fold variable into assert.	Benjamin Kramer	2017-06-15	1	-2/+1
\| \| \| \| \| \|	Silences an unused variable warning in Release builds. llvm-svn: 305488
*	ISel: Fix FastISel of swifterror values	Arnold Schwaighofer	2017-06-15	3	-14/+125
\| \| \| \| \| \| \| \| \| \| \| \|	The code assumed that we process instructions in basic block order. FastISel processes instructions in reverse basic block order. We need to pre-assign virtual registers before selecting otherwise we get def-use relationships wrong. This only affects code with swifterror registers. rdar://32659327 llvm-svn: 305484
*	[DAG] As StoreMerge now generates only legal nodes remove unecessary guard ↵	Nirav Dave	2017-06-15	1	-4/+2
\| \| \| \| \| \|	when run post-legalization NFCI. llvm-svn: 305477
*	[DAG] Defer Pre/Post IndexStore merge to after mergestore. NFCI.	Nirav Dave	2017-06-15	1	-4/+4
\| \| \| \| \| \| \| \|	In preparation for doing storemerge post-legalization, reorder visitSTORE passes to move pre/post-index combining after store merge. Reordered passes other than store merge are unaffected. llvm-svn: 305473
*	[DAG] Allow truncated and extend memory operations in Store Merge. NFCI.	Nirav Dave	2017-06-15	1	-10/+21
\| \| \| \| \| \| \| \|	As all store merges checks are based on the memory operation performed, allow use of truncated stores and extended loads as valid input candidates for merging. llvm-svn: 305468
*	[DAG] Make MergeStores generate legalized stores. NFCI.	Nirav Dave	2017-06-15	1	-4/+21
\| \| \| \| \| \| \|	Realized merged stores as truncstores if store will be realized as such by legalization. llvm-svn: 305467
*	[DAG] Use correct size for truncated store merge of load. NFCI.	Nirav Dave	2017-06-15	1	-2/+2
\| \| \| \| \| \| \|	Avoid non-legal memory ops by checking correct size when merging stores of loads into a extload-truncstore pair. llvm-svn: 305466
*	[ARM] GlobalISel: Add support for i32 modulo	Diana Picus	2017-06-15	1	-17/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for modulo for targets that have hardware division and for those that don't. When hardware division is not available, we have to choose the correct libcall to use. This is generally straightforward, except for AEABI. The AEABI variant is trickier than the other libcalls because it returns { quotient, remainder }, instead of just one value like the other libcalls that we've seen so far. Therefore, we need to use custom lowering for it. However, we don't want to have too much special code, so we refactor the target-independent code in the legalizer by adding a helper for replacing an instruction with a libcall. This helper is used by the legalizer itself when dealing with simple calls, and also by the custom ARM legalization for the more complicated AEABI divmod calls. llvm-svn: 305459
*	Allow -profile-guided-section-prefix more than once	David Callahan	2017-06-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: At present, `-profile-guided-section-prefix` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times. Our build system makes it pretty tricky to guarantee this. We often accidentally pass the flag more than once (but always with the same "false" value) which results in an error, after which compilation fails: ``` clang (LLVM option parsing): for the -profile-guided-section-prefix option: may only occur zero or one times! ``` While we work on improving our build system, it also seems reasonable just to allow `-profile-guided-section-prefix` to be passed more than once, by to `cl::ZeroOrMore`. Quoting [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed \| the documentation ]]: > The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times. > ... > If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained. Reviewers: danielcdh Reviewed By: danielcdh Subscribers: twoh, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D34219 llvm-svn: 305413
*	[mips] Fix multiprecision arithmetic.	Simon Dardis	2017-06-14	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC, get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs. For MIPS, only the DSP ASE has a carry flag, so in the general case it is not useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes. Also improve the generation code in such cases for targets with TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the comparison node rather than using it in selects. Similarly for ISD::SUBE / ISD::SUBC. Address optimization breakage by moving the generation of MIPS specific integer multiply-accumulate nodes to before legalization. This revolves PR32713 and PR33424. Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33494 llvm-svn: 305389
*	Align definition of DW_OP_plus with DWARF spec [3/3]	Florian Hahn	2017-06-14	5	-20/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things. The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack. This is done in three stages: • The first patch (LLVM) adds support for DW_OP_plus_uconst. • The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst. • The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions. Patch by Sander de Smalen. Reviewers: echristo, pcc, aprantl Reviewed By: aprantl Subscribers: fhahn, javed.absar, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D33894 llvm-svn: 305386
*	[globalisel][legalizer] G_LOAD/G_STORE NarrowScalar should not emit G_GEP x, 0.	Daniel Sanders	2017-06-13	2	-12/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When legalizing G_LOAD/G_STORE using NarrowScalar, we should avoid emitting %0 = G_CONSTANT ty 0 %1 = G_GEP %x, %0 since it's cheaper to not emit the redundant instructions than it is to fold them away later. Reviewers: qcolombet, t.p.northover, ab, rovka, aditya_nandakumar, kristof.beyls Reviewed By: qcolombet Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D32746 llvm-svn: 305340
*	Align definition of DW_OP_plus with DWARF spec [1/3]	Florian Hahn	2017-06-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things. The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack. This is done in three stages: • The first patch (LLVM) adds support for DW_OP_plus_uconst. • The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst. • The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions. Patch by Sander de Smalen. Reviewers: pcc, echristo, aprantl Reviewed By: aprantl Subscribers: fhahn, aprantl, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33892 llvm-svn: 305304
*	Fix an assertion failure when duplicate dbg.declares are present.	Adrian Prantl	2017-06-12	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	This fixes PR33157. https://bugs.llvm.org//show_bug.cgi?id=33157 We might also think about disallowing duplicate dbg.declare intrinsics entirely, but this may complicate some passes needlessly. llvm-svn: 305244
*	SplitKit: Fix partially live subreg splitting	Matthias Braun	2017-06-12	1	-2/+1
\| \| \| \| \| \| \| \| \|	Fix thinko/typo in subreg aware liverange splitting logic. I'm not sure how to write a proper testcase for this. The original problem only happens on an out-of-tree target. Forcing subreg enabled targets to spill and split in a predictable way is near impossible. llvm-svn: 305228
*	IR: Replace the "Linker Options" module flag with "llvm.linker.options" ↵	Peter Collingbourne	2017-06-12	2	-38/+17
\| \| \| \| \| \| \| \| \| \|	named metadata. The new metadata is easier to manipulate than module flags. Differential Revision: https://reviews.llvm.org/D31349 llvm-svn: 305227
*	[SelectionDAG] Allow sin/cos -> sincos optimization on GNU triples w/ just ↵	Geoff Berry	2017-06-12	1	-14/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-fno-math-errno Summary: This change enables the sin(x) cos(x) -> sincos(x) optimization on GNU target triples. This optimization was being inhibited when -ffast-math wasn't set because sincos in GLibC does not set errno, while sin and cos do. However, this optimization will only run if the attributes on the sin/cos calls include readnone, which is how clang represents the fact that it doesn't care about the errno values set by these functions (via the -fno-math-errno flag). Reviewers: hfinkel, bogner Subscribers: mcrosier, javed.absar, llvm-commits, paul.redmond Differential Revision: https://reviews.llvm.org/D32921 llvm-svn: 305204
*	StackColoring: smarter check for slot overlap	Than McIntosh	2017-06-12	1	-60/+177
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The old check for slot overlap treated 2 slots `S` and `T` as overlapping if there existed a CFG node in which both of the slots could possibly be active. That is overly conservative and caused stack blowups in Rust programs. Instead, check whether there is a single CFG node in which both of the slots are possibly active together. Fixes PR32488. Patch by Ariel Ben-Yehuda <ariel.byd@gmail.com> Reviewers: thanm, nagisa, llvm-commits, efriedma, rnk Reviewed By: thanm Subscribers: dotdash Differential Revision: https://reviews.llvm.org/D31583 llvm-svn: 305193
*	[DAG] add helper to bind memop chains; NFCI	Sanjay Patel	2017-06-12	2	-15/+19
\| \| \| \| \| \| \| \| \| \|	This step is just intended to reduce code duplication rather than change any functionality. A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper. Differential Revision: https://reviews.llvm.org/D33649 llvm-svn: 305192
*	[DAGCombine] Make sure we check the ResNo from UADDO before combining	Amaury Sechet	2017-06-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid. Reviewers: joerg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34088 llvm-svn: 305162
*	[CGP] add a reference to DataLayout in MemCmpExpansion; NFCI	Sanjay Patel	2017-06-09	1	-20/+22
\| \| \| \| \| \| \| \|	We're currently passing endian-ness around as a param (and not uniformly), so this eliminates the need for that. I'd like to add a constant fold call too, and that requires a DL. llvm-svn: 305129
*	SelectionDAG: Remove deleted nodes from legalized set to avoid clash with ↵	Zvi Rackover	2017-06-09	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	newly created nodes Summary: During DAG legalization loop in SelectionDAG::Legalize(), bookkeeping of the SDNodes that were already legalized is implemented with SmallPtrSet (LegalizedNodes). This kind of set stores only pointers to objects, not the objects themselves. Unfortunately, if SDNode is deleted during legalization for some reason, LegalizedNodes set is not informed about this fact. This wouldn’t be so bad, if SelectionDAG wouldn’t reuse space deallocated after deletion of unused nodes, for creation of new ones. Because of this, new nodes, created during legalization often can have pointers identical to ones that have been previously legalized, added to the LegalizedNodes set, and deleted afterwards. This in turn causes, that newly created nodes, sharing the same pointer as deleted old ones, are present in LegalizedNodes already at the moment of creation, so we never call Legalize on them. The fix facilitates the fact, that DAG notifies listeners about each modification. I have registered DAGNodeDeletedListener inside SelectionDAG::Legalize, with a callback function that removes any pointer of any deleted SDNode from the LegalizedNodes set. With this modification, LegalizeNodes set does not contain pointers to nodes that were deleted, so newly created nodes can always be inserted to it, even if they share pointers with old deleted nodes. Patch by pawel.szczerbuk@intel.com The issue this patch addresses causes failures in an out-of-tree target, and i was not able to create a reproducer for an in-tree target, hence there is no test-case. Reviewers: delena, spatel, RKSimon, hfinkel, davide, qcolombet Reviewed By: delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33891 llvm-svn: 305084
*	Reland "[SelectionDAG] Enable target specific vector scalarization of calls ↵	Simon Dardis	2017-06-09	4	-66/+189
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and returns" By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. The previous version of this patch had a "conditional move or jump depends on uninitialized value". Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 305083
*	[XRay] Fix computation of function size subject to XRay threshold	Serge Rogatch	2017-06-09	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions. The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks. I see two options: 1. Count the number of MachineInstr`s in MachineFunction : this gives better estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly. 2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted. Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric. Reviewers: dberris, rengolin Reviewed By: dberris Subscribers: llvm-commits, iid_iunknown Differential Revision: https://reviews.llvm.org/D34027 llvm-svn: 305072
*	Prevent RemoveDeadNodes from deleted already deleted node.	Nirav Dave	2017-06-09	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This prevents against assertion errors like PR32659 which occur from a replacement deleting a node after it's been added to the list argument of RemoveDeadNodes. The specific failure from PR32659 does not currently happen, but it is still potentially possible. The underlying cause is that the callers of the change dfunction builds up a list of nodes to delete after having moved their uses and it possible that a move of a later node will cause a previously deleted nodes to be deleted. Reviewers: bkramer, spatel, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33731 llvm-svn: 305070
*	sink DebugCompressionType into MC for exposing to clang	Saleem Abdulrasool	2017-06-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a preparatory change to expose the debug compression style to clang. It requires exposing the enumeration and passing the actual value through to the backend from the frontend in actual value form rather than a boolean that selects the GNU style of debug info compression. Minor tweak to the ELF Object Writer to use a variable for re-used values. Add an assertion that debug information format is one of the two currently known types if debug information is being compressed. llvm-svn: 305038
*	RegAllocPBQP: Do not assign reserved physical register	Matthias Braun	2017-06-08	3	-5/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register. (1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite. (2) MachineVerifier: updated the test per MatzeB's review. (3) +testcase Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>! Differential Revision: https://reviews.llvm.org/D33947 llvm-svn: 305016
*	fix formatting; NFC	Sanjay Patel	2017-06-08	1	-6/+6
\| \| \| \|	llvm-svn: 305008
*	[CGP] don't expand a memcmp with nobuiltin attribute	Sanjay Patel	2017-06-08	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This matches the behavior used in the SDAG when expanding memcmp. For reference, we're intentionally treating the earlier fortified call transforms differently after: https://bugs.llvm.org/show_bug.cgi?id=23093 https://reviews.llvm.org/rL233776 One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers: https://reviews.llvm.org/D19781 https://reviews.llvm.org/D19801 Differential Revision: https://reviews.llvm.org/D34043 llvm-svn: 305007
*	[CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion	Sanjay Patel	2017-06-08	1	-21/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test diff for PowerPC shows we can better optimize if this case is one block. For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this: _cmp_eq8: movq (%rdi), %rax cmpq (%rsi), %rax je LBB23_1 ## BB#2: ## %res_block movl $1, %ecx jmp LBB23_3 LBB23_1: xorl %ecx, %ecx LBB23_3: ## %endblock xorl %eax, %eax testl %ecx, %ecx sete %al retq We get this: cmp_eq8: movq (%rdi), %rcx xorl %eax, %eax cmpq (%rsi), %rcx sete %al retq And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally. Differential Revision: https://reviews.llvm.org/D34005 llvm-svn: 304987
*	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use ↵	Eugene Zelenko	2017-06-07	8	-83/+113
\| \| \| \| \| \|	warnings; other minor fixes (NFC). llvm-svn: 304954
*	[DAG] Improve Store Merge candidate pruning. NFC.	Nirav Dave	2017-06-07	1	-3/+15
\| \| \| \| \| \| \| \| \|	When considering merging stores values are the results of loads only consider stores whose values come from loads from the same base. This fixes much of the longer compile times in PR33330. llvm-svn: 304934
*	[CGP] avoid zext/trunc of a memcmp expansion compare	Sanjay Patel	2017-06-07	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This could be viewed as another shortcoming of the DAGCombiner: when both operands of a compare are zexted from the same source type, we should be able to compare the original types. The effect on PowerPC perf is likely unnoticeable, but there's a visible regression for x86 if we feed the suboptimal IR for memcmp expansion to the DAG: _cmp_eq4_zexted_to_i64: movl (%rdi), %ecx movl (%rsi), %edx xorl %eax, %eax cmpq %rdx, %rcx sete %al _cmp_eq4_better: movl (%rdi), %ecx xorl %eax, %eax cmpl (%rsi), %ecx sete %al llvm-svn: 304923
*	[CGP] pass size as param in MemCmpExpansion; NFCI	Sanjay Patel	2017-06-07	1	-10/+5
\| \| \| \| \| \|	Avoid extracting the constant int twice. llvm-svn: 304920
*	[CGP] pass size as param in MemCmpExpansion; NFCI	Sanjay Patel	2017-06-07	1	-13/+8
\| \| \| \| \| \|	Avoid extracting the constant int twice. llvm-svn: 304917
*	[CGP] getParent()->getParent() --> getFunction(); NFCI	Sanjay Patel	2017-06-07	1	-5/+4
\| \| \| \|	llvm-svn: 304916
*	[DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering.	Simon Pilgrim	2017-06-07	3	-7/+7
\| \| \| \| \| \| \| \|	This will allow commutation of target-specific DAG nodes in future patches Differential Revision: https://reviews.llvm.org/D33882 llvm-svn: 304911
*	[CGP] add helper function for generating compare of load pairs; NFCI	Sanjay Patel	2017-06-07	1	-5/+16
\| \| \| \| \| \| \| \|	In the special (but also the likely common) case, we can avoid the multi-block complexity of the general algorithm, so moving this part off on its own will make it re-usable. llvm-svn: 304908
*	[CGP] fix formatting in MemCmpExpansion; NFC	Sanjay Patel	2017-06-07	1	-8/+6
\| \| \| \|	llvm-svn: 304903
*	Update libdeps to add BinaryFormat, introduced in r304864.	NAKAMURA Takumi	2017-06-07	1	-1/+1
\| \| \| \|	llvm-svn: 304869