bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.	Yury Gribov	2015-12-01	3	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404
*	Extend debug info for function parameters in SDAG.	Evgeniy Stepanov	2015-12-01	1	-16/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	SDAG currently can emit debug location for function parameters when an llvm.dbg.declare points to either a function argument SSA temp, or to an AllocaInst. This change extends this logic by adding a fallback case when neither of the above is true. This is required for SafeStack, which may copy the contents of a byval function argument into something that is not an alloca, and then describe the target as the new location of the said argument. llvm-svn: 254352
*	Have 'optnone' respect the -fast-isel=false option.	Paul Robinson	2015-11-30	1	-3/+7
\| \| \| \| \| \| \| \|	This is primarily useful for debugging optnone v. ISel issues. Differential Revision: http://reviews.llvm.org/D14792 llvm-svn: 254335
*	Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFC	Craig Topper	2015-11-29	1	-2/+3
\| \| \| \|	llvm-svn: 254260
*	[SelectionDAG] Use std::any_of instead of a manually coded loop. NFC	Craig Topper	2015-11-29	1	-8/+4
\| \| \| \|	llvm-svn: 254242
*	[Stack realignment] Handling of aligned allocas.	Jonas Paulsson	2015-11-28	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements dynamic realignment of stack objects for targets with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo is changed so that for a target that has StackRealignable set to false, over-aligned static allocas are considered to be variable-sized objects and are handled with DYNAMIC_STACKALLOC nodes. It would be good to group aligned allocas into a single big alloca as an optimization, but this is yet todo. SystemZ benefits from this, due to its stack frame layout. New tests SystemZ/alloca-03.ll for aligned allocas, and SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions. Review and help from Ulrich Weigand and Hal Finkel. llvm-svn: 254227
*	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)	Artyom Skrobov	2015-11-25	2	-20/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085
*	Fix some places where we were assuming that memory type had been legalized	Eric Christopher	2015-11-25	2	-3/+2
\| \| \| \| \| \| \| \| \| \|	to a simple type when lowering a truncating store of a vector type. In this case for an EVT we'll return Expand as we should in all of the cases anyhow. The testcase triggered at the one in VectorLegalizer::LegalizeOp, inspection found the rest. llvm-svn: 254061
*	Let SelectionDAG start to use probability-based interface to add successors.	Cong Hou	2015-11-24	4	-214/+226
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes. 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights. 3. Use new interfaces in all other passes. 4. Remove old interfaces. This the second patch above. In this patch SelectionDAG starts to use probability-based interfaces in MBB to add successors but other MC passes are still using weight-based interfaces. Therefore, we need to maintain correct weight list in MBB even when probability-based interfaces are used. This is done by updating weight list in probability-based interfaces by treating the numerator of probabilities as weights. This change affects many test cases that check successor weight values. I will update those test cases once this patch looks good to you. Differential revision: http://reviews.llvm.org/D14361 llvm-svn: 253965
*	Remove duplicate getValueType() calls. NFCI.	Simon Pilgrim	2015-11-22	1	-2/+2
\| \| \| \|	llvm-svn: 253823
*	[DAGCombiner] Bugfix for lost chain depenedency.	Jonas Paulsson	2015-11-21	1	-13/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When MergeConsecutiveStores() combines two loads and two stores into wider loads and stores, the chain users of both of the original loads must be transfered to the new load, because it may be that a chain user only depends on one of the loads. New test case: test/CodeGen/SystemZ/dag-combine-01.ll Reviewed by James Y Knight. Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6 llvm-svn: 253779
*	Partially revert r253662: some unrelated work was accidentally committed ↵	Daniel Sanders	2015-11-20	1	-1/+0
\| \| \| \| \| \| \| \|	with it. Sorry. llvm-svn: 253663
*	Revert the revert 253497 and 253539 - These commits aren't the cause of the ↵	Daniel Sanders	2015-11-20	1	-0/+1
\| \| \| \| \| \| \| \|	clang-cmake-mips failures. Sorry for the noise. llvm-svn: 253662
*	X86: More efficient legalization of wide integer compares	Hans Wennborg	2015-11-19	5	-1/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In particular, this makes the code for 64-bit compares on 32-bit targets much more efficient. Example: define i32 @test_slt(i64 %a, i64 %b) { entry: %cmp = icmp slt i64 %a, %b br i1 %cmp, label %bb1, label %bb2 bb1: ret i32 1 bb2: ret i32 2 } Before this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax setae %al cmpl 16(%esp), %ecx setge %cl je .LBB2_2 movb %cl, %al .LBB2_2: testb %al, %al jne .LBB2_4 movl $1, %eax retl .LBB2_4: movl $2, %eax retl After this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax sbbl 16(%esp), %ecx jge .LBB1_2 movl $1, %eax retl .LBB1_2: movl $2, %eax retl Differential Revision: http://reviews.llvm.org/D14496 llvm-svn: 253572
*	Revert "Change memcpy/memset/memmove to have dest and source alignments."	Pete Cooper	2015-11-19	1	-34/+30
\| \| \| \| \| \| \| \| \| \|	This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
*	Change memcpy/memset/memmove to have dest and source alignments.	Pete Cooper	2015-11-18	1	-30/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
*	[DAGCombiner] Vector constant folding for comparisons	Simon Pilgrim	2015-11-18	1	-6/+14
\| \| \| \| \| \| \| \| \| \|	This patch adds support for vector constant folding of integer/float comparisons. This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations) Differential Revision: http://reviews.llvm.org/D14683 llvm-svn: 253504
*	[PGO] Value profiling support	Betul Buyukkurt	2015-11-18	1	-1/+2
\| \| \| \| \| \| \| \| \|	This change introduces an instrumentation intrinsic instruction for value profiling purposes, the lowering of the instrumentation intrinsic and raw reader updates. The raw profile data files for llvm-profdata testing are updated. llvm-svn: 253484
*	[SelectionDAGBuilder] Make sure DemoteReg ends up in right reg-class.	Jonas Paulsson	2015-11-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The virtual register containing the address for returned value on stack should in the DAG be represented with a CopyFromReg node and not a Register node. Otherwise, InstrEmitter will not make sure that it ends up in the right register class for the target instruction. SystemZ needs this, becuause the reg class for address registers is a subset of the general 64 bit register class. test/SystemZ/CodeGen/args-07.ll and args-04.ll updated to run with -verify-machineinstrs. Reviewed by Hal Finkel. llvm-svn: 253461
*	[WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction	Reid Kleckner	2015-11-17	2	-26/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Now that there is a one-to-one mapping from MachineFunction to WinEHFuncInfo, we don't need to use a DenseMap to select the right WinEHFuncInfo for the current funclet. The main challenge here is that X86WinEHStatePass is an IR pass that doesn't have access to the MachineFunction. I gave it its own WinEHFuncInfo object that it uses to calculate state numbers, which it then throws away. As long as nobody creates or removes EH pads between this pass and SDAG construction, we will get the same state numbers. The other thing X86WinEHStatePass does is to mark the EH registration node. Instead of communicating which alloca was the registration through WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic. This intrinsic generates no code and simply marks the alloca in use. Reviewers: JCTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14668 llvm-svn: 253378
*	Lower statepoints with multi-def targets.	Pat Gavlin	2015-11-17	1	-7/+11
\| \| \| \| \| \| \| \| \| \|	Statepoint lowering currently expects that the target method of a statepoint only defines a single value. This precludes using statepoints with ABIs that return values in multiple registers (e.g. the SysV AMD64 ABI). This change adds support for lowering statepoints with mutli-def targets. llvm-svn: 253339
*	[SDAG] Fix expansion of BITREVERSE	James Molloy	2015-11-13	1	-3/+5
\| \| \| \| \| \| \| \| \| \|	Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash. What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size. Use an APInt for the shift and VT.getScalarSizeInBits(). llvm-svn: 253023
*	Revert "Remove unnecessary call to getAllocatableRegClass"	Tom Stellard	2015-11-12	1	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r252565. This also includes the revert of the commit mentioned below in order to avoid breaking tests in AMDGPU: Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64" This reverts commit r252674. llvm-svn: 252956
*	[SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic	James Molloy	2015-11-12	6	-1/+62
\| \| \| \| \| \| \| \| \| \|	Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target. This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support. The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently). llvm-svn: 252878
*	LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization	Matthias Braun	2015-11-12	1	-73/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al Differential Revision: http://reviews.llvm.org/D11172 llvm-svn: 252839
*	[DAGCombiner] Improve zextload optimization.	Geoff Berry	2015-11-11	1	-22/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Don't fold (zext (and (load x), cst)) -> (and (zextload x), (zext cst)) if (and (load x) cst) will match as a zextload already and has additional users. For example, the following IR: %load = load i32, i32* %ptr, align 8 %load16 = and i32 %load, 65535 %load64 = zext i32 %load16 to i64 store i32 %load16, i32* %dst1, align 4 store i64 %load64, i64* %dst2, align 8 used to produce the following aarch64 code: ldr w8, [x0] and w9, w8, #0xffff and x8, x8, #0xffff str w9, [x1] str x8, [x2] but with this change produces the following aarch64 code: ldrh w8, [x0] str w8, [x1] str x8, [x2] Reviewers: resistor, mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14340 llvm-svn: 252789
*	Add target preference for GatherAllAliases max depth	Matt Arsenault	2015-11-11	1	-1/+1
\| \| \| \|	llvm-svn: 252775
*	LegalizeDAG: Implement promote for scalar_to_vector	Matt Arsenault	2015-11-10	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \|	This allows avoiding the default Expand behavior which introduces stack usage. Bitcast the scalar and replace the missing elements with undef. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252632
*	LegalizeDAG: Implement promote for insert_vector_elt	Matt Arsenault	2015-11-10	1	-1/+52
\| \| \| \| \| \| \|	This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252631
*	LegalizeDAG: Implement promote for extract_vector_elt	Matt Arsenault	2015-11-10	1	-4/+58
\| \| \| \| \| \| \| \| \| \|	This is for AMDGPU to implement v2i64 extract as extract of half of a v4i32. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252630
*	Remove unnecessary call to getAllocatableRegClass	Matt Arsenault	2015-11-10	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm not sure what the point of this was. I'm not sure why you would ever define an instruction that produces an unallocatable register class. No tests fail with this removed, and it seems like it should be a verifier error to define such an instruction. This was problematic for AMDGPU because it would make bad decisions by arbitrarily changing the register class when unsetting isAllocatable for VS_32/VS_64, which is currently set as a workaround to this problem. AMDGPU uses the VS_32/VS_64 register classes to represent operands which can use either VGPRs or SGPRs. When isAllocatable is unset for these, this would need to pick either the SGPR or VGPR class and insert either a copy we don't want, or an illegal copy we would need to deal with later. A semi-arbitrary register class ordering decision is made in tablegen, which resulted in always picking a VGPR class because it happens to have more registers than the SGPR register class. We really just want to use whatever register class the original register had. llvm-svn: 252565
*	add a SelectionDAG method to check if no common bits are set in two nodes; NFCI	Sanjay Patel	2015-11-09	2	-16/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. llvm-svn: 252539
*	[WinEH] Don't emit CATCHRET from visitCatchPad	David Majnemer	2015-11-09	1	-12/+5
\| \| \| \| \| \| \|	Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528
*	[CodeGen] Always promote f16 if not legal	Oliver Stannard	2015-11-09	2	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't currently have any runtime library functions for operations on f16 values (other than conversions to and from f32 and f64), so we should always promote it to f32, even if that is not a legal type. In that case, the f32 values would be softened to f32 library calls. SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type, as it may ne a no-op or require a different library call. getCopyFromParts and getCopyToParts now need to cope with a floating-point value stored in a larger integer part, as is the case for any target that needs to store an f16 value in a 32-bit integer register. Differential Revision: http://reviews.llvm.org/D12856 llvm-svn: 252459
*	[WinEH] Update exception pointer registers	Joseph Tremoulet	2015-11-07	2	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383
*	DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> ↵	Tom Stellard	2015-11-06	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	extload Reviewers: resistor, arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13805 llvm-svn: 252349
*	[WinEH] Fix funclet prologues with stack realignment	Reid Kleckner	2015-11-05	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210
*	[StatepointLowering] Remove distinction between call and invoke safepoints	Igor Laevsky	2015-11-04	2	-24/+31
\| \| \| \| \| \| \| \| \| \| \| \|	There is no point in having invoke safepoints handled differently than the call safepoints. All relevant decisions could be made by looking at whether or not gc.result and gc.relocate lay in a same basic block. This change will allow to lower call safepoints with relocates and results in a different basic blocks. See test case for example. Differential Revision: http://reviews.llvm.org/D14158 llvm-svn: 252028
*	[SelectionDAG] Use existing constant nodes instead of recreating them. NFC.	Simon Pilgrim	2015-11-03	1	-9/+6
\| \| \| \|	llvm-svn: 251990
*	Fix two issues in MergeConsecutiveStores:	James Y Knight	2015-11-02	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) PR25154. This is basically a repeat of PR18102, which was fixed in r200201, and broken again by r234430. The latter changed which of the store nodes was merged into from the first to the last. Thus, we now also need to prefer merging a later store at a given address into the target node, instead of an earlier one. 2) While investigating that, I also realized I'd introduced a bug in r236850. There, I removed a check for alignment -- not realizing that nothing except the alignment check was ensuring that none of the stores were overlapping! This is a really bogus way to ensure there's no aliased stores. A better solution to both of these issues is likely to always use the code added in the 'if (UseAA)' branches which rearrange the chain based on a more principled analysis. I'll look into whether that can be used always, but in the interest of getting things back to working, I think a minimal change makes sense. llvm-svn: 251816
*	[ValueTracking] Use !range metadata more aggressively in KnownBits	Sanjoy Das	2015-10-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more aggressively. Reviewers: majnemer, nlewycky, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14100 llvm-svn: 251487
*	[SelectionDAG] Don't inspect !range metadata for extended loads	Sanjoy Das	2015-10-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Don't call `computeKnownBitsFromRangeMetadata` for extended loads -- this can cause a mismatch between the width of the !range metadata and the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the future). This isn't a problem now, but will be after a future change. Note: this can be made more aggressive in the future. Reviewers: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14107 llvm-svn: 251486
*	Make the SelectionDAG graph printer use SDNode::PersistentId labels.	James Y Knight	2015-10-27	1	-3/+10
\| \| \| \| \| \| \| \|	r248010 changed the -debug output to use short ids, but did not similarly modify the graph printer. Change to be consistent, for ease of cross-reference. llvm-svn: 251465
*	Use the 'arcp' fast-math-flag when combining repeated FP divisors	Sanjay Patel	2015-10-27	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \|	This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. This was originally part of D8900. Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and possibly other changes. Differential Revision: http://reviews.llvm.org/D9708 llvm-svn: 251450
*	Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add ↵	Cong Hou	2015-10-27	2	-14/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429
*	Do not use "else" when both branches return (NFC)	Mehdi Amini	2015-10-27	1	-2/+1
\| \| \| \| \|	From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251398
*	Fix llc crash processing S/UREM for -Oz builds caused by rL250825.	Steve King	2015-10-27	1	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When taking the remainder of a value divided by a constant, visitREM() attempts to convert the REM to a longer but faster sequence of instructions. This conversion calls combine() on a speculative DIV instruction. Commit rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes. Flow eventually hits unreachable(). This patch adds a test case and a check to prevent visitREM() from trying to convert the REM instruction in cases where a DIVREM is possible. See http://reviews.llvm.org/D14035 llvm-svn: 251373
*	[X86] Use correct calling convention for MCU psABI libcalls	Michael Kuperstein	2015-10-25	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223
*	[DAGCombiner] Tidy up ConstantFP commutation. NFCI	Simon Pilgrim	2015-10-24	1	-37/+21
\| \| \| \| \| \|	Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code. llvm-svn: 251203
*	[DAGCombiner] Generalize masking of constant rotates.	Simon Pilgrim	2015-10-24	1	-5/+10
\| \| \| \| \| \| \| \|	We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197