bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[x86,sdag] Two interrelated changes to the x86 and sdag code.	Chandler Carruth	2015-02-19	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First, don't combine bit masking into vector shuffles (even ones the target can handle) once operation legalization has taken place. Custom legalization of vector shuffles may exist for these patterns (making the predicate return true) but that custom legalization may in some cases produce the exact bit math this matches. We only really want to handle this prior to operation legalization. However, the x86 backend, in a fit of awesome, relied on this. What it would do is mark VSELECTs as expand, which would turn them into arithmetic, which this would then match back into vector shuffles, which we would then lower properly. Amazing. Instead, the second change is to teach the x86 backend to directly form vector shuffles from VSELECT nodes with constant conditions, and to mark all of the vector types we support lowering blends as shuffles as custom VSELECT lowering. We still mark the forms which actually support variable blends as legal so that the custom lowering is bypassed, and the legal lowering can even be used by the vector shuffle legalization (yes, i know, this is confusing. but that's how the patterns are written). This makes the VSELECT lowering much more sensible, and in fact should fix a bunch of bugs with it. However, as you'll see in the test cases, right now what it does is point out the hilarious deficiency of the new vector shuffle lowering when it comes to blends. Fortunately, my very next patch fixes that. I can't submit it yet, because that patch, somewhat obviously, forms the exact and/or pattern that the DAG combine is matching here! Without this patch, teaching the vector shuffle lowering to produce the right code infloops in the DAG combiner. With this patch alone, we produce terrible code but at least lower through the right paths. With both patches, all the regressions here should be fixed, and a bunch of the improvements (like using 2 shufps with no memory loads instead of 2 andps with memory loads and an orps) will stay. Win! There is one other change worth noting here. We had hilariously wrong vectorization cost estimates for vselect because we fell through to the code path that assumed all "expand" vector operations are scalarized. However, the "expand" lowering of VSELECT is vector bit math, most definitely not scalarized. So now we go back to the correct if horribly naive cost of "1" for "not scalarized". If anyone wants to add actual modeling of shuffle costs, that would be cool, but this seems an improvement on its own. Note the removal of 16 and 32 "costs" for doing a blend. Even in SSE2 we can blend in fewer than 16 instructions. ;] Of course, we don't right now because of OMG bad code, but I'm going to fix that. Next patch. I promise. llvm-svn: 229835
*	Fixes two issue in SimplifyDemandedBits of sext_in_reg:	Michael Kuperstein	2015-02-18	1	-11/+18
\| \| \| \| \| \| \| \| \| \| \|	1) We should not try to simplify if the sext has multiple uses 2) There is no need to simplify is the source value is already sign-extended. Patch by Gil Rapaport <gil.rapaport@intel.com> Differential Revision: http://reviews.llvm.org/D6949 llvm-svn: 229659
*	Canonicalize splats as build_vectors (PR22283)	Sanjay Patel	2015-02-17	2	-28/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-on patch to: http://reviews.llvm.org/D7093 That patch canonicalized constant splats as build_vectors, and this patch removes the constant check so we can canonicalize all splats as build_vectors. This fixes the 2nd test case in PR22283: http://llvm.org/bugs/show_bug.cgi?id=22283 The unfortunate code duplication between SelectionDAG and DAGCombiner is discussed in the earlier patch review. At least this patch is just removing code... This improves an existing x86 AVX test and changes codegen in an ARM test. Differential Revision: http://reviews.llvm.org/D7389 llvm-svn: 229511
*	Prefer SmallVector::append/insert over push_back loops.	Benjamin Kramer	2015-02-17	5	-32/+11
\| \| \| \| \| \|	Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500
*	SelectionDAG: fold (fp_to_u/sint (s/uint_to_fp)) here too	Mehdi Amini	2015-02-16	1	-2/+46
\| \| \| \| \| \| \|	Update SPARC tests to match. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 229438
*	AArch64: Safely handle the incoming sret call argument.	Andrew Trick	2015-02-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413
*	[SDAG] Teach the SelectionDAG to canonicalize vector shuffles of splats	Chandler Carruth	2015-02-15	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	directly into blends of the splats. These patterns show up even very late in the vector shuffle lowering where we don't have any chance for DAG combining to kick in, and blending is a tremendously simpler operation to model. By coercing the shuffle into a blend we can much more easily match and lower shuffles of splats. Immediately with this change there are significantly more blends being matched in the x86 vector shuffle lowering. llvm-svn: 229308
*	[x86] Fix PR22377, a regression with the new vector shuffle legality	Chandler Carruth	2015-02-15	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	test. This was just a matter of the DAG combine for vector shuffles being too aggressive. This is a bit of a grey area, but I think generally if we can re-use intermediate shuffles, we should. Certainly, given the test cases I have available, this seems like the right call. llvm-svn: 229285
*	CodeGen: Canonicalize access to function attributes, NFC	Duncan P. N. Exon Smith	2015-02-14	3	-21/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208
*	Unify the two EH personality classification routines I wrote	Reid Kleckner	2015-02-14	1	-1/+1
\| \| \| \| \| \|	We only need one. llvm-svn: 229193
*	Re-sort #include lines using my handy dandy ./utils/sort_includes.py	Chandler Carruth	2015-02-13	2	-2/+2
\| \| \| \| \| \|	script. This is in preparation for changes to lots of include lines. llvm-svn: 229088
*	[SDAG] Don't try to use FP_EXTEND/FP_ROUND for int<->fp promotions	Hal Finkel	2015-02-12	1	-3/+5
\| \| \| \| \| \| \| \| \| \|	The PowerPC backend has long promoted some floating-point vector operations (such as select) to integer vector operations. Unfortunately, this behavior was broken by r216555. When using FP_EXTEND/FP_ROUND for promotions, we must check that both the old and new types are floating-point types. Otherwise, we must use BITCAST as we did prior to r216555 for everything. llvm-svn: 228969
*	MathExtras: Bring Count(Trailing\|Leading)Ones and CountPopulation in line ↵	Benjamin Kramer	2015-02-12	2	-4/+4
\| \| \| \| \| \| \| \|	with countTrailingZeros Update all callers. llvm-svn: 228930
*	[CodeGen] Don't blindly combine (fp_round (fp_round x)) to (fp_round x).	Ahmed Bougacha	2015-02-12	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \|	We used to do this DAG combine, but it's not always correct: If the first fp_round isn't a value preserving truncation, it might introduce a tie in the second fp_round, that wouldn't occur in the single-step fp_round we want to fold to. In other words, double rounding isn't the same as rounding. Differential Revision: http://reviews.llvm.org/D7571 llvm-svn: 228911
*	Fix SelectionDAG compile time issue with alias analysis.	Jonas Paulsson	2015-02-11	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	Add new token factor node and its users to worklist if alias analysis is turned on, in DAGCombiner::visitTokenFactor(). Alias analysis may cause a lot of new token factors to be inserted into the DAG, and they need to be optimized to avoid significant slow-downs. Reviewed by Hal Finkel. llvm-svn: 228841
*	Fix makeLibCall argument (signed) in SoftenFloatRes_XINT_TO_FP function	Petar Jovanovic	2015-02-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The isSigned argument of makeLibCall function was hard-coded to false (unsigned). This caused zero extension on MIPS64 soft float. As the result SingleSource/Benchmarks/Stanford/FloatMM test and SingleSource/UnitTests/2005-07-17-INT-To-FP test failed. The solution was to use the proper argument. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D7292 llvm-svn: 228765
*	Adding support for llvm.eh.begincatch and llvm.eh.endcatch intrinsics and ↵	Andrew Kaylor	2015-02-10	1	-0/+3
\| \| \| \| \| \| \| \|	beginning the documentation of native Windows exception handling. Differential Revision: http://reviews.llvm.org/D7398 llvm-svn: 228733
*	Two comment typo fixes in lib/CodeGen/SelectionDAG/DAGCombiner.cpp.	Jonas Paulsson	2015-02-10	1	-2/+2
\| \| \| \|	llvm-svn: 228700
*	[x86] Fix PR22524: the DAG combiner was incorrectly handling illegal	Chandler Carruth	2015-02-10	1	-13/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nodes when folding bitcasts of constants. We can't fold things and then check after-the-fact whether it was legal. Once we have formed the DAG node, arbitrary other nodes may have been collapsed to it. There is no easy way to go back. Instead, we need to test for the specific folding cases we're interested in and ensure those are legal first. This could in theory make this less powerful for bitcasting from an integer to some vector type, but AFAICT, that can't actually happen in the SDAG so its fine. Now, we only whitelist specific int->fp and fp->int bitcasts for post-legalization folding. I've added the test case from the PR. (Also as a note, this does not appear to be in 3.6, no backport needed) llvm-svn: 228656
*	[CodeGen] Add hook/combine to form vector extloads, enabled on X86.	Ahmed Bougacha	2015-02-05	1	-12/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The combine that forms extloads used to be disabled on vector types, because "None of the supported targets knows how to perform load and sign extend on vectors in one instruction." That's not entirely true, since at least SSE4.1 X86 knows how to do those sextloads/zextloads (with PMOVS/ZX). But there are several aspects to getting this right. First, vector extloads are controlled by a profitability callback. For instance, on ARM, several instructions have folded extload forms, so it's not always beneficial to create an extload node (and trying to match extloads is a whole 'nother can of worms). The interesting optimization enables folding of s/zextloads to illegal (splittable) vector types, expanding them into smaller legal extloads. It's not ideal (it introduces some legalization-like behavior in the combine) but it's better than the obvious alternative: form illegal extloads, and later try to split them up. If you do that, you might generate extloads that can't be split up, but have a valid ext+load expansion. At vector-op legalization time, it's too late to generate this kind of code, so you end up forced to scalarize. It's better to just avoid creating egregiously illegal nodes. This optimization is enabled unconditionally on X86. Note that the splitting combine is happy with "custom" extloads. As is, this bypasses the actual custom lowering, and just unrolls the extload. But from what I've seen, this is still much better than the current custom lowering, which does some kind of unrolling at the end anyway (see for instance load_sext_4i8_to_4i64 on SSE2, and the added FIXME). Also note that the existing combine that forms extloads is now also enabled on legal vectors. This doesn't have a big effect on X86 (because sext+load is usually combined to sext_inreg+aextload). On ARM it fires on some rare occasions; that's for a separate commit. Differential Revision: http://reviews.llvm.org/D6904 llvm-svn: 228325
*	Fixes a bug in vector load legalization that confused bits and bytes.	Michael Kuperstein	2015-02-04	1	-3/+3
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D7400 llvm-svn: 228168
*	Add nullptr checks for TargetSelectionDAGInfo in SelectionDAG.	Manuel Jacob	2015-01-28	1	-13/+19
\| \| \| \| \| \|	TSI is not guaranteed be non-null in SelectionDAG. llvm-svn: 227397
*	Revert r227242 - Merge vector stores into wider vector stores (PR21711).	Quentin Colombet	2015-01-27	1	-54/+30
\| \| \| \| \| \| \|	This commit creates infinite loop in DAG combine for in the LLVM test-suite for aarch64 with mcpu=cylcone (just having neon may be enough to expose this). llvm-svn: 227272
*	Merge vector stores into wider vector stores (PR21711)	Sanjay Patel	2015-01-27	1	-30/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch resolves part of PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ). The 'f3' test case in that report presents a situation where we have two 128-bit stores extracted from a 256-bit source vector. Instead of producing this: vmovaps %xmm0, (%rdi) vextractf128 $1, %ymm0, 16(%rdi) This patch merges the 128-bit stores into a single 256-bit store: vmovups %ymm0, (%rdi) Differential Revision: http://reviews.llvm.org/D7208 llvm-svn: 227242
*	Add a FIXME in SelectionDAGBuilder before an assert that is valid only on X86.	Manuel Jacob	2015-01-27	1	-0/+3
\| \| \| \| \| \| \| \| \|	When lowering memcpy, memset or memmove, this assert checks whether the pointer operands are in an address space < 256 which means "user defined address space" on X86. However, this notion of "user defined address space" does not exist for other targets. llvm-svn: 227191
*	Grab the TargetLowering info from the DAG rather than querying for	Eric Christopher	2015-01-27	1	-3/+2
\| \| \| \| \| \|	a subtarget. llvm-svn: 227156
*	[SelectionDAG] Fix assert message copypasta. NFC.	Ahmed Bougacha	2015-01-26	1	-2/+2
\| \| \| \|	llvm-svn: 227119
*	Move DataLayout back to the TargetMachine from TargetSubtargetInfo	Eric Christopher	2015-01-26	3	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets() have had subtarget dependent code moved out and onto the TargetMachine. One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113
*	Revert GCStrategy ownership changes	Philip Reames	2015-01-26	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	This change reverts the interesting parts of 226311 (and 227046). This change introduced two problems, and I've been convinced that an alternate approach is preferrable anyways. The bugs were: - Registery appears to require all users be within the same linkage unit. After this change, asking for "statepoint-example" in Transform/ would sometimes get you nullptr, whereas asking the same question in CodeGen would return the right GCStrategy. The correct long term fix is to get rid of the utter hack which is Registry, but I don't have time for that right now. 227046 appears to have been an attempt to fix this, but I don't believe it does so completely. - GCMetadataPrinter::finishAssembly was being called more than once per GCStrategy. Each Strategy was being added to the GCModuleInfo multiple times. Once I get time again, I'm going to split GCModuleInfo into the gc.root specific part and a GCStrategy owning Analysis pass. I'm probably also going to kill off the Registry. Once that's done, I'll move the new GCStrategyAnalysis and all built in GCStrategies into Analysis. (As original suggested by Chandler.) This will accomplish my original goal of being able to access GCStrategy from Transform/ without adding all of the builtin GCs to IR/. llvm-svn: 227109
*	[DAG] Fix wrong canonicalization performed on shuffle nodes.	Andrea Di Biagio	2015-01-24	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \|	This fixes a regression introduced by r226816. When replacing a splat shuffle node with a constant build_vector, make sure that the new build_vector has a valid number of elements. Thanks to Patrik Hagglund for reporting this problem and providing a small reproducible. llvm-svn: 227002
*	Classify functions by EH personality type rather than using the triple	Reid Kleckner	2015-01-23	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This mostly reverts commit r222062 and replaces it with a new enum. At some point this enum will grow at least for other MSVC EH personalities. Also beefs up the way we were sniffing the personality function. Previously we would emit the Itanium LSDA despite using __C_specific_handler. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6987 llvm-svn: 226920
*	DAGCombine: always constant fold FMA when target disable FP exceptions	Mehdi Amini	2015-01-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When trying to constant fold an FMA in the DAG, getNode() fails to fold the FMA if an operand is not finite. In this case this patch allows the constant folding if !TLI->hasFloatingPointExceptions() Reviewers: resistor Reviewed By: resistor Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D6912 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 226901
*	SelectionDAG: Add KnownBits and SignBits computation for EXTRACT_ELEMENT	Jan Vesely	2015-01-22	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \|	v2: use getZExtValue add missing break codestyle v3: add few more comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226880
*	Intrinsics: introduce llvm_any_ty aka ValueType Any	Ramkumar Ramachandra	2015-01-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Specifically, gc.result benefits from this greatly. Instead of: gc.result.int.* gc.result.float.* gc.result.ptr.* ... We now have a gc.result.* that can specialize to literally any type. Differential Revision: http://reviews.llvm.org/D7020 llvm-svn: 226857
*	merge consecutive stores of extracted vector elements (PR21711)	Sanjay Patel	2015-01-22	1	-92/+162
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698. That patch was checked in at r224611, but reverted at r225031 because it caused a failure outside of the regression tests. The cause of the crash was not recognizing consecutive stores that have mixed source values (loads and vector element extracts), so this patch adds a check to bail out if any store value is not coming from a vector element extract. This patch also refactors the shared logic of the constant source and vector extracted elements source cases into a helper function. Differential Revision: http://reviews.llvm.org/D6850 llvm-svn: 226845
*	[DAGCombine] Produce better code for constant splats	Michael Kuperstein	2015-01-22	2	-2/+41
\| \| \| \| \| \| \| \| \| \| \|	This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 Fixed recommit of r226811. llvm-svn: 226816
*	Revert r226811, MSVC accepts code sane compilers don't.	Michael Kuperstein	2015-01-22	2	-41/+2
\| \| \| \|	llvm-svn: 226814
*	[DAGCombine] Produce better code for constant splats	Michael Kuperstein	2015-01-22	2	-2/+41
\| \| \| \| \| \| \| \| \|	This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 llvm-svn: 226811
*	Fixed a bug in type legalizer for masked load/store intrinsics.	Elena Demikhovsky	2015-01-22	6	-38/+85
\| \| \| \| \| \| \| \| \| \| \| \|	The problem occurs when after vectorization we have type <2 x i32>. This type is promoted to <2 x i64> and then requires additional efforts for expanding loads and truncating stores. I added EXPAND / TRUNCATE attributes to the masked load/store SDNodes. The code now contains additional shuffles. I've prepared changes in the cost estimation for masked memory operations, it will be submitted separately. llvm-svn: 226808
*	Fixed a comment	Elena Demikhovsky	2015-01-22	1	-1/+1
\| \| \| \|	llvm-svn: 226806
*	Fixed a bug in narrowing store operation.	Elena Demikhovsky	2015-01-22	1	-2/+5
\| \| \| \| \| \| \| \| \|	Type MVT::i1 became legal in KNL, but store operation can't be narrowed to this type, since the size of VT (1 bit) is not equal to its actual store size(8 bits). Added a test provided by David (dag@cray.com) llvm-svn: 226805
*	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))	Tim Northover	2015-01-21	1	-0/+11
\| \| \| \| \| \| \|	It can help with argument juggling on some targets, and is generally a good idea. llvm-svn: 226740
*	Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))"	Tim Northover	2015-01-21	1	-11/+0
\| \| \| \| \| \| \| \|	It hadn't gone through review yet, but was still on my local copy. This reverts commit r226663 llvm-svn: 226665
*	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))	Tim Northover	2015-01-21	1	-0/+11
\| \| \| \|	llvm-svn: 226663
*	Prevent binary-tree deterioration in sparse switch statements.	Daniel Jasper	2015-01-20	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	This addresses part of llvm.org/PR22262. Specifically, it prevents considering the densities of sub-ranges that have fewer than TLI.getMinimumJumpTableEntries() elements. Those densities won't help jump tables. This is not a complete solution but works around the most pressing issue. Review: http://reviews.llvm.org/D7070 llvm-svn: 226600
*	Factor out a splitSwitchCase() function so that it can be reused.	Daniel Jasper	2015-01-20	2	-21/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is in preparation for a fix to llvm.org/PR22262. One of the ideas here is to first find a good jump table range first and then split before and after it. Thereby, we don't need to use the split-based-on-density heuristic at all, which can make the "binary tree" deteriorate in various cases. Also some minor cleanups. No functional changes. llvm-svn: 226551
*	[PM] Remove the Pass argument from all of the critical edge splitting	Chandler Carruth	2015-01-19	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	APIs and replace it and numerous booleans with an option struct. The critical edge splitting API has a really large surface of flags and so it seems worth burning a small option struct / builder. This struct can be constructed with the various preserved analyses and then flags can be flipped in a builder style. The various users are now responsible for directly passing along their analysis information. This should be enough for the critical edge splitting to work cleanly with the new pass manager as well. This API is still pretty crufty and could be cleaned up a lot, but I've focused on this change just threading an option struct rather than a pass through the API. llvm-svn: 226456
*	Improve DAG combine pass on certain IR vector patterns	Mehdi Amini	2015-01-17	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector produced suboptimal code in AVX2 mode with certain IR combinations. In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32 (undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical, but then mysteriously generated rather bad code; the movq/movhpd combination didn't match. The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs would get promoted to 4f32 by the type legalizer, eventually resulting in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing these were both half the output size, concatted them and then produced a shuffle. However, the resulting concat + shuffle was more complex than it should be; in the case where the upper half of the output is undef, we probably want to generate shuffle + concat instead. This enhancement causes the vector_shuffle combine step to recognize this suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR in case the same suboptimal pattern occurs for other reasons. This results in the optimizer correctly producing the optimal movq + movhpd sequence for all three variations on this IR, even with AVX2. I've included a test case. Radar link: rdar://problem/19287012 Fix for PR 21943. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 226360
*	Move ownership of GCStrategy objects to LLVMContext	Philip Reames	2015-01-16	3	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Note: This change ended up being slightly more controversial than expected. Chandler has tentatively okayed this for the moment, but I may be revisiting this in the near future after we settle some high level questions. Rather than have the GCStrategy object owned by the GCModuleInfo - which is an immutable analysis pass used mainly by gc.root - have it be owned by the LLVMContext. This simplifies the ownership logic (i.e. can you have two instances of the same strategy at once?), but more importantly, allows us to access the GCStrategy in the middle end optimizer. To this end, I add an accessor through Function which becomes the canonical way to get at a GCStrategy instance. In the near future, this will allows me to move some of the checks from http://reviews.llvm.org/D6808 into the Verifier itself, and to introduce optimization legality predicates for some of the recent additions to InstCombine. (These will follow as separate changes.) Differential Revision: http://reviews.llvm.org/D6811 llvm-svn: 226311
*	Fix SelectionDAG -view-*-dags filtering	Mehdi Amini	2015-01-15	1	-1/+1
\| \| \| \|	llvm-svn: 226163