bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Debug info: Infrastructure to support debug locations for fragmented	Adrian Prantl	2014-08-01	9	-73/+301
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	variables (for example, by-value struct arguments passed in registers, or large integer values split across several smaller registers). On the IR level, this adds a new type of complex address operation OpPiece to DIVariable that describes size and offset of a variable fragment. On the DWARF emitter level, all pieces describing the same variable are collected, sorted and emitted as DWARF expressions using the DW_OP_piece and DW_OP_bit_piece operators. http://reviews.llvm.org/D3373 rdar://problem/15928306 What this patch doesn't do / Future work: - This patch only adds the backend machinery to make this work, patches that change SROA and SelectionDAG's type legalizer to actually create such debug info will follow. (http://reviews.llvm.org/D2680) - Making the DIVariable complex expressions into an argument of dbg.value will reduce the memory footprint of the debug metadata. - The sorting/uniquing of pieces should be moved into DebugLocEntry, to facilitate the merging of multi-piece entries. llvm-svn: 214576
*	[SDAG] MorphNodeTo recursively deletes dead operands of the old	Chandler Carruth	2014-08-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	fromulation of the node, which isn't really the desired behavior from within the combiner or legalizer, but is necessary within ISel. I've added a hopefully helpful comment and fixed the only two places where this took place. Yet another step toward the combiner and legalizer not needing to use update listeners with virtual calls to manage the worklists behind legalization and combining. llvm-svn: 214574
*	[SDAG] Begin simplifying the way in which the legalizer deletes nodes.	Chandler Carruth	2014-08-01	1	-41/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This lifts the (very few) places the legalizer would delete dead nodes into the outer loop around the legalizer. This is significantly simpler because it doesn't require the legalizer itself to manage the iterator validity, and it doesn't require the legalizer to be a DAG update listener in order to remove things from the legalized set. It also makes the interface much less contrived for the case of the legalizer running inside the last phase of DAG combining. I'm working on centralizing the deletion of nodes during both legalizing and combining as much as possible. My hope is to remove the need for DAG update listeners from the combiner next, which would remove a costly virtual dispatch chain on every deletion. This in turn should allow us to more aggressively delete DAG nodes during combining which will in turn allow us to combine more aggressively by exposing the actual nodes which have single users to the combine phases. llvm-svn: 214546
*	Explicitly report runtime stack realignment in StackMap section	Philip Reames	2014-08-01	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This change adds code to explicitly mark a function which requires runtime stack realignment as not having a fixed frame size in the StackMap section. As it happens, this is not actually a functional change. The size that would be reported without the check is also "-1", but as far as I can tell, that's an accident. The code change makes this explicit. Note: There's a separate bug in handling of stackmaps and patchpoints in functions which need dynamic frame realignment. The current code assumes that offsets can be calculated from RBP, but realigned frames must use RSP. (There's a variable gap between RBP and the spill slots.) This change set does not address that issue. Reviewers: atrick, ributzka Differential Revision: http://reviews.llvm.org/D4572 llvm-svn: 214534
*	[PowerPC] Generate unaligned vector loads using intrinsics instead of ↵	Hal Finkel	2014-08-01	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	regular loads Altivec vector loads on PowerPC have an interesting property: They always load from an aligned address (by rounding down the address actually provided if necessary). In order to generate an actual unaligned load, you can generate two load instructions, one with the original address, one offset by one vector length, and use a special permutation to extract the bytes desired. When this was originally implemented, I generated these two loads using regular ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with this: The alignment of a load does not contribute to its identity, and SDNodes are uniqued. So, imagine that we have some unaligned load, L1, that is not aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned). Further imagine that there had already existed a load (L1+16)(unaligned) with the same chain operand as the load L1. When (L1+16)(aligned) is created as part of the lowering of L1, this load is also the (L1+16)(unaligned) node, just now marked as aligned (because the new alignment overwrites the old). But the original users of (L1+16)(unaligned) now get the data intended for the permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists to get its own permutation-based expansion. This was PR19991. A second potential problem has to do with the MMOs on these loads, which can be used by AA during instruction scheduling to break chain-based dependencies. If the new "aligned" loads get the MMO from the original unaligned load, this does not represent the fact that it will load data from below the original address. Normally, this would not matter, but this load might be combined with another load pair for a previous vector, and then the dependency on the otherwise- ignored lower bytes can matter. To fix both problems, instead of generating the necessary loads using regular ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are provided with MMOs with a conservative address range. Unfortunately, I no longer have a failing test case (since PR19991 was reported, other changes in CodeGen have forced this bug back into hiding it again). Nevertheless, this should fix the underlying problem. llvm-svn: 214481
*	White space fix.	Louis Gerbarg	2014-07-31	1	-1/+1
\| \| \| \|	llvm-svn: 214455
*	Make sure no loads resulting from load->switch DAGCombine are marked invariant	Louis Gerbarg	2014-07-31	6	-40/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when DAGCombine converts loads feeding a switch into a switch of addresses feeding a load the new load inherits the isInvariant flag of the left side. This is incorrect since invariant loads can be reordered in cases where it is illegal to reoarder normal loads. This patch adds an isInvariant parameter to getExtLoad() and updates all call sites to pass in the data if they have it or false if they don't. It also changes the DAGCombine to use that data to make the right decision when creating the new load. llvm-svn: 214449
*	Disable IsSub subregister assert. pr18663.	Will Schmidt	2014-07-31	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to the activity in the bug at http://llvm.org/bugs/show_bug.cgi?id=18663 . The underlying issue has to do with how the KILL pseudo-instruction is handled. I defer to Hal/Jakob/Uli for additional details and background. This will disable the (bad?) assert, add an associated fixme comment, and add a pair of tests. The code change and the pr18663-2.ll test are copied from the referenced bug. That test does not immediately fail in my environment, but I have added the pr18663.ll test which does. (Comment from Hal) to provide everyone else with some context, this assert was not bad when it was written. At that time, we only generated KILL pseudo instructions around subregister copies. This logic, unfortunately, had its own problems. In r199797, the relevant logic in MachineCopyPropagation was replaced to generate KILLs for other kinds of copies too. This change in semantics broke this now-problematic assumption in AggressiveAntiDepBreaker. The AggressiveAntiDepBreaker really needs a proper cleanup to deal with the change, but removing the assert (which just allows the function to return false) is a safe conservative behavior, and should do for the time being. llvm-svn: 214429
*	[FastISel] Fix the patchpoint intrinsic lowering in FastISel for large ↵	Juergen Ributzka	2014-07-31	1	-1/+1
\| \| \| \| \| \| \| \| \|	target addresses. This fixes a mistake where I accidentially dropped the upper 32bit of a 64bit pointer during FastISel lowering of the patchpoint intrinsic. llvm-svn: 214367
*	Refactor duplicated code.	Rafael Espindola	2014-07-30	2	-24/+29
\| \| \| \|	llvm-svn: 214328
*	Retain alignment requirements for load->selects modified by DAGCombine	Louis Gerbarg	2014-07-30	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DAGCombine may choose to rewrite graphs where two loads feed a select into graphs where a select of two addresses feed a load. While it sanity checks the loads to make sure they are broadly equivalent it currently just uses the alignment restriction of the left node. In cases where the right node has stronger alignment requiresment this may lead to bad codegen, such as generating an aligned load where an unaligned load is required. This patch makes the combine generate a load with an alignment that is the same as whichever is more restrictive of the two alignments. Tests included. rdar://17762530 llvm-svn: 214322
*	Add the missing hasLinkOnceODRLinkage predicate.	Rafael Espindola	2014-07-30	1	-2/+1
\| \| \| \|	llvm-svn: 214312
*	Don't manually (and forcibly) run the verifier on the entire module from	Chandler Carruth	2014-07-30	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	the jump instruction table pass. First, the verifier is already built into all the tools. The test case is adapted to just run llvm-as demonstrating that we still catch the broken module. Second, the verifier is extremely slow. This was responsible for very significant compile time regressions. If you have deployed a Clang binary anywhere from r210280 to this commit, you really want to re-deploy. llvm-svn: 214287
*	Add support for scalarizing ctlz_zero_undef	Petar Jovanovic	2014-07-30	1	-0/+1
\| \| \| \| \| \| \| \| \|	Fix the missing case in ScalarizeVectorResult() that was exposed with libclcore.bc in Android. Differential Revision: http://reviews.llvm.org/D4645 llvm-svn: 214266
*	Header hygiene: remove using directive and #undef DEBUG_TYPE once we're done.	Richard Smith	2014-07-30	1	-0/+2
\| \| \| \|	llvm-svn: 214263
*	Feedback on r214189, no functionality change.	Manman Ren	2014-07-29	2	-2/+2
\| \| \| \|	llvm-svn: 214240
*	[Debug Info] remove DITrivialType and use null to represent unspecified param.	Manman Ren	2014-07-29	2	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Per feedback on r214111, we are going to use null to represent unspecified parameter. If the type array is {null}, it means a function that returns void; If the type array is {null, null}, it means a variadic function that returns void. In summary if we have more than one element in the type array and the last element is null, it is a variadic function. rdar://17628609 llvm-svn: 214189
*	CodeGenPrep: fall back to MVT::Other if instruction's type isn't an EVT.	Tim Northover	2014-07-29	1	-3/+6
\| \| \| \| \| \| \| \| \|	The test being performed is just an approximation anyway, so it really shouldn't crash when things don't go entirely as expected. Should fix PR20474. llvm-svn: 214177
*	ARM: fix @llvm.convert.from.fp16 on softfloat targets.	Tim Northover	2014-07-29	1	-1/+6
\| \| \| \| \| \| \| \|	We need to make sure we use the softened version of all appropriate operands in the libcall, or things go horribly wrong. This may entail actually executing a 1-stage softening. llvm-svn: 214175
*	Add TargetInstrInfo interface isAsCheapAsAMove.	Jiangning Liu	2014-07-29	5	-5/+5
\| \| \| \|	llvm-svn: 214158
*	[Debug Info] unique MDNodes in the enum types of each compile unit.	Manman Ren	2014-07-28	1	-2/+7
\| \| \| \| \| \| \| \| \|	The enum types array by design contains pointers to MDNodes rather than DIRefs. Unique them when handling the enum types in DwarfDebug. rdar://17628609 llvm-svn: 214139
*	[Debug Info] add DISubroutineType and its creation takes DITypeArray.	Manman Ren	2014-07-28	3	-12/+12
\| \| \| \| \| \| \| \| \| \| \|	DITypeArray is an array of DITypeRef, at its creation, we will create DITypeRef (i.e use the identifier if the type node has an identifier). This is the last patch to unique the type array of a subroutine type. rdar://17628609 llvm-svn: 214132
*	[Debug Info] rename getTypeArray to getElements, setTypeArray to setArrays.	Manman Ren	2014-07-28	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the second of a series of patches to handle type uniqueing of the type array for a subroutine type. For vector and array types, getElements returns the array of subranges, so it is a better name than getTypeArray. Even for class, struct and enum types, getElements returns the members, which can be subprograms. setArrays can set up to two arrays, the second is the templates. This commit should have no functionality change. llvm-svn: 214112
*	[SDAG] Add DEBUG logging to the legalizer, fixing a "bug" found by	Chandler Carruth	2014-07-28	2	-6/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inspection in the proccess, and shuffle the logging in the DAG combiner around a bit. With this it is much easier to follow what the legalizer is doing. It should even accurately present most of the strange legalization operations where a single node is replaced by multiple nodes, etc. There is still some information lost (we log SDNodes not SDValues so we don't log which result is used for which thing), but I think this is much closer to a usable system. Notably, this will make it much more apparant when legalization is actually happening inside the combiner, or when there is a cycle caused by interactions of the legalizer and the combiner. The "bug" I fixed here I'm not sure is remotely possible to trigger. We were only adding one of the nodes in a replacement to the updated set rather than all of the nodes in the replacement. Realistically, the worst result of this are nodes not getting back onto the worklist in the DAG combiner. I doubt it is possible to trigger this today, and I certainly don't have any ideas about how, but this at least brings the code into alignment with the principled operation of the routine. llvm-svn: 214105
*	Add alignment value to allowsUnalignedMemoryAccess	Matt Arsenault	2014-07-27	3	-12/+17
\| \| \| \| \| \| \| \| \| \|	Rename to allowsMisalignedMemoryAccess. On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment, and don't need to be split into multiple accesses. Vector loads with an alignment of the element type are not uncommon in OpenCL code. llvm-svn: 214055
*	[SDAG] Add an assert that we don't mess up the number of values when	Chandler Carruth	2014-07-26	1	-0/+3
\| \| \| \| \| \| \| \|	replacing nodes in the legalizer. This caught a number of bugs for me during development. llvm-svn: 214022
*	[SDAG] Simplify the code for handling single-value nodes and add	Chandler Carruth	2014-07-26	1	-8/+12
\| \| \| \| \| \|	a missing transfer of debug information (without which tests fail). llvm-svn: 214021
*	[SDAG] When performing post-legalize DAG combining, run the legalizer	Chandler Carruth	2014-07-26	2	-61/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	over each node in the worklist prior to combining. This allows the combiner to produce new nodes which need to go back through legalization. This is particularly useful when generating operands to target specific nodes in a post-legalize DAG combine where the operands are significantly easier to express as pre-legalized operations. My immediate use case will be PSHUFB formation where we need to build a constant shuffle mask with a build_vector node. This also refactors the relevant functionality in the legalizer to support this, and updates relevant tests. I've spoken to the R600 folks and these changes look like improvements to them. The avx512 change needs to be investigated, I suspect there is a disagreement between the legalizer and the DAG combiner there, but it seems a minor issue so leaving it to be re-evaluated after this patch. Differential Revision: http://reviews.llvm.org/D4564 llvm-svn: 214020
*	Add @llvm.assume, lowering, and some basic properties	Hal Finkel	2014-07-25	3	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here: - llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef). The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion. llvm-svn: 213973
*	[stack protector] Fix a potential security bug in stack protector where the	Akira Hatanaka	2014-07-25	2	-6/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	address of the stack guard was being spilled to the stack. Previously the address of the stack guard would get spilled to the stack if it was impossible to keep it in a register. This patch introduces a new target independent node and pseudo instruction which gets expanded post-RA to a sequence of instructions that load the stack guard value. Register allocator can now just remat the value when it can't keep it in a register. <rdar://problem/12475629> llvm-svn: 213967
*	Reapply "DebugInfo: Don't put fission type units in comdat sections."	David Blaikie	2014-07-25	3	-12/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This recommits r208930, r208933, and r208975 (by reverting r209338) and reverts r209529 (the FIXME to readd this functionality once the tools were fixed) now that DWP has been fixed to cope with a single section for all fission type units. Original commit message: "Since type units in the dwo file are handled by a debug aware tool, they don't need to leverage the ELF comdat grouping to implement deduplication. Avoid creating all the .group sections for these as a space optimization." llvm-svn: 213956
*	Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for ↵	David Blaikie	2014-07-25	4	-4/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	functions that do not have top level debug information. Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped provide a reduced reproduction, though the failure wasn't too hard to guess, and even easier with the example to confirm. The assertion that the subprogram metadata associated with an llvm::Function matches the scope data referenced by the DbgLocs on the instructions in that function is not valid under LTO. In LTO, a C++ inline function might exist in multiple CUs and the subprogram metadata nodes will refer to the same llvm::Function. In this case, depending on the order of the CUs, the first intance of the subprogram metadata may not be the one referenced by the instructions in that function and the assertion will fail. A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the assertion removed and a comment added to explain this situation. This was then reverted again in r213581 as it caused PR20367. The root cause of this was the early exit in LiveDebugVariables meant that spurious DBG_VALUE intrinsics that referenced dead variables were not removed, causing an assertion/crash later on. The fix is to have LiveDebugVariables strip all DBG_VALUE intrinsics in functions without debug info as they're not needed anyway. Test case added to cover this situation (that occurs when a debug-having function is inlined into a nodebug function) in test/DebugInfo/X86/nodebug_with_debug_loc.ll Original commit message: If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 213952
*	[SDAG] Don't insert the VRBase into a mapping from SDValues when the def	Chandler Carruth	2014-07-25	1	-6/+10
\| \| \| \| \| \| \|	doesn't actually correspond to an SDValue at all. Fixes most of the remaining asserts on out-of-range SDValue result numbers. llvm-svn: 213930
*	Store nodes only have 1 result.	Matt Arsenault	2014-07-25	1	-1/+1
\| \| \| \|	llvm-svn: 213928
*	[SDAG] Start plumbing an assert into SDValues that we don't form one	Chandler Carruth	2014-07-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	with a result number outside the range of results for the node. I don't know how we managed to not really check this very basic invariant for so long, but the code is very broken at this point. I have over 270 test failures with the assert enabled. I'm committing it disabled so that others can join in the cleanup effort and reproduce the issues. I've also included one of the obvious fixes that I already found. More fixes to come. llvm-svn: 213926
*	[SDAG] Introduce a combined set to the DAG combiner which tracks nodes	Chandler Carruth	2014-07-24	1	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	which have successfully round-tripped through the combine phase, and use this to ensure all operands to DAG nodes are visited by the combiner, even if they are only added during the combine phase. This is critical to have the combiner reach nodes that are introduced during combining. Previously these would sometimes be visited and sometimes not be visited based on whether they happened to end up on the worklist or not. Now we always run them through the combiner. This fixes quite a few bad codegen test cases lurking in the suite while also being more principled. Among these, the TLS codegeneration is particularly exciting for programs that have this in the critical path like TSan-instrumented binaries (although I think they engineer to use a different TLS that is faster anyways). I've tried to check for compile-time regressions here by running llc over a merged (but not LTO-ed) clang bitcode file and observed at most a 3% slowdown in llc. Given that this is essentially a worst case (none of opt or clang are running at this phase) I think this is tolerable. The actual LTO case should be even less costly, and the cost in normal compilation should be negligible. With this combining logic, it is possible to re-legalize as we combine which is necessary to implement PSHUFB formation on x86 as a post-legalize DAG combine (my ultimate goal). Differential Revision: http://reviews.llvm.org/D4638 llvm-svn: 213898
*	[x86] Make vector legalization of extloads work more like the "normal"	Chandler Carruth	2014-07-24	1	-5/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector operation legalization with support for custom target lowering and fallback to expand when it fails, and use this to implement sext and anyext load lowering for x86 in a more principled way. Previously, the x86 backend relied on a target DAG combine to "combine away" sextload and extload nodes prior to legalization, or would expand them during legalization with terrible code. This is particularly problematic because the DAG combine relies on running over non-canonical DAG nodes at just the right time to match several common and important patterns. It used a combine rather than lowering because we didn't have good lowering support, and to expose some tricks being employed to more combine phases. With this change it becomes a proper lowering operation, the backend marks that it can lower these nodes, and I've added support for handling the canonical forms that don't have direct legal representations such as sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases for this behavior continue to pass even after the DAG combiner beigns running more systematically over every node. There is some noise caused by this in the test suite where we actually use vector extends instead of subregister extraction. This doesn't really seem like the right thing to do, but is unlikely to be a critical regression. We do regress in one case where by lowering to the target-specific patterns early we were able to combine away extraneous legal math nodes. However, this regression is completely addressed by switching to a widening based legalization which is what I'm working toward anyways, so I've just switched the test to that mode. Differential Revision: http://reviews.llvm.org/D4654 llvm-svn: 213897
*	[X86] Optimize stackmap shadows on X86.	Lang Hames	2014-07-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch minimizes the number of nops that must be emitted on X86 to satisfy stackmap shadow constraints. To minimize the number of nops inserted, the X86AsmPrinter now records the size of the most recent stackmap's shadow in the StackMapShadowTracker class, and tracks the number of instruction bytes emitted since the that stackmap instruction was encountered. Padding is emitted (if it is required at all) immediately before the next stackmap/patchpoint instruction, or at the end of the basic block. This optimization should reduce code-size and improve performance for people using the llvm stackmap intrinsic on X86. <rdar://problem/14959522> llvm-svn: 213892
*	Add scoped-noalias metadata	Hal Finkel	2014-07-24	2	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { a = b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864
*	AA metadata refactoring (introduce AAMDNodes)	Hal Finkel	2014-07-24	13	-120/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859
*	Fix indenting.	Eric Christopher	2014-07-23	1	-13/+14
\| \| \| \|	llvm-svn: 213811
*	Reorganize and simplify local variables.	Eric Christopher	2014-07-23	1	-13/+11
\| \| \| \|	llvm-svn: 213809
*	Remove the query for TargetMachine and TargetInstrInfo since we're	Eric Christopher	2014-07-23	1	-3/+1
\| \| \| \| \| \|	already inside TargetInstrInfo. llvm-svn: 213806
*	DAG: fp->int conversion for non-splat constants.	Jim Grosbach	2014-07-23	1	-12/+11
\| \| \| \| \| \| \| \| \| \|	Constant fold the lanes of the input constant build_vector individually so we correctly handle when the vector elements are not all the same constant value. PR20394 llvm-svn: 213798
*	[AArch64] Lower sdiv x, pow2 using add + select + shift.	Chad Rosier	2014-07-23	1	-3/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The target-independent DAGcombiner will generate: asr w1, X, #31 w1 = splat sign bit. add X, X, w1, lsr #28 X = X + 0 or pow2-1 asr w0, X, asr #4 w0 = X/pow2 However, the add + shifts is expensive, so generate: add w0, X, 15 w0 = X + pow2-1 cmp X, wzr X - 0 csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X; asr w0, X, asr 4 w0 = X/pow2 llvm-svn: 213758
*	Enable partial libcall inlining for all targets by default.	James Molloy	2014-07-23	1	-0/+5
\| \| \| \| \| \| \| \|	This pass attempts to speculatively use a sqrt instruction if one exists on the target, falling back to a libcall if the target instruction returned NaN. This was enabled for MIPS and System-Z, but is well guarded and is good for most targets - GCC does this for (that I've checked) X86, ARM and AArch64. llvm-svn: 213752
*	[SDAG] Make the DAGCombine worklist not grow endlessly due to duplicate	Chandler Carruth	2014-07-23	1	-52/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	insertions. The old behavior could cause arbitrarily bad memory usage in the DAG combiner if there was heavy traffic of adding nodes already on the worklist to it. This commit switches the DAG combine worklist to work the same way as the instcombine worklist where we null-out removed entries and only add new entries to the worklist. My measurements of codegen time shows slight improvement. The memory utilization is unsurprisingly dominated by other factors (the IR and DAG itself I suspect). This change results in subtle, frustrating churn in the particular order in which DAG combines are applied which causes a number of minor regressions where we fail to match a pattern previously matched by accident. AFAICT, all of these should be using AddToWorklist to directly or should be written in a less brittle way. None of the changes seem drastically bad, and a few of the changes seem distinctly better. A major change required to make this work is to significantly harden the way in which the DAG combiner handle nodes which become dead (zero-uses). Previously, we relied on the ability to "priority-bump" them on the combine worklist to achieve recursive deletion of these nodes and ensure that the frontier of remaining live nodes all were added to the worklist. Instead, I've introduced a routine to just implement that precise logic with no indirection. It is a significantly simpler operation than that of the combiner worklist proper. I suspect this will also fix some other problems with the combiner. I think the x86 changes are really minor and uninteresting, but the avx512 change at least is hiding a "regression" (despite the test case being just noise, not testing some performance invariant) that might be looked into. Not sure if any of the others impact specific "important" code paths, but they didn't look terribly interesting to me, or the changes were really minor. The consensus in review is to fix any regressions that show up after the fact here. Thanks to the other reviewers for checking the output on other architectures. There is a specific regression on ARM that Tim already has a fix prepped to commit. Differential Revision: http://reviews.llvm.org/D4616 llvm-svn: 213727
*	[SDAG] Refactor the code for inserting a newly allocated SDNode into the	Chandler Carruth	2014-07-22	1	-96/+86
\| \| \| \| \| \| \| \| \| \| \|	DAG into a helper function. This adds a trip through the (very minimal) verification logic in a bunch of places that were missing it, but shouldn't have any other impact outside of refactoring. I'm hoping to use this to do more clever things when DAG nodes are inserted into the graph. llvm-svn: 213612
*	[SDAG] Remove a giant pile of asserts that may have helped track down	Chandler Carruth	2014-07-22	1	-40/+3
\| \| \| \| \| \| \| \| \| \| \|	a bug in 2010 when they were added but are adding no value today. In fact, they are utter lies. NodeAllocator is used to allocate almost all of these node types. I don't know what we were trying to assert here, and the docs don't give any answer. Until we once again stumble upon a bug needing help, let's clear the path for improvements. llvm-svn: 213610
*	Revert "Recommit r212203: Don't try to construct debug LexicalScopes ↵	David Blaikie	2014-07-21	4	-33/+4
\| \| \| \| \| \| \| \|	hierarchy for functions that do not have top level debug information." This reverts commit r212649 while I investigate/reduce/etc PR20367. llvm-svn: 213581