bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SimplifyCFG] threshold for folding branches with common destination	Jingyue Wu	2014-09-30	1	-7/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds a threshold that controls the number of bonus instructions allowed for folding branches with common destination. The original code allows at most one bonus instruction. With this patch, users can customize the threshold to allow multiple bonus instructions. The default threshold is still 1, so that the code behaves the same as before when users do not specify this threshold. The motivation of this change is that tuning this threshold significantly (up to 25%) improves the performance of some CUDA programs in our internal code base. In general, branch instructions are very expensive for GPU programs. Therefore, it is sometimes worth trading more arithmetic computation for a more straightened control flow. Here's a reduced example: __global__ void foo(int a, int b, int c, int d, int e, int n, const int input, int output) { int sum = 0; for (int i = 0; i < n; ++i) sum += (((i ^ a) > b) && (((i \| c ) ^ d) > e)) ? 0 : input[i]; *output = sum; } The select statement in the loop body translates to two branch instructions "if ((i ^ a) > b)" and "if (((i \| c) ^ d) > e)" which share a common destination. With the default threshold, SimplifyCFG is unable to fold them, because computing the condition of the second branch "(i \| c) ^ d > e" requires two bonus instructions. With the threshold increased, SimplifyCFG can fold the two branches so that the loop body contains only one branch, making the code conceptually look like: sum += (((i ^ a) > b) & (((i \| c ) ^ d) > e)) ? 0 : input[i]; Increasing the threshold significantly improves the performance of this particular example. In the configuration where both conditions are guaranteed to be true, increasing the threshold from 1 to 2 improves the performance by 18.24%. Even in the configuration where the first condition is false and the second condition is true, which favors shortcuts, increasing the threshold from 1 to 2 still improves the performance by 4.35%. We are still looking for a good threshold and maybe a better cost model than just counting the number of bonus instructions. However, according to the above numbers, we think it is at least worth adding a threshold to enable more experiments and tuning. Let me know what you think. Thanks! Test Plan: Added one test case to check the threshold is in effect Reviewers: nadav, eliben, meheff, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D5529 llvm-svn: 218711
*	Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)	Hal Finkel	2014-09-07	1	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
*	Remove 'using std::errro_code' from lib.	Rafael Espindola	2014-06-13	1	-1/+0
\| \| \| \|	llvm-svn: 210871
*	Don't use 'using std::error_code' in include/llvm.	Rafael Espindola	2014-06-12	1	-0/+1
\| \| \| \| \| \|	This should make sure that most new uses use the std prefix. llvm-svn: 210835
*	[C++] Use 'nullptr'. Transforms edition.	Craig Topper	2014-04-25	1	-5/+5
\| \| \| \|	llvm-svn: 207196
*	[Modules] Fix potential ODR violations by sinking the DEBUG_TYPE	Chandler Carruth	2014-04-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
*	[C++11] Add 'override' keyword to virtual methods that override their base ↵	Craig Topper	2014-03-05	1	-2/+2
\| \| \| \| \| \|	class. llvm-svn: 202953
*	[Modules] Move CFG.h to the IR library as it defines graph traits over	Chandler Carruth	2014-03-04	1	-1/+1
\| \| \| \| \| \|	IR types. llvm-svn: 202827
*	Make DataLayout a plain object, not a pass.	Rafael Espindola	2014-02-25	1	-1/+2
\| \| \| \| \| \| \|	Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM don't don't handle passes to also use DataLayout. llvm-svn: 202168
*	Rename a few more DataLayout variables.	Rafael Espindola	2014-02-21	1	-5/+5
\| \| \| \|	llvm-svn: 201833
*	Disable most IR-level transform passes on functions marked 'optnone'.	Paul Robinson	2014-02-06	1	-0/+3
\| \| \| \| \| \| \| \| \|	Ideally only those transform passes that run at -O0 remain enabled, in reality we get as close as we reasonably can. Passes are responsible for disabling themselves, it's not the job of the pass manager to do it for them. llvm-svn: 200892
*	Reapply r188119 now that the bug it exposed is fixed.	Peter Collingbourne	2013-08-12	1	-160/+5
\| \| \| \|	llvm-svn: 188217
*	Revert r188119 "Kill some duplicated code for removing unreachable BBs."	Arnold Schwaighofer	2013-08-10	1	-5/+160
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is breaking builbots with libgmalloc enabled on Mac OS X. $ cd llvm ; mkdir release ; cd release $ ../configure --enable-optimized —prefix=$PWD/install $ make $ make check $ Release+Asserts/bin/llvm-lit -v --param use_gmalloc=1 --param \ gmalloc_path=/usr/lib/libgmalloc.dylib \ ../test/Instrumentation/DataFlowSanitizer/args-unreachable-bb.ll llvm-svn: 188142
*	Kill some duplicated code for removing unreachable BBs.	Peter Collingbourne	2013-08-09	1	-160/+5
\| \| \| \| \| \| \| \| \| \| \|	This moves removeUnreachableBlocksFromFn from SimplifyCFGPass.cpp to Utils/Local.cpp and uses it to replace the implementation of llvm::removeUnreachableBlocks, which appears to do a strict subset of what removeUnreachableBlocksFromFn does. Differential Revision: http://llvm-reviews.chandlerc.com/D1334 llvm-svn: 188119
*	Factor FlattenCFG out from SimplifyCFG	Tom Stellard	2013-08-06	1	-51/+13
\| \| \| \| \| \|	Patch by: Mei Ye llvm-svn: 187764
*	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵	Tom Stellard	2013-07-27	1	-22/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
*	Make SimplifyCFG simply depend upon TargetTransformInfo and pass it	Chandler Carruth	2013-01-07	1	-9/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	through as a reference rather than a pointer. There is always some implementation of this available, so this simplifies code by not having to test for whether it is available or not. Further, it turns out there were piles of places where SimplifyCFG was recursing and not passing down either TD or TTI. These are fixed to be more pedantically consistent even though I don't have any particular cases where it would matter. llvm-svn: 171691
*	Move TargetTransformInfo to live under the Analysis library. This no	Chandler Carruth	2013-01-07	1	-1/+1
\| \| \| \| \| \| \|	longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686
*	Move all of the header files which are involved in modelling the LLVM IR	Chandler Carruth	2013-01-02	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
*	Optimize tree walking in markAliveBlocks.	Evgeniy Stepanov	2012-12-17	1	-4/+3
\| \| \| \| \| \| \| \|	Check whether a BB is known as reachable before adding it to the worklist. This way BB's with multiple predecessors are added to the list no more than once. llvm-svn: 170335
*	Use the new script to sort the includes of every file under lib.	Chandler Carruth	2012-12-03	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
*	Use TargetTransformInfo to control switch-to-lookup table transformation	Hans Wennborg	2012-10-30	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the switch-to-lookup tables transform landed in SimplifyCFG, it was pointed out that this could be inappropriate for some targets. Since there was no way at the time for the pass to know anything about the target, an awkward reverse-transform was added in CodeGenPrepare that turned lookup tables back into switches for some targets. This patch uses the new TargetTransformInfo to determine if a switch should be transformed, and removes CodeGenPrepare::ConvertLoadToSwitch. llvm-svn: 167011
*	Move TargetData to DataLayout.	Micah Villmow	2012-10-08	1	-3/+3
\| \| \| \|	llvm-svn: 165402
*	Update function names to conform to guidelines.	Jim Grosbach	2012-09-06	1	-26/+26
\| \| \| \| \| \|	No functional change. llvm-svn: 163279
*	Clean whitespaces.	Nadav Rotem	2012-07-24	1	-27/+27
\| \| \| \|	llvm-svn: 160668
*	fix PR13339 (remove the predecessor from the unwind BB when removing an invoke)	Nuno Lopes	2012-07-16	1	-0/+1
\| \| \| \|	llvm-svn: 160325
*	fix the regression I introduced in r159385 (it's necessary to update PHI ↵	Nuno Lopes	2012-07-02	1	-0/+3
\| \| \| \| \| \|	nodes in unwind BB llvm-svn: 159534
*	make simplifyCFG erase invokes to readonly/readnone functions	Nuno Lopes	2012-06-28	1	-6/+7
\| \| \| \|	llvm-svn: 159385
*	improve optimization of invoke instructions:	Nuno Lopes	2012-06-25	1	-2/+7
\| \| \| \| \| \| \| \|	- simplifycfg: invoke undef/null -> unreachable - instcombine: invoke new -> invoke expect(0, 0) (an arbitrary NOOP intrinsic; only done if the allocated memory is unused, of course) - verifier: allow invoke of intrinsics (to make the previous step work) llvm-svn: 159146
*	Convert CallInst and InvokeInst APIs to use ArrayRef.	Jay Foad	2011-07-15	1	-2/+1
\| \| \| \|	llvm-svn: 135265
*	Preserve line number information while converting Invoke into a Call.	Devang Patel	2011-06-02	1	-0/+1
\| \| \| \|	llvm-svn: 132505
*	Add a parameter to ConstantFoldTerminator() that callers can use to ask it ↵	Frits van Bommel	2011-05-22	1	-1/+1
\| \| \| \| \| \| \| \|	to also clean up the condition of any conditional terminator it folds to be unconditional, if that turns the condition into dead code. This just means it calls RecursivelyDeleteTriviallyDeadInstructions() in strategic spots. It defaults to the old behavior. I also changed -simplifycfg, -jump-threading and -codegenprepare to use this to produce slightly better code without any extra cleanup passes (AFAICT this was the only place in -simplifycfg where now-dead conditions of replaced terminators weren't being cleaned up). The only other user of this function is -sccp, but I didn't read that thoroughly enough to figure out whether it might be holding pointers to instructions that could be deleted by this. llvm-svn: 131855
*	Simplify cfg inserts a call to trap when unreachable code is detected. ↵	Devang Patel	2011-04-27	1	-1/+2
\| \| \| \| \| \|	Assign DebugLoc to this new trap instruction. llvm-svn: 130315
*	Remove PHINode::reserveOperandSpace(). Instead, add a parameter to	Jay Foad	2011-03-30	1	-2/+2
\| \| \| \| \| \|	PHINode::Create() giving the (known or expected) number of operands. llvm-svn: 128537
*	(Almost) always call reserveOperandSpace() on newly created PHINodes.	Jay Foad	2011-03-30	1	-2/+3
\| \| \| \|	llvm-svn: 128535
*	Get rid of static constructors for pass registration. Instead, every pass ↵	Owen Anderson	2010-10-19	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	exposes an initializeMyPassFunction(), which must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize the pass's dependencies. Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h before parsing commandline arguments. I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass registration/creation, please send the testcase to me directly. llvm-svn: 116820
*	Now with fewer extraneous semicolons!	Owen Anderson	2010-10-07	1	-1/+1
\| \| \| \|	llvm-svn: 115996
*	Teach SimplifyCFG how to simplify indirectbr instructions.	Dan Gohman	2010-08-14	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \|	- Eliminate redundant successors. - Convert an indirectbr with one successor into a direct branch. Also, generalize SimplifyCFG to be able to be run on a function entry block. It knows quite a few simplifications which are applicable to the entry block, and it only needs a few checks to avoid trouble with the entry block. llvm-svn: 111060
*	Reapply r110396, with fixes to appease the Linux buildbot gods.	Owen Anderson	2010-08-06	1	-1/+1
\| \| \| \|	llvm-svn: 110460
*	Revert r110396 to fix buildbots.	Owen Anderson	2010-08-06	1	-1/+1
\| \| \| \|	llvm-svn: 110410
*	Don't use PassInfo* as a type identifier for passes. Instead, use the ↵	Owen Anderson	2010-08-05	1	-1/+1
\| \| \| \| \| \| \| \|	address of the static ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396
*	Fix batch of converting RegisterPass<> to INTIALIZE_PASS().	Owen Anderson	2010-07-21	1	-1/+2
\| \| \| \|	llvm-svn: 109045
*	SimplifyCFG: don't turn volatile stores to null/undef into unreachable. ↵	Benjamin Kramer	2010-06-13	1	-0/+3
\| \| \| \| \| \|	Fixes PR7369. llvm-svn: 105914
*	make simplifycfg insert an llvm.trap before the 'unreachable' it introduces	Chris Lattner	2010-05-08	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when it detects undefined behavior. llvm.trap generally codegens into some thing really small (e.g. a 2 byte ud2 instruction on x86) and debugging this sort of thing is "nontrivial". For example, we now compile: void foo() { (int)0 = 42; } into: _foo: pushl %ebp movl %esp, %ebp ud2 Some may even claim that this is a security hole, though that seems dubious to me. This addresses rdar://7958343 - Optimizing away null dereference potentially allows arbitrary code execution llvm-svn: 103356
*	Finally land the InvokeInst operand reordering.	Gabor Greif	2010-03-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	I have audited all getOperandNo calls now, fixing hidden assumptions. CallSite related uglyness will be eliminated successively. Note this patch has a long and griveous history, for all the back-and-forths have a look at CallSite.h's log. llvm-svn: 99399
*	backing out r99170 because it still fails on clang-x86_64-darwin10-fnt	Gabor Greif	2010-03-22	1	-1/+1
\| \| \| \|	llvm-svn: 99171
*	Now that hopefully all direct accesses to InvokeInst operands are fixed	Gabor Greif	2010-03-22	1	-1/+1
\| \| \| \| \| \|	we can reapply the InvokeInst operand reordering patch. (see r98957). llvm-svn: 99170
*	back out r98957, it broke ↵	Gabor Greif	2010-03-19	1	-1/+1
\| \| \| \| \| \|	http://smooshlab.apple.com:8010/builders/clang-x86_64-darwin10-fnt/builds/703 in the nightly test suite llvm-svn: 98958
*	Recommit r80858 again (which has been backed out in r80871).	Gabor Greif	2010-03-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This time I did a self-hosted bootstrap on Linux x86-64, with no problems. Let's see how darwin 64-bit self-hosting goes. At the first sign of failure I'll back this out. Maybe the valgrind bots give me a hint of what may be wrong (it at all). llvm-svn: 98957
*	In "empty" bb, the return instruction may not be first instruction, if dbg ↵	Devang Patel	2010-03-15	1	-1/+1
\| \| \| \| \| \|	value intrinsics are present in this bb. Use terminator to find return instructions. llvm-svn: 98565