bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Clang format of a file introduced in 228090 (NFC)	Philip Reames	2015-02-04	1	-29/+22
\| \| \| \|	llvm-svn: 228091
*	Add a pass for inserting safepoints into (nearly) arbitrary IR	Philip Reames	2015-02-04	3	-0/+982
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass is responsible for figuring out where to place call safepoints and safepoint polls. It doesn't actually make the relocations explicit; that's the job of the RewriteStatepointsForGC pass (http://reviews.llvm.org/D6975). Note that this code is not yet finalized. Its moving in tree for incremental development, but further cleanup is needed and will happen over the next few days. It is not yet part of the standard pass order. Planned changes in the near future: - I plan on restructuring the statepoint rewrite to use the functions add to the IRBuilder a while back. - In the current pass, the function "gc.safepoint_poll" is treated specially but is not an intrinsic. I plan to make identifying the poll function a property of the GCStrategy at some point in the near future. - As follow on patches, I will be separating a collection of test cases we have out of tree and submitting them upstream. - It's not explicit in the code, but these two patches are introducing a new state for a statepoint which looks a lot like a patchpoint. There's no a transient form which doesn't yet have the relocations explicitly represented, but does prevent reordering of memory operations. Once this is in, I need to update actually make this explicit by reserving the 'unused' argument of the statepoint as a flag, updating the docs, and making the code explicitly check for such a thing. This wasn't really planned, but once I split the two passes - which was done for other reasons - the intermediate state fell out. Just reminds us once again that we need to merge statepoints and patchpoints at some point in the not that distant future. Future directions planned: - Identifying more cases where a backedge safepoint isn't required to ensure timely execution of a safepoint poll. - Tweaking the insertion process to generate easier to optimize IR. (For example, investigating making SplitBackedge) the default. - Adding opt-in flags for a GCStrategy to use this pass. Once done, add this pass to the actual pass ordering. Differential Revision: http://reviews.llvm.org/D6981 llvm-svn: 228090
*	[LV] Split off memcheck block really at the first check	Adam Nemet	2015-02-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	I've noticed this while trying to move addRuntimeCheck to LoopAccessAnalysis. I think that the intention was to early exit from the overflow checking before the code for the memchecks. This is the entire reason why we compute FirstCheckInst but then we don't use that as the splitting instruction but the final check. Looks like an oversight. llvm-svn: 228056
*	Allow PRE to insert no-cost phi nodes	Daniel Berlin	2015-02-03	1	-44/+68
\| \| \| \|	llvm-svn: 228024
*	Add straight-line strength reduction to LLVM	Jingyue Wu	2015-02-03	3	-0/+276
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Straight-line strength reduction (SLSR) is implemented in GCC but not yet in LLVM. It has proven to effectively simplify statements derived from an unrolled loop, and can potentially benefit many other cases too. For example, LLVM unrolls #pragma unroll foo (int i = 0; i < 3; ++i) { sum += foo((b + i) * s); } into sum += foo(b * s); sum += foo((b + 1) * s); sum += foo((b + 2) * s); However, no optimizations yet reduce the internal redundancy of the three expressions: b * s (b + 1) * s (b + 2) * s With SLSR, LLVM can optimize these three expressions into: t1 = b * s t2 = t1 + s t3 = t2 + s This commit is only an initial step towards implementing a series of such optimizations. I will implement more (see TODO in the file commentary) in the near future. This optimization is enabled for the NVPTX backend for now. However, I am more than happy to push it to the standard optimization pipeline after more thorough performance tests. Test Plan: test/StraightLineStrengthReduce/slsr.ll Reviewers: eliben, HaoLiu, meheff, hfinkel, jholewinski, atrick Reviewed By: jholewinski, atrick Subscribers: karthikthecool, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7310 llvm-svn: 228016
*	[LoopVectorize] Fix rebase glitch in r227751	Adam Nemet	2015-02-03	1	-5/+3
\| \| \| \| \| \| \| \| \|	LoopVectorizationLegality::{getNumLoads,getNumStores} should forward to LoopAccessAnalysis now. Thanks to Takumi for noticing this! llvm-svn: 227992
*	Adding AArch64 support to ASan instrumentation	Renato Golin	2015-02-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	For the time being, it is still hardcoded to support only the 39 VA bits variant, I plan to work on supporting 42 and 48 VA bits variants, but I don't have access to such hardware at the moment. Patch by Chrystophe Lyon. llvm-svn: 227965
*	Resurrect initializers for NumLoads and NumStores in ↵	NAKAMURA Takumi	2015-02-03	1	-2/+3
\| \| \| \| \| \| \|	LoopVectorizationLegality to suppress undefined behavior. FIXME: Shall they be managed in LAA? llvm-svn: 227940
*	Resurrect the assertion removed by r227717	Jingyue Wu	2015-02-02	2	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MSVC can compile "LoopID->getOperand(0) == LoopID" when LoopID is MDNode*. Test Plan: no regression Reviewers: mkuper Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7327 llvm-svn: 227853
*	Fix: SLPVectorizer crashes with assertion when vectorizing a cmp instruction.	Erik Eckstein	2015-02-02	1	-0/+1
\| \| \| \| \| \| \| \| \|	The commit r225977 uncovered this bug. The problem was that the vectorizer tried to read the second operand of an already deleted instruction. The bug didn't show up before r225977 because the freed memory still contained a non-null pointer. With r225977 deletion of instructions is delayed and the read operand pointer is always null. llvm-svn: 227800
*	LoopVectorize: Remove initializer list that blocks MSVC.	Benjamin Kramer	2015-02-01	1	-4/+4
\| \| \| \|	llvm-svn: 227766
*	[LoopVectorize] Move LoopAccessAnalysis to its own module	Adam Nemet	2015-02-01	1	-1216/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Other than moving code and adding the boilerplate for the new files, the code being moved is unchanged. There are a few global functions that are shared with the rest of the LoopVectorizer. I moved these to the new module as well (emitLoopAnalysis, stripIntegerCast, replaceSymbolicStrideSCEV) along with the Report class used by emitLoopAnalysis. There is probably room for further improvement in this area. I kept DEBUG_TYPE "loop-vectorize" because it's used as the PassName with emitOptimizationRemarkAnalysis. This will obviously have to change. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227756
*	[LoopVectorize] Move RuntimePointerCheck under LoopAccessAnalysis	Adam Nemet	2015-02-01	1	-44/+49
\| \| \| \| \| \| \| \| \| \| \|	This class needs to remain public because it's used by LoopVectorizationLegality::addRuntimeCheck. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227755
*	[LoopVectorize] Pass parameters explicitly to MemoryDepChecker	Adam Nemet	2015-02-01	1	-14/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than using globals use a structure to pass parameters from the vectorizer. This prepares the class to be moved outside the LoopVectorizer. It's not great how all this is passed through in LoopAccessAnalysis but this is all expected to change once the class start servicing the Loop Distribution pass as well where some of these parameters make no sense. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227754
*	[LoopVectorize] Split out LoopAccessAnalysis from LoopVectorizationLegality	Adam Nemet	2015-02-01	1	-18/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the canVectorizeMemory functionality from LoopVectorizationLegality to a new class LoopAccessAnalysis and forward users. Currently the collection of the symbolic stride information is kept with LoopVectorizationLegality and it becomes an input to LoopAccessAnalysis. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227751
*	[LoopVectorize] Add accessors for Num{Stores,Loads,PredStores} in AccessAnalysis	Adam Nemet	2015-02-01	1	-7/+18
\| \| \| \| \| \| \| \| \| \| \|	These members are moving to LoopAccessAnalysis. The accessors help to hide this. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227750
*	[LoopVectorize] Rename the Report class to VectorizationReport	Adam Nemet	2015-02-01	1	-50/+72
\| \| \| \| \| \| \| \| \| \| \|	This class will become public in the new LoopAccessAnalysis header so the name needs to be more global. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227749
*	[LoopVectorize] Factor out duplicated code into Report::emitAnalysis	Adam Nemet	2015-02-01	1	-10/+16
\| \| \| \| \| \| \| \| \| \| \|	The logic in emitAnalysis is duplicated across multiple functions. This splits it into a function. Another use will be added by the patchset. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227748
*	[LoopVectorize] Split out RuntimePointerCheck from LoopVectorizationLegality	Adam Nemet	2015-02-01	1	-46/+47
\| \| \| \| \| \| \| \| \| \| \| \|	RuntimePointerCheck will be used through LoopAccessAnalysis in LoopVectorizationLegality. Later in the patchset it will become a local class of LoopAccessAnalysis. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227747
*	[multiversion] Kill FunctionTargetTransformInfo, TTI itself is now	Chandler Carruth	2015-02-01	1	-8/+3
\| \| \| \| \| \|	per-function and supports the exact desired interface. llvm-svn: 227743
*	EarlyCSE: Replace custom hash mixing with Hashing.h	Benjamin Kramer	2015-02-01	1	-14/+4
\| \| \| \| \| \|	Brings it in line with the other hashes in EarlyCSE. llvm-svn: 227733
*	[multiversion] Thread a function argument through all the callers of the	Chandler Carruth	2015-02-01	15	-25/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	getTTI method used to get an actual TTI object. No functionality changed. This just threads the argument and ensures code like the inliner can correctly look up the callee's TTI rather than using a fixed one. The next change will use this to implement per-function subtarget usage by TTI. The changes after that should eliminate the need for FTTI as that will have become the default. llvm-svn: 227730
*	[PM] Port SimplifyCFG to the new pass manager.	Chandler Carruth	2015-02-01	1	-44/+65
\| \| \| \| \| \| \| \|	This should be sufficient to replace the initial (minor) function pass pipeline in Clang with the new pass manager. I'll probably add an (off by default) flag to do that just to ensure we can get extra testing. llvm-svn: 227726
*	[PM] Port EarlyCSE to the new pass manager.	Chandler Carruth	2015-02-01	1	-1/+23
\| \| \| \| \| \| \| \|	I've added RUN lines both to the basic test for EarlyCSE and the target-specific test, as this serves as a nice test that the TTI layer in the new pass manager is in fact working well. llvm-svn: 227725
*	Removed assert that doesn't typecheck and breaks debug MSVC build.	Michael Kuperstein	2015-02-01	1	-1/+0
\| \| \| \|	llvm-svn: 227717
*	[SeparateConstOffsetFromGEP] skip optnone functions	Jingyue Wu	2015-02-01	1	-0/+3
\| \| \| \|	llvm-svn: 227705
*	[SeparateConstOffsetFromGEP] set PreservesCFG flag	Jingyue Wu	2015-02-01	1	-0/+1
\| \| \| \| \| \|	SeparateConstOffsetFromGEP does not change the shape of the control flow graph. llvm-svn: 227704
*	[NVPTX] Emit .pragma "nounroll" for loops marked with nounroll	Jingyue Wu	2015-02-01	2	-22/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA driver from unrolling a loop marked with llvm.loop.unroll.disable is not unrolled by CUDA driver, we need to emit .pragma "nounroll" at the header of that loop. This patch also extracts getting unroll metadata from loop ID metadata into a shared helper function. Test Plan: test/CodeGen/NVPTX/nounroll.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7041 llvm-svn: 227703
*	Fix PR22393. When recursively replacing an aggregate with a smaller	Adrian Prantl	2015-02-01	1	-3/+12
\| \| \| \| \| \| \| \|	aggregate or scalar, the debug info needs to refer to the absolute offset (relative to the entire variable) instead of storing the offset inside the smaller aggregate. llvm-svn: 227702
*	[asan][mips] Fix MIPS64 Asan mapping	Kumar Sukhani	2015-01-31	1	-1/+1
\| \| \| \|	llvm-svn: 227684
*	[PM] Change the core design of the TTI analysis to use a polymorphic	Chandler Carruth	2015-01-31	15	-46/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	type erased interface and a single analysis pass rather than an extremely complex analysis group. The end result is that the TTI analysis can contain a type erased implementation that supports the polymorphic TTI interface. We can build one from a target-specific implementation or from a dummy one in the IR. I've also factored all of the code into "mix-in"-able base classes, including CRTP base classes to facilitate calling back up to the most specialized form when delegating horizontally across the surface. These aren't as clean as I would like and I'm planning to work on cleaning some of this up, but I wanted to start by putting into the right form. There are a number of reasons for this change, and this particular design. The first and foremost reason is that an analysis group is complete overkill, and the chaining delegation strategy was so opaque, confusing, and high overhead that TTI was suffering greatly for it. Several of the TTI functions had failed to be implemented in all places because of the chaining-based delegation making there be no checking of this. A few other functions were implemented with incorrect delegation. The message to me was very clear working on this -- the delegation and analysis group structure was too confusing to be useful here. The other reason of course is that this is much more natural fit for the new pass manager. This will lay the ground work for a type-erased per-function info object that can look up the correct subtarget and even cache it. Yet another benefit is that this will significantly simplify the interaction of the pass managers and the TargetMachine. See the future work below. The downside of this change is that it is very, very verbose. I'm going to work to improve that, but it is somewhat an implementation necessity in C++ to do type erasure. =/ I discussed this design really extensively with Eric and Hal prior to going down this path, and afterward showed them the result. No one was really thrilled with it, but there doesn't seem to be a substantially better alternative. Using a base class and virtual method dispatch would make the code much shorter, but as discussed in the update to the programmer's manual and elsewhere, a polymorphic interface feels like the more principled approach even if this is perhaps the least compelling example of it. ;] Ultimately, there is still a lot more to be done here, but this was the huge chunk that I couldn't really split things out of because this was the interface change to TTI. I've tried to minimize all the other parts of this. The follow up work should include at least: 1) Improving the TargetMachine interface by having it directly return a TTI object. Because we have a non-pass object with value semantics and an internal type erasure mechanism, we can narrow the interface of the TargetMachine to just do what we need: build and return a TTI object that we can then insert into the pass pipeline. 2) Make the TTI object be fully specialized for a particular function. This will include splitting off a minimal form of it which is sufficient for the inliner and the old pass manager. 3) Add a new pass manager analysis which produces TTI objects from the target machine for each function. This may actually be done as part of #2 in order to use the new analysis to implement #2. 4) Work on narrowing the API between TTI and the targets so that it is easier to understand and less verbose to type erase. 5) Work on narrowing the API between TTI and its clients so that it is easier to understand and less verbose to forward. 6) Try to improve the CRTP-based delegation. I feel like this code is just a bit messy and exacerbating the complexity of implementing the TTI in each target. Many thanks to Eric and Hal for their help here. I ended up blocked on this somewhat more abruptly than I expected, and so I appreciate getting it sorted out very quickly. Differential Revision: http://reviews.llvm.org/D7293 llvm-svn: 227669
*	Silence "not all paths return a value" warning in MSVC	Reid Kleckner	2015-01-30	1	-0/+1
\| \| \| \|	llvm-svn: 227614
*	Remove a redundant dyn_cast.	Adrian Prantl	2015-01-30	1	-3/+2
\| \| \| \|	llvm-svn: 227605
*	Inliner: Use replaceDbgDeclareForAlloca() instead of splicing the	Adrian Prantl	2015-01-30	3	-18/+20
\| \| \| \| \| \| \|	instruction and generalize it to optionally dereference the variable. Follow-up to r227544. llvm-svn: 227604
*	[PM] Sink the population of the pass manager with target-specific	Chandler Carruth	2015-01-30	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \|	analyses back into the LTO code generator. The pass manager builder (and the transforms library in general) shouldn't be referencing the target machine at all. This makes the LTO population work like the others -- the data layout and target transform info need to be pre-populated. llvm-svn: 227576
*	Fix a warning introduced by r227557 due to a default label in a fully	Chandler Carruth	2015-01-30	1	-1/+0
\| \| \| \| \| \|	covering switch. llvm-svn: 227575
*	[LoopVectorize] Induction variables: support arbitrary constant step.	Hao Liu	2015-01-30	1	-133/+129
\| \| \| \| \| \| \| \| \| \|	Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support arbitrary constant steps. Initial patch by Alexey Volkov. Some bug fixes are added in the following version. Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193 llvm-svn: 227557
*	Fix PR22386. The inliner moves static allocas to the entry basic block	Adrian Prantl	2015-01-30	1	-0/+8
\| \| \| \| \| \|	so we need to move the dbg.declare intrinsics that describe them, too. llvm-svn: 227544
*	[LoopReroll] Alter the data structures used during reroll validation.	James Molloy	2015-01-29	1	-159/+207
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The validation algorithm used an incremental approach, building each iteration's data structures temporarily, validating them, then adding them to a global set. This does not scale well to having multiple sets of Root nodes, as the set of instructions used in each iteration is the union over all the root nodes. Therefore, refactor the logic to create a single, simple container to which later logic then refers. This makes it simpler control-flow wise to make the creation of the container more complex with the addition of multiple root sets. llvm-svn: 227499
*	[GVN] don't propagate equality comparisons of FP zero (PR22376)	Sanjay Patel	2015-01-29	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities to allow some simple value range optimizations. But that introduced a bug when comparing to -0.0 or 0.0: these compare equal even though they are not bitwise identical. This patch disallows propagating zero constants in equality comparisons. Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376 Differential Revision: http://reviews.llvm.org/D7257 llvm-svn: 227491
*	[LoopReroll] Refactor most of reroll() into a helper class	James Molloy	2015-01-29	1	-220/+273
\| \| \| \| \| \| \| \| \|	reroll() was slightly monolithic and a pain to modify. Refactor a bunch of its state from local variables to member variables of a helper class, and do some trivial simplification while we're there. llvm-svn: 227439
*	Teach SplitBlockPredecessors how to handle landingpad blocks.	Philip Reames	2015-01-28	4	-42/+29
\| \| \| \| \| \| \| \| \| \|	Patch by: Igor Laevsky <igor@azulsystems.com> "Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors." Differential Revision: http://reviews.llvm.org/D7157 llvm-svn: 227390
*	[LPM] Stop using the string based preservation API. It is an	Chandler Carruth	2015-01-28	3	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	abomination. For starters, this API is incredibly slow. In order to lookup the name of a pass it must take a memory fence to acquire a pointer to the managed static pass registry, and then potentially acquire locks while it consults this registry for information about what passes exist by that name. This stops the world of LLVMs in your process no matter how little they cared about the result. To make this more joyful, you'll note that we are preserving many passes which do not exist any more, or are not even analyses which one might wish to have be preserved. This means we do all the work only to say "nope" with no error to the user. String-based APIs are a bad idea. String-based APIs that cannot produce any meaningful error are an even worse idea. =/ I have a patch that simply removes this API completely, but I'm hesitant to commit it as I don't really want to perniciously break out-of-tree users of the old pass manager. I'd rather they just have to migrate to the new one at some point. If others disagree and would like me to kill it with fire, just say the word. =] llvm-svn: 227294
*	Move EH personality type classification to Analysis/LibCallSemantics.h	Reid Kleckner	2015-01-28	1	-28/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also add enum types for __C_specific_handler and _CxxFrameHandler3 for which we know a few things. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7214 llvm-svn: 227284
*	SymbolRewriter: allow rewriting with comdats	Saleem Abdulrasool	2015-01-27	1	-0/+20
\| \| \| \| \| \| \| \| \|	COMDATs must be identically named to the symbol. When support for COMDATs was introduced, the symbol rewriter was not updated, resulting in rewriting failing for symbols which were placed into COMDATs. This corrects the behaviour and adds test cases for this. llvm-svn: 227261
*	SymbolRewriter: prevent unnecessary rewrite	Saleem Abdulrasool	2015-01-27	1	-0/+3
\| \| \| \| \| \| \|	The rewrite for the pattern based rewrite is unnecessary if the existing name matches the pattern. llvm-svn: 227260
*	[SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.	Ahmed Bougacha	2015-01-27	1	-10/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was introduced in a faulty refactoring (r225640, mea culpa): the tests weren't testing the return values, so, for both __strcpy_chk and __stpcpy_chk, we would return the end of the buffer (matching stpcpy) instead of the beginning (for strcpy). The root cause was the prefix "__" being ignored when comparing, which made us always pick LibFunc::stpcpy_chk. Pass the LibFunc::Func directly to avoid this kind of error. Also, make the testcases as explicit as possible to prevent this. The now-useful testcases expose another, entangled, stpcpy problem, with the further simplification. This was introduced in a refactoring (r225640) to match the original behavior. However, this leads to problems when successive simplifications generate several similar instructions, none of which are removed by the custom replaceAllUsesWith. For instance, InstCombine (the main user) doesn't erase the instruction in its custom RAUW. When trying to simplify say __stpcpy_chk: - first, an stpcpy is created (fortified simplifier), - second, a memcpy is created (normal simplifier), but the stpcpy call isn't removed. - third, InstCombine later revisits the instructions, and simplifies the first stpcpy to a memcpy. We now have two memcpys. llvm-svn: 227250
*	Teach IRCE to look at branch weights when recognizing range checks	Sanjoy Das	2015-01-27	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \|	Splitting a loop to make range checks redundant is profitable only if the range check "never" fails. Make this fact a part of recognizing a range check -- a branch is a range check only if it is expected to pass (via branch_weights metadata). Differential Revision: http://reviews.llvm.org/D7192 llvm-svn: 227249
*	tsan: properly instrument unaligned accesses	Dmitry Vyukov	2015-01-27	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \|	If a memory access is unaligned, emit __tsan_unaligned_read/write callbacks instead of __tsan_read/write. Required to change semantics of __tsan_unaligned_read/write to not do the user memory. But since they were unused (other than through __sanitizer_unaligned_load/store) this is fine. Fixes long standing issue 17: https://code.google.com/p/thread-sanitizer/issues/detail?id=17 llvm-svn: 227231
*	[InstCombine] Teach how to fold a select into a cttz/ctlz with the ↵	Andrea Di Biagio	2015-01-27	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \|	'is_zero_undef' flag. This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared. Added test InstCombine/select-cmp-cttz-ctlz.ll. llvm-svn: 227197