bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Exploit dereferenceable_or_null attribute in LICM pass	Sanjoy Das	2015-05-18	1	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Allow hoisting of loads from values marked with dereferenceable_or_null attribute. For values marked with the attribute perform context-sensitive analysis to determine whether it's known-non-null or not. Patch by Artur Pilipenko! Reviewers: hfinkel, sanjoy, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9253 llvm-svn: 237593
*	Preserve the order of READ_REGISTER and WRITE_REGISTER	Hal Finkel	2015-05-18	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \|	At the present time, we don't have a way to represent general dependency relationships, so everything is represented using memory dependency. In order to preserve the data dependency of a READ_REGISTER on WRITE_REGISTER, we need to model WRITE_REGISTER as writing (which we had been doing) and model READ_REGISTER as reading (which we had not been doing). Fix this, and also the way that the chain operands were generated at the SDAG level. Patch by Nicholas Paul Johnson, thanks! Test case by me. llvm-svn: 237584
*	[LoopAccesses] Rearrange printed lines in -analyze	Adam Nemet	2015-05-18	1	-1/+0
\| \| \| \| \| \| \|	"Store to invariant address..." is moved as the last line. This is not the prime result of the analysis. Plus it simplifies some of the tests. llvm-svn: 237573
*	Reapply r237520 with another fix for infinite looping	James Molloy	2015-05-17	1	-0/+99
\| \| \| \| \| \| \| \| \|	SimplifyDemandedBits was "simplifying" a constant by removing just sign bits. This caused a canonicalization race between different parts of instcombine. Fix and regression test added - third time lucky? llvm-svn: 237539
*	Revert commits r237521 and r237520.	James Molloy	2015-05-16	1	-84/+0
\| \| \| \| \| \| \| \|	The AArch64 LNT bot is unhappy - I've found that the problem is in SimpliftDemandedBits, but that's going to require another code review so reverting in the meantime. llvm-svn: 237528
*	Update to r237520 - swap order of CHECK-NEXT lines.	James Molloy	2015-05-16	1	-1/+1
\| \| \| \| \| \| \|	... I'd copied the check-next lines from a previous test so they were slightly wrong, and had managed to test the wrong source tree. D'oh! llvm-svn: 237521
*	Reapply r237453 with a fix for the test timeouts.	James Molloy	2015-05-16	1	-0/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test timeouts were due to instcombine fighting itself. Regression test added. Original log message: Canonicalize min/max expressions correctly. This patch introduces a canonical form for min/max idioms where one operand is extended or truncated. This often happens when the other operand is a constant. For example: %1 = icmp slt i32 %a, i32 0 %2 = sext i32 %a to i64 %3 = select i1 %1, i64 %2, i64 0 Would now be canonicalized into: %1 = icmp slt i32 %a, i32 0 %2 = select i1 %1, i32 %a, i32 0 %3 = sext i32 %2 to i64 This builds upon a patch posted by David Majenemer (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass passively stopped instcombine from ruining canonical patterns. This patch additionally actively makes instcombine canonicalize too. Canonicalization of expressions involving a change in type from int->fp or fp->int are not yet implemented. llvm-svn: 237520
*	[MemCpyOpt] Turn memcpy from just-memset'd source into memset.	Ahmed Bougacha	2015-05-16	3	-2/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's no point in copying around constants, so, when all else fails, we can still transform memcpy of memset into two independent memsets. To quote the example, we can turn: memset(dst1, c, dst1_size); memcpy(dst2, dst1, dst2_size); into: memset(dst1, c, dst1_size); memset(dst2, c, dst2_size); When dst2_size <= dst1_size. Like r235232 for copy constructors, this can occur in move constructors. Differential Revision: http://reviews.llvm.org/D9682 llvm-svn: 237506
*	Remove dead code in testcase. NFC.	Ahmed Bougacha	2015-05-16	1	-4/+0
\| \| \| \|	llvm-svn: 237501
*	Add a speculative execution pass	Jingyue Wu	2015-05-15	1	-0/+195
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a pass for speculative execution of instructions for simple if-then (triangle) control flow. It's aimed at GPUs, but could perhaps be used in other contexts. Enabling this pass gives us a 1.0% geomean improvement on Google benchmark suites, with one benchmark improving 33%. Credit goes to Jingyue Wu for writing an earlier version of this pass. Patched by Bjarke Roune. Test Plan: This patch adds a set of tests in test/Transforms/SpeculativeExecution/spec.ll The pass is controlled by a flag which defaults to having the pass not run. Reviewers: eliben, dberlin, meheff, jingyue, hfinkel Reviewed By: jingyue, hfinkel Subscribers: majnemer, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9360 llvm-svn: 237459
*	Revert "Canonicalize min/max expressions correctly."	James Molloy	2015-05-15	1	-73/+0
\| \| \| \| \| \| \|	This reverts r237453 - it was causing timeouts on some bots. Reverting while I investigate (it's probably InstCombine fighting itself...) llvm-svn: 237458
*	[SLSR] handle (B \| i) * S	Jingyue Wu	2015-05-15	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Consider (B \| i) * S as (B + i) * S if B and i have no bits set in common. Test Plan: @or in slsr-mul.ll Reviewers: broune, meheff Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9788 llvm-svn: 237456
*	Canonicalize min/max expressions correctly.	James Molloy	2015-05-15	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a canonical form for min/max idioms where one operand is extended or truncated. This often happens when the other operand is a constant. For example: %1 = icmp slt i32 %a, i32 0 %2 = sext i32 %a to i64 %3 = select i1 %1, i64 %2, i64 0 Would now be canonicalized into: %1 = icmp slt i32 %a, i32 0 %2 = select i1 %1, i32 %a, i32 0 %3 = sext i32 %2 to i64 This builds upon a patch posted by David Majenemer (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass passively stopped instcombine from ruining canonical patterns. This patch additionally actively makes instcombine canonicalize too. Canonicalization of expressions involving a change in type from int->fp or fp->int are not yet implemented. llvm-svn: 237453
*	NFC - Test case invokes llc on a file rather than redirected from a file.	Nemanja Ivanovic	2015-05-15	1	-1/+1
\| \| \| \| \| \| \| \|	This has caused some local failures. Updating the test case to be more like the majority of the similar test cases. Committing on behalf of Hubert Tong (hstong@ca.ibm.com). llvm-svn: 237449
*	[PlaceSafepoints] Fix a bug that came in with rL236672.	Sanjoy Das	2015-05-15	1	-0/+42
\| \| \| \| \| \| \| \|	Transfer the calling convention from the invoke being replaced by PlaceStatepoints to the new invoke to gc.statepoint created. Add a test case that would have caught this issue. llvm-svn: 237414
*	[PlaceSafepoints] Fix a bug that came in with rL236672.	Sanjoy Das	2015-05-15	1	-0/+42
\| \| \| \| \| \| \| \| \|	rL236672 would generate all invoke statepoints with deopt args set to a list containing the single element "0", instead of an empty list. Also add a test case that would have caught this. llvm-svn: 237413
*	[ValueTracking] refactor: extract method haveNoCommonBitsSet	Jingyue Wu	2015-05-14	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extract method haveNoCommonBitsSet so that we don't have to duplicate this logic in InstCombine and SeparateConstOffsetFromGEP. This patch also makes SeparateConstOffsetFromGEP more precise by passing DominatorTree to computeKnownBits. Test Plan: value-tracking-domtree.ll that tests ValueTracking indeed leverages dominating conditions Reviewers: broune, meheff, majnemer Reviewed By: majnemer Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9734 llvm-svn: 237407
*	Add another InstCombine pass after LoopUnroll.	Wei Mi	2015-05-14	1	-0/+85
\| \| \| \| \| \| \| \|	This is to cleanup some redundency generated by LoopUnroll pass. Such redundency may not be cleaned up by existing passes after LoopUnroll. Differential Revision: http://reviews.llvm.org/D9777 llvm-svn: 237395
*	[ConstantFolding] Fix wrong folding of intrinsic 'convert.from.fp16'.	Andrea Di Biagio	2015-05-14	1	-0/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Function 'ConstantFoldScalarCall' (in ConstantFolding.cpp) works under the wrong assumption that a call to 'convert.from.fp16' returns a value of type 'float'. However, intrinsic 'convert.from.fp16' can be overloaded; for example, we can call 'convert.from.fp16.f64' to convert from half to double; etc. Before this patch, the following example would have triggered an assertion failure in opt (with -constprop): ``` define double @foo() { entry: %0 = call double @llvm.convert.from.fp16.f64(i16 0) ret double %0 } ``` This patch fixes the problem in ConstantFolding.cpp. When folding a call to convert.from.fp16, we perform a different kind of conversion based on the call return type. Added test 'Transform/ConstProp/convert-from-fp16.ll'. Differential Revision: http://reviews.llvm.org/D9771 llvm-svn: 237377
*	New Loop Distribution pass	Adam Nemet	2015-05-14	6	-0/+484
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This implements the initial version as was proposed earlier this year (http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080462.html). Since then Loop Access Analysis was split out from the Loop Vectorizer and was made into a separate analysis pass. Loop Distribution becomes the second user of this analysis. The pass is off by default and can be enabled with -enable-loop-distribution. There is currently no notion of profitability; if there is a loop with dependence cycles, the pass will try to split them off from other memory operations into a separate loop. I decided to remove the control-dependence calculation from this first version. This and the issues with the PDT are actively discussed so it probably makes sense to treat it separately. Right now I just mark all terminator instruction required which keeps identical CFGs for each distributed loop. This seems to be working pretty well for 456.hmmer where even though there is an empty if-then block in the distributed loop initially, it gets completely removed. The pass keeps DominatorTree and LoopInfo updated. I've tested this with -loop-distribute-verify with the testsuite where we distribute ~90 loops. SimplifyLoop is violated in some cases and I have a FIXME covering this. Reviewers: hfinkel, nadav, aschwaighofer Reviewed By: aschwaighofer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8831 llvm-svn: 237358
*	[PlaceSafepoints] New attributes for patchable statepoints.	Sanjoy Das	2015-05-13	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch teaches the PlaceSafepoints pass about two `CallSite` function attributes: * "statepoint-id": if the string value of this attribute can be parsed as an integer, then it is propagated to the ID parameter of the statepoint created. * "statepoint-num-patch-bytes": if the string value of this attribute can be parsed as an integer, then it is propagated to the `num patch bytes` parameter of the statepoint created. This change intentionally does not assert on a malformed value for these attributes, given that they're not "official" attributes. Reviewers: reames, pgavlin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9735 llvm-svn: 237286
*	[NaryReassociate] avoid running forever	Jingyue Wu	2015-05-13	1	-0/+11
\| \| \| \| \| \| \| \| \|	Avoid running forever by checking we are not reassociating an expression into the same form. Tested with @avoid_infinite_loops in nary-add.ll llvm-svn: 237269
*	Add function entry counts from sample profiles.	Diego Novillo	2015-05-13	2	-0/+27
\| \| \| \| \| \| \| \| \| \|	This patch uses the new function profile metadata "function_entry_count" to annotate entry counts from sample profiles. In a sampling profile, the total samples collected at the function entry are an approximation for the number of times that function was invoked. llvm-svn: 237265
*	[SLSR] handles non-canonicalized Mul candidates	Jingyue Wu	2015-05-13	1	-0/+21
\| \| \| \| \| \| \| \|	such as (2 + B) * S. Tested by @non_canonicalized in slsr-mul.ll llvm-svn: 237216
*	[Statepoints] Support for "patchable" statepoints.	Sanjoy Das	2015-05-12	29	-137/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change adds two new parameters to the statepoint intrinsic, `i64 id` and `i32 num_patch_bytes`. `id` gets propagated to the ID field in the generated StackMap section. If the `num_patch_bytes` is non-zero then the statepoint is lowered to `num_patch_bytes` bytes of nops instead of a call (the spill and reload code remains unchanged). A non-zero `num_patch_bytes` is useful in situations where a language runtime requires complete control over how a call is lowered. This change brings statepoints one step closer to patchpoints. With some additional work (that is not part of this patch) it should be possible to get rid of `TargetOpcode::STATEPOINT` altogether. PlaceSafepoints generates `statepoint` wrappers with `id` set to `0xABCDEF00` (the old default value for the ID reported in the stackmap) and `num_patch_bytes` set to `0`. This can be made more sophisticated later. Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9546 llvm-svn: 237214
*	CVP: Improve handling of Selects used as incoming PHI values	Bjorn Steinbrink	2015-05-12	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the branch that leads to the PHI node and the Select instruction depend on correlated conditions, we might be able to directly use the corresponding value from the Select instruction as the incoming value for the PHI node, allowing later removal of the select instruction. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9051 llvm-svn: 237201
*	[RewriteStatepointsForGC] Extend base pointer to handle more cases w/vectors	Philip Reames	2015-05-12	1	-0/+25
\| \| \| \| \| \| \| \| \| \|	When relocating a pointer, we need to determine a base pointer for the derived pointer being relocated. We have limited support for handling a pointer extracted from a vector; the current code only handled the case where the entire vector was known to contain base pointers. This patch extends the reasoning to handle chains of insertelements where the indices are constants. This case turns out to be fairly common in vectorized code. We can now handle vectors which contains mixtures of base and derived pointers provided the insertelements use constant indices. Note that this doesn't solve the general problem. To handle variable indexed insertelements, we'd need to scalarize and introduce conditional branching based on the index. Alternatively, we could eagerly scalarize, but the code structure doesn't currently make either fix easy. The patch also doesn't handle shufflevector or other vector manipulation for much the same reasons. I plan to defer this work until I have a motivating test case. Differential Revision: http://reviews.llvm.org/D9676 llvm-svn: 237200
*	MergeFunctions: Two different sized allocas are not the same	Arnold Schwaighofer	2015-05-12	1	-0/+33
\| \| \| \|	llvm-svn: 237193
*	[PlaceSafepoints] Remove dependence on LoopSimplify	Philip Reames	2015-05-12	1	-4/+4
\| \| \| \| \| \| \| \|	As a step towards getting rid of internal pass manager hack entirely, remove the need for loop simplify to run in the inner pass manager. The new code does produce slightly different loop structures, so this isn't technically NFC. Differential Revision: http://reviews.llvm.org/D9585 llvm-svn: 237172
*	Reimplement heuristic for estimating complete-unroll optimization effects.	Michael Zolotukhin	2015-05-12	2	-2/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch reimplements heuristic that tries to estimate optimization beneftis from complete loop unrolling. In this patch I kept the minimal changes - e.g. I removed code handling branches and folding compares. That's a promising area, but now there are too many questions to discuss before we can enable it. Test Plan: Tests are included in the patch. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8816 llvm-svn: 237156
*	Changed renaming of local symbols by inserting a dot vefore the numeric suffix.	Sunil Srivastava	2015-05-12	14	-78/+78
\| \| \| \| \| \| \|	One code change and several test changes to match that details in http://reviews.llvm.org/D9481 llvm-svn: 237150
*	[MemCpyOpt] Look at any dependency -not just source- for memset+memcpy.	Ahmed Bougacha	2015-05-11	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	This fixes another miscompile introduced by r235232: when there was a dependency on the memcpy destination other than the memset, we would ignore it, because we only looked at the source dependency. It was a mistake to use SrcDepInfo. Instead, just use DepInfo. llvm-svn: 237066
*	[RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to ↵	Sanjoy Das	2015-05-11	8	-28/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector of pointers Summary: In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for each relocated pointer, and the gc_relocate has the same type with the pointer. During the creation of gc_relocate intrinsic, llvm requires to mangle its type. However, llvm does not support mangling of all possible types. RewriteStatepointsForGC will hit an assertion failure when it tries to create a gc_relocate for pointer to vector of pointers because mangling for vector of pointers is not supported. This patch changes the way RewriteStatepointsForGC pass creates gc_relocate. For each relocated pointer, we erase the type of pointers and create an unified gc_relocate of type i8 addrspace(1)*. Then a bitcast is inserted to convert the gc_relocate to the correct type. In this way, gc_relocate does not need to deal with different types of pointers and the unsupported type mangling is no longer a problem. This change would also ease further merge when LLVM erases types of pointers and introduces an unified pointer type. Some minor changes are also introduced to gc_relocate related part in InstCombineCalls, CodeGenPrepare, and Verifier accordingly. Patch by Chen Li! Reviewers: reames, AndyAyers, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9592 llvm-svn: 237009
*	[InstCombine/PowerPC] Fix single-precision QPX load/store replacement	Hal Finkel	2015-05-11	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	The QPX single-precision load/store intrinsics have implied truncation/extension from/to the declared value type of <4 x double> to the memory type of <4 x float>. When we can prove the alignment of the pointer argument, and thus replace the intrinsic with a regular load or store, we need to load or store the correct data type (<4 x float>) instead of (<4 x double>). llvm-svn: 236973
*	Make buildbots happy	David Majnemer	2015-05-11	1	-1/+1
\| \| \| \|	llvm-svn: 236970
*	[InstCombine] Canonicalize single element array store	David Majnemer	2015-05-11	1	-0/+20
\| \| \| \| \| \| \| \|	Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9591 llvm-svn: 236969
*	[InstCombine] Canonicalize single element array load	David Majnemer	2015-05-11	1	-0/+25
\| \| \| \| \| \| \| \|	Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9596 llvm-svn: 236968
*	Extend the statepoint intrinsic to allow statepoints to be marked as ↵	Pat Gavlin	2015-05-08	23	-91/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	transitions from GC-aware code to code that is not GC-aware. This changes the shape of the statepoint intrinsic from: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args) to: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args) This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back. In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation. Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops. Differential Revision: http://reviews.llvm.org/D9501 llvm-svn: 236888
*	[NoTTI] reject negative scale in addressing mode	Jingyue Wu	2015-05-08	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I noticed this bug when deubging a WIP on LSR. I wonder whether and how we should add a regression test for this. Test Plan: no tests failed. Reviewers: atrick Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D9536 llvm-svn: 236887
*	Populate list of vectorizable functions for Accelerate library.	Michael Zolotukhin	2015-05-07	1	-0/+450
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds majority of supported by Accelerate library functions to the list of vectorizable functions. The full list of available vector functions could be found here: https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vecLib/index.html Test Plan: Unit tests are added. Reviewers: hfinkel, aschwaighofer, nadav Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9543 llvm-svn: 236747
*	Update InstCombine to transform aggregate loads into scalar loads.	Mehdi Amini	2015-05-07	1	-2/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: One step further getting aggregate loads and store being optimized properly. This will only handle struct with one element at this point. Test Plan: Added unit tests for the new supported cases. Reviewers: chandlerc, joker-eph, joker.eph, majnemer Reviewed By: majnemer Subscribers: pete, llvm-commits Differential Revision: http://reviews.llvm.org/D8339 Patch by Amaury Sechet. From: Amaury Sechet <amaury@fb.com> llvm-svn: 236695
*	[JumpThreading] Simplify comparisons when simplifying branches	Philip Reames	2015-05-07	1	-0/+69
\| \| \| \| \| \| \| \| \| \|	If we have recognized that a conditional is constant at a particular location in the code (while trying to decide if we can simplify a conditional branch), we can eagerly replace that condition with a constant if it's definition is post dominated by the branch in question. In practice, this ends up being a compile time savings at most. JumpThreading would have visited each using branch anyways. CVP would have visited the cmp itself again. Unless LVI gives up early, we shouldn't gain any addition power by doing this transformation early. What we do gain is simplicity and compile time. Differential Revision: http://reviews.llvm.org/D9312 llvm-svn: 236684
*	Let llc and opt override "-target-cpu" and "-target-features" via command line	Akira Hatanaka	2015-05-06	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	options. This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't override function attributes "-target-cpu" and "-target-features" in the IR. Differential Revision: http://reviews.llvm.org/D9537 llvm-svn: 236677
*	[IRBuilder] Add a CreateGCStatepointInvoke.	Sanjoy Das	2015-05-06	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \|	Renames the original CreateGCStatepoint to CreateGCStatepointCall, and moves invoke creating functionality from PlaceSafepoints.cpp to IRBuilder.cpp. This changes the labels generated for PlaceSafepoints/invokes.ll so use a regex there to make the basic block labels more resilient. llvm-svn: 236672
*	Allow 0-weight branches in BranchProbabilityInfo.	Diego Novillo	2015-05-06	3	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When computing branch weights in BPI, we used to disallow branches with weight 0. This is a minor nuisance, because a branch with weight 0 is different to "don't have information". In the context of instrumentation, it may mean "never executed", in the context of sampling, it means "never or seldom executed". In allowing 0 weight branches, I ran into issues with the switch expansion code in selection DAG. It is currently hardwired to not handle branches with weight 0. To maintain the current behaviour, I changed it to use 1 when it finds 0, but perhaps the algorithm needs changes to tolerate branches with weight zero. Reviewers: hansw Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9533 llvm-svn: 236617
*	[X86] Disable loop unrolling in loop vectorization pass when VF is 1.	Wei Mi	2015-05-06	2	-1/+40
\| \| \| \| \| \| \| \| \| \| \| \| \|	The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613
*	[Inliner] Discard empty COMDAT groups	David Majnemer	2015-05-05	1	-0/+15
\| \| \| \| \| \| \| \| \|	COMDAT groups which have become rendered unused because of inline are discardable if we can prove that we've made the group empty. This fixes PR22285. llvm-svn: 236539
*	Update BasicAliasAnalysis to understand that nothing aliases with undef values.	Daniel Berlin	2015-05-05	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	It got this in some cases (if one of them was an identified object), but not in all cases. This caused stores to undef to block load-forwarding in some cases, etc. Added test to Transforms/GVN to verify optimization occurs as expected. llvm-svn: 236511
*	InstCombineSimplifyDemanded: Remove nsw/nuw flags when optimizing demanded bits	Matthias Braun	2015-04-30	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When optimizing demanded bits of the operands of an Add we have to remove the nsw/nuw flags as we have no guarantee anymore that we don't wrap. This is legal here because the top bit is not demanded. In fact this operaion was already performed but missed in the case of an Add with a constant on the right side. To fix this this patch refactors the code to unify the code paths in SimplifyDemandedUseBits() handling of Add/Sub: - The transformation of Add->Or is removed from the simplify demand code because the equivalent transformation exists in InstCombiner::visitAdd() - KnownOnes/KnownZero are not adjusted for Add x, C anymore as computeKnownBits() already performs these computations. - The simplification of the operands is unified. In this new version constant on the right side of a Sub are shrunk now as I could not find a reason why not to do so. - The special case for clearing nsw/nuw in ShrinkDemandedConstant() is not necessary anymore as the caller does that already. Differential Revision: http://reviews.llvm.org/D9415 llvm-svn: 236269
*	[InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al.	Sanjoy Das	2015-04-30	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Optimizing these well are especially interesting for IRCE since it "clamps" values by generating this sort of pattern through SCEV expressions. Depends on D9352. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9353 llvm-svn: 236203