bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[RewriteStatepointsForGC] Delete stale comment [NFC]	Philip Reames	2015-09-02	1	-3/+0
\| \| \| \|	llvm-svn: 246722
*	[RewriteStatepointsForGC] Pull a function out of anon namespace [NFC]	Philip Reames	2015-09-02	1	-1/+5
\| \| \| \| \| \|	Thanks to David Blaikie for noticing in previous commit. llvm-svn: 246721
*	IR: Remove a redundant function. NFC	Justin Bogner	2015-09-02	1	-7/+0
\| \| \| \| \| \| \|	Function::print isn't interestingly different from Value::print. Just let the only caller (in PrintCallGraphPass) call the Value version. llvm-svn: 246720
*	[X86] Cleanup nontemporal fragments. NFCI.	Ahmed Bougacha	2015-09-02	1	-15/+6
\| \| \| \| \| \| \| \|	We can chain other fragments to avoid repeating conditions. This also fixes a potential bug (that realistically can't happen), where we would match indexed nontemporal stores for i32/i64. llvm-svn: 246719
*	[RewriteStatepointsForGC] Bugfix for change 246133	Philip Reames	2015-09-02	1	-16/+16
\| \| \| \| \| \| \| \|	Fix a bug in change 246133. I didn't handle the case where we had a cycle in the use graph and could add an instruction we were about to erase back on to the worklist. Oddly, I have not been able to write a small test case for this, even with the AssertingVH added. I have confirmed the basic theory for the fix on a large failing example, but all attempts to reduce that to something appropriate for a test case have failed. Differential Revision: http://reviews.llvm.org/D12575 llvm-svn: 246718
*	Fix release build warning for unused function	Philip Reames	2015-09-02	1	-1/+2
\| \| \| \|	llvm-svn: 246717
*	[RewriteStatepointsForGC] Improve debug output [NFC]	Philip Reames	2015-09-02	1	-30/+36
\| \| \| \|	llvm-svn: 246713
*	[PowerPC] Cleanup cost model for unaligned vector loads/stores	Hal Finkel	2015-09-02	1	-22/+37
\| \| \| \| \| \| \| \| \| \|	I'm adding a regression test to better cover code generation for unaligned vector loads and stores, but there's no functional change to the code generation here. There is an improvement to the cost model for unaligned vector loads and stores, mostly for QPX (for which we were not previously accounting for the permutation-based loads), and the cost model implementation is cleaner. llvm-svn: 246712
*	Move twice-repeated clang path operation into a new function.	Douglas Katzman	2015-09-02	1	-2/+10
\| \| \| \| \| \|	And make it more robust in the edge case of exactly "./" as input. llvm-svn: 246711
*	assuem(X) handling in GVN bugfix	Piotr Padlewski	2015-09-02	1	-1/+20
\| \| \| \| \| \| \| \| \| \|	There was infinite loop because it was trying to change assume(true) into assume(true) Also added handling when assume(false) appear http://reviews.llvm.org/D12516 llvm-svn: 246697
*	Constant propagation after hitting assume(cmp) bugfix	Piotr Padlewski	2015-09-02	3	-11/+50
\| \| \| \| \| \| \| \| \|	Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246696
*	Constant propagation after hiting llvm.assume	Piotr Padlewski	2015-09-02	1	-3/+68
\| \| \| \| \| \| \| \| \| \| \|	After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246695
*	[RemoveDuplicatePHINodes] Start over after removing a PHI.	Benjamin Kramer	2015-09-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes RemoveDuplicatePHINodes more effective and fixes an assertion failure. Triggering the assertions requires a DenseSet reallocation so this change only contains a constructive test. I'll explain the issue with a small example. In the following function there's a duplicate PHI, %4 and %5 are identical. When this is found the DenseSet in RemoveDuplicatePHINodes contains %2, %3 and %4. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %5, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] %5 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } after RemoveDuplicatePHINodes runs the function looks like this. %3 has changed and is now identical to %2, but RemoveDuplicatePHINodes never saw this. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %4, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } If the DenseSet does a reallocation now it will reinsert all keys and stumble over %3 now having a different hash value than it had when inserted into the map for the first time. This change clears the set whenever a PHI is deleted and starts the progress from the beginning, allowing %3 to be deleted and avoiding inconsistent DenseSet state. This potentially has a negative performance impact because it rescans all PHIs, but I don't think that this ever makes a difference in practice. llvm-svn: 246694
*	use "unpredictable" metadata in fast-isel when splitting compares	Sanjay Patel	2015-09-02	1	-1/+4
\| \| \| \| \| \| \| \|	This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12342 llvm-svn: 246692
*	use "unpredictable" metadata in SelectionDAG when splitting compares	Sanjay Patel	2015-09-02	1	-4/+5
\| \| \| \| \| \| \| \|	This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12343 llvm-svn: 246691
*	[libFuzzer] honour -only_ascii=1 when reading the initial corpus. Also, ↵	Kostya Serebryany	2015-09-02	3	-5/+10
\| \| \| \| \| \|	remove ugly #ifdef llvm-svn: 246689
*	add unpredictable metadata type for control flow	Sanjay Patel	2015-09-02	2	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch defines 'unpredictable' metadata. This metadata can be used to signal to the optimizer or backend that a branch or switch is unpredictable, and therefore, it's probably better to not split a compound predicate into multiple branches such as in CodeGenPrepare::splitBranchCondition(). This was discussed in: https://llvm.org/bugs/show_bug.cgi?id=23827 Dependent patches to alter codegen and expose this in clang to follow. Differential Revision; http://reviews.llvm.org/D12341 llvm-svn: 246688
*	[AArch64] More consistently separate asm opc and operands with '\t'.	Ahmed Bougacha	2015-09-02	1	-30/+30
\| \| \| \| \| \|	Somehow missed these in r246686. llvm-svn: 246687
*	[AArch64] Consistently separate asm opc and operands with '\t'.	Ahmed Bougacha	2015-09-02	1	-17/+17
\| \| \| \| \| \| \| \|	Some of the instructions use ' ', which drives OCD-me nuts. Let's put an end to this. NFC-ish: hopefully nobody cares about whitespace. llvm-svn: 246686
*	IR: Invert a condition to make it more legible. NFC	Justin Bogner	2015-09-02	1	-18/+16
\| \| \| \| \| \|	Also updates the style to more modern conventions. llvm-svn: 246681
*	[ValueTracking] Look through casts when both operands are casts.	James Molloy	2015-09-02	1	-5/+17
\| \| \| \| \| \| \| \| \| \| \|	We only looked through casts when one operand was a constant. We can also look through casts when both operands are non-constant, but both are in fact the same cast type. For example: %1 = icmp ult i8 %a, %b %2 = zext i8 %a to i32 %3 = zext i8 %b to i32 %4 = select i1 %1, i32 %2, i32 %3 llvm-svn: 246678
*	[PowerPC] Don't always consider P8Altivec-only masks in LowerVECTOR_SHUFFLE	Hal Finkel	2015-09-02	1	-6/+8
\| \| \| \| \| \| \| \| \| \| \|	LowerVECTOR_SHUFFLE needs to decide whether to pass a vector shuffle off to the TableGen-generated matching code, and it does this by testing the same predicates used by the TableGen files. Unfortunately, when we added new P8Altivec-only predicates, we started universally testing them in LowerVECTOR_SHUFFLE, and if then matched when targeting a system prior to a P8, we'd end up with a selection failure. llvm-svn: 246675
*	[x86] fix allowsMisalignedMemoryAccesses() for 8-byte and smaller accesses	Sanjay Patel	2015-09-02	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a continuation of the fix from: http://reviews.llvm.org/D10662 and discussion in: http://reviews.llvm.org/D12154 Here, we distinguish slow unaligned SSE (128-bit) accesses from slow unaligned scalar (64-bit and under) accesses. Other lowering (eg, getOptimalMemOpType) assumes that unaligned scalar accesses are always ok, so this changes allowsMisalignedMemoryAccesses() to match that behavior. Differential Revision: http://reviews.llvm.org/D12543 llvm-svn: 246658
*	[X86][AVX512VLBW] add support in byte shift and SAD	Asaf Badouh	2015-09-02	4	-7/+83
\| \| \| \| \| \| \| \| \|	add byte shift left/right add SAD - compute sum of absolute differences Differential Revision: http://reviews.llvm.org/D12479 llvm-svn: 246654
*	[TableGen] Allow TokenTy in intrinsic signatures	Joseph Tremoulet	2015-09-02	2	-16/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add the necessary plumbing so that llvm_token_ty can be used as an argument/return type in intrinsic definitions and correspondingly require TokenTy in function types. TokenTy is an opaque type that has no target lowering, but can be used in machine-independent intrinsics. It is required for the upcoming llvm.eh.padparam intrinsic. Reviewers: majnemer, rnk Subscribers: stoklund, llvm-commits Differential Revision: http://reviews.llvm.org/D12532 llvm-svn: 246651
*	AVX512: Implemented encoding and intrinsics for VGETMANTPD/S , VGETMANTSD/S ↵	Igor Breger	2015-09-02	5	-17/+63
\| \| \| \| \| \| \| \| \| \|	instructions Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11593 llvm-svn: 246642
*	AVX512: Implemented encoding and intrinsics for vshufps/d.	Igor Breger	2015-09-02	3	-44/+36
\| \| \| \| \| \| \| \|	Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11709 llvm-svn: 246640
*	[LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH ↵	James Molloy	2015-09-02	1	-60/+27
\| \| \| \| \| \| \| \| \| \|	instead We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch. There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity. llvm-svn: 246637
*	[LV] Move some code around slightly to make the intent of the function more ↵	James Molloy	2015-09-02	1	-3/+1
\| \| \| \| \| \| \| \|	clear. NFC. llvm-svn: 246636
*	[LV] Cleanup: Sink an IRBuilder closer to its uses.	James Molloy	2015-09-02	1	-10/+5
\| \| \| \| \| \|	NFC. llvm-svn: 246635
*	[LV] Refactor all runtime check emissions into helper functions.	James Molloy	2015-09-02	1	-100/+126
\| \| \| \| \| \| \| \|	This reduces the complexity of createEmptyBlock() and will open the door to further refactoring. The test change is simply because we're now constant folding a trivial test. llvm-svn: 246634
*	[LV] Pull creation of trip counts into a helper function.	James Molloy	2015-09-02	1	-63/+101
\| \| \| \| \| \|	... and do a tad of tidyup while we're at it. Because StartIdx must now be zero, there's no difference between Count and EndIdx. llvm-svn: 246633
*	[LV] Factor the creation of the loop induction variable out of createEmptyLoop()	James Molloy	2015-09-02	1	-19/+43
\| \| \| \| \| \|	It makes things easier to understand if this is in a helper method. This is part of my ongoing spaghetti-removal operation on createEmptyLoop. llvm-svn: 246632
*	[LV] Never widen an induction variable.	James Molloy	2015-09-02	1	-115/+49
\| \| \| \| \| \| \| \| \| \|	There's no need to widen canonical induction variables. It's just as efficient to create a new, wide, induction variable. Consider, if we widen an indvar, then we'll have to truncate it before its uses anyway (1 trunc). If we create a new indvar instead, we'll have to truncate that instead (1 trunc) [besides which IndVars should go and clean up our mess after us anyway on principle]. This lets us remove a ton of special-casing code. llvm-svn: 246631
*	[LV] Switch to using canonical induction variables.	James Molloy	2015-09-02	1	-14/+8
\| \| \| \| \| \| \| \| \| \|	Vectorized loops only ever have one induction variable. All induction PHIs from the scalar loop are rewritten to be in terms of this single indvar. We were trying very hard to pick an indvar that already existed, even if that indvar wasn't canonical (didn't start at zero). But trying so hard is really fruitless - creating a new, canonical, indvar only results in one extra add in the worst case and that add is trivially easy to push through the PHI out of the loop by instcombine. If we try and be less clever here and instead let instcombine clean up our mess (as we do in many other places in LV), we can remove unneeded complexity. llvm-svn: 246630
*	AVX-512: store <4 x i1> and <2 x i1> values in memory	Elena Demikhovsky	2015-09-02	1	-0/+5
\| \| \| \| \| \| \| \|	Enabled DAG pattern lowering for SKX with DQI predicate. Differential Revision: http://reviews.llvm.org/D12550 llvm-svn: 246625
*	Optimization for Gather/Scatter with uniform base	Elena Demikhovsky	2015-09-02	1	-31/+43
\| \| \| \| \| \| \| \| \|	Vector 'getelementptr' with scalar base is an opportunity for gather/scatter intrinsic to generate a better sequence. While looking for uniform base, we want to use the scalar base pointer of GEP, if exists. Differential Revision: http://reviews.llvm.org/D11121 llvm-svn: 246622
*	Move createEliminateAvailableExternallyPass earlier in the pass pipeline	Yaron Keren	2015-09-02	1	-11/+13
\| \| \| \| \| \| \|	to save running many ModulePasses on available external functions that are thrown away anyhow. llvm-svn: 246619
*	[CodeGen] Fix FREM on 32-bit MSVC on x86	Vedant Kumar	2015-09-02	1	-1/+11
\| \| \| \| \| \| \| \|	Patch by Dylan McKay! Differential Revision: http://reviews.llvm.org/D12099 llvm-svn: 246615
*	[MC] Generate a timestamp for COFF object files	David Majnemer	2015-09-01	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \|	The MS incremental linker seems to inspect the timestamp written into the object file to determine whether or not it's contents need to be considered. Failing to set the timestamp to a date newer than the executable will result in the object file not participating in subsequent links. To ameliorate this, write the current time into the object file's TimeDateStamp field. llvm-svn: 246607
*	[MC] Remove MCAssembler's copy of OS	David Majnemer	2015-09-01	2	-5/+5
\| \| \| \| \| \| \|	We can just ask the ObjectWriter for it's stream instead of caching around our own reference to it. No functionality change is intended. llvm-svn: 246604
*	[ARM] Don't abort on variable-idx extractelt in ReconstructShuffle.	Ahmed Bougacha	2015-09-01	1	-0/+4
\| \| \| \| \| \| \| \| \|	The code introduced in r244314 assumed that EXTRACT_VECTOR_ELT only takes constant indices, but it does accept variables. Bail out for those: we can't use them, as the shuffles we want to reconstruct do require constant masks. llvm-svn: 246594
*	[MC] Add support for generating COFF CRCs	David Majnemer	2015-09-01	3	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	COFF sections are accompanied with an auxiliary symbol which includes a checksum. This checksum used to be filled with just zero but this seems to upset LINK.exe when it is processing a /INCREMENTAL link job. Instead, fill the CheckSum field with the JamCRC of the section contents. This matches MSVC's behavior. This fixes PR19666. N.B. A rather simple implementation of JamCRC is given. It implements a byte-wise calculation using the method given by Sarwate. There are implementations with higher throughput like slice-by-eight and making use of PCLMULQDQ. We can switch to one of those techniques if it turns out to be a significant use of time. llvm-svn: 246590
*	rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI)	Sanjay Patel	2015-09-01	5	-53/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-on suggested by: http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 ) http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 ) This makes the attribute name match most of the existing lowering logic and regression test expectations. But the current use of this attribute is inconsistent; see the FIXME comment for "allowsMisalignedMemoryAccesses()". That change will result in functional changes and should be coming soon. llvm-svn: 246585
*	DeadArgElim: don't eliminate arguments from naked functions	Hans Wennborg	2015-09-01	1	-0/+21
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D12534 llvm-svn: 246564
*	New bitcode linker flags:	Artem Belevich	2015-09-01	1	-13/+33
\| \| \| \| \| \| \| \| \|	-only-needed -- link in only symbols needed by destination module -internalize -- internalize linked symbols Differential Revision: http://reviews.llvm.org/D12459 llvm-svn: 246561
*	[AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0.	Ahmed Bougacha	2015-09-01	5	-6/+25
\| \| \| \| \| \| \| \| \| \|	This matches the ARM behavior. In both cases, the register is part of the optional Performance Monitors extension, so, add the feature, and enable it for the A-class processors we support. Differential Revision: http://reviews.llvm.org/D12425 llvm-svn: 246555
*	[MC] Allow MCObjectWriter's output stream to be swapped out	David Majnemer	2015-09-01	3	-79/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are occasions where it is useful to consider the entirety of the contents of a section. For example, compressed debug info needs the entire section available before it can compress it and write it out. The compressed debug info scenario was previously implemented by mirroring the implementation of writeSectionData in the ELFObjectWriter. Instead, allow the output stream to be swapped on demand. This lets callers redirect the output stream to a more convenient location before it hits the object file. No functionality change is intended. Differential Revision: http://reviews.llvm.org/D12509 llvm-svn: 246554
*	AVX512: Implemented intrinsics for valign.	Igor Breger	2015-09-01	1	-0/+8
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D12526 llvm-svn: 246551
*	[AArch64] Turn on by default interleaved access vectorization	Silviu Baranga	2015-09-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change turns on by default interleaved access vectorization for AArch64. We also clean up some tests which were spedifically enabling this behaviour. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12149 llvm-svn: 246542