bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Converting the ErrorHandlerMutex to a ManagedStatic to avoid the static ↵	Chris Bieneman	2014-10-03	1	-4/+5
\| \| \| \| \| \|	constructor and destructor. llvm-svn: 219028
*	[x86] Adjust the patterns for lowering X86vzmovl nodes which don't	Chandler Carruth	2014-10-03	2	-47/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	perform a load to use blendps rather than movss when it is available. For non-loads, blendps is much faster. It can execute on two ports in Sandy Bridge and Ivy Bridge, and three ports on Haswell. This fixes one of the "regressions" from aggressively taking the "insertion" path in the new vector shuffle lowering. This does highlight one problem with blendps -- it isn't commuted as heavily as it should be. That's future work though. llvm-svn: 219022
*	PR21145: Teach LLVM about C++14 sized deallocation functions.	Richard Smith	2014-10-03	2	-1/+9
\| \| \| \| \| \| \| \|	C++14 adds new builtin signatures for 'operator delete'. This change allows new/delete pairs to be removed in C++14 onwards, as they were in C++11 and before. llvm-svn: 219014
*	Revert "Revert "DI: Fold constant arguments into a single MDString""	Duncan P. N. Exon Smith	2014-10-03	7	-616/+542
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r218918, effectively reapplying r218914 after fixing an Ocaml bindings test and an Asan crash. The root cause of the latter was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a PR to investigate who requires the loose check (and why). Original commit message follows. -- This patch addresses the first stage of PR17891 by folding constant arguments together into a single MDString. Integers are stringified and a `\0` character is used as a separator. Part of PR17891. Note: I've attached my testcases upgrade scripts to the PR. If I've just broken your out-of-tree testcases, they might help. llvm-svn: 219010
*	[ISel] Keep matching state consistent when folding during X86 address match	Adam Nemet	2014-10-03	2	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the X86 backend, matching an address is initiated by the 'addr' complex pattern and its friends. During this process we may reassociate and-of-shift into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the shift into the scale of the address. However as demonstrated by the testcase, this can trigger CSE of not only the shift and the AND which the code is prepared for but also the underlying load node. In the testcase this node is sitting in the RecordedNode and MatchScope data structures of the matcher and becomes a deleted node upon CSE. Returning from the complex pattern function, we try to access it again hitting an assert because the node is no longer a load even though this was checked before. Now obviously changing the DAG this late is bending the rules but I think it makes sense somewhat. Outside of addresses we prefer and-of-shift because it may lead to smaller immediates (FoldMaskAndShiftToScale is an even better example because it create a non-canonical node). We currently don't recognize addresses during DAGCombiner where arguably this canonicalization should be performed. On the other hand, having this in the matcher allows us to cover all the cases where an address can be used in an instruction. I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible for initiating the recursive CSE on users (http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it is not strictly necessary since the shift is hooked into the visited user. Of course it's safer to keep the DAG consistent at all times (e.g. for accurate number of uses, etc.). So rather than changing the fundamentals, I've decided to continue along the previous patches and detect the CSE. This patch installs a very targeted DAGUpdateListener for the duration of a complex-pattern match and updates the matching state accordingly. (Previous patches used HandleSDNode to detect the CSE but that's not practical here). The listener is only installed on X86. I tested that there is no measurable overhead due to this while running through the spec2k BC files with llc. The only thing we pay for is the creation of the listener. The callback never ever triggers in spec2k since this is a corner case. Fixes rdar://problem/18206171 llvm-svn: 219009
*	R600: Align functions to 256 bytes	Tom Stellard	2014-10-03	2	-3/+12
\| \| \| \|	llvm-svn: 219002
*	Eliminate some deep std::vector copies. NFC.	Benjamin Kramer	2014-10-03	13	-45/+21
\| \| \| \|	llvm-svn: 218999
*	MCParser: Modernize memory handling.	Benjamin Kramer	2014-10-03	1	-37/+22
\| \| \| \| \| \|	NFC. llvm-svn: 218998
*	llvm-readobj: print out the fields of the COFF delay-import table	Rui Ueyama	2014-10-03	1	-0/+6
\| \| \| \|	llvm-svn: 218996
*	[Power] Use lwsync for non-seq_cst fences	Robin Morisset	2014-10-03	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: hwsync is only required for seq_cst fences, acquire and release one can use the cheaper lwsync. Test Plan: Added some cases to atomics.ll + make check-all Reviewers: jfb, wschmidt Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5317 llvm-svn: 218995
*	MipsAsmParser.cpp: fix VS2012 build	Hans Wennborg	2014-10-03	1	-1/+1
\| \| \| \|	llvm-svn: 218991
*	HexagonMCCodeEmitter.h: deleted member functions are not supported in VS2012	Hans Wennborg	2014-10-03	1	-2/+2
\| \| \| \|	llvm-svn: 218990
*	[mips] Print warning when using register names not available in N32/64	Daniel Sanders	2014-10-03	2	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The register names t4-t7 are not available in the N32 and N64 ABIs. This patch prints a warning, when those names are used in N32/64, along with a fix-it with the correct register names. Patch by Vasileios Kalintiris Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5272 llvm-svn: 218989
*	Fix build break on Hexagon	Sid Manning	2014-10-03	1	-1/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D5600 llvm-svn: 218987
*	Adding skeleton for unit testing Hexagon Code Emission	Sid Manning	2014-10-03	6	-8/+173
\| \| \| \| \| \| \| \| \| \| \|	Adding and modifying CMakeLists.txt files to run unit tests under unittests/Target/* if the directory exists. Adding basic unit test to check that code emitter object can be retrieved. Differential Revision: http://reviews.llvm.org/D5523 Change by: Colin LeMahieu llvm-svn: 218986
*	[x86] Teach the new vector shuffle lowering to aggressively form MOVSS	Chandler Carruth	2014-10-03	2	-5/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and MOVSD nodes for single element vector inserts. This is particularly important because a number of patterns in the backend detect these patterns and leverage them to simplify things. It also fixes quite a few of the insertion bad code examples. However, it regresses a specific area: when available, blendps and blendpd are dramatically faster than movss and movsd respectively. But it doesn't really work to form the blend logic first because the blends aren't as crazy efficient when the data is coming from memory anyways, and thus will have a movss or movsd regardless. Also, doing that would block a bunch of the patterns that this is designed to hit. So my plan is to go into the patterns for lowering MOVSS and MOVSD and lower them via blends when available. However that's a pretty invasive restructuring so it will need to be a follow-up patch. I have already gone into the patterns to lower MOVSS and MOVSD from memory using MOVLPD, etc. Without that, several of the test cases I already have regress. llvm-svn: 218985
*	Revert 202433 - Provide a target override for the latest regalloc heuristic	Renato Golin	2014-10-03	3	-8/+1
\| \| \| \| \| \| \| \| \| \| \|	That commit was introduced in order to help investigate a problem in ARM codegen breaking from commit 202304 (Add a limit to the heuristic that register allocates instructions in local order). Recent analisys indicated that the problem no longer exists, so I'm reverting this change. See PR18996. llvm-svn: 218981
*	[x86] Refactor the element insertion logic in the new vector shuffle	Chandler Carruth	2014-10-03	1	-19/+21
\| \| \| \| \| \| \| \| \| \| \|	lowering to handle the potential mirroring of 2-element vectors (because we can't reliably sort them one way) in the caller rather than in the insertion logic. This will simplify things considerably as more ways to fail to match the insertion are added because now we have a nice try and retry point. llvm-svn: 218980
*	[x86] Significantly improve the ability of the new vector shuffle	Chandler Carruth	2014-10-03	1	-26/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lowering to match VZEXT_MOVL patterns. I hadn't realized that these had sufficient pattern smarts in the backend to lower zext-ing from the low element of a vector without it being a scalar_to_vector node. They do, and this is how to match a bunch of patterns for movq, movss, etc. There is a weird propensity to end up using pshufd to place the element afterward even though it means domain crossing (or rather, to use xorps+movss to zext the element rather than movq) but that's an orthogonal problem with VZEXT_MOVL that someone should probably look at. llvm-svn: 218977
*	[x86] Unbreak SSE1 with the new vector shuffle lowering. We can't widen	Chandler Carruth	2014-10-03	1	-4/+8
\| \| \| \| \| \| \| \| \|	element types to form illegal vector types. I've added a special SSE1 test case here that makes sure we don't break this going forward. llvm-svn: 218974
*	Revert r215343.	James Molloy	2014-10-03	1	-25/+1
\| \| \| \| \| \|	This was contentious and needs invesigation. llvm-svn: 218971
*	[BasicAA] Revert r218714 - Make better use of zext and sign information.	Lang Hames	2014-10-03	1	-29/+2
\| \| \| \| \| \| \| \| \|	This patch broke 447.dealII on Darwin. I'm currently working on a reduced test-case, but reverting for now to keep the bots happy. <rdar://problem/18530107> llvm-svn: 218944
*	constify TargetMachine parameter.	Eric Christopher	2014-10-03	4	-5/+6
\| \| \| \|	llvm-svn: 218934
*	llvm-readobj: print COFF delay-load import table	Rui Ueyama	2014-10-03	1	-13/+90
\| \| \| \| \| \| \| \| \|	This patch adds another iterator to access the delay-load import table and use it from llvm-readobj. http://reviews.llvm.org/D5594 llvm-svn: 218933
*	constify TargetMachine argument.	Eric Christopher	2014-10-03	4	-4/+4
\| \| \| \|	llvm-svn: 218930
*	We can grab the options struct from the TargetMachine, no need to	Eric Christopher	2014-10-03	3	-5/+4
\| \| \| \| \| \|	pass it down in the constructor. llvm-svn: 218929
*	[AVX512] Pull pattern for subvector insert into the instruction definition	Adam Nemet	2014-10-02	1	-8/+4
\| \| \| \| \| \| \| \| \| \|	No functional change intended. Very similar to the change I made for subvector extract in r218480. test/CodeGen/X86/avx512-insert-extract.ll covers this. llvm-svn: 218928
*	[AVX512] Refactor subvector inserts	Adam Nemet	2014-10-02	1	-102/+55
\| \| \| \| \| \| \| \| \| \|	No functional change. Very similar to the extract refactoring I did in r218478. Compared X86.td.expanded before and after. llvm-svn: 218927
*	[AVX512] Fix i256mem->f256mem typo in VINSERTF64x4rm	Adam Nemet	2014-10-02	1	-1/+1
\| \| \| \| \| \| \|	Just like in the case of extracts, the refactoring is uncovering some typos in the code. llvm-svn: 218926
*	[PowerPC] Modern Book-E cores support sync	Hal Finkel	2014-10-02	4	-17/+24
\| \| \| \| \| \| \| \| \| \| \| \| \|	Older Book-E cores, such as the PPC 440, support only msync (which has the same encoding as sync 0), but not any of the other sync forms. Newer Book-E cores, however, do support sync, and for performance reasons we should allow the use of the more-general form. This refactors msync use into its own feature group so that it applies by default only to older Book-E cores (of the relevant cores, we only have definitions for the PPC440/450 currently). llvm-svn: 218923
*	[Power] Improve the expansion of atomic loads/stores	Robin Morisset	2014-10-02	3	-4/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Atomic loads and store of up to the native size (32 bits, or 64 for PPC64) can be lowered to a simple load or store instruction (as the synchronization is already handled by AtomicExpand, and the atomicity is guaranteed thanks to the alignment requirements of atomic accesses). This is exactly what this patch does. Previously, these were implemented by complex load-linked/store-conditional loops.. an obvious performance problem. For example, this patch turns ``` define void @store_i8_unordered(i8* %mem) { store atomic i8 42, i8* %mem unordered, align 1 ret void } ``` from ``` _store_i8_unordered: ; @store_i8_unordered ; BB#0: rlwinm r2, r3, 3, 27, 28 li r4, 42 xori r5, r2, 24 rlwinm r2, r3, 0, 0, 29 li r3, 255 slw r4, r4, r5 slw r3, r3, r5 and r4, r4, r3 LBB4_1: ; =>This Inner Loop Header: Depth=1 lwarx r5, 0, r2 andc r5, r5, r3 or r5, r4, r5 stwcx. r5, 0, r2 bne cr0, LBB4_1 ; BB#2: blr ``` into ``` _store_i8_unordered: ; @store_i8_unordered ; BB#0: li r2, 42 stb r2, 0(r3) blr ``` which looks like a pretty clear win to me. Test Plan: fixed the tests + new test for indexed accesses + make check-all Reviewers: jfb, wschmidt, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5587 llvm-svn: 218922
*	Fix the threshold added in r186434 (a re-apply of r185393) and updaated	Chandler Carruth	2014-10-02	2	-31/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to be a ManagedStatic in r218163 to not be a global variable written and read to from within the innards of SpillPlacement. This will fix a really scary race condition for anyone that has two copies of LLVM running spill placement concurrently. Yikes! This will also fix a really significant compile time hit that r218163 caused because the spill placement threshold read is actually in the very hot path of this code. The memory fence on each read was showing up as huge compile time regressions when spilling is responsible for most of the compile time. For example, optimizing sanitized code showed over 50% compile time regressions here. =/ llvm-svn: 218921
*	[Stackmaps] Make ithe frame-pointer required for stackmaps.	Juergen Ributzka	2014-10-02	2	-2/+4
\| \| \| \| \| \| \| \| \|	Do not eliminate the frame pointer if there is a stackmap or patchpoint in the function. All stackmap references should be FP relative. This fixes PR21107. llvm-svn: 218920
*	Revert "DI: Fold constant arguments into a single MDString"	Duncan P. N. Exon Smith	2014-10-02	7	-528/+608
\| \| \| \| \| \|	This reverts commit r218914 while I investigate some bots. llvm-svn: 218918
*	llvm-readobj: print COFF imported symbols	Rui Ueyama	2014-10-02	1	-0/+90
\| \| \| \| \| \| \| \|	This patch defines a new iterator for the imported symbols. Make a change to COFFDumper to use that iterator to print out imported symbols and its ordinals. llvm-svn: 218915
*	DI: Fold constant arguments into a single MDString	Duncan P. N. Exon Smith	2014-10-02	7	-608/+528
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses the first stage of PR17891 by folding constant arguments together into a single MDString. Integers are stringified and a `\0` character is used as a separator. Part of PR17891. Note: I've attached my testcases upgrade scripts to the PR. If I've just broken your out-of-tree testcases, they might help. llvm-svn: 218914
*	[x86] Teach the new vector shuffle lowering to widen floating point	Chandler Carruth	2014-10-02	2	-8/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	elements as well as integer elements in order to form simpler shuffle patterns. This is the primary reason why we were failing to match some of the 2-and-2 floating point shuffles such as PR21140. Even after fixing this we need to support some extra patterns in the backend in order to match the resulting X86ISD::UNPCKL nodes into the correct instructions. This commit should fix PR21140 and includes more comprehensive testing of insertion patterns in v4 shuffles. Not all of the added tests are beautiful. For example, we don't have clever instructions to insert-via-load in the integer domain. There are also some places where we aren't sufficiently cunning with our use of movq and movd, but that's future work. llvm-svn: 218911
*	LTO: Document the Boolean argument from r218784	Duncan P. N. Exon Smith	2014-10-02	1	-1/+2
\| \| \| \|	llvm-svn: 218907
*	Optimize square root squared (PR21126).	Sanjay Patel	2014-10-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X. This can happen in the real world when calculating x ** 3/2. This occurs in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c. Differential Revision: http://reviews.llvm.org/D5584 llvm-svn: 218906
*	InstrProf: Avoid linear search in a hot loop	Justin Bogner	2014-10-02	1	-5/+6
\| \| \| \| \| \| \| \| \| \|	Every time we were adding or removing an expression when generating a coverage mapping we were doing a linear search to try and deduplicate the list. The indices in the list are important, so we can't just replace it by a DenseMap entirely, but an auxilliary DenseMap for fast lookup massively improves the performance issues I was seeing here. llvm-svn: 218892
*	This patch adds a new flag "-coff-imports" to llvm-readobj.	Rui Ueyama	2014-10-02	1	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the flag is given, the command prints out the COFF import table. Currently only the import table directory will be printed. I'm going to make another patch to print out the imported symbols. The implementation of import directory entry iterator in COFFObjectFile.cpp was buggy. This patch fixes that too. http://reviews.llvm.org/D5569 llvm-svn: 218891
*	Reapply "InstrProf: Don't keep a large sparse list around just to zero it"	Justin Bogner	2014-10-02	1	-24/+43
\| \| \| \| \| \| \| \| \| \|	When I was preparing r218879 for commit, I removed an early return that I decided was just noise. It wasn't. This is r218879 no-crash edition. This reverts commit r218881, reapplying r218879. llvm-svn: 218887
*	Remove an extra whitespace.	Adrian Prantl	2014-10-02	1	-1/+1
\| \| \| \|	llvm-svn: 218886
*	Pretty-printer: Paper over an ambiguity between line table entries	Adrian Prantl	2014-10-02	1	-1/+3
\| \| \| \| \| \| \| \|	and tagged mdnodes. fixes http://llvm.org/bugs/show_bug.cgi?id=21131 llvm-svn: 218885
*	Revert "InstrProf: Don't keep a large sparse list around just to zero it"	Justin Bogner	2014-10-02	1	-38/+24
\| \| \| \| \| \| \| \|	This seems to be crashing on some buildbots. Reverting to investigate. This reverts commit r218879. llvm-svn: 218881
*	InstrProf: Don't keep a large sparse list around just to zero it	Justin Bogner	2014-10-02	1	-24/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Terms vector here represented a polynomial of of all possible counters, and is used to simplify expressions when generating coverage mapping. There are a few problems with this: 1. Keeping the vector as a member is wasteful, since we clear it every time we use it. 2. Most expressions refer to a subset of the counters, so we end up iterating over a large number of zeros doing nothing a lot of the time. This updates the user of the vector to store the terms locally, and uses a sort and combine approach so that we only operate on counters that are actually used in a given expression. For small cases this makes very little difference, but in cases with a very large number of counted regions this is a significant performance fix. llvm-svn: 218879
*	Use the local variable that other clauses around here are already using.	Sanjay Patel	2014-10-02	1	-1/+1
\| \| \| \|	llvm-svn: 218876
*	Remove duplicate function names from comments. NFC.	Sanjay Patel	2014-10-02	1	-43/+35
\| \| \| \|	llvm-svn: 218875
*	[NVPTX] Remove dead code.	Tilmann Scheller	2014-10-02	1	-9/+3
\| \| \| \| \| \|	Found by the Clang static analyzer. llvm-svn: 218874
*	Support padding unaligned data in .text.	Joerg Sonnenberger	2014-10-02	1	-1/+6
\| \| \| \|	llvm-svn: 218870