bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[DAG] Move legal type checks in store merge to be checked only	Nirav Dave	2017-05-26	1	-2/+4
\| \| \| \| \| \|	on non-legal cases. NFC. llvm-svn: 303994
*	[ARM] Fix lowering of misaligned memcpy/memset	John Brawn	2017-05-26	4	-22/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently getOptimalMemOpType returns i32 for large enough sizes without checking for alignment, leading to poor code generation when misaligned accesses aren't permitted as we generate a word store then later split it up into byte stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for memset we splat the memset value into a word then immediately split it up again. Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type to use, but also fix a bug there where it wasn't correctly checking if misaligned memory accesses are allowed. Differential Revision: https://reviews.llvm.org/D33442 llvm-svn: 303990
*	nits in wide-integer-cmp.ll . NFC	Amaury Sechet	2017-05-26	1	-1/+0
\| \| \| \|	llvm-svn: 303989
*	[ARM] Add tests for 6-M memcpy/memset code generation	John Brawn	2017-05-26	2	-10/+32
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33495 llvm-svn: 303987
*	The fix for PR22004: X86AsmParser.cpp asserts: OperandStack.size() > 1 && ↵	Andrew V. Tischenko	2017-05-26	2	-2/+6
\| \| \| \| \| \|	"Too few operands." llvm-svn: 303985
*	Revert "[DWARF] - Make collectAddressRanges() return section index in ↵	George Rimar	2017-05-26	18	-96/+42
\| \| \| \| \| \| \| \| \| \| \| \|	addition to Low/High PC" Broked BB again: TEST 'LLVM :: DebugInfo/X86/dbg-value-regmask-clobber.ll' FAILED ... LLVM ERROR: Section was outside of section table. llvm-svn: 303984
*	Recommit r303978 "[DWARF] - Make collectAddressRanges() return section index ↵	George Rimar	2017-05-26	18	-42/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in addition to Low/High PC" With fix of test compilation. Initial commit message: This change is intended to use for LLD in D33183. Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to. Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's. That not only was slow, but also complicated implementation and was the reason of incorrect behavior when sections share the same offsets, like D33176 shows. This patch makes DWARF parsers to return section index as well. That solves problem mentioned above. Differential revision: https://reviews.llvm.org/D33184 llvm-svn: 303983
*	Revert r303978 "[DWARF] - Make collectAddressRanges() return section index ↵	George Rimar	2017-05-26	17	-90/+36
\| \| \| \| \| \| \| \|	in addition to Low/High PC" It failed BB. llvm-svn: 303981
*	Fix signedness of constant. NFC.	Nirav Dave	2017-05-26	1	-5/+5
\| \| \| \|	llvm-svn: 303980
*	Export the required symbol from DynamicLibraryTests	Roger Ferrer Ibanez	2017-05-26	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Running unittests/Support/DynamicLibrary/DynamicLibraryTests fails when LLVM is configured with LLVM_EXPORT_SYMBOLS_FOR_PLUGINS=ON, because the test's version script only contains symbols extracted from the static libraries, that the test links with, but not those from the main object/executable itself. The patch explicitly exports the one symbol needed by the test. This change fixes https://bugs.llvm.org/show_bug.cgi?id=32893 Patch authored by Momchil Velikov. Differential Revision: https://reviews.llvm.org/D33490 llvm-svn: 303979
*	[DWARF] - Make collectAddressRanges() return section index in addition to ↵	George Rimar	2017-05-26	17	-36/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Low/High PC This change is intended to use for LLD in D33183. Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to. Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's. That not only was slow, but also complicated implementation and was the reason of incorrect behavior when sections share the same offsets, like D33176 shows. This patch makes DWARF parsers to return section index as well. That solves problem mentioned above. Differential revision: https://reviews.llvm.org/D33184 llvm-svn: 303978
*	Remove unnecessary double-assignment triggering -Wsequence-point.	Daniel Jasper	2017-05-26	1	-1/+1
\| \| \| \|	llvm-svn: 303974
*	Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"	Max Kazantsev	2017-05-26	7	-20/+189
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on many non-X86 configs with this patch. The reason of this is that the patch makes a correctless fix that changes optimizer's behavior for this test. Without the change, LSR was making an overconfident simplification basing on a wrong SCEV. Apparently it did not need the IV analysis to do this. With the change, it chose a different way to simplify (that wasn't so confident), and this way required the IV analysis. Now, following the right execution path, LSR tries to make a transformation relying on IV Users analysis. This analysis is target-dependent due to this code: // LSR is not APInt clean, do not touch integers bigger than 64-bits. // Also avoid creating IVs of non-native types. For example, we don't want a // 64-bit IV in 32-bit code just because the loop has one 64-bit cast. uint64_t Width = SE->getTypeSizeInBits(I->getType()); if (Width > 64 \|\| !DL.isLegalInteger(Width)) return false; To make a proper transformation in this test case, the type i32 needs to be legal for the specified data layout. When the test runs on some non-X86 configuration (e.g. pure ARM 64), opt gets confused by the specified target and does not use it, rejecting the specified data layout as well. Instead, it uses some default layout that does not treat i32 as a legal type (currently the layout that is used when it is not specified does not have legal types at all). As result, the transformation we expect to happen does not happen for this test. This re-enabling patch does not have any source code changes compared to the original patch rL303730. The only difference is that the failing test is moved to X86 directory and now has requirement of running on x86 only to comply with the specified target triple and data layout. Differential Revision: https://reviews.llvm.org/D33543 llvm-svn: 303971
*	LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI	Matthias Braun	2017-05-26	5	-7/+13
\| \| \| \| \| \| \| \| \| \|	Re-commit r303937 + r303949 as they were not the cause for the build failures. We do not track liveness of reserved registers so adding them to the liveins list in computeLiveIns() was completely unnecessary. llvm-svn: 303970
*	Revert rL303923 since it broke the sanitizer bootstrap build bot.	Wei Mi	2017-05-26	5	-271/+26
\| \| \| \|	llvm-svn: 303969
*	[InstSimplify] Use APInt::isMask isntead of manually implementing it. NFC	Craig Topper	2017-05-26	1	-2/+2
\| \| \| \|	llvm-svn: 303968
*	[InstSimplify] Use m_ConstantInt matchers to short some code. NFC	Craig Topper	2017-05-26	1	-7/+5
\| \| \| \|	llvm-svn: 303967
*	[IR] Add an iterator and range accessor for the PHI nodes of a basic	Chandler Carruth	2017-05-26	4	-7/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	block. This allows writing much more natural and readable range based for loops directly over the PHI nodes. It also takes advantage of the same tricks for terminating the sequence as the hand coded versions. I've replaced one example of this mostly to showcase the difference and I've added a unit test to make sure the facilities really work the way they're intended. I want to use this inside of SimpleLoopUnswitch but it seems generally nice. Differential Revision: https://reviews.llvm.org/D33533 llvm-svn: 303964
*	Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI"	Matthias Braun	2017-05-26	2	-93/+27
\| \| \| \| \| \| \| \| \| \|	Tentatively revert this to see if it fixes the buildbot stage2 breakages. This reverts commit r303938. This reverts commit r303954. llvm-svn: 303960
*	Revert "LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI"	Matthias Braun	2017-05-26	5	-13/+7
\| \| \| \| \| \| \| \| \| \|	Tentatively revert, suspecting that it caused breakage in stage2 buildbots. This reverts commit r303949. This reverts commit r303937. llvm-svn: 303955
*	Test for r303938	Matthias Braun	2017-05-26	1	-0/+52
\| \| \| \|	llvm-svn: 303954
*	[PM] Enable the new simple loop unswitch pass in the new pass manager	Chandler Carruth	2017-05-26	2	-4/+2
\| \| \| \| \| \| \| \| \|	(where it is the only realistic option). This passes the LLVM test suite for me, but I'm clearly still hammering on this. llvm-svn: 303952
*	Tidy up RelocVisitor.h.	Rui Ueyama	2017-05-26	1	-325/+188
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: RelocVisitor had too many, too small functions. This patch group them by architecture rather than each relocation type. Reviewers: grimar, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33580 llvm-svn: 303950
*	LivePhysRegs: Follow-up to r303937	Matthias Braun	2017-05-26	1	-1/+1
\| \| \| \| \| \| \|	We may have situations in which a superregister is reserved and not added to liveins, so we have to add the subregisters. llvm-svn: 303949
*	[llvm-pdbdump] Don't crash when displaying padding.	Zachary Turner	2017-05-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a lot of complicated logic to determine where padding is in a record, and the debug info doesn't always provide enough information to figure it out with laser precision. In this case we were putting the padding in the wrong place causing an out of bounds access on a BitVector. Right now we decide that any trailing padding of a child type will be truncated during record layout, but this is only true insofar as the class still is sized properly to end on an alignment boundary, which the algorithm doesn't yet know about. For now, just don't crash, even though we display padding twice in this case. llvm-svn: 303946
*	[Examples] Fix some Clang-tidy modernize-use-using and Include What You Use ↵	Eugene Zelenko	2017-05-26	7	-44/+43
\| \| \| \| \| \|	warnings; other minor fixes (NFC). llvm-svn: 303944
*	Return a lit.Test.Result object from TestRunner's executeShTest()	Dimitry Andric	2017-05-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For various clang analyzer tests, which were unsupported, I got lit exceptions, similar to the following: Exception during script execution: Traceback (most recent call last): File "utils/lit/lit/run.py", line 190, in execute_test result = test.config.test_format.execute(test, lit_config) File "tools/clang/test/Analysis/analyzer_test.py", line 11, in execute if result.code == lit.Test.FAIL: AttributeError: 'tuple' object has no attribute 'code' This is because executeShTest() in utils/lit/lit/TestRunner.py is supposed to return a lit.Test.Result object, but in case of unsupported tests, it returns a plain tuple. Fix this by returning a properly initialized lit.Test.Result object instead. Reviewers: rnk, rafael, modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33579 llvm-svn: 303943
*	Remove unused member.	Zachary Turner	2017-05-25	1	-2/+0
\| \| \| \|	llvm-svn: 303942
*	[PPC] Add text for assert.	Tim Shen	2017-05-25	1	-1/+1
\| \| \| \|	llvm-svn: 303940
*	LTO: Do summary-based prevailing symbol resolution at --lto-O0.	Peter Collingbourne	2017-05-25	2	-13/+23
\| \| \| \| \| \| \| \| \|	Prevailing symbol resolution is necessary for correctness. Without this we can end up dropping a referenced linkonce symbol from the link. Differential Revision: https://reviews.llvm.org/D33570 llvm-svn: 303939
*	LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI	Matthias Braun	2017-05-25	1	-27/+41
\| \| \| \| \| \| \| \| \| \| \| \| \|	- addLiveOutsNoPristines() needs to add callee saved registers that are actually saved and restored somewhere to the set (they are not pristine). - Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines(). This fixes the problem from D32156. Differential Revision: https://reviews.llvm.org/D32464 llvm-svn: 303938
*	LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI	Matthias Braun	2017-05-25	5	-6/+12
\| \| \| \| \| \| \|	We do not track liveness of reserved registers so adding them to the liveins list in computeLiveIns() was completely unnecessary. llvm-svn: 303937
*	[CV Type Merging] Find nested type indices faster.	Zachary Turner	2017-05-25	8	-355/+954
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merging two type streams is one of the most time consuming parts of generating a PDB, and as such it needs to be as fast as possible. The visitor abstractions used for interoperating nicely with many different types of inputs and outputs have been used widely and help greatly for testability and implementing tools, but the abstractions build up and get in the way of performance. This patch removes all of the visitation stuff from the type stream merger, essentially re-inventing the leaf / member switch and loop, but at a very low level. This allows us many other optimizations, such as not actually deserializing any records (even member records which don't describe their own length), as the operation of "figure out how long this record is" is somewhat faster than "figure out how long this record and get all its fields out". Furthermore, whereas before we had to deserialize, re-write type indices, then re-serialize, now we don't have to do any of those 3 steps. We just find out where the type indices are and pull them directly out of the byte stream and re-write them. This is worth a 50-60% performance increase. On top of all other optimizations that have been applied this week, I now get the following numbers when linking lld.exe and lld.pdb MSVC: 25.67s Before This Patch: 18.59s After This Patch: 8.92s So this is a huge performance win. Differential Revision: https://reviews.llvm.org/D33564 llvm-svn: 303935
*	DebugInfo: Simplify scopes+subprogram handling since the subprogram<>cu link ↵	David Blaikie	2017-05-25	1	-18/+8
\| \| \| \| \| \| \| \| \| \| \| \|	inversion Previously this code was defensive to the situation in which the debug info scopes would lead to a different subprogram from the subprogram in the CU's subprogram list (this could've happened with linkonce functions, etc as per the comment being removed). Since the CU<>SP link reversal this is no longer possible. llvm-svn: 303933
*	[PPC] Fix atomics lowering in DAG lowering.	Tim Shen	2017-05-25	2	-1/+26
\| \| \| \| \| \| \| \| \| \| \|	I forgot to forward the chain, causing some missing instruction dependencies. The test crashes the compiler without this patch. Inspired by the test case, D33519 also tries to remove the extra sync. Differential Revision: https://reviews.llvm.org/D33573 llvm-svn: 303931
*	Fix test to handle running on platforms which don't enable pubnames at all	David Blaikie	2017-05-25	1	-6/+4
\| \| \| \| \| \| \|	Check that there are no entries in the pub sections, but that they may either be not present or present-but-empty. llvm-svn: 303927
*	[InstCombine] Add an InstCombine specific wrapper around ↵	Craig Topper	2017-05-25	4	-14/+14
\| \| \| \| \| \| \| \|	isKnownToBeAPowerOfTwo to shorten code. NFC We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo. llvm-svn: 303924
*	[GVN] Add phi-translate support in scalarpre.	Wei Mi	2017-05-25	5	-26/+271
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 303923
*	Add constrained intrinsics for some libm-equivalent operations	Andrew Kaylor	2017-05-25	16	-72/+1134
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D32319 llvm-svn: 303922
*	CodeGen: Rename DEBUG_TYPE to match passnames	Matthias Braun	2017-05-25	74	-162/+146
\| \| \| \| \| \| \| \|	Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921
*	[lld] Fix a bug where we continually re-follow type servers.	Zachary Turner	2017-05-25	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally this was intended to be set up so that when linking a PDB which refers to a type server, it would only visit the PDB once, and on subsequent visitations it would just skip it since all the records had already been added. Due to some C++ scoping issues, this was not occurring and it was revisiting the type server every time, which caused every record to end up being thrown away on all subsequent visitations. This doesn't affect the performance of linking clang-cl generated object files because we don't use type servers, but when linking object files and libraries generated with /Zi via MSVC, this means only 1 object file has to be linked instead of N object files, so the speedup is quite large. llvm-svn: 303920
*	[CodeView Type Merging] Don't keep re-allocating temp serializer.	Zachary Turner	2017-05-25	5	-21/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, every time we wanted to serialize a field list record, we would create a new copy of FieldListRecordBuilder, which would in turn create a temporary instance of TypeSerializer, which itself had a std::vector<> that was about 128K in size. So this 128K allocation was happening every time. We can re-use the same instance over and over, we just have to clear its internal hash table and seen records list between each run. This saves us from the constant re-allocations. This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests. Differential Revision: https://reviews.llvm.org/D33506 llvm-svn: 303919
*	Make BinaryStreamReader::readCString a bit faster.	Zachary Turner	2017-05-25	1	-13/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously it would do a character by character search for a null terminator, to account for the fact that an arbitrary stream need not store its data contiguously so you couldn't just do a memchr. However, the stream API has a function which will return the longest contiguous chunk without doing a copy, and by using this function we can do a memchr on the individual chunks. For certain types of streams like data from object files etc, this is guaranteed to find the null terminator with only a single memchr, but even with discontiguous streams such as MappedBlockStream, it's rare that any given string will cross a block boundary, so even those will almost always be satisfied with a single memchr. This optimization is worth a 10-12% reduction in link time (4.2 seconds -> 3.75 seconds) Differential Revision: https://reviews.llvm.org/D33503 llvm-svn: 303918
*	[pdb] pad source file name buffer at the end instead of the beginning	Bob Haarman	2017-05-25	5	-9/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: DbiStreamBuilder calculated the offset of the source file names inside the file info substream as the size of the file info substream minus the size of the file names. Since the file info substream is padded to a multiple of 4 bytes, this caused the first file name to be aligned on a 4-byte boundary. By contrast, DbiModuleList would read the file names immediately after the file name offset table, without skipping to the next 4-byte boundary. This change makes it so that the file names are written to the location where DbiModuleList expects them, and puts any necessary padding for the file info substream after the file names instead of before it. Reviewers: amccarth, rnk, zturner Reviewed By: amccarth, zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33475 llvm-svn: 303917
*	Fix a bug in MappedBlockStream.	Zachary Turner	2017-05-25	3	-52/+47
\| \| \| \| \| \| \| \| \| \|	It was using the number of blocks of the entire PDB file as the number of blocks of each stream that was created. This was only an issue in the readLongestContiguousChunk function, which was never called prior. This bug surfaced when I updated an algorithm to use this function and the algorithm broke. llvm-svn: 303916
*	[WebAssembly] MC: Include unnamed data when writing wasm files	Sam Clegg	2017-05-25	2	-18/+69
\| \| \| \| \| \| \| \| \| \| \| \|	Also, include global entries for all data symbols, not just external ones, since these are referenced by the relocation records. Add a test case that includes unnamed data. Differential Revision: https://reviews.llvm.org/D33079 llvm-svn: 303915
*	[CodeView Type Merging] Avoid record deserialization when possible.	Zachary Turner	2017-05-25	10	-148/+288
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A profile shows the majority of time doing type merging is spent deserializing records from sequences of bytes into friendly C++ structures that we can easily access members of in order to find the type indices to re-write. Records are prefixed with their length, however, and most records have type indices that appear at fixed offsets in the record. For these records, we can save some cycles by just looking at the right place in the byte sequence and re-writing the value, then skipping the record in the type stream. This saves us from the costly deserialization of examining every field, including potentially null terminated strings which are the slowest, even though it was unnecessary to begin with. In addition, we apply another optimization. Previously, after deserializing a record and re-writing its type indices, we would unconditionally re-serialize it in order to compute the hash of the re-written record. This would result in an alloc and memcpy for every record. If no type indices were re-written, however, this was an unnecessary allocation. In this patch re-writing is made two phase. The first phase discovers the indices that need to be rewritten and their new values. This information is passed through to the de-duplication code, which only copies and re-writes type indices in the serialized byte sequence if at least one type index is different. Some records have type indices which only appear after variable length strings, or which have lists of type indices, or various other situations that can make it tricky to make this optimization. While I'm not giving up on optimizing these cases as well, for now we can get the easy cases out of the way and lay the groundwork for more complicated cases later. This patch yields another 50% speedup on top of the already large speedups submitted over the past 2 days. In two tests I have run, I went from 9 seconds to 3 seconds, and from 16 seconds to 8 seconds. Differential Revision: https://reviews.llvm.org/D33480 llvm-svn: 303914
*	Update the documentation and CMake file for Visual Studio generators.	Aaron Ballman	2017-05-25	2	-0/+11
\| \| \| \| \| \|	By default, CMake uses a 32-bit toolchain, even when on a 64-bit platform targeting a 64-bit build. However, due to the size of the binaries involved, this can cause linker instabilities (such as the linker running out of memory). Guide people to the correct solution to get CMake to use the native toolchain. llvm-svn: 303912
*	PPC: Correct Size for GETtlsADDR	Kyle Butt	2017-05-25	1	-1/+3
\| \| \| \| \| \| \| \| \|	PPC::GETtlsADDR is lowered to a branch and a nop, by the assembly printer. Its size was incorrectly marked as 4, correct it to 8. The incorrect size can cause incorrect branch relaxation in PPCBranchSelector under the right conditions. llvm-svn: 303904
*	Revert r303859, CodeGen/AMDGPU/llvm.amdgcn.s.getpc.ll fails on bots.	Nico Weber	2017-05-25	3	-28/+1
\| \| \| \|	llvm-svn: 303902