bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make helper functions static. NFC.	Benjamin Kramer	2017-05-26	8	-12/+22
\| \| \| \|	llvm-svn: 304029
*	Fix the ManagedStatic list ordering when using ↵	Frederich Munch	2017-05-26	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DynamicLibrary::addPermanentLibrary. Summary: r295737 included a fix for leaking libraries loaded via. DynamicLibrary::addPermanentLibrary. This created a problem where static constructors in a library could insert llvm::ManagedStatic objects before DynamicLibrary would register it's own ManagedStatic, meaning a crash could occur at shutdown. r301562 exasperated this problem by cleaning up the DynamicLibrary ManagedStatic during llvm_shutdown. Reviewers: v.g.vassilev, lhames, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33581 llvm-svn: 304027
*	[InstSimplify] Move a variable declaration to make simplifyAndOfICmps look ↵	Craig Topper	2017-05-26	1	-1/+1
\| \| \| \| \| \|	more like simplifyOrOfICmps. NFC llvm-svn: 304023
*	[InstSimplify] Use commutable matchers to shorten some code	Craig Topper	2017-05-26	1	-13/+5
\| \| \| \| \| \| \| \|	This code was replicated two additional times to handle commuted cases, but I think a commutable matcher can take care of it. Differential Revision: https://reviews.llvm.org/D33585 llvm-svn: 304022
*	[InstSimplify] Use m_APInt instead of m_ConstantInt in ((V + N) & C1) \| (V & ↵	Craig Topper	2017-05-26	1	-10/+10
\| \| \| \| \| \| \| \| \| \|	C2) handling in order to support splat vectors. The tests here are have operands commuted to provide more coverage. I also commuted one of the instructions in the scalar tests so the 4 tests cover the 4 commuted variations Differential Revision: https://reviews.llvm.org/D33599 llvm-svn: 304021
*	DebugInfo: Do not emit empty CUs	David Blaikie	2017-05-26	2	-15/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consistent with GCC and addresses a shortcoming with ThinLTO where many imported CUs may end up being empty (because the functions imported from them either ended up not being used (and were then discarded, since they're imported as available_externally) or optimized away entirely). Test cases previously testing empty CUs (either intentionally, or because they didn't need anything more complicated) had a trivial 'int' or similar basic type added to their retained types list. This is a first order approximation - a deeper implementation could do things like: 1) Be more lazy about construction of the CU - for example if two CUs containing a single identical retained type are linked together, with this change one of the two CUs will be produced but empty (since a duplicate type won't be produced). 2) Go further and invert all the CU links the same way the subprogram link is inverted - keep named CU lists of retained types, macros, etc, and have those link back to the CU. Then if they're emitted, the CU is emitted, but never otherwise - this would allow the metadata itself to be dropped earlier too, though it seems unlikely that's an important optimization as there shouldn't be many CUs relative to the number of other entities. llvm-svn: 304020
*	PMB: Run the whole-program-devirt pass during LTO at --lto-O0.	Peter Collingbourne	2017-05-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	The whole-program-devirt pass needs to run at -O0 because only it knows about the llvm.type.checked.load intrinsic: it needs to both lower the intrinsic itself and handle it in the summary. Differential Revision: https://reviews.llvm.org/D33571 llvm-svn: 304019
*	[InstCombine] Pass the DominatorTree, AssumptionCache, and context ↵	Craig Topper	2017-05-26	3	-4/+7
\| \| \| \| \| \| \| \| \| \|	instruction to a few calls to isKnownPositive, isKnownNegative, and isKnownNonZero Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent. Differential Revision: https://reviews.llvm.org/D33567 llvm-svn: 304018
*	[AMDGPU][MC][GFX9] Corrected encoding of flat_scratch* for SDWA opcodes	Dmitry Preobrazhensky	2017-05-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	See bug 33171: https://bugs.llvm.org/show_bug.cgi?id=33171 Reviewers: Sam Kolton Differential Revision: https://reviews.llvm.org/D33553 llvm-svn: 304015
*	Revert r304002 "[DWARF] - Make collectAddressRanges() return section index ↵	George Rimar	2017-05-26	7	-61/+34
\| \| \| \| \| \| \| \|	in addition to Low/High PC" Revert it again. Now another bot unhappy: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/8750 llvm-svn: 304011
*	DebugInfo: Don't include locations for debug-having code inlined into ↵	David Blaikie	2017-05-26	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nodebug functions This produced 'strange' DWARF anyway - the CU would have no ranges (or at least not a range including the inlined code) nor any subprogram or inlined_subroutine - yet the line table would have entries for these instructions. (this actually becomes more relevant with changes coming after this, where a CU without any contents will be omitted entirely - so there would be no line table to put this on anyway) llvm-svn: 304004
*	AMDGPU/GlobalISel: Mark 32-bit float constants as legal	Tom Stellard	2017-05-26	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33212 llvm-svn: 304003
*	[DWARF] - Make collectAddressRanges() return section index in addition to ↵	George Rimar	2017-05-26	7	-34/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Low/High PC This change is intended to use for LLD in D33183. Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to. Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's. That not only was slow, but also complicated implementation and was the reason of incorrect behavior when sections share the same offsets, like D33176 shows. This patch makes DWARF parsers to return section index as well. That solves problem mentioned above. Differential revision: https://reviews.llvm.org/D33184 llvm-svn: 304002
*	LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI	Matthias Braun	2017-05-26	1	-31/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal addPristines() function must be called on an empty set or it may accidentally reset saved registers. - addLiveOutsNoPristines() needs to add callee saved registers that are actually saved and restored somewhere to the set (they are not pristine). - Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines(). This fixes the problem from D32156. Differential Revision: https://reviews.llvm.org/D32464 llvm-svn: 304001
*	[AMDGPU] SDWA: add disassembler support for GFX9	Sam Kolton	2017-05-26	5	-31/+113
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added decoder methods and tests Reviewers: vpykhtin, artem.tamazov, dp Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33545 llvm-svn: 303999
*	[DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790)	Sanjay Patel	2017-05-26	1	-0/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the best case: extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN ...we kill all of the extract/concat and just have narrow binops remaining. If only one of the binop operands is amenable, this transform is still worthwhile because we kill some of the extract/concat. Optional bitcasting makes the code more complicated, but there doesn't seem to be a way to avoid that. The TODO about extending to more than bitwise logic is there because we really will regress several x86 tests including madd, psad, and even a plain integer-multiply-by-2 or shift-left-by-1. I don't think there's anything fundamentally wrong with this patch that would cause those regressions; those folds are just missing or brittle. If we extend to more binops, I found that this patch will fire on at least one non-x86 regression test. There's an ARM NEON test in test/CodeGen/ARM/coalesce-subregs.ll with a pattern like: t5: v2f32 = vector_shuffle<0,3> t2, t4 t6: v1i64 = bitcast t5 t8: v1i64 = BUILD_VECTOR Constant:i64<0> t9: v2i64 = concat_vectors t6, t8 t10: v4f32 = bitcast t9 t12: v4f32 = fmul t11, t10 t13: v2i64 = bitcast t12 t16: v1i64 = extract_subvector t13, Constant:i32<0> There was no functional change in the codegen from this transform from what I could see though. For the x86 test changes: 1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case, but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops, there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op. SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat ops to match the pattern. 2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract. Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026 3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT. 4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction count by one in each case by eliminating two insert/extract while adding one narrower logic op. https://bugs.llvm.org/show_bug.cgi?id=32790 Differential Revision: https://reviews.llvm.org/D33137 llvm-svn: 303997
*	[DAG] Move legal type checks in store merge to be checked only	Nirav Dave	2017-05-26	1	-2/+4
\| \| \| \| \| \|	on non-legal cases. NFC. llvm-svn: 303994
*	[ARM] Fix lowering of misaligned memcpy/memset	John Brawn	2017-05-26	2	-18/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently getOptimalMemOpType returns i32 for large enough sizes without checking for alignment, leading to poor code generation when misaligned accesses aren't permitted as we generate a word store then later split it up into byte stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for memset we splat the memset value into a word then immediately split it up again. Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type to use, but also fix a bug there where it wasn't correctly checking if misaligned memory accesses are allowed. Differential Revision: https://reviews.llvm.org/D33442 llvm-svn: 303990
*	The fix for PR22004: X86AsmParser.cpp asserts: OperandStack.size() > 1 && ↵	Andrew V. Tischenko	2017-05-26	1	-2/+3
\| \| \| \| \| \|	"Too few operands." llvm-svn: 303985
*	Revert "[DWARF] - Make collectAddressRanges() return section index in ↵	George Rimar	2017-05-26	7	-59/+34
\| \| \| \| \| \| \| \| \| \| \| \|	addition to Low/High PC" Broked BB again: TEST 'LLVM :: DebugInfo/X86/dbg-value-regmask-clobber.ll' FAILED ... LLVM ERROR: Section was outside of section table. llvm-svn: 303984
*	Recommit r303978 "[DWARF] - Make collectAddressRanges() return section index ↵	George Rimar	2017-05-26	7	-34/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in addition to Low/High PC" With fix of test compilation. Initial commit message: This change is intended to use for LLD in D33183. Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to. Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's. That not only was slow, but also complicated implementation and was the reason of incorrect behavior when sections share the same offsets, like D33176 shows. This patch makes DWARF parsers to return section index as well. That solves problem mentioned above. Differential revision: https://reviews.llvm.org/D33184 llvm-svn: 303983
*	Revert r303978 "[DWARF] - Make collectAddressRanges() return section index ↵	George Rimar	2017-05-26	7	-59/+34
\| \| \| \| \| \| \| \|	in addition to Low/High PC" It failed BB. llvm-svn: 303981
*	Fix signedness of constant. NFC.	Nirav Dave	2017-05-26	1	-5/+5
\| \| \| \|	llvm-svn: 303980
*	[DWARF] - Make collectAddressRanges() return section index in addition to ↵	George Rimar	2017-05-26	7	-34/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Low/High PC This change is intended to use for LLD in D33183. Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to. Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's. That not only was slow, but also complicated implementation and was the reason of incorrect behavior when sections share the same offsets, like D33176 shows. This patch makes DWARF parsers to return section index as well. That solves problem mentioned above. Differential revision: https://reviews.llvm.org/D33184 llvm-svn: 303978
*	Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"	Max Kazantsev	2017-05-26	1	-2/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on many non-X86 configs with this patch. The reason of this is that the patch makes a correctless fix that changes optimizer's behavior for this test. Without the change, LSR was making an overconfident simplification basing on a wrong SCEV. Apparently it did not need the IV analysis to do this. With the change, it chose a different way to simplify (that wasn't so confident), and this way required the IV analysis. Now, following the right execution path, LSR tries to make a transformation relying on IV Users analysis. This analysis is target-dependent due to this code: // LSR is not APInt clean, do not touch integers bigger than 64-bits. // Also avoid creating IVs of non-native types. For example, we don't want a // 64-bit IV in 32-bit code just because the loop has one 64-bit cast. uint64_t Width = SE->getTypeSizeInBits(I->getType()); if (Width > 64 \|\| !DL.isLegalInteger(Width)) return false; To make a proper transformation in this test case, the type i32 needs to be legal for the specified data layout. When the test runs on some non-X86 configuration (e.g. pure ARM 64), opt gets confused by the specified target and does not use it, rejecting the specified data layout as well. Instead, it uses some default layout that does not treat i32 as a legal type (currently the layout that is used when it is not specified does not have legal types at all). As result, the transformation we expect to happen does not happen for this test. This re-enabling patch does not have any source code changes compared to the original patch rL303730. The only difference is that the failing test is moved to X86 directory and now has requirement of running on x86 only to comply with the specified target triple and data layout. Differential Revision: https://reviews.llvm.org/D33543 llvm-svn: 303971
*	LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI	Matthias Braun	2017-05-26	4	-6/+12
\| \| \| \| \| \| \| \| \| \|	Re-commit r303937 + r303949 as they were not the cause for the build failures. We do not track liveness of reserved registers so adding them to the liveins list in computeLiveIns() was completely unnecessary. llvm-svn: 303970
*	Revert rL303923 since it broke the sanitizer bootstrap build bot.	Wei Mi	2017-05-26	1	-136/+21
\| \| \| \|	llvm-svn: 303969
*	[InstSimplify] Use APInt::isMask isntead of manually implementing it. NFC	Craig Topper	2017-05-26	1	-2/+2
\| \| \| \|	llvm-svn: 303968
*	[InstSimplify] Use m_ConstantInt matchers to short some code. NFC	Craig Topper	2017-05-26	1	-7/+5
\| \| \| \|	llvm-svn: 303967
*	[IR] Add an iterator and range accessor for the PHI nodes of a basic	Chandler Carruth	2017-05-26	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	block. This allows writing much more natural and readable range based for loops directly over the PHI nodes. It also takes advantage of the same tricks for terminating the sequence as the hand coded versions. I've replaced one example of this mostly to showcase the difference and I've added a unit test to make sure the facilities really work the way they're intended. I want to use this inside of SimpleLoopUnswitch but it seems generally nice. Differential Revision: https://reviews.llvm.org/D33533 llvm-svn: 303964
*	Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI"	Matthias Braun	2017-05-26	1	-41/+27
\| \| \| \| \| \| \| \| \| \|	Tentatively revert this to see if it fixes the buildbot stage2 breakages. This reverts commit r303938. This reverts commit r303954. llvm-svn: 303960
*	Revert "LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI"	Matthias Braun	2017-05-26	4	-12/+6
\| \| \| \| \| \| \| \| \| \|	Tentatively revert, suspecting that it caused breakage in stage2 buildbots. This reverts commit r303949. This reverts commit r303937. llvm-svn: 303955
*	[PM] Enable the new simple loop unswitch pass in the new pass manager	Chandler Carruth	2017-05-26	1	-4/+1
\| \| \| \| \| \| \| \| \|	(where it is the only realistic option). This passes the LLVM test suite for me, but I'm clearly still hammering on this. llvm-svn: 303952
*	LivePhysRegs: Follow-up to r303937	Matthias Braun	2017-05-26	1	-1/+1
\| \| \| \| \| \| \|	We may have situations in which a superregister is reserved and not added to liveins, so we have to add the subregisters. llvm-svn: 303949
*	Remove unused member.	Zachary Turner	2017-05-25	1	-2/+0
\| \| \| \|	llvm-svn: 303942
*	[PPC] Add text for assert.	Tim Shen	2017-05-25	1	-1/+1
\| \| \| \|	llvm-svn: 303940
*	LTO: Do summary-based prevailing symbol resolution at --lto-O0.	Peter Collingbourne	2017-05-25	1	-13/+12
\| \| \| \| \| \| \| \| \|	Prevailing symbol resolution is necessary for correctness. Without this we can end up dropping a referenced linkonce symbol from the link. Differential Revision: https://reviews.llvm.org/D33570 llvm-svn: 303939
*	LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI	Matthias Braun	2017-05-25	1	-27/+41
\| \| \| \| \| \| \| \| \| \| \| \| \|	- addLiveOutsNoPristines() needs to add callee saved registers that are actually saved and restored somewhere to the set (they are not pristine). - Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines(). This fixes the problem from D32156. Differential Revision: https://reviews.llvm.org/D32464 llvm-svn: 303938
*	LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI	Matthias Braun	2017-05-25	4	-5/+11
\| \| \| \| \| \| \|	We do not track liveness of reserved registers so adding them to the liveins list in computeLiveIns() was completely unnecessary. llvm-svn: 303937
*	[CV Type Merging] Find nested type indices faster.	Zachary Turner	2017-05-25	3	-347/+413
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merging two type streams is one of the most time consuming parts of generating a PDB, and as such it needs to be as fast as possible. The visitor abstractions used for interoperating nicely with many different types of inputs and outputs have been used widely and help greatly for testability and implementing tools, but the abstractions build up and get in the way of performance. This patch removes all of the visitation stuff from the type stream merger, essentially re-inventing the leaf / member switch and loop, but at a very low level. This allows us many other optimizations, such as not actually deserializing any records (even member records which don't describe their own length), as the operation of "figure out how long this record is" is somewhat faster than "figure out how long this record and get all its fields out". Furthermore, whereas before we had to deserialize, re-write type indices, then re-serialize, now we don't have to do any of those 3 steps. We just find out where the type indices are and pull them directly out of the byte stream and re-write them. This is worth a 50-60% performance increase. On top of all other optimizations that have been applied this week, I now get the following numbers when linking lld.exe and lld.pdb MSVC: 25.67s Before This Patch: 18.59s After This Patch: 8.92s So this is a huge performance win. Differential Revision: https://reviews.llvm.org/D33564 llvm-svn: 303935
*	DebugInfo: Simplify scopes+subprogram handling since the subprogram<>cu link ↵	David Blaikie	2017-05-25	1	-18/+8
\| \| \| \| \| \| \| \| \| \| \| \|	inversion Previously this code was defensive to the situation in which the debug info scopes would lead to a different subprogram from the subprogram in the CU's subprogram list (this could've happened with linkonce functions, etc as per the comment being removed). Since the CU<>SP link reversal this is no longer possible. llvm-svn: 303933
*	[PPC] Fix atomics lowering in DAG lowering.	Tim Shen	2017-05-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	I forgot to forward the chain, causing some missing instruction dependencies. The test crashes the compiler without this patch. Inspired by the test case, D33519 also tries to remove the extra sync. Differential Revision: https://reviews.llvm.org/D33573 llvm-svn: 303931
*	[InstCombine] Add an InstCombine specific wrapper around ↵	Craig Topper	2017-05-25	4	-14/+14
\| \| \| \| \| \| \| \|	isKnownToBeAPowerOfTwo to shorten code. NFC We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo. llvm-svn: 303924
*	[GVN] Add phi-translate support in scalarpre.	Wei Mi	2017-05-25	1	-21/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 303923
*	Add constrained intrinsics for some libm-equivalent operations	Andrew Kaylor	2017-05-25	7	-64/+233
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D32319 llvm-svn: 303922
*	CodeGen: Rename DEBUG_TYPE to match passnames	Matthias Braun	2017-05-25	51	-128/+112
\| \| \| \| \| \| \| \|	Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921
*	[lld] Fix a bug where we continually re-follow type servers.	Zachary Turner	2017-05-25	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally this was intended to be set up so that when linking a PDB which refers to a type server, it would only visit the PDB once, and on subsequent visitations it would just skip it since all the records had already been added. Due to some C++ scoping issues, this was not occurring and it was revisiting the type server every time, which caused every record to end up being thrown away on all subsequent visitations. This doesn't affect the performance of linking clang-cl generated object files because we don't use type servers, but when linking object files and libraries generated with /Zi via MSVC, this means only 1 object file has to be linked instead of N object files, so the speedup is quite large. llvm-svn: 303920
*	[CodeView Type Merging] Don't keep re-allocating temp serializer.	Zachary Turner	2017-05-25	3	-11/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, every time we wanted to serialize a field list record, we would create a new copy of FieldListRecordBuilder, which would in turn create a temporary instance of TypeSerializer, which itself had a std::vector<> that was about 128K in size. So this 128K allocation was happening every time. We can re-use the same instance over and over, we just have to clear its internal hash table and seen records list between each run. This saves us from the constant re-allocations. This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests. Differential Revision: https://reviews.llvm.org/D33506 llvm-svn: 303919
*	Make BinaryStreamReader::readCString a bit faster.	Zachary Turner	2017-05-25	1	-13/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously it would do a character by character search for a null terminator, to account for the fact that an arbitrary stream need not store its data contiguously so you couldn't just do a memchr. However, the stream API has a function which will return the longest contiguous chunk without doing a copy, and by using this function we can do a memchr on the individual chunks. For certain types of streams like data from object files etc, this is guaranteed to find the null terminator with only a single memchr, but even with discontiguous streams such as MappedBlockStream, it's rare that any given string will cross a block boundary, so even those will almost always be satisfied with a single memchr. This optimization is worth a 10-12% reduction in link time (4.2 seconds -> 3.75 seconds) Differential Revision: https://reviews.llvm.org/D33503 llvm-svn: 303918
*	[pdb] pad source file name buffer at the end instead of the beginning	Bob Haarman	2017-05-25	1	-9/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: DbiStreamBuilder calculated the offset of the source file names inside the file info substream as the size of the file info substream minus the size of the file names. Since the file info substream is padded to a multiple of 4 bytes, this caused the first file name to be aligned on a 4-byte boundary. By contrast, DbiModuleList would read the file names immediately after the file name offset table, without skipping to the next 4-byte boundary. This change makes it so that the file names are written to the location where DbiModuleList expects them, and puts any necessary padding for the file info substream after the file names instead of before it. Reviewers: amccarth, rnk, zturner Reviewed By: amccarth, zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33475 llvm-svn: 303917