bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	test: Always treat .mir files as tests even outside of CodeGen/MIR	Matthias Braun	2016-04-04	5	-4/+4
\| \| \| \| \| \| \| \| \|	We missed a handful of .mir tests that existed outside the test/CodeGen/MIR directory. Also fix the three powerpc .mir tests that nobody noticed were broken. llvm-svn: 265350
*	Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵	Hans Wennborg	2016-04-04	8	-18/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	eliminateCallFramePseudoInstr (PR27140)" The original commit miscompiled things on 32-bit Windows, e.g. a Clang boostrap. It turns out that mergeSPUpdates() was a bit too generous in what it interpreted as a stack adjustment, causing the following code: addl $12, %esp leal -4(%ebp), %esp To be "optimized" into simply: addl $8, %esp This commit tightens up mergeSPUpdates() and includes a new test (test14 in movtopush.ll) for this situation. llvm-svn: 265345
*	Enable unroll for constant bound loops when TripCount is not modulo of ↵	Zia Ansari	2016-04-04	1	-0/+29
\| \| \| \| \| \| \| \| \| \|	unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit. Commit for Evgeny Stupachenko (evstupac@gmail.com) Differential Revision: http://reviews.llvm.org/D18290 llvm-svn: 265337
*	Fix bot errors from r265327, exact GUID which depends on path	Teresa Johnson	2016-04-04	1	-34/+33
\| \| \| \| \| \| \| \| \|	E.g. http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/21919 The source file path name will affect exact GUID, don't try to match exact value. llvm-svn: 265334
*	Beef up some dllexport tests.	Sean Silva	2016-04-04	1	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds some dllexport tests to verify that: - Variables in bss are exported appropriately - Non-dllexport symbols aliased to dllexport symbols are not exported - Symbols declared as dllexport but are not defined are not exported We plan to enable dllimport/dllexport support for the PS4, and these additional tests are for points we noticed in our internal testing. Patch by Warren Ristow! Differential Revision: http://reviews.llvm.org/D18682 llvm-svn: 265333
*	[PGO] Avoid instrumenting direct callee's at value sites.	Betul Buyukkurt	2016-04-04	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	Direct callees' that are cast to other function prototypes, show up in the Call/Invoke instructions as ConstantExpr's. Currently llvm::CallSite's getCalledFunction() fails to return the callees in such expressions as direct calls. Value profiling should avoid instrumenting such cases. Mostly NFC. llvm-svn: 265330
*	ARM, AArch64, X86: Check preserved registers for tail calls.	Matthias Braun	2016-04-04	2	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can only perform a tail call to a callee that preserves all the registers that the caller needs to preserve. This situation happens with calling conventions like preserver_mostcc or cxx_fast_tls. It was explicitely handled for fast_tls and failing for preserve_most. This patch generalizes the check to any calling convention. Related to rdar://24207743 Differential Revision: http://reviews.llvm.org/D18680 llvm-svn: 265329
*	[ThinLTO] Add option to dump value name to GUID mapping	Teresa Johnson	2016-04-04	1	-1/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Useful for debugging since we lose this correlation after the permodule summary/VST is read and until we later materialize source modules in the function importer. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18555 llvm-svn: 265327
*	[DependenceAnalysis] Check if result of getConstantPart is null	Brendon Cahoon	2016-04-04	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \|	A seg-fault occurs due to a reference of a null pointer, which is the value returned by getConstantPart. This function returns null if the constant part is not found. The code that calls this function needs to check for the null return value. Differential Revision: http://reviews.llvm.org/D18718 llvm-svn: 265319
*	Revert r265309 and r265312 because they caused some errors I need to ↵	Wei Mi	2016-04-04	5	-200/+518
\| \| \| \| \| \|	investigate. llvm-svn: 265317
*	Add MachineFunctionProperty checks for AllVRegsAllocated for target passes	Derek Schuff	2016-04-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313
*	Replace analyzeSiblingValues with new algorithm to fix its compile	Wei Mi	2016-04-04	5	-518/+200
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265309
*	[mips] Range check simm32 and fold MIPS16's imm32 into simm32.	Daniel Sanders	2016-04-04	1	-12/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: At this point we should be able to enable IAS by default for O32 without breaking check-all, or recursion. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18439 llvm-svn: 265302
*	[SystemZ] Add compare-and-branch instructions to MC	Ulrich Weigand	2016-04-04	2	-0/+1176
\| \| \| \| \| \| \| \| \| \| \| \|	This adds MC support for fused compare + indirect branch instructions, ie. CRB, CGRB, CLRB, CLGRB, CIB, CGIB, CLIB, CLGIB. They aren't actually generated yet -- this is preparation for their use for conditional returns in the next iteration of D17339. Author: koriakin Differential Revision: http://reviews.llvm.org/D18742 llvm-svn: 265296
*	[SystemZ] Support ATOMIC_FENCE	Ulrich Weigand	2016-04-04	2	-0/+29
\| \| \| \| \| \| \| \| \| \| \|	A cross-thread sequentially consistent fence should be lowered into z/Architecture's BCR serialization instruction, instead of causing a fatal error in the back-end. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18644 llvm-svn: 265292
*	[SystemZ] Support llvm.frameaddress/llvm.returnaddress intrinsics	Ulrich Weigand	2016-04-04	2	-0/+43
\| \| \| \| \| \| \| \| \| \| \|	Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which previously would cause the back-end to crash. Currently, only a frame count of zero is supported. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18514 llvm-svn: 265291
*	AVX-512: Truncating store for i1 vectors	Elena Demikhovsky	2016-04-04	2	-416/+170
\| \| \| \| \| \| \| \| \|	Implemented truncstore for KNL and skylake-avx512. Covered vectors from v2i1 to v64i1. We save the value in bits (not in bytes) - v32i1 is saved in 4 bytes. Differential Revision: http://reviews.llvm.org/D18740 llvm-svn: 265283
*	[DebugInfo] Fix tests in Assembler/	Davide Italiano	2016-04-04	6	-0/+59
\| \| \| \| \| \| \|	Each DISubprogram with isDefinition : true must belong to a compile unit. llvm-svn: 265281
*	[X86][SSE] Refreshed MOVMSK sign bit tests	Simon Pilgrim	2016-04-03	1	-26/+48
\| \| \| \|	llvm-svn: 265267
*	[CodeGenPrepare] Avoid sinking soft-FP comparisons	Peter Zotov	2016-04-03	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sinking comparisons in CGP can undo the job of hoisting them done earlier by LICM, and soft-FP makes this an expensive mistake. A common pattern that produces floating point comparisons uniform over a loop is an explicit check for division by zero. If the divisor is hoisted out of the loop, the comparison can also be, but hoisting the function that unwinds is never legal, since it may cause side effects in the loop body prior to the unwinding to not be executed. Differential Revision: http://reviews.llvm.org/D18744 llvm-svn: 265264
*	Mark some FP intrinsics as safe to speculatively execute	Peter Zotov	2016-04-03	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Floating point intrinsics in LLVM are generally not speculatively executed, since most of them are defined to behave the same as libm functions, which set errno. However, the only error that can happen when executing ceil, floor, nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE, and that requires the maximum value of the exponent to be smaller than the number of mantissa bits, which is not the case with any of the floating point types supported by LLVM. The trunc and copysign functions never set errno per per POSIX.1-2001. Differential Revision: http://reviews.llvm.org/D18643 llvm-svn: 265262
*	AVX-512: Load and Extended Load for i1 vectors	Elena Demikhovsky	2016-04-03	4	-906/+139
\| \| \| \| \| \| \| \| \| \|	Implemented load+{sign\|zero}_extend for i1 vectors Fixed failures in i1 vector load. Covered loading of v2i1, v4i1, v8i1, v16i1, v32i1, v64i1 vectors for KNL and SKX. Differential Revision: http://reviews.llvm.org/D18737 llvm-svn: 265259
*	[mips][microMIPS] Revert commits r264245 and r264248.	Zoran Jovanovic	2016-04-02	9	-575/+17
\| \| \| \| \| \| \|	Commit r264245 was the reason for failing tests in LLVM test suite. Commit r264248 depends on the first one. llvm-svn: 265249
*	[X86][SSE] Added 1024-bit vector comparison tests	Simon Pilgrim	2016-04-02	1	-0/+4894
\| \| \| \| \| \|	More examples of PR22603, poor vector splitting for AVX512F targets as well as missing uses of PACKSS/MOVMSK llvm-svn: 265248
*	[X86][AVX512] Added AVX512 comparison tests	Simon Pilgrim	2016-04-02	1	-0/+98
\| \| \| \|	llvm-svn: 265247
*	AArch64: support .cpu directive	Saleem Abdulrasool	2016-04-02	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for the AArch64 .cpu directive. This is a slightly involved directive since the parameter is actually a variable encoded string. The general structure is: <cpu>[[+-]<feature>]* We now map some of the supported string names for features for internal representation of feature flags. If we encounter one which we do not support, bail out as we cannot validate the assembly any longer. Resolves PR27010. llvm-svn: 265240
*	Bitcode: Try to emit metadata in function blocks	Duncan P. N. Exon Smith	2016-04-02	1	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Whenever metadata is only referenced by a single function, emit the metadata just in that function block. This should improve lazy-loading by reducing the amount of metadata in the global block. For now, this should catch all DILocations, and anything else that happens to be referenced only by a single function. It's also a first step toward a couple of possible future directions (which this commit does not implement): 1. Some debug info metadata is only referenced from compile units and individual functions. If we can drop the link from the compile unit, this optimization will get more powerful. 2. Any uniqued metadata that isn't referenced globally can in theory be emitted in every function block that references it (trading off bitcode size and full-parse time vs. lazy-load time). Note: this assumes the new BitcodeReader error checking from r265223. The metadata stored in function blocks gets purged after parsing each function, which means unresolved forward references will get lost. Since all the global metadata should have already been resolved by the time we get to the function metadata blocks we just need to check for that case. (If for some reason we need to handle bitcode that fails the checks in r265223, the fix is to store about-to-be-dropped unresolved nodes in MetadataList::shrinkTo until they can be handled succesfully by a future call to MetadataList::tryToResolveCycles.) llvm-svn: 265226
*	[X86][AVX] Added vector float truncation (double2float) tests	Simon Pilgrim	2016-04-02	1	-0/+168
\| \| \| \|	llvm-svn: 265222
*	AArch64: avoid clobbering SP for dead MOVimm pseudos.	Tim Northover	2016-04-01	1	-0/+46
\| \| \| \| \| \| \| \|	We were producing ORR, which actually defines a GPR32sp rather than a GPR32. Should fix PR23209. llvm-svn: 265198
*	Add missing emissionKind flags to the DICompileUnits of several old testcases.	Adrian Prantl	2016-04-01	9	-9/+9
\| \| \| \|	llvm-svn: 265192
*	ThinLTO: special handling for LinkOnce functions	Mehdi Amini	2016-04-01	2	-0/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These function can be dropped by the compiler if they are no longer referenced in the current module. However there is a change that another module is still referencing them because of the import. Multiple solutions can be used: - Always import LinkOnce when a caller is imported. This ensure that every module with a call to a LinkOnce has the definition and will be able to emit it if it emits the call. - Turn the LinkOnce into Weak, so that it is always emitted. - Turn all LinkOnce into available_externally and come back after all modules are codegen'ed to emit only one copy of the linkonce, when there is still a reference to it. This patch implement the second option, with am optimization that only one module will turn the LinkOnce into Weak, while the others will turn it into available_externally, so that there is exactly one copy emitted for the whole compilation. http://reviews.llvm.org/D18346 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265190
*	Swift Calling Convention: add swifterror attribute.	Manman Ren	2016-04-01	4	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \|	A ``swifterror`` attribute can be applied to a function parameter or an AllocaInst. This commit does not include any target-specific change. The target-specific optimization will come as a follow-up patch. Differential Revision: http://reviews.llvm.org/D18092 llvm-svn: 265189
*	[X86][SSE] Regenerated vector float tests - fabs / floor(etc.) / fneg / ↵	Simon Pilgrim	2016-04-01	4	-205/+534
\| \| \| \| \| \|	float2double llvm-svn: 265186
*	[X86][SSE] Vector i64 load tests	Simon Pilgrim	2016-04-01	1	-11/+32
\| \| \| \|	llvm-svn: 265185
*	[X86][SSE] Regenerated comparison mask and float immediate tests	Simon Pilgrim	2016-04-01	2	-19/+66
\| \| \| \|	llvm-svn: 265184
*	[X86][SSE] Regenerated the vec_extract tests.	Simon Pilgrim	2016-04-01	5	-180/+431
\| \| \| \|	llvm-svn: 265183
*	[X86][SSE] Regenerated the vec_insert tests.	Simon Pilgrim	2016-04-01	9	-121/+410
\| \| \| \|	llvm-svn: 265179
*	[X86][SSE] Regenerated vec_partial tests.	Simon Pilgrim	2016-04-01	1	-10/+11
\| \| \| \|	llvm-svn: 265173
*	[x86] add an SSE2 + fast-unaligned accesses run for memset nonzero tests	Sanjay Patel	2016-04-01	1	-4/+122
\| \| \| \| \| \| \| \| \|	Was there really no other way to splat a byte in SSE2? punpcklbw {{.#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7] pshuflw {{.#+}} xmm0 = xmm0[0,0,0,0,4,5,6,7] pshufd {{.*#+}} xmm0 = xmm0[0,0,1,1] llvm-svn: 265172
*	[X86][SSE] Regenerated vec_logical tests.	Simon Pilgrim	2016-04-01	1	-27/+72
\| \| \| \|	llvm-svn: 265171
*	AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2}	Tom Stellard	2016-04-01	1	-0/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170
*	[X86][SSE] Regenerated vector sdiv to shifts tests	Simon Pilgrim	2016-04-01	1	-46/+239
\| \| \| \| \| \|	Added SSE + AVX1 tests as well as AVX2 llvm-svn: 265169
*	[sancov] save entry block from pruning (it is always full dominator)	Mike Aizatsky	2016-04-01	1	-3/+2
\| \| \| \|	llvm-svn: 265168
*	[x86] add an SSE1 run for these tests	Sanjay Patel	2016-04-01	1	-105/+106
\| \| \| \| \| \| \| \|	Note however that this is identical to the existing SSE2 run. What we really want is yet another run for an SSE2 machine that also has fast unaligned 16-byte accesses. llvm-svn: 265167
*	[X86][SSE] Regenerated vec_setcc tests.	Simon Pilgrim	2016-04-01	1	-111/+131
\| \| \| \|	llvm-svn: 265164
*	[X86][SSE] Regenerated the vec_set tests.	Simon Pilgrim	2016-04-01	13	-128/+214
\| \| \| \| \| \|	Replaced lots of dodgy greps with actual codegen llvm-svn: 265163
*	[x86] avoid intermediate splat for non-zero memsets (PR27100)	Sanjay Patel	2016-04-01	1	-18/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Follow-up to http://reviews.llvm.org/D18566 and http://reviews.llvm.org/D18676 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The 16-byte test that was added in D18566 is now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. Note that the SSE1 path is not changed in this patch. That can be a follow-up. This patch should resolve PR27100. llvm-svn: 265161
*	[InstCombine] Don't sink an instr after a catchswitch	David Majnemer	2016-04-01	1	-0/+45
\| \| \| \| \| \|	A catchswitch is a terminator, instructions cannot be inserted after it. llvm-svn: 265158
*	[SLPVectorizer] Don't insert an extractelement before a catchswitch	David Majnemer	2016-04-01	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	A catchswitch cannot be preceded by another instruction in the same basic block (other than a PHI node). Instead, insert the extract element right after the materialization of the vectorized value. This isn't optimal but is a reasonable compromise given the constraints of WinEH. This fixes PR27163. llvm-svn: 265157
*	[x86] avoid intermediate splat for non-zero memsets (PR27100)	Sanjay Patel	2016-04-01	1	-113/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Follow-up to D18566 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The tests that were added in the last patch are now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. In the new tests, the splat via shuffling looks ok to me, but there might be some room for improvement depending on uarch there. Note that the SSE1/2 paths are not changed in this patch. That can be a follow-up. This patch should resolve PR27100. Differential Revision: http://reviews.llvm.org/D18676 llvm-svn: 265148