bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	test: Always treat .mir files as tests even outside of CodeGen/MIR	Matthias Braun	2016-04-04	4	-3/+3
\| \| \| \| \| \| \| \| \|	We missed a handful of .mir tests that existed outside the test/CodeGen/MIR directory. Also fix the three powerpc .mir tests that nobody noticed were broken. llvm-svn: 265350
*	Re-commit r265039 "[X86] Merge adjacent stack adjustments in ↵	Hans Wennborg	2016-04-04	8	-18/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	eliminateCallFramePseudoInstr (PR27140)" The original commit miscompiled things on 32-bit Windows, e.g. a Clang boostrap. It turns out that mergeSPUpdates() was a bit too generous in what it interpreted as a stack adjustment, causing the following code: addl $12, %esp leal -4(%ebp), %esp To be "optimized" into simply: addl $8, %esp This commit tightens up mergeSPUpdates() and includes a new test (test14 in movtopush.ll) for this situation. llvm-svn: 265345
*	Beef up some dllexport tests.	Sean Silva	2016-04-04	1	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds some dllexport tests to verify that: - Variables in bss are exported appropriately - Non-dllexport symbols aliased to dllexport symbols are not exported - Symbols declared as dllexport but are not defined are not exported We plan to enable dllimport/dllexport support for the PS4, and these additional tests are for points we noticed in our internal testing. Patch by Warren Ristow! Differential Revision: http://reviews.llvm.org/D18682 llvm-svn: 265333
*	ARM, AArch64, X86: Check preserved registers for tail calls.	Matthias Braun	2016-04-04	2	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can only perform a tail call to a callee that preserves all the registers that the caller needs to preserve. This situation happens with calling conventions like preserver_mostcc or cxx_fast_tls. It was explicitely handled for fast_tls and failing for preserve_most. This patch generalizes the check to any calling convention. Related to rdar://24207743 Differential Revision: http://reviews.llvm.org/D18680 llvm-svn: 265329
*	Revert r265309 and r265312 because they caused some errors I need to ↵	Wei Mi	2016-04-04	5	-200/+518
\| \| \| \| \| \|	investigate. llvm-svn: 265317
*	Add MachineFunctionProperty checks for AllVRegsAllocated for target passes	Derek Schuff	2016-04-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313
*	Replace analyzeSiblingValues with new algorithm to fix its compile	Wei Mi	2016-04-04	5	-518/+200
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265309
*	[SystemZ] Support ATOMIC_FENCE	Ulrich Weigand	2016-04-04	2	-0/+29
\| \| \| \| \| \| \| \| \| \| \|	A cross-thread sequentially consistent fence should be lowered into z/Architecture's BCR serialization instruction, instead of causing a fatal error in the back-end. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18644 llvm-svn: 265292
*	[SystemZ] Support llvm.frameaddress/llvm.returnaddress intrinsics	Ulrich Weigand	2016-04-04	2	-0/+43
\| \| \| \| \| \| \| \| \| \| \|	Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which previously would cause the back-end to crash. Currently, only a frame count of zero is supported. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18514 llvm-svn: 265291
*	AVX-512: Truncating store for i1 vectors	Elena Demikhovsky	2016-04-04	2	-416/+170
\| \| \| \| \| \| \| \| \|	Implemented truncstore for KNL and skylake-avx512. Covered vectors from v2i1 to v64i1. We save the value in bits (not in bytes) - v32i1 is saved in 4 bytes. Differential Revision: http://reviews.llvm.org/D18740 llvm-svn: 265283
*	[X86][SSE] Refreshed MOVMSK sign bit tests	Simon Pilgrim	2016-04-03	1	-26/+48
\| \| \| \|	llvm-svn: 265267
*	AVX-512: Load and Extended Load for i1 vectors	Elena Demikhovsky	2016-04-03	4	-906/+139
\| \| \| \| \| \| \| \| \| \|	Implemented load+{sign\|zero}_extend for i1 vectors Fixed failures in i1 vector load. Covered loading of v2i1, v4i1, v8i1, v16i1, v32i1, v64i1 vectors for KNL and SKX. Differential Revision: http://reviews.llvm.org/D18737 llvm-svn: 265259
*	[mips][microMIPS] Revert commits r264245 and r264248.	Zoran Jovanovic	2016-04-02	7	-567/+9
\| \| \| \| \| \| \|	Commit r264245 was the reason for failing tests in LLVM test suite. Commit r264248 depends on the first one. llvm-svn: 265249
*	[X86][SSE] Added 1024-bit vector comparison tests	Simon Pilgrim	2016-04-02	1	-0/+4894
\| \| \| \| \| \|	More examples of PR22603, poor vector splitting for AVX512F targets as well as missing uses of PACKSS/MOVMSK llvm-svn: 265248
*	[X86][AVX512] Added AVX512 comparison tests	Simon Pilgrim	2016-04-02	1	-0/+98
\| \| \| \|	llvm-svn: 265247
*	[X86][AVX] Added vector float truncation (double2float) tests	Simon Pilgrim	2016-04-02	1	-0/+168
\| \| \| \|	llvm-svn: 265222
*	AArch64: avoid clobbering SP for dead MOVimm pseudos.	Tim Northover	2016-04-01	1	-0/+46
\| \| \| \| \| \| \| \|	We were producing ORR, which actually defines a GPR32sp rather than a GPR32. Should fix PR23209. llvm-svn: 265198
*	Add missing emissionKind flags to the DICompileUnits of several old testcases.	Adrian Prantl	2016-04-01	2	-2/+2
\| \| \| \|	llvm-svn: 265192
*	[X86][SSE] Regenerated vector float tests - fabs / floor(etc.) / fneg / ↵	Simon Pilgrim	2016-04-01	4	-205/+534
\| \| \| \| \| \|	float2double llvm-svn: 265186
*	[X86][SSE] Vector i64 load tests	Simon Pilgrim	2016-04-01	1	-11/+32
\| \| \| \|	llvm-svn: 265185
*	[X86][SSE] Regenerated comparison mask and float immediate tests	Simon Pilgrim	2016-04-01	2	-19/+66
\| \| \| \|	llvm-svn: 265184
*	[X86][SSE] Regenerated the vec_extract tests.	Simon Pilgrim	2016-04-01	5	-180/+431
\| \| \| \|	llvm-svn: 265183
*	[X86][SSE] Regenerated the vec_insert tests.	Simon Pilgrim	2016-04-01	9	-121/+410
\| \| \| \|	llvm-svn: 265179
*	[X86][SSE] Regenerated vec_partial tests.	Simon Pilgrim	2016-04-01	1	-10/+11
\| \| \| \|	llvm-svn: 265173
*	[x86] add an SSE2 + fast-unaligned accesses run for memset nonzero tests	Sanjay Patel	2016-04-01	1	-4/+122
\| \| \| \| \| \| \| \| \|	Was there really no other way to splat a byte in SSE2? punpcklbw {{.#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7] pshuflw {{.#+}} xmm0 = xmm0[0,0,0,0,4,5,6,7] pshufd {{.*#+}} xmm0 = xmm0[0,0,1,1] llvm-svn: 265172
*	[X86][SSE] Regenerated vec_logical tests.	Simon Pilgrim	2016-04-01	1	-27/+72
\| \| \| \|	llvm-svn: 265171
*	AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2}	Tom Stellard	2016-04-01	1	-0/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170
*	[X86][SSE] Regenerated vector sdiv to shifts tests	Simon Pilgrim	2016-04-01	1	-46/+239
\| \| \| \| \| \|	Added SSE + AVX1 tests as well as AVX2 llvm-svn: 265169
*	[x86] add an SSE1 run for these tests	Sanjay Patel	2016-04-01	1	-105/+106
\| \| \| \| \| \| \| \|	Note however that this is identical to the existing SSE2 run. What we really want is yet another run for an SSE2 machine that also has fast unaligned 16-byte accesses. llvm-svn: 265167
*	[X86][SSE] Regenerated vec_setcc tests.	Simon Pilgrim	2016-04-01	1	-111/+131
\| \| \| \|	llvm-svn: 265164
*	[X86][SSE] Regenerated the vec_set tests.	Simon Pilgrim	2016-04-01	13	-128/+214
\| \| \| \| \| \|	Replaced lots of dodgy greps with actual codegen llvm-svn: 265163
*	[x86] avoid intermediate splat for non-zero memsets (PR27100)	Sanjay Patel	2016-04-01	1	-18/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Follow-up to http://reviews.llvm.org/D18566 and http://reviews.llvm.org/D18676 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The 16-byte test that was added in D18566 is now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. Note that the SSE1 path is not changed in this patch. That can be a follow-up. This patch should resolve PR27100. llvm-svn: 265161
*	[x86] avoid intermediate splat for non-zero memsets (PR27100)	Sanjay Patel	2016-04-01	1	-113/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Follow-up to D18566 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The tests that were added in the last patch are now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. In the new tests, the splat via shuffling looks ok to me, but there might be some room for improvement depending on uarch there. Note that the SSE1/2 paths are not changed in this patch. That can be a follow-up. This patch should resolve PR27100. Differential Revision: http://reviews.llvm.org/D18676 llvm-svn: 265148
*	[X86][AVX512] Regenerated intrinsics tests	Simon Pilgrim	2016-04-01	1	-126/+146
\| \| \| \|	llvm-svn: 265135
*	[X86] Introduce Lakemont CPU.	Andrey Turetskiy	2016-04-01	1	-0/+9
\| \| \| \| \| \| \| \|	Add a new Intel MCU CPU Lakemont, which doesn't support X87. Differential Revision: http://reviews.llvm.org/D18650 llvm-svn: 265128
*	Improve CHECK-NOT robustness of dllexport tests	Sean Silva	2016-04-01	2	-5/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	This changes some dllexport tests, to verify that some symbols that should not be exported are not, in a way that improves the robustness of CHECK-SAME interaction with CHECK-NOT. We plan to enable dllimport/dllexport support for the PS4, and these changes are for points we noticed in our internal testing. Patch by Warren Ristow! llvm-svn: 265106
*	Don't use an i64 return type with webkit_jscc	Sanjoy Das	2016-04-01	1	-4/+4
\| \| \| \| \| \| \| \| \|	Re-enable an assertion enabled by Justin Lebar in rL265092. rL265092 was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc does not like non-i64 return types. Change the test case to not do that. llvm-svn: 265099
*	Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker	Chuang-Yu Cheng	2016-04-01	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, HandleLastUse would delete RegRef information for sub-registers if they were dead even if their corresponding super-register were still live. If the super-register were later renamed, then the definitions of the sub-register would not be updated appropriately. This patch alters the behavior so that RegInfo information for sub-registers is only deleted when the sub-register and super-register are both dead. This resolves PR26775. This is the mirror image of Hal's r227311 commit. Author: Tom Jablin (tjablin) Reviewers: kbarton uweigand nemanjai hfinkel http://reviews.llvm.org/D18448 llvm-svn: 265097
*	[NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect.	Justin Lebar	2016-04-01	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously the NVVMReflect pass would read its configuration from command-line flags or a static configuration given to the pass at instantiation time. This doesn't quite work for clang's use-case. It needs to pass a value for __CUDA_FTZ down on a per-module basis. We use a module flag for this, so the NVVMReflect pass needs to be updated to read said flag. Reviewers: tra, rnk Subscribers: cfe-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18672 llvm-svn: 265090
*	testcase gardening: update the emissionKind enum to the new syntax. (NFC)	Adrian Prantl	2016-04-01	72	-73/+73
\| \| \| \|	llvm-svn: 265081
*	Move the DebugEmissionKind enum from DIBuilder into DICompileUnit.	Adrian Prantl	2016-03-31	30	-31/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder into DICompileUnit. DIBuilder is not the right place for this enum to live in — a metadata consumer should not have to include DIBuilder.h. I also added a Verifier check that checks that the emission kind of a DICompileUnit is actually legal. http://reviews.llvm.org/D18612 <rdar://problem/25427165> llvm-svn: 265077
*	Move asm-printer-topological-order.ll to PowerPC backend	Tim Shen	2016-03-31	1	-1/+1
\| \| \| \|	llvm-svn: 265067
*	[AsmPrinter] Print aliases in topological order	Tim Shen	2016-03-31	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	Print aliases in topological order, that is, for any alias a = b, b must be printed before a. This is because on some targets (e.g. PowerPC) linker expects aliases in such an order to generate correct TOC information. GCC also prints aliases in topological order. llvm-svn: 265064
*	[AArch64] Allow loads with imp-def to be handled in getMemOpBaseRegImmOfsWidth()	Jun Bum Lim	2016-03-31	2	-1/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change will allow loads with imp-def to be clustered in machine-scheduler pass. areMemAccessesTriviallyDisjoint() can also handle loads with imp-def. Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18665 llvm-svn: 265051
*	[PowerPC] Cleanup test/CodeGen/PowerPC/qpx-load-splat.ll	Hal Finkel	2016-03-31	1	-14/+6
\| \| \| \| \| \|	Removing unnecessary attributes and metadata... llvm-svn: 265049
*	[x86] add memset tests to show another potential improvement	Sanjay Patel	2016-03-31	1	-0/+203
\| \| \| \|	llvm-svn: 265048
*	[PowerPC] Add a late MI-level pass for QPX load/splat simplification	Hal Finkel	2016-03-31	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Chapter 3 of the QPX manual states that, "Scalar floating-point load instructions, defined in the Power ISA, cause a replication of the source data across all elements of the target register." Thus, if we have a load followed by a QPX splat (from the first lane), the splat is redundant. This adds a late MI-level pass to remove the redundant splats in some of these cases (specifically when both occur in the same basic block). This optimization is scheduled just prior to post-RA scheduling. It can't happen before anything that might replace the load with some already-computed quantity (i.e. store-to-load forwarding). llvm-svn: 265047
*	Revert r265039 "[X86] Merge adjacent stack adjustments in ↵	Hans Wennborg	2016-03-31	8	-46/+18
\| \| \| \| \| \| \| \| \| \|	eliminateCallFramePseudoInstr (PR27140)" I think it might have caused these build breakages: http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/7234/steps/build%20stage%202/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19566/steps/run%20tests/logs/stdio llvm-svn: 265046
*	[X86][SSE] Some basic tests for variable shuffles	Simon Pilgrim	2016-03-31	2	-0/+1942
\| \| \| \| \| \|	We don't really support non-constant shuffle masks, but these tests are for cases where BUILD_VECTOR is made up from vector extracts (as well as undef/zero scalars). llvm-svn: 265045
*	[ARM] Expand v1i64 and v2i64 ctpop.	Benjamin Kramer	2016-03-31	1	-0/+16
\| \| \| \| \| \| \|	The default is legal, which results in 'Cannot select' errors. This is triggered during selfhost due to a recent cost model change. llvm-svn: 265040