bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt ↵	Craig Topper	2016-04-24	1	-209/+0
\| \| \| \| \| \|	instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly. llvm-svn: 267311
*	DebugInfo: Remove MDString-based type references	Duncan P. N. Exon Smith	2016-04-23	2	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around DIType*. It is no longer legal to refer to a DICompositeType by its 'identifier:', and DIBuilder no longer retains all types with an 'identifier:' automatically. Aside from the bitcode upgrade, this is mainly removing logic to resolve an MDString-based reference to an actualy DIType. The commits leading up to this have made the implicit type map in DICompileUnit's 'retainedTypes:' field superfluous. This does not remove DITypeRef, DIScopeRef, DINodeRef, and DITypeRefArray, or stop using them in DI-related metadata. Although as of this commit they aren't serving a useful purpose, there are patchces under review to reuse them for CodeView support. The tests in LLVM were updated with deref-typerefs.sh, which is attached to the thread "[RFC] Lazy-loading of debug info metadata": http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html llvm-svn: 267296
*	Revert "[AArch64] Fix optimizeCondBranch logic."	Renato Golin	2016-04-23	1	-1/+1
\| \| \| \| \| \|	This reverts commit r267206, as it broke self-hosting on AArch64. llvm-svn: 267294
*	[X86][XOP] Added VPPERM -> BLEND-WITH-ZERO Test	Simon Pilgrim	2016-04-23	1	-0/+9
\| \| \| \| \| \|	Currently failing due to poor blend matching, found whilst investigating PR27472 llvm-svn: 267282
*	[CodeGen] When promoting CTTZ operations to larger type, don't insert a ↵	Craig Topper	2016-04-23	1	-58/+3
\| \| \| \| \| \|	select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280
*	AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size	Matt Arsenault	2016-04-22	1	-21/+123
\| \| \| \|	llvm-svn: 267244
*	Fix llvm/test/CodeGen/ARM/Windows/dbzchk.ll not to check mixed output, take #2.	NAKAMURA Takumi	2016-04-22	1	-2/+2
\| \| \| \|	llvm-svn: 267242
*	AMDGPU: Re-visit nodes in performAndCombine	Matt Arsenault	2016-04-22	2	-9/+12
\| \| \| \| \| \|	This fixes test regressions when i64 loads/stores are made promote. llvm-svn: 267240
*	Differential Revision: http://reviews.llvm.org/D19040	Sriraman Tallam	2016-04-22	2	-4/+123
\| \| \| \|	llvm-svn: 267229
*	Introduce llvm.load.relative intrinsic.	Peter Collingbourne	2016-04-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223
*	DAGCombiner: Relax alignment restriction when changing store type	Matt Arsenault	2016-04-22	2	-1/+54
\| \| \| \| \| \|	If the target allows the alignment, this should be OK. llvm-svn: 267217
*	CodeGen: Use PLT relocations for relative references to unnamed_addr functions.	Peter Collingbourne	2016-04-22	3	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \|	The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual functions defined in other DSOs. The unnamed_addr attribute means that the function's address is not significant, so we're allowed to substitute it with the address of a PLT entry. Also includes a bonus feature: addends for COFF image-relative references. Differential Revision: http://reviews.llvm.org/D17938 llvm-svn: 267211
*	DAGCombiner: Relax alignment restriction when changing load type	Matt Arsenault	2016-04-22	4	-5/+43
\| \| \| \| \| \|	If the target allows the alignment, this should still be OK. llvm-svn: 267209
*	[AArch64] Fix optimizeCondBranch logic.	Quentin Colombet	2016-04-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267206
*	MachineScheduler: Limit the size of the ready list.	Matthias Braun	2016-04-22	1	-0/+1
\| \| \| \| \| \| \| \| \|	Avoid quadratic complexity in unusually large basic blocks by limiting the size of the ready lists. Differential Revision: http://reviews.llvm.org/D19349 llvm-svn: 267189
*	[AArch64] When creating MRS instruction, make sure the destination register is	Quentin Colombet	2016-04-22	1	-1/+1
\| \| \| \| \| \| \| \|	declared as a definition. This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll. llvm-svn: 267185
*	[AArch64][AdvSIMDScalar] Update the kill flags correctly.	Quentin Colombet	2016-04-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to simply set the kill flags to true when transforming a scalar instruction to a vector one. SrcScalar1 = copy SrcVector1 ... = opScalar SrcScalar1 => SrcScalar1 = copy SrcVector1 ... = opVector SrcVector1<kill> This is obviously wrong. The proper update consists in: 1. Propagate the kill status from the copy to the new opVector 2. Reset the kill status on the copy, since the live-range of SrcVector1 got extended. This fixes some of the machine verifier errors for AArch64 with make check. llvm-svn: 267180
*	test: split test into two runs	Saleem Abdulrasool	2016-04-22	1	-8/+9
\| \| \| \| \| \| \| \| \| \|	Rather than checking both stdout and stderr simultaneously, split it into two tests. This apparently breaks on Windows where MSVCRT does not buffer output correctly. NFC. Thanks to chapuni for bringing the issue to my attention! llvm-svn: 267179
*	[Hexagon] Properly close live range in HexagonBlockRanges ---add testcase	Krzysztof Parzyszek	2016-04-22	1	-0/+55
\| \| \| \|	llvm-svn: 267174
*	[AMDGPU] Insert nop pass: take care of outstanding feedback	Konstantin Zhuravlyov	2016-04-22	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \|	- Switch few loops to range-based for loops - Fix nop insertion at the end of BB - Fix formatting - Check for endpgm Differential Revision: http://reviews.llvm.org/D19380 llvm-svn: 267167
*	[Hexagon] Teach mux expansion how to deal with undef predicates	Krzysztof Parzyszek	2016-04-22	1	-0/+22
\| \| \| \|	llvm-svn: 267165
*	Emit code16 in assembly in 16-bit mode	Nirav Dave	2016-04-22	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When generating assembly using -m16 we must explicitly mark it as 16-bit. Emit .code16 at beginning of file. Fixes wrong results when using -fno-integrated-as. Reviewers: dwmw2 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19392 llvm-svn: 267152
*	[mips] Fix select patterns for MIPS64	Simon Dardis	2016-04-22	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When targetting MIPS64R6 some of the patterns for select were guarded by a broken predicate. The predicate was supposed to test if a constant value could fit in a 16 bit zero-extended field. Instead the value was tested to fit in a 16 bit sign-extended field. For negative constants of native word width this resulted in wrong code generation. Reviewers: vkalintiris, dsanders Differential Review: http://reviews.llvm.org/D19378 llvm-svn: 267151
*	[mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructions	Hrvoje Varga	2016-04-22	4	-2/+10
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19354 llvm-svn: 267137
*	Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64	Daniel Sanders	2016-04-22	2	-264/+0
\| \| \| \| \| \|	It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127
*	AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic	Nicolai Haehnle	2016-04-22	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This intrinsic returns true if the current thread belongs to a live pixel and false if it belongs to a pixel that we are executing only for derivative computation. It will be used by Mesa to implement gl_HelperInvocation. Note that for pixels that are killed during the shader, this implementation also returns true, but it doesn't matter because those pixels are always disabled in the EXEC mask. This unearthed a corner case in the instruction verifier, which complained about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but correct code, so make the verifier accept it as such. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19191 llvm-svn: 267102
*	[AVX512] Teach lowering to use vplzcntd/q to implement 128/256-bit ↵	Craig Topper	2016-04-22	3	-2/+317
\| \| \| \| \| \|	CTTZ_ZERO_UNDEF even without VLX support. We can just extend to 512-bits and extract like we do for CTLZ. llvm-svn: 267100
*	[MachineCombiner] Support for floating-point FMA on ARM64	Gerolf Hoflehner	2016-04-22	2	-0/+264
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098
*	Try to fix UNRESOLVED: LLVM :: CodeGen/AArch64/arm64-regress-opt-cmp.s on bots.	Nico Weber	2016-04-22	1	-0/+1
\| \| \| \| \| \| \| \|	This test used to write a .s file until r266971 fixed that. But on most bots, the .s file still exists. Add an rm statement to clean up the bots. In a few days, this statement can go away again. llvm-svn: 267095
*	ARM: fix test for Windows division	Saleem Abdulrasool	2016-04-22	1	-4/+4
\| \| \| \| \| \| \|	This was meant to be part of SVN r267080. cbz cannot use a high register, which would be silently truncated. This has now been fixed. llvm-svn: 267092
*	[WebAssembly] Limit alignment hints to natural alignment.	Dan Gohman	2016-04-21	3	-17/+21
\| \| \| \| \| \|	This follows the current binary format rules. llvm-svn: 267082
*	ARM: restrict register class for WIN__DBZCHK	Saleem Abdulrasool	2016-04-21	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \|	WIN__DBZCHK will insert a CBZ instruction into the stream. This instruction reserves 3 bits for the condition register (rn). As such, we must ensure that we restrict the register to a low register. Use the tGPR class instead of GPR to ensure that this is properly constrained. In debug builds, we would attempt to use lr as a condition register which would silently get truncated with no hint that the register selection was incorrect. llvm-svn: 267080
*	add tests for disguised fabs/fneg	Sanjay Patel	2016-04-21	1	-0/+29
\| \| \| \|	llvm-svn: 267053
*	use FileCheck; add test for disguised fabs	Sanjay Patel	2016-04-21	1	-4/+27
\| \| \| \|	llvm-svn: 267051
*	[Hexagon] Expand handling of the small-data/bss section	Krzysztof Parzyszek	2016-04-21	5	-5/+88
\| \| \| \|	llvm-svn: 267034
*	DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component	Matt Arsenault	2016-04-21	5	-7/+515
\| \| \| \| \| \| \|	If the extracted bits are restricted to the upper half or lower half, this can be truncated. llvm-svn: 267024
*	[PowerPC] [SSP] Fix stack guard load for 32-bit.	Marcin Koscielnicki	2016-04-21	1	-1/+1
\| \| \| \| \| \| \| \|	r266809 incorrectly used LD to load the stack guard, it should be LWZ. Differential Revision: http://reviews.llvm.org/D19358 llvm-svn: 267017
*	Updated a test not to produce an empty s-file.	Evgeny Astigeevich	2016-04-21	1	-1/+1
\| \| \| \|	llvm-svn: 266971
*	[AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in ↵	Evgeny Astigeevich	2016-04-21	2	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AArch64InstrInfo::optimizeCompareInstr AArch64InstrInfo::optimizeCompareInstr has bug PR27158 which causes generation of incorrect code. A compare instruction is substituted with another instruction which does not produce the same flags as the original compare instruction. This patch contains: 1. Fix of the bug. 2. A regression test in MIR. 3. A new test to check that SUBS is replaced by SUB. Differential Revision: http://reviews.llvm.org/D18838 llvm-svn: 266969
*	[AVX512] Add CTTZ support for v8i64 and v16i32 vectors.	Craig Topper	2016-04-21	1	-182/+166
\| \| \| \|	llvm-svn: 266968
*	[X86] Fix vector-tzcnt-512 test to disable CDI while enabling BWI for one of ↵	Craig Topper	2016-04-21	1	-18/+134
\| \| \| \| \| \|	the runs. Update check patterns accordingly. llvm-svn: 266967
*	Fix test command line to explicitly disable CDI instructions for one test.	Craig Topper	2016-04-21	1	-1/+1
\| \| \| \|	llvm-svn: 266966
*	[AVX512] Add support for lowering CTTZ v64i8 and v32i16 with BWI instructions.	Craig Topper	2016-04-21	1	-729/+125
\| \| \| \|	llvm-svn: 266963
*	[AVX512] Add support for popcount of v8i64 and v16i32 with and without BWI ↵	Craig Topper	2016-04-21	1	-94/+84
\| \| \| \| \| \| \| \|	instructions. Without BWI we have to split the vectors into 256-bit vectors so we can use AVX2 pshufb and then concatenate the results. llvm-svn: 266950
*	[lanai] Add subword scheduling itineraries.	Jacques Pienaar	2016-04-20	1	-0/+29
\| \| \| \| \| \| \| \|	Differentiate between word and subword memory operations as they take different amount of cycles to complete. This just adds a basic model of the subword latency to the scheduler. llvm-svn: 266898
*	[RDF] Consider register as live if any alias is live	Krzysztof Parzyszek	2016-04-20	1	-0/+28
\| \| \| \| \| \|	This only affects the recomputation of kill flags. llvm-svn: 266875
*	[X86] enable PIE for functions	Asaf Badouh	2016-04-20	1	-0/+41
\| \| \| \| \| \| \| \|	Call locally defined function directly for PIE/fPIE Differential Revision: http://reviews.llvm.org/D19226 llvm-svn: 266863
*	[AVX512] Add avx512cd+vl runs to vector-tzcnt-128/256 tests to show using ↵	Craig Topper	2016-04-20	2	-132/+594
\| \| \| \| \| \|	the vplzcntd/q instructions. llvm-svn: 266860
*	[AVX512] Update vector-tzcnt-512 test to show how bad v32i16 and v64i8 is ↵	Craig Topper	2016-04-20	1	-113/+859
\| \| \| \| \| \|	with avx512bw enabled. llvm-svn: 266859
*	[AVX512] Add popcount support for v32i16 and v64i8.	Craig Topper	2016-04-20	1	-41/+69
\| \| \| \|	llvm-svn: 266858