bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSA	Tom Stellard	2016-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24405 llvm-svn: 281080
*	[pdb] Print out some more info when dumping a raw stream.	Zachary Turner	2016-09-09	1	-10/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have various command line options that print the type of a stream, the size of a stream, etc but nowhere that it can all be viewed together. Since a previous patch introduced the ability to dump the bytes of a stream, this seems like a good place to present a full view of the stream's properties including its size, what kind of data it represents, and the blocks it occupies. So I added the ability to print that information to the -stream-data command line option. llvm-svn: 281077
*	Do not widen load for different variable in GVN.	Dehao Chen	2016-09-09	6	-17/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Widening load in GVN is too early because it will block other optimizations like PRE, LICM. https://llvm.org/bugs/show_bug.cgi?id=29110 The SPECCPU2006 benchmark impact of this patch: Reference: o2_nopatch (1): o2_patched Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 25.2 -0.08% spec/2006/fp/C++/447.dealII 45.92 +1.05% spec/2006/fp/C++/450.soplex 41.7 -0.26% spec/2006/fp/C++/453.povray 35.65 +1.68% spec/2006/fp/C/433.milc 23.79 +0.42% spec/2006/fp/C/470.lbm 41.88 -1.12% spec/2006/fp/C/482.sphinx3 47.94 +1.67% spec/2006/int/C++/471.omnetpp 22.46 -0.36% spec/2006/int/C++/473.astar 21.19 +0.24% spec/2006/int/C++/483.xalancbmk 36.09 -0.11% spec/2006/int/C/400.perlbench 33.28 +1.35% spec/2006/int/C/401.bzip2 22.76 -0.04% spec/2006/int/C/403.gcc 32.36 +0.12% spec/2006/int/C/429.mcf 41.04 -0.41% spec/2006/int/C/445.gobmk 26.94 +0.04% spec/2006/int/C/456.hmmer 24.5 -0.20% spec/2006/int/C/458.sjeng 28 -0.46% spec/2006/int/C/462.libquantum 55.25 +0.27% spec/2006/int/C/464.h264ref 45.87 +0.72% geometric mean +0.23% For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400. Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames Subscribers: gberry, junbuml Differential Revision: https://reviews.llvm.org/D24096 llvm-svn: 281074
*	[llvm-cov] Try to fix the native_separators.c test some more	Vedant Kumar	2016-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's still breaking this bot (though, it looks like it always had been): http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015 This time, add quotes around llvm-{cov,config} so that lit won't expand them. Thanks to Reid for suggesting the patch! llvm-svn: 281072
*	[pdb] Add command line options for dumping individual streams and blocks	Zachary Turner	2016-09-09	3	-112/+54
\| \| \| \| \| \| \| \| \| \| \| \| \|	I ran into a situation where I wanted to print out the contents of page 6 of a PDB as a binary blob, and there was no straightforward way to do that. In addition to adding that, this patch also adds the ability to dump a stream by index as a binary blob, and it will stitch together all the blocks and dump the whole thing as one seemingly contiguous sequence of bytes. llvm-svn: 281070
*	[pdb] Write PDB TPI Stream from Yaml.	Zachary Turner	2016-09-09	2	-9/+12
\| \| \| \| \| \| \| \| \| \|	This writes the full sequence of type records described in Yaml to the TPI stream of the PDB file. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24316 llvm-svn: 281063
*	[codeview] Don't assert if the array element type is incomplete	Reid Kleckner	2016-09-09	1	-0/+140
\| \| \| \| \| \| \| \| \|	This can happen when the frontend knows the debug info will be emitted somewhere else. Usually this happens for dynamic classes with out of line constructors or key functions, but it can also happen when modules are enabled. llvm-svn: 281060
*	[Bitcode] Add compatibility test for the 3.9 release	Vedant Kumar	2016-09-09	2	-0/+1664
\| \| \| \| \| \| \|	Fork off compatibility.ll for the 3.9 release. The *.bc file in this commit was produced using a Release build of the release_39 branch. llvm-svn: 281059
*	[InstCombine] add tests to show pattern matching failures due to commutation	Sanjay Patel	2016-09-09	3	-0/+148
\| \| \| \| \| \| \|	I was looking to fix a bug in getComplexity(), and these cases showed up as obvious failures. I'm not sure how to find these in general though. llvm-svn: 281055
*	AMDGPU] Assembler: better support for immediate literals in assembler.	Sam Kolton	2016-09-09	9	-168/+1056
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050
*	[Sparc][LEON] Removed the parts of the errata fixes implemented using inline ↵	Chris Dewhurst	2016-09-09	3	-68/+9
\| \| \| \| \| \|	assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly. llvm-svn: 281047
*	[ARM] ADD with a negative offset can become SUB for free	James Molloy	2016-09-09	1	-0/+17
\| \| \| \| \| \|	So model that directly in TTI::getIntImmCost(). llvm-svn: 281044
*	[ARM] icmp %x, -C can be lowered to a simple ADDS or CMN	James Molloy	2016-09-09	1	-0/+34
\| \| \| \| \| \|	Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043
*	[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type	Simon Pilgrim	2016-09-09	1	-0/+107
\| \| \| \| \| \|	Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042
*	[Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0)	James Molloy	2016-09-09	5	-6/+39
\| \| \| \| \| \|	The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040
*	GlobalISel: remove G_TYPE and G_PHI	Tim Northover	2016-09-09	10	-17/+17
\| \| \| \| \| \| \| \|	These instructions were only necessary when type information was stored in the MachineInstr (because only generic MachineInstrs possessed a type). Now that it's in MachineRegisterInfo, COPY and PHI work fine. llvm-svn: 281037
*	GlobalISel: move type information to MachineRegisterInfo.	Tim Northover	2016-09-09	26	-747/+755
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want each register to have a canonical type, which means the best place to store this is in MachineRegisterInfo rather than on every MachineInstr that happens to use or define that register. Most changes following from this are pretty simple (you need an MRI anyway if you're going to be doing any transformations, so just check the type there). But legalization doesn't really want to check redundant operands (when, for example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's operand type field to encode these constraints and limit legalization's work. As an added bonus, more validation is possible, both in MachineVerifier and MachineIRBuilder (coming soon). llvm-svn: 281035
*	Revert "[mips] Fix c.<cc>.<fmt> instruction definition."	Simon Dardis	2016-09-09	22	-409/+405
\| \| \| \| \| \| \|	This reverts commit r281022. Mips buildbot broke, due to unhandled register class FCC. llvm-svn: 281033
*	[AMDGPU] Assembler: rename amd_kernel_code_t asm names according to spec	Sam Kolton	2016-09-09	6	-188/+188
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also removed duplicate code from AMDGPUTargetAsmStreamer. This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D24296 llvm-svn: 281028
*	[Thumb1] Teach optimizeCompareInstr about thumb1 compares	James Molloy	2016-09-09	2	-5/+57
\| \| \| \| \| \| \| \|	This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags. Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean. llvm-svn: 281027
*	[mips] Fix c.<cc>.<fmt> instruction definition.	Simon Dardis	2016-09-09	22	-405/+409
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of this effort, remove MipsFCmp nodes and use tablegen patterns rather than custom lowering through C++. Unexpectedly, this improves codesize for microMIPS as previous floating point setcc expansions would materialize 0 and 1 into GPRs before using the relevant mov[tf].[sd] instruction. Now $zero is used directly. Reviewers: dsanders, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D23118 llvm-svn: 281022
*	[Sparc][LEON] Unit test for CASA instruction supported by some LEON ↵	Chris Dewhurst	2016-09-09	1	-0/+14
\| \| \| \| \| \|	processors added. llvm-svn: 281021
*	[Coroutines] Part13: Handle single edge PHINodes across suspends	Gor Nishanov	2016-09-09	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If one of the uses of the value is a single edge PHINode, handle it. Original: %val = something <suspend> %p = PHINode [%val] After Spill + Part13: %val = something %slot = gep val.spill.slot store %val, %slot <suspend> %p = load %slot Plus tiny fixes/changes: * use correct index for coro.free in CoroCleanup * fixup id parameter in coro.free to allow authoring coroutine in plain C with __builtins Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24242 llvm-svn: 281020
*	[AVX-512] Add VPCMP instructions to the load folding tables and make them ↵	Craig Topper	2016-09-09	1	-12/+6
\| \| \| \| \| \|	commutable. llvm-svn: 281013
*	[AVX-512] Add more integer vector comparison tests with loads. Some of these ↵	Craig Topper	2016-09-09	1	-0/+198
\| \| \| \| \| \| \| \|	show opportunities where we can commute to fold loads. Commutes will be added in a followup commit. llvm-svn: 281012
*	[llvm-cov] Emit a summary in the report directory's index	Vedant Kumar	2016-09-09	2	-7/+29
\| \| \| \| \| \| \| \|	llvm-cov writes out an index file in '-output-dir' mode, albeit not a very informative one. Try to fix that by using the CoverageReport API to include some basic summary information in the index file. llvm-svn: 281011
*	[llvm-cov] Speculate fix for a Windows-only test (NFC)	Vedant Kumar	2016-09-09	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \|	This test should have broken after r280896. Fix up the test case speculatively, since I don't have a way to test it. I wonder why I didn't get any angry bot emails about this. Maybe none of the win32 bots test llvm-cov? That could explain it, since the test says it 'REQUIRES: system-windows', which is restricted to win32 hosts. Also: why is 'system-windows' not defined for non-win32 Windows bots? llvm-svn: 281008
*	[X86] Add more baseline tests for "irregular" shuffles. NFC.	Michael Kuperstein	2016-09-09	1	-13/+1136
\| \| \| \| \| \| \|	This adds more tests for shuffles where the output width does not match the input width and/or the output is generated from more than two inputs. llvm-svn: 281005
*	Win64: Don't use REX prefix for direct tail calls	Hans Wennborg	2016-09-08	4	-4/+4
\| \| \| \| \| \| \| \| \| \|	The REX prefix should be used on indirect jmps, but not direct ones. For direct jumps, the unwinder looks at the offset to determine if it's inside the current function. Differential Revision: https://reviews.llvm.org/D24359 llvm-svn: 281003
*	Remove debug info when hoisting instruction from then/else branch.	Dehao Chen	2016-09-08	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The hoisted instruction is executed speculatively. It could affect the debugging experience as user would see gdb go into code that may not be expected to execute. It will also affect sample profile accuracy by assigning incorrect frequency to source within then/else branch. Reviewers: davidxl, dblaikie, chandlerc, kcc, echristo Subscribers: mehdi_amini, probinson, eric_niebler, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D24164 llvm-svn: 280995
*	[InstCombine] regenerate checks	Sanjay Patel	2016-09-08	1	-228/+284
\| \| \| \|	llvm-svn: 280993
*	[LV] Ensure proper handling of multi-use case when collecting uniforms	Matthew Simpson	2016-09-08	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	The test case included in r280979 wasn't checking what it was supposed to be checking for the predicated store case. Fixing the test revealed that the multi-use case (when a pointer is used by both vectorized and scalarized memory accesses) wasn't being handled properly. We can't skip over non-consecutive-like pointers since they may have looked consecutive-like with a different memory access. llvm-svn: 280992
*	[InstCombine] regenerate checks	Sanjay Patel	2016-09-08	1	-60/+77
\| \| \| \|	llvm-svn: 280991
*	[RDF] Further improve handling of multiple phis reached from shadows	Krzysztof Parzyszek	2016-09-08	1	-0/+40
\| \| \| \|	llvm-svn: 280987
*	[llvm-cov] Fix issues with segment highlighting in the html view	Vedant Kumar	2016-09-08	2	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	The text and html coverage views take different approaches to emitting highlighted regions. That's because this problem is easier in the text view: there's no need to worry about escaping text or adding tooltip content to a highlighted snippet. Unfortunately, the html view didn't get region highlighting quite right. This patch fixes the situation, bringing parity between the two views. llvm-svn: 280981
*	[LV] Don't mark pointers used by scalarized memory accesses uniform	Matthew Simpson	2016-09-08	1	-0/+268
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, all consecutive pointers were marked uniform after vectorization. However, if a consecutive pointer is used by a memory access that is eventually scalarized, the pointer won't remain uniform after all. An example is predicated stores. Even though a predicated store may be consecutive, it will still be scalarized, making it's pointer operand non-uniform. This patch updates the logic in collectLoopUniforms to consider the cases where a memory access may be scalarized. If a memory access may be scalarized, its pointer operand is not marked uniform. The determination of whether a given memory instruction will be scalarized or not has been moved into a common function that is used by the vectorizer, cost model, and legality analysis. Differential Revision: https://reviews.llvm.org/D24271 llvm-svn: 280979
*	[Hexagon] Expand sext- and zextloads of vector types, not just extloads	Krzysztof Parzyszek	2016-09-08	1	-0/+10
\| \| \| \| \| \|	Recent change exposed this issue, breaking the Hexagon buildbots. llvm-svn: 280973
*	AMDGPU: Try to commute when selecting s_addk_i32/s_mulk_i32	Matt Arsenault	2016-09-08	2	-0/+32
\| \| \| \|	llvm-svn: 280972
*	AArch64 .arch directive - Include default arch attributes with extensions.	Eric Christopher	2016-09-08	2	-15/+14
\| \| \| \| \| \| \| \|	Fix the .arch asm parser to use the full set of features for the architecture and any extensions on the command line. Add and update testcases accordingly as well as add an extension that was used but not supported. llvm-svn: 280971
*	AMDGPU: Support commuting with immediate in src0	Matt Arsenault	2016-09-08	30	-79/+80
\| \| \| \|	llvm-svn: 280970
*	Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"	Renato Golin	2016-09-08	2	-48/+0
\| \| \| \| \| \| \| \| \| \|	And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967
*	Add unittest for r280760	Dehao Chen	2016-09-08	1	-0/+31
\| \| \| \|	llvm-svn: 280963
*	[InstCombine][X86] Regenerate masked memory op combine tests	Simon Pilgrim	2016-09-08	1	-88/+114
\| \| \| \|	llvm-svn: 280960
*	[InstCombine][X86] Regenerate vperm2f128/vperm2i128 combine tests	Simon Pilgrim	2016-09-08	1	-86/+116
\| \| \| \|	llvm-svn: 280959
*	[InstCombine][X86] Regenerate insertps combine tests	Simon Pilgrim	2016-09-08	1	-43/+59
\| \| \| \|	llvm-svn: 280957
*	[TableGen] AsmMatcher: Add AsmVariantName to Instruction class.	Sam Kolton	2016-09-08	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows specifying instructions that are available only in specific assembler variant. If AsmVariantName is specified then instruction will be presented only in MatchTable for this variant. If not specified then assembler variants will be determined based on AsmString. Also this allows splitting assembler match tables in same way as it is done in dissasembler. Reviewers: ab, tstellarAMD, craig.topper, vpykhtin Subscribers: wdng Differential Revision: https://reviews.llvm.org/D24249 llvm-svn: 280952
*	Give an x86 assembler test a triple	Reid Kleckner	2016-09-08	1	-1/+1
\| \| \| \|	llvm-svn: 280950
*	[SDAGBuilder] Don't create a binary tree for switches in minsize mode	James Molloy	2016-09-08	1	-0/+34
\| \| \| \| \| \|	This bloats codesize - all of the non-leaf nodes are extra code. llvm-svn: 280932
*	[Thumb1] AND with a constant operand can be converted into BIC	James Molloy	2016-09-08	1	-0/+18
\| \| \| \| \| \| \|	So model the cost of materializing the constant operand C as the minimum of C and ~C. llvm-svn: 280929
*	[Thumb1] Fix cost calculation for complemented immediates	James Molloy	2016-09-08	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Materializing something like "-3" can be done as 2 instructions: MOV r0, #3 MVN r0, r0 This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256. There were no tests failing after changing this... :/ llvm-svn: 280928