bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] rename and reorganize some icmp folding functions; NFC	Sanjay Patel	2016-09-10	2	-24/+23
\| \| \| \| \| \| \| \| \| \|	Everything under foldICmpInstWithConstant() should now be working for splat vectors via m_APInt matchers. Ie, I've removed all of the FIXMEs that I added while cleaning that section up. Note that not all of the associated FIXMEs in the regression tests are gone though, because some of the tests require earlier folds that are still scalar-only. llvm-svn: 281139
*	We also need to pass swifterror in R12 under swiftcc not only under ccc	Arnold Schwaighofer	2016-09-10	1	-0/+3
\| \| \| \| \| \|	rdar://28190687 llvm-svn: 281138
*	[AMDGPU] Refactor MUBUF/MTBUF instructions	Valery Pykhtin	2016-09-10	6	-1168/+1306
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D24295 llvm-svn: 281137
*	[WebAssembly] Fix typos in comments	Heejin Ahn	2016-09-10	1	-11/+14
\| \| \| \|	llvm-svn: 281131
*	[libFuzzer] print a failed-merge warning only in the merge mode	Kostya Serebryany	2016-09-10	1	-0/+1
\| \| \| \|	llvm-svn: 281130
*	AMDGPU: Implement is{LoadFrom\|StoreTo}FrameIndex	Matt Arsenault	2016-09-10	6	-21/+90
\| \| \| \|	llvm-svn: 281128
*	AMDGPU: Fix scheduling info for spill pseudos	Matt Arsenault	2016-09-10	1	-2/+3
\| \| \| \| \| \| \|	These defaulted to Write32Bit. I don't think this actually matters since these don't exist during scheduling. llvm-svn: 281127
*	[asan] Add flag to allow lifetime analysis of problematic allocas	Vitaly Buka	2016-09-10	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Could be useful for comparison when we suspect that alloca was skipped because of this. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24437 llvm-svn: 281126
*	[CodeGen] Rename MachineInstr::isInvariantLoad to ↵	Justin Lebar	2016-09-10	8	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	isDereferenceableInvariantLoad. NFC Summary: I want to separate out the notions of invariance and dereferenceability at the MI level, so that they correspond to the equivalent concepts at the IR level. (Currently an MI load is MI-invariant iff it's IR-invariant and IR-dereferenceable.) First step is renaming this function. Reviewers: chandlerc Subscribers: MatzeB, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D23370 llvm-svn: 281125
*	[libFuzzer] don't print help for internal flags	Kostya Serebryany	2016-09-10	2	-0/+3
\| \| \| \|	llvm-svn: 281124
*	[libFuzzer] print a visible message if merge fails due to a crash	Kostya Serebryany	2016-09-10	3	-0/+24
\| \| \| \|	llvm-svn: 281122
*	AMDGPU: Fix immediate folding logic when shrinking instructions	Matt Arsenault	2016-09-09	3	-16/+10
\| \| \| \| \| \| \| \| \| \|	If the literal is being folded into src0, it doesn't matter if it's an SGPR because it's being replaced with the literal. Also fixes initially selecting 32-bit versions of some instructions which also confused commuting. llvm-svn: 281117
*	Inliner: Don't mark swifterror allocas with lifetime markers	Arnold Schwaighofer	2016-09-09	1	-0/+3
\| \| \| \| \| \| \| \| \|	This would create a bitcast use which fails the verifier: swifterror values may only be used by loads, stores, and as function arguments. rdar://28233244 llvm-svn: 281114
*	X86: Fold tail calls into conditional branches also for 64-bit (PR26302)	Hans Wennborg	2016-09-09	4	-12/+40
\| \| \| \| \| \| \| \| \|	This extends the optimization in r280832 to also work for 64-bit. The only quirk is that we can't do this for 64-bit Windows (yet). Differential Revision: https://reviews.llvm.org/D24423 llvm-svn: 281113
*	AMDGPU: Run LoadStoreVectorizer pass by default	Matt Arsenault	2016-09-09	2	-1/+4
\| \| \| \|	llvm-svn: 281112
*	[libFuzzer] use sizeof() in tests instead of 4 and 8	Kostya Serebryany	2016-09-09	2	-6/+6
\| \| \| \|	llvm-svn: 281111
*	LSV: Fix incorrectly increasing alignment	Matt Arsenault	2016-09-09	1	-18/+16
\| \| \| \| \| \| \|	If the unaligned access has a dynamic offset, it may be odd which would make the adjusted alignment incorrect to use. llvm-svn: 281110
*	[InstCombine] use m_APInt to allow icmp ult X, C folds for splat constant ↵	Sanjay Patel	2016-09-09	1	-8/+13
\| \| \| \| \| \|	vectors llvm-svn: 281107
*	[libFuzzer] one more puzzle for value profile	Kostya Serebryany	2016-09-09	3	-0/+25
\| \| \| \|	llvm-svn: 281106
*	[X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targets	Simon Pilgrim	2016-09-09	1	-5/+5
\| \| \| \| \| \|	Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets llvm-svn: 281105
*	[Hexagon] Fix disassembler crash after r279255	Krzysztof Parzyszek	2016-09-09	1	-0/+3
\| \| \| \| \| \| \|	When p0 was added as an explicit operand to the duplex subinstructions, the disassembler was not updated to reflect this. llvm-svn: 281104
*	Create phi nodes for swifterror values at the end of the phi instructions list	Arnold Schwaighofer	2016-09-09	1	-1/+1
\| \| \| \| \| \| \| \|	ISel makes assumption about the order of phi nodes. rdar://28190150 llvm-svn: 281095
*	[NVPTX] Implement llvm.fabs.f32, llvm.max.f32, etc.	Justin Lebar	2016-09-09	2	-16/+132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously these only worked via NVPTX-specific intrinsics. This change will allow us to convert these target-specific intrinsics into the general LLVM versions, allowing existing LLVM passes to reason about their behavior. It also gets us some minor codegen improvements as-is, from situations where we canonicalize code into one of these llvm intrinsics. Reviewers: majnemer Subscribers: llvm-commits, jholewinski, tra Differential Revision: https://reviews.llvm.org/D24300 llvm-svn: 281092
*	ARM: move the builtins libcall CC setup	Saleem Abdulrasool	2016-09-09	3	-166/+171
\| \| \| \| \| \| \| \| \|	Move the target specific setup into the target specific lowering setup. As pointed out by Anton, the initial change was moving this too high up the stack resulting in a violation of the layering (the target generic code path setup target specific bits). Sink this into the ARM specific setup. NFC. llvm-svn: 281088
*	Add a lower level zlib::uncompress.	Rafael Espindola	2016-09-09	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \|	SmallVectors are convenient, but they don't cover every use case. In particular, they are fairly large (3 pointers + one element) and there is no way to take ownership of the buffer to put it somewhere else. This patch then adds a lower lever interface that works with any buffer. llvm-svn: 281082
*	AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type.	Wei Ding	2016-09-09	3	-9/+17
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D23700 llvm-svn: 281081
*	AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSA	Tom Stellard	2016-09-09	2	-1/+6
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24405 llvm-svn: 281080
*	[pdb] Print out some more info when dumping a raw stream.	Zachary Turner	2016-09-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have various command line options that print the type of a stream, the size of a stream, etc but nowhere that it can all be viewed together. Since a previous patch introduced the ability to dump the bytes of a stream, this seems like a good place to present a full view of the stream's properties including its size, what kind of data it represents, and the blocks it occupies. So I added the ability to print that information to the -stream-data command line option. llvm-svn: 281077
*	Do not widen load for different variable in GVN.	Dehao Chen	2016-09-09	1	-37/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Widening load in GVN is too early because it will block other optimizations like PRE, LICM. https://llvm.org/bugs/show_bug.cgi?id=29110 The SPECCPU2006 benchmark impact of this patch: Reference: o2_nopatch (1): o2_patched Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 25.2 -0.08% spec/2006/fp/C++/447.dealII 45.92 +1.05% spec/2006/fp/C++/450.soplex 41.7 -0.26% spec/2006/fp/C++/453.povray 35.65 +1.68% spec/2006/fp/C/433.milc 23.79 +0.42% spec/2006/fp/C/470.lbm 41.88 -1.12% spec/2006/fp/C/482.sphinx3 47.94 +1.67% spec/2006/int/C++/471.omnetpp 22.46 -0.36% spec/2006/int/C++/473.astar 21.19 +0.24% spec/2006/int/C++/483.xalancbmk 36.09 -0.11% spec/2006/int/C/400.perlbench 33.28 +1.35% spec/2006/int/C/401.bzip2 22.76 -0.04% spec/2006/int/C/403.gcc 32.36 +0.12% spec/2006/int/C/429.mcf 41.04 -0.41% spec/2006/int/C/445.gobmk 26.94 +0.04% spec/2006/int/C/456.hmmer 24.5 -0.20% spec/2006/int/C/458.sjeng 28 -0.46% spec/2006/int/C/462.libquantum 55.25 +0.27% spec/2006/int/C/464.h264ref 45.87 +0.72% geometric mean +0.23% For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400. Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames Subscribers: gberry, junbuml Differential Revision: https://reviews.llvm.org/D24096 llvm-svn: 281074
*	Fix another -Wunused-variable for non-assert build.	Rui Ueyama	2016-09-09	1	-3/+4
\| \| \| \|	llvm-svn: 281073
*	Fix -Wunused-variable for non-assert build.	Rui Ueyama	2016-09-09	1	-3/+2
\| \| \| \|	llvm-svn: 281069
*	[pdb] Pass CVRecord's through the visitor as non-const references.	Zachary Turner	2016-09-09	5	-85/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This simplifies a lot of code, and will actually be necessary for an upcoming patch to serialize TPI record hash values. The idea before was that visitors should be examining records, not modifying them. But this is no longer true with a visitor that constructs a CVRecord from Yaml. To handle this until now, we were doing some fixups on CVRecord objects at a higher level, but the code is really awkward, and it makes sense to just have the visitor write the bytes into the CVRecord. In doing so I uncovered a few bugs related to `Data` and `RawData` and fixed those. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24362 llvm-svn: 281067
*	[libFuzzer] one more puzzle, value_profile cracks it in a second	Kostya Serebryany	2016-09-09	3	-0/+25
\| \| \| \|	llvm-svn: 281066
*	[pdb] Write PDB TPI Stream from Yaml.	Zachary Turner	2016-09-09	9	-74/+177
\| \| \| \| \| \| \| \| \| \|	This writes the full sequence of type records described in Yaml to the TPI stream of the PDB file. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24316 llvm-svn: 281063
*	[codeview] Don't assert if the array element type is incomplete	Reid Kleckner	2016-09-09	1	-15/+26
\| \| \| \| \| \| \| \| \|	This can happen when the frontend knows the debug info will be emitted somewhere else. Usually this happens for dynamic classes with out of line constructors or key functions, but it can also happen when modules are enabled. llvm-svn: 281060
*	AMDGPU] Assembler: better support for immediate literals in assembler.	Sam Kolton	2016-09-09	14	-351/+708
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050
*	[Sparc][LEON] Removed the parts of the errata fixes implemented using inline ↵	Chris Dewhurst	2016-09-09	1	-76/+0
\| \| \| \| \| \|	assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly. llvm-svn: 281047
*	[ARM] ADD with a negative offset can become SUB for free	James Molloy	2016-09-09	1	-0/+4
\| \| \| \| \| \|	So model that directly in TTI::getIntImmCost(). llvm-svn: 281044
*	[ARM] icmp %x, -C can be lowered to a simple ADDS or CMN	James Molloy	2016-09-09	1	-0/+11
\| \| \| \| \| \|	Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043
*	[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type	Simon Pilgrim	2016-09-09	2	-3/+3
\| \| \| \| \| \|	Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042
*	[Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0)	James Molloy	2016-09-09	1	-0/+42
\| \| \| \| \| \|	The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040
*	GlobalISel: remove G_TYPE and G_PHI	Tim Northover	2016-09-09	5	-20/+3
\| \| \| \| \| \| \| \|	These instructions were only necessary when type information was stored in the MachineInstr (because only generic MachineInstrs possessed a type). Now that it's in MachineRegisterInfo, COPY and PHI work fine. llvm-svn: 281037
*	GlobalISel: fix comments and add assertions for valid instructions.	Tim Northover	2016-09-09	1	-4/+88
\| \| \| \|	llvm-svn: 281036
*	GlobalISel: move type information to MachineRegisterInfo.	Tim Northover	2016-09-09	17	-383/+275
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want each register to have a canonical type, which means the best place to store this is in MachineRegisterInfo rather than on every MachineInstr that happens to use or define that register. Most changes following from this are pretty simple (you need an MRI anyway if you're going to be doing any transformations, so just check the type there). But legalization doesn't really want to check redundant operands (when, for example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's operand type field to encode these constraints and limit legalization's work. As an added bonus, more validation is possible, both in MachineVerifier and MachineIRBuilder (coming soon). llvm-svn: 281035
*	Revert "[mips] Fix c.<cc>.<fmt> instruction definition."	Simon Dardis	2016-09-09	15	-539/+209
\| \| \| \| \| \| \|	This reverts commit r281022. Mips buildbot broke, due to unhandled register class FCC. llvm-svn: 281033
*	[AMDGPU] Assembler: rename amd_kernel_code_t asm names according to spec	Sam Kolton	2016-09-09	3	-242/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also removed duplicate code from AMDGPUTargetAsmStreamer. This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D24296 llvm-svn: 281028
*	[Thumb1] Teach optimizeCompareInstr about thumb1 compares	James Molloy	2016-09-09	1	-4/+21
\| \| \| \| \| \| \| \|	This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags. Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean. llvm-svn: 281027
*	[AMDGPU] Assembler: match e32 VOP instructions before e64.	Sam Kolton	2016-09-09	7	-32/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Split assembler match table in 4 tables with assembler variants: Default - all instructions except VOP3, SDWA and DPP - VOP3 - SDWA - DPP First match Default table then VOP3, SDWA and DPP. Reviewers: tstellarAMD, artem.tamazov, vpykhtin Subscribers: arsenm, wdng, nhaehnle, AMDGPU Differential Revision: https://reviews.llvm.org/D24252 llvm-svn: 281023
*	[mips] Fix c.<cc>.<fmt> instruction definition.	Simon Dardis	2016-09-09	15	-209/+539
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of this effort, remove MipsFCmp nodes and use tablegen patterns rather than custom lowering through C++. Unexpectedly, this improves codesize for microMIPS as previous floating point setcc expansions would materialize 0 and 1 into GPRs before using the relevant mov[tf].[sd] instruction. Now $zero is used directly. Reviewers: dsanders, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D23118 llvm-svn: 281022
*	[Coroutines] Part13: Handle single edge PHINodes across suspends	Gor Nishanov	2016-09-09	3	-4/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If one of the uses of the value is a single edge PHINode, handle it. Original: %val = something <suspend> %p = PHINode [%val] After Spill + Part13: %val = something %slot = gep val.spill.slot store %val, %slot <suspend> %p = load %slot Plus tiny fixes/changes: * use correct index for coro.free in CoroCleanup * fixup id parameter in coro.free to allow authoring coroutine in plain C with __builtins Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24242 llvm-svn: 281020