bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[CodeGen] Add hook/combine to form vector extloads, enabled on X86.	Ahmed Bougacha	2015-02-05	3	-12/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The combine that forms extloads used to be disabled on vector types, because "None of the supported targets knows how to perform load and sign extend on vectors in one instruction." That's not entirely true, since at least SSE4.1 X86 knows how to do those sextloads/zextloads (with PMOVS/ZX). But there are several aspects to getting this right. First, vector extloads are controlled by a profitability callback. For instance, on ARM, several instructions have folded extload forms, so it's not always beneficial to create an extload node (and trying to match extloads is a whole 'nother can of worms). The interesting optimization enables folding of s/zextloads to illegal (splittable) vector types, expanding them into smaller legal extloads. It's not ideal (it introduces some legalization-like behavior in the combine) but it's better than the obvious alternative: form illegal extloads, and later try to split them up. If you do that, you might generate extloads that can't be split up, but have a valid ext+load expansion. At vector-op legalization time, it's too late to generate this kind of code, so you end up forced to scalarize. It's better to just avoid creating egregiously illegal nodes. This optimization is enabled unconditionally on X86. Note that the splitting combine is happy with "custom" extloads. As is, this bypasses the actual custom lowering, and just unrolls the extload. But from what I've seen, this is still much better than the current custom lowering, which does some kind of unrolling at the end anyway (see for instance load_sext_4i8_to_4i64 on SSE2, and the added FIXME). Also note that the existing combine that forms extloads is now also enabled on legal vectors. This doesn't have a big effect on X86 (because sext+load is usually combined to sext_inreg+aextload). On ARM it fires on some rare occasions; that's for a separate commit. Differential Revision: http://reviews.llvm.org/D6904 llvm-svn: 228325
*	X86 ABI fix for return values > 24 bytes.	Andrew Trick	2015-02-05	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \|	The return value's address must be returned in %rax. i.e. the callee needs to copy the sret argument (%rdi) into the return value (%rax). This probably won't manifest as a bug when the caller is LLVM-compiled code. But it is an ABI guarantee and tools expect it. llvm-svn: 228321
*	[Hexagon] Renaming A2_addi and formatting.	Colin LeMahieu	2015-02-05	7	-37/+34
\| \| \| \|	llvm-svn: 228318
*	move fold comments to the corresponding fold; NFC	Sanjay Patel	2015-02-05	1	-3/+9
\| \| \| \|	llvm-svn: 228317
*	[Hexagon] Since decoding conflicts have been resolved, isCodeGenOnly = 0 by ↵	Colin LeMahieu	2015-02-05	6	-532/+207
\| \| \| \| \| \|	default and remove explicitly setting it. llvm-svn: 228316
*	LowerSwitch: Use ConstantInt for CaseRange::{Low,High}	Hans Wennborg	2015-02-05	1	-20/+20
\| \| \| \| \| \| \|	Case values are always ConstantInt. This allows us to remove a bunch of casts. NFC. llvm-svn: 228312
*	LowerSwitch: remove default args from CaseRange ctor; NFC	Hans Wennborg	2015-02-05	1	-3/+2
\| \| \| \|	llvm-svn: 228311
*	R600/SI: Fix bug in TTI loop unrolling preferences	Tom Stellard	2015-02-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	We should be setting UnrollingPreferences::MaxCount to MAX_UINT instead of UnrollingPreferences::Count. Count is a 'forced unrolling factor', while MaxCount sets an upper limit to the unrolling factor. Setting Count to MAX_UINT was causing the loop in the testcase to be unrolled 15 times, when it only had a maximum of 4 iterations. llvm-svn: 228303
*	R600/SI: Fix bug from insertion of llvm.SI.end.cf into loop headers	Tom Stellard	2015-02-05	1	-3/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The llvm.SI.end.cf intrinsic is used to mark the end of if-then blocks, if-then-else blocks, and loops. It is responsible for updating the exec mask to re-enable threads that had been masked during the preceding control flow block. For example: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf The bug fixed by this patch was one where the llvm.SI.end.cf intrinsic was being inserted into the header of loops. This would happen when an if block terminated in a loop header and we would end up with code like this: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() LOOP: ; Start of loop header s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf <-BUG: The exec mask has the same value at the beginning of each loop iteration. do_stuff(); s_cbranch_execnz LOOP The fix is to create a new basic block before the loop and insert the llvm.SI.end.cf there. This way the exec mask is restored before the start of the loop instead of at the beginning of each iteration. llvm-svn: 228302
*	[PowerPC] Implement the vclz instructions for PWR8	Bill Schmidt	2015-02-05	2	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch by Kit Barton. Add the vector count leading zeros instruction for byte, halfword, word, and doubleword sizes. This is a fairly straightforward addition after the changes made for vpopcnt: 1. Add the correct definitions for the various instructions in PPCInstrAltivec.td 2. Make the CTLZ operation legal on vector types when using P8Altivec in PPCISelLowering.cpp Test Plan Created new test case in test/CodeGen/PowerPC/vec_clz.ll to check the instructions are being generated when the CTLZ operation is used in LLVM. Check the encoding and decoding in test/MC/PowerPC/ppc_encoding_vmx.s and test/Disassembler/PowerPC/ppc_encoding_vmx.txt respectively. llvm-svn: 228301
*	Add a FIXME.	Rafael Espindola	2015-02-05	1	-0/+3
\| \| \| \| \| \|	Thanks to Eric for the suggestion. llvm-svn: 228300
*	Removing an unused variable warning I accidentally introduced with my last ↵	Aaron Ballman	2015-02-05	1	-1/+1
\| \| \| \| \| \|	warning fix; NFC. llvm-svn: 228295
*	Silencing an MSVC warning about a switch statement with no cases; NFC.	Aaron Ballman	2015-02-05	1	-8/+5
\| \| \| \|	llvm-svn: 228294
*	[X86][MMX] Handle i32->mmx conversion using movd	Bruno Cardoso Lopes	2015-02-05	4	-0/+38
\| \| \| \| \| \| \| \| \|	Implement a BITCAST dag combine to transform i32->mmx conversion patterns into a X86 specific node (MMX_MOVW2D) and guarantee that moves between i32 and x86mmx are better handled, i.e., don't use store-load to do the conversion.. llvm-svn: 228293
*	[X86][MMX] Move MMX DAG node to proper file	Bruno Cardoso Lopes	2015-02-05	2	-3/+8
\| \| \| \|	llvm-svn: 228291
*	Teach isDereferenceablePointer() to look through bitcast constant expressions.	Michael Kuperstein	2015-02-05	1	-1/+1
\| \| \| \| \| \| \| \|	This fixes a LICM regression due to the new load+store pair canonicalization. Differential Revision: http://reviews.llvm.org/D7411 llvm-svn: 228284
*	[X86] Add xrstors/xsavec/xsaves/clflushopt/clwb/pcommit instructions	Craig Topper	2015-02-05	3	-4/+27
\| \| \| \|	llvm-svn: 228283
*	[X86] Remove two feature flags that covered sets of instructions that have ↵	Craig Topper	2015-02-05	6	-21/+4
\| \| \| \| \| \|	no patterns or intrinsics. Since we don't check feature flags in the assembler parser for any instruction sets, these flags don't provide any value. This frees up 2 of the fully utilized feature flags. llvm-svn: 228282
*	R600/SI: Fix i64 truncate to i1	Matt Arsenault	2015-02-05	1	-0/+6
\| \| \| \|	llvm-svn: 228273
*	Disable enumeral mismatch warning when compiling llvm with gcc.	Larisse Voufo	2015-02-05	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	Tested with gcc 4.9.2. Compiling with -Werror was producing: .../llvm/lib/Target/X86/X86ISelLowering.cpp: In function 'llvm::SDValue lowerVectorShuffleAsBitMask(llvm::SDLoc, llvm::MVT, llvm::SDValue, llvm::SDValue, llvm::ArrayRef<int>, llvm::SelectionDAG&)': .../llvm/lib/Target/X86/X86ISelLowering.cpp:7771:40: error: enumeral mismatch in conditional expression: 'llvm::X86ISD::NodeType' vs 'llvm::ISD::NodeType' [-Werror=enum-compare] V = DAG.getNode(VT.isFloatingPoint() ? X86ISD::FAND : ISD::AND, DL, VT, V, ^ llvm-svn: 228271
*	Implement new heuristic for complete loop unrolling.	Michael Zolotukhin	2015-02-05	1	-2/+332
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Complete loop unrolling can make some loads constant, thus enabling a lot of other optimizations. To catch such cases, we look for loads that might become constants and estimate number of instructions that would be simplified or become dead after substitution. Example: Suppose we have: int a[] = {0, 1, 0}; v = 0; for (i = 0; i < 3; i ++) v += b[i]a[i]; If we completely unroll the loop, we would get: v = b[0]a[0] + b[1]a[1] + b[2]a[2] Which then will be simplified to: v = b[0]* 0 + b[1]* 1 + b[2]* 0 And finally: v = b[1] llvm-svn: 228265
*	Value soft float calls as more expensive in the inliner.	Cameron Esfahani	2015-02-05	5	-1/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation. By default, it's not an expensive operation. This keeps the default behavior the same as before. The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6936 llvm-svn: 228263
*	Try to fix the build in MCValue.cpp	Reid Kleckner	2015-02-05	1	-1/+1
\| \| \| \|	llvm-svn: 228256
*	Fixup.	Sean Silva	2015-02-05	1	-2/+2
\| \| \| \| \| \|	Didn't see these calls in my release build locally when testing. llvm-svn: 228254
*	[MC] Remove various unused MCAsmInfo parameters.	Sean Silva	2015-02-05	4	-15/+10
\| \| \| \|	llvm-svn: 228244
*	IR: Rename 'operator ==()' to 'isKeyOf()', NFC	Duncan P. N. Exon Smith	2015-02-05	1	-4/+4
\| \| \| \| \| \|	`isKeyOf()` is a clearer name than overloading `operator==()`. llvm-svn: 228242
*	[Hexagon] Deleting unused instructions and adding isCodeGenOnly to some defs.	Colin LeMahieu	2015-02-05	3	-34/+8
\| \| \| \|	llvm-svn: 228238
*	[Hexagon] Updating load extend to i64 patterns.	Colin LeMahieu	2015-02-04	1	-85/+30
\| \| \| \|	llvm-svn: 228237
*	[fuzzer] add flag prefer_small_during_initial_shuffle, be a bit more verbose	Kostya Serebryany	2015-02-04	4	-6/+32
\| \| \| \|	llvm-svn: 228235
*	[Hexagon] Cleaning up i1 load and extension patterns.	Colin LeMahieu	2015-02-04	1	-24/+11
\| \| \| \|	llvm-svn: 228232
*	[Hexagon] Simplifying more load and store patterns and using new addressing ↵	Colin LeMahieu	2015-02-04	1	-72/+41
\| \| \| \| \| \|	patterns. llvm-svn: 228231
*	R600/SI: Enable subreg liveness by default	Tom Stellard	2015-02-04	1	-0/+4
\| \| \| \|	llvm-svn: 228228
*	[Hexagon] Simplifying some load and store patterns.	Colin LeMahieu	2015-02-04	1	-68/+35
\| \| \| \|	llvm-svn: 228227
*	AsmParser: Split out LineField, NFC	Duncan P. N. Exon Smith	2015-02-04	1	-2/+17
\| \| \| \| \| \| \|	Split out `LineField`, which restricts the legal line numbers. This will make it easier to be consistent between different node parsers. llvm-svn: 228226
*	[Hexagon] Converting absolute-address load patterns to use AddrGP.	Colin LeMahieu	2015-02-04	1	-48/+13
\| \| \| \|	llvm-svn: 228225
*	[Hexagon] Converting atomic store/load to use AddrGP addressing.	Colin LeMahieu	2015-02-04	1	-33/+10
\| \| \| \|	llvm-svn: 228223
*	[Hexagon] Simplifying some store patterns. Adding AddrGP addressing forms.	Colin LeMahieu	2015-02-04	3	-24/+16
\| \| \| \|	llvm-svn: 228220
*	[fuzzer] add -runs=N to limit the number of runs per session. Also, make ↵	Kostya Serebryany	2015-02-04	4	-9/+18
\| \| \| \| \| \|	sure we do some mutations w/o cross over. llvm-svn: 228214
*	Fix GCC error caused by r228211	Duncan P. N. Exon Smith	2015-02-04	1	-0/+4
\| \| \| \|	llvm-svn: 228213
*	IR: Reduce boilerplate in DenseMapInfo overrides, NFC	Duncan P. N. Exon Smith	2015-02-04	1	-91/+63
\| \| \| \| \| \| \|	Minimize the boilerplate required for the `MDNode` subclass `DenseMapInfo<>` overrides in `LLVMContextImpl`. llvm-svn: 228212
*	AsmParser: Move MDField details to source file, NFC	Duncan P. N. Exon Smith	2015-02-04	2	-38/+44
\| \| \| \| \| \| \| \|	Move all the types of `MDField` to an anonymous namespace in the source file. This also eliminates the duplication of `ParseMDField()` declarations in the header for each new field type. llvm-svn: 228211
*	AsmParser: Simplify assertion, NFC	Duncan P. N. Exon Smith	2015-02-04	1	-1/+1
\| \| \| \|	llvm-svn: 228209
*	AsmParser: Remove dead code, NFC	Duncan P. N. Exon Smith	2015-02-04	1	-4/+0
\| \| \| \| \| \|	This condition is checked in the generic `ParseMDField()`. llvm-svn: 228208
*	AsmParser: Simplify MDUnsignedField	Duncan P. N. Exon Smith	2015-02-04	2	-17/+14
\| \| \| \| \| \|	We only need `uint64_t` for storage. llvm-svn: 228205
*	IR: Initialize MDNode abbreviations en masse, NFC	Duncan P. N. Exon Smith	2015-02-04	1	-3/+4
\| \| \| \|	llvm-svn: 228203
*	IR: Define MDNode uniquing sets automatically, NFC	Duncan P. N. Exon Smith	2015-02-04	1	-3/+2
\| \| \| \|	llvm-svn: 228200
*	Don' try to make sections in comdats SHF_MERGE.	Rafael Espindola	2015-02-04	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Parts of llvm were not expecting it and we wouldn't print the entity size of the section. Given what comdats are used for, having SHF_MERGE sections would be just a small improvement, so just disable it for now. Fixes pr22463. llvm-svn: 228196
*	R600/SI: Expand misaligned 16-bit memory accesses	Tom Stellard	2015-02-04	1	-0/+5
\| \| \| \|	llvm-svn: 228190
*	R600/SI: Make more store operations legal	Tom Stellard	2015-02-04	2	-12/+0
\| \| \| \| \| \| \| \| \| \| \|	v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for all address spaces. We had marked them as custom in order to lower them for the private address space, but this is no longer necessary. This enables lowering of misaligned stores of these types in the DAGLegalizer. llvm-svn: 228189
*	R600: Don't promote i64 stores to v2i32 during DAG legalization	Tom Stellard	2015-02-04	2	-3/+25
\| \| \| \| \| \| \|	We take care of this during instruction selection now. This fixes a potential infinite loop when lowering misaligned stores. llvm-svn: 228188