bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Hexagon] Adding add64 and sub64 instructions.	Colin LeMahieu	2014-11-25	1	-0/+44
\| \| \| \|	llvm-svn: 222795
*	Reverting 222792	Colin LeMahieu	2014-11-25	1	-41/+0
\| \| \| \|	llvm-svn: 222793
*	[Hexagon] Adding compare with immediate instructions.	Colin LeMahieu	2014-11-25	1	-0/+41
\| \| \| \|	llvm-svn: 222792
*	[Hexagon] Adding NOP encoding bits.	Colin LeMahieu	2014-11-25	2	-5/+6
\| \| \| \|	llvm-svn: 222791
*	R600/SI: Only use one DEBUG()	Matt Arsenault	2014-11-25	1	-2/+1
\| \| \| \|	llvm-svn: 222789
*	[AVX512] Add 512b integer shift by variable intrinsics and patterns.	Cameron McInally	2014-11-25	3	-48/+43
\| \| \| \|	llvm-svn: 222786
*	[Hexagon] Adding C2_mux instruction.	Colin LeMahieu	2014-11-25	5	-32/+31
\| \| \| \|	llvm-svn: 222784
*	Remove space before tab in all AVX512 mnemonic strings.	Craig Topper	2014-11-25	1	-137/+137
\| \| \| \|	llvm-svn: 222778
*	[Hexagon] Replacing cmp* instructions with ones that contain encoding bits.	Colin LeMahieu	2014-11-25	5	-56/+79
\| \| \| \|	llvm-svn: 222771
*	Revert r222746: That commit did not update any tests and caused two R600	Chandler Carruth	2014-11-25	1	-2/+1
\| \| \| \| \| \| \| \|	tests to start failing. Original commit log: R600/SI: Disable commutativity for MIN/MAX_LEGACY llvm-svn: 222753
*	[mips][micromips] Use call instructions with short delay slots	Zoran Jovanovic	2014-11-25	1	-21/+49
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D6338 llvm-svn: 222752
*	R600/SI: Disable commutativity for MIN/MAX_LEGACY	Marek Olsak	2014-11-25	1	-1/+2
\| \| \| \|	llvm-svn: 222746
*	R600/SI: Fix allocating flat_scr_lo / flat_scr_hi	Matt Arsenault	2014-11-25	1	-0/+2
\| \| \| \| \| \| \| \|	Only the super register flat_scr was marked as reserved, so in some cases with high register usage it would still try to allocate the subregisters. llvm-svn: 222737
*	[FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching.	Juergen Ributzka	2014-11-25	1	-19/+20
\| \| \| \| \| \| \| \| \| \|	The pattern matching failed to recognize all instances of "-1", because when comparing against "-1" we didn't use an APInt of the same bitwidth. This commit fixes this and also adds inverse versions of the conditon to catch more cases. llvm-svn: 222722
*	[PowerPC] Add the 'attn' instruction	Hal Finkel	2014-11-25	2	-0/+8
\| \| \| \| \| \| \| \|	The attn instruction is not part of the Power ISA, but is documented in the A2 user manual, and is accepted by the GNU assembler for the A2 and the POWER4+. Reported as part of PR21650. llvm-svn: 222712
*	[PowerPC] Implement combineRepeatedFPDivisors	Hal Finkel	2014-11-24	2	-0/+23
\| \| \| \| \| \| \| \|	This does not matter on newer cores (where we can use reciprocal estimates in fast-math mode anyway), but for older cores this allows us to generate better fast-math code where we have multiple FDIVs with a common divisor. llvm-svn: 222710
*	[AArch64] Fix clobber computation in A57LoadBalancing pass.	Chad Rosier	2014-11-24	1	-1/+7
\| \| \| \| \| \| \|	Extremely difficult to reproduce, so no test case included. PR21637 llvm-svn: 222677
*	Removing unused variable.	Colin LeMahieu	2014-11-24	1	-1/+0
\| \| \| \|	llvm-svn: 222676
*	[PowerPC] Fix PR 21652 - copy st_other bits on symbol assignment	Ulrich Weigand	2014-11-24	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \|	When processing an assignment in the integrated assembler that sets a symbol to the value of another symbol, we need to copy the st_other bits that encode the local entry point offset. Modeled after MipsTargetELFStreamer::emitAssignment handling of the ELF::STO_MIPS_MICROMIPS flag. llvm-svn: 222672
*	[Hexagon] Adding asrh instruction, removing unused multiclasses.	Colin LeMahieu	2014-11-24	2	-40/+7
\| \| \| \|	llvm-svn: 222670
*	[Hexagon] Adding aslh instruction.	Colin LeMahieu	2014-11-24	2	-5/+7
\| \| \| \|	llvm-svn: 222668
*	[Hexagon] Adding zxth instruction.	Colin LeMahieu	2014-11-24	2	-5/+7
\| \| \| \|	llvm-svn: 222662
*	[Hexagon] Adding zxtb instruction.	Colin LeMahieu	2014-11-24	2	-5/+43
\| \| \| \|	llvm-svn: 222660
*	[mips][microMIPS] Fix JRADDIUSP instruction	Jozef Kolek	2014-11-24	1	-1/+0
\| \| \| \| \| \| \| \| \|	Fix JRADDIUSP instruction, remove delay slot flag because this instruction doesn't have delay slot. Differential Revision: http://reviews.llvm.org/D6365 llvm-svn: 222658
*	[mips][microMIPS] Implement LBU16, LHU16, LW16, SB16, SH16 and SW16 instructions	Jozef Kolek	2014-11-24	5	-0/+164
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D5122 llvm-svn: 222653
*	[mips][microMIPS] Implement 16-bit instructions registers including ZERO ↵	Jozef Kolek	2014-11-24	3	-0/+41
\| \| \| \| \| \| \| \| \| \|	instead of S0 Implement microMIPS 16-bit instructions register set: $0, $2-$7 and $17. Differential Revision: http://reviews.llvm.org/D5780 llvm-svn: 222652
*	Removing a variable that is initialized but never read. The original author ↵	Aaron Ballman	2014-11-24	1	-6/+2
\| \| \| \| \| \|	has been alerted to the warning, in case this variable is meant to be used. Fixes -Werror builds in the meantime. llvm-svn: 222649
*	[mips][microMIPS] Implement disassembler support for 16-bit instructions	Jozef Kolek	2014-11-24	2	-14/+57
\| \| \| \| \| \| \| \| \| \| \|	With the help of new method readInstruction16() two bytes are read and decodeInstruction() is called with DecoderTableMicroMips16, if this fails four bytes are read and decodeInstruction() is called with DecoderTableMicroMips32. Differential Revision: http://reviews.llvm.org/D6149 llvm-svn: 222648
*	[X86] Improved target specific combine on VSELECT dag nodes.	Andrea Di Biagio	2014-11-24	1	-89/+8
\| \| \| \| \| \| \| \| \| \| \|	This patch teaches function 'transformVSELECTtoBlendVECTOR_SHUFFLE' how to convert VSELECT dag nodes to shuffles on targets that do not have SSE4.1. On pre-SSE4.1 targets, we can still perform blend operations using movss/movsd. Also, removed a target specific combine that performed a premature lowering of VSELECT nodes to target specific MOVSS/MOVSD nodes. llvm-svn: 222647
*	[X86] Fixes bug in build_vector v4x32 lowering	Michael Kuperstein	2014-11-23	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	r222375 made some improvements to build_vector lowering of v4x32 and v4xf32 into an insertps, but it missed a case where: 1. A single extracted element is used twice. 2. The lower of the two non-zero indexes should be preserved, and the higher should be used for the dest mask. This caused a crash, since the source value for the insertps ends-up uninitialized. Differential Revision: http://reviews.llvm.org/D6377 llvm-svn: 222635
*	Add missing override keywords.	Craig Topper	2014-11-23	1	-2/+2
\| \| \| \|	llvm-svn: 222634
*	Masked Vector Load and Store Intrinsics.	Elena Demikhovsky	2014-11-23	4	-4/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632
*	R600: Fix extloads of i1 on R600/Evergreen	Matt Arsenault	2014-11-23	1	-0/+5
\| \| \| \|	llvm-svn: 222631
*	R600: Fix assert on copy of an i1 on pre-SI	Matt Arsenault	2014-11-23	1	-1/+2
\| \| \| \| \| \| \|	i1 is not a legal type on Evergreen, so this combine proceeded and tried to produce a bitcast between i1 and i8. llvm-svn: 222630
*	Tidied up target triple OS detection. NFC	Simon Pilgrim	2014-11-22	3	-11/+6
\| \| \| \| \| \|	Use Triple::isOS*() helper functions where possible. llvm-svn: 222622
*	[x86] Teach the vector shuffle yet another step of canonicalization.	Chandler Carruth	2014-11-22	1	-2/+13
\| \| \| \| \| \| \| \|	No functionality changed yet, but this will prevent subsequent patches from having to handle permutations of various interleaved shuffle patterns. llvm-svn: 222614
*	Fix transformation of add with pc argument to adr for non-immediate	Joerg Sonnenberger	2014-11-21	1	-5/+25
\| \| \| \| \| \|	arguments. llvm-svn: 222587
*	R600/SI: Add an s_mov_b32 to patterns which use the M0RegClass	Tom Stellard	2014-11-21	2	-24/+8
\| \| \| \| \| \| \|	We need to use a s_mov_b32 rather than a copy, so that CSE will eliminate redundant moves to the m0 register. llvm-svn: 222584
*	R600/SI: Emit s_mov_b32 m0, -1 before every DS instruction	Tom Stellard	2014-11-21	6	-40/+28
\| \| \| \| \| \| \| \| \| \| \| \|	This s_mov_b32 will write to a virtual register from the M0Reg class and all the ds instructions now take an extra M0Reg explicit argument. This change is necessary to prevent issues with the scheduler mixing together instructions that expect different values in the m0 registers. llvm-svn: 222583
*	R600/SI: Add SIFoldOperands pass	Tom Stellard	2014-11-21	4	-0/+209
\| \| \| \| \| \| \|	This pass attempts to fold the source operands of mov and copy instructions into their uses. llvm-svn: 222581
*	[mips][microMIPS] This patch implements functionality in MIPS delay slot	Jozef Kolek	2014-11-21	3	-11/+53
\| \| \| \| \| \| \| \| \| \| \|	filler such as if delay slot filler have to put NOP instruction into the delay slot of microMIPS BEQ or BNE instruction which uses the register $0, then instead of emitting NOP this instruction is replaced by the corresponding microMIPS compact branch instruction, i.e. BEQZC or BNEZC. Differential Revision: http://reviews.llvm.org/D3566 llvm-svn: 222580
*	R600/SI: Mark s_mov_b32 and s_mov_b64 as rematerializable	Tom Stellard	2014-11-21	1	-0/+2
\| \| \| \|	llvm-svn: 222579
*	[Hexagon] Adding sxth instruction.	Colin LeMahieu	2014-11-21	3	-8/+10
\| \| \| \|	llvm-svn: 222577
*	[Hexagon] Adding sxtb instruction. Renaming some identically named classes ↵	Colin LeMahieu	2014-11-21	2	-15/+87
\| \| \| \| \| \|	that will be removed after converting referencing defs. llvm-svn: 222575
*	[Hexagon] Removing SUB_rr and replacing with A2_sub.	Colin LeMahieu	2014-11-21	4	-23/+6
\| \| \| \|	llvm-svn: 222571
*	Add a feature flag for slow 32-byte unaligned memory accesses [x86].	Sanjay Patel	2014-11-21	4	-10/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 llvm-svn: 222544
*	[x86] Restructure the checking patterns for v16 and v32 avx2 vector	Chandler Carruth	2014-11-21	1	-28/+24
\| \| \| \| \| \| \| \| \| \|	shuffle lowering to allow much better blend matching. Specifically, with the new structure the code seems clearer to me and we correctly can hit the cases where merging two 128-bit lanes is a clear win and can be shuffled cheaply afterward. llvm-svn: 222539
*	[x86] Make the previous logic significantly less conservative and get	Chandler Carruth	2014-11-21	1	-14/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	a bunch more improvements. Non-lane-crossing is fine, the key is that lane merging only makes sense for single-input shuffles. Not sure why I got so turned around here. The code all works, I was just using the wrong model for it. This only updates v4 and v8 lowering. The v16 and v32 lowering requires restructuring the entire check sequence. llvm-svn: 222537
*	[x86] Teach the x86 vector shuffle lowering to detect mergable 128-bit	Chandler Carruth	2014-11-21	1	-4/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lanes. By special casing these we can often either reduce the total number of shuffles significantly or reduce the number of (high latency on Haswell) AVX2 shuffles that potentially cross 128-bit lanes. Even when these don't actually cross lanes, they have much higher latency to support that. Doing two of them and a blend is worse than doing a single insert across the 128-bit lanes to blend and then doing a single interleaved shuffle. While this seems like a narrow case, it kept cropping up on me and the difference is huge as you can see in many of the test cases. I first hit this trying to perfectly fix the interleaving shuffle patterns used by Halide for AVX2. llvm-svn: 222533
*	[X86] For Silvermont CPU use 16-bit division instead of 64-bit for small ↵	Alexey Volkov	2014-11-21	4	-12/+23
\| \| \| \| \| \| \| \|	positive numbers Differential Revision: http://reviews.llvm.org/D5938 llvm-svn: 222521