bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Part 1 of 3 patches that completes very long conditional branches	Reed Kotler	2013-11-29	2	-16/+44
\| \| \| \| \| \| \| \| \| \| \| \|	in constant islands for Mips16. We introdcuce JalB16 as a synomnym for Jal16. It makes it easier to read and is also necessary because Jal16 is a call instruction but JalB16 is being used as a branch. Various parts of LLVM will not work properly even in this late stage of the backend if we use what was declared as a call instruction to function as a branch. For one, basic block labels may not get emitted in some situations. llvm-svn: 195968
*	Revert revision 195965.	Zoran Jovanovic	2013-11-29	1	-3/+1
\| \| \| \|	llvm-svn: 195967
*	Fixed issue with microMIPS long branch.	Zoran Jovanovic	2013-11-29	1	-1/+3
\| \| \| \|	llvm-svn: 195965
*	Adjust PPC A2 input operand latencies	Hal Finkel	2013-11-29	1	-52/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On the PPC A2, instructions are only issued after their input operands are ready. Model this by specifying that input operands are read at dispatch (0 cycles after issue). This changes all input operand latencies from 1 to 0. Significant test-suite performance changes (these are 99.5% confidence intervals on 6 runs for both before and after): speedups: MultiSource/Benchmarks/sim/sim -1.21915% +/- 0.175063% MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -1.23946% +/- 1.05133% SingleSource/Benchmarks/Misc/flops-2 -1.24237% +/- 0.681362% MultiSource/Applications/JM/lencod/lencod -1.33992% +/- 0.757498% MultiSource/Benchmarks/TSVC/InductionVariable-flt/InductionVariable-flt -1.51802% +/- 1.21468% MultiSource/Benchmarks/TSVC/GlobalDataFlow-flt/GlobalDataFlow-flt -2.18818% +/- 1.28605% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt -2.21977% +/- 1.19499% SingleSource/Benchmarks/BenchmarkGame/spectral-norm -2.29822% +/- 0.671871% MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl -2.40975% +/- 0.355931% SingleSource/Benchmarks/Misc/fp-convert -2.41899% +/- 1.04751% MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -2.50349% +/- 0.126765% SingleSource/Benchmarks/Misc/flops-3 -3.00214% +/- 0.700795% MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt -3.56995% +/- 3.2929% MultiSource/Applications/sgefa/sgefa -4.24908% +/- 2.00413% MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk -18.1294% +/- 3.96489% regressions: MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl 1.03249% +/- 0.178547% MultiSource/Applications/hexxagon/hexxagon 1.16597% +/- 0.285235% MultiSource/Benchmarks/TSVC/IndirectAddressing-flt/IndirectAddressing-flt 1.39576% +/- 1.07855% SingleSource/Benchmarks/Misc-C++/stepanov_v1p2 1.71539% +/- 0.173182% MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1 1.90013% +/- 0.866472% MultiSource/Benchmarks/TSVC/Recurrences-dbl/Recurrences-dbl 2.39854% +/- 1.05914% MultiSource/Benchmarks/TSVC/ControlFlow-dbl/ControlFlow-dbl 2.4402% +/- 0.817904% MultiSource/Benchmarks/TSVC/LoopRestructuring-dbl/LoopRestructuring-dbl 5.87997% +/- 3.3172% MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc 9.02643% +/- 5.79591% MultiSource/Benchmarks/VersaBench/bmm/bmm 10.3517% +/- 1.227% Obviously, there are data points on both sides of this; but I think, overall, this supports making the change. llvm-svn: 195951
*	Teach LocalStackSlotAllocation that stackmaps/patchpoints don't have range	Lang Hames	2013-11-29	1	-3/+5
\| \| \| \| \| \|	constraints on their frame offsets. llvm-svn: 195950
*	Create a PPC440 SchedMachineModel	Hal Finkel	2013-11-29	2	-6/+20
\| \| \| \| \| \| \|	Some of the older PPC processor definitions don't have associated SchedMachineModels; correct this for the PPC440. llvm-svn: 195949
*	Fixup PPC440 load/store operand latencies	Hal Finkel	2013-11-29	1	-19/+19
\| \| \| \| \| \| \| \|	The operand latencies for loads and stores in the PPC440 itinerary were wrong (the store operands are all inputs, and the "with update" (pre-increment) instructions need a latency for the additional output). llvm-svn: 195948
*	Adjust PPC440 operand latencies	Hal Finkel	2013-11-29	1	-54/+54
\| \| \| \| \| \| \| \| \| \| \| \|	The operand latencies for the PPC440 should be specified relative to dispatch, not relative to the initial fetch-and-decode stages. Because most instructions (ignoring bypass) wait in dispatch until their operands are ready, this is modeled as reading input operands "at dispatch" (0 cycles after issue), and so every input and output operand has 4 cycles subtracted from it. This could alter scheduling slightly, but I don't expect a large effect. llvm-svn: 195947
*	Don't model the fetch and decode units for the PPC440	Hal Finkel	2013-11-29	1	-180/+61
\| \| \| \| \| \| \| \| \| \|	Modeling the fetch and decode units in the PPC440 itinerary does not add anything to the hazard detection capability (and so modeling them just wastes compile time). No functionality change intended. llvm-svn: 195946
*	Remove unused variable from r195944.	Lang Hames	2013-11-29	1	-1/+0
\| \| \| \|	llvm-svn: 195945
*	Refactor a lot of patchpoint/stackmap related code to simplify and make it	Lang Hames	2013-11-29	9	-232/+211
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	target independent. Most of the x86 specific stackmap/patchpoint handling was necessitated by the use of the native address-mode format for frame index operands. PEI has now been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing us to use a simple, platform independent register/offset pair for frame indexes on stackmap/patchpoints. Notes: - Folding is now platform independent and automatically supported. - Emiting patchpoints with direct memory references now just involves calling the TargetLoweringBase::emitPatchPoint utility method from the target's XXXTargetLowering::EmitInstrWithCustomInserter method. (See X86TargetLowering for an example). - No more ugly platform-specific operand parsers. This patch shouldn't change the generated output for X86. llvm-svn: 195944
*	AArch64: The pattern match should check the range of the immediate value.	Hao Liu	2013-11-29	1	-113/+123
\| \| \| \| \| \| \|	Or we can generate some illegal instructions. E.g. shrn2 v0.4s, v1.2d, #35. The legal range should be in [1, 16]. llvm-svn: 195941
*	Add missing pattern for supporting intrinsic function vbsl_f64 with	Jiangning Liu	2013-11-29	1	-0/+3
\| \| \| \| \| \|	argument double floating point. llvm-svn: 195938
*	[AArch64 NEON]Fix a assertion failure when disassemble SHLL instruction.	Kevin Qin	2013-11-29	2	-35/+73
\| \| \| \|	llvm-svn: 195936
*	Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)).	Stephen Canon	2013-11-28	1	-26/+82
\| \| \| \|	llvm-svn: 195934
*	Refactor to remove a bit of duplication. No functionality change.	Rafael Espindola	2013-11-28	1	-24/+24
\| \| \| \|	llvm-svn: 195933
*	Silence sign-compare warning and reduce nesting.	Benjamin Kramer	2013-11-28	1	-7/+7
\| \| \| \| \| \|	No functionality change. llvm-svn: 195932
*	Remove an always true parameter.	Rafael Espindola	2013-11-28	1	-6/+2
\| \| \| \|	llvm-svn: 195931
*	[CMake] Let add_public_tablegen_target() provide intrinsics_gen, too.	NAKAMURA Takumi	2013-11-28	11	-22/+0
\| \| \| \| \| \| \| \| \| \|	I think, in principle, intrinsics_gen may be added explicitly. That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen. Almost all source files depend on both CommonTaleGen and intrinsics_gen. Explicit add_dependencies() have been pruned under lib/Target. llvm-svn: 195929
*	[CMake] Let add_public_tablegen_target responsible to provide dependency to ↵	NAKAMURA Takumi	2013-11-28	61	-109/+11
\| \| \| \| \| \| \| \| \|	CommonTableGen. add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS. LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope. llvm-svn: 195927
*	The global prefix is always one char. Don't use a string for it.	Rafael Espindola	2013-11-28	5	-13/+8
\| \| \| \|	llvm-svn: 195926
*	[CMake] Prune include_directories() in llvm/lib/Target, take #2.	NAKAMURA Takumi	2013-11-28	16	-35/+0
\| \| \| \| \| \|	I forgot to commit them. They were staging in my local repo. llvm-svn: 195924
*	[mips] Revert test commit r195922.	Daniel Sanders	2013-11-28	1	-1/+0
\| \| \| \|	llvm-svn: 195923
*	[mips] A test commit to test my Herald and Audit workflow	Daniel Sanders	2013-11-28	1	-0/+1
\| \| \| \| \| \|	Will be reverted in the next commit llvm-svn: 195922
*	[CMake] Prune include_directories() in llvm/lib/Target. add_llvm_target() ↵	NAKAMURA Takumi	2013-11-28	25	-53/+0
\| \| \| \| \| \|	sets them. llvm-svn: 195921
*	Add newline at eof.	NAKAMURA Takumi	2013-11-28	1	-1/+1
\| \| \| \|	llvm-svn: 195920
*	Use the mangler consistently instead of using getGlobalPrefix directly.	Rafael Espindola	2013-11-28	5	-19/+17
\| \| \| \|	llvm-svn: 195911
*	Don't share functional units among the PPC itineraries	Hal Finkel	2013-11-28	9	-1261/+1364
\| \| \| \| \| \| \| \| \| \| \|	Instead of sharing functional unit names between the various PPC itineraries, give each core its own unit names prefixed with the core name. This follows the convention used by other backends (such as ARM), and removes a non-obvious ordering dependency between the various PPCSchedule*.td files. No functionality change intended. llvm-svn: 195908
*	Remove the variable only used by assert to avoid the build failure	Jiangning Liu	2013-11-28	1	-2/+2
\| \| \| \| \| \|	caused by build options [-Werror,-Wunused-variable]. llvm-svn: 195905
*	AArch64: Fix a bug about disassembling post-index load single element to 4 ↵	Hao Liu	2013-11-28	1	-4/+4
\| \| \| \| \| \|	vectors llvm-svn: 195903
*	Check in conditional branches for constant islands. Still need to finish	Reed Kotler	2013-11-28	2	-5/+157
\| \| \| \| \| \| \| \| \| \| \| \|	conditional branches for very large targets. That will be the next small patch. Everything now should in principle work as good (functionality wise) as without constant islands so we decided at Mips/Imagination to make constant islands the default for Mips16 now so that it will get excercised a lot and this port is still experimentatl though hopefully soon we will change the status. Some more cleanup and code review is in order but things are converging fast. llvm-svn: 195902
*	[mips] Redefine TAILCALL as a pseudo instruction.	Akira Hatanaka	2013-11-27	3	-10/+15
\| \| \| \| \| \|	No functionality change. llvm-svn: 195896
*	DebugInfo: Do not include variables only referenced by templates in aranges.	David Blaikie	2013-11-27	1	-3/+6
\| \| \| \| \| \| \| \|	ARanges included even extern variables referenced by pointer non-type template parameters even though that variable isn't part of this compilation unit. llvm-svn: 195895
*	Add MipsOptimizePICCall.cpp to CMakeLists.txt.	Akira Hatanaka	2013-11-27	1	-0/+1
\| \| \| \|	llvm-svn: 195894
*	[mips] Implement the following optimizations using dominance information to	Akira Hatanaka	2013-11-27	4	-8/+305
\| \| \| \| \| \| \| \| \| \| \|	make PIC calls a little more efficient: 1. Remove instructions setting up $gp if it is known that a function has been called at least once. 2. Save the address of a called function in a register instead of loading it from the GOT at every call site. llvm-svn: 195892
*	Add IIC_ prefix to PPC instruction-class names	Hal Finkel	2013-11-27	13	-2355/+2366
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the IIC_ prefix to the instruction itinerary class names, giving the PPC backend a naming convention for itinerary classes that is more consistent with that used by the X86 and ARM backends. Instruction scheduling in the PPC backend needs a bunch of cleanup and improvement (especially for the ooo cores). This is just a preliminary step. No functionality change intended. llvm-svn: 195890
*	Don't set GlobalPrefix to the default value.	Rafael Espindola	2013-11-27	2	-2/+0
\| \| \| \|	llvm-svn: 195884
*	The R600 has its own asm printer which doesn't use GlobalPrefix. Drop it.	Rafael Espindola	2013-11-27	1	-1/+0
\| \| \| \|	llvm-svn: 195883
*	R600: Expand vector FABS	Tom Stellard	2013-11-27	1	-0/+1
\| \| \| \| \|	NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195881
*	R600/SI: Implement spilling of SGPRs v5	Tom Stellard	2013-11-27	6	-13/+161
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions. v2: - Fix encoding of Lane Mask - Use correct register flags, so we don't overwrite the low dword when restoring multi-dword registers. v3: - Register spilling seems to hang the GPU, so replace all shaders that need spilling with a dummy shader. v4: - Fix *LANE definitions - Change destination reg class for 32-bit SMRD instructions v5: - Remove small optimization that was crashing Serious Sam 3. https://bugs.freedesktop.org/show_bug.cgi?id=68224 https://bugs.freedesktop.org/show_bug.cgi?id=71285 NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195880
*	R600/SI: Use SGPR_32 register class for 32-bit SMRD outputs	Tom Stellard	2013-11-27	1	-2/+5
\| \| \| \| \| \| \| \|	Writing to the M0 register from an SMRD instruction hangs the GPU, so we need to use the SGPR_32 register class, which does not include M0. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195879
*	R600: Add support for ISD::FROUND	Tom Stellard	2013-11-27	3	-4/+18
\| \| \| \| \|	NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195878
*	Show stackmap entry encodings in stackmap debug logs. This makes it easier to	Lang Hames	2013-11-27	1	-23/+27
\| \| \| \| \| \| \|	cross-reference debug output with encoded stack-maps, and to create stackmap test-cases. llvm-svn: 195874
*	Remove dead code.	Rafael Espindola	2013-11-27	1	-36/+4
\| \| \| \| \| \|	MO_ExternalSymbol and MO_JumpTableIndex don't show up in inline asm. llvm-svn: 195861
*	Convert two if sequences to switches.	Rafael Espindola	2013-11-27	1	-10/+21
\| \| \| \|	llvm-svn: 195859
*	Use a switch.	Rafael Espindola	2013-11-27	1	-5/+11
\| \| \| \|	llvm-svn: 195857
*	Use the same tls section name as msvc.	Rafael Espindola	2013-11-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently error in clang with: "error: thread-local storage is unsupported for the current target", but we can start to get the llvm level ready. When compiling template<typename T> struct foo { static __declspec(thread) int bar; }; template<typename T> __declspec(therad) int foo<T>::bar; template struct foo<int>; msvc produces SECTION HEADER #3 .tls$ name 0 physical address 0 virtual address 4 size of raw data 12F file pointer to raw data (0000012F to 00000132) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers C0301040 flags Initialized Data COMDAT; sym= "public: static int foo<int>::bar" (?bar@?$foo@H@@2HA) 4 byte align Read Write gcc produces a ".data$__emutls_v.<symbol>" for the testcase with __declspec(thread) replaced with thread_local. llvm-svn: 195849
*	Remove more dead code now that this is only used for inline asm.	Rafael Espindola	2013-11-27	1	-4/+1
\| \| \| \| \| \| \|	MO_ConstantPoolIndex is handled in printLeaMemReference. MO_JumpTableIndex and MO_ExternalSymbol don't show up in inline asm. llvm-svn: 195847
*	Fix the AArch64 NEON bug exposed by checking constant integer argument range ↵	Jiangning Liu	2013-11-27	2	-164/+82
\| \| \| \| \| \|	of ACLE intrinsics. llvm-svn: 195843
*	Convert more methods in static helpers.	Rafael Espindola	2013-11-27	2	-33/+26
\| \| \| \|	llvm-svn: 195826