bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add support for a preserve_most calling convention to the AArch64 backend.	Roman Levenstein	2016-03-10	1	-0/+40
\| \| \| \| \| \| \| \| \| \|	This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092
*	Add a flag to the LLVMContext to disable name for Value other than GlobalValue	Mehdi Amini	2016-03-10	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is intended to be a performance flag, on the same level as clang cc1 option "--disable-free". LLVM will never initialize it by default, it will be up to the client creating the LLVMContext to request this behavior. Clang will do it by default in Release build (just like --disable-free). "opt" and "llc" can opt-in using -disable-named-value command line option. When performing LTO on llvm-tblgen, the initial merging of IR peaks at 92MB without this patch, and 86MB after this patch,setNameImpl() drops from 6.5MB to 0.5MB. The total link time goes from ~29.5s to ~27.8s. Compared to a compile-time flag (like the IRBuilder one), it performs very close. I profiled on SROA and obtain these results: 420ms with IRBuilder that preserve name 372ms with IRBuilder that strip name 375ms with IRBuilder that preserve name, and a runtime flag to strip Reviewers: chandlerc, dexonsmith, bogner Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D17946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263086
*	[PM] Port memdep to the new pass manager.	Chandler Carruth	2016-03-10	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082
*	[CGP] Duplicate addressing computation in cold paths if required to sink ↵	Philip Reames	2016-03-09	1	-0/+196
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	addressing mode This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store. In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path. This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had. Differential Revision: http://reviews.llvm.org/D17652 llvm-svn: 263074
*	[LICM] Store promotion when memory is thread local	Philip Reames	2016-03-09	1	-0/+133
\| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop. Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped. In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem. Differential Revision: http://reviews.llvm.org/D16783 llvm-svn: 263072
*	[x86] fix cost model inaccuracy for vector memory ops	Sanjay Patel	2016-03-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	The irony of this patch is that one CPU that is affected is AMD Jaguar, and Jaguar has a completely double-pumped AVX implementation. But getting the cost model to reflect that is a much bigger problem. The small goal here is simply to improve on the lie that !AVX2 == SandyBridge. Differential Revision: http://reviews.llvm.org/D18000 llvm-svn: 263069
*	[x86, AVX] optimize masked loads with constant masks	Sanjay Patel	2016-03-09	1	-23/+77
\| \| \| \| \| \| \| \|	Instead of a variable-blend instruction, form a blend with immediate because those are always cheaper. Differential Revision: http://reviews.llvm.org/D17899 llvm-svn: 263067
*	[InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)	Philip Reames	2016-03-09	1	-0/+34
\| \| \| \| \| \| \| \|	When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min. Differential Revision: http://reviews.llvm.org/D17873 llvm-svn: 263059
*	[LLE] Add missing check for unit stride	Adam Nemet	2016-03-09	1	-0/+43
\| \| \| \| \| \| \| \| \| \|	I somehow missed this. The case in GCC (global_alloc) was similar to the new testcase except it had an array of structs rather than a two dimensional array. Fixes RP26885. llvm-svn: 263058
*	[llvm-readobj] Enable GNU style section group print	Hemant Kulkarni	2016-03-09	1	-0/+13
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17822 llvm-svn: 263050
*	InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2	Matthias Braun	2016-03-09	2	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of r251146 InstCombine was extended to call computeKnownBits on every value in the function to determine whether it happens to be constant. This increases typical compiletime by 1-3% (5% in irgen+opt time) in my measurements. On the other hand this case did not trigger once in the whole llvm-testsuite. This patch introduces the notion of ExpensiveCombines which are only enabled for OptLevel > 2. I removed the check in InstructionSimplify as that is called from various places where the OptLevel is not known but given the rarity of the situation I think a check in InstCombine is enough. Differential Revision: http://reviews.llvm.org/D16835 llvm-svn: 263047
*	This change adds co-processor condition branching and conditional traps to ↵	Chris Dewhurst	2016-03-09	6	-8/+856
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the Sparc back-end. This will allow inline assembler code to utilize these features, but no automatic lowering is provided, except for the previously provided @llvm.trap, which lowers to "ta 5". The change also separates out the different assembly language syntaxes for V8 and V9 Sparc. Previously, only V9 Sparc assembly syntax was provided. The change also corrects the selection order of trap disassembly, allowing, e.g. "ta %g0 + 15" to be rendered, more readably, as "ta 15", ignoring the %g0 register. This is per the sparc v8 and v9 manuals. Check-in includes many extra unit tests to check this works correctly on both V8 and V9 Sparc processors. Code Reviewed at http://reviews.llvm.org/D17960. llvm-svn: 263044
*	add a test RUN to show unexpected behavior	Sanjay Patel	2016-03-09	1	-7/+10
\| \| \| \|	llvm-svn: 263037
*	[PPC] backend changes to generate xvabs[s,d]p and xvnabs[s,d]p instructions	Kit Barton	2016-03-09	1	-0/+80
\| \| \| \| \| \| \|	This has to be committed before the FE changes Phabricator: http://reviews.llvm.org/D17837 llvm-svn: 263035
*	Don't crash when compiling inline assembler containing .file directives.	Adrian Prantl	2016-03-09	1	-0/+65
\| \| \| \| \| \| \| \| \| \|	Removing the assertion is safe to do because any module level inline assembly is always emitted first via AsmPrinter::doInitialization(). http://reviews.llvm.org/D16101 rdar://22690666 llvm-svn: 263033
*	[AMDGPU] add AMDGPU target support to ELFObjectFile.h header	Valery Pykhtin	2016-03-09	3	-0/+25
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17144 llvm-svn: 263026
*	SelectionDAG: Fix a crash on inline asm when output register supports ↵	Tom Stellard	2016-03-09	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022
*	Reland r262337 "calculate builtin_object_size if arg is a removable pointer"	Petar Jovanovic	2016-03-09	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Original commit message: calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 Reland the original change with a small modification (first do a null check and then do the cast) to satisfy ubsan. llvm-svn: 263011
*	[AMDGPU] Assembler: Support DPP instructions.	Sam Kolton	2016-03-09	2	-3/+146
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008
*	[AMDGPU] Assembler: Support abs() syntax.	Nikolay Haustov	2016-03-09	1	-0/+44
\| \| \| \| \| \| \| \| \|	Support legacy SP3 abs(v1) syntax. InstPrinter still uses \|v1\|. Add tests. Differential Revision: http://reviews.llvm.org/D17887 llvm-svn: 263006
*	[AMDGPU] Assembler: Fix s_setpc_b64	Nikolay Haustov	2016-03-09	1	-2/+2
\| \| \| \| \| \| \| \|	s_setpc_b64 has just one 64-bit source which is the address of instruction to jump to. Differential Revision: http://reviews.llvm.org/D17888 llvm-svn: 263005
*	Fix ThinLTO test: depends on the X86 backend	Mehdi Amini	2016-03-09	3	-0/+3
\| \| \| \| \|	From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262993
*	[WebAssembly] Implement irreducible control flow.	Dan Gohman	2016-03-09	1	-0/+88
\| \| \| \| \| \| \| \|	This implements a very simple conservative transformation that doesn't require more than linear code size growth. There's room for much more optimization in this space. llvm-svn: 262982
*	Remove trailing newline from test case; NFC	Sanjoy Das	2016-03-09	1	-1/+0
\| \| \| \|	llvm-svn: 262980
*	[SCEV] Slightly generalize getRangeViaFactoring	Sanjoy Das	2016-03-09	1	-0/+58
\| \| \| \| \| \| \| \| \|	Building on the previous change, this generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign extend or truncate operation, and k0 and k1 are constants. llvm-svn: 262979
*	[SCEV] Slightly generalize getRangeViaFactoring	Sanjoy Das	2016-03-09	1	-2/+53
\| \| \| \| \| \| \| \|	This change generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign extend or truncate operation. llvm-svn: 262978
*	libLTO: add a ThinLTOCodeGenerator on the model of LTOCodeGenerator.	Mehdi Amini	2016-03-09	2	-0/+171
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is intended to provide a parallel (threaded) ThinLTO scheme for linker plugin use through the libLTO C API. The intent of this patch is to provide a first implementation as a proof-of-concept and allows linker to start supporting ThinLTO by definiing the libLTO C API. Some part of the libLTO API are left unimplemented yet. Following patches will add support for these. The current implementation can link all clang/llvm binaries. Differential Revision: http://reviews.llvm.org/D17066 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262977
*	[llvm-pdbdump] Dump line table information.	Zachary Turner	2016-03-08	1	-0/+12
\| \| \| \| \| \| \|	This patch adds the -lines command line option which will dump source/line information for each compiland and source file. llvm-svn: 262962
*	[AArch64] Disable the MI scheduler to turn bots green after r262942.	Chad Rosier	2016-03-08	1	-4/+4
\| \| \| \|	llvm-svn: 262944
*	Revert r262759 and r262760.	Quentin Colombet	2016-03-08	1	-30/+0
\| \| \| \| \| \| \| \|	The fix consisting in using the library call for atomic compare and swap when the instruction is not safe to use may be incorrect. Indeed the library call may not exist on all platform. In other words, we need a better fix! llvm-svn: 262943
*	Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ↵	Hans Wennborg	2016-03-08	2	-18/+33
\| \| \| \| \| \| \| \|	ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935
*	AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1.	Igor Breger	2016-03-08	1	-0/+23
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17953 llvm-svn: 262929
*	[X86] Regenerated vector float extension tests	Simon Pilgrim	2016-03-08	1	-19/+65
\| \| \| \|	llvm-svn: 262919
*	Remove pr25342 test-case.	Junmo Park	2016-03-08	1	-93/+0
\| \| \| \| \| \|	This commit removes pr25342 for reverting r262670 clearly. llvm-svn: 262918
*	[Power9] Implement new vsx instructions: load, store instructions for vector ↵	Kit Barton	2016-03-08	2	-0/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and scalar We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to implement this new patch. This patch implements the following vsx instructions: Vector load/store: lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx stxv stxvb16x stxvh8x stxvl stxvll stxvx Scalar load/store: lxsd lxssp lxsibzx lxsihzx stxsd stxssp stxsibx stxsihx 21 instructions Phabricator: http://reviews.llvm.org/D16919 llvm-svn: 262906
*	[WebAssembly] Update for spec change from tableswitch to br_table.	Dan Gohman	2016-03-08	2	-4/+4
\| \| \| \| \| \| \|	Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903
*	[AArch64][GlobalISel] Add a test case for the IRTranslator.	Quentin Colombet	2016-03-08	1	-0/+18
\| \| \| \|	llvm-svn: 262898
*	[MIR] Teach the parser/printer that generic virtual registers do not need a ↵	Quentin Colombet	2016-03-08	1	-7/+10
\| \| \| \| \| \|	register class. llvm-svn: 262893
*	[MIR] Teach the parser how to parse complex types of generic machine ↵	Quentin Colombet	2016-03-08	1	-0/+27
\| \| \| \| \| \| \| \|	instructions. By complex types, I mean aggregate or vector types. llvm-svn: 262890
*	Revert revisions 262636, 262643, 262679, and 262682.	Easwaran Raman	2016-03-08	3	-147/+0
\| \| \| \|	llvm-svn: 262883
*	[MIR] Print the type of generic machine instructions.	Quentin Colombet	2016-03-08	1	-1/+1
\| \| \| \|	llvm-svn: 262880
*	[MIR] Teach the mir parser about types on generic machine instructions.	Quentin Colombet	2016-03-08	1	-1/+2
\| \| \| \|	llvm-svn: 262879
*	[lit] Teach lit about global-isel requirement.	Quentin Colombet	2016-03-08	1	-0/+14
\| \| \| \|	llvm-svn: 262878
*	[tsan] Add support for pointer typed atomic stores, loads, and cmpxchg	Anna Zaks	2016-03-07	1	-0/+35
\| \| \| \| \| \| \| \| \| \|	TSan instrumentation functions for atomic stores, loads, and cmpxchg work on integer value types. This patch adds casts before calling TSan instrumentation functions in cases where the value is a pointer. Differential Revision: http://reviews.llvm.org/D17833 llvm-svn: 262876
*	[x86] add test to show missing optimization	Sanjay Patel	2016-03-07	1	-0/+31
\| \| \| \| \| \| \| \|	This should make it clearer how this proposed patch: http://reviews.llvm.org/D11393 ...will change codegen. llvm-svn: 262875
*	[x86] simplify test and tighten checks	Sanjay Patel	2016-03-07	1	-15/+22
\| \| \| \| \| \| \| \| \|	I noticed this test as part of: http://reviews.llvm.org/D11393 ...which is confusing enough as-is. Let's show the exact codegen, so the changes will be more obvious. llvm-svn: 262874
*	[MIR] Teach the MIPrinter about size for generic virtual registers.	Quentin Colombet	2016-03-07	1	-1/+1
\| \| \| \|	llvm-svn: 262867
*	AMDGPU: Match more med3 integer patterns	Matt Arsenault	2016-03-07	2	-0/+694
\| \| \| \|	llvm-svn: 262864
*	[MIR] Teach the parser how to handle the size of generic virtual registers.	Quentin Colombet	2016-03-07	1	-0/+17
\| \| \| \|	llvm-svn: 262862
*	[ScopedNoAliasAA] Make test basic.ll less confusing	Adam Nemet	2016-03-07	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This testcase had me confused. It made me believe that you can use alias scopes and alias scopes list interchangeably with alias.scope and noalias. Both langref and the other testcase use scope lists so I went looking. Turns out using scope directly only happens to work by chance. When ScopedNoAliasAAResult::mayAliasInScopes traverses this as a scope list: !1 = !{!1, !0, !"some scope"} , the first entry is in fact a scope but only because the scope is happened to be defined self-referentially to make it unique globally. The remaining elements in the tuple (!0, !"some scope") are considered as scopes but AliasScopeNode::getDomain will just bail on those without any error. This change avoids this ambiguity in the test but I've also been wondering if we should issue some sort of a diagnostics. Reviewers: dexonsmith, hfinkel Subscribers: mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D16670 llvm-svn: 262841