bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Do not allow register coalescer to create big superregs	Stanislav Mekhanoshin	2017-01-18	2	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 llvm-svn: 292413
*	[AMDGPU] Assembler: fix v_mac_f16 immediates	Sam Kolton	2017-01-17	2	-10/+18
\| \| \| \| \| \| \| \| \| \|	Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28802 llvm-svn: 292224
*	AMDGPU: Add replacement export intrinsics	Matt Arsenault	2017-01-17	4	-20/+80
\| \| \| \|	llvm-svn: 292205
*	AMDGPU: Remove dead pattern	Matt Arsenault	2017-01-17	1	-5/+0
\| \| \| \| \| \| \|	This is the unsafe conversion pattern, but not guarded by an unsafe math check. It is also already done in LegalizeDAG. llvm-svn: 292173
*	ADMGPU/EG,CM: Implement _noret global atomics	Jan Vesely	2017-01-16	2	-7/+113
\| \| \| \| \| \| \| \|	_RTN versions will be a lot more complicated Differential Revision: https://reviews.llvm.org/D28067 llvm-svn: 292162
*	[AMDGPU] Implement f16 fcopysign and fcopysign(f32, f64)	Konstantin Zhuravlyov	2017-01-13	2	-0/+37
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28496 llvm-svn: 291954
*	Apply clang-tidy's performance-unnecessary-value-param to LLVM.	Benjamin Kramer	2017-01-13	3	-12/+13
\| \| \| \| \| \| \|	With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904
*	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC	Diana Picus	2017-01-13	10	-146/+137
\| \| \| \| \| \| \| \| \| \| \|	Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891
*	AMDGPU: Skip fneg/select combine if it can fold into other	Matt Arsenault	2017-01-12	1	-29/+40
\| \| \| \|	llvm-svn: 291792
*	AMDGPU: Fold free fneg into sin	Matt Arsenault	2017-01-12	1	-1/+5
\| \| \| \|	llvm-svn: 291790
*	AMDGPU: Fold fneg into fmul_legacy	Matt Arsenault	2017-01-12	1	-2/+5
\| \| \| \|	llvm-svn: 291784
*	AMDGPU: Fold fneg into rcp	Matt Arsenault	2017-01-12	1	-1/+7
\| \| \| \|	llvm-svn: 291779
*	AMDGPU: Fold fneg into fp_round	Matt Arsenault	2017-01-12	1	-2/+18
\| \| \| \|	llvm-svn: 291778
*	AMDGPU: Fold fneg into fp_extend	Matt Arsenault	2017-01-12	1	-0/+14
\| \| \| \|	llvm-svn: 291777
*	AMDGPU: Fix sub_oneuse being marked commutative	Matt Arsenault	2017-01-12	1	-1/+2
\| \| \| \|	llvm-svn: 291748
*	AMDGPU: Fold fneg into fma or fmad	Matt Arsenault	2017-01-12	1	-0/+24
\| \| \| \| \| \|	Patch mostly by Fiona Glaser llvm-svn: 291733
*	AMDGPU: Fold fneg into fmul	Matt Arsenault	2017-01-12	1	-0/+17
\| \| \| \| \| \|	Patch mostly by Fiona Glaser llvm-svn: 291732
*	AMDGPU: Fold fneg into fadd	Matt Arsenault	2017-01-12	2	-0/+61
\| \| \| \| \| \|	Patch mostly by Fiona Glaser llvm-svn: 291731
*	AMDGPU: Pull fneg/fabs out of a select	Matt Arsenault	2017-01-11	1	-0/+74
\| \| \| \| \| \|	Allows better source modifier usage. llvm-svn: 291729
*	AMDGPU: Fix shrinking of addc/subb.	Matt Arsenault	2017-01-11	1	-7/+25
\| \| \| \| \| \|	To shrink to VOP2 the input carry must also be VCC. llvm-svn: 291720
*	AMDGPU: Fix sext_inreg for i1 in i16	Matt Arsenault	2017-01-11	1	-0/+5
\| \| \| \| \| \| \| \|	This produces worse code when i16 is legal, mostly due to combines getting confused by conversions inserted for uniform 16-bit operations. llvm-svn: 291717
*	AMDGPU: Fix breaking VOP3 v_add_i32s	Matt Arsenault	2017-01-11	1	-1/+11
\| \| \| \| \| \| \|	This was shrinking the instruction even though the carry output register was a virtual register, not known VCC. llvm-svn: 291716
*	AMDGPU: Fix folding immediates into mac src2	Matt Arsenault	2017-01-11	1	-2/+30
\| \| \| \| \| \| \|	Whether it is legal or not needs to check for the instruction it will be replaced with. llvm-svn: 291711
*	[AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and ↵	Sam Kolton	2017-01-11	5	-39/+133
\| \| \| \| \| \| \| \| \| \| \| \|	immediate operands Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28157 llvm-svn: 291668
*	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.	Mohammed Agabaria	2017-01-11	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657
*	AMDGPU/EG,CM: Add fp16 conversion instructions	Jan Vesely	2017-01-11	1	-1/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28164 llvm-svn: 291622
*	AMDGPU: Constant fold when immediate is materialized	Matt Arsenault	2017-01-10	1	-141/+228
\| \| \| \| \| \|	In future commits these patterns will appear after moveToVALU changes. llvm-svn: 291615
*	AMDGPU: Add tests for HasMultipleConditionRegisters	Matt Arsenault	2017-01-10	1	-0/+7
\| \| \| \| \| \|	This was enabled without many specific tests or the comment. llvm-svn: 291586
*	AMDGPU: Add Assert[SZ]Ext during argument load creation	Matt Arsenault	2017-01-09	2	-13/+17
\| \| \| \| \| \| \| \| \| \| \|	For i16 zeroext arguments when i16 was a legal type, the known bits information from the truncate was lost. Insert a zeroext so the known bits optimizations work with the 32-bit loads. Fixes code quality regressions vs. SI in min.ll test. llvm-svn: 291461
*	Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")	Matt Arsenault	2017-01-09	1	-19/+33
\| \| \| \|	llvm-svn: 291460
*	AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes	Jan Vesely	2017-01-06	5	-159/+159
\| \| \| \| \| \| \| \|	This will make transition to SCRATCH_MEMORY easier Differential Revision: https://reviews.llvm.org/D24746 llvm-svn: 291279
*	[AMDGPU] Remove extra semicolon. NFC	Konstantin Zhuravlyov	2017-01-06	1	-1/+1
\| \| \| \|	llvm-svn: 291246
*	[AMDGPU] Do not emit .AMDGPU.config section for amdhsa	Konstantin Zhuravlyov	2017-01-06	1	-4/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27732 llvm-svn: 291245
*	Revert "Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")"	Evgeniy Stepanov	2017-01-05	1	-33/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This reverts commit r291144. It breaks build bots. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/3270, http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/2058 lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:1638:12: error: could not convert ‘(const unsigned int)(& Variants)’ from ‘const unsigned int’ to ‘llvm::ArrayRef<unsigned int>’ return Variants; Reviewers: eugenis, tstellarAMD Patch by Alex Shlyapnikov. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D28372 llvm-svn: 291168
*	Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")	Matt Arsenault	2017-01-05	1	-19/+33
\| \| \| \| \| \|	Arrays are supposed to be static const llvm-svn: 291144
*	Revert r291025 ("AMDGPU: Remove unneccessary intermediate vector")	Richard Smith	2017-01-05	1	-22/+18
\| \| \| \| \| \| \|	This caused buildbot failures due to returning ArrayRefs referencing local (temporary) objects. llvm-svn: 291067
*	AMDGPU: Remove unneccessary intermediate vector	Matt Arsenault	2017-01-04	1	-18/+22
\| \| \| \|	llvm-svn: 291025
*	AMDGPU/SI: Implement sendmsghalt intrinsic	Jan Vesely	2017-01-04	6	-4/+21
\| \| \| \| \| \| \| \|	v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 llvm-svn: 290977
*	[AMDGPU][mc] Enable absolute expressions in .hsa_code_object_isa directive	Artem Tamazov	2016-12-29	1	-12/+17
\| \| \| \| \| \| \| \| \| \| \|	Among other stuff, this allows to use predefined .option.machine_version_major /minor/stepping symbols in the directive. Relevant test expanded at once (also file renamed for clarity). Differential Revision: https://reviews.llvm.org/D28140 llvm-svn: 290710
*	[AMDGPU][llvm-mc] Predefined symbols to access register counts ↵	Artem Tamazov	2016-12-27	1	-7/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(.kernel.{v\|s}gpr_count) The feature allows for conditional assembly, filling the entries of .amd_kernel_code_t etc. Symbols are defined with value 0 at the beginning of each kernel scope. After each register usage, the respective symbol is set to: value = max( value, ( register index + 1 ) ) Thus, at the end of scope the value represents a count of used registers. Kernel scopes begin at .amdgpu_hsa_kernel directive, end at the next .amdgpu_hsa_kernel (or EOF, whichever comes first). There is also dummy scope that lies from the beginning of source file til the first .amdgpu_hsa_kernel. Test added. Differential Revision: https://reviews.llvm.org/D27859 llvm-svn: 290608
*	[AMDGPU] Assembler: support SDWA and DPP for VOP2b instructions	Sam Kolton	2016-12-27	3	-6/+37
\| \| \| \| \| \| \| \| \| \|	Reviewers: nhaustov, artem.tamazov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28051 llvm-svn: 290599
*	AMDGPU: split ret/noret patterns for global atomics	Jan Vesely	2016-12-23	3	-22/+52
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27989 llvm-svn: 290435
*	Enable '-Wstring-conversion' and fix some bad asserts that it helped	Chandler Carruth	2016-12-23	1	-1/+1
\| \| \| \| \| \| \| \|	find. Notable is the assert in NewGVN which had no effect because of the bug. llvm-svn: 290400
*	AMDGPU: Invert cmp + select with constant	Matt Arsenault	2016-12-22	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	Canonicalize a select with a constant to the false side. This enables more instruction shrinking opportunities since an inline immediate can be used for the false side of v_cndmask_b32_e32. This seems to usually be better but causes some code size regressions in some tests. llvm-svn: 290372
*	AMDGPU: Use i16 for i16 shift amount	Matt Arsenault	2016-12-22	2	-8/+10
\| \| \| \|	llvm-svn: 290351
*	AMDGPU: Fix missing 16-bit cmpx instructions	Matt Arsenault	2016-12-22	1	-0/+39
\| \| \| \|	llvm-svn: 290349
*	AMDGPU: Use i16 comparison instructions	Matt Arsenault	2016-12-22	2	-5/+43
\| \| \| \|	llvm-svn: 290348
*	AMDGPU: Fixed '!NodePtr->isKnownSentinel()' assert	Matt Arsenault	2016-12-22	1	-17/+4
\| \| \| \| \| \| \| \|	Caused by dereferencing end iterator when trying to const cast the iterator. Patch by Martin Sherburn llvm-svn: 290347
*	[AMDGPU] Add pseudo SDWA instructions	Sam Kolton	2016-12-22	5	-85/+159
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is needed for later SDWA support in CodeGen. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27412 llvm-svn: 290338
*	[AMDGPU] Disassembler: fix for disaasembling v_mac_f32/16_dpp/sdwa	Sam Kolton	2016-12-22	4	-5/+26
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Real instruction should copy constraints from real instruction. This allows auto-generated disassembler to correctly process tied operands. Reviewers: nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27847 llvm-svn: 290336