bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Mark .note section SHF_ALLOC so lld creates a segment for it	Konstantin Zhuravlyov	2016-10-17	1	-2/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25694 llvm-svn: 284435
*	AMDGPU/SI: LowerParameter() should be computing align based on memory type	Tom Stellard	2016-10-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25203 llvm-svn: 284398
*	AMDGPU/SI: Fix LowerParameter() for i16 arguments	Tom Stellard	2016-10-17	1	-10/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we are loading an i16 value from a 32-bit memory location, then we need to be able to truncate the loaded value to i16. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25198 llvm-svn: 284397
*	AMDGPU/SI: Handle s_getreg hazard in GCNHazardRecognizer	Tom Stellard	2016-10-15	2	-0/+49
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25526 llvm-svn: 284298
*	AMDGPU/SI: Use new SimplifyDemandedBits helper for multi-use operations	Tom Stellard	2016-10-14	1	-13/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We are using this helper for our 24-bit arithmetic combines, so we are now able to eliminate multi-use operations that mask the high-bits of 24-bit inputs (e.g. and x, 0xffffff) Reviewers: arsenm, nhaehnle Subscribers: tony-tye, arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24672 llvm-svn: 284267
*	AMDGPU/SI: Don't allow unaligned scratch access	Tom Stellard	2016-10-14	4	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257
*	AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes	Nicolai Haehnle	2016-10-14	1	-10/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 llvm-svn: 284224
*	AMDGPU: Fix use-after-frees	Nicolai Haehnle	2016-10-14	2	-15/+16
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25312 llvm-svn: 284215
*	[AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external ↵	Konstantin Zhuravlyov	2016-10-14	6	-21/+79
\| \| \| \| \| \| \| \|	and global address space variables Differential Revision: https://reviews.llvm.org/D25562 llvm-svn: 284196
*	[AMDGPU] Add 32-bit lo/hi got and pc relative variant kinds and emit ↵	Konstantin Zhuravlyov	2016-10-14	1	-0/+8
\| \| \| \| \| \| \| \|	appropriate relocations Differential Revision: https://reviews.llvm.org/D25548 llvm-svn: 284195
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-10-13	1	-0/+10
\| \| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-10-13	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 284151
*	AMDGPU: Assume spilling will occur at -O0	Matt Arsenault	2016-10-13	1	-1/+5
\| \| \| \| \| \| \| \|	Because everything live is spilled at the end of a block by fast regalloc, assume this will happen and avoid the copies of the resource descriptor. llvm-svn: 284119
*	AMDGPU: Fix truncate to bool warnings	Matt Arsenault	2016-10-13	1	-5/+5
\| \| \| \|	llvm-svn: 284116
*	AMDGPU: Initial implementation of VGPR indexing mode	Matt Arsenault	2016-10-12	3	-43/+194
\| \| \| \| \| \| \| \| \| \| \|	This is the most basic handling of the indirect access pseudos using GPR indexing mode. This currently only enables the mode for a single v_mov_b32 and then disables it. This is much more complicated to use than the movrel instructions, so a new optimization pass is probably needed to fold the access into the uses and keep the mode enabled for them. llvm-svn: 284031
*	AMDGPU: Add instruction definitions for VGPR indexing	Matt Arsenault	2016-10-12	10	-8/+126
\| \| \| \| \| \| \|	VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027
*	AMDGPU/SI: Change mimg intrinsic signatures	Tom Stellard	2016-10-12	1	-18/+23
\| \| \| \| \| \| \| \|	This makes more fields overridable and removes redundant bits. Patch by: Changpeng Fang llvm-svn: 284024
*	[AMDGPU] Refactor waitcnt encoding	Konstantin Zhuravlyov	2016-10-11	5	-66/+171
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Refactor bit packing/unpacking - Calculate bit mask given bit shift and bit width - Introduce function for decoding bits of waitcnt - Introduce function for encoding bits of waitcnt - Introduce function for getting waitcnt mask (instead of using bare numbers) - Introduce function fot getting max waitcnt(s) (instead of using bare numbers) Differential Revision: https://reviews.llvm.org/D25298 llvm-svn: 283919
*	AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.	Changpeng Fang	2016-10-11	4	-3/+8
\| \| \| \| \| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D25454 Reviewers: tstellarAMD llvm-svn: 283893
*	Revert r283690, "MC: Remove unused entities."	Peter Collingbourne	2016-10-10	1	-1/+1
\| \| \| \|	llvm-svn: 283814
*	Move the global variables representing each Target behind accessor function	Mehdi Amini	2016-10-09	8	-23/+35
\| \| \| \| \| \| \| \|	This avoids "static initialization order fiasco" Differential Revision: https://reviews.llvm.org/D25412 llvm-svn: 283702
*	MC: Remove unused entities.	Peter Collingbourne	2016-10-09	1	-1/+1
\| \| \| \|	llvm-svn: 283691
*	Target: Remove unused entities.	Peter Collingbourne	2016-10-09	1	-1/+0
\| \| \| \|	llvm-svn: 283690
*	AMDGPU/SI: Handle div_fmas hazard in GCNHazardRecognizer	Tom Stellard	2016-10-07	2	-0/+23
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25250 llvm-svn: 283622
*	AMDGPU/SI: Add support for 8-byte relocations	Tom Stellard	2016-10-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25375 llvm-svn: 283593
*	AMDGPU/SI: Emit fixups for long branches	Tom Stellard	2016-10-07	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25366 llvm-svn: 283570
*	[AMDGPU][mc] Add support for buffer_load_dwordx3, buffer_store_dwordx3.	Artem Tamazov	2016-10-07	1	-0/+10
\| \| \| \| \| \| \| \| \|	Partially fixes Bug 28232. Lit tests added. Differential Revision: https://reviews.llvm.org/D25367 llvm-svn: 283567
*	[AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx ↵	Sam Kolton	2016-10-07	8	-54/+139
\| \| \| \| \| \| \| \| \| \| \| \|	to AMDGPUBaseInfo.h Reviewers: artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25084 llvm-svn: 283560
*	[AMDGPU] AMDGPUCodeGenPrepare: remove extra ';'	Konstantin Zhuravlyov	2016-10-07	1	-1/+1
\| \| \| \|	llvm-svn: 283558
*	[AMDGPU] Promote uniform (i1, i16] operations to i32	Konstantin Zhuravlyov	2016-10-07	1	-97/+101
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25302 llvm-svn: 283555
*	AMDGPU: Fix use-after-free in SIOptimizeExecMasking	Nicolai Haehnle	2016-10-07	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There was a bug with sequences like s_mov_b64 s[0:1], exec s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill> ... s_mov_b64_term exec, s[2:3] because s[2:3] was defined and used in the same instruction, ending up with SaveExecInst inside OtherUseInsts. Note that the test case also exposes an unrelated bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028 Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25306 llvm-svn: 283528
*	Target: Remove unused patterns and transforms. NFC.	Peter Collingbourne	2016-10-07	3	-33/+0
\| \| \| \|	llvm-svn: 283515
*	AMDGPU: Don't fold undef uses or copies with implicit uses	Matt Arsenault	2016-10-06	1	-4/+22
\| \| \| \|	llvm-svn: 283476
*	AMDGPU: Remove scheduling info from si_mask_branch	Matt Arsenault	2016-10-06	1	-0/+2
\| \| \| \|	llvm-svn: 283475
*	AMDGPU: Remove leftover implicit operands when folding immediates	Matt Arsenault	2016-10-06	1	-7/+26
\| \| \| \| \| \| \| \|	When constant folding an operation to a copy or an immediate mov, the implicit uses/defs of the old instruction were left behind, e.g. replacing v_or_b32 left the implicit exec use on the new copy. llvm-svn: 283471
*	Reapply "AMDGPU: Support using tablegened MC pseudo expansions"	Matt Arsenault	2016-10-06	5	-44/+75
\| \| \| \| \| \|	Fix bad merge llvm-svn: 283470
*	Revert "AMDGPU: Support using tablegened MC pseudo expansions"	Matt Arsenault	2016-10-06	5	-68/+44
\| \| \| \|	llvm-svn: 283469
*	AMDGPU: Support using tablegened MC pseudo expansions	Matt Arsenault	2016-10-06	5	-44/+68
\| \| \| \| \| \|	Make the necessary refactorings to make use of PseudoInstExpansion llvm-svn: 283467
*	BranchRelaxation: Support expanding unconditional branches	Matt Arsenault	2016-10-06	9	-17/+271
\| \| \| \| \| \| \|	AMDGPU needs to expand unconditional branches in a new block with an indirect branch. llvm-svn: 283464
*	[AMDGPU] Disassembler: print label names in branch instructions	Sam Kolton	2016-10-06	3	-66/+156
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Initialize MCObjectFileInfo with some default values. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 283450
*	AMDGPU: Partially fix reported code size for some instructions	Matt Arsenault	2016-10-06	4	-4/+8
\| \| \| \| \| \| \| \|	These ones need to have the size on the pseudo instruction set for getInstSizeInBytes to work correctly. These also have a statically known size. llvm-svn: 283437
*	[AMDGPU] Promote uniform i16 bitreverse intrinsic to i32	Konstantin Zhuravlyov	2016-10-06	1	-11/+65
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25121 llvm-svn: 283415
*	AMDGPU: Do not re-use tmpreg in spill/restore lowering	Matthias Braun	2016-10-05	1	-2/+2
\| \| \| \| \| \| \| \| \|	The register scavenging code does not support multiple definitions of the same vreg. Differential Revision: https://reviews.llvm.org/D25220 llvm-svn: 283369
*	AMDGPU: Refactor indirect vector lowering	Matt Arsenault	2016-10-04	1	-36/+42
\| \| \| \| \| \| \|	Allow inserting multiple instructions in the expanded loop. llvm-svn: 283177
*	AMDGPU: Factor SGPR spilling into separate functions	Matt Arsenault	2016-10-04	2	-129/+166
\| \| \| \|	llvm-svn: 283175
*	[AMDGPU] Pass optimization level to SelectionDAGISel	Konstantin Zhuravlyov	2016-10-03	3	-8/+11
\| \| \| \|	llvm-svn: 283133
*	[AMDGPU] Sign extend AShr when promoting (instead of zero extending)	Konstantin Zhuravlyov	2016-10-03	1	-2/+2
\| \| \| \|	llvm-svn: 283130
*	AMDGPU: Fix typo	Matt Arsenault	2016-10-03	1	-1/+1
\| \| \| \|	llvm-svn: 283108
*	Add new target hooks for LoadStoreVectorizer	Volkan Keles	2016-10-03	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D24727 llvm-svn: 283099
*	[AMDGPU] Remove unused variables from SIOptimizeExecMasking	Konstantin Zhuravlyov	2016-10-03	1	-3/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25110 llvm-svn: 283087