bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU/NFC: Move amdgpu code object metadata to support	Konstantin Zhuravlyov	2017-06-06	3	-617/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D31437 llvm-svn: 304812
*	[AMDGPU] Return correct value from SDWA pass	Stanislav Mekhanoshin	2017-06-06	1	-1/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33927 llvm-svn: 304805
*	AMDGPU/GlobalISel: Mark 32-bit G_ICMP as legal	Tom Stellard	2017-06-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33890 llvm-svn: 304797
*	Sort the remaining #include lines in include/... and lib/....	Chandler Carruth	2017-06-06	50	-81/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
*	[llvm] Remove double semicolons	Mandeep Singh Grang	2017-06-06	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: craig.topper, arsenm, mehdi_amini Reviewed By: mehdi_amini Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33924 llvm-svn: 304767
*	AMDGPU: Remove deprecated and unused elf definitions	Konstantin Zhuravlyov	2017-06-05	5	-144/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33689 llvm-svn: 304737
*	[AMDGPU] Fix uninit'ed var (RevisitLoop)	Mark Searles	2017-06-05	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33907 llvm-svn: 304729
*	[AMDGPU] Fix SIFoldOperands crash with clamp	Stanislav Mekhanoshin	2017-06-05	1	-1/+2
\| \| \| \| \| \| \| \| \|	Fixes bug #33302. Pass did not account that Src1 of max instruction can be an immediate. Differential Revision: https://reviews.llvm.org/D33884 llvm-svn: 304696
*	[AMDGPU] Untangle SDWA pass from SIShrinkInstructions	Stanislav Mekhanoshin	2017-06-03	2	-26/+67
\| \| \| \| \| \| \| \| \| \| \| \|	Remove dependency of SDWA pass on SIShrinkInstructions. The goal is to move SDWA even higher in the stack to avoid second run of MachineLICM, MachineCSE and SIFoldOperands. Also added handling to preserve original src modifiers. Differential Revision: https://reviews.llvm.org/D33860 llvm-svn: 304665
*	AMDGPU/GlobalISel: Mark 1-bit integer constants as legal	Tom Stellard	2017-06-03	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These are mostly legal, but will probably need special lowering for some cases. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33791 llvm-svn: 304628
*	[AMDGPU] Preserve operand order in SIFoldOperands	Stanislav Mekhanoshin	2017-06-03	1	-3/+18
\| \| \| \| \| \| \| \| \|	SIFoldOperands can commute operands even if no folding was done. This change is to preserve IR is no folding was done. Differential Revision: https://reviews.llvm.org/D33802 llvm-svn: 304625
*	[AMDGPU] V_DIV_FIXUP_F16 is not a commutable operation	Stanislav Mekhanoshin	2017-06-03	1	-1/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33808 llvm-svn: 304619
*	AMDGPU: Register AMDGPUAlwaysInline	Matt Arsenault	2017-06-02	3	-3/+10
\| \| \| \|	llvm-svn: 304574
*	AMDGPU: Make auto waitcnt before barrier a feature	Konstantin Zhuravlyov	2017-06-02	5	-8/+16
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571
*	AMDGPUAnnotateUniformValue should always treat volatile loads as divergent	Alexander Timofeev	2017-06-02	2	-1/+2
\| \| \| \|	llvm-svn: 304554
*	[AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests.	Mark Searles	2017-06-02	1	-1/+1
\| \| \| \| \| \| \| \| \|	-enable-si-insert-waitcnts=1 becomes the default -enable-si-insert-waitcnts=0 to use old pass Differential Revision: https://reviews.llvm.org/D33730 llvm-svn: 304551
*	[AMDGPU] Fix kernel arg segment size for amdgizcl	Yaxun Liu	2017-06-01	1	-1/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33307 llvm-svn: 304482
*	AMDGPU: Remove error on call in AsmPrinter	Matt Arsenault	2017-06-01	1	-29/+26
\| \| \| \| \| \| \|	Partial revert of r301938 which is making it harder to split patches up. llvm-svn: 304418
*	AMDGPU: Set high getCSRFirstUseCost	Matt Arsenault	2017-06-01	1	-0/+6
\| \| \| \|	llvm-svn: 304416
*	TargetMachine: Indicate whether machine verifier passes.	Matthias Braun	2017-05-31	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a callback to the LLVMTargetMachine that lets target indicate that they do not pass the machine verifier checks in all cases yet. This is intended to be a temporary measure while the targets are fixed allowing us to enable the machine verifier by default with EXPENSIVE_CHECKS enabled! Differential Revision: https://reviews.llvm.org/D33696 llvm-svn: 304320
*	[AMDGPU] Fix bugs in new waitcnt pass. Add test.	Mark Searles	2017-05-31	1	-4/+22
\| \| \| \| \| \| \| \| \| \| \|	- new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it - fix handling of PERMUTE ops - fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass ) - add new test Differential Revision: https://reviews.llvm.org/D33114 llvm-svn: 304311
*	[AMDGPU][MC] New syntax for ds_swizzle_b32 offset	Dmitry Preobrazhensky	2017-05-31	8	-5/+505
\| \| \| \| \| \| \| \| \| \|	See Bug 28601: https://bugs.llvm.org//show_bug.cgi?id=28601 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33542 llvm-svn: 304309
*	TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC	Matthias Braun	2017-05-30	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	TargetPassConfig is not useful for targets that do not use the CodeGen library, so we may just as well store a pointer to an LLVMTargetMachine instead of just to a TargetMachine. While at it, also change the constructor to take a reference instead of a pointer as the TM must not be nullptr. llvm-svn: 304247
*	[AMDGPU] Allow SDWA in instructions with immediates and SGPRs	Stanislav Mekhanoshin	2017-05-30	3	-17/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An encoding does not allow to use SDWA in an instruction with scalar operands, either literals or SGPRs. That is however possible to copy these operands into a VGPR first. Several copies of the value are produced if multiple SDWA conversions were done. To cleanup MachineLICM (to hoist copies out of loops), MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace SGPR to VGPR copy with immediate copy right to the VGPR) runs are added after the SDWA pass. Differential Revision: https://reviews.llvm.org/D33583 llvm-svn: 304219
*	[AMDGPU] Require waitcnt before barrier for all targets; adjust tests.	Mark Searles	2017-05-30	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33576 llvm-svn: 304217
*	Resubmit r303859 with test fixed.	Konstantin Zhuravlyov	2017-05-26	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	[AMDGPU] add intrinsic for s_getpc Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Patch by Tim Corringham llvm-svn: 304031
*	Make helper functions static. NFC.	Benjamin Kramer	2017-05-26	3	-3/+7
\| \| \| \|	llvm-svn: 304029
*	[AMDGPU][MC][GFX9] Corrected encoding of flat_scratch* for SDWA opcodes	Dmitry Preobrazhensky	2017-05-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	See bug 33171: https://bugs.llvm.org/show_bug.cgi?id=33171 Reviewers: Sam Kolton Differential Revision: https://reviews.llvm.org/D33553 llvm-svn: 304015
*	AMDGPU/GlobalISel: Mark 32-bit float constants as legal	Tom Stellard	2017-05-26	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33212 llvm-svn: 304003
*	[AMDGPU] SDWA: add disassembler support for GFX9	Sam Kolton	2017-05-26	5	-31/+113
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added decoder methods and tests Reviewers: vpykhtin, artem.tamazov, dp Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33545 llvm-svn: 303999
*	Revert r303859, CodeGen/AMDGPU/llvm.amdgcn.s.getpc.ll fails on bots.	Nico Weber	2017-05-25	1	-3/+1
\| \| \| \|	llvm-svn: 303902
*	[AMDGPU] add intrinsic for s_getpc	Tim Corringham	2017-05-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32862 llvm-svn: 303859
*	[AMDGPU] Prevent too large store merges in AMDGPU Subtargets. NFCI.	Nirav Dave	2017-05-24	4	-0/+24
\| \| \| \| \| \| \| \| \|	Various address spaces on the SI and R600 subtargets have stricter limits on memory access size that other address spaces. Use canMergeStoresTo predicate to prevent the DAGCombiner from creating these stores as they will be split up during legalization. llvm-svn: 303767
*	Revert "AMDGPU: Fold CI-specific complex SMRD patterns into existing complex ↵	Marek Olsak	2017-05-24	4	-18/+51
\| \| \| \| \| \| \| \| \| \| \|	patterns" This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa. It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of the patterns, so it was putting 32-bit literals into the 8-bit field. llvm-svn: 303754
*	[AMDGPU] Add INDIRECT_BASE_ADDR to R600_Reg32 class (PR33045)	Simon Pilgrim	2017-05-23	1	-1/+1
\| \| \| \| \| \| \| \|	This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045 Differential Revision: https://reviews.llvm.org/D33451 llvm-svn: 303691
*	AMDGPU/SI: Move the local memory usage related checking after calling ↵	Changpeng Fang	2017-05-23	1	-99/+114
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	convention checking in PromoteAlloca Summary: Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other. As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage related checking out and replace it after the calling convention checking. Reviewer: arsenm Differential Revision: http://reviews.llvm.org/D33139 llvm-svn: 303684
*	[AMDGPU] Combine and (srl) into shl (bfe)	Stanislav Mekhanoshin	2017-05-23	3	-11/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Perform DAG combine: and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb Where nb is a number of trailing zeroes in mask. It replaces two instructions with two and BFE is generally a more expensive one. However this is only done if we are selecting a byte or word at an aligned boundary which results in a proper SDWA operand pattern. It is only done if SDWA is supported. TODO: improve SDWA pass to actually convert this pattern. It is not done now because we have an immediate in the instruction, which has be moved into a VGPR. Differential Revision: https://reviews.llvm.org/D33455 llvm-svn: 303681
*	AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns	Marek Olsak	2017-05-23	4	-51/+18
\| \| \| \| \| \| \| \| \| \| \| \|	This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4. Reviewers: arsenm, nhaehnle, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28994 llvm-svn: 303658
*	[AMDGPU] Convert shl (add) into add (shl)	Stanislav Mekhanoshin	2017-05-23	2	-2/+43
\| \| \| \| \| \| \| \| \| \| \|	shl (or\|add x, c2), c1 => or\|add (shl x, c1), (c2 << c1) This allows to fold a constant into an address in some cases as well as to eliminate second shift if the expression is used as an address and second shift is a result of a GEP. Differential Revision: https://reviews.llvm.org/D33432 llvm-svn: 303641
*	[AMDGPU] SDWA: Add assembler support for GFX9	Sam Kolton	2017-05-23	13	-64/+552
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added separate pseudo and real instruction for GFX9 SDWA instructions. Currently supports only in assembler. Depends D32493 Reviewers: vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33132 llvm-svn: 303620
*	[AMDGPU] Narrow lshl from 64 to 32 bit if possible	Stanislav Mekhanoshin	2017-05-22	1	-11/+33
\| \| \| \| \| \| \| \| \|	Turn expensive 64 bit shift into 32 bit if shift does not overflow int: shl (ext x) => zext (shl x) Differential Revision: https://reviews.llvm.org/D33367 llvm-svn: 303569
*	[AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker	Valery Pykhtin	2017-05-22	2	-62/+86
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D33289 llvm-svn: 303548
*	[AMDGPU][MC] Corrected disassembler to decode instructions with 2 literals	Dmitry Preobrazhensky	2017-05-19	2	-4/+12
\| \| \| \| \| \| \| \| \| \|	See bug 32922: https://bugs.llvm.org//show_bug.cgi?id=32922 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D32912 llvm-svn: 303428
*	[AMDGPU][MC] Fixed bugs in export instruction	Dmitry Preobrazhensky	2017-05-19	2	-10/+31
\| \| \| \| \| \| \| \| \| \| \| \|	See Bugs 33019, 33056: https://bugs.llvm.org//show_bug.cgi?id=33019 https://bugs.llvm.org//show_bug.cgi?id=33056 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33288 llvm-svn: 303423
*	[LegacyPassManager] Remove TargetMachine constructors	Francis Visoiu Mistrih	2017-05-18	12	-83/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This provides a new way to access the TargetMachine through TargetPassConfig, as a dependency. The patterns replaced here are: * Passes handling a null TargetMachine call `getAnalysisIfAvailable<TargetPassConfig>`. * Passes not handling a null TargetMachine `addRequired<TargetPassConfig>` and call `getAnalysis<TargetPassConfig>`. * MachineFunctionPasses now use MF.getTarget(). * Remove all the TargetMachine constructors. * Remove INITIALIZE_TM_PASS. This fixes a crash when running `llc -start-before prologepilog`. PEI needs StackProtector, which gets constructed without a TargetMachine by the pass manager. The StackProtector pass doesn't handle the case where there is no TargetMachine, so it segfaults. Related to PR30324. Differential Revision: https://reviews.llvm.org/D33222 llvm-svn: 303360
*	[AMDGPU] SDWA operands should not intersect with potential MIs	Sam Kolton	2017-05-18	1	-13/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There should be no intesection between SDWA operands and potential MIs. E.g.: ``` v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0 v_and_b32 v2, 0xff, v0 -> src:v0 sel:BYTE_0 v_add_u32 v3, v4, v2 ``` In that example it is possible that we would fold 2nd instruction into 3rd (v_add_u32_sdwa) and then try to fold 1st instruction into 2nd (that was already destroyed). So if SDWAOperand is also a potential MI then do not apply it. Reviewers: vpykhtin, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32804 llvm-svn: 303347
*	AMDGPU: Start defining a calling convention	Matt Arsenault	2017-05-17	22	-116/+461
\| \| \| \| \| \| \| \|	Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308
*	AMDGPU: Expand frame indexes to be relative to scratch wave offset	Matt Arsenault	2017-05-17	1	-6/+71
\| \| \| \| \| \| \| \| \| \| \| \|	In order for an arbitrary callee to access an object in a caller's stack frame, the 32-bit offset used as the private pointer needs to be relative to the kernel's scratch wave offset register. Convert to this by finding the difference from the current stack frame and scaling by the wavefront size. llvm-svn: 303303
*	AMDGPU: Change mubuf soffset register when SP relative	Matt Arsenault	2017-05-17	2	-15/+53
\| \| \| \| \| \| \| \| \| \|	Check the MachinePointerInfo for whether the access is supposed to be relative to the stack pointer. No tests because this is used in later commits implementing calls. llvm-svn: 303301
*	AMDGPU: Make better use of op_sel with high components	Matt Arsenault	2017-05-17	2	-8/+57
\| \| \| \| \| \|	Handle more general swizzles. llvm-svn: 303296