bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SimplifyCFG] Use auto * when the type is obvious. NFCI.	Davide Italiano	2017-11-10	1	-11/+8
\| \| \| \|	llvm-svn: 317923
*	Recommit r317904: [Hexagon] Create HexagonISelDAGToDAG.h, NFC	Krzysztof Parzyszek	2017-11-10	2	-109/+139
\| \| \| \| \| \| \|	The Windows builder did not reconstruct the HexagonGenDAGISel.inc file after the TableGen binary has changed. llvm-svn: 317921
*	AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td	Konstantin Zhuravlyov	2017-11-10	4	-218/+258
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D39880 llvm-svn: 317920
*	Expand IRBuilder interface for atomic memcpy to require pointer alignments. ↵	Daniel Neilson	2017-11-10	2	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(NFC) Summary: The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires that the pointer arguments have alignments of at least the element size. The existing IRBuilder interface to create a call to this intrinsic does not allow for providing the alignment of these pointer args. Having an interface that makes it easy to construct invalid intrinsic calls doesn't seem sensible, so this patch simply adds the requirement that one provide the argument alignments when using IRBuilder to create atomic memcpy calls. llvm-svn: 317918
*	Revert "[Hexagon] Create HexagonISelDAGToDAG.h, NFC"	Krzysztof Parzyszek	2017-11-10	2	-139/+109
\| \| \| \| \| \|	This reverts r317904: broke Windows build. llvm-svn: 317916
*	[X86] Merge the template method selectAddrOfGatherScatterNode into ↵	Craig Topper	2017-11-10	1	-25/+16
\| \| \| \| \| \| \| \|	selectVectorAddr. NFCI Just need to initialize a couple variables differently based on the node type. No need for a whole separate template method. llvm-svn: 317915
*	[CVP] Remove some {s\|u}add.with.overflow checks.	Sanjoy Das	2017-11-10	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds logic to CVP to remove some overflow checks. It uses LVI to remove operations with at least one constant. Specifically, this can remove many overflow intrinsics immediately following an overflow check in the source code, such as: if (x < INT_MAX) ... x + 1 ... Patch by Joel Galenson! Reviewers: sanjoy, regehr Reviewed By: sanjoy Subscribers: fhahn, pirama, srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D39483 llvm-svn: 317911
*	[RISCV] Silence an unused variable warning in release builds [NFC]	Mandeep Singh Grang	2017-11-10	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also minor cleanups: 1. Avoided multiple calls to Fixup.getKind() 2. Avoided multiple calls to getFixupKindInfo() 3. Removed a redundant return. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, llvm-commits Differential Revision: https://reviews.llvm.org/D39881 llvm-svn: 317908
*	[Hexagon] Create HexagonISelDAGToDAG.h, NFC	Krzysztof Parzyszek	2017-11-10	2	-109/+139
\| \| \| \|	llvm-svn: 317904
*	[X86] Add a def file to CPU vendor, type, and subtype encodings used by Host.cpp	Craig Topper	2017-11-10	1	-271/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I want to leverage this to clean up some of the code in clang. This will allow us to simplify D39521 which was trying to do some of the same. If we accurately keep the code in Host.cpp synced with new CPUs added to compile-rt/libgcc we should be able to use this file as a proxy for what's implemented in the libraries. The entries for the CPUs recognized by the libraries use separate macros that define additional parameters like the name for __builtin_cpu_is and an alias string for the couple cases where __builtin_cpu_is accepts two different names. All of the macros contain an ARCHNAME that is usually the same as the __builtin_cpu_is string, but sometimes isn't. This represents the name recognized by X86.td and -march. I'm following the precedent set by ARM and AArch64 and adding this information to lib/Support/TargetParser.cpp Reviewers: erichkeane, echristo, asbirlea Reviewed By: echristo Subscribers: llvm-commits, aemerson, kristof.beyls Differential Revision: https://reviews.llvm.org/D39782 llvm-svn: 317900
*	LTO: don't fatal when value for cache key already exists	Bob Haarman	2017-11-10	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LTO/Caching.cpp uses file rename to atomically set the value for a cache key. On Windows, this fails when the destination file already exists. Previously, LLVM would report_fatal_error in such cases. However, because the old and the new value for the cache key are supposed to be equivalent, it actually doesn't matter which one we keep. This change makes it so that failing the rename when an openable file with the desired name already exists causes us to report success instead of fataling. Reviewers: pcc, hans Subscribers: mehdi_amini, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39874 llvm-svn: 317899
*	[WebAssembly] Fix stack offsets of return values from call lowering.	Jatin Bhateja	2017-11-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fixes PR35220 Reviewers: vadimcn, alexcrichton Reviewed By: alexcrichton Subscribers: pepyakin, alexcrichton, jfb, dschuff, sbc100, jgravelle-google, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D39866 llvm-svn: 317895
*	[AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the ↵	Alexander Timofeev	2017-11-10	1	-0/+2
\| \| \| \| \| \| \| \|	dead one Differential revision: https://reviews.llvm.org/D38754 llvm-svn: 317884
*	[llvm-opt-fuzzer] Introduce llvm-opt-fuzzer for fuzzing optimization passes	Igor Laevsky	2017-11-10	1	-0/+34
\| \| \| \| \| \| \| \| \|	This change adds generic fuzzing tools capable of running libFuzzer tests on any optimization pass or combination of them. Differential Revision: https://reviews.llvm.org/D39555 llvm-svn: 317883
*	[RegisterCoalescer] Move debug value after rematerialize trivial def	Karl-Johan Karlsson	2017-11-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The associated debug value is updated when the virtual source register of a copy is completely eliminated and replaced with a rematerialize value in the defed register of the copy. As the debug value now is associated with another register it also need to be moved, otherwise the debug value isn't valid. Reviewers: aprantl Reviewed By: aprantl Subscribers: MatzeB, llvm-commits, qcolombet Differential Revision: https://reviews.llvm.org/D38024 llvm-svn: 317880
*	[RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.	Jonas Paulsson	2017-11-10	8	-10/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* The method getRegAllocationHints() is now of bool type instead of void. If true is returned, regalloc (AllocationOrder) will only try to allocate the hints, as opposed to merely trying them before non-hinted registers. * TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with an increase in number of LOCRs. In this case, it is desired to force the hints even though there is a slight increase in spilling, because if a non-hinted register would be allocated, the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR (Load On Condition) SystemZ instruction must have both operands in either the low or high part of the 64 bit register. Reviewers: Quentin Colombet and Ulrich Weigand https://reviews.llvm.org/D36795 llvm-svn: 317879
*	[X86] Add support for combining FMADDSUB(A, B, FNEG(C))->FMSUBADD(A, B, C)	Craig Topper	2017-11-10	1	-0/+31
\| \| \| \| \| \|	Support the opposite direction as well. Also add a TODO for not being able to combine FMSUB/FNMADD/FNMSUB with FNEG. llvm-svn: 317878
*	[AMDGPU] Fix pointer info for lowering load/store for r600 for amdgiz ↵	Yaxun Liu	2017-11-10	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	environment r600 uses dummy pointer info for lowering load/store. Since dummy pointer info assumes address space 0, this causes isel failure when temporary load/store SDNodes are generated for amdgiz environment. Since the offest is not constant, FixedStack pseudo source value cannot be used to create the pointer info. This patch creates pointer info using llvm undef value. At least this provides correct address space so that isel can be done correctly. Differential Revision: https://reviews.llvm.org/D39698 llvm-svn: 317862
*	[AMDGPU] Fix pointer info for pseudo source for r600	Yaxun Liu	2017-11-10	2	-0/+21
\| \| \| \| \| \| \| \| \| \| \|	The pointer info for pseudo source for r600 is not correct when alloca addr space is not 0, which causes invalid SDNode for r600---amdgiz. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39670 llvm-svn: 317861
*	[ThinLTO] Fix missing call graph edges for calls with bitcasts.	Volodymyr Sapsai	2017-11-10	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change doesn't fix the root cause of the miscompile PR34966 as the root cause is in the linker ld64. This change makes call graph more complete allowing to have better module imports/exports. rdar://problem/35344706 Reviewers: tejohnson Reviewed By: tejohnson Subscribers: mehdi_amini, inglorion, eraman, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39356 llvm-svn: 317853
*	[support] allocate exact size required for mapping in Support/Windws/Path.inc	Bob Haarman	2017-11-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: zturner suggested that mapped_file_region::init() on Windows seems to create mappings that are larger than they need to be: Offset+Size instead of Size. Indeed, that appears to be the case. I confirmed that tests pass with mappings of just Size bytes, and fail with Size-1 bytes, suggesting that Size is indeed the correct value. Reviewers: amccarth, zturner Reviewed By: zturner Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D39876 llvm-svn: 317850
*	Add a wrapper function to set branch weights metadata.	Easwaran Raman	2017-11-09	1	-28/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This wrapper checks if there is at least one non-zero weight before setting the metadata. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39872 llvm-svn: 317845
*	Fix out-of-order stepping behavior in programs with hoisted constants.	Paul Robinson	2017-11-09	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the Constant Hoisting pass moves expensive constants into a common block, it would assign a debug location equal to the last use of that constant. While this is certainly intuitive, it places the constant in an out-of-order location, according to the debug location information. This produces out-of-order stepping when debugging programs affected by this pass. This patch creates in-order stepping behavior by merging the debug locations for hoisted constants, and the new insertion point. Patch by Matthew Voss! Differential Revision: https://reviews.llvm.org/D38088 llvm-svn: 317827
*	Preserve debug info when DAG-combinging (zext (truncate x)) -> (and x, mask).	Adrian Prantl	2017-11-09	3	-33/+46
\| \| \| \| \| \|	rdar://problem/27139077 llvm-svn: 317825
*	[Support] Make llvm::Error and Expected faster.	Zachary Turner	2017-11-09	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Whenever LLVM_ENABLE_ABI_BREAKING_CHECKS is enabled, which is usually the case for example when asserts are enabled, Error's destructor does some additional checking to make sure that that it does not represent an error condition and that it was checked. However, this is -- by definition -- not the likely codepath. Some profiling shows that at least with some compilers, simply calling assertIsChecked -- in a release build with full optimizations -- can account for up to 15% of the entire runtime of the program, even though this function should almost literally be a no-op. The problem is that the assertIsChecked function can be considered too big to inline depending on the compiler's inliner. Since it's unlikely to ever need to failure path though, we can move it out of line and force it to not be inlined, so that the fast path can be inlined. In my test (using lld to link clang with CMAKE_BUILD_TYPE=Release and LLVM_ENABLE_ASSERTIONS=ON), this reduces link time from 27 seconds to 23.5 seconds, which is a solid 15% gain. llvm-svn: 317824
*	[SLP] Fix PR23510: Try to find best possible vectorizable stores.	Alexey Bataev	2017-11-09	1	-23/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The analysis of the store sequence goes in straight order - from the first store to the last. Bu the best opportunity for vectorization will happen if we're going to use reverse order - from last store to the first. It may be best because usually users have some initialization part + further processing and this first initialization may confuse SLP vectorizer. Reviewers: RKSimon, hfinkel, mkuper, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39606 llvm-svn: 317821
*	[Reassociate] don't name values "tmp"; NFCI	Sanjay Patel	2017-11-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	The toxic stew of created values named 'tmp' and tests that already have values named 'tmp' and CHECK lines looking for values named 'tmp' causes bad things to happen in our test line auto-generation scripts because it wants to use 'TMP' as a prefix for unnamed values. Use less 'tmp' to avoid that. llvm-svn: 317818
*	[GlobalMerge] Stable sort GlobalSets to fix non-deterministic sort order	Mandeep Singh Grang	2017-11-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes failure in CodeGen/AArch64/global-merge-group-by-use.ll uncovered by D39245. Reviewers: ab, asl Reviewed By: ab Subscribers: aemerson, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D39635 llvm-svn: 317817
*	revert r317812 [BasicAA] fix build break by converting the previously ↵	Nuno Lopes	2017-11-09	1	-2/+1
\| \| \| \| \| \| \| \| \|	introduced assert into an if stmt The code has a bug, but some tests regress. I'll discuss this further on the mailing list. llvm-svn: 317815
*	[BasicAA] fix build break by converting the previously introduced assert ↵	Nuno Lopes	2017-11-09	1	-2/+2
\| \| \| \| \| \| \| \|	into an if stmt Apparently V1Size == -1 doest imply V2Size == -1, which is a bit surprising to me. llvm-svn: 317812
*	[SystemZ] Add support for the "o" inline asm constraint	Ulrich Weigand	2017-11-09	2	-0/+5
\| \| \| \| \| \| \| \| \|	We don't really need any special handling of "offsettable" memory addresses, but since some existing code uses inline asm statements with the "o" constraint, add support for this constraint for compatibility purposes. llvm-svn: 317807
*	[BasicAA] add assertion for corner case in aliasGEP()	Nuno Lopes	2017-11-09	1	-0/+1
\| \| \| \|	llvm-svn: 317803
*	[mips] Correct microMIP's jump and add unconditional branch pseudo	Simon Dardis	2017-11-09	4	-18/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Correct the definition of 'j' as being unavailable for microMIPS32R6 and provide the 'b' assembly idiom for codegen purposes for microMIPS32r3. Provide the necessary 'br' pattern for microMIPS32R6 as it now longer incorrectly uses the 'j' instruction. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39741 llvm-svn: 317801
*	[RISCV] MC layer support for the standard RV32A instruction set extension	Alex Bradbury	2017-11-09	6	-12/+128
\| \| \| \|	llvm-svn: 317791
*	Fix 'not all control paths return a value' warning on MSVC builds	Simon Pilgrim	2017-11-09	1	-0/+1
\| \| \| \|	llvm-svn: 317790
*	Reapply: Allow yaml2obj to order implicit sections for ELF	Dave Lee	2017-11-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change allows yaml input to control the order of implicitly added sections (`.symtab`, `.strtab`, `.shstrtab`). The order is controlled by adding a placeholder section of the given name to the Sections field. This change is to support changes in D39582, where it is desirable to control the location of the `.dynsym` section. This reapplied version fixes: 1. use of a function call within an assert 2. failing lld test which has an unnamed section 3. incorrect section count when given an unnamed section Additionally, one more test to cover the unnamed section failure. Reviewers: compnerd, jakehehrlich Reviewed By: jakehehrlich Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39749 llvm-svn: 317789
*	[RISCV] MC layer support for the standard RV32M instruction set extension	Alex Bradbury	2017-11-09	4	-4/+45
\| \| \| \|	llvm-svn: 317788
*	Sched model improving on btver2: JFPU01 resource, vtestp* for xmm.	Andrew V. Tischenko	2017-11-09	1	-11/+26
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D39802 llvm-svn: 317785
*	Add -print-schedule scheduling comments to inline asm.	Andrew V. Tischenko	2017-11-09	4	-14/+17
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D39728 llvm-svn: 317782
*	[X86] Give priority to EVEX FMA instructions over FMA4 instructions.	Craig Topper	2017-11-09	3	-63/+69
\| \| \| \| \| \|	No existing processor has both so it doesn't really matter what we do here. But we were previously just relying on pattern order which gave FMA4 priority. llvm-svn: 317775
*	Fix "default label in switch which covers all enumeration values" warning	Vitaly Buka	2017-11-09	1	-2/+0
\| \| \| \|	llvm-svn: 317771
*	[SectionMemoryManager] Abstract out mmap, munmap, mprotect even more ; NFC	Sanjoy Das	2017-11-09	1	-25/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will let ORC JIT clients plug in custom logic for the mmap, munmap and mprotect paths. Reviewers: loladiro, dblaikie Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D39300 llvm-svn: 317770
*	[X86] Make X86ISD::FMADDS3 isel patterns commutable.	Craig Topper	2017-11-09	1	-4/+4
\| \| \| \| \| \|	This was missed when FMADDS3 was split from X86ISD::FMADDS3_RND. llvm-svn: 317769
*	[GVN PRE] Patch the source for Phi node in PRE	Serguei Katkov	2017-11-09	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We must patch all existing incoming values of Phi node, otherwise it is possible that we can see poison where program does not expect to see it. This is the similar what GVN does. The added test test/Transforms/GVN/PRE/pre-jt-add.ll shows an example of wrong optimization done by jump threading due to GVN PRE did not patch existing incoming value. Reviewers: mkazantsev, wmi, dberlin, davide Reviewed By: dberlin Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D39637 llvm-svn: 317768
*	[Coverage] Use the wrapped segment when a line has entry segments	Vedant Kumar	2017-11-09	1	-4/+4
\| \| \| \| \| \| \| \| \|	We've worked around bugs in the frontend by ignoring the count from wrapped segments when a line has at least one region entry segment. Those frontend bugs are now fixed, so it's time to regenerate the checked-in covmapping files and remove the workaround. llvm-svn: 317761
*	AMDGPU: Merge BUFFER_STORE_DWORD_OFFEN/OFFSET into x2, x4	Marek Olsak	2017-11-09	1	-4/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Only 56 shaders (out of 48486) are affected. Totals from affected shaders (changed stats only): SGPRS: 2420 -> 2460 (1.65 %) Spilled VGPRs: 94 -> 112 (19.15 %) Scratch size: 524 -> 528 (0.76 %) dwords per thread Code Size: 187400 -> 184992 (-1.28 %) bytes One DiRT Showdown shader spills 6 more VGPRs. One Grid Autosport shader spills 12 more VGPRs. The other 54 shaders only have a decrease in code size. (I'm ignoring the SGPR noise) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39012 llvm-svn: 317755
*	AMDGPU: Lower buffer store and atomic intrinsics manually	Marek Olsak	2017-11-09	5	-20/+206
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Without this, SIMemoryLegalizer inserts s_waitcnt vmcnt(0) before every buffer store and atomic instruction. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39060 llvm-svn: 317754
*	AMDGPU: Merge BUFFER_LOAD_DWORD_OFFSET into x2, x4	Marek Olsak	2017-11-09	1	-13/+37
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Only 3 (out of 48486) shaders are affected. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38951 llvm-svn: 317753
*	AMDGPU: Merge BUFFER_LOAD_DWORD_OFFEN into x2, x4	Marek Olsak	2017-11-09	1	-26/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: -9.9% code size decrease in affected shaders. Totals (changed stats only): SGPRS: 2151462 -> 2170646 (0.89 %) VGPRS: 1634612 -> 1640288 (0.35 %) Spilled SGPRs: 8942 -> 8940 (-0.02 %) Code Size: 52940672 -> 51727288 (-2.29 %) bytes Max Waves: 373066 -> 371718 (-0.36 %) Totals from affected shaders: SGPRS: 283520 -> 302704 (6.77 %) VGPRS: 227632 -> 233308 (2.49 %) Spilled SGPRs: 3966 -> 3964 (-0.05 %) Code Size: 12203080 -> 10989696 (-9.94 %) bytes Max Waves: 44070 -> 42722 (-3.06 %) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38950 llvm-svn: 317752
*	AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4	Marek Olsak	2017-11-09	2	-14/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Only constant offsets (*_IMM opcodes) are merged. It reuses code for LDS load/store merging. It relies on the scheduler to group loads. The results are mixed, I think they are mostly positive. Most shaders are affected, so here are total stats only: SGPRS: 2072198 -> 2151462 (3.83 %) VGPRS: 1628024 -> 1634612 (0.40 %) Spilled SGPRs: 7883 -> 8942 (13.43 %) Spilled VGPRs: 97 -> 101 (4.12 %) Scratch size: 1488 -> 1492 (0.27 %) dwords per thread Code Size: 60222620 -> 52940672 (-12.09 %) bytes Max Waves: 374337 -> 373066 (-0.34 %) There is 13.4% increase in SGPR spilling, DiRT Showdown spills a few more VGPRs (now 37), but 12% decrease in code size. These are the new stats for SGPR spilling. We already spill a lot SGPRs, so it's uncertain whether more spilling will make any difference since SGPRs are always spilled to VGPRs: SGPR SPILLING APPS Shaders SpillSGPR AvgPerSh alien_isolation 2938 100 0.0 batman_arkham_origins 589 6 0.0 bioshock-infinite 1769 4 0.0 borderlands2 3968 22 0.0 counter_strike_glob.. 1142 60 0.1 deus_ex_mankind_div.. 1410 79 0.1 dirt-showdown 533 4 0.0 dirt_rally 364 1163 3.2 divinity 1052 2 0.0 dota2 1747 7 0.0 f1-2015 776 1515 2.0 grid_autosport 1767 1505 0.9 hitman 1413 273 0.2 left_4_dead_2 1762 4 0.0 life_is_strange 1296 26 0.0 mad_max 358 96 0.3 metro_2033_redux 2670 60 0.0 payday2 1362 22 0.0 portal 474 3 0.0 saints_row_iv 1704 8 0.0 serious_sam_3_bfe 392 1348 3.4 shadow_of_mordor 1418 12 0.0 shadow_warrior 3956 239 0.1 talos_principle 324 1735 5.4 thea 172 17 0.1 tomb_raider 1449 215 0.1 total_war_warhammer 242 56 0.2 ue4_effects_cave 295 55 0.2 ue4_elemental 572 12 0.0 unigine_tropics 210 56 0.3 unigine_valley 278 152 0.5 victor_vran 1262 84 0.1 yofrankie 82 2 0.0 Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38949 llvm-svn: 317751