bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[mips] Prevent shrink-wrap for BuildPairF64, ExtractElementF64 when they use $sp	Vladimir Stefanovic	2018-08-29	2	-0/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For a certain combination of options, BuildPairF64_{64}, ExtractElementF64{_64} may be expanded into instructions using stack. Add implicit operand $sp for such cases so that ShrinkWrapping doesn't move prologue setup below them. Fixes MultiSource/Benchmarks/MallocBench/cfrac for '--target=mips-img-linux-gnu -mcpu=mips32r6 -mfpxx -mnan=2008' and '--target=mips-img-linux-gnu -mcpu=mips32r6 -mfp64 -mnan=2008 -mno-odd-spreg'. Differential Revision: https://reviews.llvm.org/D50986 llvm-svn: 340927
*	[DAGCombiner] Add X / X -> 1 & X % X -> 0 folds	Simon Pilgrim	2018-08-29	4	-259/+42
\| \| \| \| \| \| \| \|	Adds more divrem folds to try and get in sync with InstructionSimplify Differential Revision: https://reviews.llvm.org/D50636 llvm-svn: 340919
*	[DAGCombiner] Add X / X -> 1 & X % X -> 0 folds (test tweaks)	Simon Pilgrim	2018-08-29	1	-37/+37
\| \| \| \| \| \| \| \|	Adjust missed test to avoid the X / X -> 1 & X % X -> 0 folds while keeping their original purposes. Differential Revision: https://reviews.llvm.org/D50636 llvm-svn: 340917
*	[DAGCombiner] Add X / X -> 1 & X % X -> 0 folds (test tweaks)	Simon Pilgrim	2018-08-29	3	-50/+63
\| \| \| \| \| \| \| \|	Adjust tests to avoid the X / X -> 1 & X % X -> 0 folds while keeping their original purposes. Differential Revision: https://reviews.llvm.org/D50636 llvm-svn: 340916
*	[X86][AVX] Prefer VPBLENDW+VPBLENDD to VPBLENDVB for v16i16 blend shuffles	Simon Pilgrim	2018-08-29	5	-85/+53
\| \| \| \| \| \| \| \| \| \|	Noticed while looking at D49562 codegen - we can avoid a large constant mask load and a slow VPBLENDVB select op by using VPBLENDW+VPBLENDD instead. TODO: As discussed on the patch, we should investigate adding VPBLENDVB handling to target shuffle combining as well, that will allow us to extend this to VPBLENDW+VPBLENDW+VPBLENDD. Differential Revision: https://reviews.llvm.org/D50074 llvm-svn: 340913
*	AMDGPU: Fix getInstSizeInBytes	Nicolai Haehnle	2018-08-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add some optional code to validate getInstSizeInBytes for emitted instructions. This flushed out some issues which are fixed by this patch: - Streamline getInstSizeInBytes - Properly define the VI readlane/writelane instruction as VOP3 - Fix the inline constant determination. Specifically, this change fixes an issue where a 32-bit value of 0xffffffff was recorded as unsigned. This is equal to -1 when restricting to a 32-bit comparison, and an inline constant can be used. Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D50629 Change-Id: Id87c3b7975839da0de8156a124b0ce98c5fb47f2 llvm-svn: 340903
*	[X86] Support v2i32 gather/scatter indices with ↵	Craig Topper	2018-08-29	1	-0/+482
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-x86-experimental-vector-widening-legalization Summary: This is split out from D41062 to cover the code in LegalVectorTypes.cpp Reviewers: RKSimon, spatel, efriedma Reviewed By: efriedma Subscribers: sdardis, jvesely, nhaehnle, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D51337 llvm-svn: 340891
*	Start reserving x18 by default on Android targets.	Peter Collingbourne	2018-08-29	1	-0/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D45588 llvm-svn: 340889
*	[X86] Add intrinsics for KADD instructions	Craig Topper	2018-08-28	2	-0/+100
\| \| \| \| \| \| \| \| \| \|	These are intrinsics for supporting kadd builtins in clang. These builtins are already in gcc to implement intrinsics from icc. Though they are missing from the Intel Intrinsics Guide. This instruction adds two mask registers together as if they were scalar rather than a vXi1. We might be able to get away with a bitcast to scalar and a normal add instruction, but that would require DAG combine smarts in the backend to recoqnize add+bitcast. For now I'd prefer to go with the easiest implementation so we can get these builtins in to clang with good codegen. Differential Revision: https://reviews.llvm.org/D51370 llvm-svn: 340869
*	AMDGPU: Don't delete instructions if S_ENDPGM has implicit uses	Matt Arsenault	2018-08-28	1	-0/+17
\| \| \| \| \| \| \| \|	This can leave behind the uses with the defs removed. Since this should only really happen in tests, it's not worth the effort of trying to handle this. llvm-svn: 340866
*	[GISel]: Add missing opcodes for overflow intrinsics	Aditya Nandakumar	2018-08-28	2	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D51197 Currently, IRTranslator (and GISel) seems to be arbitrarily picking which overflow intrinsics get mapped into opcodes which either have a carry as an input or not. For intrinsics such as Intrinsic::uadd_with_overflow, translate it to an opcode (G_UADDO) which doesn't have any carry inputs (similar to LLVM IR). This patch adds 4 missing opcodes for completeness - G_UADDO, G_USUBO, G_SSUBE and G_SADDE. llvm-svn: 340865
*	AMDGPU: Force shrinking of add/sub even if the carry is used	Matt Arsenault	2018-08-28	1	-4/+31
\| \| \| \| \| \| \| \| \|	The original motivating example uses a 64-bit add, so the carry is used. Insert a copy from VCC. This may allow shrinking of the used carry instruction. At worst, we are replacing a mov to materialize the constant with a copy of vcc. llvm-svn: 340862
*	AMDGPU: Shrink insts to fold immediates	Matt Arsenault	2018-08-28	2	-0/+426
\| \| \| \| \| \| \| \| \|	This needs to be done in the SSA fold operands pass to be effective, so there is a bit of overlap with SIShrinkInstructions but I don't think this is practically avoidable. llvm-svn: 340859
*	[WebAssembly][NFC] Fix up SIMD bitwise tests	Thomas Lively	2018-08-28	1	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The updated tests were previously infallible because the SIMD bitwise operations do not contain vector types in their names. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51369 llvm-svn: 340858
*	[WebAssembly] v128.not	Thomas Lively	2018-08-28	1	-0/+49
\| \| \| \| \| \|	Implementation and tests. llvm-svn: 340857
*	[X86] Fix copy paste mistake in vector-idiv-v2i32.ll. Add missing test case.	Craig Topper	2018-08-28	1	-61/+169
\| \| \| \| \| \|	Some of the test cases contained the same load twice instead of a different load. llvm-svn: 340833
*	[AMDGPU] Add support for a16 modifiear for gfx9	Ryan Taylor	2018-08-28	2	-0/+571
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adding support for a16 for gfx9. A16 bit replaces r128 bit for gfx9. Change-Id: Ie8b881e4e6d2f023fb5e0150420893513e5f4841 Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50575 llvm-svn: 340831
*	[X86][SSE] Improve variable scalar shift of vXi8 vectors (PR34694)	Simon Pilgrim	2018-08-28	10	-987/+357
\| \| \| \| \| \| \| \|	This patch creates the shift mask and actual shift using the vXi16 vector shift ops. Differential Revision: https://reviews.llvm.org/D51263 llvm-svn: 340813
*	[X86][SSE] Avoid vector extraction/insertion for non-constant uniform shifts	Simon Pilgrim	2018-08-28	6	-30/+32
\| \| \| \| \| \|	As discussed on D51263, we're better off using byte shifts to clear the upper bits on pre-SSE41 hardware. llvm-svn: 340810
*	[DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the ↵	Craig Topper	2018-08-28	4	-35/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	resulting load is legal for the target. Summary: I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there... I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))). Reviewers: efriedma, atanasyan, arsenm Reviewed By: efriedma Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50491 llvm-svn: 340797
*	[PPC] Remove Darwin support from POWER backend.	Kit Barton	2018-08-28	120	-652/+579
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch issues an error message if Darwin ABI is attempted with the PPC backend. It also cleans up existing test cases, either converting the test to use an alternative triple or removing the test if the coverage is no longer needed. Updated Tests ------------- The majority of test cases were updated to use a different triple that does not include the Darwin ABI. Many tests were also updated to use FileCheck, in place of grep. Deleted Tests ------------- llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test specific functionality of dsymutil using an object file created with an old version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he suggested removing the test. llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a PPC test to a SystemZ test, as the behavior is also reproducible there. All other tests that were deleted were specific to the darwin/ppc ABI and no longer necessary. Phabricator Review: https://reviews.llvm.org/D50988 llvm-svn: 340795
*	[x86] add AVX runs to show more potential scalar->vector mov opportunities; NFC	Sanjay Patel	2018-08-27	1	-219/+470
\| \| \| \|	llvm-svn: 340785
*	[Pipeliner] Fix incorrect phi values in the epilog and kernel	Brendon Cahoon	2018-08-27	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code that generates the loop definition operand for phis in the epilog and kernel is incorrect in some cases. In the kernel, when a phi refers to another phi, the code that updates PhiOp2 needs to include the stage difference between the two phis. In the epilog, the check for using the loop definition instead of the phi definition uses the StageDiffAdj value (the difference between the phi stage and the loop definition stage), but the adjustment is not needed to determine if the current stage contains an iteration with the loop definition. Differential Revision: https://reviews.llvm.org/D51167 llvm-svn: 340782
*	[X86] Reverse the check prefixes in the test added in r340774.	Craig Topper	2018-08-27	1	-497/+497
\| \| \| \| \| \|	The 32-bit and 64-bit checks were reversed. llvm-svn: 340775
*	[X86] Add test cases to show current codegen of v2i32 div/rem in 32-bit and ↵	Craig Topper	2018-08-27	1	-0/+624
\| \| \| \| \| \| \| \|	64-bit modes In particular this shows that we end up using libcalls in 32-bit mode even for division by constant. llvm-svn: 340774
*	[x86] add tests for possibly avoiding scalar->vector move; NFC	Sanjay Patel	2018-08-27	1	-0/+437
\| \| \| \|	llvm-svn: 340773
*	DAG: Check transformed type for forming fminnum/fmaxnum from vselect	Matt Arsenault	2018-08-27	2	-44/+12
\| \| \| \| \| \|	Follow up to r340655 to fix vector types which are split. llvm-svn: 340766
*	MachineVerifier: Fix assert on implicit virtreg use	Matt Arsenault	2018-08-27	1	-0/+21
\| \| \| \| \| \| \| \| \|	If the liveness of a physical register was invalid, this was attempting to iterate the subregisters of all register uses of the instruction, which would assert when it encountered an implicit virtual register operand. llvm-svn: 340763
*	[NVPTX] Implement isLegalToVectorizeLoadChain	Benjamin Kramer	2018-08-27	1	-0/+29
\| \| \| \| \| \| \| \|	This lets LSV nicely split up underaligned chains. Differential Revision: https://reviews.llvm.org/D51306 llvm-svn: 340760
*	[X86] When lowering v32i8 MULHS/MULHU, shuffle after the PACKUS rather than ↵	Craig Topper	2018-08-27	5	-48/+35
\| \| \| \| \| \| \| \| \| \|	before. We're using a 256-bit PACKUS to do the truncation, but that instruction operates on 128-bit lanes. So previously we shuffled first to rearrange the lanes. But that requires 2 shuffles. Instead we can shuffle after the PACKUS using a single VPERMQ. This matches what our normal LowerTRUNCATE code does when it uses PACKUS. Differential Revision: https://reviews.llvm.org/D51284 llvm-svn: 340757
*	[X86] Add support for matching paddus patterns where one of the vectors is a ↵	Craig Topper	2018-08-27	1	-1614/+252
\| \| \| \| \| \| \| \| \| \|	constant. InstCombine mucks these up a bit. So we need to do some additional pattern matching to fix it. There are a still a few special cases not handled, but this covers the general case. Differential Revision: https://reviews.llvm.org/D50952 llvm-svn: 340756
*	[WebAssembly] Added default stack-only instruction mode for MC.	Wouter van Oortmerssen	2018-08-27	83	-110/+306
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Made it convert from register to stack based instructions, and removed the registers. Fixes to related code that was expecting register based instructions. Added the correct testing flag to all tests, depending on what the format they were expecting so far. Translated one test to stack format as example: reg-stackify-stack.ll tested: llvm-lit -v `find test -name WebAssembly` unittests/MC/* Reviewers: dschuff, sunfish Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D51241 llvm-svn: 340750
*	[PowerPC] Revert commit r339779	Nemanja Ivanovic	2018-08-27	2	-100/+7
\| \| \| \| \| \| \|	This commit has caused failures in some internal benchmarks. Temporarily reverting this patch until the issue can be diagnosed and fixed. llvm-svn: 340740
*	[X86] Adding the test pointing to the fail case of D45653	Aleksandr Urakov	2018-08-27	1	-0/+29
\| \| \| \| \| \| \| \|	Summary: This commit adds the case of tail calling a sret function from a non-sret function when both functions have the C calling convention. llvm-svn: 340737
*	[NFC][X86] Fix `sibcall.ll` formatting	Aleksandr Urakov	2018-08-27	1	-317/+303
\| \| \| \| \| \| \| \|	Summary: Remove unnecessary lines from `sibcall.ll` and rename labels according to @RKSimon's recommendations in the D45653 conversation. llvm-svn: 340735
*	[PowerPC] Recommit r340016 after fixing the reported issue	Nemanja Ivanovic	2018-08-27	1	-0/+17
\| \| \| \| \| \| \| \|	The internal benchmark failure reported by Google was due to a missing check for the result type for the sign-extend and shift DAG. This commit adds the check and re-commits the patch. llvm-svn: 340734
*	[Sparc] Add support for the cycle counter available in GR740	Daniel Cederman	2018-08-27	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The GR740 provides an up cycle counter in the registers ASR22 and ASR23. As these registers can not be read together atomically we only use the value of ASR23 for llvm.readcyclecounter(). The ASR23 register holds the 32 LSBs of the up-counter. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: jfb, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D48638 llvm-svn: 340733
*	[Sparc] Custom bitcast between f64 and v2i32	Daniel Cederman	2018-08-27	1	-9/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently bitcasting constants from f64 to v2i32 is done by storing the value to the stack and then loading it again. This is not necessary, but seems to happen because v2i32 is a valid type for Sparc V8. If it had not been legal, we would have gotten help from the type legalizer. This patch tries to do the same work as the legalizer would have done by bitcasting the floating point constant and splitting the value up into a vector of two i32 values. Reviewers: venkatra, jyknight Reviewed By: jyknight Subscribers: glaubitz, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D49219 llvm-svn: 340723
*	[RISCV] atomic_store_nn have a different layout to regular store	Roger Ferrer Ibanez	2018-08-27	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \|	We cannot directy reuse the patterns of StPat because for some reason the store DAG node and the atomic_store_nn DAG nodes put the ptr and the value in different positions. Currently we attempt to store the address to an address formed by the value. Differential Revision: https://reviews.llvm.org/D51217 llvm-svn: 340722
*	[X86] Add FeatureCMOV explicitly to all CPUs that support it. Remove ↵	Craig Topper	2018-08-26	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FeatureCMOV implication from Feature64Bit and FeatureSSE1 Summary: Previously most CPUs inherited cmov support through Feature64Bit(or FeatureCMPXCHG16HB implying Feature64Bit) or FeatureSSE1. This has the surprising side effect that -mattr=-cmov causes an assert to fire in 64-bit mode because it clears the Feature64Bit. Or in 32-bit mode, -mattr=-cmov disables any sse/avx features which seems surprising. This patch removes the implication and instead updates hasCMOV in X86Subtarget to check SSE1 or is64Bit in addition to the regular cmov flag. This should keep most things working the way they did before. I don't believe there is a way to specific "-cmov" directly from clang so this should only effect our lower level tools. This does stop -mattr=cx16(cmpxchg16b) from implying cmov is enabled via the 64bit flag as you can see from one of the changed tests. But that was a 32-bit test so I don't know why it enabled cx16 anyway. For the other test I had to add -sse to override the new sse check in hasCMOV. Reviewers: RKSimon, DavidKreitzer, spatel Reviewed By: RKSimon Subscribers: llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D51228 llvm-svn: 340707
*	[X86] Add FeatureCMOV to athlon and athlon-tbird cpus.	Craig Topper	2018-08-26	1	-0/+321
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This matches gcc and one cpuid dump I found online. Given that these are considered 7th generation x86 CPU it seems likely they support cmov since cmov was added by Intel in their 6th generation. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51264 llvm-svn: 340706
*	[SelectionDAG][x86] turn insertelement into undef with variable index into splat	Sanjay Patel	2018-08-26	1	-280/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed this along with the patterns in D51125, but when the index is variable, we don't convert insertelement into a build_vector. For x86, that means these get expanded at legalization time into the loading/spilling code that we see in the tests. I think it's always better to avoid going to memory on these, and we get the optimal 'broadcast' if it's available. I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have regression tests that would be affected (although I did not check what would happen in those cases). In the most basic cases shown here, AArch64 would probably do much better with a splat. Differential Revision: https://reviews.llvm.org/D51186 llvm-svn: 340705
*	[MIPS GlobalISel] Legalize i8 and i16 add	Petar Jovanovic	2018-08-26	2	-3/+261
\| \| \| \| \| \| \| \| \| \| \| \|	Legalize G_ADD for types smaller than i32. LegalizationArtifactCombiner replaces extend instructions with appropriate bitwise instructions. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D51213 llvm-svn: 340697
*	[X86] Add test cases for D50952, paddus patterns involving constants. NFC	Craig Topper	2018-08-26	1	-0/+2895
\| \| \| \|	llvm-svn: 340694
*	[X86] Replace support for vXi32 SMUL_LOHI/UMUL_LOHI with MULHS/MULHU support ↵	Craig Topper	2018-08-25	8	-221/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead. Summary: The only time vector SMUL_LOHI/UMUL_LOHI nodes are created is during division/remainder lowering. If its created before op legalization, generic DAGCombine immediately turns that SMUL_LOHI/UMUL_LOHI into a MULHS/MULHU since only the upper half is used. That node will stick around through vector op legalization and will be turned back into UMUL_LOHI/SMUL_LOHI during op legalization. It will then be custom lowered by the X86 backend. Due to this two step lowering the vector shuffles created by the custom lowering get legalized after their inputs rather than before. This prevents the shuffles from being combined with any build_vector of constants. This patch uses changes vXi32 to use MULHS/MULHU instead. This is what the later DAG combine did anyway. But by skipping the change back to UMUL_LOHI/SMUL_LOHI we lower it before any constant BUILD_VECTORS. This allows the vector_shuffle creation to constant fold with the build_vectors. This accounts for the test changes here. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51254 llvm-svn: 340690
*	[x86] try harder to use broadcast to load a scalar into vector reg	Sanjay Patel	2018-08-25	6	-107/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a preliminary step for a preliminary step for D50992. I noticed that x86 often misses chances to load a scalar directly into a vector register. So this patch is just allowing more of those cases to match a broadcast op in lowerBuildVectorAsBroadcast(). The old code comment said it doesn't make sense to use a broadcast when we're loading a single element and everything else is undef, but I think that's the best case in the improved tests in insert-loaded-scalar.ll. We avoid scalar-to-vector-register move and/or less efficient shuffling. Note that there are some existing types that were already producing a broadcast, but that happens semi-accidentally. Ie, it's not happening as part of lowerBuildVectorAsBroadcast(). The build vector gets expanded into load + shuffle, and then shuffle lowering produces the broadcast. Description of the other test diffs: 1. avx-basic.ll - replacing load+shufle is a win. 2. sse3-avx-addsub-2.ll - vmovddup vs. vbroadcastss is neutral 3. sse41.ll - don't care - we convert that intrinsic to generic IR now, so this test is deprecated 4. vector-shuffle-128-v8.ll / vector-shuffle-256-v16.ll - pshufb alternatives with an extra instruction are not obviously bad Differential Revision: https://reviews.llvm.org/D51125 llvm-svn: 340685
*	[AMDGPU] Add support for multi-dword s.buffer.load intrinsic	Tim Renouf	2018-08-25	1	-9/+187
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch by Marek Olsak and David Stuttard, both of AMD. This adds a new amdgcn intrinsic supporting s.buffer.load, in particular multiple dword variants. These are convenient to use from some front-end implementations. Also modified the existing llvm.SI.load.const intrinsic to common up the underlying implementation. This modification also requires that we can lower to non-uniform loads correctly by splitting larger dword variants into sizes supported by the non-uniform versions of the load. V2: Addressed minor review comments. V3: i1 glc is now i32 cachepolicy for consistency with buffer and tbuffer intrinsics, plus fixed formatting issue. V4: Added glc test. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51098 Change-Id: I83a6e00681158bb243591a94a51c7baa445f169b llvm-svn: 340684
*	[X86] Make requested test changes from D50636	Simon Pilgrim	2018-08-25	3	-39/+74
\| \| \| \| \| \|	The tests were relying on X / X -> 1 and X % X -> 0 combines not happening in the DAG. llvm-svn: 340682
*	[CodeGen] Set FrameSetup/FrameDestroy on BUNDLE instructions	Bjorn Pettersson	2018-08-25	1	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If any of the bundled instructions are marked as FrameSetup or FrameDestroy, then that property is set on the BUNDLE instruction as well. As long as the scheduler/packetizer aren't mixing prologue/epilogue instructions (i.e. all the bundled instructions have the same property) then this simply gives the bundle the correct property (so when using a bundle iterator in late passes a bundle will be correctly identified as FrameSetup/FrameDestroy). When for example bundling a mix of FrameSetup instructions with non-FrameSetup instructions it could be discussed if the bundle should have the property or not. The choice here has been to set these properties on the BUNDLE instruction if any of the bundled instructions have the property set. Reviewers: #debug-info, kparzysz Reviewed By: kparzysz Subscribers: vsk, thegameg, llvm-commits Differential Revision: https://reviews.llvm.org/D50637 llvm-svn: 340680
*	[LiveDebugVariables] Avoid faulty addDefsFromCopies in computeIntervals	Bjorn Pettersson	2018-08-25	1	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When computeIntervals is looking through COPY instruction to extend the location mapping for a debug variable it did not handle subregisters correctly. For example DBG_VALUE debug-use %0.sub_8bit_hi, ... %1:gr16 = COPY %0 was transformed into DBG_VALUE debug-use %0.sub_8bit_hi, ... %1:gr16 = COPY %0 DBG_VALUE debug-use %1, ... So the subregister index was missing in the added DBG_VALUE. As long as the subreg refered to the least significant bits of the superreg, then I guess we could get the correct result in a debugger even when referring to the superreg. But as in the example above when the subreg refers to other parts of the superreg, then debuginfo would be incorrect. I'm not sure exactly how to fix this properly, so this patch just avoids looking through the COPY when there is a subreg involved (for more info, see the FIXME added in the code). Reviewers: rnk, aprantl Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D50788 llvm-svn: 340679