bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[DWARFv5] Allow ".loc 0" to refer to the root file.	Paul Robinson	2018-06-22	2	-4/+6
\| \| \| \| \| \| \| \| \|	DWARF v5 explicitly represents file #0 in the line table. Prior versions did not, so ".loc 0" is still an error in those cases. Differential Revision: https://reviews.llvm.org/D48452 llvm-svn: 335350
*	[SLPVectorizer] Relax alternate opcodes to accept any BinaryOperator pair	Simon Pilgrim	2018-06-22	1	-27/+11
\| \| \| \| \| \| \| \| \| \|	SLP currently only accepts (F)Add/(F)Sub alternate counterpart ops to be merged into an alternate shuffle. This patch relaxes this to accept any pair of BinaryOperator opcodes instead, assuming the target's cost model accepts the vectorization+shuffle. Differential Revision: https://reviews.llvm.org/D48477 llvm-svn: 335349
*	[InstCombine] rearrange shuffle-of-binops logic; NFC	Sanjay Patel	2018-06-22	1	-17/+12
\| \| \| \| \| \| \| \|	The commutative matcher makes things more complicated here, and I'm planning an enhancement where this form is more readable. llvm-svn: 335343
*	Recommit r335333 "[MC] - Add .stack_size sections into groups and link them ↵	George Rimar	2018-06-22	2	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with .text" With compilation fix. Original commit message: D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335336
*	[IR] Use Instruction::isBinaryOp helper instead of raw enum range tests. NFCI.	Simon Pilgrim	2018-06-22	2	-6/+3
\| \| \| \|	llvm-svn: 335335
*	Revert r335332 "[MC] - Add .stack_size sections into groups and link them ↵	George Rimar	2018-06-22	2	-23/+1
\| \| \| \| \| \| \| \| \| \| \| \|	with .text" It broke bots. http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443 http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551 llvm-svn: 335333
*	[MC] - Add .stack_size sections into groups and link them with .text	George Rimar	2018-06-22	2	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335332
*	Recommit of r335326, with the test fixed that I missed.	Sjoerd Meijer	2018-06-22	1	-3/+6
\| \| \| \|	llvm-svn: 335331
*	[CostModel][AArch64] Add some initial costs for SK_Select and ↵	Simon Pilgrim	2018-06-22	1	-17/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	SK_PermuteSingleSrc AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion. This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174. I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more. Differential Revision: https://reviews.llvm.org/D48172 llvm-svn: 335329
*	Reverting r335326 while I look at the test failure	Sjoerd Meijer	2018-06-22	1	-6/+3
\| \| \| \|	llvm-svn: 335328
*	Revert r335324 due to a builtbot failure	Eugene Leviant	2018-06-22	1	-30/+3
\| \| \| \|	llvm-svn: 335327
*	[ARM] ARMv6m and v8m.baseline strict align	Sjoerd Meijer	2018-06-22	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \|	This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline, because it has no support for unaligned accesses. It looks like we always pass target feature "+strict-align" from Clang, so this is not a user facing problem, but querying the subtarget (in e.g. llc) for unaligned access support is incorrect. Differential Revision: https://reviews.llvm.org/D48437 llvm-svn: 335326
*	AMDGPU: Add patterns for i32/i64 local atomic load/store	Matt Arsenault	2018-06-22	4	-1/+54
\| \| \| \| \| \| \| \|	Not sure why the 32/64 split is needed in the atomic_load store hierarchies. The regular PatFrags do this, but we don't do it for the existing handling for global. llvm-svn: 335325
*	[Evaluator] Improve evaluation of call instruction	Eugene Leviant	2018-06-22	1	-3/+30
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D46584 llvm-svn: 335324
*	[X86] Changing the check for valid inputs in combineScalarToVector	Mikhail Dvoretckii	2018-06-22	1	-5/+6
\| \| \| \| \| \| \| \| \|	Changing the logic of scalar mask folding to check for valid input types rather than against invalid ones, making it more robust and fixing PR37879. Differential Revision: https://reviews.llvm.org/D48366 llvm-svn: 335323
*	Revert r335306 (and r335314) - the Call Graph Profile pass.	Chandler Carruth	2018-06-22	6	-181/+7
\| \| \| \| \| \| \| \| \| \| \|	This is the first pass in the main pipeline to use the legacy PM's ability to run function analyses "on demand". Unfortunately, it turns out there are bugs in that somewhat-hacky approach. At the very least, it leaks memory and doesn't support -debug-pass=Structure. Unclear if there are larger issues or not, but this should get the sanitizer bots back to green by fixing the memory leaks. llvm-svn: 335320
*	AMDGPU/GlobalISel: Default to using TableGen'd instruction selector	Tom Stellard	2018-06-22	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We can select all instructions that are marked as legal in a full piglit run, so now is a good time to make the TableGen'd instruction selector default for all opcodes. This is NFC for a full piglit run, which is why there are no tests. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48198 llvm-svn: 335319
*	AMDGPU/GlobalISel: legalize and select 32-bit G_ASHR	Tom Stellard	2018-06-22	4	-0/+47
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D48196 llvm-svn: 335318
*	[LegacyPM] Fix PR37888 by teaching the legacy loop pass manager how to	Chandler Carruth	2018-06-22	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	clear out deleted loops from the current queue beyond just the current loop. This is important because SimpleLoopUnswitch will now enqueue the same loop to be re-processed. When it does this with the legacy PM, we don't have a way of canceling the rest of the pipeline and so we can end up deleting the loop before we reprocess it. =/ This change also makes it easy to support deleting other loops in the queue to process, although I don't have any use cases for that. Differential Revision: https://reviews.llvm.org/D48470 llvm-svn: 335317
*	AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFP	Tom Stellard	2018-06-22	4	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48195 llvm-svn: 335316
*	AMDGPU/GlobalISel: Implement select() for COPY	Tom Stellard	2018-06-22	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46151 llvm-svn: 335315
*	[InstCombine] fix shuffle-of-binops bug	Sanjay Patel	2018-06-21	1	-2/+8
\| \| \| \| \| \| \| \| \|	With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. llvm-svn: 335312
*	AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEF	Tom Stellard	2018-06-21	2	-0/+16
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46150 llvm-svn: 335307
*	[Instrumentation] Add Call Graph Profile pass	Michael J. Spencer	2018-06-21	6	-7/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for generating a call graph profile from Branch Frequency Info. The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335306
*	[X86] Fix 32-bit mingw comdat names, only add one underscore	Reid Kleckner	2018-06-21	1	-11/+6
\| \| \| \|	llvm-svn: 335304
*	Revert r335297 "[X86] Implement more of x86-64 large and medium PIC code models"	Reid Kleckner	2018-06-21	6	-115/+29
\| \| \| \| \| \|	MCJIT can't handle R_X86_64_GOT64 yet. llvm-svn: 335300
*	[X86] Commit some comments that weren't in the medium code model patch	Reid Kleckner	2018-06-21	1	-4/+4
\| \| \| \|	llvm-svn: 335298
*	[X86] Implement more of x86-64 large and medium PIC code models	Reid Kleckner	2018-06-21	6	-27/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The large code model allows code and data segments to exceed 2GB, which means that some symbol references may require a displacement that cannot be encoded as a displacement from RIP. The large PIC model even relaxes the assumption that the GOT itself is within 2GB of all code. Therefore, we need a special code sequence to materialize it: .LtmpN: leaq .LtmpN(%rip), %rbx movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch addq %rax, %rbx # GOT base reg From that, non-local references go through the GOT base register instead of being PC-relative loads. Local references typically use GOTOFF symbols, like this: movq extern_gv@GOT(%rbx), %rax movq local_gv@GOTOFF(%rbx), %rax All calls end up being indirect: movabsq $local_fn@GOTOFF, %rax addq %rbx, %rax callq *%rax The medium code model retains the assumption that the code segment is less than 2GB, so calls are once again direct, and the RIP-relative loads can be used to access the GOT. Materializing the GOT is easy: leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg DSO local data accesses will use it: movq local_gv@GOTOFF(%rbx), %rax Non-local data accesses will use RIP-relative addressing, which means we may not always need to materialize the GOT base: movq extern_gv@GOTPCREL(%rip), %rax Direct calls are basically the same as they are in the small code model: They use direct, PC-relative addressing, and the PLT is used for calls to non-local functions. This patch adds reasonably comprehensive testing of LEA, but there are lots of interesting folding opportunities that are unimplemented. Reviewers: chandlerc, echristo Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47211 llvm-svn: 335297
*	[GVN] Avoid casting a vector of size less than 8 bits to i8	Matthew Voss	2018-06-21	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A reprise of D25849. This crash was found through fuzzing some time ago and was documented in PR28879. No check for load size has been added due to the following tests: - Transforms/GVN/invariant.group.ll - Transforms/GVN/pr10820.ll These tests expect load sizes that are not a multiple of eight. Thanks to @davide for the original patch. Reviewers: nlopes, davide, RKSimon, reames, efriedma Reviewed By: efriedma Subscribers: davide, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D48330 llvm-svn: 335294
*	[SCEV] Re-apply r335197 (with Polly fixes).	Tim Shen	2018-06-21	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338). I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output. All LLVM files are already reviewed in D48338. Reviewers: jdoerfert, bollu, efriedma Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia Differential Revision: https://reviews.llvm.org/D48453 llvm-svn: 335292
*	AMDGPU: Remove ability to reserve VGPRs for debugger	Konstantin Zhuravlyov	2018-06-21	6	-50/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48234 llvm-svn: 335288
*	[mingw] Fix GCC ABI compatibility for comdat things	Reid Kleckner	2018-06-21	3	-9/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCC and the binutils COFF linker do comdats differently from MSVC. If we want to be ABI compatible, we have to do what they do, which is to emit unique section names like ".text$_Z3foov" instead of short section names like ".text". Otherwise, the binutils linker gets confused and reports multiple definition errors when two object files from GCC and Clang containing the same inline function are linked together. The best description of the issue is probably at https://github.com/Alexpux/MINGW-packages/issues/1677, we don't seem to have a good one in our tracker. I fixed up the .pdata and .xdata sections needed everywhere other than 32-bit x86. GCC doesn't use associative comdats for those, it appears to rely on the section name. Reviewers: smeenai, compnerd, mstorsjo, martell, mati865 Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D48402 llvm-svn: 335286
*	[InstCombine] fold vector select of binops with constant ops to 1 binop ↵	Sanjay Patel	2018-06-21	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR37806) This is the simplest case from PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction. This has some tricky parts that are hopefully addressed in the tests and their respective comments: 1. If the shuffle mask contains an undef element, then that lane of the result is undef: http://llvm.org/docs/LangRef.html#shufflevector-instruction Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. 2. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code. 3. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first. Differential Revision: https://reviews.llvm.org/D48401 llvm-svn: 335283
*	[AMDGPU] Update assembler for HSA Code Object v3	Scott Linder	2018-06-21	6	-75/+698
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update AMDGPU assembler syntax behind the code-object-v3 feature: * Replace/rename most AMDGPU assembler directives/symbols and document them. * Provide more diagnostics (e.g. values out of range, missing values, repeated values). * Provide path for backwards compatibility, even with underlying descriptor changes. Differential Revision: https://reviews.llvm.org/D47736 llvm-svn: 335281
*	Revert r335206 "Recommit r333268: [IPSCCP] Use PredicateInfo to propagate ↵	Francis Visoiu Mistrih	2018-06-21	2	-134/+10
\| \| \| \| \| \| \| \| \| \| \|	facts from cmp instructions." This reverts commit r335206. As discussed here: https://reviews.llvm.org/rL333740, a fix will come tomorrow. In the meanwhile, revert this to fix some bots. llvm-svn: 335272
*	[mips] Modify comment to test new email address (NFC).	Simon Dardis	2018-06-21	1	-1/+1
\| \| \| \|	llvm-svn: 335269
*	[AMDGPU] Fix bug with tracking processed blocks in SIInsertWaitcnts	Scott Linder	2018-06-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	BlockWaitcntProcessedSet was not being cleared between calls, so it was producing incorrect counts in cases where MBB addresses happened to coincide across multiple calls. Differential Revision: https://reviews.llvm.org/D48391 llvm-svn: 335268
*	AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z	Konstantin Zhuravlyov	2018-06-21	5	-51/+0
\| \| \| \| \| \| \| \| \| \| \| \|	and everything that comes with it from implementation and v3 header files. Leave definition in v2 header files for backwards compatibility. Differential Revision: https://reviews.llvm.org/D48191 llvm-svn: 335267
*	[DebugInfo] Ignore DBG_VALUE instructions in PostRA Machine Sink	Matt Davis	2018-06-21	1	-25/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The logic for handling the sinking of COPY instructions was generating different code when building with debug flags. The original code did not take into consideration debug instructions. This resulted in the registers in the DBG_VALUE instructions being treated as used, and prevented the COPY from being sunk. This patch avoids analyzing debug instructions when trying to sink COPY instructions. This patch also creates a routine from the code in MachineSinking::SinkInstruction to perform the logic of sinking an instruction along with its debug instructions. This functionality is used in multiple places, including the code for sinking COPY instrs. Reviewers: junbuml, javed.absar, MatzeB, bjope Reviewed By: bjope Subscribers: aprantl, probinson, thegameg, jonpa, bjope, vsk, kristof.beyls, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45637 llvm-svn: 335264
*	[InstCombine] use constant pattern matchers with icmp+sext	Sanjay Patel	2018-06-21	1	-14/+11
\| \| \| \| \| \| \| \|	The previous code worked with vectors, but it failed when the vector constants contained undef elements. The matchers handle those cases. llvm-svn: 335262
*	[InstCombine] simplify binops before trying other folds	Sanjay Patel	2018-06-21	4	-52/+64
\| \| \| \| \| \| \| \| \| \|	This is outwardly NFC from what I can tell, but it should be more efficient to simplify first (despite the name, SimplifyAssociativeOrCommutative does not actually simplify as InstSimplify does - it creates/morphs instructions). This should make it easier to refactor duplicated code that runs for all binops. llvm-svn: 335258
*	[DWARF] Warn on and ignore ".file 0" for DWARF v4 and earlier.	Paul Robinson	2018-06-21	1	-2/+4
\| \| \| \| \| \| \|	This had been messing with the directory table for prior versions, and also could induce a crash when generating asm output. llvm-svn: 335254
*	Revert "[AArch64] Coalesce Copy Zero during instruction selection"	Sirish Pande	2018-06-21	1	-29/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit d8f57105010cc7e78026e511d5def873fc91e0e7. Original Commit: Author: Haicheng Wu <haicheng@codeaurora.org> Date: Sun Feb 18 13:51:33 2018 +0000 [AArch64] Coalesce Copy Zero during instruction selection Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 Author's intention is to remove a BB that has one mov instruction. In order to do that, d8f571050 pessmizes MachineSinking by introducing a copy, such that mov instruction is NOT moved to the BB. Optimization downstream gets rid of the BB with only mov instruction. This works well if we have only one fall through branch as there is only one "extra" mov instruction. If we have multiple fall throughs, we will have a lot of redundant movs. In such a case, it's better to have this BB which has one mov instruction. This is causing degradation in jpeg, fft and other codebases. I believe if we want to remove a BB with only one branch instruction, we should not pessimize Machine Sinking at all, and find some other solution. llvm-svn: 335251
*	DAG combine "and\|or (select c, -1, 0), x" -> "select c, x, 0\|-1"	Stanislav Mekhanoshin	2018-06-21	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allowed folding for "and/or" binops with non-constant operand if arguments of select are 0/-1 values. Normally this code with "and" opcode does not get to a DAG combiner and simplified yet in the InstCombine. However AMDGPU produces it during lowering and InstCombine has no chance to optimize it out. In turn the same pattern with "or" opcode can reach DAG. Differential Revision: https://reviews.llvm.org/D48301 llvm-svn: 335250
*	[ARM] Enable useAA() for the in-order Cortex-R52	David Green	2018-06-21	2	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \|	This option allows codegen (such as DAGCombine or MI scheduling) to use alias analysis information, which can help with the codegen on in-order cpu's, especially machine scheduling. Here I have done things the same way as AArch64, adding a subtarget feature to enable this for specific cores, and enabled it for the R52 where we have a schedule to make use of it. Differential Revision: https://reviews.llvm.org/D48074 llvm-svn: 335249
*	[InstCombine] make div/rem vector constant utility function; NFCI	Sanjay Patel	2018-06-21	2	-13/+24
\| \| \| \| \| \|	This was originally in D48401 and will be used there. llvm-svn: 335242
*	[RISCV] Tail calls don't need to save return address	Sameer AbuAsal	2018-06-21	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When expanding the PseudoTail in expandFunctionCall() we were using X6 to save the return address. Since this is a tail call the return address is not needed, this patch replaces it with X0 to be ignored. This matches the behaviour listed in the ISA V2.2 document page 110. tail offset -----> jalr x0, x6, offset GCC exhibits the same behavior. Reviewers: apazos, asb, mgrang Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01 Differential Revision: https://reviews.llvm.org/D48343 llvm-svn: 335239
*	[x86] Lower some trunc + shuffle patterns to vpmov[q\|d][b\|w]	Mikhail Dvoretckii	2018-06-21	1	-7/+118
\| \| \| \| \| \| \| \| \| \| \| \|	This should help in lowering the following four intrinsics: _mm256_cvtepi32_epi8 _mm256_cvtepi64_epi16 _mm256_cvtepi64_epi8 _mm512_cvtepi64_epi8 Differential Revision: https://reviews.llvm.org/D46957 llvm-svn: 335238
*	[CodeGen] Avoid handling DBG_VALUE in LiveRegUnits::stepBackward	Krzysztof Parzyszek	2018-06-21	1	-2/+2
\| \| \| \| \| \| \| \|	Patch by Jesper Antonsson. Differential Revision: https://reviews.llvm.org/D48420 llvm-svn: 335233
*	AMDGPU: Remove redundant MIMG instruction variants	Nicolai Haehnle	2018-06-21	1	-20/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For sample and gather ops, we can accurately determine the set of vaddr-size instruction variants that are required. This reduces the size of instruction tables by ~5%. The number of machine instruction opcodes is reduced from 10002 to 9476. Change-Id: Ie7fc65d3657b762c7816017fe70b2e9bec644a8a Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D48168 llvm-svn: 335232