bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses	Ivan A. Kosarev	2018-07-05	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
*	Partial revert of "NFC - Various typo fixes in tests"	Mikael Holmen	2018-07-05	1	-10/+11
\| \| \| \| \| \| \| \|	This partially reverts r336268 since it causes buildbot failures. Added FIXME at the places where the CHECKs are misspelled. llvm-svn: 336323
*	[ARM] ParallelDSP: only support i16 loads for now	Sjoerd Meijer	2018-07-05	1	-1/+46
\| \| \| \| \| \| \| \| \|	We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
*	NFC - Various typo fixes in tests	Gabor Buella	2018-07-04	6	-19/+19
\| \| \| \|	llvm-svn: 336268
*	[ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m.	Vadzim Dambrouski	2018-07-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: efriedma, rogfer01, javed.absar Reviewed By: efriedma, rogfer01 Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D48846 llvm-svn: 336144
*	[ARM] Parallel DSP Pass	Sjoerd Meijer	2018-06-28	13	-0/+677
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Armv6 introduced instructions to perform 32-bit SIMD operations. The purpose of this pass is to do some straightforward IR pattern matching to create ACLE DSP intrinsics, which map on these 32-bit SIMD operations. Currently, only the SMLAD instruction gets recognised. This instruction performs two multiplications with 16-bit operands, and stores the result in an accumulator. We will follow this up with patches to recognise SMLAD in more cases, and also to generate other DSP instructions (like e.g. SADD16). Patch by: Sam Parker and Sjoerd Meijer Differential Revision: https://reviews.llvm.org/D48128 llvm-svn: 335850
*	[NEON] Support vldNq intrinsics in AArch32 (LLVM part)	Ivan A. Kosarev	2018-06-27	1	-0/+234
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the q versions of the dup (load-to-all-lanes) NEON intrinsics, such as vld2q_dup_f16() for example. Currently, non-q versions of the dup intrinsics are implemented in clang by generating IR that first loads the elements of the structure into the first lane with the lane (to-single-lane) intrinsics, and then propagating it other lanes. There are at least two problems with this approach. First, there are no double-spaced to-single-lane byte-element instructions. For example, there is no such instruction as 'vld2.8 { d0[0], d2[0] }, [r0]'. That means we cannot rely on the to-single-lane intrinsics and instructions to implement the q versions of the dup intrinsics. Note that to-all-lanes instructions do support all sizes of data items, including bytes. The second problem with the current approach is that we need a separate vdup instruction to propagate the structure to each lane. So for vld4q_dup_f16() we would need four vdup instructions in addition to the initial vld instruction. This patch introduces dup LLVM intrinsics and reworks handling of the currently supported (non-q) NEON dup intrinsics to expand them into those LLVM intrinsics, thus eliminating the need for using to-single-lane intrinsics and instructions. Additionally, this patch adds support for u64 and s64 dup NEON intrinsics. These are marked as Arch64-only in the ARM NEON Reference, but it seems there are no reasons to not support them in AArch32 mode. Please correct, if that is wrong. That's what we generate with this patch applied: vld2q_dup_f16: vld2.16 {d0[], d2[]}, [r0] vld2.16 {d1[], d3[]}, [r0] vld3q_dup_f16: vld3.16 {d0[], d2[], d4[]}, [r0] vld3.16 {d1[], d3[], d5[]}, [r0] vld4q_dup_f16: vld4.16 {d0[], d2[], d4[], d6[]}, [r0] vld4.16 {d1[], d3[], d5[], d7[]}, [r0] Differential Revision: https://reviews.llvm.org/D48439 llvm-svn: 335733
*	[X86,ARM] Retain split-stack prolog check for sibling calls	Than McIntosh	2018-06-26	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If a routine with no stack frame makes a sibling call, we need to preserve the stack space check even if the local stack frame is empty, since the call target could be a "no-split" function (in which case the linker needs to be able to fix up the prolog sequence in order to switch to a larger stack). This fixes PR37807. Reviewers: cherry, javed.absar Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D48444 llvm-svn: 335604
*	Recommit r335333 "[MC] - Add .stack_size sections into groups and link them ↵	George Rimar	2018-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with .text" With compilation fix. Original commit message: D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335336
*	Revert r335332 "[MC] - Add .stack_size sections into groups and link them ↵	George Rimar	2018-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	with .text" It broke bots. http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443 http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551 llvm-svn: 335333
*	[MC] - Add .stack_size sections into groups and link them with .text	George Rimar	2018-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335332
*	Recommit of r335326, with the test fixed that I missed.	Sjoerd Meijer	2018-06-22	1	-10/+9
\| \| \| \|	llvm-svn: 335331
*	Reverting r335326 while I look at the test failure	Sjoerd Meijer	2018-06-22	1	-9/+10
\| \| \| \|	llvm-svn: 335328
*	[ARM] ARMv6m and v8m.baseline strict align	Sjoerd Meijer	2018-06-22	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline, because it has no support for unaligned accesses. It looks like we always pass target feature "+strict-align" from Clang, so this is not a user facing problem, but querying the subtarget (in e.g. llc) for unaligned access support is incorrect. Differential Revision: https://reviews.llvm.org/D48437 llvm-svn: 335326
*	[ARM] Enable useAA() for the in-order Cortex-R52	David Green	2018-06-21	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \|	This option allows codegen (such as DAGCombine or MI scheduling) to use alias analysis information, which can help with the codegen on in-order cpu's, especially machine scheduling. Here I have done things the same way as AArch64, adding a subtarget feature to enable this for specific cores, and enabled it for the R52 where we have a schedule to make use of it. Differential Revision: https://reviews.llvm.org/D48074 llvm-svn: 335249
*	[NFC][ARM] ldrd/strd negative tests	Sam Parker	2018-06-21	1	-0/+52
\| \| \| \| \| \|	Add negative tests for load and stores of alignment 2. llvm-svn: 335241
*	[DAGCombine] Fix alignment for offset loads/stores	David Green	2018-06-21	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The alignment parameter to getExtLoad is treated as a base alignment, not the alignment of the load (base + offset). When we infer a better alignment for a Ptr we need to ensure that it applies to the base to prevent the alignment on the load from being wrong. This fixes a bug where the alignment could then be used to incorrectly prove noalias between a load and a store, leading to a miscompile. Differential Revision: https://reviews.llvm.org/D48029 llvm-svn: 335210
*	Generalize MergeBlockIntoPredecessor. Replace uses of ↵	Alina Sbirlea	2018-06-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MergeBasicBlockIntoOnlyPred. Summary: Two utils methods have essentially the same functionality. This is an attempt to merge them into one. 1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred 2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor Prior to the patch: 1. MergeBasicBlockIntoOnlyPred Updates either DomTree or DeferredDominance Moves all instructions from Pred to BB, deletes Pred Asserts BB has single predecessor If address was taken, replace the block address with constant 1 (?) 2. MergeBlockIntoPredecessor Updates DomTree, LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken After the patch: Method 2. MergeBlockIntoPredecessor is attempting to become the new default: Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken Uses of MergeBasicBlockIntoOnlyPred that need to be replaced: 1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp Updated in this patch. No challenges. 2. lib/CodeGen/CodeGenPrepare.cpp Updated in this patch. i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation. ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks Some interesting aspects: - Since Pred is not deleted (BB is), the entry block does not need updating. - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred. - isMergingEmptyBlockProfitable assumes BB is the one to be deleted. - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead. - adding some test owner as subscribers for the interesting tests modified: test/CodeGen/X86/avx-cmp.ll test/CodeGen/AMDGPU/nested-loop-conditions.ll test/CodeGen/AMDGPU/si-annotate-cf.ll test/CodeGen/X86/hoist-spill.ll test/CodeGen/X86/2006-11-17-IllegalMove.ll 3. lib/Transforms/Scalar/JumpThreading.cpp Not covered in this patch. It is the only use case using the DeferredDominance. I would defer to Brian Rzycki to make this replacement. Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D48202 llvm-svn: 335183
*	ARM: convert ORR instructions to ADD where possible on Thumb.	Tim Northover	2018-06-20	2	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \|	Thumb has more 16-bit encoding space dedicated to ADD than ORR, allowing both a 3-address encoding and a wider range of immediates. So, particularly when optimizing for code size (but it doesn't make things worse elsewhere) it's beneficial to select an OR operation to an ADD if we know overflow won't occur. This is made even better by LLVM's penchant for putting operations in canonical form by converting the other way. llvm-svn: 335119
*	[ARM] Add Thumb1 coverage for cmn testcases.	Eli Friedman	2018-06-19	1	-6/+45
\| \| \| \| \| \| \|	There's a missed optimization for immediates: we can save two instructions by using adds instead of movs+mvns+cmp. llvm-svn: 335002
*	[ARM] Testcase for missed optimization with i16 compare.	Eli Friedman	2018-06-19	1	-0/+21
\| \| \| \| \| \| \|	The result looks weird because the DAG actually has an explicit shift; I haven't figured out why, exactly. llvm-svn: 335000
*	easing the constraint for isNegatibleForFree and GetNegatedExpression	Michael Berg	2018-06-14	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Here we relax the old constraint which utilized unsafe with the TargetOption flag HonorSignDependentRoundingFPMathOption, with the assertion that unsafe is no longer needed or never was required for correctness on FDIV/FMUL. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar Reviewed By: spatel Subscribers: efriedma, wdng, tpr Differential Revision: https://reviews.llvm.org/D48057 llvm-svn: 334769
*	DAG: Fix extract_subvector combine for a single element	Matt Arsenault	2018-06-11	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This would fail before because 1x vectors aren't legal, so instead just use the scalar type. Avoids regressions in a future AMDGPU commit to add v4i16/v4f16 as legal types. Test update is just the one test that this triggers on in tree now. It wasn't checking anything before. The result is completely changed since the selects are eliminated. Not sure if it's considered better or not. llvm-svn: 334440
*	[NEON] Support VST1xN intrinsics in AArch32 mode (LLVM part)	Ivan A. Kosarev	2018-06-10	1	-0/+363
\| \| \| \| \| \| \| \| \|	We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47447 llvm-svn: 334361
*	[ARM] Allow CMPZ transforms even if the input has multiple uses.	Eli Friedman	2018-06-08	2	-1/+19
\| \| \| \| \| \| \| \| \| \|	It looks like this got left in by accident in r289794; I can't think of any reason this check would be necessary. (Maybe it was meant to be a check that the AND has one use? But we check that a few lines earlier.) Differential Revision: https://reviews.llvm.org/D47921 llvm-svn: 334322
*	[AArch64, ARM] Add support for Samsung Exynos M4	Evandro Menezes	2018-06-06	1	-0/+28
\| \| \| \| \| \|	Create a separate feature set for Exynos M4 and add test cases. llvm-svn: 334115
*	[GlobalMerge] Set the alignment on merged global structs	David Green	2018-06-06	2	-3/+27
\| \| \| \| \| \| \| \| \| \|	If no alignment is set, the abi/preferred alignment of structs will be used which may be higher than required. This can lead to extra padding and in the end an increase in data size. Differential Revision: https://reviews.llvm.org/D47633 llvm-svn: 334099
*	[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup	Peter Smith	2018-06-06	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078
*	[NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part)	Ivan A. Kosarev	2018-06-02	1	-0/+242
\| \| \| \| \| \| \| \| \|	We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47120 llvm-svn: 333825
*	Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)"	Ivan A. Kosarev	2018-06-02	1	-242/+0
\| \| \| \| \| \| \| \|	The LLVM part was committed instead of the Clang part. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333824
*	[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)	Ivan A. Kosarev	2018-06-02	1	-0/+242
\| \| \| \| \| \| \| \| \|	We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333819
*	[ARM] Enable SETCCCARRY lowering for Thumb1.	Eli Friedman	2018-05-29	2	-35/+76
\| \| \| \| \| \| \| \| \|	We've had Thumb1 support for ARMISD::SUBE for a while now, so this just works. Reduces codesize a bit for 64-bit integer comparisons. Differential Revision: https://reviews.llvm.org/D47387 llvm-svn: 333445
*	ARM: be conservative when asked load/store alignment of weird type.	Tim Northover	2018-05-21	1	-0/+8
\| \| \| \| \| \| \|	Chances are we'll be asked again after type legalization, but before that point it's better to claim misaligned accesses aren't allowed than to assert. llvm-svn: 332840
*	[GlobalMerge] Exit early if only one global is to be merged	Haicheng Wu	2018-05-19	1	-4/+4
\| \| \| \| \| \| \| \|	To save some compilation time and prevent some unnecessary changes. Differential Revision: https://reviews.llvm.org/D46640 llvm-svn: 332813
*	[ARM] preserve test intent by removing undef	Sanjay Patel	2018-05-17	1	-20/+20
\| \| \| \| \| \| \| \| \| \| \| \|	We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in https://reviews.llvm.org/D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332638
*	[ARM] preserve test intent by removing undef	Sanjay Patel	2018-05-17	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in https://reviews.llvm.org/D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. Follow-up to: https://reviews.llvm.org/rL332538 ...because that change wasn't enough. llvm-svn: 332637
*	[ARM] preserve test intent by removing undef	Sanjay Patel	2018-05-16	1	-13/+6
\| \| \| \| \| \| \| \| \| \| \| \|	We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332539
*	[ARM] preserve test intent by removing undef	Sanjay Patel	2018-05-16	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \|	We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332538
*	[ARM] preserve test intent by removing undef	Sanjay Patel	2018-05-16	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332537
*	[ARM] preserve test intent by removing undef	Sanjay Patel	2018-05-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332533
*	[ARM] preserve test intent by removing undef	Sanjay Patel	2018-05-16	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332532
*	[AArch64] Gangup loads and stores for pairing.	Sirish Pande	2018-05-16	1	-4/+3
\| \| \| \| \| \| \| \| \| \|	Keep loads and stores together (target defines how many loads and stores to gang up), such that it will help in pairing and vectorization. Differential Revision https://reviews.llvm.org/D46477 llvm-svn: 332482
*	[GlobalISel][IRTranslator] Split aggregates during IR translation.	Amara Emerson	2018-05-16	2	-23/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently handle all aggregates by creating one large LLT, and letting the legalizer deal with splitting them up. However using this approach means that we can't support big endian code correctly. This patch changes the way that the IRTranslator deals with aggregate values, by splitting them up into their constituent element values. To do this, parts of the translator need to be modified to deal with multiple VRegs for a single Value. A new Value to VReg mapper is introduced to help keep compile time under control, currently there is no measurable impact on CTMark despite the extra code being generated in some cases. Patch is based on the original work of Tim Northover. Differential Revision: https://reviews.llvm.org/D46018 llvm-svn: 332449
*	[DAG] propagate FMF for all FPMathOperators	Sanjay Patel	2018-05-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a simple hack based on what's proposed in D37686, but we can extend it if needed in follow-ups. It gets us most of the FMF functionality that we want without adding any state bits to the flags. It also intentionally leaves out non-FMF flags (nsw, etc) to minimize the patch. It should provide a superset of the functionality from D46563 - the extra tests show propagation and codegen diffs for fcmp, vecreduce, and FP libcalls. The PPC log2() test shows the limits of this most basic approach - we only applied 'afn' to the last node created for the call. AFAIK, there aren't any libcall optimizations based on the flags currently, so that shouldn't make any difference. Differential Revision: https://reviews.llvm.org/D46854 llvm-svn: 332358
*	[ARM] Back up R4 and LR if calling the stack probe function	Martin Storsjo	2018-05-14	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D46777 llvm-svn: 332298
*	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.	Shiva Chen	2018-05-09	22	-39/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841
*	[globalisel] Remove redundant -global-isel option from tests that use ↵	Daniel Sanders	2018-05-05	12	-31/+31
\| \| \| \| \| \| \| \| \| \| \|	-run-pass. NFC As Roman Tereshin pointed out in https://reviews.llvm.org/D45541, the -global-isel option is redundant when -run-pass is given. -global-isel sets up the GlobalISel passes in the pass manager but -run-pass skips that entirely and configures it's own pipeline. llvm-svn: 331603
*	ARM: don't try to over-align large vectors as arguments.	Tim Northover	2018-05-03	2	-19/+62
\| \| \| \| \| \| \| \| \| \| \| \|	By default LLVM thinks very large vectors get aligned to their size when passed across functions. Unfortunately no-one told the ARM backend so it doesn't trigger stack realignment and so accesses can cause the usual misalignment issues (e.g. a data abort). This changes the ABI alignment to the stack alignment, which in practice (and as a bonus) also coincides with the alignment "natural" vectors get. llvm-svn: 331451
*	[DAGCombiner] Fix SDLoc in a (zext (zextload x)) combine (4/N)	Vedant Kumar	2018-05-01	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The logic for this combine is almost identical to the logic for a (sext (sextload x)) combine. This commit factors out the logic so it can be shared by both combines, and corrects the SDLoc assigned in the zext version of the combine. Prior to this patch, for the given test case, we would apply the location associated with the udiv instruction to instructions which perform the load. Part of: llvm.org/PR37262 llvm-svn: 331303
*	[DAGCombiner] Fix SDLoc in a (sext (sextload x)) combine (3/N)	Vedant Kumar	2018-05-01	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, for the given test case, we would apply the location associated with the sdiv instruction to instructions which perform the load. Part of: llvm.org/PR37262. Differential Revision: https://reviews.llvm.org/D46222 llvm-svn: 331302