bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ARM] Make RWPI use movw/movt when available	Christof Douma	2017-02-07	1	-19/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When constructing global address literals while targeting the RWPI relocation model. LLVM currently only uses literal pools. If MOVW/MOVT instructions are available we can use these instead. Beside being more efficient it allows -arm-execute-only to work with -relocation-model=RWPI as well. When we generate MOVW/MOVT for global addresses when targeting the RWPI relocation model, we need to use base relative relocations. This patch does the needed plumbing in MC to generate these for MOVW/MOVT. Differential Revision: https://reviews.llvm.org/D29487 Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d llvm-svn: 294298
*	[DAGCombiner] Support bswap as a part of load combine patterns	Artur Pilipenko	2017-02-06	2	-0/+68
\| \| \| \| \| \| \| \|	Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29397 llvm-svn: 294201
*	Add DAGCombiner load combine tests with non-zero offset	Artur Pilipenko	2017-02-06	2	-0/+360
\| \| \| \| \| \|	This is separated from https://reviews.llvm.org/D29394 review. llvm-svn: 294185
*	MachineCopyPropagation: Respect implicit operands of COPY	Matthias Braun	2017-02-04	1	-0/+22
\| \| \| \| \| \| \| \| \|	The code missed to check implicit operands of COPY instructions for defs/uses. Differential Revision: https://reviews.llvm.org/D29522 llvm-svn: 294088
*	[ARM] Change TCReturn to tBL if tailcall optimization fails.	Sanne Wouda	2017-02-03	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The tail call optimisation is performed before register allocation, so at that point we don't know if LR is being spilt or not. If LR was spilt to the stack, then we cannot do a tail call optimisation. That would involve popping back into LR which is not possible in Thumb1 code. Reviewers: rengolin, jmolloy, rovka, olista01 Reviewed By: olista01 Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D29020 llvm-svn: 294000
*	[LLC] Add an inline assembly diagnostics handler.	Sanne Wouda	2017-02-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: llc would hit a fatal error for errors in inline assembly. The diagnostics message is now printed. Reviewers: rengolin, MatzeB, javed.absar, anemet Reviewed By: anemet Subscribers: jyknight, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D29408 llvm-svn: 293999
*	[ARM] Classification Improvements to ARM Sched-Model. NFCI.	Javed Absar	2017-02-02	1	-0/+128
\| \| \| \| \| \| \| \| \| \| \| \|	This is the second in the series of patches to enable adding of machine sched-models for ARM processors easier and compact. This patch focuses on integer instructions and adds missing sched definitions. Reviewers: rovka, rengolin Differential Revision: https://reviews.llvm.org/D29127 llvm-svn: 293935
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2017-02-02	5	-65/+64
\| \| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. llvm-svn: 293915
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2017-02-02	5	-64/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293893
*	[ARM] GlobalISel: Lower pointer args and returns	Diana Picus	2017-02-02	2	-0/+67
\| \| \| \| \| \| \| \| \|	It is important to change the ArgInfo's type from pointer to integer, otherwise the CC assign function won't know what to do. Instead of hacking it up, we use ComputeValueVTs and introduce some of the helpers that we will need later on for lowering more complex types. llvm-svn: 293889
*	[ARM] GlobalISel: Legalize loading pointers	Diana Picus	2017-02-02	2	-0/+37
\| \| \| \| \| \| \|	Make it legal to load pointer values. Also check that pointers are assigned to the GPR reg bank by default. llvm-svn: 293886
*	[ARM] GlobalISel: Test default banks for load results. NFC.	Diana Picus	2017-02-02	1	-0/+32
\| \| \| \| \| \|	Check that all scalars are loaded into the GPR by default. llvm-svn: 293883
*	[ARM] Enable Cortex-M23 and Cortex-M33 support.	Javed Absar	2017-02-01	2	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add both cores to the target parser and TableGen. Test that eabi attributes are set correctly for both cores. Additionally, test the absence and presence of MOVT in Cortex-M23 and Cortex-M33, respectively. Committed on behalf of Sanne Wouda. Reviewers : rengolin, olista01. Differential Revision: https://reviews.llvm.org/D29073 llvm-svn: 293761
*	CodeGen: Allow small copyable blocks to "break" the CFG.	Kyle Butt	2017-01-31	4	-24/+29
\| \| \| \| \| \| \| \| \| \| \|	When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well, subject to some simple frequency calculations. Differential Revision: https://reviews.llvm.org/D28583 llvm-svn: 293716
*	[ARM] Avoid using ARM instructions in Thumb mode	Sam Parker	2017-01-31	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Requires class overrides the target requirements of an instruction, rather than adding to them, so all ARM instructions need to include the IsARM predicate when they have overwitten requirements. This caused the swp and swpb instructions to be allowed in thumb mode assembly, and the ARM encoding of CDP to be selected in codegen (which is different for conditional instructions). Differential Revision: https://reviews.llvm.org/D29283 llvm-svn: 293634
*	DAG: Constant fold fp16_to_fp/fp16_to_fp	Matt Arsenault	2017-01-30	2	-5/+7
\| \| \| \| \| \| \|	This fixes emitting conversions of constants on targets without legal f16 that need to use these for legalization. llvm-svn: 293499
*	ARM: support `-mlong-calls` with AEABI TLS on ELF	Saleem Abdulrasool	2017-01-29	1	-0/+26
\| \| \| \| \| \| \| \| \| \|	Support lowering AEABI TLS access (__aeabi_read_tp) with long calls. This requires adjusting the call sequence to use an indirect call to get full addressability. Resolves PR31769! llvm-svn: 293433
*	[ARM/AArch64] Relocate and update InterleavedAccessPass tests (NFC)	Matthew Simpson	2017-01-27	2	-548/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The interleaved access pass is an IR-to-IR transformation that runs before code generation. It matches interleaved memory operations to target-specific intrinsics (that are later lowered to load and store multiple instructions on ARM/AArch64). We place tests for similar passes (e.g., GlobalMergePass) under test/Transforms. This patch moves the InterleavedAccessPass tests out of test/CodeGen and into target-specific directories under test/Transforms/InterleavedAccess. Although the pass is an IR pass, many of the existing tests were llc tests rather opt tests. For example, the tests would check for ldN/stN instructions generated by llc rather than the intrinsic calls the pass actually inserts. Thus, this patch updates all tests to be opt tests that check for the inserted intrinsics. We already have separate CodeGen tests that ensure we lower the interleaved access intrinsics to their corresponding ldN/stN instructions. In addition to migrating the tests to opt, this patch also performs some minor clean-up (to ensure consistent naming, etc.). Differential Revision: https://reviews.llvm.org/D29184 llvm-svn: 293309
*	ARM: fix vectorized division on WoA	Saleem Abdulrasool	2017-01-27	1	-37/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Windows on ARM target uses custom division for normal division as the backend needs to insert division-by-zero checks. However, it is designed to only handle non-vectorized division. ARM has custom lowering for vectorized division as that can avoid loading registers with the values and invoke a division routine for each one, preferring to lower using NEON instructions. Fall back to the custom lowering for the NEON instructions if we encounter a vectorized division. Resolves PR31778! llvm-svn: 293259
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2017-01-26	5	-65/+64
\| \| \| \| \| \| \| \|	UseAA is enabled." This reverts commit r293184 which is failing in LTO builds llvm-svn: 293188
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2017-01-26	5	-64/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293184
*	[ARM] GlobalISel: Load i1, i8 and i16 args from stack	Diana Picus	2017-01-26	3	-8/+84
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for loading i1, i8 and i16 arguments from the stack, with or without the ABI extension flags. When the ABI extension flags are present, we load a 4-byte value, otherwise we preserve the size of the load and let the instruction selector replace it with a LDRB/LDRH. This generates the same thing as DAGISel. Differential Revision: https://reviews.llvm.org/D27803 llvm-svn: 293163
*	SDag: fix how initial loads are formed when splitting vector ops.	Tim Northover	2017-01-25	1	-0/+10
\| \| \| \| \| \| \| \|	Later code expects the vector loads produced to be directly concatenable, which means we shouldn't pad anything except the last load produced with UNDEF. llvm-svn: 293088
*	[DAGCombiner] Match load by bytes idiom and fold it into a single load. ↵	Artur Pilipenko	2017-01-25	2	-0/+500
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Attempt #2. The previous patch (https://reviews.llvm.org/rL289538) got reverted because of a bug. Chandler also requested some changes to the algorithm. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413479.html This is an updated patch. The key difference is that collectBitProviders (renamed to calculateByteProvider) now collects the origin of one byte, not the whole value. It simplifies the implementation and allows to stop the traversal earlier if we know that the result won't be used. From the original commit: Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bytes which constitute the resulting OR value. If all the OR bytes come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: RKSimon, filcab, chandlerc Differential Revision: https://reviews.llvm.org/D27861 llvm-svn: 293036
*	[ARM] GlobalISel: Support i1 add and ABI extensions	Diana Picus	2017-01-25	4	-0/+113
\| \| \| \| \| \| \| \| \| \| \|	Add support for: * i1 add * i1 function arguments, if passed through registers * i1 returns, with ABI signext/zeroext Differential Revision: https://reviews.llvm.org/D27706 llvm-svn: 293035
*	[ARM] GlobalISel: Support i8/i16 ABI extensions	Diana Picus	2017-01-25	4	-0/+141
\| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment, this means supporting the signext/zeroext attribute on the return type of the function. For function arguments, signext/zeroext should be handled by the caller, so there's nothing for us to do until we start lowering calls. Note that this does not include support for other extensions (i8 to i16), those will be added later. Differential Revision: https://reviews.llvm.org/D27705 llvm-svn: 293034
*	[ARM] Classification Improvements to ARM Sched-Models. NFCI.	Javed Absar	2017-01-23	1	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a series of patches to enable adding of machine sched models for ARM processors easier and compact. They define new sched-readwrites for groups of ARM instructions. This has been missing so far, and as a consequence, machine scheduler models for individual sub-targets have tended to be larger than they needed to be. The current patch focuses on floating-point instructions. Reviewers: Diana Picus (rovka), Renato Golin (rengolin) Differential Revision: https://reviews.llvm.org/D28194 llvm-svn: 292825
*	[Thumb] Add support for tMUL in the compare instruction peephole optimizer.	Sjoerd Meijer	2017-01-20	2	-0/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We also want to optimise tests like this: return a*b == 0. The MULS instruction is flag setting, so we don't need the CMP instruction but can instead branch on the result of the MULS. The generated instructions sequence for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the boolean values resulting from the select instruction, but these MOVS instructions are flag setting and were thus preventing this optimisation. Now we first reorder and move the MULS to before the CMP and generate sequence MOVS, MOVS, MULS, CMP so that the optimisation could trigger. Reordering of the MULS and MOVS is safe to do because the subsequent MOVS instructions just set the CPSR register and don't use it, i.e. the CPSR is dead. Differential Revision: https://reviews.llvm.org/D27990 llvm-svn: 292608
*	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify ↵	Serge Rogatch	2017-01-19	2	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 llvm-svn: 292516
*	Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to ↵	Renato Golin	2017-01-18	2	-12/+0
\| \| \| \| \| \| \| \| \| \| \|	identify such problem earlier" This reverts commit r292210, as it broke the Thumb buldbot with: clang-5.0: error: the clang compiler does not support '-fxray-instrument on thumbv7-unknown-linux-gnueabihf'. llvm-svn: 292357
*	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify ↵	Serge Rogatch	2017-01-17	2	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 llvm-svn: 292210
*	[SelectionDAG] Add support for BITREVERSE constant folding	Simon Pilgrim	2017-01-16	1	-1/+2
\| \| \| \| \| \|	We were relying on constant folding of the legalized instructions to do what constant folding we had previously llvm-svn: 292114
*	Revert "CodeGen: Allow small copyable blocks to "break" the CFG."	Kyle Butt	2017-01-11	5	-30/+25
\| \| \| \| \| \| \| \| \|	This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded. This needs a simple probability check because there are some cases where it is not profitable. llvm-svn: 291695
*	[ARM] More aggressive matching for vpadd and vpaddl.	Eli Friedman	2017-01-11	2	-18/+234
\| \| \| \| \| \| \| \| \|	The new matchers work after legalization to make them simpler, and to avoid blocking other optimizations. Differential Revision: https://reviews.llvm.org/D27779 llvm-svn: 291693
*	[ARM] Fix test CodeGen/ARM/fpcmp_ueq.ll broken by rL290616	Evgeny Astigeevich	2017-01-11	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Commit rL290616 (https://reviews.llvm.org/rL290616) changed a checking command for the triple arm-apple-darwin in LLVM::CodeGen/ARM/fpcmp_ueq.ll. As a result of the changes the test could fail for the valid generated code. These changes fixes the test to check only instructions we would expect. Differential Revision: https://reviews.llvm.org/D28159 llvm-svn: 291678
*	CodeGen: Allow small copyable blocks to "break" the CFG.	Kyle Butt	2017-01-10	5	-25/+30
\| \| \| \| \| \| \| \| \| \| \|	When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well. Differential revision: https://reviews.llvm.org/D27742 llvm-svn: 291609
*	DAG: Avoid OOB when legalizing vector indexing	Matt Arsenault	2017-01-10	2	-3/+4
\| \| \| \| \| \| \| \| \|	If a vector index is out of bounds, the result is supposed to be undefined but is not undefined behavior. Change the legalization for indexing the vector on the stack so that an out of bounds index does not create an out of bounds memory access. llvm-svn: 291604
*	Emit .cfi_sections before the first .cfi_startproc	Joerg Sonnenberger	2017-01-02	2	-0/+55
\| \| \| \| \| \| \| \| \| \| \|	GNU as rejects input where .cfi_sections is used after .cfi_startproc, if the new section differs from the old. Adjust our output to always emit .cfi_sections before the first .cfi_startproc to minimize necessary code. Differential Revision: https://reviews.llvm.org/D28011 llvm-svn: 290817
*	test: modernise ARM CodeGen tests	Saleem Abdulrasool	2016-12-27	14	-1535/+1609
\| \| \| \| \| \| \| \|	Replace the use of grep with FileCheck. Tidy up some of the tests. A few of the tests have been left as weak as previously, though some have been made more stringent. llvm-svn: 290616
*	Make the canonicalisation on shifts benifit to more case.	Zijiao Ma	2016-12-23	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \|	1.Fix pessimized case in FIXME. 2.Add tests for it. 3.The canonicalisation on shifts results in different sequence for tests of machine-licm.Correct some check lines. Differential Revision: https://reviews.llvm.org/D27916 llvm-svn: 290410
*	Renumber testcase metadata nodes after r290153.	Adrian Prantl	2016-12-22	4	-273/+324
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch renumbers the metadata nodes in debug info testcases after https://reviews.llvm.org/D26769. This is a separate patch because it causes so much churn. This was implemented with a python script that pipes the testcases through llvm-as - \| llvm-dis - and then goes through the original and new output side-by side to insert all comments at a close-enough location. Differential Revision: https://reviews.llvm.org/D27765 llvm-svn: 290292
*	Legalize metadata in legacy testcases	Adrian Prantl	2016-12-21	1	-0/+3
\| \| \| \|	llvm-svn: 290285
*	[ARM] Implement isExtractSubvectorCheap.	Eli Friedman	2016-12-20	4	-51/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	See https://reviews.llvm.org/D6678 for the history of isExtractSubvectorCheap. Essentially the same considerations apply to ARM. This temporarily breaks the formation of vpadd/vpaddl in certain cases; AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles. See https://reviews.llvm.org/D27779 for followup fix. Differential Revision: https://reviews.llvm.org/D27774 llvm-svn: 290198
*	[ARM] Generate checks for shuffle tests using update_llc_test_checks.py.	Eli Friedman	2016-12-20	3	-143/+542
\| \| \| \|	llvm-svn: 290196
*	[IR] Remove the DIExpression field from DIGlobalVariable.	Adrian Prantl	2016-12-20	4	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades and a change to the Bitcode record for DIGlobalVariable, that makes upgrading the old format unambiguous also for variables without DIExpressions. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 290153
*	Add ARM support to update_llc_test_checks.py	Eli Friedman	2016-12-19	1	-34/+64
\| \| \| \| \| \| \| \| \| \|	Just the minimal support to get it working at the moment. Includes checks for test/CodeGen/ARM/vzip.ll as an example. Differential Revision: https://reviews.llvm.org/D27829 llvm-svn: 290144
*	[ARM] GlobalISel: Add more checks to test	Diana Picus	2016-12-19	1	-0/+4
\| \| \| \|	llvm-svn: 290108
*	[ARM] GlobalISel: Minor style fixup in test	Diana Picus	2016-12-19	1	-3/+3
\| \| \| \|	llvm-svn: 290107
*	[ARM] GlobalISel: Lower i8 and i16 register args	Diana Picus	2016-12-19	2	-8/+52
\| \| \| \| \| \| \| \| \| \| \|	This allows lowering i8 and i16 arguments if they can fit in the registers. Note that the lowering is incomplete - ABI extensions are handled in a subsequent patch. (Last part of) Differential Revision: https://reviews.llvm.org/D27704 llvm-svn: 290106
*	[ARM] GlobalISel: Allow i8 and i16 adds	Diana Picus	2016-12-19	3	-5/+122
\| \| \| \| \| \| \| \| \|	Teach the instruction selector and legalizer that it's ok to have adds with 8 or 16-bit integers. This is the second part of https://reviews.llvm.org/D27704 llvm-svn: 290105