bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[mips] Add more checks to the tls.ll test case. NFC	Simon Atanasyan	2018-07-23	1	-49/+106
\| \| \| \|	llvm-svn: 337705
*	[FPEnv] Legalize double width StrictFP vector operations	Cameron McInally	2018-07-23	1	-48/+940
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48809 llvm-svn: 337698
*	[ARM] ARMCodeGenPrepare backend pass	Sam Parker	2018-07-23	3	-0/+905
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Arm specific codegen prepare is implemented to perform type promotion on icmp operands, which can enable the removal of uxtb and uxth (unsigned extend) instructions. This is possible because performing type promotion before ISel alleviates this duty from the DAG builder which has to perform legalisation, but has a limited view on data ranges. The pass visits any instruction operand of an icmp and creates a worklist to traverse the use-def tree to determine whether the values can simply be promoted. Our concern is values in the registers overflowing the narrow (i8, i16) data range, so instructions marked with nuw can be promoted easily. For add and sub instructions, we are able to use the parallel dsp instructions to operate on scalar data types and avoid overflowing bits. Underflowing adds and subs are also permitted when the result is only used by an unsigned icmp. Differential Revision: https://reviews.llvm.org/D48832 llvm-svn: 337687
*	[GVN] Don't use the eliminated load as an available value in phi construction	John Brawn	2018-07-23	2	-0/+121
\| \| \| \| \| \| \| \| \| \| \|	In ConstructSSAForLoadSet if an available value is actually the load that we're doing SSA construction to eliminate, then we can omit it as SSAUpdate will add in the value for the phi that will be replacing it anyway. This can result in simpler IR which can allow further optimisation. Differential Revision: https://reviews.llvm.org/D44160 llvm-svn: 337686
*	[MemorySSAUpdater] Update Phi operands after trivial Phi elimination	Alexandros Lamprineas	2018-07-23	1	-0/+119
\| \| \| \| \| \| \| \| \| \| \| \|	Bug fix for PR37445. The underlying problem and its fix are similar to PR37808. The bug lies in MemorySSAUpdater::getPreviousDefRecursive(), where PhiOps is computed before the call to tryRemoveTrivialPhi() and it ends up being out of date, pointing to stale data. We have now turned each of the PhiOps into a TrackingVH<MemoryAccess>. Differential Revision: https://reviews.llvm.org/D49425 llvm-svn: 337680
*	[NFC][MCA] ZnVer1: Update RegisterFile to identify false dependencies on ↵	Roman Lebedev	2018-07-23	5	-95/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	partially written registers. Summary: Pretty mechanical follow-up for D49196. As microarchitecture.pdf notes, "20 AMD Ryzen pipeline", "20.8 Register renaming and out-of-order schedulers": The integer register file has 168 physical registers of 64 bits each. The floating point register file has 160 registers of 128 bits each. "20.14 Partial register access": The processor always keeps the different parts of an integer register together. ... An instruction that writes to part of a register will therefore have a false dependence on any previous write to the same register or any part of it. Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh Reviewed By: GGanesh Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D49393 llvm-svn: 337676
*	[NFC][MCA] ZnVer1: add partial-reg-update tests	Roman Lebedev	2018-07-23	7	-0/+460
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh Reviewed By: GGanesh Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D49392 llvm-svn: 337675
*	[GVNHoist] safeToHoistLdSt allows illegal hoisting	Alexandros Lamprineas	2018-07-23	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \|	Bug fix for PR36787. When reasoning if it's safe to hoist a load we want to make sure that the defining memory access dominates the new insertion point of the hoisted instruction. safeToHoistLdSt calls firstInBB(InsertionPoint,DefiningAccess) which returns false if InsertionPoint == DefiningAccess, and therefore it falsely thinks it's safe to hoist. Differential Revision: https://reviews.llvm.org/D49555 llvm-svn: 337674
*	[x86/SLH] Fix a bug where we would harden tail calls twice -- once as	Chandler Carruth	2018-07-23	1	-6/+0
\| \| \| \| \| \| \| \| \|	a call, and then again as a return. Also added a comment to try and explain better why we would be doing what we're doing when hardening the (non-call) returns. llvm-svn: 337673
*	[x86/SLH] Add a test covering indirect forms of control flow. NFC.	Chandler Carruth	2018-07-23	1	-0/+234
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This specifically covers different ways of making indirect calls and jumps. There are some bugs in SLH that I will be fixing in subsequent patches where the diff in the generated instructions makes the bug fix much more clear, so just checking in a baseline of this test to start. I'm also going to be adding direct mitigation for variant 1.2 which this file very specifically tests in the various forms it can arise on x86. Again, the diff to the generated instructions should make the change for that much more clear, so having the test as a baseline seems useful. llvm-svn: 337672
*	[X86] Remove the max vector width restriction from combineLoopMAddPattern ↵	Craig Topper	2018-07-22	2	-329/+186
\| \| \| \| \| \| \| \|	and rely splitOpsAndApply to handle splitting. This seems to be a net improvement. There's still an issue under avx512f where we have a 512-bit vpaddd, but not vpmaddwd so we end up doing two 256-bit vpmaddwds and inserting the results before a 512-bit vpaddd. It might be better to do two 512-bits paddds with zeros in the upper half. Same number of instructions, but breaks a dependency. llvm-svn: 337656
*	[X86] Add more MADD recurrence test cases with larger and narrower vector ↵	Craig Topper	2018-07-22	1	-259/+1173
\| \| \| \| \| \|	widths. llvm-svn: 337650
*	[mips] Move out the WrapperPat declaration from the NotInMicroMips predicate	Simon Atanasyan	2018-07-21	2	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to the rL335185. Those commit adds some WrapperPat patterns for microMIPS target. But declaration of the WrapperPat class is under the NotInMicroMips predicate and microMIPS patterns cannot be selected because predicate (Subtarget->inMicroMipsMode()) && (!Subtarget->inMicroMipsMode()) is always false. This change move out the WrapperPat class declaration from the NotInMicroMips predicate and enables microMIPS WrapperPat patterns. Differential revision: https://reviews.llvm.org/D49533 llvm-svn: 337646
*	[InstrSimplify] fold sdiv if two operands are negated and non-overflow	Chen Zheng	2018-07-21	1	-24/+7
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49382 llvm-svn: 337642
*	[Hexagon] Disable packets in test to avoid ordering issues in checks	Krzysztof Parzyszek	2018-07-20	1	-2/+2
\| \| \| \|	llvm-svn: 337624
*	[COFF] Adjust how we flag weak externals	Martin Storsjo	2018-07-20	3	-5/+58
\| \| \| \| \| \| \| \| \| \|	This fixes PR36096. Originally based on a patch by Martell Malone. Differential Revision: https://reviews.llvm.org/D44357 llvm-svn: 337613
*	[FileCheck] Provide an option for FileCheck to dump original input to stderr ↵	George Karpenkov	2018-07-20	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	on failure The option can be either set using environment variable (e.g. env FILECHECK_DUMP_INPUT_ON_FAILURE=1 ninja check-fuzzer) or with a FileCheck flag. This can be extremely useful for debugging, cf. https://groups.google.com/forum/#!topic/llvm-dev/kLrzg8OM_h8 for discussion. Differential Revision: https://reviews.llvm.org/D49328 llvm-svn: 337609
*	Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size"	Roman Tereshin	2018-07-20	2	-4/+67
\| \| \| \| \| \| \| \|	This reapplies commit r337489 reverted by r337541 Additionally, this commit contains a speculative fix to the issue reported in r337541 (the report does not contain an actionable reproducer, just a stack trace) llvm-svn: 337606
*	[FileCheck] Fix search ranges for DAG-NOT-DAG	Joel E. Denny	2018-07-20	2	-1/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A DAG-NOT-DAG is a CHECK-DAG group, X, followed by a CHECK-NOT group, N, followed by a CHECK-DAG group, Y. Let y be the initial directive of Y. This patch makes the following changes to the behavior: 1. Directives in N can no longer match within part of Y's match range just because y happens not to be the earliest match from Y. Specifically, this patch withdraws N's search range end from y's match range start to Y's match range start. 2. y can no longer match within X's match range, where a y match produced a reordering complaint, which is thus no longer possible. Specifically, this patch withdraws y's search range start from X's permitted range start to X's match range end, which was already the search range start for other members of Y. Both of these changes can only increase the number of test passes: #1 constrains the ability of CHECK-NOTs to match, and #2 expands the ability of CHECK-DAGs to match without complaints. These changes are based on discussions at: <http://lists.llvm.org/pipermail/llvm-dev/2018-May/123550.html> <https://reviews.llvm.org/D47106> which conclude that: 1. These changes simplify the FileCheck conceptual model. First, it makes search ranges for DAG-NOT-DAG more consistent with other cases. Second, it was confusing that y was treated differently from the rest of Y. 2. These changes add theoretical use cases for DAG-NOT-DAG that had no obvious means to be expressed otherwise. We can justify the first half of this assertion with the observation that these changes can only increase the number of test passes. 3. Reordering detection for DAG-NOT-DAG had no obvious real benefit. We don't have evidence from real uses cases to help us debate conclusions #2 and #3, but #1 at least seems intuitive. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D48986 llvm-svn: 337605
*	[llvm-objcopy] Add basic support for --rename-section	Jordan Rupprecht	2018-07-20	2	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add basic support for --rename-section=old=new to llvm-objcopy. A full replacement for GNU objcopy requires also modifying flags (i.e. --rename-section=old=new,flag1,flag2); I'd like to keep that in a separate change to keep this simple. Reviewers: jakehehrlich, alexshap Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49576 llvm-svn: 337604
*	And add a lit substitution for llvm-undname, as the comment says to	Reid Kleckner	2018-07-20	1	-1/+1
\| \| \| \|	llvm-svn: 337600
*	Make check-llvm depend on llvm-undname	Reid Kleckner	2018-07-20	1	-0/+1
\| \| \| \|	llvm-svn: 337597
*	Fix a few warnings and style issues in MS demangler.	Zachary Turner	2018-07-20	1	-3/+0
\| \| \| \| \| \|	Also remove a broken test case. llvm-svn: 337591
*	[X86] Remove isel patterns for MOVSS/MOVSD ISD opcodes with integer types.	Craig Topper	2018-07-20	4	-74/+75
\| \| \| \| \| \| \| \| \| \|	Ideally our ISD node types going into the isel table would have types consistent with their instruction domain. This prevents us having to duplicate patterns with different types for the same instruction. Unfortunately, it seems our shuffle combining is currently relying on this a little remove some bitcasts. This seems to enable some switching between shufps and shufd. Hopefully there's some way we can address this in the combining. Differential Revision: https://reviews.llvm.org/D49280 llvm-svn: 337590
*	[llvm-mca][x86] Add movsx/movzx instructions to general x86_64 resource tests	Simon Pilgrim	2018-07-20	10	-10/+700
\| \| \| \|	llvm-svn: 337586
*	Add a Microsoft Demangler.	Zachary Turner	2018-07-20	3	-0/+610
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds initial support for a demangling library (LLVMDemangle) and tool (llvm-undname) for demangling Microsoft names. This doesn't cover 100% of cases and there are some known limitations which I intend to address in followup patches, at least until such time that we have (near) 100% test coverage matching up with all of the test cases in clang/test/CodeGenCXX/mangle-ms-*. Differential Revision: https://reviews.llvm.org/D49552 llvm-svn: 337584
*	[X86][XOP] Fix SUB constant folding for VPSHA/VPSHL shift lowering	Simon Pilgrim	2018-07-20	5	-221/+158
\| \| \| \| \| \| \| \|	We can safely use getConstant here as we're still lowering, which allows constant folding to kick in and simplify the vector shift codegen. Noticed while working on D49562. llvm-svn: 337578
*	[ARM] Add new feature to enable optimizing the VFP registers	Evandro Menezes	2018-07-20	1	-16/+10
\| \| \| \| \| \| \| \| \|	Enable the optimization of operations on DPR and SPR via a feature instead of checking the target. Differential revision: https://reviews.llvm.org/D49463 llvm-svn: 337575
*	[MSan] run materializeChecks() before materializeStores()	Alexander Potapenko	2018-07-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When pointer checking is enabled, it's important that every pointer is checked before its value is used. For stores MSan used to generate code that calculates shadow/origin addresses from a pointer before checking it. For userspace this isn't a problem, because the shadow calculation code is quite simple and compiler is able to move it after the check on -O2. But for KMSAN getShadowOriginPtr() creates a runtime call, so we want the check to be performed strictly before that call. Swapping materializeChecks() and materializeStores() resolves the issue: both functions insert code before the given IR location, so the new insertion order guarantees that the code calculating shadow address is between the address check and the memory access. llvm-svn: 337571
*	[X86][SSE] Use SplitOpsAndApply to improve HADD/HSUB lowering	Simon Pilgrim	2018-07-20	4	-64/+31
\| \| \| \| \| \|	Improve AVX1 256-bit vector HADD/HSUB matching by using SplitOpsAndApply to split into 128-bit instructions. llvm-svn: 337568
*	[llvm-objcopy, tests] Fix several llvm-objcopy tests	Stella Stamenova	2018-07-20	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In Python 3, sys.stdout.write expects a string rather than bytes. In order to be able to write the bytes to stdout, we need to use the buffer directly instead. This change is borrowing the implementation for writing to stdout that cat.py uses. Note that we cannot use cat.py directly because the file we are trying to open is a gzip file. Reviewers: asmith, bkramer, alexshap, jakehehrlich Reviewed By: alexshap, jakehehrlich Subscribers: jakehehrlich, llvm-commits Differential Revision: https://reviews.llvm.org/D49515 llvm-svn: 337567
*	[X86][AVX] Add support for i16 256-bit vector horizontal op redundant ↵	Simon Pilgrim	2018-07-20	1	-2/+0
\| \| \| \| \| \|	shuffle removal llvm-svn: 337566
*	[X86][AVX] Add v16i16 horizontal op redundant shuffle tests	Simon Pilgrim	2018-07-20	1	-0/+127
\| \| \| \|	llvm-svn: 337565
*	[DAG] Avoid Node Update assertion due to AND simplification	Nirav Dave	2018-07-20	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check for construction-time folding for incomplete AND nodes in BackwardsPropagateMask. Fixes PR38185. Reviewers: RKSimon, samparker Reviewed By: samparker Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D49444 llvm-svn: 337563
*	[X86][AVX] Add support for 32/64 bits 256-bit vector horizontal op redundant ↵	Simon Pilgrim	2018-07-20	1	-6/+0
\| \| \| \| \| \|	shuffle removal llvm-svn: 337561
*	[DAG] Fix Memory ordering check in ReduceLoadOpStore.	Nirav Dave	2018-07-20	1	-9/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When merging through a TokenFactor we need to check that the load may be ordered such that no other aliasing memory operations may happen. It is not sufficient to just check that the load is a member of the chain token factor as it there may be a indirect chain. Require the load's chain has only one use. This fixes PR37826. Reviewers: spatel, davide, efriedma, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D49388 llvm-svn: 337560
*	[X86][AVX] Add 256-bit vector horizontal op redundant shuffle tests	Simon Pilgrim	2018-07-20	1	-1/+260
\| \| \| \|	llvm-svn: 337558
*	Regenerate partial vector fold test. NFCI.	Simon Pilgrim	2018-07-20	1	-4/+18
\| \| \| \|	llvm-svn: 337551
*	[NFC][testcases] fold sdiv if two operands are negated and non-overflow	Chen Zheng	2018-07-20	1	-0/+147
\| \| \| \|	llvm-svn: 337549
*	Recommit r328307: [IPSCCP] Use constant range information for comparisons of ↵	Florian Hahn	2018-07-20	1	-13/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 337548
*	Regenerate remainder test.	Simon Pilgrim	2018-07-20	1	-25/+30
\| \| \| \|	llvm-svn: 337546
*	[InstSimplify] fold srem instruction if its two operands are negated.	Chen Zheng	2018-07-20	1	-25/+11
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49423 llvm-svn: 337545
*	[DebugInfo] Generate .debug_names section when it makes sense	Pavel Labath	2018-07-20	3	-10/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch makes us generate the debug_names section in response to some user-facing commands (previously it was only generated if explicitly selected via the -accel-tables option). My goal was to make this work for DWARF>=5 (as it's an official part of that standard), and also, as an extension, for DWARF<5 if one is explicitly tuning for lldb as a debugger (because it brings a large performance improvement there). This is slightly complicated by the fact that the debug_names tables are incompatible with the DWARF v4 type units (they assume that the type units are in the debug_info section), and unfortunately, right now we generate DWARF v4-style type units even for -gdwarf-5. For this reason, I disable all accelerator tables if the user requested type unit generation. I do this even for apple tables, as they have the same problem (in fact generating type units for apple targets makes us crash even before we get around to emitting the accelerator tables). Reviewers: JDevlieghere, aprantl, dblaikie, echristo, probinson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49420 llvm-svn: 337544
*	[NFC][testcases] more testcases for folding srem if its two operands are ↵	Chen Zheng	2018-07-20	1	-0/+34
\| \| \| \| \| \|	negatived. llvm-svn: 337543
*	[SystemZ] Test case formatting fixes	Ulrich Weigand	2018-07-20	215	-2814/+2814
\| \| \| \| \| \| \| \|	Fix systematically wrong whitespace from a prior automated change. NFC. llvm-svn: 337542
*	Revert "[LSV] Refactoring + supporting bitcasts to a type of different size"	Sam McCall	2018-07-20	2	-35/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r337489. It causes asserts to fire in some TensorFlow tests, e.g. tensorflow/compiler/tests/gather_test.py on GPU. Example stack trace: Start test case: GatherTest.testHigherRank assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits" @ 0x5559446ebe10 __assert_fail @ 0x55593ef32f5e llvm::APInt::trunc() @ 0x55593d78f86e (anonymous namespace)::Vectorizer::lookThroughComplexAddresses() @ 0x55593d78f2bc (anonymous namespace)::Vectorizer::areConsecutivePointers() @ 0x55593d78d128 (anonymous namespace)::Vectorizer::isConsecutiveAccess() @ 0x55593d78c926 (anonymous namespace)::Vectorizer::vectorizeInstructions() @ 0x55593d78c221 (anonymous namespace)::Vectorizer::vectorizeChains() @ 0x55593d78b948 (anonymous namespace)::Vectorizer::run() @ 0x55593d78b725 (anonymous namespace)::LoadStoreVectorizer::runOnFunction() @ 0x55593edf4b17 llvm::FPPassManager::runOnFunction() @ 0x55593edf4e55 llvm::FPPassManager::runOnModule() @ 0x55593edf563c (anonymous namespace)::MPPassManager::runOnModule() @ 0x55593edf5137 llvm::legacy::PassManagerImpl::run() @ 0x55593edf5b71 llvm::legacy::PassManager::run() @ 0x55593ced250d xla::gpu::IrDumpingPassManager::run() @ 0x55593ced5033 xla::gpu::(anonymous namespace)::EmitModuleToPTX() @ 0x55593ced40ba xla::gpu::(anonymous namespace)::CompileModuleToPtx() @ 0x55593ced33d0 xla::gpu::CompileToPtx() @ 0x55593b26b2a2 xla::gpu::NVPTXCompiler::RunBackend() @ 0x55593b21f973 xla::Service::BuildExecutable() @ 0x555938f44e64 xla::LocalService::CompileExecutable() @ 0x555938f30a85 xla::LocalClient::Compile() @ 0x555938de3c29 tensorflow::XlaCompilationCache::BuildExecutable() @ 0x555938de4e9e tensorflow::XlaCompilationCache::CompileImpl() @ 0x555938de3da5 tensorflow::XlaCompilationCache::Compile() @ 0x555938c5d962 tensorflow::XlaLocalLaunchBase::Compute() @ 0x555938c68151 tensorflow::XlaDevice::Compute() @ 0x55593f389e1f tensorflow::(anonymous namespace)::ExecutorState::Process() @ 0x55593f38a625 tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()() * SIGABRT received by PID 7798 (TID 7837) from PID 7798; * llvm-svn: 337541
*	[SystemZ] Reimplent SchedModel IssueWidth and WriteRes/ReadAdvance mappings.	Jonas Paulsson	2018-07-20	10	-69/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As a consequence of recent discussions (http://lists.llvm.org/pipermail/llvm-dev/2018-May/123164.html), this patch changes the SystemZ SchedModels so that the IssueWidth is 6, which is the decoder capacity, and NumMicroOps become the number of decoder slots needed per instruction. In addition, the SchedWrite latencies now match the MachineInstructions def-operand indexes, and ReadAdvances have been added on instructions with one register operand and one memory operand. Review: Ulrich Weigand https://reviews.llvm.org/D47008 llvm-svn: 337538
*	Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"	Matt Arsenault	2018-07-20	4	-21/+206
\| \| \| \| \| \|	Reverts r337079 with fix for msan error. llvm-svn: 337535
*	[AArch64][SVE] Asm: Support for bit/byte reverse operations.	Sander de Smalen	2018-07-20	9	-0/+206
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the following instructions: RBIT reverse bits within each active elemnt (predicated), e.g. rbit z0.d, p0/m, z1.d for 8, 16, 32 and 64 bit elements. REV reverse order of elements in data/predicate vector (unpredicated), e.g. rev z0.d, z1.d rev p0.d, p1.d for 8, 16, 32 and 64 bit elements. REVB reverse order of bytes within each active element, e.g. revb z0.d, p0/m, z1.d for 16, 32 and 64 bit elements. REVH reverse order of 16-bit half-words within each active element, e.g. revh z0.d, p0/m, z1.d for 32 and 64 bit elements. REVW reverse order of 32-bit words within each active element, e.g. revw z0.d, p0/m, z1.d for 64 bit elements. llvm-svn: 337534
*	[AArch64][SVE] Asm: Support for FTMAD instruction.	Sander de Smalen	2018-07-20	2	-0/+64
\| \| \| \| \| \| \| \| \| \| \|	Floating-point trigonometric multiply-add coefficient, e.g. ftmad z0.h, z0.h, z1.h, #7 with variants for 16, 32 and 64-bit elements. llvm-svn: 337533