bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[AArch64] [Windows] Don't skip constructing UnwindHelp.	Eli Friedman	2019-02-28	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	In certain cases, the first non-frame-setup instruction in a function is a branch. For example, it could be a cbz on an argument. Make sure we correctly allocate the UnwindHelp, and find an appropriate register to use to initialize it. Fixes https://bugs.llvm.org/show_bug.cgi?id=40184 Differential Revision: https://reviews.llvm.org/D58752 llvm-svn: 355136
*	[AArch64] Improve FP16 vector convert from short instructions.	Abderrazek Zaafrani	2019-02-28	1	-6/+15
\| \| \| \| \| \|	https://reviews.llvm.org/D58563 llvm-svn: 355134
*	[x86] scalarize extract element 0 of FP math	Sanjay Patel	2019-02-28	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is another step towards ensuring that we produce the optimal code for reductions, but there are other potential benefits as seen in the tests diffs: 1. Memory loads may get scalarized resulting in more efficient code. 2. Memory stores may get scalarized resulting in more efficient code. 3. Complex ops like fdiv/sqrt get scalarized which may be faster instructions depending on uarch. 4. Even simple ops like addss/subss/mulss/roundss may result in faster operation/less frequency throttling when scalarized depending on uarch. The TODO comment suggests 1 or more follow-ups for opcodes that can currently result in regressions. Differential Revision: https://reviews.llvm.org/D58282 llvm-svn: 355130
*	bpf: disassembler support for XADD under sub-register mode	Jiong Wang	2019-02-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Like the other load/store instructions, "w" register is preferred when disassembling BPF_STX \| BPF_W \| BPF_XADD. v1 -> v2: - Updated testcase insn-unit.s (Yonghong) Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 355127
*	bpf: enable sub-register code-gen for XADD	Jiong Wang	2019-02-28	2	-5/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support sub-register code-gen for XADD is like supporting any other Load and Store patterns. No new instruction is introduced. lock (u32 )(r1 + 0) += w2 has exactly the same underlying insn as: lock (u32 )(r1 + 0) += r2 BPF_W width modifier has guaranteed they behave the same at runtime. This patch merely teaches BPF back-end that BPF_W width modifier could work GPR32 register class and that's all needed for sub-register code-gen support for XADD. test/CodeGen/BPF/xadd.ll updated to include sub-register code-gen tests. A new testcase test/CodeGen/BPF/xadd_legal.ll is added to make sure the legal case could pass on all code-gen modes. It could also test dead Def check on GPR32. If there is no proper handling like what has been done inside BPFMIChecking.cpp:hasLivingDefs, then this testcase will fail. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 355126
*	bpf: improve dead Defs check for XADD	Jiong Wang	2019-02-28	1	-1/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BPF XADD semantics require all Defs of XADD are dead, meaning any result of XADD insn is not used. However, BPF backend hasn't enabled sub-register liveness track, so when the source and destination operands of XADD are GPR32, there is no sub-register dead info. If we rely on the generic MachineInstr::allDefsAreDead, then we will raise false alarm on GPR32 Def. This was fine as there was no sub-register code-gen support for XADD which will be added by the next patch. To support GPR32 Def, ideally we could just enable sub-registr liveness track on BPF backend, then allDefsAreDead could work on GPR32 Def. This requires implementing TargetSubtargetInfo::enableSubRegLiveness on BPF. However, sub-register liveness tracking module inside LLVM is actually designed for the situation where one register could be split into more than one sub-registers for which case each sub-register could have their own liveness and kill one of them doesn't kill others. So, tracking liveness for each make sense. For BPF, each 64-bit register could only have one 32-bit sub-register. This is exactly the case which LLVM think brings no benefits for doing sub-register tracking, because the live range of sub-register must always equal to its parent register, therefore liveness tracking is disabled even the back-end has implemented enableSubRegLiveness. The detailed information is at r232695: Author: Matthias Braun <matze@braunis.de> Date: Thu Mar 19 00:21:58 2015 +0000 Do not track subregister liveness when it brings no benefits Hence, for BPF, we enhance MachineInstr::allDefsAreDead. Given the solo sub-register always has the same liveness as its parent register, LLVM is already attaching a implicit 64-bit register Def whenever the there is a sub-register Def. The liveness of the implicit 64-bit Def is available. For example, for "lock (u32 )(r0 + 4) += w9", the MachineOperand info could be: $w9 = XADDW32 killed $r0, 4, $w9(tied-def 0), implicit killed $r9, implicit-def dead $r9 Even though w9 is not marked as Dead, the parent register r9 is marked as Dead correctly, and it is safe to use such information or our purpose. v1 -> v2: - Simplified code logic inside hasLiveDefs. (Yonghong) Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 355124
*	[X86] Don't peek through bitcasts before checking ↵	Craig Topper	2019-02-28	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \|	ISD::isBuildVectorOfConstantSDNodes in combineTruncatedArithmetic We don't have any combines that can look through a bitcast to truncate a build vector of constants. So the truncate will stick around and give us something like this pattern (binop (trunc X), (trunc (bitcast (build_vector)))) which has two truncates in it. Which will be reversed by hoistLogicOpWithSameOpcodeHands in the generic DAG combiner. Thus causing an infinite loop. Even if we had a combine for (truncate (bitcast (build_vector))), I think it would need to be implemented in getNode otherwise DAG combiner visit ordering would probably still visit the binop first and reverse it. Or combineTruncatedArithmetic would need to do its own constant folding. Differential Revision: https://reviews.llvm.org/D58705 llvm-svn: 355116
*	Revert "[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1."	Amara Emerson	2019-02-28	1	-118/+26
\| \| \| \| \| \|	Seems to break some neon intrinsics tests. llvm-svn: 355115
*	[WebAssembly] Remove uses of ThreadModel	Thomas Lively	2019-02-28	2	-14/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the clang UI, replaces -mthread-model posix with -matomics as the source of truth on threading. In the backend, replaces -thread-model=posix with the atomics target feature, which is now collected on the WebAssemblyTargetMachine along with all other used features. These collected features will also be used to emit the target features section in the future. The default configuration for the backend is thread-model=posix and no atomics, which was previously an invalid configuration. This change makes the default valid because the thread model is ignored. A side effect of this change is that objects are never emitted with passive segments. It will instead be up to the linker to decide whether sections should be active or passive based on whether atomics are used in the final link. Reviewers: aheejin, sbc100, dschuff Subscribers: mehdi_amini, jgravelle-google, hiraditya, sunfish, steven_wu, dexonsmith, rupprecht, jfb, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D58742 llvm-svn: 355112
*	[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1.	Amara Emerson	2019-02-28	1	-26/+118
\| \| \| \| \| \| \| \| \|	This extends the existing support for shufflevector to handle cases like <2 x float>, which we can implement by concating the vectors and using a TBL1. Differential Revision: https://reviews.llvm.org/D58684 llvm-svn: 355104
*	[Target][ARM] Add a usage for SrcSz to unbreak build-bots without assertions	Kadir Cetinkaya	2019-02-28	1	-0/+1
\| \| \| \|	llvm-svn: 355101
*	Add support for computing "zext of value" in KnownBits. NFCI	Bjorn Pettersson	2019-02-28	3	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099
*	[PowerPC] Removed STATISTIC that was causing build errors.	Stefan Pintilie	2019-02-28	1	-1/+0
\| \| \| \|	llvm-svn: 355087
*	[PowerPC] Move the stack pointer update instruction later in the prologue ↵	Stefan Pintilie	2019-02-28	5	-33/+229
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	and earlier in the epilogue. Move the stdu instruction in the prologue and epilogue. This should provide a small performance boost in functions that are able to do this. I've kept this change rather conservative at the moment and functions with frame pointers or base pointers will not try to move the stack pointer update. Differential Revision: https://reviews.llvm.org/D42590 llvm-svn: 355085
*	[X86][AVX] Remove superfluous insert_subvector(zero, bitcast(x)) -> ↵	Simon Pilgrim	2019-02-28	1	-14/+0
\| \| \| \| \| \| \| \|	bitcast(insert_subvector(zero, x)) fold This is caught by other existing bitcast folds. llvm-svn: 355084
*	[ARM GlobalISel] Make arm_i32imm an IntImmLeaf	Diana Picus	2019-02-28	2	-15/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	This gets rid of some duplication in the TableGen definition, but it forces us to keep both a pointer and a reference to the subtarget in the ARMInstructionSelector. That is pretty ugly but it might be a reasonable trade-off, since the TableGen descriptions should outlive the code in the selector (or in the worst case we can update to use just the reference when we get rid of DAGISel). Differential Revision: https://reviews.llvm.org/D58031 llvm-svn: 355083
*	[X86][AVX] Fold vf64 concat_vectors(movddup(x),movddup(x)) -> broadcast(x)	Simon Pilgrim	2019-02-28	1	-1/+11
\| \| \| \|	llvm-svn: 355078
*	[ARM GlobalISel] Support global variables for Thumb2	Diana Picus	2019-02-28	2	-25/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the same level of support as for ARM mode (i.e. still no TLS support). In most cases, it is sufficient to replace the opcodes with the t2-equivalent, but there are some idiosyncrasies that I decided to preserve because I don't understand the full implications: * For ARM we use LDRi12 to load from constant pools, but for Thumb we use t2LDRpci (I'm not sure if the ideal would be to use t2LDRi12 for Thumb as well, or to use LDRcp for ARM). * For Thumb we don't have an equivalent for MOV\|LDRLIT_ga_pcrel_ldr, so we have to generate MOV\|LDRLIT_ga_pcrel plus a load from GOT. The tests are in separate files because they're hard enough to read even without doubling the number of checks. llvm-svn: 355077
*	[X86] Use PreprocessISelDAG to convert vector sra/srl/shl to the X86 ↵	Craig Topper	2019-02-28	3	-121/+40
\| \| \| \| \| \| \| \| \| \| \| \|	specific variable shift ISD opcodes. These allows use to use the same set of isel patterns for sra/srl/shl which are undefined for out of range shifts and intrinsic shifts which aren't undefined. Doing this late allows DAG combine to have every opportunity to optimize the sra/srl/shl nodes. This removes about 7000 bytes from the isel table and simplies the td files. llvm-svn: 355071
*	[X86] Use X86::LAST_VALID_COND instead of assuming X86::COND_S is the last ↵	Craig Topper	2019-02-28	1	-1/+1
\| \| \| \| \| \|	encoding. NFC llvm-svn: 355059
*	AMDGPU: Fix typo	Matt Arsenault	2019-02-28	1	-1/+1
\| \| \| \|	llvm-svn: 355056
*	AMDGPU: Enable function calls by default	Matt Arsenault	2019-02-28	1	-4/+9
\| \| \| \| \| \| \|	Fixes some crashes on illegal call situations which are unfortunately still valid IR. llvm-svn: 355051
*	[AArch64] Generate FP16 vector compare instructions.	Abderrazek Zaafrani	2019-02-28	1	-4/+4
\| \| \| \| \| \|	https://reviews.llvm.org/D58561 llvm-svn: 355050
*	AMDGPU: Fix crashes in invalid call cases	Matt Arsenault	2019-02-28	2	-6/+15
\| \| \| \| \| \| \|	We have to at least tolerate calls to kernels, possibly with a mismatched calling convention on the callsite. llvm-svn: 355049
*	GlobalISel: Implement fewerElementsVector for phi	Matt Arsenault	2019-02-28	1	-0/+1
\| \| \| \|	llvm-svn: 355048
*	GlobalISel: Implement moreElementsVector for phi	Matt Arsenault	2019-02-28	1	-0/+1
\| \| \| \|	llvm-svn: 355047
*	Default to Secure PLT on PPC for NetBSD and OpenBSD.	Joerg Sonnenberger	2019-02-27	1	-0/+3
\| \| \| \| \| \|	This matches the default settings of clang. llvm-svn: 355038
*	Seperate volatility and atomicity/ordering in SelectionDAG	Philip Reames	2019-02-27	4	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment, we mark every atomic memory access as being also volatile. This is unnecessarily conservative and prohibits many legal transforms (DCE, folding, etc..). This patch removes MOVolatile from the MachineMemOperands of atomic, but not volatile, instructions. This should be strictly NFC after a series of previous patches which have gone in to ensure backend code is conservative about handling of isAtomic MMOs. Once it's in and baked for a bit, we'll start working through removing unnecessary bailouts one by one. We applied this same strategy to the middle end a few years ago, with good success. To make sure this patch itself is NFC, it is build on top of a series of other patches which adjust code to (for the moment) be as conservative for an atomic access as for a volatile access and build up a test corpus (mostly in test/CodeGen/X86/atomics-unordered.ll).. Previously landed D57593 Fix a bug in the definition of isUnordered on MachineMemOperand D57596 [CodeGen] Be conservative about atomic accesses as for volatile D57802 Be conservative about unordered accesses for the moment rL353959: [Tests] First batch of cornercase tests for unordered atomics. rL353966: [Tests] RMW folding tests w/unordered atomic operations. rL353972: [Tests] More unordered atomic lowering tests. rL353989: [SelectionDAG] Inline a single use helper function, and remove last non-MMO interface rL354740: [Hexagon, SystemZ] Be super conservative about atomics rL354800: [Lanai] Be super conservative about atomics rL354845: [ARM] Be super conservative about atomics Attention Out of Tree Backend Owners: This patch may break you. If it does, you can use the TLI getMMOFlags hook to restore the MOVolatile to any instruction you need to. (See llvm-dev thread titled "PSA: Changes to how atomics are handled in backends" started Feb 27, 2019.) Differential Revision: https://reviews.llvm.org/D57601 llvm-svn: 355025
*	[X86][AVX] Pull out some INSERT_SUBVECTOR combines into a ↵	Simon Pilgrim	2019-02-27	1	-51/+66
\| \| \| \| \| \| \| \| \| \| \| \|	combineConcatVectorOps helper. NFCI A lot of the INSERT_SUBVECTOR combines can be more generally handled as if they have come from a CONCAT_VECTORS node. I've been investigating adding a CONCAT_VECTORS combine to X86, but this is a much easier first step that avoids the issue of handling a number of pre-legalization issues that I've encountered. Differential Revision: https://reviews.llvm.org/D58583 llvm-svn: 355015
*	[AMDGPU][MC] Added register size check for VOP3/SDWA/DPP operands	Dmitry Preobrazhensky	2019-02-27	2	-13/+17
\| \| \| \| \| \| \| \| \| \|	See bug 37943: https://bugs.llvm.org/show_bug.cgi?id=37943 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58287 llvm-svn: 354974
*	[AMDGPU][MC][GFX8+] Added syntactic sugar for 'vgpr index' operand of ↵	Dmitry Preobrazhensky	2019-02-27	7	-28/+149
\| \| \| \| \| \| \| \| \| \| \| \|	instructions s_set_gpr_idx_on and s_set_gpr_idx_mode See bug 39331: https://bugs.llvm.org/show_bug.cgi?id=39331 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D58288 llvm-svn: 354969
*	[X86][AVX] Only combine loads to broadcasts for legal types	Simon Pilgrim	2019-02-27	1	-9/+11
\| \| \| \| \| \|	Thanks to @echristo for spotting this. llvm-svn: 354961
*	[BPF] Don't fail for static variables	Yonghong Song	2019-02-27	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the LLVM will print an error like Unsupported relocation: try to compile with -O2 or above, or check your static variable usage if user defines more than one static variables in a single ELF section (e.g., .bss or .data). There is ongoing effort to support static and global variables in libbpf and kernel. This patch removed the assertion so user programs with static variables won't fail compilation. The static variable in-section offset is written to the "imm" field of the corresponding to-be-relocated bpf instruction. Below is an example to show how the application (e.g., libbpf) can relate variable to relocations. -bash-4.4$ cat g1.c static volatile long a = 2; static volatile int b = 3; int test() { return a + b; } -bash-4.4$ clang -target bpf -O2 -c g1.c -bash-4.4$ llvm-readelf -r g1.o Relocation section '.rel.text' at offset 0x158 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name 0000000000000000 0000000400000001 R_BPF_64_64 0000000000000000 .data 0000000000000018 0000000400000001 R_BPF_64_64 0000000000000000 .data -bash-4.4$ llvm-readelf -s g1.o Symbol table '.symtab' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS g1.c 2: 0000000000000000 8 OBJECT LOCAL DEFAULT 4 a 3: 0000000000000008 4 OBJECT LOCAL DEFAULT 4 b 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 64 FUNC GLOBAL DEFAULT 2 test -bash-4.4$ llvm-objdump -d g1.o g1.o: file format ELF64-BPF Disassembly of section .text: 0000000000000000 test: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 2: 79 11 00 00 00 00 00 00 r1 = (u64 )(r1 + 0) 3: 18 02 00 00 08 00 00 00 00 00 00 00 00 00 00 00 r2 = 8 ll 5: 61 20 00 00 00 00 00 00 r0 = (u32 )(r2 + 0) 6: 0f 10 00 00 00 00 00 00 r0 += r1 7: 95 00 00 00 00 00 00 00 exit -bash-4.4$ . from symbol table, static variable "a" is in section #4, offset 0. . from symbol table, static variable "b" is in section #4, offset 8. . the first relocation is against symbol #4: 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 and in-section offset 0 (see llvm-objdump result) . the second relocation is against symbol #4: 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 and in-section offset 8 (see llvm-objdump result) . therefore, the first relocation is for variable "a", and the second relocation is for variable "b". Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 354954
*	[WebAssembly] Fix ScopeTops info in CFGStackify for EH pads	Heejin Ahn	2019-02-27	1	-5/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When creating `ScopeTops` info for `try` ~ `catch` ~ `end_try`, we should create not only `end_try` -> `try` mapping but also `catch` -> `try` mapping as well. If this is not created, `block` and `end_block` markers later added may span across an existing `catch`, resulting in the incorrect code like: ``` try block --\| (X) catch \| end_block --\| end_try ``` Reviewers: dschuff Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58605 llvm-svn: 354945
*	[WebAssembly] Remove unnecessary instructions after TRY marker placement	Heejin Ahn	2019-02-27	1	-2/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This removes unnecessary instructions after TRY marker placement. There are two cases: - `end`/`end_block` can be removed if they overlap with `try`/`end_try` and they have the same return types. - `br` right before `catch` that branches to after `end_try` can be deleted. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58591 llvm-svn: 354939
*	[SystemZ] Pass regalloc hints to help Load-and-Test transformations.	Jonas Paulsson	2019-02-27	1	-15/+38
\| \| \| \| \| \| \| \| \| \| \| \|	Since there is no "Load-and-Test-High" instruction, the 32 bit load of a register to be compared with 0 can only be implemented with LT if the virtual GRX32 register ends up in a low part (GR32 register). This patch detects these cases and passes the GR32 registers (low parts) as (soft) hints in getRegAllocationHints(). Review: Ulrich Weigand. llvm-svn: 354935
*	[AMDGPU] Fixed hang during DAG combine	Stanislav Mekhanoshin	2019-02-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	SITargetLowering::reassociateScalarOps() does not touch constants so that DAGCombiner::ReassociateOps() does not revert the combine. However a global address is not a ConstantSDNode. Switched to the method used by DAGCombiner::ReassociateOps() itself to detect constants. Differential Revision: https://reviews.llvm.org/D58695 llvm-svn: 354926
*	[X86] Fix bug in vectorcall calling convention	Reid Kleckner	2019-02-26	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	Original implementation can't correctly handle __m256 and __m512 types passed by reference through stack. This patch fixes it. Patch by Wei Xiao! Differential Revision: https://reviews.llvm.org/D57643 llvm-svn: 354921
*	[MIPS GlobalISel] Select G_UADDO	Petar Avramovic	2019-02-26	1	-1/+1
\| \| \| \| \| \| \| \| \|	Lower G_UADDO. Legalize G_UADDO for MIPS32 Differential Revision: https://reviews.llvm.org/D58671 llvm-svn: 354900
*	[X86] AMD znver2 enablement	Ganesh Gopalasubramanian	2019-02-26	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables the following 1) AMD family 17h "znver2" tune flag (-march, -mcpu). 2) ISAs that are enabled for "znver2" architecture. 3) For the time being, it uses the znver1 scheduler model. 4) Tests are updated. 5) Scheduler descriptions are yet to be put in place. Reviewers: craig.topper Differential Revision: https://reviews.llvm.org/D58343 llvm-svn: 354897
*	[SystemZ] Wait with selection of legal vector/FP constants until Select().	Jonas Paulsson	2019-02-26	5	-163/+173
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch aims to make sure that any such constant that can be generated with a vector instruction (for example VGBM) is recognized as such during legalization and kept as a target independent node through post-legalize DAGCombining. Two new functions named isVectorConstantLegal() and loadVectorConstant() replace old ways of handling vector/FP constants. A new struct named SystemZVectorConstantInfo is used to cache the results of isVectorConstantLegal() and pass them onto loadVectorConstant(). Support for fp128 constants in the presence of FeatureVectorEnhancements1 (z14) has been added. Review: Ulrich Weigand https://reviews.llvm.org/D58270 llvm-svn: 354896
*	[mips] Emit `.module softfloat` directive	Simon Atanasyan	2019-02-26	2	-3/+7
\| \| \| \| \| \| \|	This change fixes crash on an assertion in case of using `soft float` ABI for mips32r6 target. llvm-svn: 354882
*	[llvm-objdump] Implement -Mreg-names-raw/-std options.	Igor Kudrin	2019-02-26	3	-6/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The --disassembler-options, or -M, are used to customize the disassembler and affect its output. The two implemented options allow selecting register names on ARM: * With -Mreg-names-raw, the disassembler uses rNN for all registers. * With -Mreg-names-std it prints sp, lr and pc for r13, r14 and r15, which is the default behavior of llvm-objdump. Differential Revision: https://reviews.llvm.org/D57680 llvm-svn: 354870
*	[ARM] Add Cortex-M35P	Luke Cheeseman	2019-02-26	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	- Add LLVM backend support for Cortex-M35P - Documentation can be found at https://developer.arm.com/products/processors/cortex-m/cortex-m35p Differentail Revision: https://reviews.llvm.org/D57763 llvm-svn: 354868
*	[WebAssembly] Properly align fp128 arguments in outgoing varargs arguments	Dan Gohman	2019-02-26	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For outgoing varargs arguments, it's necessary to check the OrigAlign field of the corresponding OutputArg entry to determine argument alignment, rather than just computing an alignment from the argument value type. This is because types like fp128 are split into multiple argument values, with narrower types that don't reflect the ABI alignment of the full fp128. This fixes the printf("printfL: %4.*Lf\n", 2, lval); testcase. Differential Revision: https://reviews.llvm.org/D58656 llvm-svn: 354846
*	[ARM] Be super conservative about atomics	Philip Reames	2019-02-26	1	-2/+5
\| \| \| \| \| \| \| \| \|	As requested during review of D57601 <https://reviews.llvm.org/D57601> https://reviews.llvm.org/D57601, be equally conservative for atomic MMOs as for volatile MMOs in all in tree backends. At the moment, all atomic MMOs are also volatile, but I'm about to change that. Differential Revision: https://reviews.llvm.org/D58490 Note: D58498 landed in several pieces as individual backends were approved. This is the last chunk. llvm-svn: 354845
*	[WebAssembly] Fix a bug deleting instruction in a ranged for loop	Heejin Ahn	2019-02-26	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We shouldn't delete elements while iterating a ranged for loop. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58519 llvm-svn: 354844
*	[X86] Fix bug in x86_intrcc with arg copy elision	Reid Kleckner	2019-02-26	4	-43/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use a custom calling convention handler for interrupts instead of fixing up the locations in LowerMemArgument. This way, the offsets are correct when constructed and we don't need to account for them in as many places. Depends on D56883 Replaces D56275 Reviewers: craig.topper, phil-opp Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56944 llvm-svn: 354837
*	RegBankSelect: Handle slightly more complex value mappings	Matt Arsenault	2019-02-25	2	-8/+47
\| \| \| \| \| \| \| \|	Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828
*	AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes	Matt Arsenault	2019-02-25	1	-0/+2
\| \| \| \|	llvm-svn: 354825