bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86][AVX1] Enable *_EXTEND_VECTOR_INREG lowering of 256-bit vectors	Simon Pilgrim	2018-10-09	1	-15/+34
\| \| \| \| \| \| \| \|	As discussed on D52964, this adds 256-bit *_EXTEND_VECTOR_INREG lowering support for AVX1 targets to help improve SimplifyDemandedBits handling. Differential Revision: https://reviews.llvm.org/D52980 llvm-svn: 344019
*	[MIPS GlobalISel] Legalize i64 add	Petar Jovanovic	2018-10-08	2	-1/+50
\| \| \| \| \| \| \| \| \| \|	Custom legalize s64 G_ADD for MIPS32. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D52652 llvm-svn: 344007
*	[X86] Revert r343993 condition branches folding for three-way conditional codes	Rong Xu	2018-10-08	6	-601/+0
\| \| \| \| \| \|	Some buildbots failed. llvm-svn: 343998
*	[X86] Prefer isTypeLegal over checking isSimple in a DAG combine.	Craig Topper	2018-10-08	1	-1/+3
\| \| \| \| \| \|	Simple types are a superset of what all in tree targets in LLVM could possibly have a legal type. This means the behavior of using isSimple to check for a supported type for X86 could change over time. For example, this could would change if a v256i1 type was added to MVT in the future. llvm-svn: 343995
*	[X86] condition branches folding for three-way conditional codes	Rong Xu	2018-10-08	6	-0/+601
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements a pass that optimizes condition branches on x86 by taking advantage of the three-way conditional code generated by compare instructions. Currently, it tries to hoisting EQ and NE conditional branch to a dominant conditional branch condition where the same EQ/NE conditional code is computed. An example: bb_0: cmp %0, 19 jg bb_1 jmp bb_2 bb_1: cmp %0, 40 jg bb_3 jmp bb_4 bb_4: cmp %0, 20 je bb_5 jmp bb_6 Here we could combine the two compares in bb_0 and bb_4 and have the following code: bb_0: cmp %0, 20 jg bb_1 jl bb_2 jmp bb_5 bb_1: cmp %0, 40 jg bb_3 jmp bb_6 For the case of %0 == 20 (bb_5), we eliminate two jumps, and the control height for bb_6 is also reduced. bb_4 is gone after the optimization. This optimization is motivated by the branch pattern generated by the switch lowering: we always have pivot-1 compare for the inner nodes and we do a pivot compare again the leaf (like above pattern). This pass currently is enabled on Intel's Sandybridge and later arches. Some reviewers pointed out that on some arches (like AMD Jaguar), this pass may increase branch density to the point where it hurts the performance of the branch predictor. Differential Revision: https://reviews.llvm.org/D46662 llvm-svn: 343993
*	[AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions	Scott Linder	2018-10-08	4	-108/+293
\| \| \| \| \| \| \| \| \| \| \|	Emit a waterfall loop in the general case for a potentially-divergent Rsrc operand. When practical, avoid this by using Addr64 instructions. Recommits r341413 with changes to update the MachineDominatorTree when present. Differential Revision: https://reviews.llvm.org/D51742 llvm-svn: 343992
*	[X86][AVX2] Enable ZERO_EXTEND_VECTOR_INREG lowering of 256-bit vectors	Simon Pilgrim	2018-10-08	1	-7/+5
\| \| \| \| \| \| \| \|	Some necessary yak shaving before lowering *_EXTEND_VECTOR_INREG 256-bit vectors on AVX1 targets as suggested by D52964. Differential Revision: https://reviews.llvm.org/D52970 llvm-svn: 343991
*	[x86] make horizontal binop matching clearer; NFCI	Sanjay Patel	2018-10-08	1	-41/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The instructions are complicated, so this code will probably never be very obvious, but hopefully this makes it better. As shown in PR39195: https://bugs.llvm.org/show_bug.cgi?id=39195 ...we need to improve the matching to not miss cases where we're h-opping on 1 source vector, and that should be a small patch after this rearranging. llvm-svn: 343989
*	AMDGPU/GlobalISel: Select amdgcn.cvt.pkrtz to 64-bit instructions	Tom Stellard	2018-10-08	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The 32-bit variants do not exist on VI+. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52958 llvm-svn: 343985
*	[AMDGPU] Add an AMDGPU specific atomic optimizer.	Neil Henning	2018-10-08	4	-0/+438
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a new IR level pass to the AMDGPU backend to perform atomic optimizations. It works by: - Running through a function and finding atomicrmw add/sub or uses of the atomic buffer intrinsics for add/sub. - If all arguments except the value to be added/subtracted are uniform, record the value to be optimized. - Run through the atomic operations we can optimize and, depending on whether the value is uniform/divergent use wavefront wide operations (DPP in the divergent case) to calculate the total amount to be atomically added/subtracted. - Then let only a single lane of each wavefront perform the atomic operation, reducing the total number of atomic operations in flight. - Lastly we recombine the result from the single lane to each lane of the wavefront, and calculate our individual lanes offset into the final result. Differential Revision: https://reviews.llvm.org/D51969 llvm-svn: 343973
*	[AArch64][v8.5A] Don't create BR instructions in outliner when BTI enabled	Oliver Stannard	2018-10-08	1	-1/+9
\| \| \| \| \| \| \| \| \| \|	When branch target identification is enabled, we can only do indirect tail-calls through x16 or x17. This means that the outliner can't transform a BLR instruction at the end of an outlined region into a BR. Differential revision: https://reviews.llvm.org/D52869 llvm-svn: 343969
*	[AArch64][v8.5A] Restrict indirect tail calls to use x16/17 only when using BTI	Oliver Stannard	2018-10-08	4	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When branch target identification is enabled, all indirectly-callable functions start with a BTI C instruction. this instruction can only be the target of certain indirect branches (direct branches and fall-through are not affected): - A BLR instruction, in either a protected or unprotected page. - A BR instruction in a protected page, using x16 or x17. - A BR instruction in an unprotected page, using any register. Without BTI, we can use any non call-preserved register to hold the address for an indirect tail call. However, when BTI is enabled, then the code being compiled might be loaded into a BTI-protected page, where only x16 and x17 can be used for indirect tail calls. Legacy code withiout this restriction can still indirectly tail-call BTI-protected functions, because they will be loaded into an unprotected page, so any register is allowed. Differential revision: https://reviews.llvm.org/D52868 llvm-svn: 343968
*	[AArch64][v8.5A] Branch Target Identification code-generation pass	Oliver Stannard	2018-10-08	4	-0/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Branch Target Identification extension, introduced to AArch64 in Armv8.5-A, adds the BTI instruction, which is used to mark valid targets of indirect branches. When enabled, the processor will trap if an instruction in a protected page tries to perform an indirect branch to any instruction other than a BTI. The BTI instruction uses encodings which were NOPs in earlier versions of the architecture, so BTI-enabled code will still run on earlier hardware, just without the extra protection. There are 3 variants of the BTI instruction, which are valid targets for different kinds or branches: - BTI C can be targeted by call instructions, and is inteneded to be used at function entry points. These are the BLR instruction, as well as BR with x16 or x17. These BR instructions are allowed for use in PLT entries, and we can also use them to allow indirect tail-calls. - BTI J can be targeted by BR only, and is intended to be used by jump tables. - BTI JC acts ab both a BTI C and a BTI J instruction, and can be targeted by any BLR or BR instruction. Note that RET instructions are not restricted by branch target identification, the reason for this is that return addresses can be protected more effectively using return address signing. Direct branches and calls are also unaffected, as it is assumed that an attacker cannot modify executable pages (if they could, they wouldn't need to do a ROP/JOP attack). This patch adds a MachineFunctionPass which: - Adds a BTI C at the start of every function which could be indirectly called (either because it is address-taken, or externally visible so could be address-taken in another translation unit). - Adds a BTI J at the start of every basic block which could be indirectly branched to. This could be either done by a jump table, or by taking the address of the block (e.g. the using GCC label values extension). We only need to use BTI JC when a function is indirectly-callable, and takes the address of the entry block. I've not been able to trigger this from C or IR, but I've included a MIR test just in case. Using BTI C at function entries relies on the fact that no other code in BTI-protected pages uses indirect tail-calls, unless they use x16 or x17 to hold the address. I'll add that code-generation restriction as a separate patch. Differential revision: https://reviews.llvm.org/D52867 llvm-svn: 343967
*	[GlobalIsel][X86] Support G_UDIV/G_UREM/G_SREM	Alexander Ivchenko	2018-10-08	2	-52/+178
\| \| \| \| \| \| \| \| \| \|	Support G_UDIV/G_UREM/G_SREM. The instruction selection code is taken from FastISel with only minor tweaks to adapt for GlobalISel. Differential Revision: https://reviews.llvm.org/D49781 llvm-svn: 343966
*	[IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle.	Neil Henning	2018-10-08	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IRBuilder CreateIntrinsic method wouldn't allow you to specify the types that you wanted the intrinsic to be mangled with. To fix this I've: - Added an ArrayRef<Type > member to both CreateIntrinsic overloads. - Used that array to pass into the Intrinsic::getDeclaration call. - Added a CreateUnaryIntrinsic to replace the most common use of CreateIntrinsic where the type was auto-deduced from operand 0. - Added a bunch more unit tests to test CreateIntrinsic calls that weren't being tested (including the FMF flag that wasn't checked). This was suggested as part of the AMDGPU specific atomic optimizer review (https://reviews.llvm.org/D51969). Differential Revision: https://reviews.llvm.org/D52087 llvm-svn: 343962
*	[ARM] Account for implicit IT when calculating inline asm size	Peter Smith	2018-10-08	2	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When deciding if it is safe to optimize a conditional branch to a CBZ or CBNZ the offsets of the BasicBlocks from the start of the function are estimated. For inline assembly the generic getInlineAsmLength() function is used to get a worst case estimate of the inline assembly by multiplying the number of instructions by the max instruction size of 4 bytes. This unfortunately doesn't take into account the generation of Thumb implicit IT instructions. In edge cases such as when all the instructions in the block are 4-bytes in size and there is an implicit IT then the size is underestimated. This can cause an out of range CBZ or CBNZ to be generated. The patch takes a conservative approach and assumes that every instruction in the inline assembly block may have an implicit IT. Fixes pr31805 Differential Revision: https://reviews.llvm.org/D52834 llvm-svn: 343960
*	[AArch64] Fix verifier error when outlining indirect calls	Oliver Stannard	2018-10-08	3	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MachineOutliner for AArch64 transforms indirect calls into indirect tail calls, replacing the call with the TCRETURNri pseudo-instruction. This pseudo lowers to a BR, but has the isCall and isReturn flags set. The problem is that TCRETURNri takes a tcGPR64 as the register argument, to prevent indiret tail-calls from using caller-saved registers. The indirect calls transformed by the outliner could use caller-saved registers. This is fine, because the outliner ensures that the register is available at all call sites. However, this causes a verifier failure when the register is not in tcGPR64. The fix is to add a new pseudo-instruction like TCRETURNri, but which accepts any GPR. Differential revision: https://reviews.llvm.org/D52829 llvm-svn: 343959
*	[X86] getFauxShuffleMask - Handle undef + sentinel values in subvector insertion	Simon Pilgrim	2018-10-06	1	-1/+3
\| \| \| \|	llvm-svn: 343926
*	[X86][AVX] Ensure resolveTargetShuffleInputs shuffle masks are the correct width	Simon Pilgrim	2018-10-06	1	-1/+2
\| \| \| \| \| \|	Don't handle ZERO_EXTEND style shuffles until we support bitcasts. Found by inspection. llvm-svn: 343924
*	[X86] combinePMULDQ - add op back to worklist if SimplifyDemandedBits ↵	Simon Pilgrim	2018-10-06	1	-2/+6
\| \| \| \| \| \| \| \|	succeeds on either operand Prevents missing other simplifications that may occur deep in the operand chain where CommitTargetLoweringOpt won't add the PMULDQ back to the worklist itself llvm-svn: 343922
*	[X86][SSE] SimplifyDemandedVectorEltsForTargetNode - simplify PSHUFB masks	Simon Pilgrim	2018-10-06	1	-0/+9
\| \| \| \| \| \|	Attempt to simplify PSHUFB masks (even non-constant ones) - we should probably be able to simplify other variable shuffles as well as the need arises. llvm-svn: 343919
*	[X86] Use the SimplifyDemandedBits wrappers where possible. NFCI.	Simon Pilgrim	2018-10-06	1	-23/+4
\| \| \| \| \| \|	Leave the wrapper to handle TargetLowering::TargetLoweringOpt and CommitTargetLoweringOpt. llvm-svn: 343918
*	[RISCV] Compress addiw rd, x0, simm6 to c.li rd, simm6	Alex Bradbury	2018-10-06	1	-0/+2
\| \| \| \| \| \| \| \|	A pattern was present for addi rd, x0, simm6 but not addiw which is semantically identical when the source register is x0. This patch addresses that, and the benefit can be seen in rv64c-aliases-valid.s. llvm-svn: 343911
*	AMDGPU: Consolidate SMRD TableGen patterns	Tom Stellard	2018-10-06	1	-100/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Merge the SMRD patterns for CI into the same multiclass as the patterns for other sub-targets. This removes some duplicate code and will make it easier for some future GlobalISel changes I would like to do. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52557 llvm-svn: 343909
*	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions	Matthias Braun	2018-10-05	3	-27/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This rebases and recommits r343520. hwasan should be fixed now and this shouldn't break the tests anymore. Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343895
*	[X86][AVX] Limit getFauxShuffleMask INSERT_SUBVECTOR support to 2 inputs	Simon Pilgrim	2018-10-05	1	-1/+2
\| \| \| \| \| \| \| \|	rL343853 didn't limit the number of subinputs, but we don't currently support faux shuffles with more than 2 total inputs, so put a limiter in place until this is fixed. Found by Artem Dergachev. llvm-svn: 343891
*	[X86] Don't promote i16 compares to i32 if the immediate will fit in 8 bits.	Craig Topper	2018-10-05	1	-2/+5
\| \| \| \| \| \| \| \|	The comments in this code say we were trying to avoid 16-bit immediates, but if the immediate fits in 8-bits this isn't an issue. This avoids creating a zero extend that probably won't go away. The movmskb related changes are interesting. The movmskb instruction writes a 32-bit result, but fills the upper bits with 0. So the zero_extend we were previously emitting was free, but we turned a -1 immediate that would fit in 8-bits into a 32-bit immediate so it was still bad. llvm-svn: 343871
*	[X86] Move ReadAfterLd functionality into X86FoldableSchedWrite (PR36957)	Simon Pilgrim	2018-10-05	20	-477/+534
\| \| \| \| \| \| \| \| \| \| \| \|	Currently we hardcode instructions with ReadAfterLd if the register operands don't need to be available until the folded load has completed. This doesn't take into account the different load latencies of different memory operands (PR36957). This patch adds a ReadAfterFold def into X86FoldableSchedWrite to replace ReadAfterLd, allowing us to specify the load latency at a scheduler class level. I've added ReadAfterVec*Ld classes that match the XMM/Scl, XMM and YMM/ZMM WriteVecLoad classes that we currently use, we can tweak these values in future patches once this infrastructure is in place. Differential Revision: https://reviews.llvm.org/D52886 llvm-svn: 343868
*	[X86][AVX] getFauxShuffleMask - add support for INSERT_SUBVECTOR subvector ↵	Simon Pilgrim	2018-10-05	1	-0/+36
\| \| \| \| \| \| \| \| \| \|	shuffles Decode subvector shuffles from INSERT_SUBVECTOR(SRC0, SHUFFLE(EXTRACT_SUBVECTOR(SRC1)) This was found necessary while investigating PR39161 llvm-svn: 343853
*	[TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints()	Jonas Paulsson	2018-10-05	11	-21/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851
*	AMDGPU/GlobalISel: Add support for G_INTTOPTR	Tom Stellard	2018-10-05	3	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a no-op. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52916 llvm-svn: 343839
*	[WebAssembly] Saturating arithmetic intrinsics	Thomas Lively	2018-10-05	3	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Depends on D52805. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52813 llvm-svn: 343833
*	[WebAssembly] Ignore DBG_VALUE in WebAssemblyCFGStackify pass when looking ↵	Yury Delendik	2018-10-04	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	for block start Summary: Fixes https://bugs.llvm.org/show_bug.cgi?id=39158 and regression caused by D49034. Though it is possible the problem was existed before and was exposed by additional DBG_VALUEs. Reviewers: sunfish, dschuff, aheejin Reviewed By: aheejin Subscribers: sbc100, aheejin, llvm-commits, alexcrichton, jgravelle-google Differential Revision: https://reviews.llvm.org/D52837 llvm-svn: 343827
*	[RISCV] Support named operands for CSR instructions.	Ana Pazos	2018-10-04	17	-88/+637
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: asb, mgrang Reviewed By: asb Subscribers: jocewei, mgorny, jfb, PkmX, MartinMosbeck, brucehoult, the_o, rkruppe, rogfer01, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones Differential Revision: https://reviews.llvm.org/D46759 llvm-svn: 343822
*	[X86][LegalizeVectorOps] Use MERGE_VALUES to return two results from ↵	Craig Topper	2018-10-04	1	-11/+6
\| \| \| \| \| \| \| \| \| \|	LowerLoad. Remove special case code in LegalizeVectorOps that allowed us to only return one result. Previously we replaced the chain use ourself and return the data result. LegalizeVectorOps then detected that we'd done this and assumed the chain had already been handled. This commit instead returns a MERGE_VALUES node with two results joined from nodes. This allows LegalizeVectorOps to do all the replacements for us without any special casing. The MERGE_VALUES will be removed by DAG combine. llvm-svn: 343817
*	[WebAssembly] Don't modify preds/succs iterators while erasing from them	Heejin Ahn	2018-10-04	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This caused out-of-bound bugs. Found by `-DLLVM_ENABLE_EXPENSIVE_CHECKS=ON`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52902 llvm-svn: 343814
*	AMDGPU: Rename isAmdCodeObjectV2 -> isAmdHsaOrMesa	Konstantin Zhuravlyov	2018-10-04	5	-18/+16
\| \| \| \| \| \| \| \| \| \| \| \|	The isAmdCodeObjectV2 is a misleading name which actually checks whether the os is amdhsa or mesa. Also add a test to make sure we do not generate old kernel header for code object v3. Differential Revision: https://reviews.llvm.org/D52897 llvm-svn: 343813
*	[COFF] [X86] Don't use llvm_unreachable for unsupported relocation types	Martin Storsjo	2018-10-04	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This can happen if assembling a reference to _GLOBAL_OFFSET_TABLE_. While it doesn't make sense to try to assemble that for COFF, the fact that we previously used llvm_unreachable meant that the code had undefined behaviour if something tried to assemble that. The configure script of libgmp would try to assemble such a snippet (which should signal a failure). If llvm is built without assertions, the undefined behaviour meant a (near) infinite loop. Differential Revision: https://reviews.llvm.org/D52903 llvm-svn: 343811
*	AArch64: Fix XSeqPairs/WSeqPairs problems	Matthias Braun	2018-10-04	1	-18/+68
\| \| \| \| \| \| \| \| \| \|	- Fix spill/reloads of XSeqPairs failing with vregs (only physregs worked correctly) - Add missing spill/reload code for WSeqPairs class Differential Revision: https://reviews.llvm.org/D52761 llvm-svn: 343799
*	[AMDGPU] Match signed dot4/8 pattern.	Farhana Aleen	2018-10-04	1	-48/+59
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch matches signed dot4 and dot8 pattern. Author: FarhanaAleen Reviewed By: msearles Differential Revision: https://reviews.llvm.org/D52520 llvm-svn: 343798
*	[RISCV] Remove overzealous is64Bit checks	Alex Bradbury	2018-10-04	2	-4/+3
\| \| \| \| \| \| \| \|	lowerGlobalAddress, lowerBlockAddress, and insertIndirectBranch contain overzealous checks for is64Bit. These functions are all safe as-implemented for RV64. llvm-svn: 343781
*	[X86] Set correct MMO offset on scalarized load pieces	David Greene	2018-10-04	1	-3/+9
\| \| \| \| \| \| \|	When scalarizing a load, be sure to update the offset in the MachineMemOperand for each scalar load. llvm-svn: 343776
*	Fix MSVC "not all control paths return a value" warning. NFCI.	Simon Pilgrim	2018-10-04	1	-0/+1
\| \| \| \|	llvm-svn: 343765
*	[RISCV] Bugfix for floats passed on the stack with the ILP32 ABI on RV32F	Alex Bradbury	2018-10-04	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	f32 values passed on the stack would previously cause an assertion in unpackFromMemLoc.. This would only trigger in the presence of the F extension making f32 a legal type. Otherwise the f32 would be legalized. This patch fixes that by keeping LocVT=f32 when a float is passed on the stack. It also adds test coverage for this case, and tests that also demonstrate lw/sw/flw/fsw will be selected when most profitable. i.e. there is no unnecessary i32<->f32 conversion in registers. llvm-svn: 343756
*	[X86] Merge matchANDXORWithAllOnesAsANDNP into ↵	Craig Topper	2018-10-04	1	-25/+12
\| \| \| \| \| \| \| \|	combineANDXORWithAllOnesIntoANDNP. NFCI It's the only caller and the logic pretty easy to combine. llvm-svn: 343754
*	[RISCV][NFC] Fix naming of RISCVISelLowering::{LowerRETURNADDR,LowerFRAMEADDR}	Alex Bradbury	2018-10-04	2	-7/+7
\| \| \| \| \| \| \|	Rename to lowerRETURNADDR, lowerFRAMEADDR in order to be consistent with the LLVM coding style and the other functions in this file. llvm-svn: 343752
*	[RISCV] Handle redundant SplitF64+BuildPairF64 pairs in a DAGCombine	Alex Bradbury	2018-10-03	3	-12/+20
\| \| \| \| \| \| \| \|	r343712 performed this optimisation during instruction selection. As Eli Friedman pointed out in post-commit review, implementing this as a DAGCombine might allow opportunities for further optimisations. llvm-svn: 343741
*	[WebAssembly] Bitselect intrinsic and instruction	Thomas Lively	2018-10-03	3	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Depends on D52755. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52805 llvm-svn: 343739
*	[RISCV][NFC] Refactor LocVT<->ValVT converstion in RISCVISelLowering	Alex Bradbury	2018-10-03	1	-40/+33
\| \| \| \| \| \| \| \| \| \|	There was some duplicated logic for using the LocInfo of a CCValAssign in order to convert from the ValVT to LocVT or vice versa. Resolve this by factoring out convertLocVTFromValVT from unpackFromRegLoc. Also rename packIntoRegLoc to the more appropriate convertValVTToLocVT and call these helper functions consistently. llvm-svn: 343737
*	[WebAssembly] Refactor WasmSignature and use it for MCSymbolWasm	Derek Schuff	2018-10-03	10	-118/+106
\| \| \| \| \| \| \| \| \| \| \| \|	MCContext does not destroy MCSymbols on shutdown. So, rather than putting SmallVectors (which may heap-allocate) inside MCSymbolWasm, use unowned pointer to a WasmSignature instead. The signatures are now owned by the AsmPrinter. Also uses WasmSignature instead of param and result vectors in TargetStreamer, and leaves some TODOs for further simplification. Differential Revision: https://reviews.llvm.org/D52580 llvm-svn: 343733