bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	CodeGen: Introduce a class for registers	Matt Arsenault	2019-06-24	2	-2/+2
\| \| \| \| \| \| \| \| \|	Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
*	[DAGCombine] visitMUL - allow shift by zero in MulByConstant.	Simon Pilgrim	2019-06-24	1	-6/+6
\| \| \| \| \| \| \| \|	This can occur under certain circumstances when undefs are created later on in the constant multipliers (e.g. in this case due to SimplifyDemandedVectorElts). Its better to let the shift by zero to occur and perform any cleanup afterward. Fixes OSS Fuzz #15429 llvm-svn: 364179
*	[SelectionDAG] Remove the code that attempts to calculate the alignment for ↵	Craig Topper	2019-06-23	2	-27/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	the second half of a split masked load/store. The code divides the alignment by 2 if the original alignment is equal to the original VT size. But this wouldn't be correct if the alignment was larger than the VT size. The memory operand object already takes care of calling MinAlign on the base alignment and the memory pointer offset. So we don't need any special code at all. llvm-svn: 364151
*	[DAGCombine] narrowExtractedVectorBinOp - pull out repeated getOpcode(). NFCI.	Simon Pilgrim	2019-06-21	1	-2/+2
\| \| \| \|	llvm-svn: 364076
*	[DAGCombine] narrowInsertExtractVectorBinOp - reuse "extract from insert" ↵	Simon Pilgrim	2019-06-21	1	-11/+15
\| \| \| \| \| \| \| \|	detection code. Move the "extract from insert detection code" into a lambda helper function. llvm-svn: 364059
*	[DAGCombiner] Use getAPIntValue() instead of getZExtValue() where possible.	Simon Pilgrim	2019-06-20	1	-21/+20
\| \| \| \| \| \|	Better handling of out-of-i64-range values due to large integer types or from fuzz tests. llvm-svn: 363955
*	[DAGCombiner][NFC] Remove unused var	Jordan Rupprecht	2019-06-20	1	-1/+0
\| \| \| \|	llvm-svn: 363954
*	[DAGCombiner] Support (shl (zext (srl x, C)), C) -> (zext (shl (srl x, C), ↵	Simon Pilgrim	2019-06-20	1	-17/+19
\| \| \| \| \| \| \| \|	C)) non-uniform folds. Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. llvm-svn: 363929
*	[DAGCombine] Add TODOs for some combines that should support non-uniform vectors	Simon Pilgrim	2019-06-20	1	-0/+15
\| \| \| \| \| \|	We tend to only test for scalar/scalar consts when really we could support non-uniform vectors using ISD::matchUnaryPredicate/matchBinaryPredicate etc. llvm-svn: 363924
*	[DAGCombine] Reduce scope of ShAmtVal variable. NFCI.	Simon Pilgrim	2019-06-20	1	-2/+1
\| \| \| \| \| \| \| \|	Fixes cppcheck warning. Use the more capable getAPIntVal() instead of getZExtValue() as well since I'm here. llvm-svn: 363921
*	[DAGCombine] Use ConstantSDNode::getAPIntValue() instead of getZExtValue().	Simon Pilgrim	2019-06-19	1	-2/+2
\| \| \| \| \| \|	Use getAPIntValue() in a few more places. Most of the time getZExtValue() is fine, but occasionally there's fuzzed code or someone decides to create i65536 or something..... llvm-svn: 363887
*	[TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG support	Simon Pilgrim	2019-06-19	1	-11/+12
\| \| \| \| \| \|	Move 'lowest' demanded elt -> bitcast fold out of ZERO_EXTEND_VECTOR_INREG into ANY_EXTEND_VECTOR_INREG case. llvm-svn: 363856
*	[TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ↵	Simon Pilgrim	2019-06-19	1	-3/+4
\| \| \| \| \| \| \| \| \| \|	ANY_EXTEND_VECTOR_INREG Simplify ZERO_EXTEND_VECTOR_INREG if the extended bits are not required. Matches what we already do for ZERO_EXTEND. llvm-svn: 363850
*	[TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ↵	Simon Pilgrim	2019-06-19	1	-6/+10
\| \| \| \| \| \| \| \| \| \|	ANY/ZERO_EXTEND_VECTOR_INREG Simplify SIGN_EXTEND_VECTOR_INREG if the extended bits are not required/known zero. Matches what we already do for SIGN_EXTEND. llvm-svn: 363802
*	[DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, ↵	Simon Pilgrim	2019-06-19	1	-16/+15
\| \| \| \| \| \| \| \|	c2)) non-uniform folds. Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. llvm-svn: 363793
*	[DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> 0 non-uniform folds.	Simon Pilgrim	2019-06-19	2	-11/+27
\| \| \| \| \| \| \| \|	Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. This requires us to tweak matchBinaryPredicate to allow it to (optionally) handle constants with different type widths. llvm-svn: 363792
*	[DAGCombiner] visitSHL - pull out repeated shift amount VT. NFCI.	Simon Pilgrim	2019-06-19	1	-6/+6
\| \| \| \|	llvm-svn: 363789
*	[DAGCombine] Fix (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, c2)) ↵	Simon Pilgrim	2019-06-19	1	-1/+2
\| \| \| \| \| \| \| \|	comment. NFCI. We pre-extend, not post. llvm-svn: 363787
*	Rename ExpandISelPseudo->FinalizeISel, delay register reservation	Matt Arsenault	2019-06-19	2	-1/+30
\| \| \| \| \| \| \| \| \| \| \|	This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
*	[TargetLowering] SimplifyDemandedBits - Cleanup ANY_EXTEND handling	Simon Pilgrim	2019-06-18	1	-2/+8
\| \| \| \| \| \|	Match SIGN_EXTEND + ZERO_EXTEND handling - will be adding ANY_EXTEND_VECTOR_INREG support in a future patch. llvm-svn: 363716
*	[TargetLowering] SimplifyDemandedBits - Merge ↵	Simon Pilgrim	2019-06-18	1	-24/+16
\| \| \| \| \| \| \| \|	ZERO_EXTEND+ZERO_EXTEND_VECTOR_INREG handling Other than adding consistent demanded elts handling which was a trivial addition, the other differences in functionality will be added in later patches. llvm-svn: 363713
*	[TargetLowering] SimplifyDemandedBits - Merge ↵	Simon Pilgrim	2019-06-18	1	-25/+17
\| \| \| \| \| \| \| \|	SIGN_EXTEND+SIGN_EXTEND_VECTOR_INREG handling Other than adding consistent demanded elts handling which was a trivial addition, the other differences in functionality will be added in later patches. llvm-svn: 363710
*	[TargetLowering] SimplifyDemandedVectorElts - support MUL and ↵	Simon Pilgrim	2019-06-18	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	ANY_EXTEND_VECTOR_INREG Also fold ANY_EXTEND_VECTOR_INREG -> BITCAST if we only need the bottom element. Fixes temporary regression introduced in rL363693. llvm-svn: 363694
*	[SelectionDAG] Legalize vaargs that require vector splitting	Simon Pilgrim	2019-06-18	2	-0/+24
\| \| \| \| \| \| \| \| \| \|	This adds vector splitting for vaarg instructions during type legalization Committed on behalf of @luke (Luke Lau) Differential Revision: https://reviews.llvm.org/D60762 llvm-svn: 363671
*	[DAGCombiner] [CodeGenPrepare] More comprehensive GEP splitting	Luis Marques	2019-06-17	1	-3/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some GEPs were not being split, presumably because that split would just be undone by the DAGCombiner. Not performing those splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates. This patch: - Makes the splits also occur in the cases where the base address and the GEP are in the same BB. - Ensures that the DAGCombiner doesn't reassociate them back again. Differential Revision: https://reviews.llvm.org/D60294 llvm-svn: 363544
*	[SelectionDAG] Fold insert_subvector(undef, extract_subvector(v, c), c) -> v ↵	Simon Pilgrim	2019-06-17	1	-0/+6
\| \| \| \| \| \| \| \|	in getNode This is already done in DAGCombiner::visitINSERT_SUBVECTOR, but this helps a number of shuffles across different vector widths recognise when they come from the same source. llvm-svn: 363542
*	adding more fmf propagation for selects plus updated tests	Michael Berg	2019-06-15	2	-20/+37
\| \| \| \|	llvm-svn: 363484
*	Revert "adding more fmf propagation for selects plus tests"	Fangrui Song	2019-06-15	2	-37/+20
\| \| \| \| \| \| \| \| \| \| \|	This reverts rL363474. -debug-only=isel was added to some tests that don't specify `REQUIRES: asserts`. This causes failures on -DLLVM_ENABLE_ASSERTIONS=off builds. I chose to revert instead of fixing the tests because I'm not sure whether we should add `REQUIRES: asserts` to more tests. llvm-svn: 363482
*	adding more fmf propagation for selects plus tests	Michael Berg	2019-06-14	2	-20/+37
\| \| \| \|	llvm-svn: 363474
*	[TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests ↵	Simon Pilgrim	2019-06-12	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 llvm-svn: 363179
*	[TargetLowering] Add allowsMemoryAccess(MachineMemOperand) helper wrapper. NFCI.	Simon Pilgrim	2019-06-11	2	-45/+39
\| \| \| \| \| \|	As suggested by @arsenm on D63075 - this adds a TargetLowering::allowsMemoryAccess wrapper that takes a Load/Store node's MachineMemOperand to handle the AddressSpace/Alignment arguments and will also implicitly handle the MachineMemOperand::Flags change in D63075. llvm-svn: 363048
*	[DAGCombine] GetNegatedExpression - constant float vector support (PR42105)	Simon Pilgrim	2019-06-11	1	-9/+40
\| \| \| \| \| \| \| \|	Add support for negation of constant build vectors. Differential Revision: https://reviews.llvm.org/D62963 llvm-svn: 363040
*	Change semantics of fadd/fmul vector reductions.	Sander de Smalen	2019-06-11	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones: llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul and adds functionality to auto-upgrade existing LLVM IR and bitcode. Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60261 llvm-svn: 363035
*	[FastISel] Skip creating unnecessary vregs for arguments	Francis Visoiu Mistrih	2019-06-10	2	-33/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This behavior was added in r130928 for both FastISel and SD, and then disabled in r131156 for FastISel. This re-enables it for FastISel with the corresponding fix. This is triggered only when FastISel can't lower the arguments and falls back to SelectionDAG for it. FastISel contains a map of "register fixups" where at the end of the selection phase it replaces all uses of a register with another register that FastISel sometimes pre-assigned. Code at the end of SelectionDAGISel::runOnMachineFunction is doing the replacement at the very end of the function, while other pieces that come in before that look through the MachineFunction and assume everything is done. In this case, the real issue is that the code emitting COPY instructions for the liveins (physreg to vreg) (EmitLiveInCopies) is checking if the vreg assigned to the physreg is used, and if it's not, it will skip the COPY. If a register wasn't replaced with its assigned fixup yet, the copy will be skipped and we'll end up with uses of undefined registers. This fix moves the replacement of registers before the emission of copies for the live-ins. The initial motivation for this fix is to enable tail calls for swiftself functions, which were blocked because we couldn't prove that the swiftself argument (which is callee-save) comes from a function argument (live-in), because there was an extra copy (vreg to vreg). A few tests are affected by this: * llvm/test/CodeGen/AArch64/swifterror.ll: we used to spill x21 (callee-save) but never reload it because it's attached to the return. We now don't even spill it anymore. * llvm/test/CodeGen//swiftself.ll: we tail-call now. llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll: I believe this test was not really testing the right thing, but it worked because the same registers were re-used. * llvm/test/CodeGen/ARM/cmpxchg-O0.ll: regalloc changes * llvm/test/CodeGen/ARM/swifterror.ll: get rid of a copy * llvm/test/CodeGen/Mips/: get rid of spills and copies llvm/test/CodeGen/SystemZ/swift-return.ll: smaller stack * llvm/test/CodeGen/X86/atomic-unordered.ll: smaller stack * llvm/test/CodeGen/X86/swifterror.ll: same as AArch64 * llvm/test/DebugInfo/X86/dbg-declare-arg.ll: stack size changed Differential Revision: https://reviews.llvm.org/D62361 llvm-svn: 362963
*	[DAGCombine] Match a pattern where a wide type scalar value is stored by ↵	QingShan Zhang	2019-06-10	1	-0/+180
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res \|= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > ((i32)p) = val; i8 p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > ((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D62897 llvm-svn: 362921
*	[TargetLowering] Simplify (ctpop x) == 1	David Bolvansky	2019-06-09	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: craig.topper, spatel, RKSimon, bkramer Reviewed By: spatel Subscribers: javed.absar, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63004 llvm-svn: 362912
*	Use for-range loop. NFCI.	Simon Pilgrim	2019-06-09	1	-3/+1
\| \| \| \|	llvm-svn: 362897
*	[DAGCombine] visitAND - merge (zext_inreg ((s)extload x)) -> (zextload x) ↵	Simon Pilgrim	2019-06-08	1	-21/+4
\| \| \| \| \| \| \| \|	combines. NFCI. Same codegen, only differ by the oneuse limit for the sextload case. llvm-svn: 362880
*	Factor out SelectionDAG's switch analysis and lowering into a separate ↵	Amara Emerson	2019-06-08	3	-767/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	component. In order for GlobalISel to re-use the significant amount of analysis and optimization code in SDAG's switch lowering, we first have to extract it and create an interface to be used by both frameworks. No test changes as it's NFC. Differential Revision: https://reviews.llvm.org/D62745 llvm-svn: 362857
*	[DAGCombine] visitAND - fix local shadow variable warnings. NFCI.	Simon Pilgrim	2019-06-07	1	-24/+24
\| \| \| \|	llvm-svn: 362825
*	[DAGCombine] Use APInt::extractBits in "sub-splat" constant mask detection. ↵	Simon Pilgrim	2019-06-07	1	-3/+3
\| \| \| \| \| \|	NFCI. llvm-svn: 362820
*	[AIX] Implement function descriptor on SDAG	Jason Liu	2019-06-06	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (1) Function descriptor on AIX On AIX, a called routine may have 2 distinct symbols associated with it: * A function descriptor (Name) * A function entry point (.Name) The descriptor structure on AIX is the same as those in the ELF V1 ABI: * The address of the entry point of the function. * The TOC base address for the function. * The environment pointer. The descriptor symbol uses the same name as the source level function in C. The function entry point is analogous to the symbol we would generate for a function in a non-descriptor-based ABI, except that it is renamed by prepending a ".". Which symbol gets referenced depends on the context: * Taking the address of the function references the descriptor symbol. * Calling the function references the entry point symbol. (2) Speaking of implementation on AIX, for direct function call target, we create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to replace original TargetGlobalAddress SDNode. Then down the path, we can take advantage of this MCSymbol. Patch by: Xiangling_L Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara Differential Revision: https://reviews.llvm.org/D62532 llvm-svn: 362735
*	[DAGCombine] MergeConsecutiveStores - improve non-temporal load\store ↵	Simon Pilgrim	2019-06-06	1	-7/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	handling (PR42123) This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores: 1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another. 2 - The merged load\store node must retain the non-temporal flag. Differential Revision: https://reviews.llvm.org/D62910 llvm-svn: 362723
*	[DAGCombine] Cleanup isNegatibleForFree/GetNegatedExpression. NFCI.	Simon Pilgrim	2019-06-06	1	-20/+21
\| \| \| \| \| \|	Prep work for PR42105 - clang-format, use auto for cast and merge nested if()s llvm-svn: 362695
*	Allow target to handle STRICT floating-point nodes	Ulrich Weigand	2019-06-05	3	-10/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is both marked as mayRaiseFPException and FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663
*	IR: make getParamByValType Just Work. NFC.	Tim Northover	2019-06-05	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \|	Most parts of LLVM don't care whether the byval type is derived from an explicit Attribute or from the parameter's pointee type, so it makes sense for the main access function to just return the right value. The very few users who do care (only BitcodeReader so far) can find out how it's specified by accessing the Attribute directly. llvm-svn: 362642
*	Fix shadow local variable warning. NFCI.	Simon Pilgrim	2019-06-05	1	-6/+6
\| \| \| \|	llvm-svn: 362622
*	[TargetLowering] SimplifyDemandedBits - pull out shift value type. NFCI.	Simon Pilgrim	2019-06-05	1	-1/+2
\| \| \| \| \| \|	Will be used more in an upcoming patch. llvm-svn: 362595
*	[SelectionDAG][FIX] Allow "returned" arguments to be bit-casted	Johannes Doerfert	2019-06-04	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: An argument that is return by a function but bit-casted before can still be annotated as "returned". Make sure we do not crash for this case. Reviewers: sunfish, stephenwlin, niravd, arsenm Subscribers: wdng, hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59917 llvm-svn: 362546
*	Revert r362472 as it is breaking PPC build bots	Nemanja Ivanovic	2019-06-04	1	-179/+0
\| \| \| \| \| \| \|	The patch https://reviews.llvm.org/rL362472 broke PPC LNT buildbots. Reverting it to bring the bots back to green. llvm-svn: 362539