bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[MachineOutliner] Disable outlining from noreturn functions	Jessica Paquette	2019-10-04	1	-0/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Outlining from noreturn functions doesn't do the correct thing right now. The outliner should respect that the caller is marked noreturn. In the event that we have a noreturn function, and the outlined code is in tail position, the outliner will not see that the outlined function should be tail called. As a result, you end up with a regular call containing a return. Fixing this requires that we check that all candidates live inside noreturn functions. So, for the sake of correctness, don't outline from noreturn functions right now. Add machine-outliner-noreturn.mir to test this. llvm-svn: 373791
*	[AArch64][SVE] Move the testcase into CodeGen dir	Jinsong Ji	2019-10-03	1	-0/+25
\| \| \| \| \| \| \|	https://reviews.llvm.org/rL373600 added an AArch64 testcase in top dir which should be moved to Codegen dir. llvm-svn: 373657
*	[AArch64] Static (de)allocation of SVE stack objects.	Sander de Smalen	2019-10-03	1	-0/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds support to AArch64FrameLowering to allocate fixed-stack SVE objects. The focus of this patch is purely to allow the stack frame to allocate/deallocate space for scalable SVE objects. More dynamic allocation (at compile-time, i.e. determining placement of SVE objects on the stack), or resolving frame-index references that include scalable-sized offsets, are left for subsequent patches. SVE objects are allocated in the stack frame as a separate region below the callee-save area, and above the alignment gap. This is done so that the SVE objects can be accessed directly from the FP at (runtime) VL-based offsets to benefit from using the VL-scaled addressing modes. The layout looks as follows: +-------------+ \| stack arg \| +-------------+ \| Callee Saves\| \| X29, X30 \| (if available) \|-------------\| <- FP (if available) \| : \| \| SVE area \| \| : \| +-------------+ \|/////////////\| alignment gap. \| : \| \| Stack objs \| \| : \| +-------------+ <- SP after call and frame-setup SVE and non-SVE stack objects are distinguished using different StackIDs. The offsets for objects with TargetStackID::SVEVector should be interpreted as purely scalable offsets within their respective SVE region. Reviewers: thegameg, rovka, t.p.northover, efriedma, rengolin, greened Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61437 llvm-svn: 373585
*	[AArch64][SVE] Implement int_aarch64_sve_cnt intrinsic	Kerry McLaughlin	2019-10-02	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch includes tests for the VecOfBitcastsToInt type added by D68021 Reviewers: c-rhodes, sdesmalen, rovka Reviewed By: c-rhodes Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68023 llvm-svn: 373468
*	[Dominators][CodeGen] Don't mark MachineDominatorTree as preserved in ↵	Jakub Kuderski	2019-10-01	1	-0/+1
\| \| \| \| \| \|	MachineLICM llvm-svn: 373378
*	[Dominators][CodeGen] Add MachinePostDominatorTree verification	Jakub Kuderski	2019-10-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch implements Machine PostDominator Tree verification and ensures that the verification doesn't fail the in-tree tests. MPDT verification can be enabled using `verify-machine-dom-info` -- the same flag used by Machine Dominator Tree verification. Flipping the flag revealed that MachineSink falsely claimed to preserve CFG and MDT/MPDT. This patch fixes that. Reviewers: arsenm, hliao, rampitec, vpykhtin, grosser Reviewed By: hliao Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68235 llvm-svn: 373341
*	GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sources	Matt Arsenault	2019-10-01	1	-3/+3
\| \| \| \| \| \|	Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287
*	[AArch64][SVE] Implement punpk[hi\|lo] intrinsics	Kerry McLaughlin	2019-09-30	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the following two intrinsics: - int_aarch64_sve_punpkhi - int_aarch64_sve_punpklo This patch also contains a fix which allows LLVMHalfElementsVectorType to forward reference overloadable arguments. Reviewers: sdesmalen, rovka, rengolin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, greened, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67830 llvm-svn: 373232
*	[AArch64][GlobalISel] Support lowering variadic musttail calls	Jessica Paquette	2019-09-30	1	-0/+223
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for lowering variadic musttail calls. To do this, we have to... - Detect a musttail call in a variadic function before attempting to lower the call's formal arguments. This is done in the IRTranslator. - Compute forwarded registers in `lowerFormalArguments`, and add copies for those registers. - Restore the forwarded registers in `lowerTailCall`. Because there doesn't seem to be any nice way to wrap these up into the outgoing argument handler, the restore code in `lowerTailCall` is done separately. Also, irritatingly, you have to make sure that the registers don't overlap with any passed parameters. Otherwise, the scheduler doesn't know what to do with the extra copies and asserts. Add call-translator-variadic-musttail.ll to test this. This is pretty much the same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to base this off of, but the idea is the same. Differential Revision: https://reviews.llvm.org/D68043 llvm-svn: 373226
*	[TargetLowering] Simplify expansion of S{ADD,SUB}O	Roger Ferrer Ibanez	2019-09-30	4	-690/+312
\| \| \| \| \| \| \| \| \| \|	ISD::SADDO uses the suggested sequence described in the section §2.4 of the RISCV Spec v2.2. ISD::SSUBO uses the dual approach but checking for (non-zero) positive. Differential Revision: https://reviews.llvm.org/D47927 llvm-svn: 373187
*	[GlobalISel Enable memcpy inlining with optsize.	Amara Emerson	2019-09-28	1	-0/+92
\| \| \| \| \| \|	We should be disabling inline for minsize, not optsize. llvm-svn: 373143
*	Add an operand to memory intrinsics to denote the "tail" marker.	Amara Emerson	2019-09-28	7	-56/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to propagate this information from the IR in order to be able to safely do tail call optimizations on the intrinsics during legalization. Assuming it's safe to do tail call opt without checking for the marker isn't safe because the mem libcall may use allocas from the caller. This adds an extra immediate operand to the end of the intrinsics and fixes the legalizer to handle it. Differential Revision: https://reviews.llvm.org/D68151 llvm-svn: 373140
*	Revert XFAIL a codegen test AArch64/tailmerging_in_mbp.ll	Jakub Kuderski	2019-09-27	1	-1/+0
\| \| \| \| \| \|	This reverts r373103 (git commit a524e630a793e18e7d5fabc2262781f310eb0279) llvm-svn: 373116
*	XFAIL a codegen test AArch64/tailmerging_in_mbp.ll	Jakub Kuderski	2019-09-27	1	-0/+1
\| \| \| \| \| \| \|	This test fails when machine dominator tree verifier is run. Needs more investigation, as this is not a new failure. llvm-svn: 373103
*	Revert r372893 "[CodeGen] Replace -max-jump-table-size with ↵	Hans Wennborg	2019-09-27	1	-23/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-max-jump-table-targets" This caused severe compile-time regressions, see PR43455. > Modern processors predict the targets of an indirect branch regardless of > the size of any jump table used to glean its target address. Moreover, > branch predictors typically use resources limited by the number of actual > targets that occur at run time. > > This patch changes the semantics of the option `-max-jump-table-size` to limit > the number of different targets instead of the number of entries in a jump > table. Thus, it is now renamed to `-max-jump-table-targets`. > > Before, when `-max-jump-table-size` was specified, it could happen that > cluster jump tables could have targets used repeatedly, but each one was > counted and typically resulted in tables with the same number of entries. > With this patch, when specifying `-max-jump-table-targets`, tables may have > different lengths, since the number of unique targets is counted towards the > limit, but the number of unique targets in tables is the same, but for the > last one containing the balance of targets. > > Differential revision: https://reviews.llvm.org/D60295 llvm-svn: 373060
*	hwasan: Compatibility fixes for short granules.	Peter Collingbourne	2019-09-27	1	-30/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can't use short granules with stack instrumentation when targeting older API levels because the rest of the system won't understand the short granule tags stored in shadow memory. Moreover, we need to be able to let old binaries (which won't understand short granule tags) run on a new system that supports short granule tags. Such binaries will call the __hwasan_tag_mismatch function when their outlined checks fail. We can compensate for the binary's lack of support for short granules by implementing the short granule part of the check in the __hwasan_tag_mismatch function. Unfortunately we can't do anything about inline checks, but I don't believe that we can generate these by default on aarch64, nor did we do so when the ABI was fixed. A new function, __hwasan_tag_mismatch_v2, is introduced that lets code targeting the new runtime avoid redoing the short granule check. Because tag mismatches are rare this isn't important from a performance perspective; the main benefit is that it introduces a symbol dependency that prevents binaries targeting the new runtime from running on older (i.e. incompatible) runtimes. Differential Revision: https://reviews.llvm.org/D68059 llvm-svn: 373035
*	[DAGCombine][X86][AArch64][NFC] Add tests for shift-by-signext	Roman Lebedev	2019-09-26	1	-0/+122
\| \| \| \|	llvm-svn: 373014
*	[Verifier] add invariant check for callbr	Nick Desaulniers	2019-09-25	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The list of indirect labels should ALWAYS have their blockaddresses as argument operands to the callbr (but not necessarily the other way around). Add an invariant that checks this. The verifier catches a bad test case that was added recently in r368478. I think that was a simple mistake, and the test was made less strict in regards to the precise addresses (as those weren't specifically the point of the test). This invariant will be used to find a reported bug. Link: https://www.spinics.net/lists/arm-kernel/msg753473.html Link: https://github.com/ClangBuiltLinux/linux/issues/649 Reviewers: craig.topper, void, chandlerc Reviewed By: void Subscribers: ychen, lebedev.ri, javed.absar, kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D67196 llvm-svn: 372923
*	[AArch64][GlobalISel] Choose CCAssignFns per-argument for tail call lowering	Jessica Paquette	2019-09-25	1	-16/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When checking for tail call eligibility, we should use the correct CCAssignFn for each argument, rather than just checking if the caller/callee is varargs or not. This is important for tail call lowering with varargs. If we don't check it, then basically any varargs callee with parameters cannot be tail called on Darwin, for one thing. If the parameters are all guaranteed to be in registers, this should be entirely safe. On top of that, not checking for this could potentially make it so that we have the wrong stack offsets when checking for tail call eligibility. Also refactor some of the stuff for CCAssignFnForCall and pull it out into a helper function. Update call-translator-tail-call.ll to show that we can now correctly tail call on Darwin. Also add two extra tail call checks. The first verifies that we still respect the caller's stack size, and the second verifies that we still don't tail call when a varargs function has a memory argument. Differential Revision: https://reviews.llvm.org/D67939 llvm-svn: 372897
*	[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets	Evandro Menezes	2019-09-25	1	-23/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modern processors predict the targets of an indirect branch regardless of the size of any jump table used to glean its target address. Moreover, branch predictors typically use resources limited by the number of actual targets that occur at run time. This patch changes the semantics of the option `-max-jump-table-size` to limit the number of different targets instead of the number of entries in a jump table. Thus, it is now renamed to `-max-jump-table-targets`. Before, when `-max-jump-table-size` was specified, it could happen that cluster jump tables could have targets used repeatedly, but each one was counted and typically resulted in tables with the same number of entries. With this patch, when specifying `-max-jump-table-targets`, tables may have different lengths, since the number of unique targets is counted towards the limit, but the number of unique targets in tables is the same, but for the last one containing the balance of targets. Differential revision: https://reviews.llvm.org/D60295 llvm-svn: 372893
*	[AArch64] Convert neon_ushl and neon_sshl with positive constants to VSHL.	Florian Hahn	2019-09-25	1	-26/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I think we should be able to use shl instead of sshl and ushl for positive constant shift values, unless I am missing something. We already have the machinery in place to ensure we only replace nodes, if the shift value is positive and <= the element width. This is a generalization of an earlier patch rL372565. Reviewers: t.p.northover, samparker, dmgreen, anemet Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D67955 llvm-svn: 372824
*	[AArch64][GlobalISel] Tweak legalization rule for G_BSWAP to handle widening ↵	Amara Emerson	2019-09-25	1	-0/+44
\| \| \| \| \| \|	s16. llvm-svn: 372812
*	[GlobalISel][IRTranslator] Fix switch table lowering to use signed LE not ↵	Amara Emerson	2019-09-24	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \|	unsigned. We were miscompiling switch value comparisons with the wrong signedness, which shows up when we have things like switch case values with i1 types, which end up being legalized incorrectly. Fixes PR43383 llvm-svn: 372675
*	[AArch64] support neon_sshl and neon_ushl in performIntrinsicCombine.	Florian Hahn	2019-09-23	1	-6/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Try to generate ushll/sshll for aarch64_neon_ushl/aarch64_neon_sshl, if their first operand is extended and the second operand is a constant Also adds a few tests marked with FIXME, where we can further increase codegen. Reviewers: t.p.northover, samparker, dmgreen, anemet Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D62308 llvm-svn: 372565
*	[AArch64][GlobalISel] Implement selection for G_SHL of <2 x i64>	Amara Emerson	2019-09-21	1	-0/+29
\| \| \| \| \| \|	Simple continuation of existing selection support. llvm-svn: 372467
*	[AArch64][GlobalISel] Selection support for G_ASHR of <2 x s64>	Amara Emerson	2019-09-21	1	-0/+30
\| \| \| \| \| \|	Just add an extra case to the existing selection logic. llvm-svn: 372466
*	[AArch64][GlobalISel] Make <4 x s32> G_ASHR and G_LSHR legal.	Amara Emerson	2019-09-21	1	-0/+78
\| \| \| \|	llvm-svn: 372465
*	[GlobalISel] Defer setting HasCalls on MachineFrameInfo to selection time.	Amara Emerson	2019-09-20	2	-1/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently always set the HasCalls on MFI during translation and legalization if we're handling a call or legalizing to a libcall. However, if that call is later optimized to a tail call then we don't need the flag. The flag being set to true causes frame lowering to always save and restore FP/LR, which adds unnecessary code. This change does the same thing as SelectionDAG and ports over some code that scans instructions after selection, using TargetInstrInfo to determine if target opcodes are known calls. Code size geomean improvements on CTMark: -O0 : 0.1% -Os : 0.3% Differential Revision: https://reviews.llvm.org/D67868 llvm-svn: 372443
*	[MTE] Handle MTE instructions in AArch64LoadStoreOptimizer.	Evgeniy Stepanov	2019-09-20	2	-1/+286
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Generate pre- and post-indexed forms of ST*G and STGP when possible. Reviewers: ostannard, vitalybuka Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67741 llvm-svn: 372412
*	[aarch64] add def-pats for dot product	Sebastian Pop	2019-09-20	1	-0/+142
\| \| \| \| \| \| \| \| \|	This patch adds the patterns to select the dot product instructions. Tested on aarch64-linux with make check-all. Differential Revision: https://reviews.llvm.org/D67645 llvm-svn: 372408
*	[FastISel] Fix insertion of unconditional branches during FastISel	David Tellenbach	2019-09-20	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The insertion of an unconditional branch during FastISel can differ depending on building with or without debug information. This happens because FastISel::fastEmitBranch emits an unconditional branch depending on the size of the current basic block without distinguishing between debug and non-debug instructions. This patch fixes this issue by ignoring debug instructions when getting the size of the basic block. Reviewers: aprantl Reviewed By: aprantl Subscribers: ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67703 llvm-svn: 372389
*	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Matt Arsenault	2019-09-19	1	-2/+1
\| \| \| \| \| \| \| \| \|	This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
*	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"	Hans Wennborg	2019-09-19	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
*	GlobalISel: Don't materialize immarg arguments to intrinsics	Matt Arsenault	2019-09-19	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
*	[AArch64][GlobalISel] Support lowering musttail calls	Jessica Paquette	2019-09-18	3	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we now lower most tail calls, it makes sense to support musttail. Instead of always falling back to SelectionDAG, only fall back when a musttail call was not able to be emitted as a tail call. Once we can handle most incoming and outgoing arguments, we can change this to a `report_fatal_error` like in ISelLowering. Remove the assert that we don't have varargs and a musttail, and replace it with a return false. Implementing this requires that we implement `saveVarArgRegisters` from AArch64ISelLowering, which is an entirely different patch. Add GlobalISel lines to vararg-tallcall.ll to make sure that we produce correct code. Right now we only fall back, but eventually this will be relevant. Differential Revision: https://reviews.llvm.org/D67681 llvm-svn: 372273
*	[AArch64] Don't implicitly enable global isel on Darwin if code-model==large.	Lang Hames	2019-09-18	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: AArch64 GlobalISel doesn't support MachO's large code model, so this patch adds a check for that combination before implicitly enabling it. Reviewers: paquette Subscribers: kristof.beyls, ributzka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67724 llvm-svn: 372256
*	Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize"	Krasimir Georgiev	2019-09-18	1	-92/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This reverts commit r372204. This change causes build bot failures under msan: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio: ``` FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579) ****************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED **************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir \| /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - \| /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir -- Exit Code: 2 Command Output (stderr): -- ==62894==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10 #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536 #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21 #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11 #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079 #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361 #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18 #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27 #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863 #13 0x905f21 in compileModule(char, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8 #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22 #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369) MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const Exiting error: -: The file was not recognized as a valid object file FileCheck error: '-' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir ``` Reviewers: bkramer Reviewed By: bkramer Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67710 llvm-svn: 372228
*	[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize	Sander de Smalen	2019-09-18	1	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. Reviewers: omjavaid, eli.friedman, thegameg, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D66935 llvm-svn: 372204
*	[AArch64][GlobalISel] Support -tailcallopt	Jessica Paquette	2019-09-17	2	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for `-tailcallopt` tail calls to CallLowering. This piggy-backs off the changes from D67577, since doing it without a bit of refactoring gets extremely ugly. Support is basically ported from AArch64ISelLowering. The main difference here is that tail calls in `-tailcallopt` change the ABI, so there's some extra bookkeeping for the stack. Show that we are correctly lowering these by updating tail-call.ll. Also show that we don't do anything strange in general by updating fastcc-reserved.ll, which passes `-tailcallopt`, but doesn't emit any tail calls. Differential Revision: https://reviews.llvm.org/D67580 llvm-svn: 372177
*	[GlobalISel] Partially revert r371901.	Amara Emerson	2019-09-16	1	-0/+99
\| \| \| \| \| \| \| \| \| \| \|	r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050
*	[SVE][Inline-Asm] Add constraints for SVE predicate registers	Kerry McLaughlin	2019-09-16	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the following inline asm constraints for SVE: - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive - Upa: SVE predicate register with full range, P0 to P15 Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin Reviewed By: rovka Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66524 llvm-svn: 371967
*	[AArch64] Some more FP16 FMA pattern matching	Sjoerd Meijer	2019-09-16	2	-16/+53
\| \| \| \| \| \| \| \| \|	After our previous machinecombiner exercises (rL371321, rL371818, rL371833), we were still missing a few FP16 FMA patterns. Differential Revision: https://reviews.llvm.org/D67576 llvm-svn: 371960
*	[GlobalISel] Fix insertion point of new instructions to be after PHIs.	Amara Emerson	2019-09-13	1	-0/+49
\| \| \| \| \| \| \| \| \| \|	For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901
*	[AArch64][GlobalISel] Tail call memory intrinsics	Jessica Paquette	2019-09-13	2	-12/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because memory intrinsics are handled differently than other calls, we need to check them for tail call eligiblity in the legalizer. This allows us to still inline them when it's beneficial to do so, but also tail call when possible. This adds simple tail calling support for when the intrinsic is followed by a return. It ports the attribute checks from `TargetLowering::isInTailCallPosition` into a similarly-named function in LegalizerHelper.cpp. The target-specific `isUsedByReturnOnly` hook is not ported here. Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call memory intrinsics. Update legalize-memcpy-et-al.mir to have a case where we don't tail call. Differential Revision: https://reviews.llvm.org/D67566 llvm-svn: 371893
*	[AArch64][GlobalISel] Add support for sibcalling callees with varargs	Jessica Paquette	2019-09-13	1	-14/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for tail calling callees with varargs, equivalent to how it is done in AArch64ISelLowering. This only works for sibling calls, and does not add the necessary support for musttail with varargs. (See r345641 for equivalent ISelLowering support.) This should be implemented when we stop falling back on musttail. Update call-translator-tail-call.ll to show that we can now tail call varargs. Differential Revision: https://reviews.llvm.org/D67518 llvm-svn: 371868
*	[AArch64] More @llvm.fma.f16 tests	Sjoerd Meijer	2019-09-13	1	-3/+43
\| \| \| \| \| \| \| \|	Follow up of rL371321 that added FMA FP16 patterns. This adds more tests for @llvm.fma.f16. This probably shows we miss one fmsub optimisation opportunity, which I will look into. llvm-svn: 371833
*	[AArch64][GlobalISel] Support tail calling with swiftself parameters	Jessica Paquette	2019-09-12	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Swiftself uses a callee-saved register. We can tail call when the register used in the caller and callee is the same. This behaviour is equivalent to that in `TargetLowering::parametersInCSRMatch`. Update call-translator-tail-call.ll to verify that we can do this. When we support inline assembly, we can write a check similar to the one in the general swiftself.ll. For now, we need to verify that we get the correct COPY instruction after call lowering. Differential Revision: https://reviews.llvm.org/D67511 llvm-svn: 371788
*	[AArch64][GlobalISel] Support sibling calls with outgoing arguments	Jessica Paquette	2019-09-12	5	-48/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for lowering sibling calls with outgoing arguments. e.g ``` define void @foo(i32 %a) ``` Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`. The only thing that is missing is a full port of `TargetLowering::parametersInCSRMatch`. So, if we're using swiftself, we'll never tail call. - Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used for both outgoing and incoming arguments - Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for stack arguments. - Teach `lowerFormalArguments` to set the bytes in the caller's stack argument area. This is used later to check if the tail call's parameters will fit on the caller's stack. - Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on the callee's outgoing arguments. For testing: - Update call-translator-tail-call to verify that we can now tail call with outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the size of the caller's stack - Remove GISel-specific check lines from speculation-hardening.ll, since GISel now tail calls like the other selectors - Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that test now - Add a GISel test line to tailcall_misched_graph.ll since we tail call there now. Add specific check lines for GISel, since the debug output from the machine-scheduler differs with GlobalISel. The dependency still holds, but the output comes out in a different order. Differential Revision: https://reviews.llvm.org/D67471 llvm-svn: 371780
*	AArch64: support arm64_32, an ILP32 slice for watchOS.	Tim Northover	2019-09-12	30	-160/+1819
\| \| \| \| \| \| \| \|	This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32. llvm-svn: 371722
*	[AArch64][GlobalISel] Fall back on attempts to allocate split types on the ↵	Amara Emerson	2019-09-11	2	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	stack. First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually make a difference for us in CallLowering but fix that anyway to silence the assert. The bigger issue was that after fixing the assert we were generating invalid MIR because the merging/unmerging of values split across multiple registers wasn't also implemented for memory locs. This happens when we run out of registers and have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but for now just fall back. llvm-svn: 371693