bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[DAGCombiner] Don't combine (addcarry (uaddo X, Y), 0, Carry) -> (addcarry ↵	Craig Topper	2019-07-04	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X, Y, Carry) if the Carry comes from the uaddo. Summary: The uaddo won't be removed and the addcarry will still be dependent on the uaddo. So we'll just increase the use count of X and Y and potentially require a COPY. Reviewers: spatel, RKSimon, deadalnix Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64190 llvm-svn: 365149
*	GlobalISel: Fix widenScalar for pointer typed G_MERGE_VALUES	Matt Arsenault	2019-07-03	1	-1/+1
\| \| \| \|	llvm-svn: 365093
*	[CodeGen] Make branch funnels pass the machine verifier	Francis Visoiu Mistrih	2019-07-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We previously marked all the tests with branch funnels as `-verify-machineinstrs=0`. This is an attempt to fix it. 1) `ICALL_BRANCH_FUNNEL` has no defs. Mark it as `let OutOperandList = (outs)` 2) After that we hit an assert: ``` Assertion failed: (Op.getValueType() != MVT::Other && Op.getValueType() != MVT::Glue && "Chain and glue operands should occur at end of operand list!"), function AddOperand, file /Users/francisvm/llvm/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp, line 461. ``` The chain operand was added at the beginning of the operand list. Move that to the end. 3) After that we hit another verifier issue in the pseudo expansion where the registers used in the cmps and jmps are not added to the livein lists. Add the `EFLAGS` to all the new MBBs that we create. PR39436 Differential Review: https://reviews.llvm.org/D54155 llvm-svn: 365058
*	Use getAllOnesConstants instead of -1 in DAGCombiner. NFC	Amaury Sechet	2019-07-03	1	-1/+1
\| \| \| \|	llvm-svn: 365054
*	[DAGCombine] More diamong carry pattern optimization.	Amaury Sechet	2019-07-03	1	-27/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This diff improve the capability of DAGCOmbine to generate linear carries propagation in presence of a diamond pattern. It is now able to match a large variety of different patterns rather than some hardcoded one. Arguably, the codegen in test cases is not better, but this is to be expected. The goal of this transformation is more about canonicalisation than actual optimisation. Reviewers: hfinkel, RKSimon, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57302 llvm-svn: 365051
*	[SelectionDAG] Propagate alias metadata to target intrinsic nodes	James Molloy	2019-07-03	2	-6/+8
\| \| \| \| \| \| \| \|	When a target intrinsic has been determined to touch memory, we construct a MachineMemOperand during SDAG construction. In this case, we should propagate AAMDNodes metadata to the MachineMemOperand where available. Differential revision: https://reviews.llvm.org/D64131 llvm-svn: 365043
*	[ARM] Thumb2: favor R4-R7 over R12/LR in allocation order when opt for minsize	Oliver Stannard	2019-07-03	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For Thumb2, we prefer low regs (costPerUse = 0) to allow narrow encoding. However, current allocation order is like: R0-R3, R12, LR, R4-R11 As a result, a lot of instructs that use R12/LR will be wide instrs. This patch changes the allocation order to: R0-R7, R12, LR, R8-R11 for thumb2 and -Osize. In most cases, there is no extra push/pop instrs as they will be folded into existing ones. There might be slight performance impact due to more stack usage, so we only enable it when opt for min size. https://reviews.llvm.org/D30324 llvm-svn: 365014
*	[Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)	Roman Lebedev	2019-07-03	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]. In middle-end, we'd want to prefer the form with two adds - D63992, but as this diff shows, not every target will prefer that pattern. Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars, but only X86 prefer that same pattern for vectors. Here i'm adding a new TLI hook, always defaulting to the inc-of-add, but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars. Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel Reviewed By: efriedma Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64090 llvm-svn: 365010
*	Revert Changing CodeView debug info type record representation in assembly ↵	Nilanjana Basu	2019-07-03	1	-41/+8
\| \| \| \| \| \| \| \|	files to make it more human-readable & editable This reverts r364982 (git commit 2082bf28ebea76cc187b508f801122866420d9ff) llvm-svn: 364987
*	Changing CodeView debug info type record representation in assembly files to ↵	Nilanjana Basu	2019-07-03	1	-8/+41
\| \| \| \| \| \|	make it more human-readable & editable llvm-svn: 364982
*	[RA] Fix spelling of Greedy register allocator internal option	Teresa Johnson	2019-07-02	1	-1/+1
\| \| \| \| \| \| \| \|	The internal option added with r323870 has a typo. It isn't being used by any tests, but I decided to fix the spelling and leave it in for use in debugging the changes added in that patch. llvm-svn: 364958
*	GlobalISel: Add G_FENCE	Matt Arsenault	2019-07-02	2	-0/+15
\| \| \| \| \| \| \|	The pattern importer is for some reason emitting checks for G_CONSTANT for the immediate operands. llvm-svn: 364926
*	[NFC][TargetLowering] Some preparatory cleanups around 'prepareUREMEqFold()' ↵	Roman Lebedev	2019-07-02	1	-17/+18
\| \| \| \| \| \|	from D63963 llvm-svn: 364921
*	[TailDuplicator] Fix copy instruction emitting into the wrong block.	Amara Emerson	2019-07-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The code for duplicating instructions could sometimes try to emit copies intended to deal with unconstrainable register classes to the tail block of the original instruction, rather than before the newly cloned instruction in the predecessor block. This was exposed by GlobalISel on arm64. Differential Revision: https://reviews.llvm.org/D64049 llvm-svn: 364888
*	[DAGCombiner] Exploiting more about the transformation of ↵	Zi Xuan Wu	2019-07-02	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TransformFPLoadStorePair function For a given floating point load / store pair, if the load value isn't used by any other operations, then consider transforming the pair to integer load / store operations if the target deems the transformation profitable. And we can exploiting much more when there are other operation nodes with chain operand between the load/store pair so long as we keep the chain ordering original. We only replace the register used to load/store from float to integer. I only add testcase in ARM because the TLI.isDesirableToTransformToIntegerOp hook is only enabled in ARM target. Differential Revision: https://reviews.llvm.org/D60601 llvm-svn: 364883
*	Testing commit access through minor formatting change	Nilanjana Basu	2019-07-01	1	-2/+3
\| \| \| \|	llvm-svn: 364843
*	GlobalISel: Try to widen merges with other merges	Matt Arsenault	2019-07-01	1	-2/+28
\| \| \| \| \| \| \| \|	If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841
*	GlobalISel: Verify G_MERGE_VALUES operand sizes	Matt Arsenault	2019-07-01	1	-0/+10
\| \| \| \|	llvm-svn: 364822
*	[GlobalISel]: Allow backends to custom legalize Intrinsics	Aditya Nandakumar	2019-07-01	2	-0/+10
\| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D31359 Add a hook "legalizeInstrinsic" to allow backends to override this and custom lower/legalize intrinsics. llvm-svn: 364821
*	GlobalISel: Implement lower for min/max	Matt Arsenault	2019-07-01	1	-0/+36
\| \| \| \|	llvm-svn: 364816
*	Fixup r364512	Diana Picus	2019-07-01	1	-10/+12
\| \| \| \| \| \| \| \| \| \|	Fix stack-use-after-scope errors from r364512. One instance was already fixed in r364611 - this patch simplifies that fix and addresses one more instance of similar code. Discussed in: https://reviews.llvm.org/D63905 llvm-svn: 364778
*	[SelectionDAG] Do minnum->minimum at legalization time instead of building time	Benjamin Kramer	2019-07-01	2	-16/+17
\| \| \| \| \| \| \| \|	The SDAGBuilder behavior stems from the days when we didn't have fast math flags available in SDAG. We do now and doing the transformation in the legalizer has the advantage that it also works for vector types. llvm-svn: 364743
*	[DebugInfo] Avoid adding too much indirection to pointer-valued variables	Jeremy Morse	2019-07-01	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses PR41675, where a stack-pointer variable is dereferenced too many times by its location expression, presenting a value on the stack as the pointer to the stack. The difference between a stack pointer DBG_VALUE and one that refers to a value on the stack, is currently the indirect flag. However the DWARF backend will also try to guess whether something is a memory location or not, based on whether there is any computation in the location expression. By simply prepending the stack offset to existing expressions, we can accidentally convert a register location into a memory location, which introduces a suprise (and unintended) dereference. The solution is to add DW_OP_stack_value whenever we add a DIExpression computation to a stack pointer. It's an implicit location computed on the expression stack, thus needs to be flagged as a stack_value. For the edge case where the offset is zero and the location could be a register location, DIExpression::prepend will still generate opcodes, and thus DW_OP_stack_value must still be added. Differential Revision: https://reviews.llvm.org/D63429 llvm-svn: 364736
*	[ARM] WLS/LE Code Generation	Sam Parker	2019-07-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733
*	Cleanup: llvm::bsearch -> llvm::partition_point after r364719	Fangrui Song	2019-06-30	2	-7/+6
\| \| \| \|	llvm-svn: 364720
*	[SelectionDAG] Use the memory VT instead of result VT for FoldingSet ↵	Craig Topper	2019-06-30	1	-3/+2
\| \| \| \| \| \| \| \| \|	profiling in getMaskedLoad/getMaskedStore. This matches what is done by the Profile function. Otherwise CSE won't work properly. llvm-svn: 364717
*	[HardwareLoops] Loop counter guard intrinsic	Sam Parker	2019-06-28	1	-16/+105
\| \| \| \| \| \| \| \| \|	Introduce llvm.test.set.loop.iterations which sets the loop counter and also produces an i1 after testing that the count is not zero. Differential Revision: https://reviews.llvm.org/D63809 llvm-svn: 364628
*	GlobalISel: Use Register	Matt Arsenault	2019-06-28	3	-39/+39
\| \| \| \|	llvm-svn: 364618
*	GlobalISel: Convert rest of MachineIRBuilder to using Register	Matt Arsenault	2019-06-28	1	-50/+50
\| \| \| \|	llvm-svn: 364615
*	[GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when ↵	Amara Emerson	2019-06-27	1	-14/+26
\| \| \| \| \| \| \| \| \| \| \| \|	optimizations are used. The new switch lowering code that tries to generate jump tables and range checks were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to town on trying to generate optimized lowerings, e.g. multiple jump tables, range checks etc. This exposed bugs in the way PHI nodes are handled because the CFG looks even stranger after all of this is done. llvm-svn: 364613
*	Fix ASAN error caused by commit r364512.	Rumeet Dhindsa	2019-06-27	1	-4/+6
\| \| \| \| \| \| \| \| \|	This patch intends to fix ASAN stack-use-after-scope error. This is at least a short-term fix to unbreak LLVM's mainline. Differential Revision: https://reviews.llvm.org/D63905 llvm-svn: 364611
*	[CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 3)	Roman Lebedev	2019-06-27	1	-0/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222. There is no such action in "Add Action" menu... This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. This is a recommit, the original commit rL364563 was reverted in rL364568 because test-suite detected miscompile - the new comparison constant 'Q' was being computed incorrectly (we divided by `D0` instead of `D`). Original patch D50222 by @hermord (Dmytro Shynkevych) Notes: - In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla. This seems to require an extra branch on overflow, so I refrained from implementing this for now. - An explicit check for when the `REM` can be reduced to just its LHS is included: the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise. I hadn't managed to find a better way to not generate worse output in this case. - The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390. Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: dexonsmith, kristina, xbolva00, javed.absar, llvm-commits, hermord Tags: #llvm Differential Revision: https://reviews.llvm.org/D63391 llvm-svn: 364600
*	Revert "[LiveDebugValues] Emit the debug entry values"	Djordje Todorovic	2019-06-27	1	-139/+20
\| \| \| \| \| \| \| \| \|	Appears that the 'test/DebugInfo/MIR/X86/dbginfo-entryvals.mir' does not pass on Windows. This reverts commit rL364553. llvm-svn: 364571
*	Revert "[CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM ↵	Roman Lebedev	2019-06-27	1	-107/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	case) (try 2)" Appears to break test-suite on http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/23790 FAIL: burg.execution_time FAIL: spiff.execution_time FAIL: employ.execution_time FAIL: llu.execution_time FAIL: gramschmidt.execution_time FAIL: fdtd-apml.execution_time This reverts commit r364563. llvm-svn: 364568
*	[CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 2)	Roman Lebedev	2019-06-27	1	-0/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222. There is no such action in "Add Action" menu... Original patch D50222 by @hermord (Dmytro Shynkevych) This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. Original patch author: @hermord (Dmytro Shynkevych)! Notes: - In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla. This seems to require an extra branch on overflow, so I refrained from implementing this for now. - An explicit check for when the `REM` can be reduced to just its LHS is included: the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise. I hadn't managed to find a better way to not generate worse output in this case. - The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390. Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: xbolva00, javed.absar, llvm-commits, hermord Tags: #llvm Differential Revision: https://reviews.llvm.org/D63391 llvm-svn: 364563
*	[LiveDebugValues] Emit the debug entry values	Djordje Todorovic	2019-06-27	1	-20/+139
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Emit replacements for clobbered parameters location if the parameter has unmodified value throughout the funciton. This is basic scenario where we can use the debug entry values. ([12/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D58042 llvm-svn: 364553
*	[LiveRangeEdit] Fix build failure caused by the rL364536	Djordje Todorovic	2019-06-27	1	-1/+1
\| \| \| \|	llvm-svn: 364549
*	[TargetLowering] SimplifyDemandedVectorElts - add shift/rotate support.	Simon Pilgrim	2019-06-27	1	-0/+18
\| \| \| \|	llvm-svn: 364548
*	[DWARF] Handle the DW_OP_entry_value operand	Djordje Todorovic	2019-06-27	6	-5/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the IR and the AsmPrinter parts for handling of the DW_OP_entry_values DWARF operation. ([11/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60866 llvm-svn: 364542
*	[TargetLowering] SimplifyDemandedBits - use DemandedElts to better identify ↵	Simon Pilgrim	2019-06-27	1	-11/+21
\| \| \| \| \| \|	partial splat shift amounts llvm-svn: 364541
*	[Backend] Keep call site info valid through the backend	Djordje Todorovic	2019-06-27	7	-6/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Handle call instruction replacements and deletions in order to preserve valid state of the call site info of the MachineFunction. NOTE: If the call site info is enabled for a new target, the assertion from the MachineFunction::DeleteMachineInstr() should help to locate places where the updateCallSiteInfo() should be called in order to preserve valid state of the call site info. ([10/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D61062 llvm-svn: 364536
*	[ISEL][X86] Tracking of registers that forward call arguments	Djordje Todorovic	2019-06-27	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While lowering calls, collect info about registers that forward arguments into following function frame. We store such info into the MachineFunction of the call. This is used very late when dumping DWARF info about call site parameters. ([9/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60715 llvm-svn: 364516
*	[DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locations	Jeremy Morse	2019-06-27	1	-2/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once MIR code leaves SSA form and the liveness of a vreg is considered, DBG_VALUE insts are able to refer to non-live vregs, because their debug-uses do not contribute to liveness. This non-liveness becomes problematic for optimizations like register coalescing, as they can't ``see'' the debug uses in the liveness analyses. As a result registers get coalesced regardless of debug uses, and that can lead to invalid variable locations containing unexpected values. In the added test case, the first vreg operand of ADD32rr is merged with various copies of the vreg (great for performance), but a DBG_VALUE of the unmodified operand is blindly updated to the modified operand. This changes what value the variable will appear to have in a debugger. Fix this by changing any DBG_VALUE whose operand will be resurrected by register coalescing to be a $noreg DBG_VALUE, i.e. give the variable no location. This is an overapproximation as some coalesced locations are safe (others are not) -- an extra domination analysis would be required to work out which, and it would be better if we just don't generate non-live DBG_VALUEs. This fixes PR40010. Differential Revision: https://reviews.llvm.org/D56151 llvm-svn: 364515
*	[GlobalISel] Remove [un]packRegs from IRTranslator	Diana Picus	2019-06-27	1	-29/+4
\| \| \| \| \| \| \| \| \| \| \|	Remove the last use of packRegs from IRTranslator and delete pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics with aggregate arguments, since we don't have a testcase for them so it's hard to tell how we'd want to handle them. Discussed in https://reviews.llvm.org/D63551 llvm-svn: 364514
*	[GlobalISel] Accept multiple vregs for lowerCall's args	Diana Picus	2019-06-27	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512
*	[GlobalISel] Accept multiple vregs for lowerCall's result	Diana Picus	2019-06-27	2	-14/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511
*	[GlobalISel] Accept multiple vregs in lowerFormalArgs	Diana Picus	2019-06-27	2	-20/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510
*	[GlobalISel] Allow multiple VRegs in ArgInfo. NFC	Diana Picus	2019-06-27	2	-5/+11
\| \| \| \| \| \| \| \| \| \| \|	Allow CallLowering::ArgInfo to contain more than one virtual register. This is useful when passes split aggregates into several virtual registers, but need to also provide information about the original type to the call lowering. Used in follow-up patches. Differential Revision: https://reviews.llvm.org/D63548 llvm-svn: 364509
*	[MachineFunction] Base support for call site info tracking	Djordje Todorovic	2019-06-27	3	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add an attribute into the MachineFunction that tracks call site info. ([8/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D61061 llvm-svn: 364506
*	[IR] Add DISuprogram and DIE for a func decl	Djordje Todorovic	2019-06-27	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A unique DISubprogram may be attached to a function declaration used for call site debug info. ([6/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60713 llvm-svn: 364500