bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AArch64] Change AArch64 Windows EH UnwindHelp object to be a fixed object	Daniel Frampton	2020-06-25	5	-24/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The UnwindHelp object is used during exception handling by runtime code. It must be findable from a fixed offset from FP. This change allocates the UnwindHelp object as a fixed object (as is done for x86_64) to ensure that both the generated code and runtime agree on the location of the object. Fixes https://bugs.llvm.org/show_bug.cgi?id=45346 Differential Revision: https://reviews.llvm.org/D77016 (cherry picked from commit 494abe139a9aab991582f1b3f3370b99b252944c)
*	[AArch64] Fix mismatch in prologue and epilogue for funclets on Windows	Daniel Frampton	2020-06-25	1	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The generated code for a funclet can have an add to sp in the epilogue for which there is no corresponding sub in the prologue. This patch removes the early return from emitPrologue that was preventing the sub to sp, and instead conditionalizes the appropriate parts of the rest of the function. Fixes https://bugs.llvm.org/show_bug.cgi?id=45345 Differential Revision: https://reviews.llvm.org/D77015 (cherry picked from commit 522b4c4b88a5606b0074926e8658e7fede97c230)
*	[AARch64] Add Marvell ThunderX3T110 support	Wei Zhao	2020-06-17	5	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	This is the first checkin to support Marvell ThunderX3T110. Initial definition of the micro-ops of the instructions in ThunderX3T110 is included. Differential Revision: https://reviews.llvm.org/D78129 (cherry picked from commit 382d3a85e2a9269569e7fb8caa487d7ef57900c6)
*	[AArch64] Fix BTI instruction emission.	Daniel Kiss	2020-06-16	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SCTLR_EL1.BT[01] controls the PACI[AB]SP compatibility with PBYTE 11 (see [1]) This bit will be set to zero so PACI[AB]SP are equal to BTI C instruction only. [1] https://developer.arm.com/docs/ddi0595/b/aarch64-system-registers/sctlr_el1 Reviewers: chill, tamas.petz, pbarrio, ostannard Reviewed By: tamas.petz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81746 (cherry picked from commit b8ae3fdfa579dbf366b1bb1cbfdbf8c51db7fa55)
*	[AArch64] Fix BTI landing pad generation.	Daniel Kiss	2020-06-16	1	-0/+31
\| \| \| \| \| \| \| \| \| \|	In some cases BTI landing pad is inserted even compatible instruction was there already. Meta instruction does not count in this case therefore skip them in the check for first instructions in the function. Differential revision: https://reviews.llvm.org/D74492 (cherry picked from commit d5a186a60014dc1a8c979c978cb32aba7ecb9102)
*	No longer generate calls to *_finite	serge-sans-paille	2020-02-28	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to Joseph Myers, a libm maintainer > They were only ever an ABI (selected by use of -ffinite-math-only or > options implying it, which resulted in the headers using "asm" to redirect > calls to some libm functions), not an API. The change means that ABI has > turned into compat symbols (only available for existing binaries, not for > anything newly linked, not included in static libm at all, not included in > shared libm for future glibc ports such as RV32), so, yes, in any case > where tools generate direct calls to those functions (rather than just > following the "asm" annotations on function declarations in the headers), > they need to stop doing so. As a consequence, we should no longer assume these symbols are available on the target system. Still keep the TargetLibraryInfo for constant folding. Differential Revision: https://reviews.llvm.org/D74712 (cherry picked from commit 6d15c4deab51498b70825fb6cefbbfe8f3d9bdcf) For https://bugs.llvm.org/show_bug.cgi?id=45034
*	[Codegen] Revert rL354676/rL354677 and followups - introduced PR43446 miscompile	Roman Lebedev	2020-02-26	1	-4/+27
\| \| \| \| \| \| \| \| \| \|	This reverts https://reviews.llvm.org/D58468 (rL354676, 44037d7a6377ec8e5542cced73583283334b516b), and all and any follow-ups to that code block. https://bugs.llvm.org/show_bug.cgi?id=43446 (cherry picked from commit d20907d1de89bf63b589fadd8c096d4895e47fba)
*	[AArch64][FPenv] Update chain of int to fp conversion	Diogo Sampaio	2020-02-18	1	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When using strict fp, it is required to update the chain when performing integer type promotion of a operand to a integer to floating point conversion. Reviewers: craig.topper, john.brawn Reviewed By: craig.topper Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74597 (cherry picked from commit 8bc790f9e6a6fc6d8fe8f41a7120269366fa0957)
*	[FPEnv][AArch64] Add lowering of f128 STRICT_FSETCC	John Brawn	2020-02-18	1	-1/+191
\| \| \| \| \| \| \| \|	These get lowered to function calls, like the non-strict versions. Differential Revision: https://reviews.llvm.org/D73784 (cherry picked from commit 68cf574857c81f711f498a479855a17e7bea40f7)
*	[FPEnv][AArch64] Add lowering and instruction selection for strict conversions	John Brawn	2020-02-18	2	-39/+410
\| \| \| \| \| \| \| \| \| \|	Strict fp-to-int and int-to-fp conversions can be handled in the same way that the non-strict versions are (by using the appropriate instruction or converting to a function call when we have no instruction). Differential Revision: https://reviews.llvm.org/D73625 (cherry picked from commit 0bb9a27c9895c0fbc3f55f56ad7f1e1927398fce)
*	[FPEnv][AArch64] Add lowering and instruction selection for STRICT_FP_ROUND	John Brawn	2020-02-18	1	-3/+18
\| \| \| \| \| \| \| \| \| \|	This gets selected to the appropriate fcvt instruction. Handling from there on isn't fully correct yet, as we need to model fcvt reading and writing to fpsr and fpcr. Differential Revision: https://reviews.llvm.org/D73201 (cherry picked from commit 258d8dd76afd88a12539b182a53ff21dcba16a2d)
*	Add lowering of STRICT_FSETCC and STRICT_FSETCCS	John Brawn	2020-02-18	1	-0/+974
\| \| \| \| \| \| \| \| \| \|	These become STRICT_FCMP and STRICT_FCMPE, which then get selected to the corresponding FCMP and FCMPE instructions, though the handling from there on isn't fully correct as we don't model reads and writes to FPCR and FPSR. Differential Revision: https://reviews.llvm.org/D73368 (cherry picked from commit 2224407ef5baf6100fa22420feb4d25af1a9493f)
*	Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field"	Jeremy Morse	2020-02-12	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit ed29dbaafa49bb8c9039a35f768244c394411fea. I'm backing out D68945, which as the discussion for D73526 shows, doesn't seem to handle the -O0 path through the codegen backend correctly. I'll reland the patch when a fix is worked out, apologies for all the churn. The two parent commits are part of this revert too. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/test/DebugInfo/X86/dbg-addr-dse.ll SelectionDAGBuilder conflict is due to a nearby change in e39e2b4a79c6 that's technically unrelated. dbg-addr-dse.ll conflicted because 41206b61e30c (legitimately) changes the order of two lines. There are further modifications to dbg-value-func-arg.ll: it landed after the patch being reverted, and I've converted indirection to be represented by the isIndirect field rather than DW_OP_deref. (cherry picked from commit 6531a78ac4b5b229bce272706593a0bc873877d7)
*	[AArch64] Add option to enable/disable load-store renaming.	Florian Hahn	2020-02-10	8	-15/+21
\| \| \| \| \| \| \| \|	This patch adds a new option to enable/disable register renaming in the load-store optimizer. Defaults to disabled, as there is a potential mis-compile caused by this. (cherry picked from commit 8e3f59b45ae185cc9b4e3a817d7ac958f1d55976)
*	[X86] -fpatchable-function-entry=N,0: place patch label after ENDBR{32,64}	Fangrui Song	2020-02-05	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to D73680 (AArch64 BTI). A local linkage function whose address is not taken does not need ENDBR32/ENDBR64. Placing the patch label after ENDBR32/ENDBR64 has the advantage that code does not need to differentiate whether the function has an initial ENDBR. Also, add 32-bit tests and test that .cfi_startproc is at the function entry. The line information has a general implementation and is tested by AArch64/patchable-function-entry-empty.mir Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73760 (cherry picked from commit 8ff86fcf4c038c7156ed4f01e7ed35cae49489e2)
*	[ARM][VecReduce] Force expand vector_reduce_fmin	David Green	2020-02-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under MVE, we do not have any lowering for fminimum, which a vector_reduce_fmin without NoNan will be expanded into. As with the other recent patches, force this to expand in the pre-isel pass. Note that Neon lowering would be OK because the scalar fminimum uses the vector VMIN instruction, but is probably better to just rely on the scalar operations, which is what is done here. Also fixes what appears to be the reversal of INF vs -INF in the vector_reduce_fmin widening code. (cherry picked from commit 362d00e0510ee75750499e2993a782428e377215)
*	[AArch64][ARM] Always expand ordered vector reductions (PR44600)	Nikita Popov	2020-02-05	3	-0/+330
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fadd/fmul reductions without reassoc are lowered to VECREDUCE_STRICT_FADD/FMUL nodes, which don't have legalization support. Until that is in place, expand these intrinsics on ARM and AArch64. Other targets always expand the vector reduction intrinsics. Additionally expand fmax/fmin reductions without nonan flag on AArch64, as the backend asserts that the flag is present when lowering VECREDUCE_FMIN/FMAX. This fixes https://bugs.llvm.org/show_bug.cgi?id=44600. Differential Revision: https://reviews.llvm.org/D73135 (cherry picked from commit 70d345e687caba4ac1f95655c6924dfa91e0083f)
*	[AArch64] -fpatchable-function-entry=N,0: place patch label after BTI	Fangrui Song	2020-02-03	1	-2/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after 9a24488cb67a90f889529987275c5e411ce01dda, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680 (cherry picked from commit 06b8e32d4fd3f634f793e3bc0bc4fdb885e7a3ac)
*	Drop arm triple from test/CodeGen/AArch64/global-merge-hidden-minsize.ll	Hans Wennborg	2020-01-30	1	-1/+0
\| \| \| \| \| \| \|	Because it's in the AArch64/ directory, it runs in cases where the arm target may not be available, see comment on D73235. (cherry picked from commit 6be9acdfa814dee6c57833d5351137c72c11fbd3)
*	[GlobalMerge] Preserve symbol visibility when merging globals	Michael Spang	2020-01-29	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \|	Symbols created for merged external global variables have default visibility. This can break programs when compiling with -Oz -fvisibility=hidden as symbols that should be hidden will be exported at link time. Differential Revision: https://reviews.llvm.org/D73235 (cherry picked from commit a2fb2c0ddca14c133f24d08af4a78b6a3d612ec6)
*	[PatchableFunction] Allow empty entry MachineBasicBlock	Fangrui Song	2020-01-24	1	-0/+64
\| \| \| \| \| \| \| \|	Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73301 (cherry picked from commit 50a3ff30e1587235d1830fec9694c1239302ab9f)
*	Add function attribute "patchable-function-prefix" to support ↵	Fangrui Song	2020-01-24	2	-13/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-fpatchable-function-entry=N,M where M>0 Similar to the function attribute `prefix` (prefix data), "patchable-function-prefix" inserts data (M NOPs) before the function entry label. -fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry) will look like: ``` .type foo,@function .Ltmp0: # @foo nop foo: .Lfunc_begin0: # optional `bti c` (AArch64 Branch Target Identification) or # `endbr64` (Intel Indirect Branch Tracking) nop .section __patchable_function_entries,"awo",@progbits,get,unique,0 .p2align 3 .quad .Ltmp0 ``` -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html): ``` (a) (b) func: func: .Ltmp0: bti c bti c .Ltmp0: nop nop ``` (a) needs no additional code. If the consensus is to go for (b), we will need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp . Differential Revision: https://reviews.llvm.org/D73070 (cherry picked from commit 22467e259507f5ead2a87d989251b4c951a587e4)
*	[AsmPrinter] Don't emit __patchable_function_entries entry if ↵	Fangrui Song	2020-01-24	2	-16/+24
\| \| \| \| \| \| \| \|	"patchable-function-entry"="0" Add improve tests (cherry picked from commit d232c215669cb57f5eb4ead40a4a336220dbc429)
*	[CodeGen] Move fentry-insert, xray-instrumentation and patchable-function ↵	Fangrui Song	2020-01-24	3	-6/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	before addPreEmitPass() This intention is to move patchable-function before aarch64-branch-targets (configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424). This also allows addPreEmitPass() passes to know the precise instruction sizes if they want. Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1. No output difference with this commit and the previous commit. (cherry picked from commit 9a24488cb67a90f889529987275c5e411ce01dda)
*	[AArch64] Don't rename registers with pseudo defs in Ld/St opt.	Florian Hahn	2020-01-22	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \|	If the root def of for renaming is a noop-pseudo instruction like kill, we would end up without a correct def for the renamed register, causing miscompiles. This patch conservatively bails out on any pseudo instruction. This fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1037912#c70 (cherry picked from commit 300997c41a00b705ca10264c15910dd8d691ab75)
*	Revert rG6078f2fedcac5797ac39ee5ef3fd7a35ef1202d5 - "[AArch64][GlobalISel]: ↵	Simon Pilgrim	2020-01-15	2	-42/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Support @llvm.{return,frame}address selection." These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656 ---- Breaks EXPENSIVE_CHECKS builds.
*	[AArch64][SVE] Add ptest intrinsics	Cullen Rhodes	2020-01-15	2	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implements the following intrinsics: * @llvm.aarch64.sve.ptest.any * @llvm.aarch64.sve.ptest.first * @llvm.aarch64.sve.ptest.last Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72398
*	[AArch64][GlobalISel]: Support @llvm.{return,frame}address selection.	Amara Emerson	2020-01-14	2	-0/+42
\| \| \| \| \| \| \| \| \|	These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656
*	[SVE] Add patterns for MUL immediate instruction.	Danilo Carvalho Grael	2020-01-14	3	-0/+106
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add the missing MUL pattern for integer immediate instructions. Reviewers: sdesmalen, huntergr, efriedma, c-rhodes, kmclaughlin Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D72654
*	[MachineScheduler] Reduce reordering due to mem op clustering	Jay Foad	2020-01-14	6	-20/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Mem op clustering adds a weak edge in the DAG between two loads or stores that should be clustered, but the direction of this edge is pretty arbitrary (it depends on the sort order of MemOpInfo, which represents the operands of a load or store). This often means that two loads or stores will get reordered even if they would naturally have been scheduled together anyway, which leads to test case churn and goes against the scheduler's "do no harm" philosophy. The fix makes sure that the direction of the edge always matches the original code order of the instructions. Reviewers: atrick, MatzeB, arsenm, rampitec, t.p.northover Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72706
*	[AArch64] Fix save register pairing for Windows AAPCS	Sanne Wouda	2020-01-14	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: On Windows, when a function does not have an unwind table (for example, EH filtering funclets), we don't correctly pair FP and LR to form the frame record in all circumstances. Fix this by invalidating a pair when the second register is FP when compiling for Windows, even when CFI is not needed. Fixes PR44271 introduced by D65653. Reviewers: efriedma, sdesmalen, rovka, rengolin, t.p.northover, thegameg, greened Reviewed By: rengolin Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71754
*	[GlobalISel] Change representation of shuffle masks in MachineOperand.	Eli Friedman	2020-01-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663
*	[AArch64][SVE] Add patterns for some arith SVE instructions.	Danilo Carvalho Grael	2020-01-13	1	-0/+365
\| \| \| \| \| \| \| \| \| \| \|	Summary: Add patterns for the following instructions: - smax, smin, umax, umin Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: amehsan Differential Revision: https://reviews.llvm.org/D71779
*	[AArch64] Emit HINT instead of PAC insns in Armv8.2-A or below	Pablo Barrio	2020-01-13	11	-99/+185
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The Pointer Authentication Extension (PAC) was added in Armv8.3-A. Some instructions are implemented in the HINT space to allow compiling code common to CPUs regardless of whether they feature PAC or not, and still benefit from PAC protection in the PAC-enabled CPUs. The 8.3-specific mnemonics were currently enabled in any architecture, and LLVM was emitting them in assembly files when PAC code generation was enabled. This was ok for compilations where both LLVM codegen and the integrated assembler were used. However, the LLVM codegen was not compatible with other assemblers (e.g. GAS). Given the fact that the approach from these assemblers (i.e. to disallow Armv8.3-A mnemonics if compiling for Armv8.2-A or lower) is entirely reasonable, this patch makes LLVM to emit HINT when building for Armv8.2-A and below, instead of PACIASP, AUTIASP and friends. Then, LLVM assembly should be compatible with other assemblers. Reviewers: samparker, chill, LukeCheeseman Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71658
*	[SelectionDAG] ComputeKnownBits - minimum leading/trailing zero bits in ↵	Simon Pilgrim	2020-01-13	2	-9/+4
\| \| \| \| \| \| \| \| \| \|	LSHR/SHL (PR44526) As detailed in https://blog.regehr.org/archives/1709 we don't make use of the known leading/trailing zeros for shifted values in cases where we don't know the shift amount value. This patch adds support to SelectionDAG::ComputeKnownBits to use KnownBits::countMinTrailingZeros and countMinLeadingZeros to set the minimum guaranteed leading/trailing known zero bits. Differential Revision: https://reviews.llvm.org/D72573
*	This option allows selecting the TLS size in the local exec TLS model,	KAWASHIMA Takahiro	2020-01-13	2	-45/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	which is the default TLS model for non-PIC objects. This allows large/ many thread local variables or a compact/fast code in an executable. Specification is same as that of GCC. For example, the code model option precedes the TLS size option. TLS access models other than local-exec are not changed. It means supoort of the large code model is only in the local exec TLS model. Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>) Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard Reviewd By: peter.smith Committed by: peter.smith Differential Revision: https://reviews.llvm.org/D71688
*	__patchable_function_entries: don't use linkage field 'unique' with ↵	Fangrui Song	2020-01-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	-no-integrated-as .section name, "flags"G, @type, GroupName[, linkage] As of binutils 2.33, linkage cannot be 'unique'. For integrated assembler, we use both 'o' flag and 'unique' linkage to support --gc-sections and COMDAT with lld. https://sourceware.org/ml/binutils/2019-11/msg00266.html
*	[AArch64] Don't generate libcalls for wide shifts on Darwin	Jessica Paquette	2020-01-10	1	-0/+5
\| \| \| \| \| \| \|	Similar to cff90f07cb5cc3. Darwin doesn't always use compiler-rt, and so we can't assume that these functions are available (at least on arm64).
*	[AArch64] Add function attribute "patchable-function-entry" to add NOPs at ↵	Fangrui Song	2020-01-10	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	function entry The Linux kernel uses -fpatchable-function-entry to implement DYNAMIC_FTRACE_WITH_REGS for arm64 and parisc. GCC 8 implemented -fpatchable-function-entry, which can be seen as a generalized form of -mnop-mcount. The N,M form (function entry points before the Mth NOP) is currently only used by parisc. This patch adds N,0 support to AArch64 codegen. N is represented as the function attribute "patchable-function-entry". We will use a different function attribute for M, if we decide to implement it. The patch reuses the existing patchable-function pass, and TargetOpcode::PATCHABLE_FUNCTION_ENTER which is currently used by XRay. When the integrated assembler is used, __patchable_function_entries will be created for each text section with the SHF_LINK_ORDER flag to prevent --gc-sections (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93197) and COMDAT (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93195) issues. Retrospectively, __patchable_function_entries should use a PC-relative relocation type to avoid the SHF_WRITE flag and dynamic relocations. "patchable-function-entry"'s interaction with Branch Target Identification is still unclear (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 for GCC discussions). Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D72215
*	Add support for __declspec(guard(nocf))	Andrew Paverd	2020-01-10	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a new `"guard_nocf"` function attribute to indicate that checks should not be added on indirect calls within that function. Add support for `__declspec(guard(nocf))` following the same syntax as MSVC. Reviewers: rnk, dmajor, pcc, hans, aaron.ballman Reviewed By: aaron.ballman Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72167
*	Relax opcode checks in test for G_READCYCLECOUNTER to check for only a ↵	Douglas Yung	2020-01-09	1	-1/+1
\| \| \| \|	number instead of a specific number.
*	[AArch64][GlobalISel] Implement selection of <2 x float> vector splat.	Amara Emerson	2020-01-09	3	-6/+71
\| \| \| \| \| \|	Also requires making G_IMPLICIT_DEF of v2s32 legal. Differential Revision: https://reviews.llvm.org/D72422
*	[GlobalISel][AArch64] Import + select LDRroW and STRroW patterns	Jessica Paquette	2020-01-09	2	-0/+483
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for selecting a large chunk of the load/store roW patterns. This is pretty much a straight port of AArch64DAGToDAGISel::SelectAddrModeWRO into GISel. The code is very similar to the XRO code. The main difference is that in the roW patterns, we want to try and fold in an extend, and possibly a shift along with it. A good portion of this patch is refactoring the existing XRO code. - Add selectAddrModeWRO - Factor out the code from selectAddrModeShiftedExtendXReg which is used by both selectAddrModeXRO and selectAddrModeWRO into selectExtendedSHL. This is similar to the function of the same name in AArch64DAGToDAGISel. - Add support for extends to the factored out code in selectExtendedSHL. - Teach getExtendTypeForInst how to handle AND masks that are intended to be used in loads/stores (necessary for this addressing mode.) - Make getExtendTypeForInst not static because moving it made an annoying diff and I wanted to have the WRO/XRO functions close to each other while I was writing the code. Differential Revision: https://reviews.llvm.org/D72426
*	Revert "Merge memtag instructions with adjacent stack slots."	Evgenii Stepanov	2020-01-08	4	-308/+13
\| \| \| \| \| \| \| \| \| \| \| \|	* Bad machine code: Tied use must be a register * - function: stg_alloca17 - basic block: %bb.0 entry (0x20076710580) - instruction: early-clobber %0:gpr64common, early-clobber %1:gpr64sp = STGloop 272, %stack.0.a :: (store 272 into %ir.a, align 16) - operand 3: %stack.0.a http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/21481/steps/test-check-all/logs/stdio This reverts commit b675a7628ce6a21b1e4a71c079a67badfb8b073d.
*	Merge memtag instructions with adjacent stack slots.	Evgenii Stepanov	2020-01-08	4	-13/+308
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and base address as a FI operand when possible. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286
*	AArch64: add missing Apple CPU names and use them by default.	Tim Northover	2020-01-08	1	-0/+1
\| \| \| \| \| \| \| \|	Apple's CPUs are called A7-A13 in official communication, occasionally with weird suffixes which we probably don't need to care about. This adds each one and describes its features. It also switches the default CPU to the canonical name for Cyclone, but leaves legacy support in so that existing bitcode still compiles.
*	[AArch64][GlobalISel] Fold a chain of two G_PTR_ADDs of constant offsets.	Amara Emerson	2020-01-07	1	-0/+72
\| \| \| \| \| \| \| \| \| \|	E.g. %addr1 = G_PTR_ADD %base, G_CONSTANT 20 %addr2 = G_PTR_ADD %addr1, G_CONSTANT 8 --> %addr2 = G_PTR_ADD %base, G_CONSTANT 28 Differential Revision: https://reviews.llvm.org/D72351
*	[MachineOutliner][AArch64] Save + restore LR in noreturn functions	Jessica Paquette	2020-01-07	2	-56/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conservatively always save + restore LR in noreturn functions. These functions do not end in a RET, and so they aren't guaranteed to have an instruction which uses LR in any way. So, as a result, you can end up in unfortunate situations where you can't backtrace out of these functions in a debugger. Remove the old noreturn test, and add a new one which is more descriptive. Remove the restriction that we can't outline from noreturn functions as well since we now do the right thing.
*	Lower TAGPstack with negative offset to SUBG.	Evgenii Stepanov	2020-01-06	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This never really occurs in the current codegen, so only a MIR test is possible. Reviewers: ostannard, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72123
*	GlobalISel: Define G_READCYCLECOUNTER	Matt Arsenault	2020-01-04	2	-0/+15
\|