summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
...
* Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2)Sander de Smalen2019-10-282-4/+57
| | | | | Fixed up test/DebugInfo/MIR/Mips/live-debug-values-reg-copy.mir that broke r375425.
* Add Windows Control Flow Guard checks (/guard:cf).Andrew Paverd2019-10-287-3/+36
| | | | | | | | | | | | | | | | | | | Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761
* [AArch64] Fix unannotated fall-through between switch labelsJinsong Ji2019-10-281-0/+1
| | | | | | This is breaking buildbot with -Werror,-Wimplicit-fallthrough on. eg: http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/6881
* [ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLEvhscampos2019-10-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Writing support for three ACLE functions: unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) unsigned int __clsll(uint64_t x) CLS stands for "Count number of leading sign bits". In AArch64, these two intrinsics can be translated into the 'cls' instruction directly. In AArch32, on the other hand, this functionality is achieved by implementing it in terms of clz (count number of leading zeros). Reviewers: compnerd Reviewed By: compnerd Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69250
* [AArch64][SVE] Implement masked load intrinsicsKerry McLaughlin2019-10-287-4/+136
| | | | | | | | | | | | | | | | Summary: Adds support for codegen of masked loads, with non-extending, zero-extending and sign-extending variants. Reviewers: huntergr, rovka, greened, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, samparker, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68877
* [Alignment][NFC] getMemoryOpCost uses MaybeAlignGuillaume Chatelet2019-10-252-5/+5
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
* [Mips] Use appropriate private label prefix based on Mips ABIMirko Brkusanin2019-10-231-1/+2
| | | | | | | | | | MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64 regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo we can find out Mips ABI and pick appropriate prefix. Tags: #llvm, #clang, #lldb Differential Revision: https://reviews.llvm.org/D66795
* [GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectorsQuentin Colombet2019-10-211-0/+2
| | | | | | | | | | Teach the CombinerHelper how to turn shuffle_vectors, that concatenate vectors, into concat_vectors and add this combine to the AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69149 llvm-svn: 375452
* Reverted r375425 as it broke some buildbots.Sander de Smalen2019-10-212-57/+4
| | | | llvm-svn: 375444
* [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2)Sander de Smalen2019-10-212-4/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit message from D66935: This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. This patch fixes the lldb unit tests in `functionalities/thread/concurrent_events/*` Changes after D66935: Ensures AArch64FunctionInfo::getCalleeSavedStackSize does not return the uninitialized CalleeSavedStackSize when running `llc` on a specific pass where the MIR code has already been expected to have gone through PEI. Instead, getCalleeSavedStackSize (when passed the MachineFrameInfo) will try to recalculate the CalleeSavedStackSize from the CalleeSavedInfo. In debug mode, the compiler will assert the recalculated size equals the cached size as calculated through a call to determineCalleeSaves. This fixes two tests: test/DebugInfo/AArch64/asan-stack-vars.mir test/DebugInfo/AArch64/compiler-gen-bbs-livedebugvalues.mir that otherwise fail when compiled using msan. Reviewed By: omjavaid, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68783 llvm-svn: 375425
* Use Align for TFL::TransientStackAlignmentGuillaume Chatelet2019-10-211-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, jyknight, sdardis, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69216 llvm-svn: 375398
* Prune a LegacyDivergenceAnalysis and MachineLoopInfo include eachReid Kleckner2019-10-191-0/+1
| | | | | | Now X86ISelLowering doesn't depend on many IR analyses. llvm-svn: 375320
* Prune two MachineInstr.h includes, fix up depsReid Kleckner2019-10-191-0/+1
| | | | | | | | | | MachineInstr.h included AliasAnalysis.h, which includes a world of IR constructs mostly unneeded in CodeGen. Prune it. Same for DebugInfoMetadata.h. Noticed with -ftime-trace. llvm-svn: 375311
* [GISel][CallLowering] Make isIncomingArgumentHandler a pure virtual methodQuentin Colombet2019-10-181-0/+2
| | | | | | | | | | | | | The default implementation of isIncomingArgumentHandler could lead to generating incorrect code. Make it a pure virtual method, so that targets know they have to override it to produce correct code. NFC Differential Revision: https://reviews.llvm.org/D69187 llvm-svn: 375277
* [AArch64] Adding support for PMMIR_EL1 registerVictor Campos2019-10-184-1/+16
| | | | | | | | | | | | | | | | | | Summary: The PMMIR_EL1 register is present in Armv8.4 with PMU extension. This patch adds support for it. Reviewers: t.p.northover, dnsampaio Reviewed By: dnsampaio Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68940 llvm-svn: 375228
* [AArch64][SVE] Add SPLAT_VECTOR ISD NodeGraham Hunter2019-10-184-8/+54
| | | | | | | | | | | | | | | | | | | | | | | | | Adds a new ISD node to replicate a scalar value across all elements of a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot be used. Fixes up default type legalization for scalable vectors after the new MVT type ranges were introduced. At present I only use this node for scalable vectors. A DAGCombine has been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all elements are the same, but only if the default operation action of Expand has been overridden by the target. I've only added result promotion legalization for scalable vector i8/i16/i32/i64 types in AArch64 for now. Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy Reviewed By: jmolloy Differential Revision: https://reviews.llvm.org/D47775 llvm-svn: 375222
* [AArch64] Don't combine callee-save and local stack adjustment when ↵David Green2019-10-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimizing for size For arm64, D18619 introduced the ability to combine bumping the stack pointer upfront in case it needs to be bumped for both the callee-save area as well as the local stack area. That diff already remarks that "This change can cause an increase in instructions", but argues that even when that happens, it should be still be a performance benefit because the number of micro-ops is reduced. We have observed that this code-size increase can be significant in practice. This diff disables combining stack bumping for methods that are marked as optimize-for-size. Example of a prologue with the behavior before this diff (combining stack bumping when possible): sub sp, sp, #0x40 stp d9, d8, [sp, #0x10] stp x20, x19, [sp, #0x20] stp x29, x30, [sp, #0x30] add x29, sp, #0x30 [... compute x8 somehow ...] stp x0, x8, [sp] And after this diff, if the method is marked as optimize-for-size: stp d9, d8, [sp, #-0x30]! stp x20, x19, [sp, #0x10] stp x29, x30, [sp, #0x20] add x29, sp, #0x20 [... compute x8 somehow ...] stp x0, x8, [sp, #-0x10]! Note that without combining the stack bump there are two auto-decrements, nicely folded into the stp instructions, whereas otherwise there is a single sub sp, ... instruction, but not folded. Patch by Nikolai Tillmann! Differential Revision: https://reviews.llvm.org/D68530 llvm-svn: 375217
* [AArch64][SVE] Implement unpack intrinsicsKerry McLaughlin2019-10-185-5/+39
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implements the following intrinsics: - int_aarch64_sve_sunpkhi - int_aarch64_sve_sunpklo - int_aarch64_sve_uunpkhi - int_aarch64_sve_uunpklo This patch also adds AArch64ISD nodes for UNPK instead of implementing the intrinsics directly, as they are required for a future patch which implements the sign/zero extension of legal vectors. This patch includes tests for the Subdivide2Argument type added by D67549 Reviewers: sdesmalen, SjoerdMeijer, greened, rengolin, rovka Reviewed By: greened Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D67550 llvm-svn: 375210
* [Alignment][NFC] Use Align for TargetFrameLowering/SubtargetGuillaume Chatelet2019-10-171-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68993 llvm-svn: 375084
* [gicombiner] Add the run-time rule disable optionDaniel Sanders2019-10-172-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Each generated helper can be configured to generate an option that disables rules in that helper. This can be used to bisect rulesets. The disable bits are stored in a SparseVector as this is very cheap for the common case where nothing is disabled. It gets more expensive the more rules are disabled but you're generally doing that for debug purposes where performance is less of a concern. Depends on D68426 Reviewers: volkan, bogner Reviewed By: volkan Subscribers: hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68438 llvm-svn: 375067
* [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => ↵Quentin Colombet2019-10-171-0/+2
| | | | | | | | | | | | | build_vector Teach the combiner helper how to flatten concat_vectors of build_vectors into a build_vector. Add this combine as part of AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69071 llvm-svn: 375066
* [gicombiner] Hoist pure C++ combine into the tablegen definitionDaniel Sanders2019-10-162-8/+3
| | | | | | | | | | | | | | | | | | | | | | Summary: This is just moving the existing C++ code around and will be NFC w.r.t AArch64. Renamed 'CombineBr' to something more descriptive ('ElideByByInvertingCond') at the same time. The remaining combines in AArch64PreLegalizeCombiner require features that aren't implemented at this point and will be hoisted as they are added. Depends on D68424 Reviewers: bogner, volkan Subscribers: kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68426 llvm-svn: 375057
* [AArch64] Fix offset calculationShoaib Meenai2019-10-162-4/+4
| | | | | | | | | | | | | | | | | r374772 changed Offset to be an int64_t but left NewOffset as an int. Scale is unsigned, so in the calculation `Offset - NewOffset * Scale`, `NewOffset * Scale` was promoted to unsigned and was then zero-extended to 64 bits, leading to an incorrect computation which manifested as an out-of-memory when building the Swift standard library for Android aarch64. Promote NewOffset to int64_t to fix this, and promote EmittableOffset as well, since its one user passes it to a function which takes an int64_t anyway. Test case based on a suggestion by Sander de Smalen! Differential Revision: https://reviews.llvm.org/D69018 llvm-svn: 375043
* [AArch64,Assembler] Compiler support for ID_MMFR5_EL1Mark Murray2019-10-161-0/+1
| | | | | | | | | | | | Summary: Add read-only system register ID_MMFR5_EL1 and unit tests. Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69039 llvm-svn: 375010
* [Alignment][NFC] Value::getPointerAlignment returns MaybeAlignGuillaume Chatelet2019-10-151-2/+2
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68398 llvm-svn: 374889
* [AArch64] Stackframe accesses to SVE objects.Sander de Smalen2019-10-144-41/+108
| | | | | | | | | | | | | | | | | Materialize accesses to SVE frame objects from SP or FP, whichever is available and beneficial. This patch still assumes the objects are pre-allocated. The automatic layout of SVE objects within the stackframe will be added in a separate patch. Reviewers: greened, cameron.mcinally, efriedma, rengolin, thegameg, rovka Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D67749 llvm-svn: 374772
* recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure ↵Zi Xuan Wu2019-10-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634
* [AArch64][SVE] Implement sdot and udot (lane) intrinsicsKerry McLaughlin2019-10-113-22/+35
| | | | | | | | | | | | | | | | | | | | | Summary: Implements the following arithmetic intrinsics: - int_aarch64_sve_sdot - int_aarch64_sve_sdot_lane - int_aarch64_sve_udot - int_aarch64_sve_udot_lane This patch includes tests for the Subdivide4Argument type added by D67549 Reviewers: sdesmalen, SjoerdMeijer, greened, rengolin, rovka Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D67551 llvm-svn: 374566
* [System Model] [TTI] Update cache and prefetch TTI interfacesDavid Greene2019-10-093-28/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo. Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Moving the default TargetTransformInfoImplBase implementation to a default MCSubtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default MCSubtargetInfo implementation essentially returns the defaults TargetTransformInfoImplBase used to return. Existing users of TTI defaults will hit the defaults now in MCSubtargetInfo. Targets that define their own custom TTI implementations won't use the BasicTTIImpl implementations that route to the subtarget. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 374205
* [AArch64] Ensure no tagged memory is left in the unallocated portion of theMomchil Velikov2019-10-091-15/+68
| | | | | | | | | | | | | | | | | stack This patch makes sure that if we tag some memory, we untag that memory before the function returns/throws via any exit, reachable from the tag operation. For that we place the untag operation either at: a) the lifetime end call for the alloca, if that call post-dominates the lifetime start call (where the tag operation is placed), or it (the lifetime end call) dominates all reachable exits, otherwise b) at the reachable exits Differential Revision: https://reviews.llvm.org/D68469 llvm-svn: 374182
* Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure ↵Jinsong Ji2019-10-081-2/+1
| | | | | | | | | | | | | | separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091
* fix fmls fp16Sebastian Pop2019-10-081-12/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tim Northover remarked that the added patterns for fmls fp16 produce wrong code in case the fsub instruction has a multiplication as its first operand, i.e., all the patterns FMLSv*_OP1: > define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) { > ; CHECK-LABEL: test_FMLSv8f16_OP1: > ; CHECK: fmls {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h > entry: > > %mul = fmul fast <8 x half> %c, %b > %sub = fsub fast <8 x half> %mul, %a > ret <8 x half> %sub > } > > This doesn't look right to me. The exact instruction produced is "fmls > v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2*v1", but the > IR is calculating "v2*v1-v0". The equivalent <4 x float> code also > doesn't emit an fmls. This patch generates an fmla and negates the value of the operand2 of the fsub. Inspecting the pattern match, I found that there was another mistake in the opcode to be selected: matching FMULv4*16 should generate FMLSv4*16 and not FMLSv2*32. Tested on aarch64-linux with make check-all. Differential Revision: https://reviews.llvm.org/D67990 llvm-svn: 374044
* [SVE][IR] Scalable Vector size queries and IR instruction supportGraham Hunter2019-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | * Adds a TypeSize struct to represent the known minimum size of a type along with a flag to indicate that the runtime size is a integer multiple of that size * Converts existing size query functions from Type.h and DataLayout.h to return a TypeSize result * Adds convenience methods (including a transparent conversion operator to uint64_t) so that most existing code 'just works' as if the return values were still scalars. * Uses the new size queries along with ElementCount to ensure that all supported instructions used with scalable vectors can be constructed in IR. Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen Reviewed By: rovka, sdesmalen Differential Revision: https://reviews.llvm.org/D53137 llvm-svn: 374042
* [ISEL][ARM][AARCH64] Tracking simple parameter forwarding registersNikola Prica2019-10-081-1/+16
| | | | | | | | | | | | | | Support for tracking registers that forward function parameters into the following function frame. For now we only support cases when parameter is forwarded through single register. Reviewers: aprantl, vsk, t.p.northover Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D66953 llvm-svn: 374033
* [LoopVectorize][PowerPC] Estimate int and float register pressure separately ↵Zi Xuan Wu2019-10-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017
* [AArch64InstPrinter] prefer bfi to bfc for < armv8.2-aNick Desaulniers2019-10-031-1/+2
| | | | | | | | | | | | | | | | | | | Summary: Fixes pr/42576. Link: https://github.com/ClangBuiltLinux/linux/issues/697 Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D68356 llvm-svn: 373655
* Revert "[Alignment][NFC] Allow constexpr Align"Guillaume Chatelet2019-10-031-1/+1
| | | | | | This reverts commit b3af236fb5fc6e50fcc1b54d868f0bff557f3fb1. llvm-svn: 373619
* [AArch64][SVE] Adding patterns for floating point SVE add instructions.Ehsan Amiri2019-10-032-12/+14
| | | | llvm-svn: 373600
* [AArch64] Static (de)allocation of SVE stack objects.Sander de Smalen2019-10-035-13/+171
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support to AArch64FrameLowering to allocate fixed-stack SVE objects. The focus of this patch is purely to allow the stack frame to allocate/deallocate space for scalable SVE objects. More dynamic allocation (at compile-time, i.e. determining placement of SVE objects on the stack), or resolving frame-index references that include scalable-sized offsets, are left for subsequent patches. SVE objects are allocated in the stack frame as a separate region below the callee-save area, and above the alignment gap. This is done so that the SVE objects can be accessed directly from the FP at (runtime) VL-based offsets to benefit from using the VL-scaled addressing modes. The layout looks as follows: +-------------+ | stack arg | +-------------+ | Callee Saves| | X29, X30 | (if available) |-------------| <- FP (if available) | : | | SVE area | | : | +-------------+ |/////////////| alignment gap. | : | | Stack objs | | : | +-------------+ <- SP after call and frame-setup SVE and non-SVE stack objects are distinguished using different StackIDs. The offsets for objects with TargetStackID::SVEVector should be interpreted as purely scalable offsets within their respective SVE region. Reviewers: thegameg, rovka, t.p.northover, efriedma, rengolin, greened Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61437 llvm-svn: 373585
* Fix uninitialized variable warning. NFCISimon Pilgrim2019-10-031-1/+1
| | | | llvm-svn: 373583
* [Alignment][NFC] Allow constexpr AlignGuillaume Chatelet2019-10-031-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68329 llvm-svn: 373580
* [gicombiner] Fix windows issue where single quotes in the command are passed ↵Daniel Sanders2019-10-021-1/+1
| | | | | | through to tablegen llvm-svn: 373545
* [gicombiner] Add the boring boilerplate for the declarative combinerDaniel Sanders2019-10-024-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first of a series of patches extracted from a much bigger WIP patch. It merely establishes the tblgen pass and the way empty combiner helpers are declared and integrated into a combiner info. The tablegen pass takes a -combiners option to select the combiner helper that will be generated. This can be given multiple values to generate multiple combiner helpers at once. Doing so helps to minimize parsing overhead. The reason for creating a GlobalISel subdirectory in utils/TableGen is that there will be quite a lot of non-pass files (~15) by the time the patch series is done. Reviewers: volkan Subscribers: mgorny, hiraditya, simoncook, Petar.Avramovic, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68286 llvm-svn: 373527
* [AArch64][SVE] Implement int_aarch64_sve_cnt intrinsicKerry McLaughlin2019-10-022-6/+16
| | | | | | | | | | | | | | | | Summary: This patch includes tests for the VecOfBitcastsToInt type added by D68021 Reviewers: c-rhodes, sdesmalen, rovka Reviewed By: c-rhodes Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68023 llvm-svn: 373468
* TLI: Remove DAG argument from getRegisterByNameMatt Arsenault2019-10-012-5/+5
| | | | | | | | | | | Replace with the MachineFunction. X86 is the only user, and only uses it for the function. This removes one obstacle from using this in GlobalISel. The other is the more tolerable EVT argument. The X86 use of the function seems questionable to me. It checks hasFP, before frame lowering. llvm-svn: 373292
* [AArch64][SVE] Implement punpk[hi|lo] intrinsicsKerry McLaughlin2019-09-302-2/+15
| | | | | | | | | | | | | | | | | | | | | | Summary: Adds the following two intrinsics: - int_aarch64_sve_punpkhi - int_aarch64_sve_punpklo This patch also contains a fix which allows LLVMHalfElementsVectorType to forward reference overloadable arguments. Reviewers: sdesmalen, rovka, rengolin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, greened, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67830 llvm-svn: 373232
* [AArch64][GlobalISel] Support lowering variadic musttail callsJessica Paquette2019-09-301-11/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for lowering variadic musttail calls. To do this, we have to... - Detect a musttail call in a variadic function before attempting to lower the call's formal arguments. This is done in the IRTranslator. - Compute forwarded registers in `lowerFormalArguments`, and add copies for those registers. - Restore the forwarded registers in `lowerTailCall`. Because there doesn't seem to be any nice way to wrap these up into the outgoing argument handler, the restore code in `lowerTailCall` is done separately. Also, irritatingly, you have to make sure that the registers don't overlap with any passed parameters. Otherwise, the scheduler doesn't know what to do with the extra copies and asserts. Add call-translator-variadic-musttail.ll to test this. This is pretty much the same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to base this off of, but the idea is the same. Differential Revision: https://reviews.llvm.org/D68043 llvm-svn: 373226
* [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)Guillaume Chatelet2019-09-301-3/+4
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68141 llvm-svn: 373207
* [GlobalISel Enable memcpy inlining with optsize.Amara Emerson2019-09-281-1/+1
| | | | | | We should be disabling inline for minsize, not optsize. llvm-svn: 373143
* [Alignment][NFC] Remove unneeded llvm:: scoping on Align typesGuillaume Chatelet2019-09-273-8/+7
| | | | llvm-svn: 373081
OpenPOWER on IntegriCloud