summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64][SVE] Add patterns for some arith SVE instructions.Danilo Carvalho Grael2020-01-131-0/+29
| | | | | | | | | | | Summary: Add patterns for the following instructions: - smax, smin, umax, umin Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: amehsan Differential Revision: https://reviews.llvm.org/D71779
* [SelectionDAG] Disallow indirect "i" constraintFangrui Song2019-12-291-1/+0
| | | | | | | | | This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.
* Fix "result of 32-bit shift implicitly converted to 64 bits" warning. NFC.Simon Pilgrim2019-12-211-1/+1
|
* [AArch64][SVE] Fold constant multiply of element countCullen Rhodes2019-12-201-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: E.g. %0 = tail call i64 @llvm.aarch64.sve.cntw(i32 31) %mul = mul i64 %0, <const> Should emit: cntw x0, all, mul #<const> For <const> in the range 1-16. Patch by Kerry McLaughlin Reviewers: sdesmalen, huntergr, dancgr, rengolin, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71014
* [AArch64] match fcvtl2 with bitcasted extractSanjay Patel2019-12-181-0/+35
| | | | | | | | | | | | | This should eliminate a regression seen in D63815. If we are FP extending the high half extract of a vector, we should be able to peek through a bitcast sitting between the extract and extend. This replaces tablegen patterns with a more general DAG to DAG override, so we can handle any casted type. Differential Revision: https://reviews.llvm.org/D71515
* [AArch64][SVE] Change pattern generation code to fix -Wimplicit-fallthrough ↵Fangrui Song2019-12-161-4/+11
| | | | after D71483
* [AArch64][SVE] Add patterns for logical immediate operations.Danilo Carvalho Grael2019-12-161-0/+36
| | | | | | | | | | | | | | Summary: Add pattern matching for the following SVE logical vector and immediate instructions: - and/bic, orr/orn, eor/eon. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71483
* [AArch64][SVE] Add integer arithmetic with immediate instructions.Danilo Carvalho Grael2019-12-121-0/+41
| | | | | | | | | | | | | | | | | | | | Summary: Add pattern matching for the following instructions: - add, sub, subr, sqadd, sqsub, uqadd, uqsub This patch required complex patterns to match the immediate with optinal left shift. I re-used the Select function from the other SVE repo to implement the complext pattern. I plan on doing another patch to also match constant vector of the same immediate. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71370
* [IR] Split out target specific intrinsic enums into separate headersReid Kleckner2019-12-111-0/+1
| | | | | | | | | | | | | | | | | | | | This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
* [PGO][PGSO] DAG.shouldOptForSize part.Hiroshi Yamauchi2019-11-211-6/+2
| | | | | | | | | | | | | | | Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095
* [AArch64] Fix unannotated fall-through between switch labelsJinsong Ji2019-10-281-0/+1
| | | | | | This is breaking buildbot with -Werror,-Wimplicit-fallthrough on. eg: http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/6881
* [AArch64][SVE] Implement masked load intrinsicsKerry McLaughlin2019-10-281-0/+20
| | | | | | | | | | | | | | | | Summary: Adds support for codegen of masked loads, with non-extending, zero-extending and sign-extending variants. Reviewers: huntergr, rovka, greened, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, samparker, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68877
* [DAG] Add SelectionDAG::MaxRecursionDepth constantSimon Pilgrim2019-09-191-1/+1
| | | | | | | | | | As commented on D67557 we have a lot of uses of depth checks all using magic numbers. This patch adds the SelectionDAG::MaxRecursionDepth constant and moves over some general cases to use this explicitly. Differential Revision: https://reviews.llvm.org/D67711 llvm-svn: 372315
* [aarch64] move custom isel of extract_vector_elt to td file - NFCSebastian Pop2019-09-131-43/+0
| | | | | | | | | | | | | | | | | | | In preparation for def-pat selection of dot product instructions, this patch moves the custom instruction selection of extract_vector_elt to the td file. Without this change it is impossible to catch a pattern that starts with an extract_vector_elt: the custom cpp code is executed first ahead of the patterns in the td files that are only executed at the end of the switch statement in SelectCode(Node). With this patch applied, it becomes possible to select a different pattern that starts with extract_vector_elt by selecting a higher complexity than this pattern. The patch has been tested on aarch64-linux with make check-all. Differential Revision: https://reviews.llvm.org/D67497 llvm-svn: 371887
* Basic codegen for MTE stack tagging.Evgeniy Stepanov2019-07-171-1/+59
| | | | | | | | | | | | Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360
* AArch64: Add support for reading pc using llvm.read_register.Peter Collingbourne2019-06-221-0/+8
| | | | | | | | | | | | | | | This is useful for allowing code to efficiently take an address that can be later mapped onto debug info. Currently the hwasan pass achieves this by taking the address of the current function: http://llvm-cs.pcc.me.uk/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp#921 but this costs two instructions (plus a GOT entry in PIC code) per function with stack variables. This will allow the cost to be reduced to a single instruction. Differential Revision: https://reviews.llvm.org/D63471 llvm-svn: 364126
* Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-04-231-2/+2
| | | | llvm-svn: 358969
* [AArch64] Add support for MTE intrinsicsJaved Absar2019-04-231-17/+46
| | | | | | | | | | | This patch provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. The intrinsics are described in detail in the latest ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed by: David Spickett Differential Revision: https://reviews.llvm.org/D60486 llvm-svn: 358963
* [IR] Refactor attribute methods in Function class (NFC)Evandro Menezes2019-04-041-1/+1
| | | | | | | | Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [AArch64] Always use the version of computeKnownBits that returns a value. NFCI.Simon Pilgrim2018-12-211-6/+3
| | | | | | Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349908
* [AArch64][v8.5A] Add speculation restriction system registersOliver Stannard2018-09-271-1/+1
| | | | | | | | | | | This adds some new system registers which can be used to restrict certain types of speculative execution. Patch by Pablo Barrio and David Spickett! Differential revision: https://reviews.llvm.org/D52482 llvm-svn: 343218
* [SDAG] Remove the reliance on MI's allocation strategy forChandler Carruth2018-08-141-21/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be *surprised* at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740
* Revert "[AArch64] Coalesce Copy Zero during instruction selection"Sirish Pande2018-06-211-29/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d8f57105010cc7e78026e511d5def873fc91e0e7. Original Commit: Author: Haicheng Wu <haicheng@codeaurora.org> Date: Sun Feb 18 13:51:33 2018 +0000 [AArch64] Coalesce Copy Zero during instruction selection Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 Author's intention is to remove a BB that has one mov instruction. In order to do that, d8f571050 pessmizes MachineSinking by introducing a copy, such that mov instruction is NOT moved to the BB. Optimization downstream gets rid of the BB with only mov instruction. This works well if we have only one fall through branch as there is only one "extra" mov instruction. If we have multiple fall throughs, we will have a lot of redundant movs. In such a case, it's better to have this BB which has one mov instruction. This is causing degradation in jpeg, fft and other codebases. I believe if we want to remove a BB with only one branch instruction, we should not pessimize Machine Sinking at all, and find some other solution. llvm-svn: 335251
* [AArch64] Take advantage of variable shift/rotate amount implicit mod operation.Geoff Berry2018-05-241-0/+111
| | | | | | | | | | | | | | | | Summary: Optimize code generated for variable shifts/rotates by taking advantage of the implicit and/mod done on the variable shift amount register. Resolves bug 27582 and bug 37421. Reviewers: t.p.northover, qcolombet, MatzeB, javed.absar Subscribers: rengolin, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D46844 llvm-svn: 333214
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-141-8/+10
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-3/+3
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* Reland r329956, "AArch64: Introduce a DAG combine for folding offsets into ↵Peter Collingbourne2018-04-231-8/+10
| | | | | | | | | | | | | | | | | | | | addresses.", with a fix for the bot failure. This reland includes a check to prevent the DAG combiner from folding an offset that is smaller than the existing one. This can cause oscillations between two possible DAGs, which was the cause of the hang and later assertion failure observed on the lnt-ctmark-aarch64-O3-flto bot. http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/ Original commit message: > This is a code size win in code that takes offseted addresses > frequently, such as C++ constructors that typically need to compute > an offseted address of a vtable. This reduces the size of Chromium > for Android's .text section by 108KB. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 330630
* Revert r329956, "AArch64: Introduce a DAG combine for folding offsets into ↵Peter Collingbourne2018-04-131-10/+8
| | | | | | | | | | addresses." Caused a hang and eventually an assertion failure in LTO builds of 7zip-benchmark on aarch64 iOS targets. http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/ llvm-svn: 330063
* AArch64: Introduce a DAG combine for folding offsets into addresses.Peter Collingbourne2018-04-121-8/+10
| | | | | | | | | | | This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. This reduces the size of Chromium for Android's .text section by 108KB. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329956
* Revert r329611, "AArch64: Allow offsets to be folded into addresses with ELF."Peter Collingbourne2018-04-101-10/+8
| | | | | | Caused a build failure in check-tsan. llvm-svn: 329718
* AArch64: Allow offsets to be folded into addresses with ELF.Peter Collingbourne2018-04-091-8/+10
| | | | | | | | | | | | | | | This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. It reduces the size of Chromium for Android's .text section by 46KB, or 56KB with ThinLTO (which exposes more opportunities to use a direct access rather than a GOT access). Because the addend range is limited in COFF and Mach-O, this is enabled for ELF only. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329611
* [AArch64] Fix UB about shift amount exceeds data bit-widthWeiming Zhao2018-03-081-1/+1
| | | | | | | | | | | | | | | | Summary: Fixes an UB caught by sanitizer. The shift amount might be larger than 32 so the operand should be 1ULL. In this patch, we replace the original expression with existing API with uint64_t type. Reviewers: eli.friedman, rengolin Reviewed By: rengolin Subscribers: rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44234 llvm-svn: 326969
* [AArch64] Coalesce Copy Zero during instruction selectionHaicheng Wu2018-02-181-1/+29
| | | | | | | | Add special case for copy of zero to avoid a double copy. Differential Revision: https://reviews.llvm.org/D36104 llvm-svn: 325459
* [SelectionDAGISel] Add a debug print before call to Select. Adjust where ↵Craig Topper2018-01-261-5/+0
| | | | | | | | | | | | blank lines are printed during isel process to make things more sensibly grouped. Previously some targets printed their own message at the start of Select to indicate what they were selecting. For the targets that didn't, it means there was no print of the root node before any custom handling in the target executed. So if the target did something custom and never called SelectNodeCommon, no print would be made. For the targets that did print a message in Select, if they didn't custom handle a node SelectNodeCommon would reprint the root node before walking the isel table. It seems better to just print the message before the call to Select so all targets behave the same. And then remove the root node printing from SelectNodeCommon and just leave a message that says we're starting the table search. There were also some oddities in blank line behavior. Usually due to a \n after a call to SelectionDAGNode::dump which already inserted a new line. llvm-svn: 323551
* AArch64: get type from correct result when forming BFXTim Northover2018-01-231-1/+1
| | | | | | | Some nodes produce multiple values so when obtaining the type of an ISD::OR we need to make sure we ask for the correct one. Hopefully that's all of them. llvm-svn: 323205
* AArch64: get type from correct result when forming BFI/BFMTim Northover2018-01-231-1/+1
| | | | | | | Some nodes produce multiple values so when obtaining the type of an ISD::OR we need to make sure we ask for the correct one. llvm-svn: 323202
* MachineFunction: Return reference from getFunction(); NFCMatthias Braun2017-12-151-1/+1
| | | | | | The Function can never be nullptr so we can return a reference. llvm-svn: 320884
* [AArch64] Avoid selecting XZR inline ASM memory operandYi Kong2017-07-141-4/+11
| | | | | | | | | | | | | Restricting register class to PointerRegClass for memory operands. Also fix the PointerRegClass for AArch64 from GPR64 to GPR64sp, since XZR cannot hold a memory pointer while SP is. Fixes PR33134. Differential Revision: https://reviews.llvm.org/D34999 llvm-svn: 308060
* [AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.Christof Douma2017-06-211-4/+11
| | | | | | | | | | | | | | | | | | | | | | Implemented support to AArch64 codegen for ARMv8.1 Large System Extensions atomic instructions. Where supported, these instructions can provide atomic operations with higher performance. Currently supported operations include: fetch_add, fetch_or, fetch_xor, fetch_smin, fetch_min/max (signed and unsigned), swap, and compare_exchange. This implementation implies sequential-consistency ordering, more relaxed ordering is under development. Subtarget->hasLSE is currently supported for Cavium ThunderX2T99. Patch by Ananth Jasty. Differential Revision: https://reviews.llvm.org/D33586 Change-Id: I82f6d3d64255622791ceb0715b7ab9f4dc4d4b2c llvm-svn: 305893
* [llvm] Remove double semicolonsMandeep Singh Grang2017-06-061-1/+1
| | | | | | | | | | | | Reviewers: craig.topper, arsenm, mehdi_amini Reviewed By: mehdi_amini Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33924 llvm-svn: 304767
* [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and ↵Craig Topper2017-04-281-12/+13
| | | | | | | | | | | | simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620
* [APInt] Use inplace shift methods where possible. NFCICraig Topper2017-04-281-10/+10
| | | | llvm-svn: 301612
* Revert "[APInt] Fix a few places that use APInt::getRawData to operate ↵Renato Golin2017-04-231-10/+10
| | | | | | | | | | | | | | | | within the normal API." This reverts commit r301105, 4, 3 and 1, as a follow up of the previous revert, which broke even more bots. For reference: Revert "[APInt] Use operator<<= where possible. NFC" Revert "[APInt] Use operator<<= instead of shl where possible. NFC" Revert "[APInt] Use ashInPlace where possible." PR32754. llvm-svn: 301111
* [APInt] Use operator<<= instead of shl where possible. NFCCraig Topper2017-04-231-10/+10
| | | | llvm-svn: 301103
* [APInt] Use lshrInPlace to replace lshr where possibleCraig Topper2017-04-181-4/+4
| | | | | | | | | | This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566
* Revert "Instrument SDISel C++ patterns"Quentin Colombet2017-04-011-327/+322
| | | | | | | | This reverts commit r299284. Didn't intend to commit this :( llvm-svn: 299286
* Instrument SDISel C++ patternsQuentin Colombet2017-04-011-322/+327
| | | | llvm-svn: 299284
* [AArch64] Add new subtarget feature to fold LSL into address mode.Balaram Makam2017-03-311-3/+44
| | | | | | | | | | | | | Summary: This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL. Reviewers: mcrosier, rengolin, t.p.northover Subscribers: junbuml, gberry, llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D31113 llvm-svn: 299240
* [AArch64] Fix useful bits detection for BFM instructionsSilviu Baranga2016-11-301-9/+38
| | | | | | | | | | | | | | | | | | | Summary: When computing useful bits for a BFM instruction, we need to take into consideration the case where both operands of the BFM are equal and provide data that we need to track. Not doing this can cause us to miss useful bits. Fixes PR31138 (https://llvm.org/bugs/show_bug.cgi?id=31138) Reviewers: t.p.northover, jmolloy Subscribers: evandro, gberry, srhines, pirama, mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27130 llvm-svn: 288253
OpenPOWER on IntegriCloud