summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64] Transfer memory operands when lowering vector load/store intrinsicsSanjin Sijaric2016-11-071-0/+11
| | | | | | | | | | | | | | | Summary: Some vector loads and stores generated from AArch64 intrinsics alias each other unnecessarily, preventing better scheduling. We just need to transfer memory operands during lowering. Reviewers: mcrosier, t.p.northover, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26313 llvm-svn: 286168
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283004
* [AArch64] Improve add/sub/cmp isel of uxtw forms.Geoff Berry2016-09-261-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the 32-bit to 64-bit zero-extend can be done for free by taking advantage of the 32-bit defining instruction zeroing the upper 32-bits of the X register destination. This enables better instruction selection in a few cases, such as: sub x0, xzr, x8 instead of: mov x8, xzr sub x0, x8, w9, uxtw madd x0, x1, x1, x8 instead of: mul x9, x1, x1 add x0, x9, w8, uxtw cmp x2, x8 instead of: sub x8, x2, w8, uxtw cmp x8, #0 add x0, x8, x1, lsl #3 instead of: lsl x9, x1, #3 add x0, x9, w8, uxtw Reviewers: t.p.northover, jmolloy Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24747 llvm-svn: 282413
* getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCISanjay Patel2016-09-141-1/+1
| | | | llvm-svn: 281493
* getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCISanjay Patel2016-09-141-1/+1
| | | | llvm-svn: 281490
* getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCISanjay Patel2016-09-141-1/+1
| | | | llvm-svn: 281489
* Use the range variant of transform instead of unpacking begin/endDavid Majnemer2016-08-121-4/+4
| | | | | | No functionality change is intended. llvm-svn: 278476
* AArch64: TableGenerate system instruction operands.Tim Northover2016-07-051-22/+22
| | | | | | | | | | | | | | | | | | | | The way the named arguments for various system instructions are handled at the moment has a few problems: - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp - That weird Mapping class that I have no idea what I was on when I thought it was a good idea. - Searches are performed linearly through the entire list. - We print absolutely all registers in upper-case, even though some are canonically mixed case (SPSel for example). - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated to comments in our implementation, with a slightly opaque hex value indicating the canonical encoding LLVM will use. This adds a new TableGen backend to produce efficiently searchable tables, and switches AArch64 over to using that infrastructure. llvm-svn: 274576
* AArch64: use correct SDValue # when looking for bitfield placement.Tim Northover2016-07-051-2/+3
| | | | | | | | | | The other use really does only care about the SDNode (it checks the opcode against a whitelist), but bitFieldPlacement can be misled if the node produces multiple results. Patch by Ismail Badawi. llvm-svn: 274567
* [AArch64] Remove an overly aggressive assert.Chad Rosier2016-06-221-5/+0
| | | | llvm-svn: 273458
* Apply most suggestions of clang-tidy's performance-unnecessary-value-paramBenjamin Kramer2016-06-081-1/+1
| | | | | | | Avoids unnecessary copies. All changes audited & pass tests with asan. No functional change intended. llvm-svn: 272190
* [AArch64] Spot SBFX-compatible code expressed with sign_extend.Chad Rosier2016-06-031-0/+30
| | | | | | | This is very similar to r271677, but for extracts from i32 with the SIGN_EXTEND acting on a arithmetic shift. llvm-svn: 271717
* [AArch64] Spot SBFX-compatbile code expressed with sign_extend_inreg.Chad Rosier2016-06-031-0/+37
| | | | | | | | | | We were assuming all SBFX-like operations would have the shl/asr form, but often when the field being extracted is an i8 or i16, we end up with a SIGN_EXTEND_INREG acting on a shift instead. This is a port of r213754 from ARM to AArch64. llvm-svn: 271677
* [AArch64] Generate a BFI/BFXIL from 'or (and X, MaskImm), OrImm'.Chad Rosier2016-05-261-1/+95
| | | | | | | | | | | | | | | | | | | | | | | If and only if the value being inserted sets only known zero bits. This combine transforms things like and w8, w0, #0xfffffff0 movz w9, #5 orr w0, w8, w9 into movz w8, #5 bfxil w0, w8, #0, #4 The combine is tuned to make sure we always reduce the number of instructions. We avoid churning code for what is expected to be performance neutral changes (e.g., converted AND+OR to OR+BFI). Differential Revision: http://reviews.llvm.org/D20387 llvm-svn: 270846
* [AArch64 ] Generate a BFXIL from 'or (and X, Mask0Imm),(and Y, Mask1Imm)'.Chad Rosier2016-05-191-0/+59
| | | | | | | | | | | | | | | | | | Mask0Imm and ~Mask1Imm must be equivalent and one of the MaskImms is a shifted mask (e.g., 0x000ffff0). Both 'and's must have a single use. This changes code like: and w8, w0, #0xffff000f and w9, w1, #0x0000fff0 orr w0, w9, w8 into lsr w8, w1, #4 bfi w0, w8, #4, #12 llvm-svn: 270063
* [AArch64] Push comment into function. NFC.Chad Rosier2016-05-181-9/+9
| | | | llvm-svn: 270003
* [AArch64] Minor refactoring. NFC.Chad Rosier2016-05-181-4/+5
| | | | llvm-svn: 269963
* Use proper capitalization and punctuation per coding standards. NFC.Chad Rosier2016-05-161-1/+1
| | | | llvm-svn: 269652
* [AArch64] Update local variable names to conform to coding standard. NFC.Chad Rosier2016-05-141-31/+31
| | | | llvm-svn: 269573
* [AArch64] Simplify logic to reduce vertical space. NFC.Chad Rosier2016-05-131-6/+2
| | | | llvm-svn: 269512
* SDAG: Implement Select instead of SelectImpl in AArch64DAGToDAGISelJustin Bogner2016-05-121-802/+1161
| | | | | | | | | | | | | | This one has a lot of code churn, but it's all mechanical and straightforward. - Where we were returning a node before, call ReplaceNode instead. - Where we would return null to fall back to another selector, rename the method to try* and return a bool for success. - Where we were calling SelectNodeTo, just return afterwards. Part of llvm.org/pr26808. llvm-svn: 269379
* SDAG: Clean up dangling nodes in AArch64ISelDAGToDAG::SelectImplJustin Bogner2016-05-121-5/+8
| | | | | | | | | When we convert to the void Select interface, leaving unreferenced nodes around won't be allowed anymore. Part of llvm.org/pr26808. llvm-svn: 269345
* [AArch64] Give function a more appropriate name.Chad Rosier2016-05-121-3/+3
| | | | llvm-svn: 269335
* [AArch64] Minor refactoring to simplify future patch. NFC.Chad Rosier2016-05-121-29/+16
| | | | llvm-svn: 269329
* [AArch64] Add support for unscaled narrow stores in getUsefulBitsForUse.Chad Rosier2016-05-121-0/+2
| | | | llvm-svn: 269263
* [AArch64] Remove floating-point narrow stores from getUsefulBitsForUse.Chad Rosier2016-05-121-2/+0
| | | | | | | While not impossible, it's unlikely we'd be performing bitwise operations on FP values. llvm-svn: 269260
* [AArch64] Improve getUsefulBitsForUse for narrow stores.Chad Rosier2016-05-111-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | For narrow stores (e.g., strb, srth) we know the upper bits of the register are unused/not useful. In some cases we can use this information to eliminate unnecessary instructions. For example, without this patch we generate (from the 2nd test case): ldr w8, [x0] and w8, w8, #0xfff0 bfxil w8, w2, #16, #4 strh w8, [x1] and after the patch the 'and' is removed: ldr w8, [x0] bfxil w8, w2, #16, #4 strh w8, [x1] ret During the lowering of the bitfield insert instruction the 'and' is eliminated because we know the upper 16-bits that are masked off are unused and the lower 4-bits that are masked off are overwritten by the insert itself. Therefore, the 'and' is unnecessary. Differential Revision: http://reviews.llvm.org/D20175 llvm-svn: 269226
* SDAG: Rename Select->SelectImpl and repurpose Select as returning voidJustin Bogner2016-05-051-2/+2
| | | | | | | | | | | | | | This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693
* AArch64: expand cmpxchg after regalloc at -O0.Tim Northover2016-04-141-0/+37
| | | | | | | | | | | | | | | | | | | FastRegAlloc works only at the basic-block level and spills all live-out registers. Unfortunately for a stack-based cmpxchg near the spill slots, this can perpetually clear the exclusive monitor, which means the cmpxchg will never succeed. I believe the only way to handle this within LLVM is by expanding the loop post-regalloc. We don't want this in general because it severely limits the optimisations that can be done, so we limit this to -O0 compilations. It's an ugly hack, and about the one good point in the whole mess is that we can treat all cmpxchg operations in the most naive way possible (seq_cst, no clrex faff) without affecting correctness. Should fix PR25526. llvm-svn: 266339
* NFC: make AtomicOrdering an enum classJF Bastien2016-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602
* Simplify some boolean conditional return statements in AArch64.Eric Christopher2016-02-291-7/+2
| | | | | | | | http://reviews.llvm.org/D9979 Patch by Richard Thomson (and some conflict resolution by me). llvm-svn: 262266
* GlobalValue: use getValueType() instead of getType()->getPointerElementType().Manuel Jacob2016-01-161-1/+1
| | | | | | | | | | | | Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999
* [AArch64] Fix a corner case in BitFeild selectWeiming Zhao2015-12-011-5/+11
| | | | | | | | | | | | | | | | | Summary: When not useful bits, BitWidth becomes 0 and APInt will not be happy. See https://llvm.org/bugs/show_bug.cgi?id=25571 We can just mark the operand as IMPLICIT_DEF is none bits of it is used. Reviewers: t.p.northover, jmolloy Subscribers: gberry, jmolloy, mgrang, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14803 llvm-svn: 254440
* [AArch64] Add ARMv8.2-A UAO PSTATE bitOliver Stannard2015-11-261-1/+1
| | | | | | | | | | | | | ARMv8.2-A adds a new PSTATE bit, PSTATE.UAO, which allows the LDTR/STTR instructions to behave the same as LDR/STR with respect to execute-only pages at higher privilege levels. New variants of the MSR/MRS instructions are added to allow reading and writing this bit. It is a required part of ARMv8.2-A, so no additional subtarget features are required. Differential Revision: http://reviews.llvm.org/D15020 llvm-svn: 254157
* Make a bunch of static arrays const.Craig Topper2015-10-181-4/+4
| | | | llvm-svn: 250642
* [MC layer][AArch64] llvm-mc accepts 4-bit immediate values forAlexandros Lamprineas2015-10-051-1/+9
| | | | | | | | | "msr pan, #imm", while only 1-bit immediate values should be valid. Changed encoding and decoding for msr pstate instructions. Differential Revision: http://reviews.llvm.org/D13011 llvm-svn: 249313
* Don't raise inexact when lowering ceil, floor, round, trunc.Stephen Canon2015-09-221-174/+1
| | | | | | | | The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions *did* raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior. This also lets us fold some logic into TD patterns, which is nice. Differential Revision: http://reviews.llvm.org/D12969 llvm-svn: 248266
* [AArch64] Improved bitfield instruction selection.Geoff Berry2015-09-181-11/+67
| | | | | | | | | | | | | | | | | | Summary: For bitfield insert OR matching, check both operands for larger pattern first before checking for smaller pattern. Add pattern for unsigned bitfield insert-in-zero done with SHL+AND. Resolves PR21631. Reviewers: jmolloy, t.p.northover Subscribers: aemerson, rengolin, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D12908 llvm-svn: 248006
* Typos. NFC.Chad Rosier2015-09-171-5/+5
| | | | llvm-svn: 247884
* Fix an undefined behavior introduces in r247234Steven Wu2015-09-101-1/+1
| | | | llvm-svn: 247296
* [ADT] Switch a bunch of places in LLVM that were doing single-characterChandler Carruth2015-09-101-1/+1
| | | | | | | splits to actually use the single character split routine which does less work, and in a debug build is *substantially* faster. llvm-svn: 247245
* [AArch64] Match FI+offset in STNP addressing mode.Ahmed Bougacha2015-09-101-0/+13
| | | | | | | | | | | | | | | First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236
* [AArch64] Match base+offset in STNP addressing mode.Ahmed Bougacha2015-09-101-0/+16
| | | | | | Followup to r247231. llvm-svn: 247234
* [AArch64] Support selecting STNP.Ahmed Bougacha2015-09-101-0/+33
| | | | | | | | | | | | | | | | | | We could go through the load/store optimizer and match STNP where we would have matched a nontemporal-annotated STP, but that's not reliable enough, as an opportunistic optimization. Insetad, we can guarantee emitting STNP, by matching them at ISel. Since there are no single-input nontemporal stores, we have to resort to some high-bits-extracting trickery to generate an STNP from a plain store. Also, we need to support another, LDP/STP-specific addressing mode, base + signed scaled 7-bit immediate offset. For now, only match the base. Let's make it smart separately. Part of PR24086. llvm-svn: 247231
* Add missing break in AArch64DAGToDAGISel::Select() switch caseMehdi Amini2015-08-231-0/+1
| | | | | | | Reported by coverity. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245800
* wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel2015-08-041-3/+1
| | | | | | | | | | | | | | | | | Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
* [AArch64] Add isel support for f16 indexed LD/ST.Ahmed Bougacha2015-08-041-0/+2
| | | | llvm-svn: 243935
* [AArch64] Match float round and convert to int instructions.Geoff Berry2015-07-281-12/+116
| | | | | | | | | | | | | | Summary: Add patterns for doing floating point round with various rounding modes followed by conversion to int as a single FCVT* instruction. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11424 llvm-svn: 243422
* AArch64: add comment missed out from earlier patch.Tim Northover2015-07-171-0/+4
| | | | | | Helps explain some of the background behind this bit of code. llvm-svn: 242503
* AArch64: make inexact signalling on round Darwin-specificTim Northover2015-07-161-1/+1
| | | | | | | | | C11 leaves the choice on whether round-to-integer operations set the inexact flag implementation-defined. Darwin does expect it to be set, but this seems to be against the intent of the IEEE document and slower to implement anyway. So it should be opt-in. llvm-svn: 242446
OpenPOWER on IntegriCloud