summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64FastISel.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Swift Calling Convention: swifterror target support.Manman Ren2016-04-111-1/+36
| | | | | | Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997
* Swift Calling Convention: add swiftself attribute.Manman Ren2016-03-291-1/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754
* Simplify some boolean conditional return statements in AArch64.Eric Christopher2016-02-291-4/+1
| | | | | | | | http://reviews.llvm.org/D9979 Patch by Richard Thomson (and some conflict resolution by me). llvm-svn: 262266
* [NFC] Replace several manual GEP loops with gep_type_iterator.Eduard Burtescu2016-01-201-16/+9
| | | | | | | | | | Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16335 llvm-svn: 258262
* [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with ↵Eduard Burtescu2016-01-191-1/+7
| | | | | | | | | | | | | | | | | | get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145
* CXX_FAST_TLS calling convention: performance improvement for AArch64.Manman Ren2015-12-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. The target independent portion was committed as r255353. rdar://problem/23557469 Differential Revision: http://reviews.llvm.org/D15341 llvm-svn: 255821
* AArch64FastISel: Use cbz/cbnz to branch on i1Matthias Braun2015-12-031-61/+25
| | | | | | | | | In the case of a conditional branch without a preceding cmp we used to emit a "and; cmp; b.eq/b.ne" sequence, use tbz/tbnz instead. Differential Revision: http://reviews.llvm.org/D15122 llvm-svn: 254621
* Let SelectionDAG start to use probability-based interface to add successors.Cong Hou2015-11-241-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes. 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights. 3. Use new interfaces in all other passes. 4. Remove old interfaces. This the second patch above. In this patch SelectionDAG starts to use probability-based interfaces in MBB to add successors but other MC passes are still using weight-based interfaces. Therefore, we need to maintain correct weight list in MBB even when probability-based interfaces are used. This is done by updating weight list in probability-based interfaces by treating the numerator of probabilities as weights. This change affects many test cases that check successor weight values. I will update those test cases once this patch looks good to you. Differential revision: http://reviews.llvm.org/D14361 llvm-svn: 253965
* Revert "Change memcpy/memset/memmove to have dest and source alignments."Pete Cooper2015-11-191-4/+3
| | | | | | | | | | This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
* Change memcpy/memset/memmove to have dest and source alignments.Pete Cooper2015-11-181-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
* [AArch64][FastISel] Don't even try to select vector icmps.Ahmed Bougacha2015-11-061-0/+4
| | | | | | | | | | | | We used to try to constant-fold them to i32 immediates. Given that fast-isel doesn't otherwise support vNi1, when selecting the result users, we'd fallback to SDAG anyway. However, if the users were in another block, we'd insert broken cross-class copies (GPR32 to FPR64). Give up, let SDAG agree with itself on a vNi1 legalization strategy. llvm-svn: 252364
* Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add ↵Cong Hou2015-10-271-5/+6
| | | | | | | | | | | | | | successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429
* AArch64: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-131-2/+2
| | | | llvm-svn: 250216
* FastISel: Factor out common code; NFC intendedMatthias Braun2015-08-261-40/+5
| | | | | | | | | This should be no functional change but for the record: For three cases in X86FastISel this will change the order in which the FalseMBB and TrueMBB of a conditional branch is addedd to the successor/predecessor lists. llvm-svn: 245997
* [AArch64][FastISel] Don't fold shifts with UB.Juergen Ributzka2015-08-191-13/+38
| | | | | | | | | | We are already falling back to SelectionDAG when encountering an shift with UB. This adds the same checks for shifts with UB that get folded into arithmetic or logical operations. This fixes rdar://problem/22345295. llvm-svn: 245499
* PseudoSourceValue: Replace global manager with a manager in a machine function.Alex Lorenz2015-08-111-4/+4
| | | | | | | | | | | | | | | | | | | | | | This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693
* Fix some comment typos.Benjamin Kramer2015-08-081-1/+1
| | | | llvm-svn: 244402
* [AArch64][FastISel] Always use AND before checking the branch flag.Juergen Ributzka2015-08-061-1/+5
| | | | | | | | | | | | | When we are not emitting the condition for the branch, because the condition is in another BB or SDAG did the selection for us, then we have to mask the flag in the register with AND. This is required when the condition comes from a truncate, because SDAG only truncates down to a legal size of i32. This fixes rdar://problem/22161062. llvm-svn: 244291
* Revert "[AArch64][FastISel] Add more truncation tests." and ↵Juergen Ributzka2015-08-061-24/+31
| | | | | | | | | | | "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types." This reverts commit r243198 and 243304. Turns out this wasn't the correct fix for this problem. It works only within FastISel, but fails when the truncate is selected by SDAG. llvm-svn: 244287
* Move BB succ_iterator to be inside TerminatorInst. NFC.Pete Cooper2015-08-051-2/+2
| | | | | | | | | | | | | | | | | | | | | To get the successors of a BB we currently do successors(BB) which ultimately walks the successors of the BB's terminator. This moves the iterator to TerminatorInst as thats what we're actually using to do the iteration, and adds a member function to TerminatorInst to allow us to iterate directly over successors given an instruction. For example, we can now do for (auto *Succ : BI->successors()) instead of for (unsigned i = 0, e = BI->getNumSuccessors(); i != e; ++i) Reviewed by Tobias Grosser. llvm-svn: 244074
* Convert some AArch64 code to foreach loops. NFC.Pete Cooper2015-08-031-4/+3
| | | | | | | Also converted a cast<> to dyn_cast while i was working on the same line of code. llvm-svn: 243894
* De-constify pointers to Type since they can't be modified. NFCCraig Topper2015-08-011-1/+1
| | | | | | This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
* [AArch64][FastISel] Always use an AND instruction when truncating to ↵Juergen Ributzka2015-07-251-31/+24
| | | | | | | | | | | | | | non-legal types. When truncating to non-legal types (such as i16, i8 and i1) always use an AND instruction to mask out the upper bits. This was only done when the source type was an i64, but not when the source type was an i32. This commit fixes this and adds the missing i32 truncate tests. This fixes rdar://problem/21990703. llvm-svn: 243198
* Make TargetLowering::getPointerTy() taking DataLayout as an argumentMehdi Amini2015-07-091-20/+21
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
* Redirect DataLayout from TargetMachine to Module in ComputeValueVTs()Mehdi Amini2015-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: Avoid using the TargetMachine owned DataLayout and use the Module owned one instead. This requires passing the DataLayout up the stack to ComputeValueVTs(). This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11019 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241773
* fix formatting; NFCSanjay Patel2015-07-011-2/+2
| | | | llvm-svn: 241175
* Use MCSymbols for FastISel.Rafael Espindola2015-06-231-13/+16
| | | | | | | | | | | The summary is that it moves the mangling earlier and replaces a few calls to .addExternalSymbol with addSym. I originally wanted to replace all the uses of addExternalSymbol with addSym, but noticed it was a lot of work and doesn't need to be done all at once. llvm-svn: 240395
* On behalf of Alexandros Lamprineas:Evgeny Astigeevich2015-06-151-0/+6
| | | | | | | | | | | | | | | | LLVM targeting aarch64 doesn't correctly produce aligned accesses for non-aligned data at -O0/fast-isel (-mno-unaligned-access). The root cause seems to be in fast-isel not producing unaligned access correctly for -mno-unaligned-access. The patch just aborts fast-isel for loads and stores when -mno-unaligned-access is present. The regression test is updated to check this new test case (-mno-unaligned-access together with fast-isel). Differential Revision: http://reviews.llvm.org/D10360 llvm-svn: 239732
* Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC.Pete Cooper2015-05-201-2/+2
| | | | | | | | Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method. This updates all users which were either using an unsigned to store it, or had a now unnecessary cast. llvm-svn: 237810
* [AArch64] Fix sext/zext folding in address arithmetic.Pete Cooper2015-05-071-29/+32
| | | | | | | | We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there. Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use. llvm-svn: 236764
* [AArch64][FastISel] Variant of the logical instructions that use two inputQuentin Colombet2015-05-011-1/+1
| | | | | | | | registers cannot write on SP. rdar://problem/20748715 llvm-svn: 236352
* [AArch64][FastISel] Fix the setting of kill flags for MUL -> UMULH sequences.Quentin Colombet2015-05-011-2/+8
| | | | | | rdar://problem/20748715 llvm-svn: 236346
* [AArch64] Fix bad register class constraint in fast-isel for TST instruction.Quentin Colombet2015-04-301-1/+4
| | | | | | rdar://problem/20748715 llvm-svn: 236273
* Disable AArch64 fast-isel on big-endian call vector returns.Pete Cooper2015-04-161-0/+5
| | | | | | | | A big-endian vector return needs a byte-swap which we aren't doing right now. For now just bail on these cases to get correctness back. llvm-svn: 235133
* [AArch64][FastISel] Fix integer extend optimization.Juergen Ributzka2015-04-091-5/+6
| | | | | | | | | | | | | | The integer extend optimization tries to fold the extend into the load instruction. This requires us to identify if the extend has already been emitted or not and act accordingly on it. The check that was originally performed for this was not sufficient. Besides checking the ValueMap for a mapped register we also need to check if the virtual register has already an associated machine instruction that defines it. This fixes rdar://problem/20470788. llvm-svn: 234529
* Refactor: Simplify boolean expressions in AArch64 targetDavid Blaikie2015-03-241-1/+1
| | | | | | | | | | | | Simplify boolean expressions using `true` and `false` with `clang-tidy` Patch by Richard Thomson. Reviewed By: rengolin Differential Revision: http://reviews.llvm.org/D8525 llvm-svn: 233089
* Have getCallPreservedMask and getThisCallPreservedMask take aEric Christopher2015-03-111-1/+1
| | | | | | | MachineFunction argument so that we can grab subtarget specific features off of it. llvm-svn: 231979
* Clean up some uses of getSubtarget in AArch64.Eric Christopher2015-01-301-4/+4
| | | | llvm-svn: 227530
* Migrate AArch64 except for TTI and AsmPrinter away from getSubtargetImpl.Eric Christopher2015-01-281-1/+1
| | | | llvm-svn: 227293
* [AArch64] Implement GHC calling conventionGreg Fitzgerald2015-01-191-0/+2
| | | | | | | | | | Original patch by Luke Iannini. Minor improvements and test added by Erik de Castro Lopo. Differential Revision: http://reviews.llvm.org/D6877 From: Erik de Castro Lopo <erikd@mega-nerd.com> llvm-svn: 226473
* [AArch64] MachO large code-model: Materialize FP constants in code.Juergen Ributzka2014-12-101-0/+18
| | | | | | | | | | | | | | | In the large code model we have to first get the address of the GOT entry, load the address of the constant, and then load the constant itself. To avoid these loads and the GOT entry alltogether this commit changes the way how FP constants are materialized in the large code model. The constats are now materialized in a GPR and then bitconverted/moved into the FPR. Reviewed by Tim Northover Fixes rdar://problem/16572564. llvm-svn: 223941
* [FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'.Juergen Ributzka2014-12-091-1/+1
| | | | | | | | | The load/store value type is currently not available when lowering the memcpy intrinsic. Add the missing nullptr check to support this in 'computeAddress'. Fixes rdar://problem/19178947. llvm-svn: 223818
* AArch64: treat [N x Ty] as a block during procedure calls.Tim Northover2014-11-271-0/+1
| | | | | | | | | | | | | | The AAPCS treats small structs and homogeneous floating (or vector) aggregates specially, and guarantees they either get passed as a contiguous block of registers, or prevent any future use of those registers and get passed on the stack. This concept can fit quite neatly into LLVM's own type system, mapping an HFA to [N x float] and so on, and small structs to [N x i64]. Doing so allows front-ends to emit AAPCS compliant code without having to duplicate the register counting logic. llvm-svn: 222903
* [FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching.Juergen Ributzka2014-11-251-19/+20
| | | | | | | | | | The pattern matching failed to recognize all instances of "-1", because when comparing against "-1" we didn't use an APInt of the same bitwidth. This commit fixes this and also adds inverse versions of the conditon to catch more cases. llvm-svn: 222722
* [FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmeticChad Rosier2014-11-181-2/+3
| | | | | | | | | shift-right for booleans (i1). Arithmetic shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222272
* [FastISel][AArch64] Also allow folding of sign-/zero-extend and logicalChad Rosier2014-11-181-2/+3
| | | | | | | | | shift-right for booleans (i1). Logical shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222270
* [FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for ↵Juergen Ributzka2014-11-181-17/+26
| | | | | | | | | | | "zero" shifts." Shifts also perform sign-/zero-extends to larger types, which requires us to emit an integer extend instead of a simple COPY. Related to PR21594. llvm-svn: 222257
* [FastISel][AArch64] Fix shift-immediate emission for "zero" shifts.Juergen Ributzka2014-11-181-6/+33
| | | | | | | | This change emits a COPY for a shift-immediate with a "zero" shift value. This fixes PR21594 where we emitted a shift instruction with an incorrect immediate operand. llvm-svn: 222247
* [FastISel][AArch64] Don't bail during simple GEP instruction selection.Juergen Ributzka2014-11-131-0/+23
| | | | | | | | | | | | | | | The generic FastISel code would bail, because it can't emit a sign-extend for AArch64. This copies the code over and uses AArch64 specific emit functions. This is not ideal and 'computeAddress' should handles this, so it can fold the address computation into the memory operation. I plan to clean up 'computeAddress' anyways, so I will add that in a future commit. Related to rdar://problem/18962471. llvm-svn: 221923
* [FastISel][AArch64] Optimize select when one of the operands is a 'true' or ↵Juergen Ributzka2014-11-131-0/+61
| | | | | | | | | | | 'false' value. Optimize selects of i1 in the presence of 'true' and 'false' operands to simple logic operations. This fixes rdar://problem/18960150. llvm-svn: 221848
OpenPOWER on IntegriCloud