summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64
Commit message (Collapse)AuthorAgeFilesLines
* AArch64: rename compact unwind forms back to UNWIND_ARM64_*. NFC.Tim Northover2016-02-231-30/+30
| | | | | | | Looks like the global rename last year was a bit over-zealous. These things really are referred to with ARM64 elsewhere (ld64, libunwind, ...). llvm-svn: 261698
* [AArch64] Generate csinv instruction more oftenGeoff Berry2016-02-231-0/+8
| | | | | | | | | | Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17546 llvm-svn: 261675
* [AArch64] Fix fastcc -tailcallopt epilog code generation.Geoff Berry2016-02-231-6/+23
| | | | | | | | | | | | | | Summary: Fix a bug in epilog generation where the incoming stack arguments were not being popped for fastcc functions when -tailcallopt was passed. Reviewers: t.p.northover, mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16894 llvm-svn: 261650
* [AArch64] Fix comment typo in Cyclone scheduling defs. NFC.Chad Rosier2016-02-231-1/+1
| | | | llvm-svn: 261637
* CodeGen: TII: Take MachineInstr& in predicate API, NFCDuncan P. N. Exon Smith2016-02-231-4/+4
| | | | | | | | | | | | | Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605
* Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"Duncan P. N. Exon Smith2016-02-222-3/+3
| | | | | | | | | | This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567
* Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"Duncan P. N. Exon Smith2016-02-221-2/+2
| | | | | | | | | | | | | This reverts commit r261510, effectively reapplying r261509. The original commit missed a caller in AArch64ConditionalCompares. Original commit message: Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261511
* CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFCDuncan P. N. Exon Smith2016-02-212-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504
* [AArch64][ShrinkWrap] Fix bug in prolog clobbering live reg when shrink ↵Geoff Berry2016-02-192-9/+63
| | | | | | | | | | | | | | wrapping. Summary: See bug https://llvm.org/bugs/show_bug.cgi?id=26642 Reviewers: qcolombet, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17350 llvm-svn: 261349
* Remove uses of builtin comma operator.Richard Trieu2016-02-181-12/+24
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* [AArch64] Reduce vector insert/extract cost for KryoMatthew Simpson2016-02-181-0/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D17379 llvm-svn: 261237
* AArch64: always clear kill flags up to last eliminated copyTim Northover2016-02-171-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | After r261154, we were only clearing flags if the known-zero register was originally live-in to the basic block, but we have to do it even if not when more than one COPY has been eliminated, otherwise the user of the first COPY may still have <kill> marked. E.g. BB#N: %X0 = COPY %XZR STRXui %X0<kill>, <fi#0> %X0 = COPY %XZR STRXui %X0<kill>, <fi#1> We can eliminate both copies, X0 is not live-in, but we must clear the kill on the first store. Unfortunately, I've been unable to come up with a non-fragile test for this. I've only seen it in the wild with regalloc-created spills, and attempts to reproduce that in a reasonable way run afoul of COPY coalescing. Even volatile asm clobbers were moved around. Should fix the aarch64 bot though. llvm-svn: 261175
* AArch64: improve redundant copy elimination.Tim Northover2016-02-171-40/+46
| | | | | | | | | | | | | | | Mostly, this fixes the bug that if the CBZ guaranteed Xn but Wn was used, we didn't sort out the use-def chain properly. I've also made it check more than just the last instruction for a compatible CBZ (so it can cope without fallthroughs). I'd have liked to do that separately, but it's helps writing the test. Finally, I removed some custom loops in favour of MachineInstr helpers and refactored the control flow to flatten it and avoid possibly quadratic iterations in blocks with many copies. NFC for these, just a general tidy-up. llvm-svn: 261154
* [AArch64] Add pass to remove redundant copy after RAJun Bum Lim2016-02-164-0/+181
| | | | | | | | | | | | | | | | | | | | | Summary: This change will add a pass to remove unnecessary zero copies in target blocks of cbz/cbnz instructions. E.g., the copy instruction in the code below can be removed because the cbz jumps to BB1 when x0 is zero : BB0: cbz x0, .BB1 BB1: mov x0, xzr Jun Reviewers: gberry, jmolloy, HaoLiu, MatzeB, mcrosier Subscribers: mcrosier, mssimpso, haicheng, bmakam, llvm-commits, aemerson, rengolin Differential Revision: http://reviews.llvm.org/D16203 llvm-svn: 261004
* [GlobalISel] Re-apply r260922-260923 with MSVC-friendly code.Quentin Colombet2016-02-167-88/+161
| | | | | | | | | Original message: Get rid of the ifdefs in TargetLowering. Introduce a new API used only by GlobalISel: CallLowering. This API will contain target hooks dedicated to call lowering. llvm-svn: 260998
* Reverting r260922-260923; they cause link failures with MSVC.Aaron Ballman2016-02-167-159/+88
| | | | | | | http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc2015/builds/15436/steps/build/logs/stdio http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc18-DA/builds/961/steps/build_llvm/logs/stdio llvm-svn: 260972
* [GlobalISel] Get rid of the ifdefs in TargetLowering.Quentin Colombet2016-02-167-88/+159
| | | | | | | Introduce a new API used only by GlobalISel: CallLowering. This API will contain target hooks dedicated to call lowering. llvm-svn: 260922
* [AArch64] Enable post-RA MI scheduler for Kryo.Chad Rosier2016-02-121-1/+1
| | | | | | This should have landed in r260686. llvm-svn: 260739
* [AArch64] Reduce number of callee-save save/restores.Geoff Berry2016-02-121-126/+160
| | | | | | | | | | | | | | | | | | | | | Summary: Before this change, callee-save registers would be rounded up to even pairs of GPRs and FPRs. This change eliminates these extra padding load/stores, though it does keep the stack allocation the same size unless both the GPR and FPR sets have an odd size, in which case one full pair stack slot (16 bytes) is saved. This optimization cannot currently be done for MachO targets since they rely on a fast-path .debug_frame equivalent that can only encode callee-save registers as pairs. Reviewers: t.p.northover, rengolin, mcrosier, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17000 llvm-svn: 260689
* [AArch64] Add support for Qualcomm Kryo CPU.Chad Rosier2016-02-128-5/+2506
| | | | | | Machine model description by Dave Estes <cestes@codeaurora.org>. llvm-svn: 260686
* [AArch64] Merge two adjacent str WZR into str XZRJun Bum Lim2016-02-121-15/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change merges adjacent 32 bit zero stores into a 64 bit zero store. e.g., str wzr, [x0] str wzr, [x0, #4] becomes str xzr, [x0] Therefore, four adjacent 32 bit zero stores will be a single stp. e.g., str wzr, [x0] str wzr, [x0, #4] str wzr, [x0, #8] str wzr, [x0, #12] becomes stp xzr, xzr, [x0] Reviewers: mcrosier, jmolloy, gberry, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16933 llvm-svn: 260682
* [AArch64] Implements the lowering of formal arguments for GlobalISel.Quentin Colombet2016-02-112-0/+53
| | | | | | | | | | | | | | | | This is just a trivial implementation: - Support only arguments passed in registers. - Support only "plain" arguments, i.e., no sext/zext attribute. At this point, it is possible to play with the IRTranslator on AArch64: llc -mtriple arm64-<vendor>-<os> -print-machineinstrs <input.ll> -o - -global-isel For now, we only support the translation of program with adds and returns. Follow-up patches are on their way to add a test case (the MIRParser is not ready as it is). llvm-svn: 260600
* [AArch64] Trivial implementation of lower return for the IRTranslator.Quentin Colombet2016-02-112-0/+34
| | | | llvm-svn: 260574
* [AArch64] Plug the beginning of the GlobalISel pipeline.Quentin Colombet2016-02-112-1/+14
| | | | llvm-svn: 260569
* [AArch64] Refactoring findMatchingStore() in aarch64-ldst-opt; NFCJun Bum Lim2016-02-111-11/+13
| | | | | | | | | | | | Summary: This change makes findMatchingStore() follow the same coding style introduced in r260275. Reviewers: gberry, junbuml Subscribers: aemerson, rengolin, haicheng, bmakam, mssimpso Differential Revision: http://reviews.llvm.org/D17083 llvm-svn: 260534
* [AArch64] Improve load/store optimizer to handle LDUR + LDR.Chad Rosier2016-02-111-11/+68
| | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. This is a reapplication of r259812, which had an incorrect assert. The test_stur_str_no_assert() test is a reduced version of the issue hit in the AArch64 self-host. PR24465 llvm-svn: 260523
* [AArch64] Refactor is logic into a helper function. NFC.Chad Rosier2016-02-101-12/+22
| | | | llvm-svn: 260419
* [AArch64] Update comment to match reality. NFC.Chad Rosier2016-02-101-2/+2
| | | | llvm-svn: 260406
* [AArch64] This bit of logic is specific to pairing. NFC.Chad Rosier2016-02-101-8/+10
| | | | llvm-svn: 260383
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-091-12/+6
| | | | llvm-svn: 260316
* [AArch64] This check is specific to merging instructions. NFC.Chad Rosier2016-02-091-4/+4
| | | | llvm-svn: 260283
* [AArch64] AArch64LoadStoreOptimizer: fix bug in pre-inc check iteratorGeoff Berry2016-02-091-8/+9
| | | | | | | | | | | | | | | Summary: Fix case where a pre-inc/dec load/store would not be formed if the add/sub that forms the inc/dec part of the operation was the first instruction in the block being examined. Reviewers: mcrosier, jmolloy, t.p.northover, junbuml Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16785 llvm-svn: 260275
* [AArch64] Bail even earlier if the instructions modifieds the base register. ↵Chad Rosier2016-02-091-5/+6
| | | | | | NFC. llvm-svn: 260274
* [AArch64] Simplify. NFC.Chad Rosier2016-02-091-3/+1
| | | | llvm-svn: 260273
* [AArch64] Add an assert to ensure we don't scale an offset that can't be scaled.Chad Rosier2016-02-091-1/+3
| | | | llvm-svn: 260272
* [AArch64] Add a FIXME about invalid KILL markers after the ld/st opt pass.Chad Rosier2016-02-091-0/+5
| | | | llvm-svn: 260264
* [AArch64] Remove redundant calls and clang format. NFC.Chad Rosier2016-02-091-42/+40
| | | | llvm-svn: 260260
* [AArch64] Hoist now common logic. NFC.Chad Rosier2016-02-091-13/+9
| | | | llvm-svn: 260257
* [AArch64] Rename variable to make it clear we're merging here, not pairing.Chad Rosier2016-02-091-19/+19
| | | | llvm-svn: 260256
* [AArch64] Separage the codegen logic for widening vs. pairing. NFC.Chad Rosier2016-02-091-38/+94
| | | | llvm-svn: 260249
* [AArch64] Cleanup to simplify logic when widening vs. pairing loads/stores. NFC.Chad Rosier2016-02-091-13/+50
| | | | | | | | The logic to pair instructions and merge narrow instructions has become cloogy and error prone. This patch beings to unravel these two similar, but distinct optimizations. llvm-svn: 260242
* [AArch64] Rename variable to improve readability. NFC.Chad Rosier2016-02-091-5/+5
| | | | llvm-svn: 260228
* [AArch64] Remove stale comment.Chad Rosier2016-02-091-3/+0
| | | | llvm-svn: 260226
* AArch64: match correct order in subtraction pattern.Tim Northover2016-02-081-4/+4
| | | | | | | The accumulator in multiply-and-subtract instructions is actually subtracted *from* so these patterns were computing the wrong value. llvm-svn: 260131
* [AArch64] Add the scheduling model for Exynos-M1Evandro Menezes2016-02-062-2/+361
| | | | | | | | | | | | | | Summary: Add the core scheduling model for the Samsung Exynos-M1 (ARMv8-A). Reviewers: jmolloy, rengolin, christof, MinSeongKIM, t.p.northover Subscribers: aemerson, rengolin, MatzeB Differential Revision: http://reviews.llvm.org/D16644 llvm-svn: 259958
* [AArch64] Refactoring aarch64-ldst-opt. NCF.Jun Bum Lim2016-02-051-25/+38
| | | | | | | Remove narrow load / store instructions from getMatchingPairOpcode(), and add getMatchingWideOpcode(). llvm-svn: 259914
* Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3)."Renato Golin2016-02-051-76/+21
| | | | | | This reverts commit r259812 as it broke AArch64 self-hosting. llvm-svn: 259881
* [AArch64] Bound the number of instructions we scan when searching for updates.Chad Rosier2016-02-041-14/+26
| | | | | | | This only impacts the creation of pre-/post-index instructions. The bound was set high enough such that it did not change code generation for SPEC200X. llvm-svn: 259828
* [AArch64] Improve load/store optimizer to handle LDUR + LDR (take 3).Chad Rosier2016-02-041-21/+76
| | | | | | | | | | | | | | | This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. This is a reapplication of r246769 and r259790. The tramp3d failure was caused by an incorrect refactoring in the patch. Specifically, we weren't always properly clearing the SExtIdx flag. llvm-svn: 259812
* [AArch64] Multiply extended 32-bit ints with `[U|S]MADDL'Silviu Baranga2016-02-041-0/+40
| | | | | | | | | | | | | | | | | | | | | | | During instruction selection, the AArch64 backend can recognise the following pattern and generate an [U|S]MADDL instruction, i.e. a multiply of two 32-bit operands with a 64-bit result: (mul (sext i32), (sext i32)) However, when one of the operands is constant, the sign extension gets folded into the constant in SelectionDAG::getNode(). This means that the instruction selection sees this: (mul (sext i32), i64) ...which doesn't match the pattern. Sign-extension and 64-bit multiply instructions are generated, which are slower than one 32-bit multiply. Add a pattern to match this and generate the correct instruction, for both signed and unsigned multiplies. Patch by Chris Diamand! llvm-svn: 259800
OpenPOWER on IntegriCloud