summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently"James Molloy2016-09-142-138/+5
| | | | | | This reverts commit r281323. It caused chromium test failures and a selfhost failure. llvm-svn: 281451
* This reapplies r281304. The issue was that I had missedSjoerd Meijer2016-09-143-42/+48
| | | | | | to copy the new isAdd field in the tablegen data structure. llvm-svn: 281447
* Revert r281336 (and r281337), it caused PR30372.Nico Weber2016-09-131-7/+68
| | | | llvm-svn: 281361
* Defer asm errors to post-statement failureNirav Dave2016-09-131-68/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommitting after fixing AsmParser Initialization. Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281336
* Revert "[ARM] Promote small global constants to constant pools"James Molloy2016-09-134-144/+1
| | | | | | This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656 llvm-svn: 281327
* [ARM] Add ".code 32" to functions in the ARM instruction setPablo Barrio2016-09-131-1/+2
| | | | | | | | | | | | | | | | | | | Before, only Thumb functions were marked as ".code 16". These ".code x" directives are effective until the next directive of its kind is encountered. Therefore, in code with interleaved ARM and Thumb functions, it was possible to declare a function as ARM and end up with a Thumb function after assembly. A test has been added. An existing test has also been fixed to take this change into account. Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24337 llvm-svn: 281324
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-09-132-5/+138
| | | | | | | | | | | | | For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281323
* [ARM] Support ldr.w in pseudo instruction ldr rd,=immediatePeter Smith2016-09-132-0/+7
| | | | | | | | | | | | The changes made in r269352, r269353 and r269354 to support the transformation of the ldr rd,=immediate to mov introduced a regression from 3.8 (ldr.w rd, =immediate) not supported. This change puts support back in for ldr.w by means of a t2InstAlias for the .w form. The .w is ignored in ARM state and propagated to the ldr in Thumb2. llvm-svn: 281319
* [ARM] Promote small global constants to constant poolsJames Molloy2016-09-134-1/+144
| | | | | | | | | | | | | | | | | | | | | | | | If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281314
* Revert of r281304 as it is causing build bot failures in hexagonSjoerd Meijer2016-09-133-48/+42
| | | | | | | hwloop regression tests. These tests pass locally; will be investigating where these differences come from. llvm-svn: 281306
* This adds a new field isAdd to MCInstrDesc. The ARM and Hexagon instructionSjoerd Meijer2016-09-133-42/+48
| | | | | | | | | | | descriptions now tag add instructions, and the Hexagon backend is using this to identify loop induction statements. Patch by Sam Parker and Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D23601 llvm-svn: 281304
* Temporarily Revert "[MC] Defer asm errors to post-statement failure" as it's ↵Eric Christopher2016-09-131-7/+68
| | | | | | | | causing errors on the sanitizer bots. This reverts commit r281249. llvm-svn: 281280
* Revert r281215, it caused PR30358.Nico Weber2016-09-122-139/+5
| | | | llvm-svn: 281263
* [MC] Defer asm errors to post-statement failureNirav Dave2016-09-121-68/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281249
* Revert "[ARM] Promote small global constants to constant pools"James Molloy2016-09-124-123/+1
| | | | | | This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625 llvm-svn: 281228
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-09-122-5/+139
| | | | | | | | | | | | | For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281215
* [ARM] Promote small global constants to constant poolsJames Molloy2016-09-124-1/+123
| | | | | | | | | | | | | | | | | | | | | | | | If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281213
* CodeGen: Give MachineBasicBlock::reverse_iterator a handle to the current MIDuncan P. N. Exon Smith2016-09-111-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that MachineBasicBlock::reverse_instr_iterator knows when it's at the end (since r281168 and r281170), implement MachineBasicBlock::reverse_iterator directly on top of an ilist::reverse_iterator by adding an IsReverse template parameter to MachineInstrBundleIterator. This replaces another hard-to-reason-about use of std::reverse_iterator on list iterators, matching the changes for ilist::reverse_iterator from r280032 (see the "out of scope" section at the end of that commit message). MachineBasicBlock::reverse_iterator now has a handle to the current node and has obvious invalidation semantics. r280032 has a more detailed explanation of how list-style reverse iterators (invalidated when the pointed-at node is deleted) are different from vector-style reverse iterators like std::reverse_iterator (invalidated on every operation). A great motivating example is this commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp. Note: If your out-of-tree backend deletes instructions while iterating on a MachineBasicBlock::reverse_iterator or converts between MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator, you'll need to update your code in similar ways to r280032. The following table might help: [Old] ==> [New] delete &*RI, RE = end() delete &*RI++ RI->erase(), RE = end() RI++->erase() reverse_iterator(I) std::prev(I).getReverse() reverse_iterator(I) ++I.getReverse() --reverse_iterator(I) I.getReverse() reverse_iterator(std::next(I)) I.getReverse() RI.base() std::prev(RI).getReverse() RI.base() ++RI.getReverse() --RI.base() RI.getReverse() std::next(RI).base() RI.getReverse() (For more details, have a look at r280032.) llvm-svn: 281172
* [CodeGen] Split out the notions of MI invariance and MI dereferenceability.Justin Lebar2016-09-113-11/+17
| | | | | | | | | | | | | | | | | | | Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151
* ARM: move the builtins libcall CC setupSaleem Abdulrasool2016-09-092-0/+169
| | | | | | | | | Move the target specific setup into the target specific lowering setup. As pointed out by Anton, the initial change was moving this too high up the stack resulting in a violation of the layering (the target generic code path setup target specific bits). Sink this into the ARM specific setup. NFC. llvm-svn: 281088
* [ARM] ADD with a negative offset can become SUB for freeJames Molloy2016-09-091-0/+4
| | | | | | So model that directly in TTI::getIntImmCost(). llvm-svn: 281044
* [ARM] icmp %x, -C can be lowered to a simple ADDS or CMNJames Molloy2016-09-091-0/+11
| | | | | | Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043
* [Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0)James Molloy2016-09-091-0/+42
| | | | | | The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040
* [Thumb1] Teach optimizeCompareInstr about thumb1 comparesJames Molloy2016-09-091-4/+21
| | | | | | | | This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags. Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean. llvm-svn: 281027
* Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"Renato Golin2016-09-086-119/+0
| | | | | | | | | | And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967
* [ARM XRay] Try to fix Thumb-only failureRenato Golin2016-09-081-1/+1
| | | | | | | I mised the check that it had to support ARM to work. This commit tries to fix that, to make sure we don't emit ARM code in Thumb-only mode. llvm-svn: 280935
* [Thumb1] AND with a constant operand can be converted into BICJames Molloy2016-09-081-0/+4
| | | | | | | So model the cost of materializing the constant operand C as the minimum of C and ~C. llvm-svn: 280929
* [Thumb1] Fix cost calculation for complemented immediatesJames Molloy2016-09-081-1/+1
| | | | | | | | | | | | Materializing something like "-3" can be done as 2 instructions: MOV r0, #3 MVN r0, r0 This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256. There were no tests failing after changing this... :/ llvm-svn: 280928
* Revert "[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)"Pablo Barrio2016-09-081-18/+1
| | | | | | | | | | This reverts commit r280808. It is possible that this change results in an infinite loop. This is causing timeouts in some tests on ARM, and a Chromebook bot is failing. llvm-svn: 280918
* [XRay] ARM 32-bit no-Thumb support in LLVMDean Michael Berris2016-09-086-0/+119
| | | | | | | | | | | | This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: 1. https://reviews.llvm.org/D23932 (Clang test) 2. https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 280888
* [ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)Pablo Barrio2016-09-071-1/+18
| | | | | | | | | | | | | | | Summary: This saves a library call to __aeabi_uidivmod. However, the processor must feature hardware division in order to benefit from the transformation. Reviewers: scott-0, jmolloy, compnerd, rengolin Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D24133 llvm-svn: 280808
* ARM: workaround bundled operation predicationSaleem Abdulrasool2016-09-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | This is a Windows ARM specific issue. If the code path in the if conversion ends up using a relocation which will form a IMAGE_REL_ARM_MOV32T, we end up with a bundle to ensure that the mov.w/mov.t pair is not split up. This is normally fine, however, if the branch is also predicated, then we end up trying to predicate the bundle. For now, report a bundle as being unpredicatable. Although this is false, this would trigger a failure case previously anyways, so this is no worse. That is, there should not be any code which would previously have been if converted and predicated which would not be now. Under certain circumstances, it may be possible to "predicate the bundle". This would require scanning all bundle instructions, and ensure that the bundle contains only predicatable instructions, and converting the bundle into an IT block sequence. If the bundle is larger than the maximal IT block length (4 instructions), it would require materializing multiple IT blocks from the single bundle. llvm-svn: 280689
* [Thumb1] Add relocations for fixups fixup_arm_thumb_{br,bcc}James Molloy2016-09-051-0/+6
| | | | | | | | These need to be mapped through to R_ARM_THM_JUMP{11,8} respectively. Fixes PR30279. llvm-svn: 280651
* Setting fp trapping mode and denormal type: this an improvement ofSjoerd Meijer2016-09-021-16/+24
| | | | | | | | | r280246 and calculates compatibility of functions attributes in a better way. Differential Revision: https://reviews.llvm.org/D24070 llvm-svn: 280534
* Clang patch r280064 introduced ways to set the FP exceptions and denormalSjoerd Meijer2016-08-311-10/+31
| | | | | | | | | | types. This is the LLVM counterpart and it adds options that map onto FP exceptions and denormal build attributes allowing better fp math library selections. Differential Revision: https://reviews.llvm.org/D24070 llvm-svn: 280246
* Handle empty functions with debug info in load/store opt passPablo Barrio2016-08-261-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: In fuctions that contained debug info but were empty otherwise, the ARM load/store optimizer could abort. This was because function MergeReturnIntoLDM handled the special case where a Machine Basic BLock is empty by calling MBB.empty(). However, this returns false in presence of debug info, although the function should be considered empty in the eyes of the load/store optimizer. This has been fixed by handling the case where searching through the block finds only debug instructions. Reviewers: rengolin, dexonsmith, llvm-commits, jmolloy Subscribers: t.p.northover, aemerson, rengolin, samparker Differential Revision: https://reviews.llvm.org/D23847 llvm-svn: 279820
* ARM: by default don't set the Thumb bit on MachO relocated values.Tim Northover2016-08-251-10/+11
| | | | | | | | | | | | | Its existence is largely historical, apparently we tried to make ARM object files look maybe-almost-possibly runnable by putting our best guess at the actual value into relocated locations. Of course, the real linker then comes along and can completely change things. But it should only be there for word-sized and movw/movt relocations. It can't be encoded in branch relocations, and I've seen it mess up validity calculations twice in the last couple of weeks so the default is clearly problematic. llvm-svn: 279773
* MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, ↵Matthias Braun2016-08-256-6/+6
| | | | | | | | | | | | | compute it Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of running after register and simply describes that no vregs are used in a machine function. With that we can simply compute the property and do not need to dump/parse it in .mir files. Differential Revision: http://reviews.llvm.org/D23850 llvm-svn: 279698
* ARM: don't diagnose cbz/cbnz to Thumb functions.Tim Northover2016-08-241-1/+2
| | | | | | | | A branch-distance to a Thumb function shouldn't be forced to be odd for CBZ/CBNZ instructions because (assuming it's within range), it's going to be a valid, even offset. llvm-svn: 279665
* Use isTargetMachO instead of isTargetDarwin.Rafael Espindola2016-08-241-1/+1
| | | | llvm-svn: 279655
* [ARM] Generate consistent frame records for Thumb2Oliver Stannard2016-08-234-33/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is not an official documented ABI for frame pointers in Thumb2, but we should try to emit something which is useful. We use r7 as the frame pointer for Thumb code, which currently means that if a function needs to save a high register (r8-r11), it will get pushed to the stack between the frame pointer (r7) and link register (r14). This means that while a stack unwinder can follow the chain of frame pointers up the stack, it cannot know the offset to lr, so does not know which functions correspond to the stack frames. To fix this, we need to push the callee-saved registers in two batches, with the first push saving the low registers, fp and lr, and the second push saving the high registers. This is already implemented, but previously only used for iOS. This patch turns it on for all Thumb2 targets when frame pointers are required by the ABI, and the frame pointer is r7 (Windows uses r11, so this isn't a problem there). If frame pointer elimination is enabled we still emit a single push/pop even if we need a frame pointer for other reasons, to avoid increasing code size. We must also ensure that lr is pushed to the stack when using a frame pointer, so that we end up with a complete frame record. Situations that could cause this were rare, because we already push lr in most situations so that we can return using the pop instruction. Differential Revision: https://reviews.llvm.org/D23516 llvm-svn: 279506
* ARM: Avoid dereferencing end() in ARMFrameLowering::emitEpilogueDuncan P. N. Exon Smith2016-08-211-2/+2
| | | | | | | | | | | | | | | | | | This fixes the crash from PR29072, where the MachineBasicBlock::iterator wasn't being properly checked against MachineBasicBlock::end() before iterating. This was another bug exposed by the new ilist::iterator::operator*() assertion from r279314. This testcase is poor quality. bugpoint couldn't reduce any further, and I haven't had time to dig into what's going on so I can't invent a better one. I didn't even get good CHECK lines in: this is just a crasher. I'm committing anyway since this is a real crash with an obvious fix, but I'll leave PR29072 open and ask an ARM maintainer to help improve the testcase. llvm-svn: 279391
* [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> froundMichael Kuperstein2016-08-181-4/+4
| | | | | | | | | | The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129
* Replace a few more "fall through" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-174-7/+8
| | | | | | Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970
* GlobalISel: support irtranslation of icmp instructions.Tim Northover2016-08-171-0/+1
| | | | llvm-svn: 278969
* Replace "fallthrough" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-1712-23/+26
| | | | | | | This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902
* Some places that could using TargetParser in LLVM. NFC.Zijiao Ma2016-08-172-3/+8
| | | | llvm-svn: 278888
* ARM: Avoid dereferencing end() in ARMFrameLowering::emitPrologueDuncan P. N. Exon Smith2016-08-171-1/+2
| | | | | | | | llvm::tryFoldSPUpdateIntoPushPop assumes its arguments are valid MachineInstrs. Update ARMFrameLowering::emitPrologue to respect that; when LastPush==end(), it can't possibly be a push instruction anyway. llvm-svn: 278880
* Correct the upper bound for a CBZ/CBNZ branch target.Prakhar Bahuguna2016-08-161-2/+4
| | | | | | | | | | | | | Summary: Fix for the upper bound check that was causing a build failure. Reviewers: olista01, rengolin, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23501 llvm-svn: 278789
* [Thumb] Validate branch target for CBZ/CBNZ instructions.Prakhar Bahuguna2016-08-162-0/+11
| | | | | | | | | | | | | | | | | Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278788
OpenPOWER on IntegriCloud