summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Don't pessimize i32 vselect.Charlie Turner2015-11-171-3/+0
| | | | | | | | | | | | | | | | | | The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad scalarization that is still happening there. I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements. From my benchmarks, I saw these improvements in A57 (T32) spec.cpu2000.ref.177_mesa 5.95% lnt.SingleSource/Benchmarks/Shootout/strcat 12.93% lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89% I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change Differential Revision: http://reviews.llvm.org/D14743 llvm-svn: 253349
* [ARM] Default to ARMv4t in favour of adding Other to ARMArchBradley Smith2015-11-172-2/+2
| | | | llvm-svn: 253335
* [ARM] Match VABDL from log2 shuffles.Charlie Turner2015-11-171-0/+23
| | | | | | Differential Revision: http://reviews.llvm.org/D14664 llvm-svn: 253334
* [ARM] Properly initialize ARMArch in the ARM subtargetBradley Smith2015-11-172-3/+3
| | | | llvm-svn: 253331
* [Assembler] Make fatal assembler errors non-fatalOliver Stannard2015-11-173-27/+53
| | | | | | | | | | | | | | Currently, if the assembler encounters an error after parsing (such as an out-of-range fixup), it reports this as a fatal error, and so stops after the first error. However, for most of these there is an obvious way to recover after emitting the error, such as emitting the fixup with a value of zero. This means that we can report on all of the errors in a file, not just the first one. MCContext::reportError records the fact that an error was encountered, so we won't actually emit an object file with the incorrect contents. Differential Revision: http://reviews.llvm.org/D14717 llvm-svn: 253328
* Drop prelink support.Rafael Espindola2015-11-171-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way prelink used to work was * The compiler decides if a given section only has relocations that are know to point to the same DSO. If so, it names it .data.rel.ro.local<something>. * The static linker puts all of these together. * The prelinker program assigns addresses to each library and resolves the local relocations. There are many problems with this: * It is incompatible with address space randomization. * The information passed by the compiler is redundant. The linker knows if a given relocation is in the same DSO or not. If could sort by that if so desired. * There are newer ways of speeding up DSO (gnu hash for example). * Even if we want to implement this again in the compiler, the previous implementation is pretty broken. It talks about relocations that are "resolved by the static linker". If they are resolved, there are none left for the prelinker. What one needs to track is if an expression will require only dynamic relocations that point to the same DSO. At this point it looks like the prelinker is an historical curiosity. For example, fedora has retired it because it failed to build for two releases (http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f) This patch removes support for it. That is, it stops printing the ".local" sections. llvm-svn: 253280
* [ARM] Prevent use of a value pointed by end() iterator when placing a jump tablePetr Pavlu2015-11-161-0/+2
| | | | | | | | | | | | | | Function ARMConstantIslands::doInitialJumpTablePlacement() iterates over all basic blocks in a machine function. It calls `MI = MBB.getLastNonDebugInstr()` to get the last instruction in each block and then uses MI->getOpcode() to decide what to do. If getLastNonDebugInstr() returns MBB.end() (for example, when the block does not contain any instructions) then calling getOpcode() on this value is incorrect. Avoid this problem by checking the result of getLastNonDebugInstr(). Differential Revision: http://reviews.llvm.org/D14694 llvm-svn: 253222
* [ARM,AArch64] Store source location of asm constant pool entriesOliver Stannard2015-11-162-4/+7
| | | | | | | | | | Storing the source location of the expression that created a constant pool entry allows us to emit better error messages if we later discover that the expression cannot be represented by a relocation. Differential Revision: http://reviews.llvm.org/D14646 llvm-svn: 253220
* [ARM,AArch64] Store source location for values in assembly filesOliver Stannard2015-11-162-2/+2
| | | | | | | | | | | The MCValue class can store a SMLoc to allow better error messages to be emitted if an error is detected after parsing. The ARM and AArch64 assembly parsers were not setting this, so error messages did not have source information. Differential Revision: http://reviews.llvm.org/D14645 llvm-svn: 253219
* Handle ARMv6KZ namingArtyom Skrobov2015-11-163-6/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: * ARMv6KZ is the "canonical" name, given in the ARMARM * ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM * ARMv6ZK is a popular misspelling, which we should support as an alias. The patch corrects the handling of the names. Functional changes: * ARMv6Z no longer treated as an architecture in its own right * ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias * arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K * default ARMv6K CPU changed to arm1176j-s Reviewers: rengolin, logan, compnerd Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14568 llvm-svn: 253206
* [ARM] Introduce subtarget features per ARM architecture.Bradley Smith2015-11-164-358/+407
| | | | | | | This allows for accurate architecture targeting as well as removing duplicate information (hardcoded feature strings) from MCTargetDesc. llvm-svn: 253196
* Properly check if a CMPZ node is in fact comparing against zeroJames Molloy2015-11-161-0/+6
| | | | | | | | This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless. Caught by Oliver Stannard using csmith; regression test added. llvm-svn: 253195
* Reduce the size of MCRelaxableFragment.Akira Hatanaka2015-11-141-1/+6
| | | | | | | | | | | | | | | | | | | | | | MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127
* [MCTargetAsmParser] Move the member varialbes that referenceAkira Hatanaka2015-11-141-17/+16
| | | | | | | | | | MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a member function getSTI. This is done in preparation for making changes to shrink the size of MCRelaxableFragment. (see http://reviews.llvm.org/D14346). llvm-svn: 253124
* [ARM] Replace ARMISD::RBIT with ISD::BITREVERSEJames Molloy2015-11-134-10/+7
| | | | | | ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering. llvm-svn: 253047
* Cull non-standard variants of ARM architectures (NFC)Artyom Skrobov2015-11-122-14/+5
| | | | | | | | | | | | | | | Summary: This patch changes ARMV5, ARMV5E, ARMV6SM, ARMV6HL, ARMV7, ARMV7L, ARMV7HL, ARMV7EM to be treated as aliases for the corresponding standard architectures, instead of as actual architectures. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14577 llvm-svn: 252903
* [ARM] CMOV->BFI combining: handle both senses of CMPZJames Molloy2015-11-121-0/+10
| | | | | | | | I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was. If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases. llvm-svn: 252891
* Revert "[ARM] Enable shrink-wrapping by default."Renato Golin2015-11-121-5/+0
| | | | | | This reverts commit r252825, as it broke ASAN on ARM. Investigating... llvm-svn: 252889
* [ARM] Enable shrink-wrapping by default.Quentin Colombet2015-11-111-0/+5
| | | | | | | | Differential Revision: http://reviews.llvm.org/D14357 rdar://problem/21942589 llvm-svn: 252825
* Properly fix unused variable in disable-assert builds.Diego Novillo2015-11-111-1/+3
| | | | | | | | I missed the side-effects of ParseBFI in my previous attempt (r252748). Thanks dblaikie for the suggestion of adding a void use of the unused variable instead. llvm-svn: 252751
* Remove unused variable in disable-assert builds. NFC.Diego Novillo2015-11-111-2/+1
| | | | llvm-svn: 252748
* [ARM] Combine BFIs togetherJames Molloy2015-11-111-2/+109
| | | | | | If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits. llvm-svn: 252740
* [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()Sanjay Patel2015-11-102-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz r0, r0 bx lr cttz: rbit r0, r0 clz r0, r0 bx lr Instead of: ctlz: cmp r0, #0 moveq r0, #32 clzne r0, r0 bx lr cttz: cmp r0, #0 moveq r0, #32 rbitne r0, r0 clzne r0, r0 bx lr This will help solve a general speculation/despeculation problem noted in PR24818: https://llvm.org/bugs/show_bug.cgi?id=24818 Differential Revision: http://reviews.llvm.org/D14469 llvm-svn: 252639
* Reapply "[ARM] Combine CMOV into BFI where possible"James Molloy2015-11-102-0/+113
| | | | | | | | | | | | | | | | | | Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh! Original commit message: If we have a CMOV, OR and AND combination such as: if (x & CN) y |= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252606
* [ARM] Handle t2ADDri in ARMAsmPrinter::EmitUnwindingInstruction.Akira Hatanaka2015-11-101-0/+1
| | | | | | | | | | | | | This fixes a bug in ARMAsmPrinter::EmitUnwindingInstruction where llvm_unreachable was reached because t2ADDri wasn't handled. Test case provided by Tim Northover. rdar://problem/23270609 http://reviews.llvm.org/D14518 llvm-svn: 252557
* [EABI] Add LLVM support for -meabi flagRenato Golin2015-11-093-8/+42
| | | | | | | | | | | | | | | | | | | | | "GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent '__aeabi_memops'. This convertion violates GCC contract. The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI targets. Without -meabi: use the triple default EABI. With -meabi=default: use the triple default EABI. With -meabi=gnu: use 'memops'. With -meabi=4 or -meabi=5: use '__aeabi_memops'. With -meabi set to an unknown value: same as -meabi=default. Patch by Vinicius Tinti. llvm-svn: 252462
* Revert "[ARM] Combine CMOV into BFI where possible"Renato Golin2015-11-092-116/+0
| | | | | | | This reverts commit r252057, as it broke ARM self-hosting buildbots, probably due to a code-gen fault. llvm-svn: 252460
* [AsmParser] Backends can parameterize ASM tokenization.Colin LeMahieu2015-11-091-0/+7
| | | | llvm-svn: 252439
* [WinEH] Update exception pointer registersJoseph Tremoulet2015-11-072-7/+24
| | | | | | | | | | | | | | | | | | | | Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383
* [WinEH] Mark funclet entries and exits as clobbering all registersReid Kleckner2015-11-061-1/+1
| | | | | | | | | | | | | | | | | Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318
* Remove windows line endings introduced by r252177. NFC.Tim Northover2015-11-052-40/+40
| | | | llvm-svn: 252217
* [DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268.Oleg Ranevskyy2015-11-052-37/+40
| | | | | | | | | | | | | | | | | | | Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177
* [ARM] Compute known bits for ARMISD::CMOVJames Molloy2015-11-051-0/+10
| | | | | | | | | | We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities. I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :( This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case. llvm-svn: 252168
* Go back to producing relocations for out of range symbols.Rafael Espindola2015-11-051-6/+4
| | | | | | | | This brings back the behavior from before r252090 for out of range symbols. Should bring some arm bots back. llvm-svn: 252119
* Slightly saner handling of thumb branches.Rafael Espindola2015-11-041-9/+15
| | | | | | | | The generic infrastructure already did a lot of work to decide if the fixup value is know or not. It doesn't make sense to reimplement a very basic case: same fragment. llvm-svn: 252090
* [ARM] Combine CMOV into BFI where possibleJames Molloy2015-11-042-0/+106
| | | | | | | | | | | | | | If we have a CMOV, OR and AND combination such as: if (x & CN) y |= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252057
* WatchOS: update default CPU for triple after t2dsp -> dsp renameTim Northover2015-11-021-2/+2
| | | | llvm-svn: 251814
* Recognize that ARM1176JZ[F]-S support TrustZoneArtyom Skrobov2015-10-292-1/+4
| | | | | | | | | | | | | | Summary: ARMv6KZ cores were set up incorrectly in ARM.td; also, the SMI mnemonic (the old name for SMC, as defined in ARMv6KZ) wasn't supported. Reviewers: jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14154 llvm-svn: 251627
* ARM: add support for WatchOS's compact unwind information.Tim Northover2015-10-286-9/+240
| | | | llvm-svn: 251573
* ARM: teach backend about WatchOS and TvOS libcalls.Tim Northover2015-10-282-24/+48
| | | | | | | The most substantial changes are again for watchOS: libcalls are hard-float if needed and sincos has a different calling convention. llvm-svn: 251571
* ARM: add backend support for the ABI used in WatchOSTim Northover2015-10-288-16/+42
| | | | | | | At the LLVM level this ABI is essentially a minimal modification of AAPCS to support 16-byte alignment for vector types and the stack. llvm-svn: 251570
* ARM: support .watchos_version_min and .tvos_version_min.Tim Northover2015-10-281-1/+9
| | | | | | | | These MachO file directives are used by linkers and other tools to provide compatibility information, much like the existing .ios_version_min and .macosx_version_min. llvm-svn: 251569
* [ARM] Allow SP in rGPR, starting from ARMv8Artyom Skrobov2015-10-282-13/+36
| | | | | | | | | | | | | | | | | Summary: This patch handles assembly and disassembly, but not codegen, as of yet. Additionally, it fixes a bug whereby SP and PC as shifted-reg operands were treated as predictable in ARMv7 Thumb; and it enables the tests for invalid and unpredictable instructions to run on both ARMv7 and ARMv8. Reviewers: jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14141 llvm-svn: 251516
* Remove templates from CostTableLookup functions. All instantiations had the ↵Craig Topper2015-10-281-14/+9
| | | | | | | | same type. This also lets us remove the versions of the functions that took a statically sized array as we can rely on ArrayRef implicit conversion now. llvm-svn: 251490
* [ARM] Expand ROTL and ROTR of vector value typesCharlie Turner2015-10-271-1/+5
| | | | | | | | | | | | Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection. Reviewers: rengolin, t.p.northover Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14082 llvm-svn: 251401
* Convert cost table lookup functions to return a pointer to the entry or ↵Craig Topper2015-10-271-39/+32
| | | | | | | | | | nullptr instead of the index. This avoid mentioning the table name an extra time and allows the lookup to be done directly in the ifs by relying on the bool conversion of the pointer. While there make use of ArrayRef and std::find_if. llvm-svn: 251382
* ARM: make sure VFP loads and stores are properly aligned.Tim Northover2015-10-261-10/+12
| | | | | | | Both VLDRS and VLDRD fault if the memory is not 4 byte aligned, which wasn't really being checked before, leading to faults at runtime. llvm-svn: 251352
* ARM/ELF: Restore original (pre-r251322) logic for deciding whether to use GOT.Peter Collingbourne2015-10-262-2/+2
| | | | | | | Unbreaks linking with gold, which cannot resolve direct relocations referring to global symbols. llvm-svn: 251342
* ARM/ELF: Better codegen for global variable addresses.Peter Collingbourne2015-10-2613-168/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In PIC mode we were previously computing global variable addresses (or GOT entry addresses) by adding the PC, the PC-relative GOT displacement and the GOT-relative symbol/GOT entry displacement. Because the latter two displacements are fixed, we ended up performing one more addition than necessary. This change causes us to compute addresses using a single PC-relative displacement, resulting in a shorter code sequence. This reduces code size by about 4% in a recent build of Chromium for Android. As a result of this change we no longer need to compute the GOT base address in the ARM backend, which allows us to remove the Global Base Reg pass and SDAG lowering for the GOT. We also now no longer use the GOT when addressing a symbol which is known to be defined in the same linkage unit. Specifically, the symbol must have either hidden visibility or a strong definition in the current module in order to not use the the GOT. This is a change from the previous behaviour where we would use the GOT to address externally visible symbols defined in the same module. I think the only cases where this could matter are cases involving symbol interposition, but we don't really support that well anyway. Differential Revision: http://reviews.llvm.org/D13650 llvm-svn: 251322
* [ARM] Handle the inline asm constraint type 'o'James Molloy2015-10-262-0/+3
| | | | | | This means "memory with offset" and requires very little plumbing to get working. This fixes PR25317. llvm-svn: 251280
OpenPOWER on IntegriCloud