summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] GlobalISel: Fixup r307365Diana Picus2017-07-071-11/+10
| | | | | | | Rename member DebugLoc -> DbgLoc (so it doesn't conflict with the class name). llvm-svn: 307366
* [ARM] GlobalISel: Select hard G_FCMP for s32Diana Picus2017-07-071-63/+234
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We lower to a sequence consisting of: - MOVi 0 into a register - VCMPS to do the actual comparison and set the VFP flags - FMSTAT to move the flags out of the VFP unit - MOVCCi to either use the "zero register" that we have previously set with the MOVi, or move 1 into the result register, based on the values of the flags As was the case with soft-float, for some predicates (one, ueq) we actually need two comparisons instead of just one. When that happens, we generate two VCMPS-FMSTAT-MOVCCi sequences and chain them by means of using the result of the first MOVCCi as the "zero register" for the second one. This is a bit overkill, since one comparison followed by two non-flag-setting conditional moves should be enough. In any case, the backend manages to CSE one of the comparisons away so it doesn't matter much. Note that unlike SelectionDAG and FastISel, we always use VCMPS, and not VCMPES. This makes the code a lot simpler, and it also seems correct since the LLVM Lang Ref defines simple true/false returns if the operands are QNaN's. For SNaN's, even VCMPS throws an Invalid Operand exception, so they won't be slipping through unnoticed. Implementation-wise, this introduces a template so we can share the same code that we use for handling integer comparisons, since the only differences are in the details (exact opcodes to be used etc). Hopefully this will be easy to extend to s64 G_FCMP. llvm-svn: 307365
* [ARM] GlobalISel: Map s32 G_FCMP in reg bank selectDiana Picus2017-07-061-0/+14
| | | | | | Map hard G_FCMP operands to FPR and the result to GPR. llvm-svn: 307245
* [ARM] GlobalISel: Legalize G_FCMP for s32Diana Picus2017-07-062-0/+162
| | | | | | | | | | | | | | | | | | | | | This covers both hard and soft float. Hard float is easy, since it's just Legal. Soft float is more involved, because there are several different ways to handle it based on the predicate: one and ueq need not only one, but two libcalls to get a result. Furthermore, we have large differences between the values returned by the AEABI and GNU functions. AEABI functions return a nice 1 or 0 representing true and respectively false. GNU functions generally return a value that needs to be compared against 0 (e.g. for ogt, the value returned by the libcall is > 0 for true). We could introduce redundant comparisons for AEABI as well, but they don't seem easy to remove afterwards, so we do different processing based on whether or not the result really needs to be compared against something (and just truncate if it doesn't). llvm-svn: 307243
* [ARM] GlobalISel: Widen s1, s8, s16 G_CONSTANTDiana Picus2017-07-061-0/+2
| | | | | | Get the legalizer to widen small constants. llvm-svn: 307239
* [GlobalISel] Refactor Legalizer helpers for libcallsDiana Picus2017-07-051-4/+9
| | | | | | | | | | We used to have a helper that replaced an instruction with a libcall. That turns out to be too aggressive, since sometimes we need to replace the instruction with at least two libcalls. Therefore, change our existing helper to only create the libcall and leave the instruction removal as a separate step. Also rename the helper accordingly. llvm-svn: 307149
* [AsmParser] Mnemonic Spell CorrectorSjoerd Meijer2017-07-051-2/+8
| | | | | | | | | | | | | | | | | | This implements suggesting other mnemonics when an invalid one is specified, for example: $ echo "adXd r1,r2,#3" | llvm-mc -triple arm <stdin>:1:1: error: invalid instruction, did you mean: add, qadd? adXd r1,r2,#3 ^ The implementation is target agnostic, but as a first step I have added it only to the ARM backend; so the ARM backend is a good example if someone wants to enable this too for another target. Differential Revision: https://reviews.llvm.org/D33128 llvm-svn: 307148
* [ARM] GlobalISel: Extract tiny helper. NFCDiana Picus2017-07-051-2/+5
| | | | | | Extract functionality for determining if the target uses AEABI. llvm-svn: 307145
* fix trivial typos in comments; NFCHiroshi Inoue2017-07-041-1/+1
| | | | llvm-svn: 307094
* [globalisel][tablegen] Partially fix compile-time regressions by converting ↵Daniel Sanders2017-07-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | matcher to state-machine(s) Summary: Replace the matcher if-statements for each rule with a state-machine. This significantly reduces compile time, memory allocations, and cumulative memory allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is recommitted. The following patches will expand on this further to fully fix the regressions. Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33758 llvm-svn: 307079
* Remove the default ARMSubtarget from the ARM TargetMachine.Eric Christopher2017-07-013-11/+20
| | | | | | | This enables us to ensure better LTO and code generation in the face of module linking. Remove a report_fatal_error from the TargetMachine and replace it with an assert in ARMSubtarget - and remove the test that depended on the error. The assertion will still fire in the case that we were reporting before, but error reporting needs to be in front end tools if possible for options parsing. llvm-svn: 306939
* Rewrite ARM execute only support to avoid the use of a command line flag and ↵Eric Christopher2017-07-014-29/+21
| | | | | | | | unqualified ARMSubtarget lookup. Paired with a clang commit to use the new behavior. llvm-svn: 306927
* [ARM] Move GISel accessor initialization from TargetMachine to Subtarget.Quentin Colombet2017-07-012-54/+63
| | | | | | NFC llvm-svn: 306920
* Rename and adjust processFixupValue.Rafael Espindola2017-06-302-12/+11
| | | | | | | It was not processing any value. All that it ever did was force relocations, so name it shouldForceRelocation. llvm-svn: 306906
* ARM: fix big-endian 64-bit cmpxchg.Tim Northover2017-06-301-4/+11
| | | | | | | | | | On big-endian machines the high and low parts of the value accessed by ldrexd and strexd are swapped around. To account for this we swap inputs and outputs in ISelLowering. Patch by Bharathi Seshadri. llvm-svn: 306865
* [GlobalISel] Make multi-step legalization work.Kristof Beyls2017-06-301-38/+1
| | | | | | | | | | | | | | | | In r301116, a custom lowering needed to be introduced to be able to legalize 8 and 16-bit divisions on ARM targets without a division instruction, since 2-step legalization (WidenScalar from 8 bit to 32 bit, then Libcall the 32-bit division) doesn't work. This fixes this and makes this kind of multi-step legalization, where first the size of the type needs to be changed and then some action is needed that doesn't require changing the size of the type, straighforward to specify. Differential Revision: https://reviews.llvm.org/D32529 llvm-svn: 306806
* Unified logic for computing target ABI in backend and front end by moving ↵Eric Christopher2017-06-301-50/+10
| | | | | | | | | | this common code to Support/TargetParser. Modeled Triple::GNU after front end code (aapcs abi) and updated tests that expect apcs abi. Based heavily on a patch by Ana Pazos! llvm-svn: 306768
* [llvm-objdump] Handle invalid instruction gracefully on ARMEugene Leviant2017-06-291-1/+1
| | | | | | Differential revision: https://reviews.llvm.org/D34813 llvm-svn: 306687
* [ARM] Add tGPRwithpc register class and use it for TBB/THHFlorian Hahn2017-06-292-4/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: TBB and THH allow using a Thumb GPR or the PC as destination operand. A few machine verifier failures where due to those instructions not expecting PC as destination operand. Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add test coverage even if expensive checks are disabled. Reviewers: MatzeB, t.p.northover, jmolloy Reviewed By: MatzeB Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34610 llvm-svn: 306654
* Don't repeat names and reformat. NFC.Rafael Espindola2017-06-281-46/+37
| | | | llvm-svn: 306556
* [ARM] Improve if-conversion for M-class CPUs without branch predictorsJohn Brawn2017-06-285-14/+85
| | | | | | | | | | | | | The current heuristic in isProfitableToIfCvt assumes we have a branch predictor, and so gives the wrong answer in some cases when we don't. This patch adds a subtarget feature to indicate that a subtarget has no branch predictor, and changes the heuristic in isProfitableToiIfCvt when it's present. This gives a slight overall improvement in a set of embedded benchmarks on Cortex-M4 and Cortex-M33. Differential Revision: https://reviews.llvm.org/D34398 llvm-svn: 306547
* [ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).Kristof Beyls2017-06-281-1/+1
| | | | | | | | | | | | | | | | The benchmarking summarized in http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed this is beneficial for a wide range of cores. As is to be expected, quite a few small adaptations are needed to the regressions tests, as the difference in scheduling results in: - Quite a few small instruction schedule differences. - A few changes in register allocation decisions caused by different instruction schedules. - A few changes in IfConversion decisions, due to a difference in instruction schedule and/or the estimated cost of a branch mispredict. llvm-svn: 306514
* [ARM] GlobalISel: Support G_SELECT for pointersDiana Picus2017-06-271-0/+1
| | | | | | All we need to do is mark it as legal, otherwise it's just like s32. llvm-svn: 306390
* [ARM] GlobalISel: Support G_SELECT for i32Diana Picus2017-06-273-0/+65
| | | | | | | | | | * Mark as legal for (s32, i1, s32, s32) * Map everything into GPRs * Select to two instructions: a CMP of the condition against 0, to set the flags, and a MOVCCr to select between the two inputs based on the flags that we've just set llvm-svn: 306382
* Simplify the processFixupValue interface. NFC.Rafael Espindola2017-06-242-8/+3
| | | | llvm-svn: 306202
* Remove redundant argument.Rafael Espindola2017-06-242-2/+3
| | | | llvm-svn: 306189
* ARM: move some logic from processFixupValue to applyFixup.Rafael Espindola2017-06-232-23/+35
| | | | | | | | | | | | processFixupValue is called on every relaxation iteration. applyFixup is only called once at the very end. applyFixup is then the correct place to do last minute changes and value checks. While here, do proper range checks again for fixup_arm_thumb_bl. We used to do it, but dropped because of thumb2. We now do it again, but use the thumb2 range. llvm-svn: 306177
* COFF: Produce an error on invalid pcrel relocs.Rafael Espindola2017-06-231-3/+4
| | | | | | | | | | X86_64 COFF only has support for 32 bit pcrel relocations. Produce an error on all others. Note that gnu as has extended the relocation values to support this. It is not clear if we should support the gnu extension. llvm-svn: 306082
* [ARM] Create relocations for beq.w branches to ARM function syms.Florian Hahn2017-06-221-0/+1
| | | | | | | | | | | | | | | | | | Summary: The ARM ELF ABI requires the linker to do interworking for wide conditional branches from Thumb code to ARM code. That was pointed out by @peter.smith in the comments for D33436. Reviewers: rafael, peter.smith, echristo Reviewed By: peter.smith Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, peter.smith Differential Revision: https://reviews.llvm.org/D34447 llvm-svn: 306009
* Don't conditionalize Neon instructions, even in IT blocks.Kristof Beyls2017-06-221-3/+5
| | | | | | | | | | | | | | This has been deprecated since ARMARM v7-AR, release C.b, published back in 2012. This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was introduced to check that conditionalization of Neon instructions did happen when generating Thumb2. However, the test had evolved and was no longer testing that. Rather than trying to adapt that test, this commit introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can now use the MIR framework to write nicer/more maintainable tests. llvm-svn: 305998
* [ARM] Add .w aliases of MOV with shifted operandJohn Brawn2017-06-222-2/+14
| | | | | | | | These appear to have been simply missing. Differential Revision: https://reviews.llvm.org/D34461 llvm-svn: 305993
* [ARM] Clean up choice of narrow instructions in ARMAsmParser, NFCJohn Brawn2017-06-221-33/+27
| | | | | | | | | | | This patch makes a couple of changes to how we decide whether to use the narrow or wide encoding of thumb2 instructions: * Common out the detection of the .w qualifier * Check for the CPSR operand in a consistent way Differential Revision: https://reviews.llvm.org/D34460 llvm-svn: 305992
* [ARM] Add macro fusion for AES instructions.Florian Hahn2017-06-226-1/+99
| | | | | | | | | | | | | | | | Summary: This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES instructions back to back and adds FeatureFuseAES to enable the feature. Reviewers: evandro, javed.absar, rengolin, t.p.northover Reviewed By: javed.absar Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34142 llvm-svn: 305988
* Use a MutableArrayRef. NFC.Rafael Espindola2017-06-212-5/+5
| | | | llvm-svn: 305968
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-203-15/+44
| | | | | | | | | | | | | | | | | | Resubmission of r305387, which was reverted at r305390. The Address Sanitizer caught a stack-use-after-scope of a Twine variable. This is now fixed by passing the Twine directly as a function parameter. The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305776
* [ARM] GlobalISel: Support G_ICMP for s8 and s16Diana Picus2017-06-191-0/+2
| | | | | | Widen to s32 (like all other binary ops). llvm-svn: 305683
* [ARM] GlobalISel: Support G_ICMP for i32 and pointersDiana Picus2017-06-193-0/+119
| | | | | | | | | | | | | | Add support throughout the pipeline: - mark as legal for s32 and pointers - map to GPRs - lower to a sequence of instructions, which moves 0 or 1 into the result register based on the flags set by a CMPrr We have copied from FastISel a helper function which maps CmpInst predicates into ARMCC codes. Ideally, we should be able to move it somewhere that both FastISel and GlobalISel can use. llvm-svn: 305672
* [ARM] GlobalISel: Add support for i32 moduloDiana Picus2017-06-151-0/+45
| | | | | | | | | | | | | | | | | | Add support for modulo for targets that have hardware division and for those that don't. When hardware division is not available, we have to choose the correct libcall to use. This is generally straightforward, except for AEABI. The AEABI variant is trickier than the other libcalls because it returns { quotient, remainder }, instead of just one value like the other libcalls that we've seen so far. Therefore, we need to use custom lowering for it. However, we don't want to have too much special code, so we refactor the target-independent code in the legalizer by adding a helper for replacing an instruction with a libcall. This helper is used by the legalizer itself when dealing with simple calls, and also by the custom ARM legalization for the more complicated AEABI divmod calls. llvm-svn: 305459
* [ARM] GlobalISel: Lower only homogeneous struct argsDiana Picus2017-06-151-31/+24
| | | | | | | | | | | | | Lowering mixed struct args, params and returns used G_INSERT, which is a bit more convoluted to support through the entire pipeline. Since they don't occur that often in practice, it's probably wiser to leave them out until later. Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which has good support in the legalizer. These occur e.g. as the return of __aeabi_idivmod, so it's nice to be able to support them. llvm-svn: 305458
* Revert "[ARM] Support constant pools in data when generating execute-only code."Alexandros Lamprineas2017-06-143-43/+15
| | | | | | | | | | | This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0. I've upset a buildbot which runs the address sanitizer: ERROR: AddressSanitizer: stack-use-after-scope lib/Target/ARM/ARMISelLowering.cpp:2690 That Twine variable is used illegally. llvm-svn: 305390
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-143-15/+43
| | | | | | | | | | | | | | The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305387
* [ARM] Add scheduling classes for VFNM[AS]Oliver Stannard2017-06-131-6/+12
| | | | | | | | | | The VFNM[AS] instructions did not have scheduling information attached, which was causing assertion failures with the Cortex-A57 scheduling model and -fp-contract=fast, because the Cortex-A57 sched model claims to be complete. Differential Revision: https://reviews.llvm.org/D34139 llvm-svn: 305288
* Const correctness for TTI::getRegisterBitWidthDaniel Neilson2017-06-121-1/+1
| | | | | | | | | | | | | | Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189
* [ARM] Custom machine-scheduler. NFCI.Javed Absar2017-06-091-0/+15
| | | | | | | | | This patch creates a customised machine-scheduler for ARM targets, so that subsequently DAG mutations etc can be added. Reviewed by: hahn, rengolin, rovka. Differential Revision: https://reviews.llvm.org/D34039 llvm-svn: 305078
* [ARM] Add scheduling info for VFMSOliver Stannard2017-06-091-3/+6
| | | | | | | | | | The scalar VFMS instructions did not have scheduling information attached (but VFMA did), which was causing assertion failures with the Cortex-A57 scheduling model and -fp-contract=fast. Differential Revision: https://reviews.llvm.org/D34040 llvm-svn: 305064
* [ARM] Use FixupKind variable in processFixupValue (cleanup, NFC).Florian Hahn2017-06-071-10/+10
| | | | llvm-svn: 304905
* [ARM] GlobalISel: Purge G_SEQUENCEDiana Picus2017-06-073-53/+52
| | | | | | | | | | | | | | | | | According to the commit message from r296921, G_MERGE_VALUES and G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating G_SEQUENCE in the ARM backend and remove the code dealing with it. This boils down to the code breaking up double values for the soft float calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR + VMOVRRD and simplifies the code in the instruction selector. There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but that is part of the target-independent code for translating constant structs. Therefore, it is beyond the scope of this commit. llvm-svn: 304902
* [ARM] GlobalISel: Support G_XORDiana Picus2017-06-072-1/+2
| | | | | | | | | Same as the other binary operators: - legalize to 32 bits - map to GPRs - select to EORrr via TableGen'erated code llvm-svn: 304898
* [ARM] GlobalISel: Support G_ORDiana Picus2017-06-072-1/+2
| | | | | | | | | Same as the other binary operators: - legalize to 32 bits - map to GPRs - select ORRrr thanks to TableGen'erated code llvm-svn: 304890
* [ARM] GlobalISel: Support G_ANDDiana Picus2017-06-072-1/+2
| | | | | | | | | This is identical to the support for the other binary operators: - widen to s32 - map into GPR - select ANDrr (via TableGen'erated code) llvm-svn: 304885
OpenPOWER on IntegriCloud