summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] GlobalISel: Select hard G_FCMP for s32Diana Picus2017-07-072-0/+637
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We lower to a sequence consisting of: - MOVi 0 into a register - VCMPS to do the actual comparison and set the VFP flags - FMSTAT to move the flags out of the VFP unit - MOVCCi to either use the "zero register" that we have previously set with the MOVi, or move 1 into the result register, based on the values of the flags As was the case with soft-float, for some predicates (one, ueq) we actually need two comparisons instead of just one. When that happens, we generate two VCMPS-FMSTAT-MOVCCi sequences and chain them by means of using the result of the first MOVCCi as the "zero register" for the second one. This is a bit overkill, since one comparison followed by two non-flag-setting conditional moves should be enough. In any case, the backend manages to CSE one of the comparisons away so it doesn't matter much. Note that unlike SelectionDAG and FastISel, we always use VCMPS, and not VCMPES. This makes the code a lot simpler, and it also seems correct since the LLVM Lang Ref defines simple true/false returns if the operands are QNaN's. For SNaN's, even VCMPS throws an Invalid Operand exception, so they won't be slipping through unnoticed. Implementation-wise, this introduces a template so we can share the same code that we use for handling integer comparisons, since the only differences are in the details (exact opcodes to be used etc). Hopefully this will be easy to extend to s64 G_FCMP. llvm-svn: 307365
* RegisterScavenging: Fix PR33687Matthias Braun2017-07-071-0/+66
| | | | | | | | | When scavenging for a use in instruction MI, we will reload after that instruction and hence cannot spill uses/defs of this instruction. This fixes http://llvm.org/PR33687 llvm-svn: 307352
* [ARM] GlobalISel: Map s32 G_FCMP in reg bank selectDiana Picus2017-07-061-0/+29
| | | | | | Map hard G_FCMP operands to FPR and the result to GPR. llvm-svn: 307245
* [ARM] GlobalISel: Legalize G_FCMP for s32Diana Picus2017-07-061-0/+654
| | | | | | | | | | | | | | | | | | | | | This covers both hard and soft float. Hard float is easy, since it's just Legal. Soft float is more involved, because there are several different ways to handle it based on the predicate: one and ueq need not only one, but two libcalls to get a result. Furthermore, we have large differences between the values returned by the AEABI and GNU functions. AEABI functions return a nice 1 or 0 representing true and respectively false. GNU functions generally return a value that needs to be compared against 0 (e.g. for ogt, the value returned by the libcall is > 0 for true). We could introduce redundant comparisons for AEABI as well, but they don't seem easy to remove afterwards, so we do different processing based on whether or not the result really needs to be compared against something (and just truncate if it doesn't). llvm-svn: 307243
* [ARM] GlobalISel: Widen s1, s8, s16 G_CONSTANTDiana Picus2017-07-061-0/+15
| | | | | | Get the legalizer to widen small constants. llvm-svn: 307239
* [DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions ↵Andrew Zhogin2017-07-051-6/+3
| | | | | | | | | | into one with combined shift operand. For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize. Differential revision: https://reviews.llvm.org/D12833 llvm-svn: 307179
* [ARM][test] Added test/CodeGen/ARM/ror.ll test. NFC precommit for D12833.Andrew Zhogin2017-07-041-0/+36
| | | | llvm-svn: 307103
* Remove the default ARMSubtarget from the ARM TargetMachine.Eric Christopher2017-07-011-10/+0
| | | | | | | This enables us to ensure better LTO and code generation in the face of module linking. Remove a report_fatal_error from the TargetMachine and replace it with an assert in ARMSubtarget - and remove the test that depended on the error. The assertion will still fire in the case that we were reporting before, but error reporting needs to be in front end tools if possible for options parsing. llvm-svn: 306939
* Rewrite ARM execute only support to avoid the use of a command line flag and ↵Eric Christopher2017-07-014-15/+15
| | | | | | | | unqualified ARMSubtarget lookup. Paired with a clang commit to use the new behavior. llvm-svn: 306927
* GlobalISel: add G_IMPLICIT_DEF instruction.Tim Northover2017-06-301-5/+5
| | | | | | | | | It looks like there are two target-independent but not GISel instructions that need legalization, IMPLICIT_DEF and PHI. These are already anomalies since their operands have important LLTs attached, so to make things more uniform it seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF. llvm-svn: 306875
* ARM: fix big-endian 64-bit cmpxchg.Tim Northover2017-06-301-0/+26
| | | | | | | | | | On big-endian machines the high and low parts of the value accessed by ldrexd and strexd are swapped around. To account for this we swap inputs and outputs in ISelLowering. Patch by Bharathi Seshadri. llvm-svn: 306865
* Unified logic for computing target ABI in backend and front end by moving ↵Eric Christopher2017-06-306-7/+7
| | | | | | | | | | this common code to Support/TargetParser. Modeled Triple::GNU after front end code (aapcs abi) and updated tests that expect apcs abi. Based heavily on a patch by Ana Pazos! llvm-svn: 306768
* [NFC] Use stdin for some tests instead of positional argument.Nikolai Bozhenov2017-06-292-3/+3
| | | | | | | | | | | | | | | | Summary: Otherwise unexpected matches with the path to the tests might happen. Reviewers: rengolin, spatel, efriedma, RKSimon Reviewed By: spatel Subscribers: n.bozhenov, javed.absar, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D32994 llvm-svn: 306684
* [ARM] Add tGPRwithpc register class and use it for TBB/THHFlorian Hahn2017-06-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: TBB and THH allow using a Thumb GPR or the PC as destination operand. A few machine verifier failures where due to those instructions not expecting PC as destination operand. Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add test coverage even if expensive checks are disabled. Reviewers: MatzeB, t.p.northover, jmolloy Reviewed By: MatzeB Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34610 llvm-svn: 306654
* [ARM] Make -mcpu=generic schedule for an in-order core (Cortex-A8).Kristof Beyls2017-06-2845-407/+402
| | | | | | | | | | | | | | | | The benchmarking summarized in http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed this is beneficial for a wide range of cores. As is to be expected, quite a few small adaptations are needed to the regressions tests, as the difference in scheduling results in: - Quite a few small instruction schedule differences. - A few changes in register allocation decisions caused by different instruction schedules. - A few changes in IfConversion decisions, due to a difference in instruction schedule and/or the estimated cost of a branch mispredict. llvm-svn: 306514
* [ARM] GlobalISel: Support G_SELECT for pointersDiana Picus2017-06-273-3/+76
| | | | | | All we need to do is mark it as legal, otherwise it's just like s32. llvm-svn: 306390
* [ARM] GlobalISel: Support G_SELECT for i32Diana Picus2017-06-274-0/+106
| | | | | | | | | | * Mark as legal for (s32, i1, s32, s32) * Map everything into GPRs * Select to two instructions: a CMP of the condition against 0, to set the flags, and a MOVCCr to select between the two inputs based on the flags that we've just set llvm-svn: 306382
* GlobalISel: convert buildSequence to use non-deprecated instructions.Tim Northover2017-06-231-2/+5
| | | | | | | | G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to be taught how to emulate it with alternatives. We use G_MERGE_VALUES where possible, and a sequence of G_INSERTs if not. llvm-svn: 306119
* Don't conditionalize Neon instructions, even in IT blocks.Kristof Beyls2017-06-222-6/+5
| | | | | | | | | | | | | | This has been deprecated since ARMARM v7-AR, release C.b, published back in 2012. This also removes test/CodeGen/Thumb2/ifcvt-neon.ll that originally was introduced to check that conditionalization of Neon instructions did happen when generating Thumb2. However, the test had evolved and was no longer testing that. Rather than trying to adapt that test, this commit introduces test/CodeGen/Thumb2/ifcvt-neon-deprecated.mir, since we can now use the MIR framework to write nicer/more maintainable tests. llvm-svn: 305998
* [ARM] Add macro fusion for AES instructions.Florian Hahn2017-06-221-0/+203
| | | | | | | | | | | | | | | | Summary: This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES instructions back to back and adds FeatureFuseAES to enable the feature. Reviewers: evandro, javed.absar, rengolin, t.p.northover Reviewed By: javed.absar Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34142 llvm-svn: 305988
* [XRay] Reduce synthetic references emitted by XRayDean Michael Berris2017-06-212-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When we're building with XRay instrumentation, we use a trick that preserves references from the function to a function sled index. This index table lives in a separate section, and without this trick the linker is free to garbage-collect this section and all the segments it refers to. Until we're able to tell the linkers to preserve these sections, we use this reference trick to keep around both the index and the entries in the instrumentation map. Before this change we emitted both a synthetic reference to the label in the instrumentation map, and to the entry in the function map index. This change removes the first synthetic reference and only emits one synthetic reference to the index -- the index entry has the references to the labels in the instrumentation map, so the linker will still preserve those if the function itself is preserved. This reduces the amount of synthetic references we emit from 16 bytes to just 8 bytes in x86_64, and similarly to other platforms. Reviewers: dblaikie Subscribers: javed.absar, kpw, pelikan, llvm-commits Differential Revision: https://reviews.llvm.org/D34340 llvm-svn: 305880
* DAG: correctly legalize UMULO.Tim Northover2017-06-201-0/+16
| | | | | | | | | We were incorrectly sign extending into the high word (as you would for SMULO) when legalizing UMULO in terms of a wider full multiplication. Patch by James Duley. llvm-svn: 305800
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-201-0/+50
| | | | | | | | | | | | | | | | | | Resubmission of r305387, which was reverted at r305390. The Address Sanitizer caught a stack-use-after-scope of a Twine variable. This is now fixed by passing the Twine directly as a function parameter. The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305776
* [ARM] GlobalISel: Support G_ICMP for s8 and s16Diana Picus2017-06-192-3/+74
| | | | | | Widen to s32 (like all other binary ops). llvm-svn: 305683
* [ARM] GlobalISel: Support G_ICMP for i32 and pointersDiana Picus2017-06-194-0/+457
| | | | | | | | | | | | | | Add support throughout the pipeline: - mark as legal for s32 and pointers - map to GPRs - lower to a sequence of instructions, which moves 0 or 1 into the result register based on the flags set by a CMPrr We have copied from FastISel a helper function which maps CmpInst predicates into ARMCC codes. Ideally, we should be able to move it somewhere that both FastISel and GlobalISel can use. llvm-svn: 305672
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2017-06-173-8/+10
| | | | | | | | | | | | | | | | | | | Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse to place spills as the very first instruciton of a basic block and thus artifically increase pressure (test in test/CodeGen/PowerPC/scavenging.mir:spill_at_begin) This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305625
* Revert "RegScavenging: Add scavengeRegisterBackwards()"Matthias Braun2017-06-163-10/+8
| | | | | | | | | Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. llvm-svn: 305566
* RegScavenging: Add scavengeRegisterBackwards()Matthias Braun2017-06-153-8/+10
| | | | | | | | | | | | | | | | | | Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305516
* ISel: Fix FastISel of swifterror valuesArnold Schwaighofer2017-06-151-0/+28
| | | | | | | | | | | | The code assumed that we process instructions in basic block order. FastISel processes instructions in reverse basic block order. We need to pre-assign virtual registers before selecting otherwise we get def-use relationships wrong. This only affects code with swifterror registers. rdar://32659327 llvm-svn: 305484
* [ARM] GlobalISel: Add support for i32 moduloDiana Picus2017-06-152-0/+96
| | | | | | | | | | | | | | | | | | Add support for modulo for targets that have hardware division and for those that don't. When hardware division is not available, we have to choose the correct libcall to use. This is generally straightforward, except for AEABI. The AEABI variant is trickier than the other libcalls because it returns { quotient, remainder }, instead of just one value like the other libcalls that we've seen so far. Therefore, we need to use custom lowering for it. However, we don't want to have too much special code, so we refactor the target-independent code in the legalizer by adding a helper for replacing an instruction with a libcall. This helper is used by the legalizer itself when dealing with simple calls, and also by the custom ARM legalization for the more complicated AEABI divmod calls. llvm-svn: 305459
* [ARM] GlobalISel: Lower only homogeneous struct argsDiana Picus2017-06-152-158/+45
| | | | | | | | | | | | | Lowering mixed struct args, params and returns used G_INSERT, which is a bit more convoluted to support through the entire pipeline. Since they don't occur that often in practice, it's probably wiser to leave them out until later. Meanwhile, we can lower homogeneous structs using G_MERGE_VALUES, which has good support in the legalizer. These occur e.g. as the return of __aeabi_idivmod, so it's nice to be able to support them. llvm-svn: 305458
* Revert "[ARM] Support constant pools in data when generating execute-only code."Alexandros Lamprineas2017-06-141-50/+0
| | | | | | | | | | | This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0. I've upset a buildbot which runs the address sanitizer: ERROR: AddressSanitizer: stack-use-after-scope lib/Target/ARM/ARMISelLowering.cpp:2690 That Twine variable is used illegally. llvm-svn: 305390
* [ARM] Support constant pools in data when generating execute-only code.Alexandros Lamprineas2017-06-141-0/+50
| | | | | | | | | | | | | | The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305387
* Align definition of DW_OP_plus with DWARF spec [3/3]Florian Hahn2017-06-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things. The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack. This is done in three stages: • The first patch (LLVM) adds support for DW_OP_plus_uconst. • The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst. • The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions. Patch by Sander de Smalen. Reviewers: echristo, pcc, aprantl Reviewed By: aprantl Subscribers: fhahn, javed.absar, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D33894 llvm-svn: 305386
* [ARM] Add scheduling classes for VFNM[AS]Oliver Stannard2017-06-131-0/+38
| | | | | | | | | | The VFNM[AS] instructions did not have scheduling information attached, which was causing assertion failures with the Cortex-A57 scheduling model and -fp-contract=fast, because the Cortex-A57 sched model claims to be complete. Differential Revision: https://reviews.llvm.org/D34139 llvm-svn: 305288
* [SelectionDAG] Allow sin/cos -> sincos optimization on GNU triples w/ just ↵Geoff Berry2017-06-121-16/+51
| | | | | | | | | | | | | | | | | | | | | -fno-math-errno Summary: This change enables the sin(x) cos(x) -> sincos(x) optimization on GNU target triples. This optimization was being inhibited when -ffast-math wasn't set because sincos in GLibC does not set errno, while sin and cos do. However, this optimization will only run if the attributes on the sin/cos calls include readnone, which is how clang represents the fact that it doesn't care about the errno values set by these functions (via the -fno-math-errno flag). Reviewers: hfinkel, bogner Subscribers: mcrosier, javed.absar, llvm-commits, paul.redmond Differential Revision: https://reviews.llvm.org/D32921 llvm-svn: 305204
* [ARM] Add scheduling info for VFMSOliver Stannard2017-06-091-5/+86
| | | | | | | | | | The scalar VFMS instructions did not have scheduling information attached (but VFMA did), which was causing assertion failures with the Cortex-A57 scheduling model and -fp-contract=fast. Differential Revision: https://reviews.llvm.org/D34040 llvm-svn: 305064
* [ARM] GlobalISel: Add more tests. NFCDiana Picus2017-06-081-0/+149
| | | | | | | | Add a couple of tests to increase coverage for the TableGen'erated code, in particular for rules where 2 generic instructions may be combined into a single machine instruction. llvm-svn: 304971
* [ARM] GlobalISel: Purge G_SEQUENCEDiana Picus2017-06-075-64/+46
| | | | | | | | | | | | | | | | | According to the commit message from r296921, G_MERGE_VALUES and G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating G_SEQUENCE in the ARM backend and remove the code dealing with it. This boils down to the code breaking up double values for the soft float calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR + VMOVRRD and simplifies the code in the instruction selector. There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but that is part of the target-independent code for translating constant structs. Therefore, it is beyond the scope of this commit. llvm-svn: 304902
* [ARM] GlobalISel: Support G_XORDiana Picus2017-06-074-0/+168
| | | | | | | | | Same as the other binary operators: - legalize to 32 bits - map to GPRs - select to EORrr via TableGen'erated code llvm-svn: 304898
* [ARM] GlobalISel: Support G_ORDiana Picus2017-06-074-0/+168
| | | | | | | | | Same as the other binary operators: - legalize to 32 bits - map to GPRs - select ORRrr thanks to TableGen'erated code llvm-svn: 304890
* [ARM] GlobalISel: Support G_ANDDiana Picus2017-06-074-0/+170
| | | | | | | | | This is identical to the support for the other binary operators: - widen to s32 - map into GPR - select ANDrr (via TableGen'erated code) llvm-svn: 304885
* Vivek Pandya2017-06-065-113/+113
| | | | | | | | | | | | [Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default. If -simplify-mir option is passed then MIRPrinter will not print such fields. This change also required some lit test cases in CodeGen directory to be changed. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D32304 llvm-svn: 304779
* [ARM] GlobalISel: Constrain callee register on indirect callsDiana Picus2017-06-051-4/+6
| | | | | | | | | | | | | When lowering calls, we generate instructions with machine opcodes rather than generic ones. Therefore, we need to constrain the register classes of the operands. Also enable the machine verifier on the arm-irtranslator.ll test, since that would've caught this issue. Fixes (part of) PR32146. llvm-svn: 304712
* Add support for #pragma clang sectionJaved Absar2017-06-051-0/+140
| | | | | | | | | | | | | | | This patch provides a means to specify section-names for global variables, functions and static variables, using #pragma directives. This feature is only defined to work sensibly for ELF targets. One can specify section names as: #pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText" One can "unspecify" a section name with empty string e.g. #pragma clang section bss="" data="" text="" rodata="" Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner Differential Revision: https://reviews.llvm.org/D33413 llvm-svn: 304704
* [GlobalMerge] Don't merge globals that may be preemptedJohn Brawn2017-06-021-0/+1
| | | | | | | | | | | When a global may be preempted it needs to be accessed directly, instead of indirectly through a MergedGlobals symbol, for the preemption to work. This fixes PR33136. Differential Revision: https://reviews.llvm.org/D33727 llvm-svn: 304537
* [ARM] GlobalISel: Support struct params/returnsDiana Picus2017-06-022-4/+71
| | | | | | | | | | | | Very very similar to the support for arrays. As with arrays, we don't support returning large structs that wouldn't fit in R0-R3. Most front-ends would likely use sret arguments for that anyway. The only significant difference is that when splitting a struct, we need to make sure we set the correct original alignment on each member, otherwise it may get split incorrectly between stack and registers. llvm-svn: 304536
* [ARM] Cortex-A57 scheduling model for ARM backend (AArch32)Javed Absar2017-06-0211-0/+487
| | | | | | | | | | | | | | | This patch implements the Cortex-A57 scheduling model. The main code is in ARMScheduleA57.td, ARMScheduleA57WriteRes.td. Small changes in cpp,.h files to support required scheduling predicates. Scheduling model implemented according to: http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf. Patch by : Andrew Zhogin (submitted on his behalf, as requested). Rewiewed by: Renato Golin, Diana Picus, Javed Absar, Kristof Beyls. Differential Revision: https://reviews.llvm.org/D28152 llvm-svn: 304530
* ARM: Fix cmpxchg O0 expansionMatthias Braun2017-05-311-3/+6
| | | | | | | | | | | | | | | | This is the equivalent of r304048 for ARM: - Rewrite livein calculation to use the computeLiveIns() helper function. This is slightly less efficient but easier to reason about and doesn't unnecessarily add pristine and reserved registers[1] - Zero the status register at the beginning of the loop to make sure it has a defined value. - Remove kill flags of values that need to stay alive throughout the loop. [1] An upcoming commit of mine will tighten the MachineVerifier to catch these. llvm-svn: 304267
* MIR: remove explicit "noVRegs" property.Tim Northover2017-05-301-2/+0
| | | | | | | We can infer this from the incoming MIR, so there's no reason to represent it with a special flag. llvm-svn: 304246
OpenPOWER on IntegriCloud