summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] GlobalISel: Allow i8 and i16 addsDiana Picus2016-12-191-1/+6
| | | | | | | | | Teach the instruction selector and legalizer that it's ok to have adds with 8 or 16-bit integers. This is the second part of https://reviews.llvm.org/D27704 llvm-svn: 290105
* [ARM] GlobalISel: Select i8 and i16 copiesDiana Picus2016-12-191-2/+9
| | | | | | | | | Teach the instruction selector that it's ok to copy small values from physical registers. First part of https://reviews.llvm.org/D27704 llvm-svn: 290104
* [ARM] GlobalISel: Lower more than 4 argumentsDiana Picus2016-12-191-10/+22
| | | | | | | | | | This adds support for lowering more than 4 arguments (although still i32 only). It uses the handleAssignments / ValueHandler infrastructure extracted from the AArch64 backend in r288658. Differential Revision: https://reviews.llvm.org/D27195 llvm-svn: 290098
* [ARM] GlobalISel: Support loading from the stackDiana Picus2016-12-193-10/+45
| | | | | | | | | | Add support for selecting simple G_LOAD and G_FRAME_INDEX instructions (32-bit scalars only). This will be useful for functions that need to pass arguments on the stack. First part of https://reviews.llvm.org/D27195. llvm-svn: 290096
* [ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.Eli Friedman2016-12-163-19/+93
| | | | | | | | | | | | | | | | | | | | Currently, there are substantial problems forming vld1_dup even if the VDUP survives legalization. The lack of an actual node leads to terrible results: not only can we not form post-increment vld1_dup instructions, but we form scalar pre-increment and post-increment loads which force the loaded value into a GPR. This patch fixes that by combining the vdup+load into an ARMISD node before DAGCombine messes it up. Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable). Recommiting with fix to avoid forming vld1dup if the type of the load doesn't match the type of the vdup (see https://llvm.org/bugs/show_bug.cgi?id=31404). Differential Revision: https://reviews.llvm.org/D27694 llvm-svn: 289972
* [GlobalISel] Silence unused variable warnings in Release builds.Benjamin Kramer2016-12-161-5/+4
| | | | llvm-svn: 289941
* [ARM] GlobalISel: Select add i32, i32Diana Picus2016-12-167-9/+299
| | | | | | | | | | | | | Add the minimal support necessary to select a function that returns the sum of two i32 values. This includes some support for argument/return lowering of i32 values through registers, as well as the handling of copy and add instructions throughout the GlobalISel pipeline. Differential Revision: https://reviews.llvm.org/D26677 llvm-svn: 289940
* [ARM] Expose methods to get the CCAssignFn. NFCIDiana Picus2016-12-162-17/+21
| | | | | | | | | | | Add two public methods to ARMTargetLowering: CCAssignFnForCall and CCAssignFnForReturn, which are just calling the already existing private method CCAssignFnForNode. These will come in handy for GlobalISel on ARM. We also replace all calls to CCAssignFnForNode in ARMISelLowering.cpp, because the new methods are friendlier to the reader. llvm-svn: 289932
* Revert 279703, it caused PR31404.Nico Weber2016-12-163-92/+19
| | | | llvm-svn: 289923
* [GlobalISel] Drop workaround for Legalizer member/class sharing a name. NFC.Ahmed Bougacha2016-12-151-1/+1
| | | | | | | | MachineLegalizer used to be the name of both the class and the member, causing GCC errors. r276522 fixed that by renaming the member to just 'Legalizer'. The 'class' workaround isn't necessary anymore; drop it. llvm-svn: 289848
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlySjoerd Meijer2016-12-152-4/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is essentially a recommit of r285893, but with a correctness fix. The problem of the original commit was that this: bic r5, r7, #31 cbz r5, .LBB2_10 got rewritten into: lsrs r5, r7, #5 beq .LBB2_10 The result in destination register r5 is not the same and this is incorrect when r5 is not dead. So this fix includes checking the uses of the AND destination register. And also, compared to the original commit, some regression tests didn't need changing anymore because of this extra check. For completeness, this was the original commit message: For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. Differential Revision: https://reviews.llvm.org/D27761 llvm-svn: 289794
* Fix for build warning in execute-only supportPrakhar Bahuguna2016-12-151-2/+2
| | | | llvm-svn: 289788
* [ARM] Implement execute-only support in CodeGenPrakhar Bahuguna2016-12-1513-17/+113
| | | | | | | | | | | | | | | | | | | | This implements execute-only support for ARM code generation, which prevents the compiler from generating data accesses to code sections. The following changes are involved: * Add the CodeGen option "-arm-execute-only" to the ARM code generator. * Add the clang flag "-mexecute-only" as well as the GCC-compatible alias "-mpure-code" to enable this option. * When enabled, literal pools are replaced with MOVW/MOVT instructions, with VMOV used in addition for floating-point literals. As the MOVT instruction is required, execute-only support is only available in Thumb mode for targets supporting ARMv8-M baseline or Thumb2. * Jump tables are placed in data sections when in execute-only mode. * The execute-only text section is assigned section ID 0, and is marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'. This also overrides selection of ELF sections for globals. llvm-svn: 289784
* [ARM] Split 128-bit vectors in BUILD_VECTOR loweringEli Friedman2016-12-141-0/+21
| | | | | | | | | | | | | Given that INSERT_VECTOR_ELT operates on D registers anyway, combining 64-bit vectors into a 128-bit vector is basically free. Therefore, try to split BUILD_VECTOR nodes before giving up and lowering them to a series of INSERT_VECTOR_ELT instructions. Sometimes this allows dramatically better lowerings; see testcases for examples. Inspired by similar code in the x86 backend for AVX. Differential Revision: https://reviews.llvm.org/D27624 llvm-svn: 289706
* [ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.Eli Friedman2016-12-143-19/+92
| | | | | | | | | | | | | | | | Currently, there are substantial problems forming vld1_dup even if the VDUP survives legalization. The lack of an actual node leads to terrible results: not only can we not form post-increment vld1_dup instructions, but we form scalar pre-increment and post-increment loads which force the loaded value into a GPR. This patch fixes that by combining the vdup+load into an ARMISD node before DAGCombine messes it up. Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable). Differential Revision: https://reviews.llvm.org/D27694 llvm-svn: 289703
* Replace APFloatBase static fltSemantics data members with getter functionsStephan Bergmann2016-12-142-2/+2
| | | | | | | | | | | | | At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647
* Add support for Samsung Exynos M3 (NFC)Evandro Menezes2016-12-131-0/+6
| | | | llvm-svn: 289613
* Generalize strided store pattern in interleave access passAlina Sbirlea2016-12-131-6/+36
| | | | | | | | | | | | | | | | | Summary: This patch aims to generalize matching of the strided store accesses to more general masks. The more general rule is to have consecutive accesses based on the stride: [x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...] All elements in the masks need not form a contiguous space, there may be gaps. As before, undefs are allowed and filled in with adjacent element loads. Reviewers: HaoLiu, mssimpso Subscribers: mkuper, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D23646 llvm-svn: 289573
* LivePhysReg: Use reference instead of pointer in init(); NFCMatthias Braun2016-12-081-1/+1
| | | | llvm-svn: 289002
* [ARM] Better error message for invalid flag-preserving Thumb1 instsOliver Stannard2016-12-061-1/+4
| | | | | | | | | | When we see a non flag-setting instruction for which only the flag-setting version is available in Thumb1, we should give a better error message than "invalid instruction". Differential Revision: https://reviews.llvm.org/D27414 llvm-svn: 288805
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-1/+1
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* [ARM] Fix for 64-bit CAS expansion on ARM32 with -O0Oleg Ranevskyy2016-12-011-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes comparison of 64-bit atomic with its expected value in CMP_SWAP_64 expansion. Currently, the low words are compared with CMP, while the high words are compared with SBC. SBC expects the carry flag to be set if CMP detects a difference. CMP might leave the carry unset for unequal arguments though if the first one is >= than the second. This might cause the comparison logic to detect false equality. Example of the broken C++ code: ``` std::atomic<long long> at(2); long long ll = 1; std::atomic_compare_exchange_strong(&at, &ll, 3); ``` Even though the atomic `at` and the expected value `ll` are not equal and `atomic_compare_exchange_strong` returns `false`, `at` is changed to 3. The patch replaces SBC with CMPEQ. Reviewers: t.p.northover Subscribers: aemerson, rengolin, llvm-commits, asl Differential Revision: https://reviews.llvm.org/D27315 llvm-svn: 288433
* Move most EH from MachineModuleInfo to MachineFunctionMatthias Braun2016-12-011-3/+2
| | | | | | | | | | | | | | | | | | | | | | | Recommitting r288293 with some extra fixes for GlobalISel code. Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288405
* Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction"Eric Christopher2016-12-011-2/+3
| | | | | | | | | This apprears to have broken the global isel bot: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console This reverts commit r288293. llvm-svn: 288322
* Move most EH from MachineModuleInfo to MachineFunctionMatthias Braun2016-11-301-3/+2
| | | | | | | | | | | | | | | | | | | | | Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288293
* Move FrameInstructions from MachineModuleInfo to MachineFunctionMatthias Braun2016-11-302-28/+28
| | | | | | | | | | | This is per function data so it is better kept at the function instead of the module. This is a necessary step to have machine module passes work properly. Differential Revision: https://reviews.llvm.org/D27185 llvm-svn: 288291
* Clarify rules for reserved regs, fix aarch64 ones.Matthias Braun2016-11-301-9/+11
| | | | | | | | | No test case necessary as the problematic condition is checked with the newly introduced assertAllSuperRegsMarked() function. Differential Revision: https://reviews.llvm.org/D26648 llvm-svn: 288277
* [xray] Add XRay support for Mach-O in CodeGenKuba Mracek2016-11-231-17/+27
| | | | | | | | Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support. Differential Revision: https://reviews.llvm.org/D26983 llvm-svn: 287734
* CodeGen: simplify TargetMachine::getSymbol interface. NFC.Tim Northover2016-11-221-1/+1
| | | | | | | | | No-one actually had a mangler handy when calling this function, and getSymbol itself went most of the way towards getting its own mangler (with a local TLOF variable) so forcing all callers to supply one was just extra complication. llvm-svn: 287645
* [ARM] Relax restriction on variadic functions for tailcall optimizationPablo Barrio2016-11-171-5/+0
| | | | | | | | | | | | | | Summary: Variadic functions can be treated in the same way as normal functions with respect to the number and types of parameters. Reviewers: grosbach, olista01, t.p.northover, rengolin Subscribers: javed.absar, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26748 llvm-svn: 287219
* ARM: fix CodeGen for 64-bit shifts.Tim Northover2016-11-161-17/+31
| | | | | | | | | One half of the shifts obviously needed conditional selection based on whether the shift amount is more than 32-bits, but leaving the other half as the natural shift isn't acceptable either: it's undefined behaviour to shift a 32-bit value by more than 31. llvm-svn: 287149
* GlobalISel: remove unused variable to silence warning.Tim Northover2016-11-152-2/+1
| | | | llvm-svn: 287027
* [ARM] GlobalISel: Remove unused members. NFCIDiana Picus2016-11-153-8/+4
| | | | | | This silences some warnings that I didn't see with my host compiler. llvm-svn: 286981
* [ARM] Make sure GlobalISel is only initialized once. NFCIDiana Picus2016-11-151-12/+12
| | | | | | | | | Move some code inside the proper 'if' block to make sure it is only run once, when the subtarget is first created. Things can still break if we use different ARM target machines or if we have functions with different 'target-cpu' or 'target-features', we should fix that too in the future. llvm-svn: 286974
* [ARM] Add machine scheduler for Cortex-R52 Javed Absar2016-11-153-1/+985
| | | | | | | | | | | | | This patch adds the Sched Machine Model for Cortex-R52. Details of the pipeline and descriptions are in comments in file ARMScheduleR52.td included in this patch. Reviewers: rengolin, jmolloy Differential Revision: https://reviews.llvm.org/D26500 llvm-svn: 286949
* ARM: try to fix GCC 4.8 compilation again after r286881.Tim Northover2016-11-141-1/+2
| | | | llvm-svn: 286882
* Recommit: ARM: sort register lists by encoding in push/pop instructions.Tim Northover2016-11-143-2/+28
| | | | | | | | | | | | | | | | | For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. Fixed usage of std::sort so that we (hopefully) use instantiations that actually exist in GCC 4.8. llvm-svn: 286881
* Revert "ARM: sort register lists by encoding in push/pop instructions."Tim Northover2016-11-143-28/+2
| | | | | | | This reverts commit 286866. It broke a bot, something to do with exactly which templates std::sort accepts. llvm-svn: 286867
* ARM: sort register lists by encoding in push/pop instructions.Tim Northover2016-11-143-2/+28
| | | | | | | | | | | | | | For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. llvm-svn: 286866
* [ARM] Add plumbing for GlobalISelDiana Picus2016-11-1113-4/+407
| | | | | | Add GlobalISel skeleton, up to the point where we can select a ret void. llvm-svn: 286573
* [Target] Rename X86/ARM Assembly printer to reflect reality.Davide Italiano2016-11-101-1/+1
| | | | | | | This shows up a lot profiling LTO testcases with -time-passes, so better have a non confusing name. llvm-svn: 286488
* [ARM] Thumb2 LDR (literal) should accept PC as the destinationOliver Stannard2016-11-101-1/+1
| | | | | | | | | The version of this instruction with the .w suffix already correctly accepts this, but the alias without the .w did not. Differential Revision: https://reviews.llvm.org/D26499 llvm-svn: 286446
* [Thumb1] Move padding earlier when synthesizing TBBs off of the PCJames Molloy2016-11-071-0/+8
| | | | | | | | | | When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding. If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset. In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4. llvm-svn: 286107
* ARM: lower fpowi appropriately for Windows ARMSaleem Abdulrasool2016-11-061-0/+57
| | | | | | | | | | | This handles the last case of the builtin function calls that we would generate code which differed from Microsoft's ABI. Rather than generating a call to `__pow{d,s}i2` we now promote the parameter to a float or double and invoke `powf` or `pow` instead. Addresses PR30825! llvm-svn: 286082
* [Cortex-M0] Atomic loweringWeiming Zhao2016-11-032-7/+14
| | | | | | | | | | | | Summary: ARMv6m supports dmb etc fench instructions but not ldrex/strex etc. So for some atomic load/store, LLVM should inline instructions instead of lowering to __sync_ calls. Reviewers: rengolin, efriedma, t.p.northover, jmolloy Subscribers: efriedma, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26120 llvm-svn: 285969
* Remove a redundant condition found by PVS-Studio.Chandler Carruth2016-11-031-2/+2
| | | | | | | Filed http://llvm.org/PR30897 to teach Clang to warn on this kind of stuff. llvm-svn: 285945
* Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently"James Molloy2016-11-032-138/+5
| | | | | | This reverts commit r285893. It caused (probably) http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/83 . llvm-svn: 285912
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-11-032-5/+138
| | | | | | | | | | | | | | | This recommits r281323, which was backed out for two reasons. One, a selfhost failure, and two, it apparently caused Chromium failures. Actually, the latter was a red herring. The log has expired from the former, but I suspect that was a red herring too (actually caused by another problematic patch of mine). Therefore reapplying, and will watch the bots like a hawk. For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 285893
* [ARM][MC] Cleanup ARM Target Assembly ParserNirav Dave2016-11-021-540/+343
| | | | | | | | | | | | | | Summary: Correctly parse end-of-statement tokens and handle preprocessor end-of-line comments in ARM assembly processor. Reviewers: rnk, majnemer Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26152 llvm-svn: 285830
* [TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.hAlex Bradbury2016-11-011-20/+20
| | | | | | | | | | | | | | | As it stands, the OperandMatchResultTy is only included in the generated header if there is custom operand parsing. However, almost all backends make use of MatchOperand_Success and friends from OperandMatchResultTy for e.g. parseRegister. This is a pain when starting an AsmParser for a new backend that doesn't yet have custom operand parsing. Move the enum to MCTargetAsmParser.h. This patch is a prerequisite for D23563 Differential Revision: https://reviews.llvm.org/D23496 llvm-svn: 285705
OpenPOWER on IntegriCloud