summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
* Fix ARM paired GPR COPY loweringJF Bastien2013-07-121-0/+17
| | | | | | | | | | | | | ARM paired GPR COPY was being lowered to two MOVr without CC. This patch puts the CC back. My test is a reduction of the case where I encountered the issue, 64-bit atomics use paired GPRs. The issue only occurs with selectionDAG, FastISel doesn't encounter it so I didn't bother calling it. llvm-svn: 186226
* Start using CHECK-LABEL in some tests.Stephen Lin2013-07-122-26/+26
| | | | llvm-svn: 186163
* Add a comment to this change, requested by Eric Christopher.Joey Gouly2013-07-081-0/+1
| | | | llvm-svn: 185853
* ARM: Improve codegen for generic vselect.Jim Grosbach2013-07-082-30/+39
| | | | | | | | Fall back to by-element insert rather than building it up on the stack. rdar://14351991 llvm-svn: 185846
* Stop putting operations after a tail call.Tim Northover2013-07-061-0/+16
| | | | | | | | This prevents the emission of DAG-generated vreg definitions after a tail call be dropping them entirely (on the grounds that nothing could use them anyway, and they interfere with O0 CodeGen). llvm-svn: 185754
* ARM: Add a pack pattern for matching arithmetic shift rightArnold Schwaighofer2013-07-051-0/+10
| | | | llvm-svn: 185714
* ARM: Fix incorrect pack patternArnold Schwaighofer2013-07-051-1/+14
| | | | | | | | | | | A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them in the bottom half of "x". An arithmetic and logic shift are only equivalent in this context if the shift amount is 16. We would be shifting in ones into the bottom 16bits instead of zeros if "y" is negative. radar://14338767 llvm-svn: 185712
* PR16490: fix a crash in ARMDAGToDAGISel::SelectInlineAsm.Joey Gouly2013-07-051-0/+5
| | | | | | | | | | | In the SelectionDAG immediate operands to inline asm are constructed as two separate operands. The first is a constant of value InlineAsm::Kind_Imm and the second is a constant with the value of the immediate. In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we should skip over the next operand too. llvm-svn: 185688
* [ARM] Improve the instruction selection of vector loads.Quentin Colombet2013-07-032-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the ARM back-end, build_vector nodes are lowered to a target specific build_vector that uses floating point type. This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. In other words, this conversion may introduce artificial dependencies when the code leading to the build vector cannot be completed with a floating point type. In particular, this happens when loads are not aligned. Before this patch, in that case, the compiler generates general purpose loads and creates the floating point vector from them, instead of directly using the vector unit. The patch uses a vector friendly sequence of code when the inserted bitcasts to floating point survived DAGCombine. This is done by a target specific DAGCombine that changes the target specific build_vector into a sequence of insert_vector_elt that get rid of the bitcasts. <rdar://problem/14170854> llvm-svn: 185587
* Prefix failing commands with not to make clear they are expected to fail.Rafael Espindola2013-07-033-3/+3
| | | | llvm-svn: 185554
* ARM: relax the atomic release barrier to "dmb ishst" on SwiftTim Northover2013-07-033-56/+101
| | | | | | | | | | | Swift cores implement store barriers that are stronger than the ARM specification but weaker than general barriers. They are, in fact, just about enough to provide the ordering needed for atomic operations with release semantics. This patch makes use of that quirk. llvm-svn: 185527
* Revert r185339 (ARM: relax the atomic release barrier to "dmb ishst")Tim Northover2013-07-012-90/+56
| | | | | | | | | Turns out I'd misread the architecture reference manual and thought that was a load/store-store barrier, when it's not. Thanks for pointing it out Eli! llvm-svn: 185356
* ARM: relax the atomic release barrier to "dmb ishst"Tim Northover2013-07-012-56/+90
| | | | | | | | | | | I believe the full "dmb ish" barrier is not required to guarantee release semantics for atomic operations. The weaker "dmb ishst" prevents previous operations being reordered with a store executed afterwards, which is enough. A key point to note (fortunately already correct) is that this barrier alone is *insufficient* for sequential consistency, no matter how liberally placed. llvm-svn: 185339
* Add missing case to switch statement - DAGTypeLegalizer::ExpandIntegerResultLang Hames2013-06-281-0/+15
| | | | | | | | | | | | should expand ATOMIC_CMP_SWAP nodes the same way that it does for ATOMIC_SWAP. Since ATOMIC_LOADs on some targets (e.g. older ARM variants) get legalized to ATOMIC_CMP_SWAPs, the missing case had been causing i64 atomic loads to crash during isel. <rdar://problem/14074644> llvm-svn: 185186
* Bug 13662: Enable GPRPair for all i64 operands of inline asm on ARMWeiming Zhao2013-06-281-3/+36
| | | | | | | | This patch assigns paired GPRs for inline asm with 64-bit data on ARM. It's enabled for both ARM and Thumb to support modifiers like %H, %Q, %R. llvm-svn: 185169
* ARM: ensure fixed-point conversions have sane typesTim Northover2013-06-282-0/+82
| | | | | | | | | | | We were generating intrinsics for NEON fixed-point conversions that didn't exist (e.g. float -> i16). There are two cases to consider: + iN is smaller than float. In this case we can do the conversion but need an extend or truncate as well. + iN is larger than float. In this case using the NEON conversion would be incorrect so we don't perform any combining. llvm-svn: 185158
* Debug Info: clean up usage of Verify.Manman Ren2013-06-281-4/+4
| | | | | | | | | | | No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. For cases where we know the type of a DI metadata, use assert. Also update testing cases to make them conform to the format of DI classes. llvm-svn: 185135
* Add a Subtarget feature 'v8fp' to the ARM backend.Joey Gouly2013-06-271-0/+10
| | | | llvm-svn: 185073
* Add a subtarget feature 'v8' to the ARM backend.Joey Gouly2013-06-261-7/+16
| | | | | | This allows for targeting the ARMv8 AArch32 variant. llvm-svn: 184967
* Remove the 'generic' CPU from the ARM eabi attributes printer.Joey Gouly2013-06-261-4/+3
| | | | | | Make v4 the default ARM architecture attribute, to match CodeGen. llvm-svn: 184962
* DebugInfo: Don't lose unreferenced non-trivial by-value parametersDavid Blaikie2013-06-211-1/+1
| | | | | | | | | | | | A FastISel optimization was causing us to emit no information for such parameters & when they go missing we end up emitting a different function type. By avoiding that shortcut we not only get types correct (very important) but also location information (handy) - even if it's only live at the start of a function & may be clobbered later. Reviewed/discussion by Evan Cheng & Dan Gohman. llvm-svn: 184604
* ARM: Remove a (false) dependency on the memoryoperand's value as we do not useQuentin Colombet2013-06-202-2/+44
| | | | | | | | | | it at the moment. This allows to form more paired loads even when stack coloring pass destroys the memoryoperand's value. <rdar://problem/13978317> llvm-svn: 184492
* During SelectionDAG building explicitly set a node to constant zero when theQuentin Colombet2013-06-181-1/+1
| | | | | | | | | | | | value is zero. This allows optmizations to kick in more easily. Fix some test cases so that they remain meaningful (i.e., not completely dead coded) when optimizations apply. <rdar://problem/14096009> superfluous multiply by high part of zero-extended value. llvm-svn: 184222
* Switch spill weights from a basic loop depth estimation to BlockFrequencyInfo.Benjamin Kramer2013-06-171-2/+1
| | | | | | | | | | | | | | | | | | The main advantages here are way better heuristics, taking into account not just loop depth but also __builtin_expect and other static heuristics and will eventually learn how to use profile info. Most of the work in this patch is pushing the MachineBlockFrequencyInfo analysis into the right places. This is good for a 5% speedup on zlib's deflate (x86_64), there were some very unfortunate spilling decisions in its hottest loop in longest_match(). Other benchmarks I tried were mostly neutral. This changes register allocation in subtle ways, update the tests for it. 2012-02-20-MachineCPBug.ll was deleted as it's very fragile and the instruction it looked for was gone already (but the FileCheck pattern picked up unrelated stuff). llvm-svn: 184105
* Debug Info: Simplify Frame Index handling in DBG_VALUE Machine InstructionsDavid Blaikie2013-06-162-2/+2
| | | | | | | | | | | | | | | | | | | | Rather than using the full power of target-specific addressing modes in DBG_VALUEs with Frame Indicies, simply use Frame Index + Offset. This reduces the complexity of debug info handling down to two representations of values (reg+offset and frame index+offset) rather than three or four. Ideally we could ensure that frame indicies had been eliminated by the time we reached an assembly or dwarf generation, but I haven't spent the time to figure out where the FIs are leaking through into that & whether there's a good place to convert them. Some FI+offset=>reg+offset conversion is done (see PrologEpilogInserter, for example) which is necessary for some SelectionDAG assumptions about registers, I believe, but it might be possible to make this a more thorough conversion & ensure there are no remaining FIs no matter how instruction selection is performed. llvm-svn: 184066
* DebugInfo: follow up to 184045 to constrain the tests further to ensure they ↵David Blaikie2013-06-151-2/+2
| | | | | | don't contain +0 offsets llvm-svn: 184046
* DebugInfo: print DBG_VALUE MachineInstrs with [] for deref and drop the ↵David Blaikie2013-06-152-3/+3
| | | | | | offset when it's zero llvm-svn: 184045
* Make PrologEpilogInserter save/restore all callee saved registersDerek Schuff2013-06-141-0/+18
| | | | | | | | | | | in functions which call __builtin_unwind_init() __builtin_unwind_init() is an undocumented gcc intrinsic which has this effect, and is used in libgcc_eh. Goes part of the way toward fixing PR8541. llvm-svn: 183984
* Enable FastISel on ARM for Linux and NaCl, not MCJITJF Bastien2013-06-1425-12/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a resubmit of r182877, which was reverted because it broken MCJIT tests on ARM. The patch leaves MCJIT on ARM as it was before: only enabled for iOS. I've CC'ed people from the original review and revert. FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl, but not MCJIT. Thumb2 support needs a bit more work, mainly around register class restrictions. The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch. The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back. The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change. I ran all of lnt test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass. All the tests also pass on x86 make check-all. I also re-ran the check-all tests that failed on ARM, and they all seem to pass. llvm-svn: 183966
* Add test for ARM FastISel load/store register classesJF Bastien2013-06-101-0/+70
| | | | | | r183624 fixed an issue that was tested indirectly. Test it directly with this new test. llvm-svn: 183634
* Refine the ARM EHABI test cases.Logan Chien2013-06-099-413/+323
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we have ARM unwind directive parser and assembler, we can check the correctness in two stages: 1. From LLVM assembly (.ll) to ARM assembly (.s) 2. From ARM assembly (.s) to ELF object file (.o) We already have several "*.s to *.o" test cases. This CL adds some "*.ll to *.s" test cases and removes the redundant "*.ll to *.o" test cases. New test cases to check "*.ll to *.s" code generator: - ehabi.ll: Check the correctness of the generated unwind directives. - section-name.ll: Check the section name of functions. Removed test cases: - ehabi-mc-cantunwind.ll (Covered by ehabi-cantunwind.ll, and eh-directive-cantunwind.s) - ehabi-mc-compact-pr0.ll (Covered by ehabi.ll, eh-compact-pr0.s, eh-directive-save.s, and eh-directive-setfp.s) - ehabi-mc-compact-pr1.ll (Covered by ehabi.ll, eh-compact-pr1.s, eh-directive-save.s, and eh-directive-setfp.s) - ehabi-mc.ll (Covered by ehabi.ll, and eh-directive-integrated-test.s) - ehabi-mc-section-group.ll (Covered by section-name.ll, and eh-directive-section-comdat.s) - ehabi-mc-section.ll (Covered by section-name.ll, and eh-directive-section.s) - ehabi-mc-sh_link.ll (Covered by eh-directive-text-section.s, and eh-directive-section.s) llvm-svn: 183628
* Fix ARM unwind opcode assembler in several cases.Logan Chien2013-06-092-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes to ARM unwind opcode assembler: * Fix multiple .save or .vsave directives. Besides, the order is preserved now. * For the directives which will generate multiple opcodes, such as ".save {r0-r11}", the order of the unwind opcode is fixed now, i.e. the registers with less encoding value are popped first. * Fix the $sp offset calculation. Now, we can use the .setfp, .pad, .save, and .vsave directives at any order. Changes to test cases: * Add test cases to check the order of multiple opcodes for the .save directive. * Fix the incorrect $sp offset in the test case. The stack pointer offset specified in the test case was incorrect. (Changed test cases: ehabi-mc-section.ll and ehabi-mc.ll) * The opcode to restore $sp are slightly reordered. The behavior are not changed, and the new output is same as the output of GNU as. (Changed test cases: eh-directive-pad.s and eh-directive-setfp.s) llvm-svn: 183627
* Reapply r183552. This time, use a standard type for the option to avoid templateQuentin Colombet2013-06-081-0/+24
| | | | | | | | | | | | | | | instantiation issue with non-standard type. Add a backend option to warn on a given stack size limit. Option: -mllvm -warn-stack-size=<limit> Output (if limit is exceeded): warning: Stack size limit exceeded (<actual size>) in <functionName>. The longer term plan is to hook that to a clang warning. PR:4072 <rdar://problem/13987214>. llvm-svn: 183595
* Revert commits related to stack warning.Quentin Colombet2013-06-071-24/+0
| | | | llvm-svn: 183579
* Explicit triple in warn stack size test cases to not depend on OS.Quentin Colombet2013-06-071-2/+2
| | | | llvm-svn: 183574
* Add a backend option to warn on a given stack size limit.Quentin Colombet2013-06-071-0/+24
| | | | | | | | | | | | Option: -mllvm -warn-stack-size=<limit> Output (if limit is exceeded): warning: Stack size limit exceeded (<actual size>) in <functionName>. The longer term plan is to hook that to a clang warning. PR:4072 <rdar://problem/13987214> llvm-svn: 183552
* ARM FastISel integer sext/zext improvementsJF Bastien2013-06-079-29/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My recent ARM FastISel patch exposed this bug: http://llvm.org/bugs/show_bug.cgi?id=16178 The root cause is that it can't select integer sext/zext pre-ARMv6 and asserts out. The current integer sext/zext code doesn't handle other cases gracefully either, so this patch makes it handle all sext and zext from i1/i8/i16 to i8/i16/i32, with and without ARMv6, both in Thumb and ARM mode. This should fix the bug as well as make FastISel faster because it bails to SelectionDAG less often. See fastisel-ext.patch for this. fastisel-ext-tests.patch changes current tests to always use reg-imm AND for 8-bit zext instead of UXTB. This simplifies code since it is supported on ARMv4t and later, and at least on A15 both should perform exactly the same (both have exec 1 uop 1, type I). 2013-05-31-char-shift-crash.ll is a bitcode version of the above bug 16178 repro. fast-isel-ext.ll tests all sext/zext combinations that ARM FastISel should now handle. Note that my ARM FastISel enabling patch was reverted due to a separate failure when dealing with MCJIT, I'll fix this second failure and then turn FastISel on again for non-iOS ARM targets. I've tested "make check-all" on my x86 box, and "lnt test-suite" on A15 hardware. llvm-svn: 183551
* Teach AsmPrinter how to print odd constants.Quentin Colombet2013-06-071-0/+18
| | | | | | | | | | | Fix an assertion when the compiler encounters big constants whose bit width is not a multiple of 64-bits. Although clang would never generate something like this, the backend should be able to handle any legal IR. <rdar://problem/13363576> llvm-svn: 183544
* Cortex-R5 can issue Thumb2 integer division instructions.Evan Cheng2013-06-041-11/+12
| | | | llvm-svn: 183275
* ARM: Fix crash in ARM backend inside of ARMConstantIslandPassDavid Majnemer2013-06-041-0/+14
| | | | | | | | | | The ARM backend did not expect LDRBi12 to hold a constant pool operand. Allow for LLVM to deal with the instruction similar to how it deals with LDRi12. This fixes PR16215. llvm-svn: 183238
* Revert r182937 and r182877.Rafael Espindola2013-05-3024-82/+12
| | | | | | | | | r182877 broke MCJIT tests on ARM and r182937 was working around another failure by r182877. This should make the ARM bots green. llvm-svn: 182960
* Change how we iterate over relocations on ELF.Rafael Espindola2013-05-303-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | For COFF and MachO, sections semantically have relocations that apply to them. That is not the case on ELF. In relocatable objects (.o), a section with relocations in ELF has offsets to another section where the relocations should be applied. In dynamic objects and executables, relocations don't have an offset, they have a virtual address. The section sh_info may or may not point to another section, but that is not actually used for resolving the relocations. This patch exposes that in the ObjectFile API. It has the following advantages: * Most (all?) clients can handle this more efficiently. They will normally walk all relocations, so doing an effort to iterate in a particular order doesn't save time. * llvm-readobj now prints relocations in the same way the native readelf does. * probably most important, relocations that don't point to any section are now visible. This is the case of relocations in the rela.dyn section. See the updated relocation-executable.test for example. llvm-svn: 182908
* Order CALLSEQ_START and CALLSEQ_END nodes.Andrew Trick2013-05-291-10/+9
| | | | | | | | | | | | Fixes PR16146: gdb.base__call-ar-st.exp fails after pre-RA-sched=source fixes. Patch by Xiaoyi Guo! This also fixes an unsupported dbg.value test case. Codegen was previously incorrect but the test was passing by luck. llvm-svn: 182885
* Enable FastISel on ARM for Linux and NaClJF Bastien2013-05-2924-12/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl. Thumb2 support needs a bit more work, mainly around register class restrictions. The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch. The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back. The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change. I ran all of test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass. llvm-svn: 182877
* Track IR ordering of SelectionDAG nodes 4/4.Andrew Trick2013-05-251-0/+21
| | | | | | Unit test cases for -pre-RA-sched=source. llvm-svn: 182706
* Track IR ordering of SelectionDAG nodes 3/4.Andrew Trick2013-05-252-5/+5
| | | | | | | Remove the old IR ordering mechanism and switch to new one. Fix unit test failures. llvm-svn: 182704
* ARM: implement @llvm.readcyclecounter intrinsicTim Northover2013-05-231-0/+24
| | | | | | | | | | | | | This implements the @llvm.readcyclecounter intrinsic as the specific MRC instruction specified in the ARM manuals for CPUs with the Power Management extensions. Older CPUs had slightly different methods which may also have to be implemented eventually, but this should cover all v7 cases. rdar://problem/13939186 llvm-svn: 182603
* Fix PR16110: Handle DBG_VALUE in ConnectedVNInfoEqClasses::Distribute().Jakob Stoklund Olesen2013-05-231-0/+113
| | | | | | | | | | | | Now that the LiveDebugVariables pass is running *after* register coalescing, the ConnectedVNInfoEqClasses class needs to deal with DBG_VALUE instructions. This only comes up when rematerialization during coalescing causes the remaining live range of a virtual register to separate into two connected components. llvm-svn: 182592
* PR15868 fix.Stepan Dyatkovskiy2013-05-203-8/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduction: In case when stack alignment is 8 and GPRs parameter part size is not N*8: we add padding to GPRs part, so part's last byte must be recovered at address K*8-1. We need to do it, since remained (stack) part of parameter starts from address K*8, and we need to "attach" "GPRs head" without gaps to it: Stack: |---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes... [ [padding] [GPRs head] ] [ ------ Tail passed via stack ------ ... FIX: Note, once we added padding we need to correct *all* Arg offsets that are going after padded one. That's why we need this fix: Arg offsets were never corrected before this patch. See new test-cases included in patch. We also don't need to insert padding for byval parameters that are stored in GPRs only. We need pad only last byval parameter and only in case it outsides GPRs and stack alignment = 8. Though, stack area, allocated for recovered byval params, must satisfy "Size mod 8 = 0" restriction. This patch reduces stack usage for some cases: We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be "packed" with alignment 4 in some cases. llvm-svn: 182237
* Support unaligned load/store on more ARM targetsJF Bastien2013-05-172-130/+147
| | | | | | | | | | | | | | | | | | | | | This patch matches GCC behavior: the code used to only allow unaligned load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for v6+ Darwin as well as for v7+ on Linux and NaCl. The distinction is made because v6 doesn't guarantee support (but LLVM assumes that Apple controls hardware+kernel and therefore have conformant v6 CPUs), whereas v7 does provide this guarantee (and Linux/NaCl behave sanely). The patch keeps the -arm-strict-align command line option, and adds -arm-no-strict-align. They behave similarly to GCC's -mstrict-align and -mnostrict-align. I originally encountered this discrepancy in FastIsel tests which expect unaligned load/store generation. Overall this should slightly improve performance in most cases because of reduced I$ pressure. llvm-svn: 182175
OpenPOWER on IntegriCloud