summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit.Ahmed Bougacha2015-09-111-5/+10
| | | | | | | | | | | | | | | We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429
* [CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind.Ahmed Bougacha2015-09-111-3/+3
| | | | | | This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428
* [ARM] Do not use vtrn for vectorshuffle if the order is reversedJames Molloy2015-09-101-4/+13
| | | | | | | | The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254
* [ARM] Don't abort on variable-idx extractelt in ReconstructShuffle.Ahmed Bougacha2015-09-011-0/+4
| | | | | | | | | The code introduced in r244314 assumed that EXTRACT_VECTOR_ELT only takes constant indices, but it does accept variables. Bail out for those: we can't use them, as the shuffles we want to reconstruct do require constant masks. llvm-svn: 246594
* [CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0.Ahmed Bougacha2015-08-281-30/+21
| | | | | | | | | | | For targets that didn't support this, this will let us respect the langref instead of failing to select. Note that we don't need to change the 32-bit x86/PPC lowerings (to account for the result type/# difference) because they're both custom and bypass type legalization. llvm-svn: 246258
* [WinEH] Add some support for code generating catchpadReid Kleckner2015-08-271-4/+4
| | | | | | | We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235
* [ARM] Use AEABI helpers for i64 div and remScott Douglass2015-08-241-5/+58
| | | | | | Differential Revision: http://reviews.llvm.org/D12232 llvm-svn: 245830
* [ARM] Refactor LowerDivRem before adding LowerREM (nfc)Scott Douglass2015-08-241-17/+36
| | | | | | Differential Revision: http://reviews.llvm.org/D12230 llvm-svn: 245829
* [ARM] Don't try and custom lower a vNi64 SETCC.James Molloy2015-08-201-0/+6
| | | | | | | | It won't go well. We've already marked 64-bit SETCCs as non-Custom, but it's just possible that a SETCC has a legal result type but an illegal operand type. If this happens, bail out before we create unselectable nodes. Fixes PR24292. I tried to create a testcase but in 99% of cases we can't trigger this - not surprising that this bug has been latent since 2009. llvm-svn: 245577
* [ARM] Add instruction selection patterns for vmin/vmaxSilviu Baranga2015-08-191-2/+20
| | | | | | | | | | | | | | | | Summary: The mid-end was generating vector smin/smax/umin/umax nodes, but we were using vbsl to generatate the code. This adds the vmin/vmax patterns and a test to check that we are now generating vmin/vmax instructions. Reviewers: rengolin, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12105 llvm-svn: 245439
* [ARM] Fix crash when targetting CPU without NEONJames Molloy2015-08-171-3/+3
| | | | | | | | We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available. Found here: https://code.google.com/p/chromium/issues/detail?id=521671 llvm-svn: 245231
* Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNMJames Molloy2015-08-171-194/+0
| | | | | | This is no longer needed - SDAGBuilder will do this for us. llvm-svn: 245197
* [ARM] FMINNAN/FMAXNAN of f64 are not legal.James Molloy2015-08-131-2/+0
| | | | | | | | This was my error. We've got f32 marked as legal because they're simulated using a v2f32 instruction, but there's no equivalent for f64. This will get test coverage imminently when D12015 lands. llvm-svn: 244916
* PseudoSourceValue: Replace global manager with a manager in a machine function.Alex Lorenz2015-08-111-75/+92
| | | | | | | | | | | | | | | | | | | | | | This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693
* [ARM] Match fminnan/fmaxnan for vector vmin/vmax instead of an intrinsicJames Molloy2015-08-111-0/+16
| | | | | | | | Lower Intrinsic::arm_neon_vmins/vmaxs to fminnan/fmaxnan and match that instead. This is important because SDAG will soon be able to select FMINNAN itself, so we need a unified lowering path for intrinsics and SDAG. NFCI. llvm-svn: 244593
* [ARM] Match fminnum/fmaxnum for vector vminnm/vmaxnm instead of an intrinsicJames Molloy2015-08-111-0/+12
| | | | | | | | Lower the intrinsic to a FMINNUM/FMAXNUM node and select that instead. This is important because soon SDAG will be able to select FMINNUM/FMAXNUM itself, so we need an integrated lowering path between SDAG and intrinsics. NFCI. llvm-svn: 244592
* [ARM] Replace ARMISD::VMINNM/VMAXNM with ISD::FMINNUM/FMAXNUMJames Molloy2015-08-111-6/+8
| | | | | | NFCI. This replaces another custom ISDNode with a generic equivalent. llvm-svn: 244591
* [ARM] Replace ARMISD::FMIN/FMAX with the shiny new ISD::FMINNAN/FMAXNAN.James Molloy2015-08-111-4/+10
| | | | | | NFCI. This removes a custom ISDNode. llvm-svn: 244590
* Fix some comment typos.Benjamin Kramer2015-08-081-2/+2
| | | | llvm-svn: 244402
* Fix unused variable warning introduced in r244314Silviu Baranga2015-08-071-2/+4
| | | | llvm-svn: 244315
* [ARM] Update ReconstructShuffle to handle mismatched typesSilviu Baranga2015-08-071-92/+154
| | | | | | | | | | | | | | | | | | Summary: Port the ReconstructShuffle function from AArch64 to ARM to handle mismatched incoming types in the BUILD_VECTOR node. This fixes an outstanding FIXME in the ReconstructShuffle code. Reviewers: t.p.northover, rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11720 llvm-svn: 244314
* wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel2015-08-041-3/+2
| | | | | | | | | | | | | | | | | Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
* ARM: support windows division routinesSaleem Abdulrasool2015-08-041-0/+5
| | | | | | | | | This adds the software division routines for the Windows RTABI. These are not expected to be used often though as most modern Windows ARM capable targets support hardware division. In the case that the target CPU doesnt support hardware division, this will be the fallback. llvm-svn: 243952
* ARM: make Darwin libcall registration table driven (NFC)Saleem Abdulrasool2015-08-041-71/+65
| | | | | | | Make the libcall updating table driven similar to the approach that the Linux and Windows codepath does below. NFC. llvm-svn: 243951
* De-constify pointers to Type since they can't be modified. NFCCraig Topper2015-08-011-3/+3
| | | | | | This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
* [ARM] Lower modulo operation to generate __aeabi_divmod on AndroidSumanth Gundapaneni2015-07-311-3/+4
| | | | | | | | | | | | | | For a modulo (reminder) operation, clang -target armv7-none-linux-gnueabi generates "__modsi3" clang -target armv7-none-eabi generates "__aeabi_idivmod" clang -target armv7-linux-androideabi generates "__modsi3" Android bionic libc doesn't provide a __modsi3, instead it provides a "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate the correct call when ever there is a modulo operation. Differential Revision: http://reviews.llvm.org/D11661 llvm-svn: 243717
* fix memcpy/memset/memmove lowering when optimizing for sizeSanjay Patel2015-07-301-3/+3
| | | | | | | | | | | | | | | | | | | | Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693
* Implement target independent TLS compatible with glibc's emutls.c.Chih-Hung Hsieh2015-07-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438
* [ARM] - Fix lowering of shufflevectors in AArch32Luke Cheeseman2015-07-241-38/+127
| | | | | | | | | | | | | | | | | | | Some shufflevectors are currently being incorrectly lowered in the AArch32 backend as the existing checks for detecting the NEON operations from the shufflevector instruction expects the shuffle mask and the vector operands to be of the same length. This is not always the case as the mask may be twice as long as the operand; here only the lower half of the shufflemask gets checked, so provided the lower half of the shufflemask looks like a vector transpose (or even is just all -1 for undef) then the intrinsics may get incorrectly lowered into a vector transpose (VTRN) instruction. This patch fixes this by accommodating for both cases and adds regression tests. Differential Revision: http://reviews.llvm.org/D11407 llvm-svn: 243103
* When lowering vector shifts a check is performed to see if the value to shift byLuke Cheeseman2015-07-241-4/+8
| | | | | | | | | | | | is an immediate, in this check the value is negated and stored in and int64_t. The value can be -2^63 yet the result cannot be stored in an int64_t and this gives some undefined behaviour causing failures. The negation is only necessary when the values is within a certain range and so it should not need to negate -2^63, this patch introduces this and also a regression test. Differential Revision: http://reviews.llvm.org/D11408 llvm-svn: 243100
* [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABAJames Molloy2015-07-171-0/+14
| | | | | | | No functional change, but it preps codegen for the future when SABSDIFF will start getting generated in anger. llvm-svn: 242546
* Fix __builtin_setjmp in combination with sjlj exception handling.Matthias Braun2015-07-161-5/+17
| | | | | | | | | | | | | | | | | | | llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling style but is also used in clang to implement __builtin_setjmp. The ARM backend needs to output additional dispatch tables for the SjLj exception handling style, these tables however can't be emitted if llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual landing pad blocks exist. To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj exception handling lowering, so we can differentiate between the case where we actually need to setup a dispatch table and the case where we just need the __builtin_setjmp semantic. Differential Revision: http://reviews.llvm.org/D9313 llvm-svn: 242481
* ARM: Fix cttz expansion on vector types.Logan Chien2015-07-131-2/+98
| | | | | | | | | | | | The 64/128-bit vector types are legal if NEON instructions are available. However, there was no matching patterns for @llvm.cttz.*() intrinsics and result in fatal error. This commit fixes the problem by lowering cttz to: a. ctpop((x & -x) - 1) b. width - ctlz(x & -x) - 1 llvm-svn: 242037
* Allow {e,r}bp as the target of {read,write}_register.Pat Gavlin2015-07-091-2/+2
| | | | | | | | | | This patch allows the read_register and write_register intrinsics to read/write the RBP/EBP registers on X86 iff the targeted register is the frame pointer for the containing function. Differential Revision: http://reviews.llvm.org/D10977 llvm-svn: 241827
* Remove getDataLayout() from TargetLoweringMehdi Amini2015-07-091-19/+23
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779
* Make isLegalAddressingMode() taking DataLayout as an argumentMehdi Amini2015-07-091-3/+3
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778
* Make TargetLowering::getPointerTy() taking DataLayout as an argumentMehdi Amini2015-07-091-67/+72
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
* Remove IsLittleEndian from TargetLowering and redirect to DataLayoutMehdi Amini2015-07-081-8/+10
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11017 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241655
* [ARM] Define a subtarget feature and use it to decide whether long calls shouldAkira Hatanaka2015-07-071-6/+1
| | | | | | | | | | | | | | | | | be emitted. This is needed to enable ARM long calls for LTO and enable and disable it on a per-function basis. Out-of-tree projects currently using EnableARMLongCalls to emit long calls should start passing "+long-calls" to the feature string (see the changes made to clang in r241565). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D9364 llvm-svn: 241566
* IR: Do not consider available_externally linkage to be linker-weak.Peter Collingbourne2015-07-051-3/+3
| | | | | | | | | | | | | | | From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413
* [TargetLowering] StringRefize asm constraint getters.Benjamin Kramer2015-07-051-5/+3
| | | | | | | | There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411
* [ARM] Lower interleaved memory accesses to vldN/vstN intrinsics.Hao Liu2015-06-261-0/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4 %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4) %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4 into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240755
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* [ARM] Look through concat when lowering in-place shuffles (VZIP, ..)Ahmed Bougacha2015-06-191-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | Currently, we canonicalize shuffles that produce a result larger than their operands with: shuffle(concat(v1, undef), concat(v2, undef)) -> shuffle(concat(v1, v2), undef) because we can access quad vectors (see PerformVECTOR_SHUFFLECombine). This is useful in the general case, but there are special cases where native shuffles produce larger results: the two-result ops. We can look through the concat when lowering them: shuffle(concat(v1, v2), undef) -> concat(VZIP(v1, v2):0, :1) This lets us generate the native shuffles instead of scalarizing to dozens of VMOVs. Differential Revision: http://reviews.llvm.org/D10424 llvm-svn: 240118
* [ARM] Factor out two-result shuffle matching. NFCI.Ahmed Bougacha2015-06-191-26/+35
| | | | | | | In preparation for a future patch: makes it easier to do the same matching to generate different nodes, without duplication. llvm-svn: 240116
* Clean up redundant copies of Triple objects. NFCDaniel Sanders2015-06-161-1/+1
| | | | | | | | | | | | | | Summary: Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10382 llvm-svn: 239823
* Revert "Move dllimport name mangling to IR mangler."Reid Kleckner2015-06-111-2/+7
| | | | | | | | | This reverts commit r239437. This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol directly instead of using it to do an indirect function call. llvm-svn: 239502
* Move dllimport name mangling to IR mangler.Peter Collingbourne2015-06-091-7/+2
| | | | | | | | This ensures that LTO clients see the correct external symbol name. Differential Revision: http://reviews.llvm.org/D10318 llvm-svn: 239437
* Remove DisableTailCalls from TargetOptions and the code in resetTargetOptionsAkira Hatanaka2015-06-091-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | that was resetting it. Remove the uses of DisableTailCalls in subclasses of TargetLowering and use the value of function attribute "disable-tail-calls" instead. Also, unconditionally add pass TailCallElim to the pipeline and check the function attribute at the start of runOnFunction to disable the pass on a per-function basis. This is part of the work to remove TargetMachine::resetTargetOptions, and since DisableTailCalls was the last non-fast-math option that was being reset in that function, we should be able to remove the function entirely after the work to propagate IR-level fast-math flags to DAG nodes is completed. Out-of-tree users should remove the uses of DisableTailCalls and make changes to attach attribute "disable-tail-calls"="true" or "false" to the functions in the IR. rdar://problem/13752163 Differential Revision: http://reviews.llvm.org/D10099 llvm-svn: 239427
OpenPOWER on IntegriCloud