diff options
-rw-r--r-- | llvm/docs/Atomics.rst | 174 | ||||
-rw-r--r-- | llvm/include/llvm/CodeGen/RuntimeLibcalls.h | 73 | ||||
-rw-r--r-- | llvm/include/llvm/Target/TargetLowering.h | 19 | ||||
-rw-r--r-- | llvm/lib/CodeGen/AtomicExpandPass.cpp | 500 | ||||
-rw-r--r-- | llvm/lib/CodeGen/TargetLoweringBase.cpp | 64 | ||||
-rw-r--r-- | llvm/lib/Target/Sparc/SparcISelLowering.cpp | 7 | ||||
-rw-r--r-- | llvm/test/Transforms/AtomicExpand/SPARC/libcalls.ll | 257 | ||||
-rw-r--r-- | llvm/test/Transforms/AtomicExpand/SPARC/lit.local.cfg | 2 |
8 files changed, 22 insertions, 1074 deletions
diff --git a/llvm/docs/Atomics.rst b/llvm/docs/Atomics.rst index 89f5f44dae6..ff667480446 100644 --- a/llvm/docs/Atomics.rst +++ b/llvm/docs/Atomics.rst @@ -413,28 +413,19 @@ The MachineMemOperand for all atomic operations is currently marked as volatile; this is not correct in the IR sense of volatile, but CodeGen handles anything marked volatile very conservatively. This should get fixed at some point. -One very important property of the atomic operations is that if your backend -supports any inline lock-free atomic operations of a given size, you should -support *ALL* operations of that size in a lock-free manner. - -When the target implements atomic ``cmpxchg`` or LL/SC instructions (as most do) -this is trivial: all the other operations can be implemented on top of those -primitives. However, on many older CPUs (e.g. ARMv5, SparcV8, Intel 80386) there -are atomic load and store instructions, but no ``cmpxchg`` or LL/SC. As it is -invalid to implement ``atomic load`` using the native instruction, but -``cmpxchg`` using a library call to a function that uses a mutex, ``atomic -load`` must *also* expand to a library call on such architectures, so that it -can remain atomic with regards to a simultaneous ``cmpxchg``, by using the same -mutex. - -AtomicExpandPass can help with that: it will expand all atomic operations to the -proper ``__atomic_*`` libcalls for any size above the maximum set by -``setMaxAtomicSizeInBitsSupported`` (which defaults to 0). +Common architectures have some way of representing at least a pointer-sized +lock-free ``cmpxchg``; such an operation can be used to implement all the other +atomic operations which can be represented in IR up to that size. Backends are +expected to implement all those operations, but not operations which cannot be +implemented in a lock-free manner. It is expected that backends will give an +error when given an operation which cannot be implemented. (The LLVM code +generator is not very helpful here at the moment, but hopefully that will +change.) On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent fences generate an ``MFENCE``, other fences do not cause any code to be -generated. ``cmpxchg`` uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg`` +generated. cmpxchg uses the ``LOCK CMPXCHG`` instruction. ``atomicrmw xchg`` uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending on the users of the result, some ``atomicrmw`` operations can be translated into @@ -455,151 +446,10 @@ atomic constructs. Here are some lowerings it can do: ``emitStoreConditional()`` * large loads/stores -> ll-sc/cmpxchg by overriding ``shouldExpandAtomicStoreInIR()``/``shouldExpandAtomicLoadInIR()`` -* strong atomic accesses -> monotonic accesses + fences by overriding - ``shouldInsertFencesForAtomic()``, ``emitLeadingFence()``, and - ``emitTrailingFence()`` +* strong atomic accesses -> monotonic accesses + fences + by using ``setInsertFencesForAtomic()`` and overriding ``emitLeadingFence()`` + and ``emitTrailingFence()`` * atomic rmw -> loop with cmpxchg or load-linked/store-conditional by overriding ``expandAtomicRMWInIR()`` -* expansion to __atomic_* libcalls for unsupported sizes. For an example of all of these, look at the ARM backend. - -Libcalls: __atomic_* -==================== - -There are two kinds of atomic library calls that are generated by LLVM. Please -note that both sets of library functions somewhat confusingly share the names of -builtin functions defined by clang. Despite this, the library functions are -not directly related to the builtins: it is *not* the case that ``__atomic_*`` -builtins lower to ``__atomic_*`` library calls and ``__sync_*`` builtins lower -to ``__sync_*`` library calls. - -The first set of library functions are named ``__atomic_*``. This set has been -"standardized" by GCC, and is described below. (See also `GCC's documentation -<https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>`_) - -LLVM's AtomicExpandPass will translate atomic operations on data sizes above -``MaxAtomicSizeInBitsSupported`` into calls to these functions. - -There are four generic functions, which can be called with data of any size or -alignment:: - - void __atomic_load(size_t size, void *ptr, void *ret, int ordering) - void __atomic_store(size_t size, void *ptr, void *val, int ordering) - void __atomic_exchange(size_t size, void *ptr, void *val, void *ret, int ordering) - bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, void *desired, int success_order, int failure_order) - -There are also size-specialized versions of the above functions, which can only -be used with *naturally-aligned* pointers of the appropriate size. In the -signatures below, "N" is one of 1, 2, 4, 8, and 16, and "iN" is the appropriate -integer type of that size; if no such integer type exists, the specialization -cannot be used:: - - iN __atomic_load_N(iN *ptr, iN val, int ordering) - void __atomic_store_N(iN *ptr, iN val, int ordering) - iN __atomic_exchange_N(iN *ptr, iN val, int ordering) - bool __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired, int success_order, int failure_order) - -Finally there are some read-modify-write functions, which are only available in -the size-specific variants (any other sizes use a ``__atomic_compare_exchange`` -loop):: - - iN __atomic_fetch_add_N(iN *ptr, iN val, int ordering) - iN __atomic_fetch_sub_N(iN *ptr, iN val, int ordering) - iN __atomic_fetch_and_N(iN *ptr, iN val, int ordering) - iN __atomic_fetch_or_N(iN *ptr, iN val, int ordering) - iN __atomic_fetch_xor_N(iN *ptr, iN val, int ordering) - iN __atomic_fetch_nand_N(iN *ptr, iN val, int ordering) - -This set of library functions have some interesting implementation requirements -to take note of: - -- They support all sizes and alignments -- including those which cannot be - implemented natively on any existing hardware. Therefore, they will certainly - use mutexes in for some sizes/alignments. - -- As a consequence, they cannot be shipped in a statically linked - compiler-support library, as they have state which must be shared amongst all - DSOs loaded in the program. They must be provided in a shared library used by - all objects. - -- The set of atomic sizes supported lock-free must be a superset of the sizes - any compiler can emit. That is: if a new compiler introduces support for - inline-lock-free atomics of size N, the ``__atomic_*`` functions must also have a - lock-free implementation for size N. This is a requirement so that code - produced by an old compiler (which will have called the ``__atomic_*`` function) - interoperates with code produced by the new compiler (which will use native - the atomic instruction). - -Note that it's possible to write an entirely target-independent implementation -of these library functions by using the compiler atomic builtins themselves to -implement the operations on naturally-aligned pointers of supported sizes, and a -generic mutex implementation otherwise. - -Libcalls: __sync_* -================== - -Some targets or OS/target combinations can support lock-free atomics, but for -various reasons, it is not practical to emit the instructions inline. - -There's two typical examples of this. - -Some CPUs support multiple instruction sets which can be swiched back and forth -on function-call boundaries. For example, MIPS supports the MIPS16 ISA, which -has a smaller instruction encoding than the usual MIPS32 ISA. ARM, similarly, -has the Thumb ISA. In MIPS16 and earlier versions of Thumb, the atomic -instructions are not encodable. However, those instructions are available via a -function call to a function with the longer encoding. - -Additionally, a few OS/target pairs provide kernel-supported lock-free -atomics. ARM/Linux is an example of this: the kernel `provides -<https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt>`_ a -function which on older CPUs contains a "magically-restartable" atomic sequence -(which looks atomic so long as there's only one CPU), and contains actual atomic -instructions on newer multicore models. This sort of functionality can typically -be provided on any architecture, if all CPUs which are missing atomic -compare-and-swap support are uniprocessor (no SMP). This is almost always the -case. The only common architecture without that property is SPARC -- SPARCV8 SMP -systems were common, yet it doesn't support any sort of compare-and-swap -operation. - -In either of these cases, the Target in LLVM can claim support for atomics of an -appropriate size, and then implement some subset of the operations via libcalls -to a ``__sync_*`` function. Such functions *must* not use locks in their -implementation, because unlike the ``__atomic_*`` routines used by -AtomicExpandPass, these may be mixed-and-matched with native instructions by the -target lowering. - -Further, these routines do not need to be shared, as they are stateless. So, -there is no issue with having multiple copies included in one binary. Thus, -typically these routines are implemented by the statically-linked compiler -runtime support library. - -LLVM will emit a call to an appropriate ``__sync_*`` routine if the target -ISelLowering code has set the corresponding ``ATOMIC_CMPXCHG``, ``ATOMIC_SWAP``, -or ``ATOMIC_LOAD_*`` operation to "Expand", and if it has opted-into the -availablity of those library functions via a call to ``initSyncLibcalls()``. - -The full set of functions that may be called by LLVM is (for ``N`` being 1, 2, -4, 8, or 16):: - - iN __sync_val_compare_and_swap_N(iN *ptr, iN expected, iN desired) - iN __sync_lock_test_and_set_N(iN *ptr, iN val) - iN __sync_fetch_and_add_N(iN *ptr, iN val) - iN __sync_fetch_and_sub_N(iN *ptr, iN val) - iN __sync_fetch_and_and_N(iN *ptr, iN val) - iN __sync_fetch_and_or_N(iN *ptr, iN val) - iN __sync_fetch_and_xor_N(iN *ptr, iN val) - iN __sync_fetch_and_nand_N(iN *ptr, iN val) - iN __sync_fetch_and_max_N(iN *ptr, iN val) - iN __sync_fetch_and_umax_N(iN *ptr, iN val) - iN __sync_fetch_and_min_N(iN *ptr, iN val) - iN __sync_fetch_and_umin_N(iN *ptr, iN val) - -This list doesn't include any function for atomic load or store; all known -architectures support atomic loads and stores directly (possibly by emitting a -fence on either side of a normal load or store.) - -There's also, somewhat separately, the possibility to lower ``ATOMIC_FENCE`` to -``__sync_synchronize()``. This may happen or not happen independent of all the -above, controlled purely by ``setOperationAction(ISD::ATOMIC_FENCE, ...)``. diff --git a/llvm/include/llvm/CodeGen/RuntimeLibcalls.h b/llvm/include/llvm/CodeGen/RuntimeLibcalls.h index cfd13a3250c..bdf8d9ce086 100644 --- a/llvm/include/llvm/CodeGen/RuntimeLibcalls.h +++ b/llvm/include/llvm/CodeGen/RuntimeLibcalls.h @@ -336,11 +336,7 @@ namespace RTLIB { // EXCEPTION HANDLING UNWIND_RESUME, - // Note: there's two sets of atomics libcalls; see - // <http://llvm.org/docs/Atomics.html> for more info on the - // difference between them. - - // Atomic '__sync_*' libcalls. + // Family ATOMICs SYNC_VAL_COMPARE_AND_SWAP_1, SYNC_VAL_COMPARE_AND_SWAP_2, SYNC_VAL_COMPARE_AND_SWAP_4, @@ -402,73 +398,6 @@ namespace RTLIB { SYNC_FETCH_AND_UMIN_8, SYNC_FETCH_AND_UMIN_16, - // Atomic '__atomic_*' libcalls. - ATOMIC_LOAD, - ATOMIC_LOAD_1, - ATOMIC_LOAD_2, - ATOMIC_LOAD_4, - ATOMIC_LOAD_8, - ATOMIC_LOAD_16, - - ATOMIC_STORE, - ATOMIC_STORE_1, - ATOMIC_STORE_2, - ATOMIC_STORE_4, - ATOMIC_STORE_8, - ATOMIC_STORE_16, - - ATOMIC_EXCHANGE, - ATOMIC_EXCHANGE_1, - ATOMIC_EXCHANGE_2, - ATOMIC_EXCHANGE_4, - ATOMIC_EXCHANGE_8, - ATOMIC_EXCHANGE_16, - - ATOMIC_COMPARE_EXCHANGE, - ATOMIC_COMPARE_EXCHANGE_1, - ATOMIC_COMPARE_EXCHANGE_2, - ATOMIC_COMPARE_EXCHANGE_4, - ATOMIC_COMPARE_EXCHANGE_8, - ATOMIC_COMPARE_EXCHANGE_16, - - ATOMIC_FETCH_ADD_1, - ATOMIC_FETCH_ADD_2, - ATOMIC_FETCH_ADD_4, - ATOMIC_FETCH_ADD_8, - ATOMIC_FETCH_ADD_16, - - ATOMIC_FETCH_SUB_1, - ATOMIC_FETCH_SUB_2, - ATOMIC_FETCH_SUB_4, - ATOMIC_FETCH_SUB_8, - ATOMIC_FETCH_SUB_16, - - ATOMIC_FETCH_AND_1, - ATOMIC_FETCH_AND_2, - ATOMIC_FETCH_AND_4, - ATOMIC_FETCH_AND_8, - ATOMIC_FETCH_AND_16, - - ATOMIC_FETCH_OR_1, - ATOMIC_FETCH_OR_2, - ATOMIC_FETCH_OR_4, - ATOMIC_FETCH_OR_8, - ATOMIC_FETCH_OR_16, - - ATOMIC_FETCH_XOR_1, - ATOMIC_FETCH_XOR_2, - ATOMIC_FETCH_XOR_4, - ATOMIC_FETCH_XOR_8, - ATOMIC_FETCH_XOR_16, - - ATOMIC_FETCH_NAND_1, - ATOMIC_FETCH_NAND_2, - ATOMIC_FETCH_NAND_4, - ATOMIC_FETCH_NAND_8, - ATOMIC_FETCH_NAND_16, - - ATOMIC_IS_LOCK_FREE, - // Stack Protector Fail. STACKPROTECTOR_CHECK_FAIL, diff --git a/llvm/include/llvm/Target/TargetLowering.h b/llvm/include/llvm/Target/TargetLowering.h index 994071bf9d2..7c97f3c05a2 100644 --- a/llvm/include/llvm/Target/TargetLowering.h +++ b/llvm/include/llvm/Target/TargetLowering.h @@ -1059,14 +1059,6 @@ public: /// \name Helpers for atomic expansion. /// @{ - /// Returns the maximum atomic operation size (in bits) supported by - /// the backend. Atomic operations greater than this size (as well - /// as ones that are not naturally aligned), will be expanded by - /// AtomicExpandPass into an __atomic_* library call. - unsigned getMaxAtomicSizeInBitsSupported() const { - return MaxAtomicSizeInBitsSupported; - } - /// Whether AtomicExpandPass should automatically insert fences and reduce /// ordering for this atomic. This should be true for most architectures with /// weak memory ordering. Defaults to false. @@ -1464,14 +1456,6 @@ protected: MinStackArgumentAlignment = Align; } - /// Set the maximum atomic operation size supported by the - /// backend. Atomic operations greater than this size (as well as - /// ones that are not naturally aligned), will be expanded by - /// AtomicExpandPass into an __atomic_* library call. - void setMaxAtomicSizeInBitsSupported(unsigned SizeInBits) { - MaxAtomicSizeInBitsSupported = SizeInBits; - } - public: //===--------------------------------------------------------------------===// // Addressing mode description hooks (used by LSR etc). @@ -1881,9 +1865,6 @@ private: /// The preferred loop alignment. unsigned PrefLoopAlignment; - /// Size in bits of the maximum atomics size the backend supports. - /// Accesses larger than this will be expanded by AtomicExpandPass. - unsigned MaxAtomicSizeInBitsSupported; /// If set to a physical register, this specifies the register that /// llvm.savestack/llvm.restorestack should save and restore. diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp b/llvm/lib/CodeGen/AtomicExpandPass.cpp index 6d69de12422..8c0c0f4acba 100644 --- a/llvm/lib/CodeGen/AtomicExpandPass.cpp +++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp @@ -8,10 +8,10 @@ //===----------------------------------------------------------------------===// // // This file contains a pass (at IR level) to replace atomic instructions with -// __atomic_* library calls, or target specific instruction which implement the -// same semantics in a way which better fits the target backend. This can -// include the use of (intrinsic-based) load-linked/store-conditional loops, -// AtomicCmpXchg, or type coercions. +// target specific instruction which implement the same semantics in a way +// which better fits the target backend. This can include the use of either +// (intrinsic-based) load-linked/store-conditional loops, AtomicCmpXchg, or +// type coercions. // //===----------------------------------------------------------------------===// @@ -64,95 +64,19 @@ namespace { bool expandAtomicCmpXchg(AtomicCmpXchgInst *CI); bool isIdempotentRMW(AtomicRMWInst *AI); bool simplifyIdempotentRMW(AtomicRMWInst *AI); - - bool expandAtomicOpToLibcall(Instruction *I, unsigned Size, unsigned Align, - Value *PointerOperand, Value *ValueOperand, - Value *CASExpected, AtomicOrdering Ordering, - AtomicOrdering Ordering2, - ArrayRef<RTLIB::Libcall> Libcalls); - void expandAtomicLoadToLibcall(LoadInst *LI); - void expandAtomicStoreToLibcall(StoreInst *LI); - void expandAtomicRMWToLibcall(AtomicRMWInst *I); - void expandAtomicCASToLibcall(AtomicCmpXchgInst *I); }; } char AtomicExpand::ID = 0; char &llvm::AtomicExpandID = AtomicExpand::ID; -INITIALIZE_TM_PASS(AtomicExpand, "atomic-expand", "Expand Atomic instructions", - false, false) +INITIALIZE_TM_PASS(AtomicExpand, "atomic-expand", + "Expand Atomic calls in terms of either load-linked & store-conditional or cmpxchg", + false, false) FunctionPass *llvm::createAtomicExpandPass(const TargetMachine *TM) { return new AtomicExpand(TM); } -namespace { -// Helper functions to retrieve the size of atomic instructions. -unsigned getAtomicOpSize(LoadInst *LI) { - const DataLayout &DL = LI->getModule()->getDataLayout(); - return DL.getTypeStoreSize(LI->getType()); -} - -unsigned getAtomicOpSize(StoreInst *SI) { - const DataLayout &DL = SI->getModule()->getDataLayout(); - return DL.getTypeStoreSize(SI->getValueOperand()->getType()); -} - -unsigned getAtomicOpSize(AtomicRMWInst *RMWI) { - const DataLayout &DL = RMWI->getModule()->getDataLayout(); - return DL.getTypeStoreSize(RMWI->getValOperand()->getType()); -} - -unsigned getAtomicOpSize(AtomicCmpXchgInst *CASI) { - const DataLayout &DL = CASI->getModule()->getDataLayout(); - return DL.getTypeStoreSize(CASI->getCompareOperand()->getType()); -} - -// Helper functions to retrieve the alignment of atomic instructions. -unsigned getAtomicOpAlign(LoadInst *LI) { - unsigned Align = LI->getAlignment(); - // In the future, if this IR restriction is relaxed, we should - // return DataLayout::getABITypeAlignment when there's no align - // value. - assert(Align != 0 && "An atomic LoadInst always has an explicit alignment"); - return Align; -} - -unsigned getAtomicOpAlign(StoreInst *SI) { - unsigned Align = SI->getAlignment(); - // In the future, if this IR restriction is relaxed, we should - // return DataLayout::getABITypeAlignment when there's no align - // value. - assert(Align != 0 && "An atomic StoreInst always has an explicit alignment"); - return Align; -} - -unsigned getAtomicOpAlign(AtomicRMWInst *RMWI) { - // TODO(PR27168): This instruction has no alignment attribute, but unlike the - // default alignment for load/store, the default here is to assume - // it has NATURAL alignment, not DataLayout-specified alignment. - const DataLayout &DL = RMWI->getModule()->getDataLayout(); - return DL.getTypeStoreSize(RMWI->getValOperand()->getType()); -} - -unsigned getAtomicOpAlign(AtomicCmpXchgInst *CASI) { - // TODO(PR27168): same comment as above. - const DataLayout &DL = CASI->getModule()->getDataLayout(); - return DL.getTypeStoreSize(CASI->getCompareOperand()->getType()); -} - -// Determine if a particular atomic operation has a supported size, -// and is of appropriate alignment, to be passed through for target -// lowering. (Versus turning into a __atomic libcall) -template <typename Inst> -bool atomicSizeSupported(const TargetLowering *TLI, Inst *I) { - unsigned Size = getAtomicOpSize(I); - unsigned Align = getAtomicOpAlign(I); - return Align >= Size && Size <= TLI->getMaxAtomicSizeInBitsSupported() / 8; -} - -} // end anonymous namespace - bool AtomicExpand::runOnFunction(Function &F) { if (!TM || !TM->getSubtargetImpl(F)->enableAtomicExpand()) return false; @@ -176,33 +100,6 @@ bool AtomicExpand::runOnFunction(Function &F) { auto CASI = dyn_cast<AtomicCmpXchgInst>(I); assert((LI || SI || RMWI || CASI) && "Unknown atomic instruction"); - // If the Size/Alignment is not supported, replace with a libcall. - if (LI) { - if (!atomicSizeSupported(TLI, LI)) { - expandAtomicLoadToLibcall(LI); - MadeChange = true; - continue; - } - } else if (SI) { - if (!atomicSizeSupported(TLI, SI)) { - expandAtomicStoreToLibcall(SI); - MadeChange = true; - continue; - } - } else if (RMWI) { - if (!atomicSizeSupported(TLI, RMWI)) { - expandAtomicRMWToLibcall(RMWI); - MadeChange = true; - continue; - } - } else if (CASI) { - if (!atomicSizeSupported(TLI, CASI)) { - expandAtomicCASToLibcall(CASI); - MadeChange = true; - continue; - } - } - if (TLI->shouldInsertFencesForAtomic(I)) { auto FenceOrdering = AtomicOrdering::Monotonic; bool IsStore, IsLoad; @@ -247,7 +144,7 @@ bool AtomicExpand::runOnFunction(Function &F) { assert(LI->getType()->isIntegerTy() && "invariant broken"); MadeChange = true; } - + MadeChange |= tryExpandAtomicLoad(LI); } else if (SI) { if (SI->getValueOperand()->getType()->isFloatingPointTy()) { @@ -936,384 +833,3 @@ bool llvm::expandAtomicRMWToCmpXchg(AtomicRMWInst *AI, return true; } - -// This converts from LLVM's internal AtomicOrdering enum to the -// memory_order_* value required by the __atomic_* libcalls. -static int libcallAtomicModel(AtomicOrdering AO) { - enum { - AO_ABI_memory_order_relaxed = 0, - AO_ABI_memory_order_consume = 1, - AO_ABI_memory_order_acquire = 2, - AO_ABI_memory_order_release = 3, - AO_ABI_memory_order_acq_rel = 4, - AO_ABI_memory_order_seq_cst = 5 - }; - - switch (AO) { - case AtomicOrdering::NotAtomic: - llvm_unreachable("Expected atomic memory order."); - case AtomicOrdering::Unordered: - case AtomicOrdering::Monotonic: - return AO_ABI_memory_order_relaxed; - // Not implemented yet in llvm: - // case AtomicOrdering::Consume: - // return AO_ABI_memory_order_consume; - case AtomicOrdering::Acquire: - return AO_ABI_memory_order_acquire; - case AtomicOrdering::Release: - return AO_ABI_memory_order_release; - case AtomicOrdering::AcquireRelease: - return AO_ABI_memory_order_acq_rel; - case AtomicOrdering::SequentiallyConsistent: - return AO_ABI_memory_order_seq_cst; - } - llvm_unreachable("Unknown atomic memory order."); -} - -// In order to use one of the sized library calls such as -// __atomic_fetch_add_4, the alignment must be sufficient, the size -// must be one of the potentially-specialized sizes, and the value -// type must actually exist in C on the target (otherwise, the -// function wouldn't actually be defined.) -static bool canUseSizedAtomicCall(unsigned Size, unsigned Align, - const DataLayout &DL) { - // TODO: "LargestSize" is an approximation for "largest type that - // you can express in C". It seems to be the case that int128 is - // supported on all 64-bit platforms, otherwise only up to 64-bit - // integers are supported. If we get this wrong, then we'll try to - // call a sized libcall that doesn't actually exist. There should - // really be some more reliable way in LLVM of determining integer - // sizes which are valid in the target's C ABI... - unsigned LargestSize = DL.getLargestLegalIntTypeSize() >= 64 ? 16 : 8; - return Align >= Size && - (Size == 1 || Size == 2 || Size == 4 || Size == 8 || Size == 16) && - Size <= LargestSize; -} - -void AtomicExpand::expandAtomicLoadToLibcall(LoadInst *I) { - static const RTLIB::Libcall Libcalls[6] = { - RTLIB::ATOMIC_LOAD, RTLIB::ATOMIC_LOAD_1, RTLIB::ATOMIC_LOAD_2, - RTLIB::ATOMIC_LOAD_4, RTLIB::ATOMIC_LOAD_8, RTLIB::ATOMIC_LOAD_16}; - unsigned Size = getAtomicOpSize(I); - unsigned Align = getAtomicOpAlign(I); - - bool expanded = expandAtomicOpToLibcall( - I, Size, Align, I->getPointerOperand(), nullptr, nullptr, - I->getOrdering(), AtomicOrdering::NotAtomic, Libcalls); - (void)expanded; - assert(expanded && "expandAtomicOpToLibcall shouldn't fail tor Load"); -} - -void AtomicExpand::expandAtomicStoreToLibcall(StoreInst *I) { - static const RTLIB::Libcall Libcalls[6] = { - RTLIB::ATOMIC_STORE, RTLIB::ATOMIC_STORE_1, RTLIB::ATOMIC_STORE_2, - RTLIB::ATOMIC_STORE_4, RTLIB::ATOMIC_STORE_8, RTLIB::ATOMIC_STORE_16}; - unsigned Size = getAtomicOpSize(I); - unsigned Align = getAtomicOpAlign(I); - - bool expanded = expandAtomicOpToLibcall( - I, Size, Align, I->getPointerOperand(), I->getValueOperand(), nullptr, - I->getOrdering(), AtomicOrdering::NotAtomic, Libcalls); - (void)expanded; - assert(expanded && "expandAtomicOpToLibcall shouldn't fail tor Store"); -} - -void AtomicExpand::expandAtomicCASToLibcall(AtomicCmpXchgInst *I) { - static const RTLIB::Libcall Libcalls[6] = { - RTLIB::ATOMIC_COMPARE_EXCHANGE, RTLIB::ATOMIC_COMPARE_EXCHANGE_1, - RTLIB::ATOMIC_COMPARE_EXCHANGE_2, RTLIB::ATOMIC_COMPARE_EXCHANGE_4, - RTLIB::ATOMIC_COMPARE_EXCHANGE_8, RTLIB::ATOMIC_COMPARE_EXCHANGE_16}; - unsigned Size = getAtomicOpSize(I); - unsigned Align = getAtomicOpAlign(I); - - bool expanded = expandAtomicOpToLibcall( - I, Size, Align, I->getPointerOperand(), I->getNewValOperand(), - I->getCompareOperand(), I->getSuccessOrdering(), I->getFailureOrdering(), - Libcalls); - (void)expanded; - assert(expanded && "expandAtomicOpToLibcall shouldn't fail tor CAS"); -} - -static ArrayRef<RTLIB::Libcall> GetRMWLibcall(AtomicRMWInst::BinOp Op) { - static const RTLIB::Libcall LibcallsXchg[6] = { - RTLIB::ATOMIC_EXCHANGE, RTLIB::ATOMIC_EXCHANGE_1, - RTLIB::ATOMIC_EXCHANGE_2, RTLIB::ATOMIC_EXCHANGE_4, - RTLIB::ATOMIC_EXCHANGE_8, RTLIB::ATOMIC_EXCHANGE_16}; - static const RTLIB::Libcall LibcallsAdd[6] = { - RTLIB::UNKNOWN_LIBCALL, RTLIB::ATOMIC_FETCH_ADD_1, - RTLIB::ATOMIC_FETCH_ADD_2, RTLIB::ATOMIC_FETCH_ADD_4, - RTLIB::ATOMIC_FETCH_ADD_8, RTLIB::ATOMIC_FETCH_ADD_16}; - static const RTLIB::Libcall LibcallsSub[6] = { - RTLIB::UNKNOWN_LIBCALL, RTLIB::ATOMIC_FETCH_SUB_1, - RTLIB::ATOMIC_FETCH_SUB_2, RTLIB::ATOMIC_FETCH_SUB_4, - RTLIB::ATOMIC_FETCH_SUB_8, RTLIB::ATOMIC_FETCH_SUB_16}; - static const RTLIB::Libcall LibcallsAnd[6] = { - RTLIB::UNKNOWN_LIBCALL, RTLIB::ATOMIC_FETCH_AND_1, - RTLIB::ATOMIC_FETCH_AND_2, RTLIB::ATOMIC_FETCH_AND_4, - RTLIB::ATOMIC_FETCH_AND_8, RTLIB::ATOMIC_FETCH_AND_16}; - static const RTLIB::Libcall LibcallsOr[6] = { - RTLIB::UNKNOWN_LIBCALL, RTLIB::ATOMIC_FETCH_OR_1, - RTLIB::ATOMIC_FETCH_OR_2, RTLIB::ATOMIC_FETCH_OR_4, - RTLIB::ATOMIC_FETCH_OR_8, RTLIB::ATOMIC_FETCH_OR_16}; - static const RTLIB::Libcall LibcallsXor[6] = { - RTLIB::UNKNOWN_LIBCALL, RTLIB::ATOMIC_FETCH_XOR_1, - RTLIB::ATOMIC_FETCH_XOR_2, RTLIB::ATOMIC_FETCH_XOR_4, - RTLIB::ATOMIC_FETCH_XOR_8, RTLIB::ATOMIC_FETCH_XOR_16}; - static const RTLIB::Libcall LibcallsNand[6] = { - RTLIB::UNKNOWN_LIBCALL, RTLIB::ATOMIC_FETCH_NAND_1, - RTLIB::ATOMIC_FETCH_NAND_2, RTLIB::ATOMIC_FETCH_NAND_4, - RTLIB::ATOMIC_FETCH_NAND_8, RTLIB::ATOMIC_FETCH_NAND_16}; - - switch (Op) { - case AtomicRMWInst::BAD_BINOP: - llvm_unreachable("Should not have BAD_BINOP."); - case AtomicRMWInst::Xchg: - return ArrayRef<RTLIB::Libcall>(LibcallsXchg); - case AtomicRMWInst::Add: - return ArrayRef<RTLIB::Libcall>(LibcallsAdd); - case AtomicRMWInst::Sub: - return ArrayRef<RTLIB::Libcall>(LibcallsSub); - case AtomicRMWInst::And: - return ArrayRef<RTLIB::Libcall>(LibcallsAnd); - case AtomicRMWInst::Or: - return ArrayRef<RTLIB::Libcall>(LibcallsOr); - case AtomicRMWInst::Xor: - return ArrayRef<RTLIB::Libcall>(LibcallsXor); - case AtomicRMWInst::Nand: - return ArrayRef<RTLIB::Libcall>(LibcallsNand); - case AtomicRMWInst::Max: - case AtomicRMWInst::Min: - case AtomicRMWInst::UMax: - case AtomicRMWInst::UMin: - // No atomic libcalls are available for max/min/umax/umin. - return ArrayRef<RTLIB::Libcall>(); - } - llvm_unreachable("Unexpected AtomicRMW operation."); -} - -void AtomicExpand::expandAtomicRMWToLibcall(AtomicRMWInst *I) { - ArrayRef<RTLIB::Libcall> Libcalls = GetRMWLibcall(I->getOperation()); - - unsigned Size = getAtomicOpSize(I); - unsigned Align = getAtomicOpAlign(I); - - bool Success = false; - if (!Libcalls.empty()) - Success = expandAtomicOpToLibcall( - I, Size, Align, I->getPointerOperand(), I->getValOperand(), nullptr, - I->getOrdering(), AtomicOrdering::NotAtomic, Libcalls); - - // The expansion failed: either there were no libcalls at all for - // the operation (min/max), or there were only size-specialized - // libcalls (add/sub/etc) and we needed a generic. So, expand to a - // CAS libcall, via a CAS loop, instead. - if (!Success) { - expandAtomicRMWToCmpXchg(I, [this](IRBuilder<> &Builder, Value *Addr, - Value *Loaded, Value *NewVal, - AtomicOrdering MemOpOrder, - Value *&Success, Value *&NewLoaded) { - // Create the CAS instruction normally... - AtomicCmpXchgInst *Pair = Builder.CreateAtomicCmpXchg( - Addr, Loaded, NewVal, MemOpOrder, - AtomicCmpXchgInst::getStrongestFailureOrdering(MemOpOrder)); - Success = Builder.CreateExtractValue(Pair, 1, "success"); - NewLoaded = Builder.CreateExtractValue(Pair, 0, "newloaded"); - - // ...and then expand the CAS into a libcall. - expandAtomicCASToLibcall(Pair); - }); - } -} - -// A helper routine for the above expandAtomic*ToLibcall functions. -// -// 'Libcalls' contains an array of enum values for the particular -// ATOMIC libcalls to be emitted. All of the other arguments besides -// 'I' are extracted from the Instruction subclass by the -// caller. Depending on the particular call, some will be null. -bool AtomicExpand::expandAtomicOpToLibcall( - Instruction *I, unsigned Size, unsigned Align, Value *PointerOperand, - Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering, - AtomicOrdering Ordering2, ArrayRef<RTLIB::Libcall> Libcalls) { - assert(Libcalls.size() == 6); - - LLVMContext &Ctx = I->getContext(); - Module *M = I->getModule(); - const DataLayout &DL = M->getDataLayout(); - IRBuilder<> Builder(I); - IRBuilder<> AllocaBuilder(&I->getFunction()->getEntryBlock().front()); - - bool UseSizedLibcall = canUseSizedAtomicCall(Size, Align, DL); - Type *SizedIntTy = Type::getIntNTy(Ctx, Size * 8); - - unsigned AllocaAlignment = DL.getPrefTypeAlignment(SizedIntTy); - - // TODO: the "order" argument type is "int", not int32. So - // getInt32Ty may be wrong if the arch uses e.g. 16-bit ints. - ConstantInt *SizeVal64 = ConstantInt::get(Type::getInt64Ty(Ctx), Size); - Constant *OrderingVal = - ConstantInt::get(Type::getInt32Ty(Ctx), libcallAtomicModel(Ordering)); - Constant *Ordering2Val = CASExpected - ? ConstantInt::get(Type::getInt32Ty(Ctx), - libcallAtomicModel(Ordering2)) - : nullptr; - bool HasResult = I->getType() != Type::getVoidTy(Ctx); - - RTLIB::Libcall RTLibType; - if (UseSizedLibcall) { - switch (Size) { - case 1: RTLibType = Libcalls[1]; break; - case 2: RTLibType = Libcalls[2]; break; - case 4: RTLibType = Libcalls[3]; break; - case 8: RTLibType = Libcalls[4]; break; - case 16: RTLibType = Libcalls[5]; break; - } - } else if (Libcalls[0] != RTLIB::UNKNOWN_LIBCALL) { - RTLibType = Libcalls[0]; - } else { - // Can't use sized function, and there's no generic for this - // operation, so give up. - return false; - } - - // Build up the function call. There's two kinds. First, the sized - // variants. These calls are going to be one of the following (with - // N=1,2,4,8,16): - // iN __atomic_load_N(iN *ptr, int ordering) - // void __atomic_store_N(iN *ptr, iN val, int ordering) - // iN __atomic_{exchange|fetch_*}_N(iN *ptr, iN val, int ordering) - // bool __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired, - // int success_order, int failure_order) - // - // Note that these functions can be used for non-integer atomic - // operations, the values just need to be bitcast to integers on the - // way in and out. - // - // And, then, the generic variants. They look like the following: - // void __atomic_load(size_t size, void *ptr, void *ret, int ordering) - // void __atomic_store(size_t size, void *ptr, void *val, int ordering) - // void __atomic_exchange(size_t size, void *ptr, void *val, void *ret, - // int ordering) - // bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, - // void *desired, int success_order, - // int failure_order) - // - // The different signatures are built up depending on the - // 'UseSizedLibcall', 'CASExpected', 'ValueOperand', and 'HasResult' - // variables. - - AllocaInst *AllocaCASExpected = nullptr; - Value *AllocaCASExpected_i8 = nullptr; - AllocaInst *AllocaValue = nullptr; - Value *AllocaValue_i8 = nullptr; - AllocaInst *AllocaResult = nullptr; - Value *AllocaResult_i8 = nullptr; - - Type *ResultTy; - SmallVector<Value *, 6> Args; - AttributeSet Attr; - - // 'size' argument. - if (!UseSizedLibcall) { - // Note, getIntPtrType is assumed equivalent to size_t. - Args.push_back(ConstantInt::get(DL.getIntPtrType(Ctx), Size)); - } - - // 'ptr' argument. - Value *PtrVal = - Builder.CreateBitCast(PointerOperand, Type::getInt8PtrTy(Ctx)); - Args.push_back(PtrVal); - - // 'expected' argument, if present. - if (CASExpected) { - AllocaCASExpected = AllocaBuilder.CreateAlloca(CASExpected->getType()); - AllocaCASExpected->setAlignment(AllocaAlignment); - AllocaCASExpected_i8 = - Builder.CreateBitCast(AllocaCASExpected, Type::getInt8PtrTy(Ctx)); - Builder.CreateLifetimeStart(AllocaCASExpected_i8, SizeVal64); - Builder.CreateAlignedStore(CASExpected, AllocaCASExpected, AllocaAlignment); - Args.push_back(AllocaCASExpected_i8); - } - - // 'val' argument ('desired' for cas), if present. - if (ValueOperand) { - if (UseSizedLibcall) { - Value *IntValue = - Builder.CreateBitOrPointerCast(ValueOperand, SizedIntTy); - Args.push_back(IntValue); - } else { - AllocaValue = AllocaBuilder.CreateAlloca(ValueOperand->getType()); - AllocaValue->setAlignment(AllocaAlignment); - AllocaValue_i8 = - Builder.CreateBitCast(AllocaValue, Type::getInt8PtrTy(Ctx)); - Builder.CreateLifetimeStart(AllocaValue_i8, SizeVal64); - Builder.CreateAlignedStore(ValueOperand, AllocaValue, AllocaAlignment); - Args.push_back(AllocaValue_i8); - } - } - - // 'ret' argument. - if (!CASExpected && HasResult && !UseSizedLibcall) { - AllocaResult = AllocaBuilder.CreateAlloca(I->getType()); - AllocaResult->setAlignment(AllocaAlignment); - AllocaResult_i8 = - Builder.CreateBitCast(AllocaResult, Type::getInt8PtrTy(Ctx)); - Builder.CreateLifetimeStart(AllocaResult_i8, SizeVal64); - Args.push_back(AllocaResult_i8); - } - - // 'ordering' ('success_order' for cas) argument. - Args.push_back(OrderingVal); - - // 'failure_order' argument, if present. - if (Ordering2Val) - Args.push_back(Ordering2Val); - - // Now, the return type. - if (CASExpected) { - ResultTy = Type::getInt1Ty(Ctx); - Attr = Attr.addAttribute(Ctx, AttributeSet::ReturnIndex, Attribute::ZExt); - } else if (HasResult && UseSizedLibcall) - ResultTy = SizedIntTy; - else - ResultTy = Type::getVoidTy(Ctx); - - // Done with setting up arguments and return types, create the call: - SmallVector<Type *, 6> ArgTys; - for (Value *Arg : Args) - ArgTys.push_back(Arg->getType()); - FunctionType *FnType = FunctionType::get(ResultTy, ArgTys, false); - Constant *LibcallFn = - M->getOrInsertFunction(TLI->getLibcallName(RTLibType), FnType, Attr); - CallInst *Call = Builder.CreateCall(LibcallFn, Args); - Call->setAttributes(Attr); - Value *Result = Call; - - // And then, extract the results... - if (ValueOperand && !UseSizedLibcall) - Builder.CreateLifetimeEnd(AllocaValue_i8, SizeVal64); - - if (CASExpected) { - // The final result from the CAS is {load of 'expected' alloca, bool result - // from call} - Type *FinalResultTy = I->getType(); - Value *V = UndefValue::get(FinalResultTy); - Value *ExpectedOut = - Builder.CreateAlignedLoad(AllocaCASExpected, AllocaAlignment); - Builder.CreateLifetimeEnd(AllocaCASExpected_i8, SizeVal64); - V = Builder.CreateInsertValue(V, ExpectedOut, 0); - V = Builder.CreateInsertValue(V, Result, 1); - I->replaceAllUsesWith(V); - } else if (HasResult) { - Value *V; - if (UseSizedLibcall) - V = Builder.CreateBitOrPointerCast(Result, I->getType()); - else { - V = Builder.CreateAlignedLoad(AllocaResult, AllocaAlignment); - Builder.CreateLifetimeEnd(AllocaResult_i8, SizeVal64); - } - I->replaceAllUsesWith(V); - } - I->eraseFromParent(); - return true; -} diff --git a/llvm/lib/CodeGen/TargetLoweringBase.cpp b/llvm/lib/CodeGen/TargetLoweringBase.cpp index d4aa5d5adad..8cadbb2dcd0 100644 --- a/llvm/lib/CodeGen/TargetLoweringBase.cpp +++ b/llvm/lib/CodeGen/TargetLoweringBase.cpp @@ -405,66 +405,7 @@ static void InitLibcallNames(const char **Names, const Triple &TT) { Names[RTLIB::SYNC_FETCH_AND_UMIN_4] = "__sync_fetch_and_umin_4"; Names[RTLIB::SYNC_FETCH_AND_UMIN_8] = "__sync_fetch_and_umin_8"; Names[RTLIB::SYNC_FETCH_AND_UMIN_16] = "__sync_fetch_and_umin_16"; - - Names[RTLIB::ATOMIC_LOAD] = "__atomic_load"; - Names[RTLIB::ATOMIC_LOAD_1] = "__atomic_load_1"; - Names[RTLIB::ATOMIC_LOAD_2] = "__atomic_load_2"; - Names[RTLIB::ATOMIC_LOAD_4] = "__atomic_load_4"; - Names[RTLIB::ATOMIC_LOAD_8] = "__atomic_load_8"; - Names[RTLIB::ATOMIC_LOAD_16] = "__atomic_load_16"; - - Names[RTLIB::ATOMIC_STORE] = "__atomic_store"; - Names[RTLIB::ATOMIC_STORE_1] = "__atomic_store_1"; - Names[RTLIB::ATOMIC_STORE_2] = "__atomic_store_2"; - Names[RTLIB::ATOMIC_STORE_4] = "__atomic_store_4"; - Names[RTLIB::ATOMIC_STORE_8] = "__atomic_store_8"; - Names[RTLIB::ATOMIC_STORE_16] = "__atomic_store_16"; - - Names[RTLIB::ATOMIC_EXCHANGE] = "__atomic_exchange"; - Names[RTLIB::ATOMIC_EXCHANGE_1] = "__atomic_exchange_1"; - Names[RTLIB::ATOMIC_EXCHANGE_2] = "__atomic_exchange_2"; - Names[RTLIB::ATOMIC_EXCHANGE_4] = "__atomic_exchange_4"; - Names[RTLIB::ATOMIC_EXCHANGE_8] = "__atomic_exchange_8"; - Names[RTLIB::ATOMIC_EXCHANGE_16] = "__atomic_exchange_16"; - - Names[RTLIB::ATOMIC_COMPARE_EXCHANGE] = "__atomic_compare_exchange"; - Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_1] = "__atomic_compare_exchange_1"; - Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_2] = "__atomic_compare_exchange_2"; - Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_4] = "__atomic_compare_exchange_4"; - Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_8] = "__atomic_compare_exchange_8"; - Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_16] = "__atomic_compare_exchange_16"; - - Names[RTLIB::ATOMIC_FETCH_ADD_1] = "__atomic_fetch_add_1"; - Names[RTLIB::ATOMIC_FETCH_ADD_2] = "__atomic_fetch_add_2"; - Names[RTLIB::ATOMIC_FETCH_ADD_4] = "__atomic_fetch_add_4"; - Names[RTLIB::ATOMIC_FETCH_ADD_8] = "__atomic_fetch_add_8"; - Names[RTLIB::ATOMIC_FETCH_ADD_16] = "__atomic_fetch_add_16"; - Names[RTLIB::ATOMIC_FETCH_SUB_1] = "__atomic_fetch_sub_1"; - Names[RTLIB::ATOMIC_FETCH_SUB_2] = "__atomic_fetch_sub_2"; - Names[RTLIB::ATOMIC_FETCH_SUB_4] = "__atomic_fetch_sub_4"; - Names[RTLIB::ATOMIC_FETCH_SUB_8] = "__atomic_fetch_sub_8"; - Names[RTLIB::ATOMIC_FETCH_SUB_16] = "__atomic_fetch_sub_16"; - Names[RTLIB::ATOMIC_FETCH_AND_1] = "__atomic_fetch_and_1"; - Names[RTLIB::ATOMIC_FETCH_AND_2] = "__atomic_fetch_and_2"; - Names[RTLIB::ATOMIC_FETCH_AND_4] = "__atomic_fetch_and_4"; - Names[RTLIB::ATOMIC_FETCH_AND_8] = "__atomic_fetch_and_8"; - Names[RTLIB::ATOMIC_FETCH_AND_16] = "__atomic_fetch_and_16"; - Names[RTLIB::ATOMIC_FETCH_OR_1] = "__atomic_fetch_or_1"; - Names[RTLIB::ATOMIC_FETCH_OR_2] = "__atomic_fetch_or_2"; - Names[RTLIB::ATOMIC_FETCH_OR_4] = "__atomic_fetch_or_4"; - Names[RTLIB::ATOMIC_FETCH_OR_8] = "__atomic_fetch_or_8"; - Names[RTLIB::ATOMIC_FETCH_OR_16] = "__atomic_fetch_or_16"; - Names[RTLIB::ATOMIC_FETCH_XOR_1] = "__atomic_fetch_xor_1"; - Names[RTLIB::ATOMIC_FETCH_XOR_2] = "__atomic_fetch_xor_2"; - Names[RTLIB::ATOMIC_FETCH_XOR_4] = "__atomic_fetch_xor_4"; - Names[RTLIB::ATOMIC_FETCH_XOR_8] = "__atomic_fetch_xor_8"; - Names[RTLIB::ATOMIC_FETCH_XOR_16] = "__atomic_fetch_xor_16"; - Names[RTLIB::ATOMIC_FETCH_NAND_1] = "__atomic_fetch_nand_1"; - Names[RTLIB::ATOMIC_FETCH_NAND_2] = "__atomic_fetch_nand_2"; - Names[RTLIB::ATOMIC_FETCH_NAND_4] = "__atomic_fetch_nand_4"; - Names[RTLIB::ATOMIC_FETCH_NAND_8] = "__atomic_fetch_nand_8"; - Names[RTLIB::ATOMIC_FETCH_NAND_16] = "__atomic_fetch_nand_16"; - + if (TT.getEnvironment() == Triple::GNU) { Names[RTLIB::SINCOS_F32] = "sincosf"; Names[RTLIB::SINCOS_F64] = "sincos"; @@ -836,9 +777,6 @@ TargetLoweringBase::TargetLoweringBase(const TargetMachine &tm) : TM(tm) { GatherAllAliasesMaxDepth = 6; MinStackArgumentAlignment = 1; MinimumJumpTableEntries = 4; - // TODO: the default will be switched to 0 in the next commit, along - // with the Target-specific changes necessary. - MaxAtomicSizeInBitsSupported = 1024; InitLibcallNames(LibcallRoutineNames, TM.getTargetTriple()); InitCmpLibcallCCs(CmpLibcallCCs); diff --git a/llvm/lib/Target/Sparc/SparcISelLowering.cpp b/llvm/lib/Target/Sparc/SparcISelLowering.cpp index 32d88f9be74..d9f00095589 100644 --- a/llvm/lib/Target/Sparc/SparcISelLowering.cpp +++ b/llvm/lib/Target/Sparc/SparcISelLowering.cpp @@ -1611,13 +1611,6 @@ SparcTargetLowering::SparcTargetLowering(TargetMachine &TM, } // ATOMICs. - // Atomics are only supported on Sparcv9. (32bit atomics are also - // supported by the Leon sparcv8 variant, but we don't support that - // yet.) - if (Subtarget->isV9()) - setMaxAtomicSizeInBitsSupported(64); - else - setMaxAtomicSizeInBitsSupported(0); setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Legal); setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, diff --git a/llvm/test/Transforms/AtomicExpand/SPARC/libcalls.ll b/llvm/test/Transforms/AtomicExpand/SPARC/libcalls.ll deleted file mode 100644 index afab7a39b27..00000000000 --- a/llvm/test/Transforms/AtomicExpand/SPARC/libcalls.ll +++ /dev/null @@ -1,257 +0,0 @@ -; RUN: opt -S %s -atomic-expand | FileCheck %s - -;;; NOTE: this test is actually target-independent -- any target which -;;; doesn't support inline atomics can be used. (E.g. X86 i386 would -;;; work, if LLVM is properly taught about what it's missing vs i586.) - -;target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" -;target triple = "i386-unknown-unknown" -target datalayout = "e-m:e-p:32:32-i64:64-f128:64-n32-S64" -target triple = "sparc-unknown-unknown" - -;; First, check the sized calls. Except for cmpxchg, these are fairly -;; straightforward. - -; CHECK-LABEL: @test_load_i16( -; CHECK: %1 = bitcast i16* %arg to i8* -; CHECK: %2 = call i16 @__atomic_load_2(i8* %1, i32 5) -; CHECK: ret i16 %2 -define i16 @test_load_i16(i16* %arg) { - %ret = load atomic i16, i16* %arg seq_cst, align 4 - ret i16 %ret -} - -; CHECK-LABEL: @test_store_i16( -; CHECK: %1 = bitcast i16* %arg to i8* -; CHECK: call void @__atomic_store_2(i8* %1, i16 %val, i32 5) -; CHECK: ret void -define void @test_store_i16(i16* %arg, i16 %val) { - store atomic i16 %val, i16* %arg seq_cst, align 4 - ret void -} - -; CHECK-LABEL: @test_exchange_i16( -; CHECK: %1 = bitcast i16* %arg to i8* -; CHECK: %2 = call i16 @__atomic_exchange_2(i8* %1, i16 %val, i32 5) -; CHECK: ret i16 %2 -define i16 @test_exchange_i16(i16* %arg, i16 %val) { - %ret = atomicrmw xchg i16* %arg, i16 %val seq_cst - ret i16 %ret -} - -; CHECK-LABEL: @test_cmpxchg_i16( -; CHECK: %1 = bitcast i16* %arg to i8* -; CHECK: %2 = alloca i16, align 2 -; CHECK: %3 = bitcast i16* %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 2, i8* %3) -; CHECK: store i16 %old, i16* %2, align 2 -; CHECK: %4 = call zeroext i1 @__atomic_compare_exchange_2(i8* %1, i8* %3, i16 %new, i32 5, i32 0) -; CHECK: %5 = load i16, i16* %2, align 2 -; CHECK: call void @llvm.lifetime.end(i64 2, i8* %3) -; CHECK: %6 = insertvalue { i16, i1 } undef, i16 %5, 0 -; CHECK: %7 = insertvalue { i16, i1 } %6, i1 %4, 1 -; CHECK: %ret = extractvalue { i16, i1 } %7, 0 -; CHECK: ret i16 %ret -define i16 @test_cmpxchg_i16(i16* %arg, i16 %old, i16 %new) { - %ret_succ = cmpxchg i16* %arg, i16 %old, i16 %new seq_cst monotonic - %ret = extractvalue { i16, i1 } %ret_succ, 0 - ret i16 %ret -} - -; CHECK-LABEL: @test_add_i16( -; CHECK: %1 = bitcast i16* %arg to i8* -; CHECK: %2 = call i16 @__atomic_fetch_add_2(i8* %1, i16 %val, i32 5) -; CHECK: ret i16 %2 -define i16 @test_add_i16(i16* %arg, i16 %val) { - %ret = atomicrmw add i16* %arg, i16 %val seq_cst - ret i16 %ret -} - - -;; Now, check the output for the unsized libcalls. i128 is used for -;; these tests because the "16" suffixed functions aren't available on -;; 32-bit i386. - -; CHECK-LABEL: @test_load_i128( -; CHECK: %1 = bitcast i128* %arg to i8* -; CHECK: %2 = alloca i128, align 8 -; CHECK: %3 = bitcast i128* %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %3) -; CHECK: call void @__atomic_load(i32 16, i8* %1, i8* %3, i32 5) -; CHECK: %4 = load i128, i128* %2, align 8 -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %3) -; CHECK: ret i128 %4 -define i128 @test_load_i128(i128* %arg) { - %ret = load atomic i128, i128* %arg seq_cst, align 16 - ret i128 %ret -} - -; CHECK-LABEL @test_store_i128( -; CHECK: %1 = bitcast i128* %arg to i8* -; CHECK: %2 = alloca i128, align 8 -; CHECK: %3 = bitcast i128* %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %3) -; CHECK: store i128 %val, i128* %2, align 8 -; CHECK: call void @__atomic_store(i32 16, i8* %1, i8* %3, i32 5) -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %3) -; CHECK: ret void -define void @test_store_i128(i128* %arg, i128 %val) { - store atomic i128 %val, i128* %arg seq_cst, align 16 - ret void -} - -; CHECK-LABEL: @test_exchange_i128( -; CHECK: %1 = bitcast i128* %arg to i8* -; CHECK: %2 = alloca i128, align 8 -; CHECK: %3 = bitcast i128* %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %3) -; CHECK: store i128 %val, i128* %2, align 8 -; CHECK: %4 = alloca i128, align 8 -; CHECK: %5 = bitcast i128* %4 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %5) -; CHECK: call void @__atomic_exchange(i32 16, i8* %1, i8* %3, i8* %5, i32 5) -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %3) -; CHECK: %6 = load i128, i128* %4, align 8 -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %5) -; CHECK: ret i128 %6 -define i128 @test_exchange_i128(i128* %arg, i128 %val) { - %ret = atomicrmw xchg i128* %arg, i128 %val seq_cst - ret i128 %ret -} - -; CHECK-LABEL: @test_cmpxchg_i128( -; CHECK: %1 = bitcast i128* %arg to i8* -; CHECK: %2 = alloca i128, align 8 -; CHECK: %3 = bitcast i128* %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %3) -; CHECK: store i128 %old, i128* %2, align 8 -; CHECK: %4 = alloca i128, align 8 -; CHECK: %5 = bitcast i128* %4 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %5) -; CHECK: store i128 %new, i128* %4, align 8 -; CHECK: %6 = call zeroext i1 @__atomic_compare_exchange(i32 16, i8* %1, i8* %3, i8* %5, i32 5, i32 0) -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %5) -; CHECK: %7 = load i128, i128* %2, align 8 -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %3) -; CHECK: %8 = insertvalue { i128, i1 } undef, i128 %7, 0 -; CHECK: %9 = insertvalue { i128, i1 } %8, i1 %6, 1 -; CHECK: %ret = extractvalue { i128, i1 } %9, 0 -; CHECK: ret i128 %ret -define i128 @test_cmpxchg_i128(i128* %arg, i128 %old, i128 %new) { - %ret_succ = cmpxchg i128* %arg, i128 %old, i128 %new seq_cst monotonic - %ret = extractvalue { i128, i1 } %ret_succ, 0 - ret i128 %ret -} - -; This one is a verbose expansion, as there is no generic -; __atomic_fetch_add function, so it needs to expand to a cmpxchg -; loop, which then itself expands into a libcall. - -; CHECK-LABEL: @test_add_i128( -; CHECK: %1 = alloca i128, align 8 -; CHECK: %2 = alloca i128, align 8 -; CHECK: %3 = load i128, i128* %arg, align 16 -; CHECK: br label %atomicrmw.start -; CHECK:atomicrmw.start: -; CHECK: %loaded = phi i128 [ %3, %0 ], [ %newloaded, %atomicrmw.start ] -; CHECK: %new = add i128 %loaded, %val -; CHECK: %4 = bitcast i128* %arg to i8* -; CHECK: %5 = bitcast i128* %1 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %5) -; CHECK: store i128 %loaded, i128* %1, align 8 -; CHECK: %6 = bitcast i128* %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %6) -; CHECK: store i128 %new, i128* %2, align 8 -; CHECK: %7 = call zeroext i1 @__atomic_compare_exchange(i32 16, i8* %4, i8* %5, i8* %6, i32 5, i32 5) -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %6) -; CHECK: %8 = load i128, i128* %1, align 8 -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %5) -; CHECK: %9 = insertvalue { i128, i1 } undef, i128 %8, 0 -; CHECK: %10 = insertvalue { i128, i1 } %9, i1 %7, 1 -; CHECK: %success = extractvalue { i128, i1 } %10, 1 -; CHECK: %newloaded = extractvalue { i128, i1 } %10, 0 -; CHECK: br i1 %success, label %atomicrmw.end, label %atomicrmw.start -; CHECK:atomicrmw.end: -; CHECK: ret i128 %newloaded -define i128 @test_add_i128(i128* %arg, i128 %val) { - %ret = atomicrmw add i128* %arg, i128 %val seq_cst - ret i128 %ret -} - -;; Ensure that non-integer types get bitcast correctly on the way in and out of a libcall: - -; CHECK-LABEL: @test_load_double( -; CHECK: %1 = bitcast double* %arg to i8* -; CHECK: %2 = call i64 @__atomic_load_8(i8* %1, i32 5) -; CHECK: %3 = bitcast i64 %2 to double -; CHECK: ret double %3 -define double @test_load_double(double* %arg, double %val) { - %1 = load atomic double, double* %arg seq_cst, align 16 - ret double %1 -} - -; CHECK-LABEL: @test_store_double( -; CHECK: %1 = bitcast double* %arg to i8* -; CHECK: %2 = bitcast double %val to i64 -; CHECK: call void @__atomic_store_8(i8* %1, i64 %2, i32 5) -; CHECK: ret void -define void @test_store_double(double* %arg, double %val) { - store atomic double %val, double* %arg seq_cst, align 16 - ret void -} - -; CHECK-LABEL: @test_cmpxchg_ptr( -; CHECK: %1 = bitcast i16** %arg to i8* -; CHECK: %2 = alloca i16*, align 4 -; CHECK: %3 = bitcast i16** %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 4, i8* %3) -; CHECK: store i16* %old, i16** %2, align 4 -; CHECK: %4 = ptrtoint i16* %new to i32 -; CHECK: %5 = call zeroext i1 @__atomic_compare_exchange_4(i8* %1, i8* %3, i32 %4, i32 5, i32 2) -; CHECK: %6 = load i16*, i16** %2, align 4 -; CHECK: call void @llvm.lifetime.end(i64 4, i8* %3) -; CHECK: %7 = insertvalue { i16*, i1 } undef, i16* %6, 0 -; CHECK: %8 = insertvalue { i16*, i1 } %7, i1 %5, 1 -; CHECK: %ret = extractvalue { i16*, i1 } %8, 0 -; CHECK: ret i16* %ret -; CHECK: } -define i16* @test_cmpxchg_ptr(i16** %arg, i16* %old, i16* %new) { - %ret_succ = cmpxchg i16** %arg, i16* %old, i16* %new seq_cst acquire - %ret = extractvalue { i16*, i1 } %ret_succ, 0 - ret i16* %ret -} - -;; ...and for a non-integer type of large size too. - -; CHECK-LABEL: @test_store_fp128 -; CHECK: %1 = bitcast fp128* %arg to i8* -; CHECK: %2 = alloca fp128, align 8 -; CHECK: %3 = bitcast fp128* %2 to i8* -; CHECK: call void @llvm.lifetime.start(i64 16, i8* %3) -; CHECK: store fp128 %val, fp128* %2, align 8 -; CHECK: call void @__atomic_store(i32 16, i8* %1, i8* %3, i32 5) -; CHECK: call void @llvm.lifetime.end(i64 16, i8* %3) -; CHECK: ret void -define void @test_store_fp128(fp128* %arg, fp128 %val) { - store atomic fp128 %val, fp128* %arg seq_cst, align 16 - ret void -} - -;; Unaligned loads and stores should be expanded to the generic -;; libcall, just like large loads/stores, and not a specialized one. -;; NOTE: atomicrmw and cmpxchg don't yet support an align attribute; -;; when such support is added, they should also be tested here. - -; CHECK-LABEL: @test_unaligned_load_i16( -; CHECK: __atomic_load( -define i16 @test_unaligned_load_i16(i16* %arg) { - %ret = load atomic i16, i16* %arg seq_cst, align 1 - ret i16 %ret -} - -; CHECK-LABEL: @test_unaligned_store_i16( -; CHECK: __atomic_store( -define void @test_unaligned_store_i16(i16* %arg, i16 %val) { - store atomic i16 %val, i16* %arg seq_cst, align 1 - ret void -} diff --git a/llvm/test/Transforms/AtomicExpand/SPARC/lit.local.cfg b/llvm/test/Transforms/AtomicExpand/SPARC/lit.local.cfg deleted file mode 100644 index 9a34b657815..00000000000 --- a/llvm/test/Transforms/AtomicExpand/SPARC/lit.local.cfg +++ /dev/null @@ -1,2 +0,0 @@ -if not 'Sparc' in config.root.targets: - config.unsupported = True |