8 files changed, 22 insertions, 1074 deletions
diff --git a/llvm/docs/Atomics.rst b/llvm/docs/Atomics.rst
index 89f5f44dae6..ff667480446 100644
--- a/llvm/docs/Atomics.rst
+++ b/llvm/docs/Atomics.rst
@@ -413,28 +413,19 @@ The MachineMemOperand for all atomic operations is currently marked as volatile;
 this is not correct in the IR sense of volatile, but CodeGen handles anything
 marked volatile very conservatively.  This should get fixed at some point.
 
-One very important property of the atomic operations is that if your backend
-supports any inline lock-free atomic operations of a given size, you should
-support *ALL* operations of that size in a lock-free manner.
-
-When the target implements atomic ``cmpxchg`` or LL/SC instructions (as most do)
-this is trivial: all the other operations can be implemented on top of those
-primitives. However, on many older CPUs (e.g. ARMv5, SparcV8, Intel 80386) there
-are atomic load and store instructions, but no ``cmpxchg`` or LL/SC. As it is
-invalid to implement ``atomic load`` using the native instruction, but
-``cmpxchg`` using a library call to a function that uses a mutex, ``atomic
-load`` must *also* expand to a library call on such architectures, so that it
-can remain atomic with regards to a simultaneous ``cmpxchg``, by using the same
-mutex.
-
-AtomicExpandPass can help with that: it will expand all atomic operations to the
-proper ``__atomic_*`` libcalls for any size above the maximum set by
-``setMaxAtomicSizeInBitsSupported`` (which defaults to 0).
+Common architectures have some way of representing at least a pointer-sized
+lock-free ``cmpxchg``; such an operation can be used to implement all the other
+atomic operations which can be represented in IR up to that size.  Backends are
+expected to implement all those operations, but not operations which cannot be
+implemented in a lock-free manner.  It is expected that backends will give an
+error when given an operation which cannot be implemented.  (The LLVM code
+generator is not very helpful here at the moment, but hopefully that will
+change.)
 
 On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores
 generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent
 fences generate an ``MFENCE``, other fences do not cause any code to be
-generated.  ``cmpxchg`` uses the ``LOCK CMPXCHG`` instruction.  ``atomicrmw xchg``
+generated.  cmpxchg uses the ``LOCK CMPXCHG`` instruction.  ``atomicrmw xchg``
 uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all
 other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``.  Depending
 on the users of the result, some ``atomicrmw`` operations can be translated into
@@ -455,151 +446,10 @@ atomic constructs. Here are some lowerings it can do:
   ``emitStoreConditional()``
 * large loads/stores -> ll-sc/cmpxchg
   by overriding ``shouldExpandAtomicStoreInIR()``/``shouldExpandAtomicLoadInIR()``
-* strong atomic accesses -> monotonic accesses + fences by overriding
-  ``shouldInsertFencesForAtomic()``, ``emitLeadingFence()``, and
-  ``emitTrailingFence()``
+* strong atomic accesses -> monotonic accesses + fences
+  by using ``setInsertFencesForAtomic()`` and overriding ``emitLeadingFence()``
+  and ``emitTrailingFence()``
 * atomic rmw -> loop with cmpxchg or load-linked/store-conditional
   by overriding ``expandAtomicRMWInIR()``
-* expansion to __atomic_* libcalls for unsupported sizes.
 
 For an example of all of these, look at the ARM backend.
-
-Libcalls: __atomic_*
-====================
-
-There are two kinds of atomic library calls that are generated by LLVM. Please
-note that both sets of library functions somewhat confusingly share the names of
-builtin functions defined by clang. Despite this, the library functions are
-not directly related to the builtins: it is *not* the case that ``__atomic_*``
-builtins lower to ``__atomic_*`` library calls and ``__sync_*`` builtins lower
-to ``__sync_*`` library calls.
-
-The first set of library functions are named ``__atomic_*``. This set has been
-"standardized" by GCC, and is described below. (See also `GCC's documentation
-<https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>`_)
-
-LLVM's AtomicExpandPass will translate atomic operations on data sizes above
-``MaxAtomicSizeInBitsSupported`` into calls to these functions.
-
-There are four generic functions, which can be called with data of any size or
-alignment::
-
-   void __atomic_load(size_t size, void *ptr, void *ret, int ordering)
-   void __atomic_store(size_t size, void *ptr, void *val, int ordering)
-   void __atomic_exchange(size_t size, void *ptr, void *val, void *ret, int ordering)
-   bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, void *desired, int success_order, int failure_order)
-
-There are also size-specialized versions of the above functions, which can only
-be used with *naturally-aligned* pointers of the appropriate size. In the
-signatures below, "N" is one of 1, 2, 4, 8, and 16, and "iN" is the appropriate
-integer type of that size; if no such integer type exists, the specialization
-cannot be used::
-
-   iN __atomic_load_N(iN *ptr, iN val, int ordering)
-   void __atomic_store_N(iN *ptr, iN val, int ordering)
-   iN __atomic_exchange_N(iN *ptr, iN val, int ordering)
-   bool __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired, int success_order, int failure_order)
-
-Finally there are some read-modify-write functions, which are only available in
-the size-specific variants (any other sizes use a ``__atomic_compare_exchange``
-loop)::
-
-   iN __atomic_fetch_add_N(iN *ptr, iN val, int ordering)
-   iN __atomic_fetch_sub_N(iN *ptr, iN val, int ordering)
-   iN __atomic_fetch_and_N(iN *ptr, iN val, int ordering)
-   iN __atomic_fetch_or_N(iN *ptr, iN val, int ordering)
-   iN __atomic_fetch_xor_N(iN *ptr, iN val, int ordering)
-   iN __atomic_fetch_nand_N(iN *ptr, iN val, int ordering)
-
-This set of library functions have some interesting implementation requirements
-to take note of:
-
-- They support all sizes and alignments -- including those which cannot be
-  implemented natively on any existing hardware. Therefore, they will certainly
-  use mutexes in for some sizes/alignments.
-
-- As a consequence, they cannot be shipped in a statically linked
-  compiler-support library, as they have state which must be shared amongst all
-  DSOs loaded in the program. They must be provided in a shared library used by
-  all objects.
-
-- The set of atomic sizes supported lock-free must be a superset of the sizes
-  any compiler can emit. That is: if a new compiler introduces support for
-  inline-lock-free atomics of size N, the ``__atomic_*`` functions must also have a
-  lock-free implementation for size N. This is a requirement so that code
-  produced by an old compiler (which will have called the ``__atomic_*`` function)
-  interoperates with code produced by the new compiler (which will use native
-  the atomic instruction).
-
-Note that it's possible to write an entirely target-independent implementation
-of these library functions by using the compiler atomic builtins themselves to
-implement the operations on naturally-aligned pointers of supported sizes, and a
-generic mutex implementation otherwise.
-
-Libcalls: __sync_*
-==================
-
-Some targets or OS/target combinations can support lock-free atomics, but for
-various reasons, it is not practical to emit the instructions inline.
-
-There's two typical examples of this.
-
-Some CPUs support multiple instruction sets which can be swiched back and forth
-on function-call boundaries. For example, MIPS supports the MIPS16 ISA, which
-has a smaller instruction encoding than the usual MIPS32 ISA. ARM, similarly,
-has the Thumb ISA. In MIPS16 and earlier versions of Thumb, the atomic
-instructions are not encodable. However, those instructions are available via a
-function call to a function with the longer encoding.
-
-Additionally, a few OS/target pairs provide kernel-supported lock-free
-atomics. ARM/Linux is an example of this: the kernel `provides
-<https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt>`_ a
-function which on older CPUs contains a "magically-restartable" atomic sequence
-(which looks atomic so long as there's only one CPU), and contains actual atomic
-instructions on newer multicore models. This sort of functionality can typically
-be provided on any architecture, if all CPUs which are missing atomic
-compare-and-swap support are uniprocessor (no SMP). This is almost always the
-case. The only common architecture without that property is SPARC -- SPARCV8 SMP
-systems were common, yet it doesn't support any sort of compare-and-swap
-operation.
-
-In either of these cases, the Target in LLVM can claim support for atomics of an
-appropriate size, and then implement some subset of the operations via libcalls
-to a ``__sync_*`` function. Such functions *must* not use locks in their
-implementation, because unlike the ``__atomic_*`` routines used by
-AtomicExpandPass, these may be mixed-and-matched with native instructions by the
-target lowering.
-
-Further, these routines do not need to be shared, as they are stateless. So,
-there is no issue with having multiple copies included in one binary. Thus,
-typically these routines are implemented by the statically-linked compiler
-runtime support library.
-
-LLVM will emit a call to an appropriate ``__sync_*`` routine if the target
-ISelLowering code has set the corresponding ``ATOMIC_CMPXCHG``, ``ATOMIC_SWAP``,
-or ``ATOMIC_LOAD_*`` operation to "Expand", and if it has opted-into the
-availablity of those library functions via a call to ``initSyncLibcalls()``.
-
-The full set of functions that may be called by LLVM is (for ``N`` being 1, 2,
-4, 8, or 16)::
-
-  iN __sync_val_compare_and_swap_N(iN *ptr, iN expected, iN desired)
-  iN __sync_lock_test_and_set_N(iN *ptr, iN val)
-  iN __sync_fetch_and_add_N(iN *ptr, iN val)
-  iN __sync_fetch_and_sub_N(iN *ptr, iN val)
-  iN __sync_fetch_and_and_N(iN *ptr, iN val)
-  iN __sync_fetch_and_or_N(iN *ptr, iN val)
-  iN __sync_fetch_and_xor_N(iN *ptr, iN val)
-  iN __sync_fetch_and_nand_N(iN *ptr, iN val)
-  iN __sync_fetch_and_max_N(iN *ptr, iN val)
-  iN __sync_fetch_and_umax_N(iN *ptr, iN val)
-  iN __sync_fetch_and_min_N(iN *ptr, iN val)
-  iN __sync_fetch_and_umin_N(iN *ptr, iN val)
-
-This list doesn't include any function for atomic load or store; all known
-architectures support atomic loads and stores directly (possibly by emitting a
-fence on either side of a normal load or store.)
-
-There's also, somewhat separately, the possibility to lower ``ATOMIC_FENCE`` to
-``__sync_synchronize()``. This may happen or not happen independent of all the
-above, controlled purely by ``setOperationAction(ISD::ATOMIC_FENCE, ...)``.
diff --git a/llvm/include/llvm/CodeGen/RuntimeLibcalls.h b/llvm/include/llvm/CodeGen/RuntimeLibcalls.h
index cfd13a3250c..bdf8d9ce086 100644
--- a/llvm/include/llvm/CodeGen/RuntimeLibcalls.h
+++ b/llvm/include/llvm/CodeGen/RuntimeLibcalls.h
@@ -336,11 +336,7 @@ namespace RTLIB {
     // EXCEPTION HANDLING
     UNWIND_RESUME,
 
-    // Note: there's two sets of atomics libcalls; see
-    // <http://llvm.org/docs/Atomics.html> for more info on the
-    // difference between them.
-
-    // Atomic '__sync_*' libcalls.
+    // Family ATOMICs
     SYNC_VAL_COMPARE_AND_SWAP_1,
     SYNC_VAL_COMPARE_AND_SWAP_2,
     SYNC_VAL_COMPARE_AND_SWAP_4,
@@ -402,73 +398,6 @@ namespace RTLIB {
     SYNC_FETCH_AND_UMIN_8,
     SYNC_FETCH_AND_UMIN_16,
 
-    // Atomic '__atomic_*' libcalls.
-    ATOMIC_LOAD,
-    ATOMIC_LOAD_1,
-    ATOMIC_LOAD_2,
-    ATOMIC_LOAD_4,
-    ATOMIC_LOAD_8,
-    ATOMIC_LOAD_16,
-
-    ATOMIC_STORE,
-    ATOMIC_STORE_1,
-    ATOMIC_STORE_2,
-    ATOMIC_STORE_4,
-    ATOMIC_STORE_8,
-    ATOMIC_STORE_16,
-
-    ATOMIC_EXCHANGE,
-    ATOMIC_EXCHANGE_1,
-    ATOMIC_EXCHANGE_2,
-    ATOMIC_EXCHANGE_4,
-    ATOMIC_EXCHANGE_8,
-    ATOMIC_EXCHANGE_16,
-
-    ATOMIC_COMPARE_EXCHANGE,
-    ATOMIC_COMPARE_EXCHANGE_1,
-    ATOMIC_COMPARE_EXCHANGE_2,
-    ATOMIC_COMPARE_EXCHANGE_4,
-    ATOMIC_COMPARE_EXCHANGE_8,
-    ATOMIC_COMPARE_EXCHANGE_16,
-
-    ATOMIC_FETCH_ADD_1,
-    ATOMIC_FETCH_ADD_2,
-    ATOMIC_FETCH_ADD_4,
-    ATOMIC_FETCH_ADD_8,
-    ATOMIC_FETCH_ADD_16,
-
-    ATOMIC_FETCH_SUB_1,
-    ATOMIC_FETCH_SUB_2,
-    ATOMIC_FETCH_SUB_4,
-    ATOMIC_FETCH_SUB_8,
-    ATOMIC_FETCH_SUB_16,
-
-    ATOMIC_FETCH_AND_1,
-    ATOMIC_FETCH_AND_2,
-    ATOMIC_FETCH_AND_4,
-    ATOMIC_FETCH_AND_8,
-    ATOMIC_FETCH_AND_16,
-
-    ATOMIC_FETCH_OR_1,
-    ATOMIC_FETCH_OR_2,
-    ATOMIC_FETCH_OR_4,
-    ATOMIC_FETCH_OR_8,
-    ATOMIC_FETCH_OR_16,
-
-    ATOMIC_FETCH_XOR_1,
-    ATOMIC_FETCH_XOR_2,
-    ATOMIC_FETCH_XOR_4,
-    ATOMIC_FETCH_XOR_8,
-    ATOMIC_FETCH_XOR_16,
-
-    ATOMIC_FETCH_NAND_1,
-    ATOMIC_FETCH_NAND_2,
-    ATOMIC_FETCH_NAND_4,
-    ATOMIC_FETCH_NAND_8,
-    ATOMIC_FETCH_NAND_16,
-
-    ATOMIC_IS_LOCK_FREE,
-
     // Stack Protector Fail.
     STACKPROTECTOR_CHECK_FAIL,
 
diff --git a/llvm/include/llvm/Target/TargetLowering.h b/llvm/include/llvm/Target/TargetLowering.h
index 994071bf9d2..7c97f3c05a2 100644
--- a/llvm/include/llvm/Target/TargetLowering.h
+++ b/llvm/include/llvm/Target/TargetLowering.h
@@ -1059,14 +1059,6 @@ public:
   /// \name Helpers for atomic expansion.
   /// @{
 
-  /// Returns the maximum atomic operation size (in bits) supported by
-  /// the backend. Atomic operations greater than this size (as well
-  /// as ones that are not naturally aligned), will be expanded by
-  /// AtomicExpandPass into an __atomic_* library call.
-  unsigned getMaxAtomicSizeInBitsSupported() const {
-    return MaxAtomicSizeInBitsSupported;
-  }
-
   /// Whether AtomicExpandPass should automatically insert fences and reduce
   /// ordering for this atomic. This should be true for most architectures with
   /// weak memory ordering. Defaults to false.
@@ -1464,14 +1456,6 @@ protected:
     MinStackArgumentAlignment = Align;
   }
 
-  /// Set the maximum atomic operation size supported by the
-  /// backend. Atomic operations greater than this size (as well as
-  /// ones that are not naturally aligned), will be expanded by
-  /// AtomicExpandPass into an __atomic_* library call.
-  void setMaxAtomicSizeInBitsSupported(unsigned SizeInBits) {
-    MaxAtomicSizeInBitsSupported = SizeInBits;
-  }
-
 public:
   //===--------------------------------------------------------------------===//
   // Addressing mode description hooks (used by LSR etc).
@@ -1881,9 +1865,6 @@ private:
   /// The preferred loop alignment.
   unsigned PrefLoopAlignment;
 
-  /// Size in bits of the maximum atomics size the backend supports.
-  /// Accesses larger than this will be expanded by AtomicExpandPass.
-  unsigned MaxAtomicSizeInBitsSupported;
 
   /// If set to a physical register, this specifies the register that
   /// llvm.savestack/llvm.restorestack should save and restore.
diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index 6d69de12422..8c0c0f4acba 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -8,10 +8,10 @@
 //===----------------------------------------------------------------------===//
 //
 // This file contains a pass (at IR level) to replace atomic instructions with
-// __atomic_* library calls, or target specific instruction which implement the
-// same semantics in a way which better fits the target backend.  This can
-// include the use of (intrinsic-based) load-linked/store-conditional loops,
-// AtomicCmpXchg, or type coercions.
+// target specific instruction which implement the same semantics in a way
+// which better fits the target backend.  This can include the use of either
+// (intrinsic-based) load-linked/store-conditional loops, AtomicCmpXchg, or
+// type coercions.
 //
 //===----------------------------------------------------------------------===//
 
@@ -64,95 +64,19 @@ namespace {
     bool expandAtomicCmpXchg(AtomicCmpXchgInst *CI);
     bool isIdempotentRMW(AtomicRMWInst *AI);
     bool simplifyIdempotentRMW(AtomicRMWInst *AI);
-
-    bool expandAtomicOpToLibcall(Instruction *I, unsigned Size, unsigned Align,
-                                 Value *PointerOperand, Value *ValueOperand,
-                                 Value *CASExpected, AtomicOrdering Ordering,
-                                 AtomicOrdering Ordering2,
-                                 ArrayRef<RTLIB::Libcall> Libcalls);
-    void expandAtomicLoadToLibcall(LoadInst *LI);
-    void expandAtomicStoreToLibcall(StoreInst *LI);
-    void expandAtomicRMWToLibcall(AtomicRMWInst *I);
-    void expandAtomicCASToLibcall(AtomicCmpXchgInst *I);
   };
 }
 
 char AtomicExpand::ID = 0;
 char &llvm::AtomicExpandID = AtomicExpand::ID;
-INITIALIZE_TM_PASS(AtomicExpand, "atomic-expand", "Expand Atomic instructions",
-                   false, false)
+INITIALIZE_TM_PASS(AtomicExpand, "atomic-expand",
+    "Expand Atomic calls in terms of either load-linked & store-conditional or cmpxchg",
+    false, false)
 
 FunctionPass *llvm::createAtomicExpandPass(const TargetMachine *TM) {
   return new AtomicExpand(TM);
 }
 
-namespace {
-// Helper functions to retrieve the size of atomic instructions.
-unsigned getAtomicOpSize(LoadInst *LI) {
-  const DataLayout &DL = LI->getModule()->getDataLayout();
-  return DL.getTypeStoreSize(LI->getType());
-}
-
-unsigned getAtomicOpSize(StoreInst *SI) {
-  const DataLayout &DL = SI->getModule()->getDataLayout();
-  return DL.getTypeStoreSize(SI->getValueOperand()->getType());
-}
-
-unsigned getAtomicOpSize(AtomicRMWInst *RMWI) {
-  const DataLayout &DL = RMWI->getModule()->getDataLayout();
-  return DL.getTypeStoreSize(RMWI->getValOperand()->getType());
-}
-
-unsigned getAtomicOpSize(AtomicCmpXchgInst *CASI) {
-  const DataLayout &DL = CASI->getModule()->getDataLayout();
-  return DL.getTypeStoreSize(CASI->getCompareOperand()->getType());
-}
-
-// Helper functions to retrieve the alignment of atomic instructions.
-unsigned getAtomicOpAlign(LoadInst *LI) {
-  unsigned Align = LI->getAlignment();
-  // In the future, if this IR restriction is relaxed, we should
-  // return DataLayout::getABITypeAlignment when there's no align
-  // value.
-  assert(Align != 0 && "An atomic LoadInst always has an explicit alignment");
-  return Align;
-}
-
-unsigned getAtomicOpAlign(StoreInst *SI) {
-  unsigned Align = SI->getAlignment();
-  // In the future, if this IR restriction is relaxed, we should
-  // return DataLayout::getABITypeAlignment when there's no align
-  // value.
-  assert(Align != 0 && "An atomic StoreInst always has an explicit alignment");
-  return Align;
-}
-
-unsigned getAtomicOpAlign(AtomicRMWInst *RMWI) {
-  // TODO(PR27168): This instruction has no alignment attribute, but unlike the
-  // default alignment for load/store, the default here is to assume
-  // it has NATURAL alignment, not DataLayout-specified alignment.
-  const DataLayout &DL = RMWI->getModule()->getDataLayout();
-  return DL.getTypeStoreSize(RMWI->getValOperand()->getType());
-}
-
-unsigned getAtomicOpAlign(AtomicCmpXchgInst *CASI) {
-  // TODO(PR27168): same comment as above.
-  const DataLayout &DL = CASI->getModule()->getDataLayout();
-  return DL.getTypeStoreSize(CASI->getCompareOperand()->getType());
-}
-
-// Determine if a particular atomic operation has a supported size,
-// and is of appropriate alignment, to be passed through for target
-// lowering. (Versus turning into a __atomic libcall)
-template <typename Inst>
-bool atomicSizeSupported(const TargetLowering *TLI, Inst *I) {
-  unsigned Size = getAtomicOpSize(I);
-  unsigned Align = getAtomicOpAlign(I);
-  return Align >= Size && Size <= TLI->getMaxAtomicSizeInBitsSupported() / 8;
-}
-
-} // end anonymous namespace
-
 bool AtomicExpand::runOnFunction(Function &F) {
   if (!TM || !TM->getSubtargetImpl(F)->enableAtomicExpand())
     return false;
@@ -176,33 +100,6 @@ bool AtomicExpand::runOnFunction(Function &F) {
     auto CASI = dyn_cast<AtomicCmpXchgInst>(I);
     assert((LI || SI || RMWI || CASI) && "Unknown atomic instruction");
 
-    // If the Size/Alignment is not supported, replace with a libcall.
-    if (LI) {
-      if (!atomicSizeSupported(TLI, LI)) {
-        expandAtomicLoadToLibcall(LI);
-        MadeChange = true;
-        continue;
-      }
-    } else if (SI) {
-      if (!atomicSizeSupported(TLI, SI)) {
-        expandAtomicStoreToLibcall(SI);
-        MadeChange = true;
-        continue;
-      }
-    } else if (RMWI) {
-      if (!atomicSizeSupported(TLI, RMWI)) {
-        expandAtomicRMWToLibcall(RMWI);
-        MadeChange = true;
-        continue;
-      }
-    } else if (CASI) {
-      if (!atomicSizeSupported(TLI, CASI)) {
-        expandAtomicCASToLibcall(CASI);
-        MadeChange = true;
-        continue;
-      }
-    }
-
     if (TLI->shouldInsertFencesForAtomic(I)) {
       auto FenceOrdering = AtomicOrdering::Monotonic;
       bool IsStore, IsLoad;
@@ -247,7 +144,7 @@ bool AtomicExpand::runOnFunction(Function &F) {
         assert(LI->getType()->isIntegerTy() && "invariant broken");
         MadeChange = true;
       }
-
+      
       MadeChange |= tryExpandAtomicLoad(LI);
     } else if (SI) {
       if (SI->getValueOperand()->getType()->isFloatingPointTy()) {
@@ -936,384 +833,3 @@ bool llvm::expandAtomicRMWToCmpXchg(AtomicRMWInst *AI,
 
   return true;
 }
-
-// This converts from LLVM's internal AtomicOrdering enum to the
-// memory_order_* value required by the __atomic_* libcalls.
-static int libcallAtomicModel(AtomicOrdering AO) {
-  enum {
-    AO_ABI_memory_order_relaxed = 0,
-    AO_ABI_memory_order_consume = 1,
-    AO_ABI_memory_order_acquire = 2,
-    AO_ABI_memory_order_release = 3,
-    AO_ABI_memory_order_acq_rel = 4,
-    AO_ABI_memory_order_seq_cst = 5
-  };
-
-  switch (AO) {
-  case AtomicOrdering::NotAtomic:
-    llvm_unreachable("Expected atomic memory order.");
-  case AtomicOrdering::Unordered:
-  case AtomicOrdering::Monotonic:
-    return AO_ABI_memory_order_relaxed;
-  // Not implemented yet in llvm:
-  // case AtomicOrdering::Consume:
-  //  return AO_ABI_memory_order_consume;
-  case AtomicOrdering::Acquire:
-    return AO_ABI_memory_order_acquire;
-  case AtomicOrdering::Release:
-    return AO_ABI_memory_order_release;
-  case AtomicOrdering::AcquireRelease:
-    return AO_ABI_memory_order_acq_rel;
-  case AtomicOrdering::SequentiallyConsistent:
-    return AO_ABI_memory_order_seq_cst;
-  }
-  llvm_unreachable("Unknown atomic memory order.");
-}
-
-// In order to use one of the sized library calls such as
-// __atomic_fetch_add_4, the alignment must be sufficient, the size
-// must be one of the potentially-specialized sizes, and the value
-// type must actually exist in C on the target (otherwise, the
-// function wouldn't actually be defined.)
-static bool canUseSizedAtomicCall(unsigned Size, unsigned Align,
-                                  const DataLayout &DL) {
-  // TODO: "LargestSize" is an approximation for "largest type that
-  // you can express in C". It seems to be the case that int128 is
-  // supported on all 64-bit platforms, otherwise only up to 64-bit
-  // integers are supported. If we get this wrong, then we'll try to
-  // call a sized libcall that doesn't actually exist. There should
-  // really be some more reliable way in LLVM of determining integer
-  // sizes which are valid in the target's C ABI...
-  unsigned LargestSize = DL.getLargestLegalIntTypeSize() >= 64 ? 16 : 8;
-  return Align >= Size &&
-         (Size == 1 || Size == 2 || Size == 4 || Size == 8 || Size == 16) &&
-         Size <= LargestSize;
-}
-
-void AtomicExpand::expandAtomicLoadToLibcall(LoadInst *I) {
-  static const RTLIB::Libcall Libcalls[6] = {
-      RTLIB::ATOMIC_LOAD,   RTLIB::ATOMIC_LOAD_1, RTLIB::ATOMIC_LOAD_2,
-      RTLIB::ATOMIC_LOAD_4, RTLIB::ATOMIC_LOAD_8, RTLIB::ATOMIC_LOAD_16};
-  unsigned Size = getAtomicOpSize(I);
-  unsigned Align = getAtomicOpAlign(I);
-
-  bool expanded = expandAtomicOpToLibcall(
-      I, Size, Align, I->getPointerOperand(), nullptr, nullptr,
-      I->getOrdering(), AtomicOrdering::NotAtomic, Libcalls);
-  (void)expanded;
-  assert(expanded && "expandAtomicOpToLibcall shouldn't fail tor Load");
-}
-
-void AtomicExpand::expandAtomicStoreToLibcall(StoreInst *I) {
-  static const RTLIB::Libcall Libcalls[6] = {
-      RTLIB::ATOMIC_STORE,   RTLIB::ATOMIC_STORE_1, RTLIB::ATOMIC_STORE_2,
-      RTLIB::ATOMIC_STORE_4, RTLIB::ATOMIC_STORE_8, RTLIB::ATOMIC_STORE_16};
-  unsigned Size = getAtomicOpSize(I);
-  unsigned Align = getAtomicOpAlign(I);
-
-  bool expanded = expandAtomicOpToLibcall(
-      I, Size, Align, I->getPointerOperand(), I->getValueOperand(), nullptr,
-      I->getOrdering(), AtomicOrdering::NotAtomic, Libcalls);
-  (void)expanded;
-  assert(expanded && "expandAtomicOpToLibcall shouldn't fail tor Store");
-}
-
-void AtomicExpand::expandAtomicCASToLibcall(AtomicCmpXchgInst *I) {
-  static const RTLIB::Libcall Libcalls[6] = {
-      RTLIB::ATOMIC_COMPARE_EXCHANGE,   RTLIB::ATOMIC_COMPARE_EXCHANGE_1,
-      RTLIB::ATOMIC_COMPARE_EXCHANGE_2, RTLIB::ATOMIC_COMPARE_EXCHANGE_4,
-      RTLIB::ATOMIC_COMPARE_EXCHANGE_8, RTLIB::ATOMIC_COMPARE_EXCHANGE_16};
-  unsigned Size = getAtomicOpSize(I);
-  unsigned Align = getAtomicOpAlign(I);
-
-  bool expanded = expandAtomicOpToLibcall(
-      I, Size, Align, I->getPointerOperand(), I->getNewValOperand(),
-      I->getCompareOperand(), I->getSuccessOrdering(), I->getFailureOrdering(),
-      Libcalls);
-  (void)expanded;
-  assert(expanded && "expandAtomicOpToLibcall shouldn't fail tor CAS");
-}
-
-static ArrayRef<RTLIB::Libcall> GetRMWLibcall(AtomicRMWInst::BinOp Op) {
-  static const RTLIB::Libcall LibcallsXchg[6] = {
-      RTLIB::ATOMIC_EXCHANGE,   RTLIB::ATOMIC_EXCHANGE_1,
-      RTLIB::ATOMIC_EXCHANGE_2, RTLIB::ATOMIC_EXCHANGE_4,
-      RTLIB::ATOMIC_EXCHANGE_8, RTLIB::ATOMIC_EXCHANGE_16};
-  static const RTLIB::Libcall LibcallsAdd[6] = {
-      RTLIB::UNKNOWN_LIBCALL,    RTLIB::ATOMIC_FETCH_ADD_1,
-      RTLIB::ATOMIC_FETCH_ADD_2, RTLIB::ATOMIC_FETCH_ADD_4,
-      RTLIB::ATOMIC_FETCH_ADD_8, RTLIB::ATOMIC_FETCH_ADD_16};
-  static const RTLIB::Libcall LibcallsSub[6] = {
-      RTLIB::UNKNOWN_LIBCALL,    RTLIB::ATOMIC_FETCH_SUB_1,
-      RTLIB::ATOMIC_FETCH_SUB_2, RTLIB::ATOMIC_FETCH_SUB_4,
-      RTLIB::ATOMIC_FETCH_SUB_8, RTLIB::ATOMIC_FETCH_SUB_16};
-  static const RTLIB::Libcall LibcallsAnd[6] = {
-      RTLIB::UNKNOWN_LIBCALL,    RTLIB::ATOMIC_FETCH_AND_1,
-      RTLIB::ATOMIC_FETCH_AND_2, RTLIB::ATOMIC_FETCH_AND_4,
-      RTLIB::ATOMIC_FETCH_AND_8, RTLIB::ATOMIC_FETCH_AND_16};
-  static const RTLIB::Libcall LibcallsOr[6] = {
-      RTLIB::UNKNOWN_LIBCALL,   RTLIB::ATOMIC_FETCH_OR_1,
-      RTLIB::ATOMIC_FETCH_OR_2, RTLIB::ATOMIC_FETCH_OR_4,
-      RTLIB::ATOMIC_FETCH_OR_8, RTLIB::ATOMIC_FETCH_OR_16};
-  static const RTLIB::Libcall LibcallsXor[6] = {
-      RTLIB::UNKNOWN_LIBCALL,    RTLIB::ATOMIC_FETCH_XOR_1,
-      RTLIB::ATOMIC_FETCH_XOR_2, RTLIB::ATOMIC_FETCH_XOR_4,
-      RTLIB::ATOMIC_FETCH_XOR_8, RTLIB::ATOMIC_FETCH_XOR_16};
-  static const RTLIB::Libcall LibcallsNand[6] = {
-      RTLIB::UNKNOWN_LIBCALL,     RTLIB::ATOMIC_FETCH_NAND_1,
-      RTLIB::ATOMIC_FETCH_NAND_2, RTLIB::ATOMIC_FETCH_NAND_4,
-      RTLIB::ATOMIC_FETCH_NAND_8, RTLIB::ATOMIC_FETCH_NAND_16};
-
-  switch (Op) {
-  case AtomicRMWInst::BAD_BINOP:
-    llvm_unreachable("Should not have BAD_BINOP.");
-  case AtomicRMWInst::Xchg:
-    return ArrayRef<RTLIB::Libcall>(LibcallsXchg);
-  case AtomicRMWInst::Add:
-    return ArrayRef<RTLIB::Libcall>(LibcallsAdd);
-  case AtomicRMWInst::Sub:
-    return ArrayRef<RTLIB::Libcall>(LibcallsSub);
-  case AtomicRMWInst::And:
-    return ArrayRef<RTLIB::Libcall>(LibcallsAnd);
-  case AtomicRMWInst::Or:
-    return ArrayRef<RTLIB::Libcall>(LibcallsOr);
-  case AtomicRMWInst::Xor:
-    return ArrayRef<RTLIB::Libcall>(LibcallsXor);
-  case AtomicRMWInst::Nand:
-    return ArrayRef<RTLIB::Libcall>(LibcallsNand);
-  case AtomicRMWInst::Max:
-  case AtomicRMWInst::Min:
-  case AtomicRMWInst::UMax:
-  case AtomicRMWInst::UMin:
-    // No atomic libcalls are available for max/min/umax/umin.
-    return ArrayRef<RTLIB::Libcall>();
-  }
-  llvm_unreachable("Unexpected AtomicRMW operation.");
-}
-
-void AtomicExpand::expandAtomicRMWToLibcall(AtomicRMWInst *I) {
-  ArrayRef<RTLIB::Libcall> Libcalls = GetRMWLibcall(I->getOperation());
-
-  unsigned Size = getAtomicOpSize(I);
-  unsigned Align = getAtomicOpAlign(I);
-
-  bool Success = false;
-  if (!Libcalls.empty())
-    Success = expandAtomicOpToLibcall(
-        I, Size, Align, I->getPointerOperand(), I->getValOperand(), nullptr,
-        I->getOrdering(), AtomicOrdering::NotAtomic, Libcalls);
-
-  // The expansion failed: either there were no libcalls at all for
-  // the operation (min/max), or there were only size-specialized
-  // libcalls (add/sub/etc) and we needed a generic. So, expand to a
-  // CAS libcall, via a CAS loop, instead.
-  if (!Success) {
-    expandAtomicRMWToCmpXchg(I, [this](IRBuilder<> &Builder, Value *Addr,
-                                       Value *Loaded, Value *NewVal,
-                                       AtomicOrdering MemOpOrder,
-                                       Value *&Success, Value *&NewLoaded) {
-      // Create the CAS instruction normally...
-      AtomicCmpXchgInst *Pair = Builder.CreateAtomicCmpXchg(
-          Addr, Loaded, NewVal, MemOpOrder,
-          AtomicCmpXchgInst::getStrongestFailureOrdering(MemOpOrder));
-      Success = Builder.CreateExtractValue(Pair, 1, "success");
-      NewLoaded = Builder.CreateExtractValue(Pair, 0, "newloaded");
-
-      // ...and then expand the CAS into a libcall.
-      expandAtomicCASToLibcall(Pair);
-    });
-  }
-}
-
-// A helper routine for the above expandAtomic*ToLibcall functions.
-//
-// 'Libcalls' contains an array of enum values for the particular
-// ATOMIC libcalls to be emitted. All of the other arguments besides
-// 'I' are extracted from the Instruction subclass by the
-// caller. Depending on the particular call, some will be null.
-bool AtomicExpand::expandAtomicOpToLibcall(
-    Instruction *I, unsigned Size, unsigned Align, Value *PointerOperand,
-    Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering,
-    AtomicOrdering Ordering2, ArrayRef<RTLIB::Libcall> Libcalls) {
-  assert(Libcalls.size() == 6);
-
-  LLVMContext &Ctx = I->getContext();
-  Module *M = I->getModule();
-  const DataLayout &DL = M->getDataLayout();
-  IRBuilder<> Builder(I);
-  IRBuilder<> AllocaBuilder(&I->getFunction()->getEntryBlock().front());
-
-  bool UseSizedLibcall = canUseSizedAtomicCall(Size, Align, DL);
-  Type *SizedIntTy = Type::getIntNTy(Ctx, Size * 8);
-
-  unsigned AllocaAlignment = DL.getPrefTypeAlignment(SizedIntTy);
-
-  // TODO: the "order" argument type is "int", not int32. So
-  // getInt32Ty may be wrong if the arch uses e.g. 16-bit ints.
-  ConstantInt *SizeVal64 = ConstantInt::get(Type::getInt64Ty(Ctx), Size);
-  Constant *OrderingVal =
-      ConstantInt::get(Type::getInt32Ty(Ctx), libcallAtomicModel(Ordering));
-  Constant *Ordering2Val = CASExpected
-                               ? ConstantInt::get(Type::getInt32Ty(Ctx),
-                                                  libcallAtomicModel(Ordering2))
-                               : nullptr;
-  bool HasResult = I->getType() != Type::getVoidTy(Ctx);
-
-  RTLIB::Libcall RTLibType;
-  if (UseSizedLibcall) {
-    switch (Size) {
-    case 1: RTLibType = Libcalls[1]; break;
-    case 2: RTLibType = Libcalls[2]; break;
-    case 4: RTLibType = Libcalls[3]; break;
-    case 8: RTLibType = Libcalls[4]; break;
-    case 16: RTLibType = Libcalls[5]; break;
-    }
-  } else if (Libcalls[0] != RTLIB::UNKNOWN_LIBCALL) {
-    RTLibType = Libcalls[0];
-  } else {
-    // Can't use sized function, and there's no generic for this
-    // operation, so give up.
-    return false;
-  }
-
-  // Build up the function call. There's two kinds. First, the sized
-  // variants.  These calls are going to be one of the following (with
-  // N=1,2,4,8,16):
-  //  iN    __atomic_load_N(iN *ptr, int ordering)
-  //  void  __atomic_store_N(iN *ptr, iN val, int ordering)
-  //  iN    __atomic_{exchange|fetch_*}_N(iN *ptr, iN val, int ordering)
-  //  bool  __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired,
-  //                                    int success_order, int failure_order)
-  //
-  // Note that these functions can be used for non-integer atomic
-  // operations, the values just need to be bitcast to integers on the
-  // way in and out.
-  //
-  // And, then, the generic variants. They look like the following:
-  //  void  __atomic_load(size_t size, void *ptr, void *ret, int ordering)
-  //  void  __atomic_store(size_t size, void *ptr, void *val, int ordering)
-  //  void  __atomic_exchange(size_t size, void *ptr, void *val, void *ret,
-  //                          int ordering)
-  //  bool  __atomic_compare_exchange(size_t size, void *ptr, void *expected,
-  //                                  void *desired, int success_order,
-  //                                  int failure_order)
-  //
-  // The different signatures are built up depending on the
-  // 'UseSizedLibcall', 'CASExpected', 'ValueOperand', and 'HasResult'
-  // variables.
-
-  AllocaInst *AllocaCASExpected = nullptr;
-  Value *AllocaCASExpected_i8 = nullptr;
-  AllocaInst *AllocaValue = nullptr;
-  Value *AllocaValue_i8 = nullptr;
-  AllocaInst *AllocaResult = nullptr;
-  Value *AllocaResult_i8 = nullptr;
-
-  Type *ResultTy;
-  SmallVector<Value *, 6> Args;
-  AttributeSet Attr;
-
-  // 'size' argument.
-  if (!UseSizedLibcall) {
-    // Note, getIntPtrType is assumed equivalent to size_t.
-    Args.push_back(ConstantInt::get(DL.getIntPtrType(Ctx), Size));
-  }
-
-  // 'ptr' argument.
-  Value *PtrVal =
-      Builder.CreateBitCast(PointerOperand, Type::getInt8PtrTy(Ctx));
-  Args.push_back(PtrVal);
-
-  // 'expected' argument, if present.
-  if (CASExpected) {
-    AllocaCASExpected = AllocaBuilder.CreateAlloca(CASExpected->getType());
-    AllocaCASExpected->setAlignment(AllocaAlignment);
-    AllocaCASExpected_i8 =
-        Builder.CreateBitCast(AllocaCASExpected, Type::getInt8PtrTy(Ctx));
-    Builder.CreateLifetimeStart(AllocaCASExpected_i8, SizeVal64);
-    Builder.CreateAlignedStore(CASExpected, AllocaCASExpected, AllocaAlignment);
-    Args.push_back(AllocaCASExpected_i8);
-  }
-
-  // 'val' argument ('desired' for cas), if present.
-  if (ValueOperand) {
-    if (UseSizedLibcall) {
-      Value *IntValue =
-          Builder.CreateBitOrPointerCast(ValueOperand, SizedIntTy);
-      Args.push_back(IntValue);
-    } else {
-      AllocaValue = AllocaBuilder.CreateAlloca(ValueOperand->getType());
-      AllocaValue->setAlignment(AllocaAlignment);
-      AllocaValue_i8 =
-          Builder.CreateBitCast(AllocaValue, Type::getInt8PtrTy(Ctx));
-      Builder.CreateLifetimeStart(AllocaValue_i8, SizeVal64);
-      Builder.CreateAlignedStore(ValueOperand, AllocaValue, AllocaAlignment);
-      Args.push_back(AllocaValue_i8);
-    }
-  }
-
-  // 'ret' argument.
-  if (!CASExpected && HasResult && !UseSizedLibcall) {
-    AllocaResult = AllocaBuilder.CreateAlloca(I->getType());
-    AllocaResult->setAlignment(AllocaAlignment);
-    AllocaResult_i8 =
-        Builder.CreateBitCast(AllocaResult, Type::getInt8PtrTy(Ctx));
-    Builder.CreateLifetimeStart(AllocaResult_i8, SizeVal64);
-    Args.push_back(AllocaResult_i8);
-  }
-
-  // 'ordering' ('success_order' for cas) argument.
-  Args.push_back(OrderingVal);
-
-  // 'failure_order' argument, if present.
-  if (Ordering2Val)
-    Args.push_back(Ordering2Val);
-
-  // Now, the return type.
-  if (CASExpected) {
-    ResultTy = Type::getInt1Ty(Ctx);
-    Attr = Attr.addAttribute(Ctx, AttributeSet::ReturnIndex, Attribute::ZExt);
-  } else if (HasResult && UseSizedLibcall)
-    ResultTy = SizedIntTy;
-  else
-    ResultTy = Type::getVoidTy(Ctx);
-
-  // Done with setting up arguments and return types, create the call:
-  SmallVector<Type *, 6> ArgTys;
-  for (Value *Arg : Args)
-    ArgTys.push_back(Arg->getType());
-  FunctionType *FnType = FunctionType::get(ResultTy, ArgTys, false);
-  Constant *LibcallFn =
-      M->getOrInsertFunction(TLI->getLibcallName(RTLibType), FnType, Attr);
-  CallInst *Call = Builder.CreateCall(LibcallFn, Args);
-  Call->setAttributes(Attr);
-  Value *Result = Call;
-
-  // And then, extract the results...
-  if (ValueOperand && !UseSizedLibcall)
-    Builder.CreateLifetimeEnd(AllocaValue_i8, SizeVal64);
-
-  if (CASExpected) {
-    // The final result from the CAS is {load of 'expected' alloca, bool result
-    // from call}
-    Type *FinalResultTy = I->getType();
-    Value *V = UndefValue::get(FinalResultTy);
-    Value *ExpectedOut =
-        Builder.CreateAlignedLoad(AllocaCASExpected, AllocaAlignment);
-    Builder.CreateLifetimeEnd(AllocaCASExpected_i8, SizeVal64);
-    V = Builder.CreateInsertValue(V, ExpectedOut, 0);
-    V = Builder.CreateInsertValue(V, Result, 1);
-    I->replaceAllUsesWith(V);
-  } else if (HasResult) {
-    Value *V;
-    if (UseSizedLibcall)
-      V = Builder.CreateBitOrPointerCast(Result, I->getType());
-    else {
-      V = Builder.CreateAlignedLoad(AllocaResult, AllocaAlignment);
-      Builder.CreateLifetimeEnd(AllocaResult_i8, SizeVal64);
-    }
-    I->replaceAllUsesWith(V);
-  }
-  I->eraseFromParent();
-  return true;
-}
diff --git a/llvm/lib/CodeGen/TargetLoweringBase.cpp b/llvm/lib/CodeGen/TargetLoweringBase.cpp
index d4aa5d5adad..8cadbb2dcd0 100644
--- a/llvm/lib/CodeGen/TargetLoweringBase.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringBase.cpp
@@ -405,66 +405,7 @@ static void InitLibcallNames(const char **Names, const Triple &TT) {
   Names[RTLIB::SYNC_FETCH_AND_UMIN_4] = "__sync_fetch_and_umin_4";
   Names[RTLIB::SYNC_FETCH_AND_UMIN_8] = "__sync_fetch_and_umin_8";
   Names[RTLIB::SYNC_FETCH_AND_UMIN_16] = "__sync_fetch_and_umin_16";
-
-  Names[RTLIB::ATOMIC_LOAD] = "__atomic_load";
-  Names[RTLIB::ATOMIC_LOAD_1] = "__atomic_load_1";
-  Names[RTLIB::ATOMIC_LOAD_2] = "__atomic_load_2";
-  Names[RTLIB::ATOMIC_LOAD_4] = "__atomic_load_4";
-  Names[RTLIB::ATOMIC_LOAD_8] = "__atomic_load_8";
-  Names[RTLIB::ATOMIC_LOAD_16] = "__atomic_load_16";
-
-  Names[RTLIB::ATOMIC_STORE] = "__atomic_store";
-  Names[RTLIB::ATOMIC_STORE_1] = "__atomic_store_1";
-  Names[RTLIB::ATOMIC_STORE_2] = "__atomic_store_2";
-  Names[RTLIB::ATOMIC_STORE_4] = "__atomic_store_4";
-  Names[RTLIB::ATOMIC_STORE_8] = "__atomic_store_8";
-  Names[RTLIB::ATOMIC_STORE_16] = "__atomic_store_16";
-
-  Names[RTLIB::ATOMIC_EXCHANGE] = "__atomic_exchange";
-  Names[RTLIB::ATOMIC_EXCHANGE_1] = "__atomic_exchange_1";
-  Names[RTLIB::ATOMIC_EXCHANGE_2] = "__atomic_exchange_2";
-  Names[RTLIB::ATOMIC_EXCHANGE_4] = "__atomic_exchange_4";
-  Names[RTLIB::ATOMIC_EXCHANGE_8] = "__atomic_exchange_8";
-  Names[RTLIB::ATOMIC_EXCHANGE_16] = "__atomic_exchange_16";
-
-  Names[RTLIB::ATOMIC_COMPARE_EXCHANGE] = "__atomic_compare_exchange";
-  Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_1] = "__atomic_compare_exchange_1";
-  Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_2] = "__atomic_compare_exchange_2";
-  Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_4] = "__atomic_compare_exchange_4";
-  Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_8] = "__atomic_compare_exchange_8";
-  Names[RTLIB::ATOMIC_COMPARE_EXCHANGE_16] = "__atomic_compare_exchange_16";
-
-  Names[RTLIB::ATOMIC_FETCH_ADD_1] = "__atomic_fetch_add_1";
-  Names[RTLIB::ATOMIC_FETCH_ADD_2] = "__atomic_fetch_add_2";
-  Names[RTLIB::ATOMIC_FETCH_ADD_4] = "__atomic_fetch_add_4";
-  Names[RTLIB::ATOMIC_FETCH_ADD_8] = "__atomic_fetch_add_8";
-  Names[RTLIB::ATOMIC_FETCH_ADD_16] = "__atomic_fetch_add_16";
-  Names[RTLIB::ATOMIC_FETCH_SUB_1] = "__atomic_fetch_sub_1";
-  Names[RTLIB::ATOMIC_FETCH_SUB_2] = "__atomic_fetch_sub_2";
-  Names[RTLIB::ATOMIC_FETCH_SUB_4] = "__atomic_fetch_sub_4";
-  Names[RTLIB::ATOMIC_FETCH_SUB_8] = "__atomic_fetch_sub_8";
-  Names[RTLIB::ATOMIC_FETCH_SUB_16] = "__atomic_fetch_sub_16";
-  Names[RTLIB::ATOMIC_FETCH_AND_1] = "__atomic_fetch_and_1";
-  Names[RTLIB::ATOMIC_FETCH_AND_2] = "__atomic_fetch_and_2";
-  Names[RTLIB::ATOMIC_FETCH_AND_4] = "__atomic_fetch_and_4";
-  Names[RTLIB::ATOMIC_FETCH_AND_8] = "__atomic_fetch_and_8";
-  Names[RTLIB::ATOMIC_FETCH_AND_16] = "__atomic_fetch_and_16";
-  Names[RTLIB::ATOMIC_FETCH_OR_1] = "__atomic_fetch_or_1";
-  Names[RTLIB::ATOMIC_FETCH_OR_2] = "__atomic_fetch_or_2";
-  Names[RTLIB::ATOMIC_FETCH_OR_4] = "__atomic_fetch_or_4";
-  Names[RTLIB::ATOMIC_FETCH_OR_8] = "__atomic_fetch_or_8";
-  Names[RTLIB::ATOMIC_FETCH_OR_16] = "__atomic_fetch_or_16";
-  Names[RTLIB::ATOMIC_FETCH_XOR_1] = "__atomic_fetch_xor_1";
-  Names[RTLIB::ATOMIC_FETCH_XOR_2] = "__atomic_fetch_xor_2";
-  Names[RTLIB::ATOMIC_FETCH_XOR_4] = "__atomic_fetch_xor_4";
-  Names[RTLIB::ATOMIC_FETCH_XOR_8] = "__atomic_fetch_xor_8";
-  Names[RTLIB::ATOMIC_FETCH_XOR_16] = "__atomic_fetch_xor_16";
-  Names[RTLIB::ATOMIC_FETCH_NAND_1] = "__atomic_fetch_nand_1";
-  Names[RTLIB::ATOMIC_FETCH_NAND_2] = "__atomic_fetch_nand_2";
-  Names[RTLIB::ATOMIC_FETCH_NAND_4] = "__atomic_fetch_nand_4";
-  Names[RTLIB::ATOMIC_FETCH_NAND_8] = "__atomic_fetch_nand_8";
-  Names[RTLIB::ATOMIC_FETCH_NAND_16] = "__atomic_fetch_nand_16";
-
+  
   if (TT.getEnvironment() == Triple::GNU) {
     Names[RTLIB::SINCOS_F32] = "sincosf";
     Names[RTLIB::SINCOS_F64] = "sincos";
@@ -836,9 +777,6 @@ TargetLoweringBase::TargetLoweringBase(const TargetMachine &tm) : TM(tm) {
   GatherAllAliasesMaxDepth = 6;
   MinStackArgumentAlignment = 1;
   MinimumJumpTableEntries = 4;
-  // TODO: the default will be switched to 0 in the next commit, along
-  // with the Target-specific changes necessary.
-  MaxAtomicSizeInBitsSupported = 1024;
 
   InitLibcallNames(LibcallRoutineNames, TM.getTargetTriple());
   InitCmpLibcallCCs(CmpLibcallCCs);
diff --git a/llvm/lib/Target/Sparc/SparcISelLowering.cpp b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
index 32d88f9be74..d9f00095589 100644
--- a/llvm/lib/Target/Sparc/SparcISelLowering.cpp
+++ b/llvm/lib/Target/Sparc/SparcISelLowering.cpp
@@ -1611,13 +1611,6 @@ SparcTargetLowering::SparcTargetLowering(TargetMachine &TM,
   }
 
   // ATOMICs.
-  // Atomics are only supported on Sparcv9. (32bit atomics are also
-  // supported by the Leon sparcv8 variant, but we don't support that
-  // yet.)
-  if (Subtarget->isV9())
-    setMaxAtomicSizeInBitsSupported(64);
-  else
-    setMaxAtomicSizeInBitsSupported(0);
 
   setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Legal);
   setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32,
diff --git a/llvm/test/Transforms/AtomicExpand/SPARC/libcalls.ll b/llvm/test/Transforms/AtomicExpand/SPARC/libcalls.ll
deleted file mode 100644
index afab7a39b27..00000000000
--- a/llvm/test/Transforms/AtomicExpand/SPARC/libcalls.ll
+++ /dev/null
@@ -1,257 +0,0 @@
-; RUN: opt -S %s -atomic-expand | FileCheck %s
-
-;;; NOTE: this test is actually target-independent -- any target which
-;;; doesn't support inline atomics can be used. (E.g. X86 i386 would
-;;; work, if LLVM is properly taught about what it's missing vs i586.)
-
-;target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
-;target triple = "i386-unknown-unknown"
-target datalayout = "e-m:e-p:32:32-i64:64-f128:64-n32-S64"
-target triple = "sparc-unknown-unknown"
-
-;; First, check the sized calls. Except for cmpxchg, these are fairly
-;; straightforward.
-
-; CHECK-LABEL: @test_load_i16(
-; CHECK:  %1 = bitcast i16* %arg to i8*
-; CHECK:  %2 = call i16 @__atomic_load_2(i8* %1, i32 5)
-; CHECK:  ret i16 %2
-define i16 @test_load_i16(i16* %arg) {
-  %ret = load atomic i16, i16* %arg seq_cst, align 4
-  ret i16 %ret
-}
-
-; CHECK-LABEL: @test_store_i16(
-; CHECK:  %1 = bitcast i16* %arg to i8*
-; CHECK:  call void @__atomic_store_2(i8* %1, i16 %val, i32 5)
-; CHECK:  ret void
-define void @test_store_i16(i16* %arg, i16 %val) {
-  store atomic i16 %val, i16* %arg seq_cst, align 4
-  ret void
-}
-
-; CHECK-LABEL: @test_exchange_i16(
-; CHECK:  %1 = bitcast i16* %arg to i8*
-; CHECK:  %2 = call i16 @__atomic_exchange_2(i8* %1, i16 %val, i32 5)
-; CHECK:  ret i16 %2
-define i16 @test_exchange_i16(i16* %arg, i16 %val) {
-  %ret = atomicrmw xchg i16* %arg, i16 %val seq_cst
-  ret i16 %ret
-}
-
-; CHECK-LABEL: @test_cmpxchg_i16(
-; CHECK:  %1 = bitcast i16* %arg to i8*
-; CHECK:  %2 = alloca i16, align 2
-; CHECK:  %3 = bitcast i16* %2 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 2, i8* %3)
-; CHECK:  store i16 %old, i16* %2, align 2
-; CHECK:  %4 = call zeroext i1 @__atomic_compare_exchange_2(i8* %1, i8* %3, i16 %new, i32 5, i32 0)
-; CHECK:  %5 = load i16, i16* %2, align 2
-; CHECK:  call void @llvm.lifetime.end(i64 2, i8* %3)
-; CHECK:  %6 = insertvalue { i16, i1 } undef, i16 %5, 0
-; CHECK:  %7 = insertvalue { i16, i1 } %6, i1 %4, 1
-; CHECK:  %ret = extractvalue { i16, i1 } %7, 0
-; CHECK:  ret i16 %ret
-define i16 @test_cmpxchg_i16(i16* %arg, i16 %old, i16 %new) {
-  %ret_succ = cmpxchg i16* %arg, i16 %old, i16 %new seq_cst monotonic
-  %ret = extractvalue { i16, i1 } %ret_succ, 0
-  ret i16 %ret
-}
-
-; CHECK-LABEL: @test_add_i16(
-; CHECK:  %1 = bitcast i16* %arg to i8*
-; CHECK:  %2 = call i16 @__atomic_fetch_add_2(i8* %1, i16 %val, i32 5)
-; CHECK:  ret i16 %2
-define i16 @test_add_i16(i16* %arg, i16 %val) {
-  %ret = atomicrmw add i16* %arg, i16 %val seq_cst
-  ret i16 %ret
-}
-
-
-;; Now, check the output for the unsized libcalls. i128 is used for
-;; these tests because the "16" suffixed functions aren't available on
-;; 32-bit i386.
-
-; CHECK-LABEL: @test_load_i128(
-; CHECK:  %1 = bitcast i128* %arg to i8*
-; CHECK:  %2 = alloca i128, align 8
-; CHECK:  %3 = bitcast i128* %2 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %3)
-; CHECK:  call void @__atomic_load(i32 16, i8* %1, i8* %3, i32 5)
-; CHECK:  %4 = load i128, i128* %2, align 8
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %3)
-; CHECK:  ret i128 %4
-define i128 @test_load_i128(i128* %arg) {
-  %ret = load atomic i128, i128* %arg seq_cst, align 16
-  ret i128 %ret
-}
-
-; CHECK-LABEL @test_store_i128(
-; CHECK:  %1 = bitcast i128* %arg to i8*
-; CHECK:  %2 = alloca i128, align 8
-; CHECK:  %3 = bitcast i128* %2 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %3)
-; CHECK:  store i128 %val, i128* %2, align 8
-; CHECK:  call void @__atomic_store(i32 16, i8* %1, i8* %3, i32 5)
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %3)
-; CHECK:  ret void
-define void @test_store_i128(i128* %arg, i128 %val) {
-  store atomic i128 %val, i128* %arg seq_cst, align 16
-  ret void
-}
-
-; CHECK-LABEL: @test_exchange_i128(
-; CHECK:  %1 = bitcast i128* %arg to i8*
-; CHECK:  %2 = alloca i128, align 8
-; CHECK:  %3 = bitcast i128* %2 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %3)
-; CHECK:  store i128 %val, i128* %2, align 8
-; CHECK:  %4 = alloca i128, align 8
-; CHECK:  %5 = bitcast i128* %4 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %5)
-; CHECK:  call void @__atomic_exchange(i32 16, i8* %1, i8* %3, i8* %5, i32 5)
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %3)
-; CHECK:  %6 = load i128, i128* %4, align 8
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %5)
-; CHECK:  ret i128 %6
-define i128 @test_exchange_i128(i128* %arg, i128 %val) {
-  %ret = atomicrmw xchg i128* %arg, i128 %val seq_cst
-  ret i128 %ret
-}
-
-; CHECK-LABEL: @test_cmpxchg_i128(
-; CHECK:  %1 = bitcast i128* %arg to i8*
-; CHECK:  %2 = alloca i128, align 8
-; CHECK:  %3 = bitcast i128* %2 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %3)
-; CHECK:  store i128 %old, i128* %2, align 8
-; CHECK:  %4 = alloca i128, align 8
-; CHECK:  %5 = bitcast i128* %4 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %5)
-; CHECK:  store i128 %new, i128* %4, align 8
-; CHECK:  %6 = call zeroext i1 @__atomic_compare_exchange(i32 16, i8* %1, i8* %3, i8* %5, i32 5, i32 0)
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %5)
-; CHECK:  %7 = load i128, i128* %2, align 8
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %3)
-; CHECK:  %8 = insertvalue { i128, i1 } undef, i128 %7, 0
-; CHECK:  %9 = insertvalue { i128, i1 } %8, i1 %6, 1
-; CHECK:  %ret = extractvalue { i128, i1 } %9, 0
-; CHECK:  ret i128 %ret
-define i128 @test_cmpxchg_i128(i128* %arg, i128 %old, i128 %new) {
-  %ret_succ = cmpxchg i128* %arg, i128 %old, i128 %new seq_cst monotonic
-  %ret = extractvalue { i128, i1 } %ret_succ, 0
-  ret i128 %ret
-}
-
-; This one is a verbose expansion, as there is no generic
-; __atomic_fetch_add function, so it needs to expand to a cmpxchg
-; loop, which then itself expands into a libcall.
-
-; CHECK-LABEL: @test_add_i128(
-; CHECK:  %1 = alloca i128, align 8
-; CHECK:  %2 = alloca i128, align 8
-; CHECK:  %3 = load i128, i128* %arg, align 16
-; CHECK:  br label %atomicrmw.start
-; CHECK:atomicrmw.start:
-; CHECK:  %loaded = phi i128 [ %3, %0 ], [ %newloaded, %atomicrmw.start ]
-; CHECK:  %new = add i128 %loaded, %val
-; CHECK:  %4 = bitcast i128* %arg to i8*
-; CHECK:  %5 = bitcast i128* %1 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %5)
-; CHECK:  store i128 %loaded, i128* %1, align 8
-; CHECK:  %6 = bitcast i128* %2 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %6)
-; CHECK:  store i128 %new, i128* %2, align 8
-; CHECK:  %7 = call zeroext i1 @__atomic_compare_exchange(i32 16, i8* %4, i8* %5, i8* %6, i32 5, i32 5)
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %6)
-; CHECK:  %8 = load i128, i128* %1, align 8
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %5)
-; CHECK:  %9 = insertvalue { i128, i1 } undef, i128 %8, 0
-; CHECK:  %10 = insertvalue { i128, i1 } %9, i1 %7, 1
-; CHECK:  %success = extractvalue { i128, i1 } %10, 1
-; CHECK:  %newloaded = extractvalue { i128, i1 } %10, 0
-; CHECK:  br i1 %success, label %atomicrmw.end, label %atomicrmw.start
-; CHECK:atomicrmw.end:
-; CHECK:  ret i128 %newloaded
-define i128 @test_add_i128(i128* %arg, i128 %val) {
-  %ret = atomicrmw add i128* %arg, i128 %val seq_cst
-  ret i128 %ret
-}
-
-;; Ensure that non-integer types get bitcast correctly on the way in and out of a libcall:
-
-; CHECK-LABEL: @test_load_double(
-; CHECK:  %1 = bitcast double* %arg to i8*
-; CHECK:  %2 = call i64 @__atomic_load_8(i8* %1, i32 5)
-; CHECK:  %3 = bitcast i64 %2 to double
-; CHECK:  ret double %3
-define double @test_load_double(double* %arg, double %val) {
-  %1 = load atomic double, double* %arg seq_cst, align 16
-  ret double %1
-}
-
-; CHECK-LABEL: @test_store_double(
-; CHECK:  %1 = bitcast double* %arg to i8*
-; CHECK:  %2 = bitcast double %val to i64
-; CHECK:  call void @__atomic_store_8(i8* %1, i64 %2, i32 5)
-; CHECK:  ret void
-define void @test_store_double(double* %arg, double %val) {
-  store atomic double %val, double* %arg seq_cst, align 16
-  ret void
-}
-
-; CHECK-LABEL: @test_cmpxchg_ptr(
-; CHECK:   %1 = bitcast i16** %arg to i8*
-; CHECK:   %2 = alloca i16*, align 4
-; CHECK:   %3 = bitcast i16** %2 to i8*
-; CHECK:   call void @llvm.lifetime.start(i64 4, i8* %3)
-; CHECK:   store i16* %old, i16** %2, align 4
-; CHECK:   %4 = ptrtoint i16* %new to i32
-; CHECK:   %5 = call zeroext i1 @__atomic_compare_exchange_4(i8* %1, i8* %3, i32 %4, i32 5, i32 2)
-; CHECK:   %6 = load i16*, i16** %2, align 4
-; CHECK:   call void @llvm.lifetime.end(i64 4, i8* %3)
-; CHECK:   %7 = insertvalue { i16*, i1 } undef, i16* %6, 0
-; CHECK:   %8 = insertvalue { i16*, i1 } %7, i1 %5, 1
-; CHECK:   %ret = extractvalue { i16*, i1 } %8, 0
-; CHECK:   ret i16* %ret
-; CHECK: }
-define i16* @test_cmpxchg_ptr(i16** %arg, i16* %old, i16* %new) {
-  %ret_succ = cmpxchg i16** %arg, i16* %old, i16* %new seq_cst acquire
-  %ret = extractvalue { i16*, i1 } %ret_succ, 0
-  ret i16* %ret
-}
-
-;; ...and for a non-integer type of large size too.
-
-; CHECK-LABEL: @test_store_fp128
-; CHECK:   %1 = bitcast fp128* %arg to i8*
-; CHECK:  %2 = alloca fp128, align 8
-; CHECK:  %3 = bitcast fp128* %2 to i8*
-; CHECK:  call void @llvm.lifetime.start(i64 16, i8* %3)
-; CHECK:  store fp128 %val, fp128* %2, align 8
-; CHECK:  call void @__atomic_store(i32 16, i8* %1, i8* %3, i32 5)
-; CHECK:  call void @llvm.lifetime.end(i64 16, i8* %3)
-; CHECK:  ret void
-define void @test_store_fp128(fp128* %arg, fp128 %val) {
-  store atomic fp128 %val, fp128* %arg seq_cst, align 16
-  ret void
-}
-
-;; Unaligned loads and stores should be expanded to the generic
-;; libcall, just like large loads/stores, and not a specialized one.
-;; NOTE: atomicrmw and cmpxchg don't yet support an align attribute;
-;; when such support is added, they should also be tested here.
-
-; CHECK-LABEL: @test_unaligned_load_i16(
-; CHECK:  __atomic_load(
-define i16 @test_unaligned_load_i16(i16* %arg) {
-  %ret = load atomic i16, i16* %arg seq_cst, align 1
-  ret i16 %ret
-}
-
-; CHECK-LABEL: @test_unaligned_store_i16(
-; CHECK: __atomic_store(
-define void @test_unaligned_store_i16(i16* %arg, i16 %val) {
-  store atomic i16 %val, i16* %arg seq_cst, align 1
-  ret void
-}
diff --git a/llvm/test/Transforms/AtomicExpand/SPARC/lit.local.cfg b/llvm/test/Transforms/AtomicExpand/SPARC/lit.local.cfg
deleted file mode 100644
index 9a34b657815..00000000000
--- a/llvm/test/Transforms/AtomicExpand/SPARC/lit.local.cfg
+++ /dev/null
@@ -1,2 +0,0 @@
-if not 'Sparc' in config.root.targets:
-  config.unsupported = True