summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM/atomic-64bit.ll
Commit message (Collapse)AuthorAgeFilesLines
* ARM: Better codegen for 64-bit compares.Peter Collingbourne2016-03-211-96/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962
* ARM: sink atomic release barrier as far as possible into cmpxchg.Tim Northover2016-02-221-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | DMB instructions can be expensive, so it's best to avoid them if possible. In atomicrmw operations there will always be an attempted store so a release barrier is always needed, but in the cmpxchg case we can delay the DMB until we know we'll definitely try to perform a store (and so need release semantics). In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX instructions to skip the barrier on subsequent iterations. The basic outline becomes: ldrex rOld, [rAddr] cmp rOld, rDesired bne Ldone dmb Lloop: strex rRes, rNew, [rAddr] cbz rRes Ldone ldrex rOld, [rAddr] cmp rOld, rDesired beq Lloop Ldone: So we'll skip this version for strong operations in "minsize" functions. llvm-svn: 261568
* AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.Tim Northover2015-12-021-0/+6
| | | | | | | | The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* Add a new string member to the TargetOptions struct for the nameEric Christopher2014-12-181-1/+1
| | | | | | | | | | | | | of the abi we should be using. For targets that don't use the option there's no change, otherwise this allows external users to set the ABI via string and avoid some of the -backend-option pain in clang. Use this option to move the ABI for the ARM port from the Subtarget to the TargetMachine and update the testcases accordingly since it's no longer valid to set via -mattr. llvm-svn: 224492
* Model ARM backend ABI selection after the front end code doing theEric Christopher2014-12-181-1/+1
| | | | | | | | | | | | | | same. This will change the "bare metal" ABI from APCS to AAPCS. The only difference between the front and back end code is that the code for Triple::GNU was added for environment. That will migrate to the front end shortly. Tests updated with the ABI they were originally testing in the case of bare metal (e.g. -mtriple armv7) or with a -gnu for arm-linux triples. llvm-svn: 224489
* Atomics: make use of the "cmpxchg weak" instruction.Tim Northover2014-06-131-2/+3
| | | | | | | | | This also simplifies the IR we create slightly: instead of working out where success & failure should go manually, it turns out we can just always jump to a success/failure block created for the purpose. Later phases will sort out the mess without much difficulty. llvm-svn: 210917
* IR: add "cmpxchg weak" variant to support permitted failure.Tim Northover2014-06-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
* ARM & AArch64: make use of common cmpxchg idioms after expansionTim Northover2014-05-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | The C and C++ semantics for compare_exchange require it to return a bool indicating success. This gets mapped to LLVM IR which follows each cmpxchg with an icmp of the value loaded against the desired value. When lowered to ldxr/stxr loops, this extra comparison is redundant: its results are implicit in the control-flow of the function. This commit makes two changes: it replaces that icmp with appropriate PHI nodes, and then makes sure earlyCSE is called after expansion to actually make use of the opportunities revealed. I've also added -{arm,aarch64}-enable-atomic-tidy options, so that existing fragile tests aren't perturbed too much by the change. Many of them either rely on undef/unreachable too pervasively to be restored to something well-defined (particularly while making sure they test the same obscure assert from many years ago), or depend on a particular CFG shape, which is disrupted by SimplifyCFG. rdar://problem/16227836 llvm-svn: 209883
* ARM big endian function argument passingChristian Pirker2014-05-081-45/+86
| | | | llvm-svn: 208316
* ARM: expand atomic ldrex/strex loops in IRTim Northover2014-04-031-41/+114
| | | | | | | | | | | | | | | | | | | | | | | | | The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded at MachineInstr emission time had grown to be extremely large and involved, to account for the subtly different code needed for the various flavours (8/16/32/64 bit, cmpxchg/add/minmax). Moving this transformation into the IR clears up the code substantially, and makes future optimisations much easier: 1. an atomicrmw followed by using the *new* value can be more efficient. As an IR pass, simple CSE could handle this efficiently. 2. Making use of cmpxchg success/failure orderings only has to be done in one (simpler) place. 3. The common "cmpxchg; did we store?" idiom can be exposed to optimisation. I intend to gradually improve this situation within the ARM backend and make sure there are no hidden issues before moving the code out into CodeGen to be shared with (at least ARM64/AArch64, though I think PPC & Mips could benefit too). llvm-svn: 205525
* IR: add a second ordering operand to cmpxhg for failureTim Northover2014-03-111-1/+1
| | | | | | | | | | | | | | | The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
* [ARM] Use the load-acquire/store-release instructions optimally in AArch32.Amara Emerson2013-09-261-15/+1
| | | | | | Patch by Artyom Skrobov. llvm-svn: 191428
* [ARM] Constrain some register classes in EmitAtomicBinary64 so thatJoey Gouly2013-08-221-1/+1
| | | | | | we pass these tests with -verify-machineinstrs. llvm-svn: 189006
* Mass update to CodeGen tests to use CHECK-LABEL for labels corresponding to ↵Stephen Lin2013-07-141-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | function definitions for more informative error messages. No functionality change and all updated tests passed locally. This update was done with the following bash script: find test/CodeGen -name "*.ll" | \ while read NAME; do echo "$NAME" if ! grep -q "^; *RUN: *llc.*debug" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \ while read FUNC; do sed -i '' "s/;\(.*\)\([A-Za-z0-9_-]*\):\( *\)$FUNC: *\$/;\1\2-LABEL:\3$FUNC:/g" $TEMP done sed -i '' "s/;\(.*\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP sed -i '' "s/;\(.*\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP sed -i '' "s/;\(.*\)-NOT-LABEL:/;\1-NOT:/" $TEMP sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP mv $TEMP $NAME fi done llvm-svn: 186280
* Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier ↵Stephen Lin2013-07-131-13/+13
| | | | | | | | | | debugging. No functionality change and all tests pass after conversion. This was done with the following sed invocation to catch label lines demarking function boundaries: sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct. llvm-svn: 186258
* ARM: relax the atomic release barrier to "dmb ishst" on SwiftTim Northover2013-07-031-50/+50
| | | | | | | | | | | Swift cores implement store barriers that are stronger than the ARM specification but weaker than general barriers. They are, in fact, just about enough to provide the ordering needed for atomic operations with release semantics. This patch makes use of that quirk. llvm-svn: 185527
* Revert r185339 (ARM: relax the atomic release barrier to "dmb ishst")Tim Northover2013-07-011-50/+50
| | | | | | | | | Turns out I'd misread the architecture reference manual and thought that was a load/store-store barrier, when it's not. Thanks for pointing it out Eli! llvm-svn: 185356
* ARM: relax the atomic release barrier to "dmb ishst"Tim Northover2013-07-011-50/+50
| | | | | | | | | | | I believe the full "dmb ish" barrier is not required to guarantee release semantics for atomic operations. The weaker "dmb ishst" prevents previous operations being reordered with a store executed afterwards, which is enough. A key point to note (fortunately already correct) is that this barrier alone is *insufficient* for sequential consistency, no matter how liberally placed. llvm-svn: 185339
* Fix 64-bit atomic operations in Thumb mode.Tim Northover2013-01-291-0/+147
| | | | | | | | The ARM and Thumb variants of LDREXD and STREXD have different constraints and take different operands. Previously the code expanding atomic operations didn't take this into account and asserted in Thumb mode. llvm-svn: 173780
* Fixed the condition codes for the atomic64 min/umin code generation on ARM. ↵Silviu Baranga2013-01-251-2/+2
| | | | | | If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation. llvm-svn: 173437
* Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend.Silviu Baranga2012-11-291-0/+61
| | | | llvm-svn: 168886
* Remove hard coded registers in ARM ldrexd and strexd instructionsWeiming Zhao2012-11-161-41/+41
| | | | | | | | | This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. llvm-svn: 168207
* Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; ↵Eli Friedman2011-08-311-7/+37
| | | | | | implements 64-bit atomic load/store for ARM. llvm-svn: 138872
* 64-bit atomic cmpxchg for ARM.Eli Friedman2011-08-311-0/+15
| | | | llvm-svn: 138868
* Some minor cleanups for r138845.Eli Friedman2011-08-311-1/+1
| | | | llvm-svn: 138846
* Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next.Eli Friedman2011-08-311-0/+83
llvm-svn: 138845
OpenPOWER on IntegriCloud