summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64/cmpxchg-O0.ll
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Improve codegen of volatile load/store of i128Victor Campos2019-12-181-4/+2
| | | | | | | | | | | | | | | | Summary: Instead of generating two i64 instructions for each load or store of a volatile i128 value (two LDRs or STRs), now emit a single LDP or STP. Reviewers: labrinea, t.p.northover, efriedma Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69559
* Revert r329611, "AArch64: Allow offsets to be folded into addresses with ELF."Peter Collingbourne2018-04-101-4/+5
| | | | | | Caused a build failure in check-tsan. llvm-svn: 329718
* AArch64: Allow offsets to be folded into addresses with ELF.Peter Collingbourne2018-04-091-5/+4
| | | | | | | | | | | | | | | This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. It reduces the size of Chromium for Android's .text section by 46KB, or 56KB with ThinLTO (which exposes more opportunities to use a direct access rather than a GOT access). Because the addend range is limited in COFF and Mach-O, this is enabled for ELF only. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329611
* [AArch64][GlobalISel] Enable GlobalISel at -O0 by defaultAmara Emerson2018-01-021-1/+1
| | | | | | | | | | | Tests updated to explicitly use fast-isel at -O0 instead of implicitly. This change also allows an explicit -fast-isel option to override an implicitly enabled global-isel. Otherwise -fast-isel would have no effect at -O0. Differential Revision: https://reviews.llvm.org/D41362 llvm-svn: 321655
* AArch64: Fix cmpxchg O0 expansionMatthias Braun2017-05-261-3/+7
| | | | | | | | | | | | | | - Rewrite livein calculation to use the computeLiveIns() helper function. This is slightly less efficient but easier to reason about and doesn't unnecessarily add pristine and reserved registers[1] - Zero the status register at the beginning of the loop to make sure it has a defined value. - Remove kill flags of values that need to stay alive throughout the loop. [1] An upcoming commit of mine will tighten the MachineVerifier to catch these. llvm-svn: 304048
* AArch64: fix 128-bit cmpxchg at -O0 (again, again).Tim Northover2016-12-011-4/+8
| | | | | | | | | | | | | | | This time the issue is fortunately just a simple mistake rather than a horrible design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for comparing two 64-bit values, but they don't. The fix is slightly clunkier in AArch64 because we can't use conditional execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway. Thanks to Anton Korobeynikov for pointing out the issue. llvm-svn: 288418
* AArch64: don't assume all i128s are BUILD_PAIRsTim Northover2016-08-041-0/+26
| | | | | | | It leads to a crash when they're not. I'm *sure* I've made this mistake before, at least once. llvm-svn: 277755
* [AArch64][FastISel] Select -O0 legal cmpxchg.Ahmed Bougacha2016-07-201-1/+1
| | | | | | | | | | | At -O0, cmpxchg survives AtomicExpand: it's mostly straightforward to select it in fast-isel, and let the pseudo be expanded later. extractvalues on the result are the tricky part: the generic logic only works for legal types (and it would be painful to make it support illegal types), so we can only support i32/i64 cmpxchg. llvm-svn: 276183
* [AArch64] Set correct successors in CMPXCHG pseudo expansion.Ahmed Bougacha2016-04-271-1/+1
| | | | | | | | | | | transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas it should be a successor of the original MBB. Follow-up to r266339. Unfortunately, it's tricky to catch this in the verifier. llvm-svn: 267779
* AArch64: expand cmpxchg after regalloc at -O0.Tim Northover2016-04-141-0/+75
FastRegAlloc works only at the basic-block level and spills all live-out registers. Unfortunately for a stack-based cmpxchg near the spill slots, this can perpetually clear the exclusive monitor, which means the cmpxchg will never succeed. I believe the only way to handle this within LLVM is by expanding the loop post-regalloc. We don't want this in general because it severely limits the optimisations that can be done, so we limit this to -O0 compilations. It's an ugly hack, and about the one good point in the whole mess is that we can treat all cmpxchg operations in the most naive way possible (seq_cst, no clrex faff) without affecting correctness. Should fix PR25526. llvm-svn: 266339
OpenPOWER on IntegriCloud