summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/AtomicExpandPass.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Restrict scope of variables [NFC]Philip Reames2016-02-181-2/+2
| | | | llvm-svn: 261250
* Speculative fix for windows buildPhilip Reames2015-12-161-0/+1
| | | | llvm-svn: 255743
* [IR] Add support for floating pointer atomic loads and storesPhilip Reames2015-12-161-4/+92
| | | | | | | | | | | | | | | | This patch allows atomic loads and stores of floating point to be specified in the IR and adds an adapter to allow them to be lowered via existing backend support for bitcast-to-equivalent-integer idiom. Previously, the only way to specify a atomic float operation was to bitcast the pointer to a i32, load the value as an i32, then bitcast to a float. At it's most basic, this patch simply moves this expansion step to the point we start lowering to the backend. This patch does not add canonicalization rules to convert the bitcast idioms to the appropriate atomic loads. I plan to do that in the future, but for now, let's simply add the support. I'd like to get instruction selection working through at least one backend (x86-64) without the bitcast conversion before canonicalizing into this form. Similarly, I haven't yet added the target hooks to opt out of the lowering step I added to AtomicExpand. I figured it would more sense to add those once at least one backend (x86) was ready to actually opt out. As you can see from the included tests, the generated code quality is not great. I plan on submitting some patches to fix this, but help from others along that line would be very welcome. I'm not super familiar with the backend and my ramp up time may be material. Differential Revision: http://reviews.llvm.org/D15471 llvm-svn: 255737
* AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.Tim Northover2015-12-021-30/+38
| | | | | | | | The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524
* CodeGen: Start removing implicit conversions to/from list iterators, NFCDuncan P. N. Exon Smith2015-10-091-3/+3
| | | | | | | Start removing implicit conversions to/from list iterators in CodeGen, ala r249782 for IR. A lot more to go after this. llvm-svn: 249851
* [AArch64] Emit clrex in the expanded cmpxchg fail block.Ahmed Bougacha2015-09-221-3/+14
| | | | | | | | | | | | | | | | | In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248291
* [CodeGen] Fix AtomicExpand invalidation issue caused by r247429.Ahmed Bougacha2015-09-121-2/+4
| | | | llvm-svn: 247514
* [CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit.Ahmed Bougacha2015-09-111-18/+17
| | | | | | | | | | | | | | | We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429
* [CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind.Ahmed Bougacha2015-09-111-3/+3
| | | | | | This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428
* Fix an alignment error in `llvm::expandAtomicRMWToCmpXchg` without breaking ↵Richard Diamond2015-08-061-1/+1
| | | | | | | | | | | | | | the build where X86 isn't enabled. Summary: Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test. Reviewers: rengolin, jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11804 llvm-svn: 244229
* Revert "Divide the primitive size in bits by eight so the initial load's ↵Renato Golin2015-08-061-1/+1
| | | | | | | | | alignment is in bytes as expected. Tested with the included unit test." This reverts commit r244155, as it was breaking the buildbots for too long. Should be reapplied with proper fix. llvm-svn: 244205
* Divide the primitive size in bits by eight so the initial load's alignment is inRichard Diamond2015-08-051-1/+1
| | | | | | bytes as expected. Tested with the included unit test. llvm-svn: 244155
* Write access test.Richard Diamond2015-08-051-1/+2
| | | | llvm-svn: 244103
* Refactor AtomicExpand::expandAtomicRMWToCmpXchg into a standalone function.JF Bastien2015-08-031-66/+82
| | | | | | | | | | | | | | | Summary: This is useful for PNaCl's `RewriteAtomics` pass. NaCl intrinsics don't exist for some of the more exotic RMW instructions, so by refactoring this function into its own, `RewriteAtomics` can share code rewriting those atomics with `AtomicExpand` while additionally saving a few cycles by generating the `cmpxchg` NaCl-specific intrinsic with the callback. Without this patch, `RewriteAtomics` would require two extra passes over functions, by first requiring use of the full `AtomicExpand` pass to just expand the leftover exotic RMWs and then running itself again to expand resulting `cmpxchg`s. NFC Reviewers: jfb Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D11422 llvm-svn: 243880
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Fix "the the" in comments.Eric Christopher2015-06-191-1/+1
| | | | llvm-svn: 240112
* Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how ↵JF Bastien2015-03-041-9/+24
| | | | | | | | | | | | | | | | | | | | | AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250
* Migrate AtomicExpandPass and DwarfEHPrepare to using a Function-ized ↵Eric Christopher2015-01-271-2/+2
| | | | | | getSubtargetImpl. llvm-svn: 227159
* Cache the lookup of TargetLowering in the atomic expand pass.Eric Christopher2015-01-261-31/+17
| | | | llvm-svn: 227121
* Lower idempotent RMWs to fence+loadRobin Morisset2014-09-251-2/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I originally tried doing this specifically for X86 in the backend in D5091, but it was rather brittle and generally running too late to be general. Furthermore, other targets may want to implement similar optimizations. So I reimplemented it at the IR-level, fitting it into AtomicExpandPass as it interacts with that pass (which could not be cleanly done before at the backend level). This optimization relies on a new target hook, which is only used by X86 for now, as the correctness of the optimization on other targets remains an open question. If it is found correct on other targets, it should be trivial to enable for them. Details of the optimization are discussed in D5091. Test Plan: make check-all + a new test Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5422 llvm-svn: 218455
* [X86] Make wide loads be managed by AtomicExpandRobin Morisset2014-09-231-0/+29
| | | | | | | | | | | | | | | | | | | | | | | Summary: AtomicExpand already had logic for expanding wide loads and stores on LL/SC architectures, and for expanding wide stores on CmpXchg architectures, but not for wide loads on CmpXchg architectures. This patch fills this hole, and makes use of this new feature in the X86 backend. Only one functionnal change: we now lose the SynchScope attribute. It is regrettable, but I have another patch that I will submit soon that will solve this for all of AtomicExpand (it seemed better to split it apart as it is a different concern). Test Plan: make check-all (lots of tests for this functionality already exist) Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5404 llvm-svn: 218332
* Add AtomicExpandPass::bracketInstWithFences, and use it whenever ↵Robin Morisset2014-09-231-38/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getInsertFencesForAtomic would trigger in SelectionDAGBuilder Summary: The goal is to eventually remove all the code related to getInsertFencesForAtomic in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works mostly by accident because the backends are overly conservative), and repeats the same logic that goes in emitLeading/TrailingFence. In this patch, I make AtomicExpandPass insert the fences as it knows better where to put them. Because this requires getting the fences and not just passing an IRBuilder around, I had to change the return type of emitLeading/TrailingFence. This code only triggers on ARM for now. Because it is earlier in the pipeline than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so SelectionDAGBuilder does not add barriers anymore on ARM. If this patch is accepted I plan to implement emitLeading/TrailingFence for all backends that setInsertFencesForAtomic(true), which will allow both making them less conservative and simplifying SelectionDAGBuilder once they are all using this interface. This should not cause any functionnal change so the existing tests are used and not modified. Test Plan: make check-all, benefits from existing tests of atomics on ARM Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5179 llvm-svn: 218329
* [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPassRobin Morisset2014-09-171-52/+133
| | | | | | | | | | | | This required a new hook called hasLoadLinkedStoreConditional to know whether to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to CmpXchg (X86). Apart from that, the new code in AtomicExpandPass is mostly moved from X86AtomicExpandPass. The main result of this patch is to get rid of that pass, which had lots of code duplicated with AtomicExpandPass. llvm-svn: 217928
* Refactor AtomicExpandPass and add a generic isAtomic() method to InstructionRobin Morisset2014-09-031-30/+31
| | | | | | | | | | | | | | | | | | | | | Summary: Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs. Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving the X86 backend to this pass. This requires a way of easily detecting which instructions are atomic. I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making isAtomic() a method of Instruction implemented by a switch on the opcodes. Test Plan: make check Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5035 llvm-svn: 217080
* Use target-dependent emitLeading/TrailingFence instead of the ↵Robin Morisset2014-09-031-51/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | target-independent insertLeading/TrailingFence (in AtomicExpandPass) Fixes two latent bugs: - There was no fence inserted before expanded seq_cst load (unsound on Power) - There was only a fence release before seq_cst stores (again unsound, in particular on Power) It is not even clear if this is correct on ARM swift processors (where release fences are DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift as it is not clear whether it is incorrect. I would love to get documentation stating whether it is correct or not. These two bugs were not triggered because Power is not (yet) using this pass, and these behaviours happen to be (mostly?) working on ARM (although they completely butchered the semantics of the llvm IR). See: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html for an example of the problems that can be caused by the second of these bugs. I couldn't see a way of fixing these in a completely target-independent way without adding lots of unnecessary fences on ARM, hence the target-dependent parts of this patch. This patch implements the new target-dependent parts only for ARM (the default of not doing anything is enough for AArch64), other architectures will use this infrastructure in later patches. llvm-svn: 217076
* Rename AtomicExpandLoadLinked into AtomicExpandRobin Morisset2014-08-211-0/+384
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of a group that aim at making it more target-independent. See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html for details The command line option is "atomic-expand" llvm-svn: 216231
OpenPOWER on IntegriCloud