summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Erase fence insertion from SelectionDAGBuilder.cpp (NFC)Robin Morisset2014-10-161-67/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Backends can use setInsertFencesForAtomic to signal to the middle-end that montonic is the only memory ordering they can accept for stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger ordering to fences + monotonic accesses is currently living in SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it for several reasons: - There is lots of redundancy to avoid: extremely similar logic already exists in AtomicExpand. - The current code in SelectionDAGBuilder does not use any target-hooks, it does the same transformation for every backend that requires it - As a result it is plain *unsound*, as it was apparently designed for ARM. It happens to mostly work for the other targets because they are extremely conservative, but Power for example had to switch to AtomicExpand to be able to use lwsync safely (see r218331). - Because it produces IR-level fences, it cannot be made sound ! This is noted in the C++11 standard (section 29.3, page 1140): ``` Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering semantics. ``` It can also be seen by the following example (called IRIW in the litterature): ``` atomic<int> x = y = 0; int r1, r2, r3, r4; Thread 0: x.store(1); Thread 1: y.store(1); Thread 2: r1 = x.load(); r2 = y.load(); Thread 3: r3 = y.load(); r4 = x.load(); ``` r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst. But if they are lowered to monotonic accesses, no amount of fences can prevent it.. This patch does three things (I could cut it into parts, but then some of them would not be tested/testable, please tell me if you would prefer that): - it provides a default implementation for emitLeadingFence/emitTrailingFence in terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder. As we saw above, this is unsound, but the best that can be done without knowing the targets well (and there is a comment warning about this risk). - it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default implementation (that exactly replicates the logic of SelectionDAGBuilder, so no functional change) - it finally erase this logic from SelectionDAGBuilder as it is dead-code. Ideally, each target would define its own override for emitLeading/TrailingFence using target-specific fences, but I do not know the Sparc/Mips/XCore memory model well enough to do this, and they appear to be dealing fine with the ARM-inspired default expansion for now (probably because they are overly conservative, as Power was). If anyone wants to compile fences more agressively on these platforms, the long comment should make it clear why he should first override emitLeading/TrailingFence. Test Plan: make check-all, no functional change Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5474 llvm-svn: 219957
* Avoid caching the MachineFunction, we don't use it outside ofEric Christopher2014-10-151-9/+7
| | | | | | runOnMachineFunction. llvm-svn: 219847
* Simplify handling of --noexecstack by using getNonexecutableStackSection.Rafael Espindola2014-10-152-7/+9
| | | | llvm-svn: 219799
* [MachineSink] Use the real post dominator treeJingyue Wu2014-10-151-21/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source. Thanks Alexey Volkov for reporting PR21115! Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking. Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115. Updated an affected test in X86. Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread. Reviewers: Jiangning, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5633 llvm-svn: 219773
* [AAarch64] Optimize CSINC-branch sequenceGerolf Hoflehner2014-10-141-0/+12
| | | | | | | | | | | | | | | | | | | | | Peephole optimization that generates a single conditional branch for csinc-branch sequences like in the examples below. This is possible when the csinc sets or clears a register based on a condition code and the branch checks that register. Also the condition code may not be modified between the csinc and the original branch. Examples: 1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44 to b.<invCC> 2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44 to b.<CC> rdar://problem/18506500 llvm-svn: 219742
* Remove unused member variable.Rafael Espindola2014-10-142-5/+3
| | | | | | Fixes pr20904. llvm-svn: 219706
* DebugInfo: Ensure that all debug location scope chains from instructions ↵David Blaikie2014-10-141-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | within a function, lead to the function itself. Let me tell you a tale... Originally committed in r211723 after discovering a nasty case of weird scoping due to inlining, this was reverted in r211724 after it fired in ASan/compiler-rt. (minor diversion where I accidentally committed/reverted again in r211871/r211873) After further testing and fixing bugs in ArgumentPromotion (r211872) and Inlining (r212065) it was recommitted in r212085. Reverted in r212089 after the sanitizer buildbots still showed problems. Fixed another bug in ArgumentPromotion (r212128) found by this assertion. Recommitted in r212205, reverted in r212226 after it crashed some more on sanitizer buildbots. Fix clang some more in r212761. Recommitted in r212776, reverted in r212793. ASan failures. Recommitted in r213391, reverted in r213432, trying to reproduce flakey ASan build failure. Fixed bugs in r213805 (ArgPromo + DebugInfo), r213952 (LiveDebugVariables strips dbg_value intrinsics in functions not described by debug info). Recommitted in r214761, reverted in r214999, flakey failure on Windows buildbot. Fixed DeadArgElimination + DebugInfo bug in r219210. Recommitted in r219215, reverted in r219512, failure on ObjC++ atomic properties in the test-suite on Darwin. Fixed ObjC++ atomic properties issue in Clang in r219690. [This commit is provided 'as is' with no hope that this is the last time I commit this change either expressed or implied] llvm-svn: 219702
* Revert "Fix stuff... again."David Blaikie2014-10-141-7/+2
| | | | | | | | Accidental commit. This reverts commit r219693. llvm-svn: 219695
* Revert some parts of r196288 that were confusing and untested.David Blaikie2014-10-141-8/+2
| | | | | | | If we figure out why they should be here, let's add some testing of some kind so we can better demonstrate why it's needed. llvm-svn: 219694
* Fix stuff... again.David Blaikie2014-10-141-2/+7
| | | | llvm-svn: 219693
* Remove unnecessary TargetMachine.h includes.Eric Christopher2014-10-1425-26/+1
| | | | llvm-svn: 219672
* Grab the subtarget and subtarget dependent variables off ofEric Christopher2014-10-144-21/+10
| | | | | | MachineFunction rather than TargetMachine. llvm-svn: 219671
* Grab the subtarget and subtarget dependent variables off ofEric Christopher2014-10-142-9/+6
| | | | | | MachineFunction rather than TargetMachine. llvm-svn: 219670
* Instead of the TargetMachine cache the MachineFunctionEric Christopher2014-10-141-14/+13
| | | | | | | | and TargetRegisterInfo in the peephole optimizer. This makes it easier to grab subtarget dependent variables off of the MachineFunction rather than the TargetMachine. llvm-svn: 219669
* Access subtarget specific variables off of the MachineFunction'sEric Christopher2014-10-142-6/+4
| | | | | | cached subtarget and not the TargetMachine. llvm-svn: 219668
* Access the subtarget off of the MachineFunction via the DAGEric Christopher2014-10-141-9/+7
| | | | | | | | scheduler or via the SelectionDAG if available. Otherwise grab the subtarget off of the MachineFunction by going up the parent chain. llvm-svn: 219666
* Remove the use and member variable of the TargetMachine fromEric Christopher2014-10-141-6/+4
| | | | | | MachineLICM as we can get the same data off of the MachineFunction. llvm-svn: 219663
* Have MachineInstrBundle use the MachineFunction for subtargetEric Christopher2014-10-141-5/+5
| | | | | | access rather than the TargetMachine. llvm-svn: 219662
* Access the subtarget off of the MachineFunction rather thanEric Christopher2014-10-141-4/+2
| | | | | | through the TargetMachine. llvm-svn: 219661
* Remove the TargetMachine from DFAPacketizer since it was onlyEric Christopher2014-10-141-2/+2
| | | | | | | being used to grab subtarget specific things that we can grab from the MachineFunction anyhow. llvm-svn: 219650
* Migrate another set of getSubtargetImpl away.Eric Christopher2014-10-131-2/+2
| | | | llvm-svn: 219636
* Add an assertion about the integrity of the iterator.Adrian Prantl2014-10-131-0/+5
| | | | | | | | | Broken parent scope pointers in inlined DIVariables can cause ensureAbstractVariableIsCreated to insert new abstract scopes, thus invalidating the iterator in this loop and leading to hard-to-debug crashes. Useful when manually reducing IR for testcases. llvm-svn: 219628
* constify the getters in SDNodeDbgValue.Adrian Prantl2014-10-131-12/+12
| | | | llvm-svn: 219627
* Refactor debug statement and remove dead argument. NFC.Chad Rosier2014-10-132-18/+13
| | | | llvm-svn: 219626
* Modernize old-style static asserts. NFC.Benjamin Kramer2014-10-121-1/+1
| | | | llvm-svn: 219588
* Revert "DebugInfo: Ensure that all debug location scope chains from ↵David Blaikie2014-10-101-7/+2
| | | | | | | | | | | instructions within a function, lead to the function itself." This invariant is violated (& the assertions fire) on some Objective C++ in the test-suite. Reverting while I investigate. This reverts commit r219215. llvm-svn: 219523
* [MiSched] Fix a logic error in tryPressure()Hal Finkel2014-10-101-2/+2
| | | | | | | | | | | | | Fixes a logic error in the MachineScheduler found by Steve Montgomery (and confirmed by Andy). This has gone unfixed for months because the fix has been found to introduce some small performance regressions. However, Andy has recommended that, at this point, we fix this to avoid further dependence on the incorrect behavior (and then follow-up separately on any regressions), and I agree. Fixes PR18883. llvm-svn: 219512
* Simplify a few uses of DwarfDebug::SPMapDavid Blaikie2014-10-102-22/+4
| | | | llvm-svn: 219510
* Reorder functions in WinCodeViewLineTables.cpp [NFC]Timur Iskhodzhanov2014-10-101-51/+53
| | | | | | This helps read the comments and understand the code in a natural order llvm-svn: 219508
* Reduce double set lookups. NFC.Benjamin Kramer2014-10-101-6/+2
| | | | llvm-svn: 219505
* Fix a small typo, NFCTimur Iskhodzhanov2014-10-101-1/+1
| | | | llvm-svn: 219492
* Sink the per-CU part of DwarfDebug::finishSubprogramDefinitions into ↵David Blaikie2014-10-103-15/+21
| | | | | | DwarfCompileUnit. llvm-svn: 219477
* Sink most of DwarfDebug::constructAbstractSubprogramScopeDIE down into ↵David Blaikie2014-10-104-29/+40
| | | | | | DwarfCompileUnit. llvm-svn: 219476
* Avoid unnecessary map lookup/insertion.David Blaikie2014-10-101-2/+2
| | | | llvm-svn: 219466
* Improve sqrt estimate algorithm (fast-math)Sanjay Patel2014-10-091-17/+16
| | | | | | | | | | | | | | | | | | | This patch changes the fast-math implementation for calculating sqrt(x) from: y = 1 / (1 / sqrt(x)) to: y = x * (1 / sqrt(x)) This has 2 benefits: less code / faster code and one less estimate instruction that may lose precision. The only target that will be affected (until http://reviews.llvm.org/D5658 is approved) is PPC. The difference in codegen for PPC is 2 less flops for a single-precision sqrtf or vector sqrtf and 4 less flops for a double-precision sqrt. We also eliminate a constant load and extra register usage. Differential Revision: http://reviews.llvm.org/D5682 llvm-svn: 219445
* delete function names from commentsSanjay Patel2014-10-091-32/+30
| | | | llvm-svn: 219444
* Remove unused parameterDavid Blaikie2014-10-092-6/+5
| | | | llvm-svn: 219440
* Sink DwarfDebug::createAndAddScopeChildren down into DwarfCompileUnit.David Blaikie2014-10-094-19/+17
| | | | llvm-svn: 219437
* Sink DwarfDebug::constructSubprogramScopeDIE down into DwarfCompileUnitDavid Blaikie2014-10-094-49/+55
| | | | llvm-svn: 219436
* Sink DwarfDebug::createScopeChildrenDIE down into DwarfCompileUnit.David Blaikie2014-10-094-28/+31
| | | | llvm-svn: 219422
* [PBQP] Replace PBQPBuilder with composable constraints (PBQPRAConstraint).Lang Hames2014-10-091-355/+307
| | | | | | | | | | | | | | | | This patch removes the PBQPBuilder class and its subclasses and replaces them with a composable constraints class: PBQPRAConstraint. This allows constraints that are only required for optimisation (e.g. coalescing, soft pairing) to be mixed and matched. This patch also introduces support for target writers to supply custom constraints for their targets by overriding a TargetSubtargetInfo method: std::unique_ptr<PBQPRAConstraints> getCustomPBQPConstraints() const; This patch should have no effect on allocations. llvm-svn: 219421
* Sink DwarfDebug.cpp::constructVariableDIE into DwarfCompileUnit.David Blaikie2014-10-093-12/+14
| | | | llvm-svn: 219419
* Move DwarfUnit::constructVariableDIE down to DwarfCompileUnit, since it's ↵David Blaikie2014-10-094-72/+74
| | | | | | only needed there. llvm-svn: 219418
* Sink DwarfDebug::constructLexicalScopeDIE into DwarfCompileUnitDavid Blaikie2014-10-094-23/+21
| | | | llvm-svn: 219414
* Missing reformattingDavid Blaikie2014-10-091-1/+1
| | | | llvm-svn: 219413
* Sink DwarfDebug::constructInlinedScopeDIE into DwarfCompileUnitDavid Blaikie2014-10-094-44/+50
| | | | | | | | | | | This introduces access to the AbstractSPDies map from DwarfDebug so DwarfCompileUnit can access it. Eventually this'll sink down to DwarfFile, but it'll still be generically accessible - not much encapsulation to provide it. (constructInlinedScopeDIE could stay further up, in DwarfFile to avoid exposing this - but I don't think that's particularly better) llvm-svn: 219411
* Remove more calls to getSubtargetImpl from the schedulers andEric Christopher2014-10-093-24/+17
| | | | | | remove cached or unnecessary TargetMachines. llvm-svn: 219387
* Remove unused argument to CreateTargetScheduleState and changeEric Christopher2014-10-092-2/+2
| | | | | | | the TargetMachine to a TargetSubtargetInfo since everything we wanted is off of that. llvm-svn: 219382
* Remove uses of getSubtargetImpl from ResourcePriorityQueue andEric Christopher2014-10-091-7/+5
| | | | | | replace them with calls off of the MachineFuncton. llvm-svn: 219381
* Remove the uses of getSubtargetImpl from InstrEmitter and removeEric Christopher2014-10-092-9/+6
| | | | | | the now unused TargetMachine variable. llvm-svn: 219379
OpenPOWER on IntegriCloud