summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* transform fadd chains to increase parallelismSanjay Patel2015-04-281-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a compromise: with this simple patch, we should always handle a chain of exactly 3 operations optimally, but we're not generating the optimal balanced binary tree for a longer sequence. In general, this transform will reduce the dependency chain for a sequence of instructions using N operands from a worst case N-1 dependent operations to N/2 dependent operations. The optimal balanced binary tree would reduce the chain to log2(N). The trade-off for not dealing with longer sequences is: (1) we have less complexity in the compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we don't need to worry about the increased register pressure required to parallelize longer sequences. It also seems unlikely that we would ever encounter really long strings of dependent ops like that in the wild, but I'm not sure how to verify that speculation. FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math and this patch. We can extend this patch to cover other associative operations such as fmul, fmax, fmin, integer add, integer mul. This is a partial fix for: https://llvm.org/bugs/show_bug.cgi?id=17305 and if extended: https://llvm.org/bugs/show_bug.cgi?id=21768 https://llvm.org/bugs/show_bug.cgi?id=23116 The issue also came up in: http://reviews.llvm.org/D8941 Differential Revision: http://reviews.llvm.org/D9232 llvm-svn: 236031
* move IR-level optimization flags into their own structSanjay Patel2015-04-282-8/+11
| | | | | | | | | | | | | | | | | | | | | | This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900). In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment, we should eventually share this data between the IR passes and the backend. We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to use the new struct. The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a data member to the subclass. This means we don't have to repeat all of the get/set methods per flag, but we're potentially adding size to all nodes of this subclassi type. In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at least one free byte to play with due to struct alignment. Differential Revision: http://reviews.llvm.org/D9325 llvm-svn: 235997
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-2813-1172/+1434
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-2813-1434/+1172
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-2813-1172/+1434
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* Masked gather and scatter: Added code for SelectionDAG.Elena Demikhovsky2015-04-283-0/+203
| | | | | | | | All other patches, including tests will follow. http://reviews.llvm.org/D7665 llvm-svn: 235970
* Switch lowering: use uint32_t for weights everywhereHans Wennborg2015-04-272-6/+12
| | | | | | | | | I previously thought switch clusters would need to use uint64_t in case the weights of multiple cases overflowed a 32-bit int. It turns out that the weights on a terminator instruction are capped to allow for being added together, so using a uint32_t should be safe. llvm-svn: 235945
* Switch lowering: Take branch weight into account when ordering for fall-throughHans Wennborg2015-04-271-3/+4
| | | | | | | | | | | Previously, the code would try to put a fall-through case last, even if that meant moving a case with much higher branch weight further down the chain. Ordering by branch weight is most important, putting a fall-through block last is secondary. llvm-svn: 235942
* Switch lowering: order bit tests by branch weight.Hans Wennborg2015-04-271-1/+4
| | | | llvm-svn: 235912
* Make the message associated with a fatal error slightly more helpfulPhilip Reames2015-04-261-2/+10
| | | | | | Looking into 23095, my best guess is that the CodeGen library itself isn't getting linked and initialized properly. To make this slightly more obvious to consumers of LLVM, emit a different error message if we can tell that the registry is empty vs you've simply happened to name a collector which hasn't been registered. llvm-svn: 235824
* Fix build error from accidental changeAndrew Kaylor2015-04-241-2/+1
| | | | llvm-svn: 235792
* [WinEH] Find correct cloned entry block for outlined handler functions.Andrew Kaylor2015-04-241-1/+2
| | | | llvm-svn: 235791
* [WinEH] Find correct cloned entry block for outlined handler functions.Andrew Kaylor2015-04-241-4/+13
| | | | llvm-svn: 235789
* [DAGCombiner] Fix the type used in canFoldInAddressingMode to account for theQuentin Colombet2015-04-241-2/+2
| | | | | | | | | | | | | | | | | | right scaling. In the function canFoldInAddressingMode, VT is computed as the type of the destination/source of a LOAD/STORE operations, instead of the memory type of the operation. On targets with a scaling factor on the offset of the LOAD/STORE operations, the function may return false for actually valid cases. This may then prevent the selection of profitable pre or post indexed load/store operations, and instead select pre or post indexed load/store for unprofitable cases. Patch by Francois de Ferriere <francois.de-ferriere@st.com>! Differential Revision: http://reviews.llvm.org/D9146 llvm-svn: 235780
* Remove an unused variable to prevent -Werror build failures.Kaelyn Takata2015-04-241-1/+0
| | | | llvm-svn: 235773
* [SEH] Implement GetExceptionCode in __except blocksReid Kleckner2015-04-244-26/+89
| | | | | | | | | This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered by copying the EAX value live into whatever basic block it is called from. Obviously, this only works if you insert it late during codegen, because otherwise mid-level passes might reschedule it. llvm-svn: 235768
* [AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr.Lang Hames2015-04-2420-431/+431
| | | | | | | AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a reference for this is crufty. llvm-svn: 235752
* Switch lowering: fix APInt overflow causing infinite loop / OOMHans Wennborg2015-04-241-1/+2
| | | | llvm-svn: 235729
* [WinEH] Split the landingpad BB instead of cloning itReid Kleckner2015-04-241-21/+9
| | | | | | | This means we don't have to RAUW the landingpad instruction and landingpad BB, which is a nice win. llvm-svn: 235725
* RegisterCoalescer: implicit phsreg uses are fine when rematerializingMatthias Braun2015-04-241-2/+2
| | | | | | | The target hooks should have already checked them. This change is necessary to enable the remateriailzation on R600. llvm-svn: 235673
* RegisterCoalescer: Avoid unnecessary register class widening for some ↵Matthias Braun2015-04-231-3/+25
| | | | | | | | | | | rematerializations I couldn't provide a testcase as none of the public targets has wide register classes with alot of subregisters and at the same time an instruction which "ReMaterializable" and "AsCheapAsAMove" (could probably be added for R600). llvm-svn: 235668
* Re-commit "[SEH] Remove the old __C_specific_handler code now that ↵Reid Kleckner2015-04-234-121/+2
| | | | | | | | | | WinEHPrepare works" This reverts commit r235617. r235649 should have addressed the problems. llvm-svn: 235667
* [WinEH] Ignore filter clauses while mapping landing pad blocks.Andrew Kaylor2015-04-231-0/+6
| | | | llvm-svn: 235656
* Remove trivial assert to fix NDEBUG Werror buildsReid Kleckner2015-04-231-2/+0
| | | | llvm-svn: 235652
* [WinEH] Replace more lpad value uses with undefReid Kleckner2015-04-231-9/+20
| | | | | | | | | | | | | | | | | | | | | We were asserting on code like this: extern "C" unsigned long _exception_code(); void might_crash(unsigned long); void foo() { __try { might_crash(0); } __except(1) { might_crash(_exception_code()); } } Gtest and many other libraries get the exception code from the __except block. What's supposed to happen here is that EAX is live into the __except block, and it contains the exception code. Eventually we'll represent that as a use of the landingpad ehptr value, but for now we can replace it with undef. llvm-svn: 235649
* [MachineCopyPropagation] Handle undef flags conservatively so that we do notQuentin Colombet2015-04-231-1/+5
| | | | | | | | | | | | | | | | | | | | | remove copies that are useful after breaking some hardware dependencies. In other words, handle this kind of situations conservatively by assuming reg2 is redefined by the undef flag. reg1 = copy reg2 = inst reg2<undef> reg2 = copy reg1 Copy propagation used to remove the last copy. This is incorrect because the undef flag on reg2 in inst, allows next passes to put whatever trashed value in reg2 that may help. In practice we end up with this code: reg1 = copy reg2 reg2 = 0 = inst reg2<undef> reg2 = copy reg1 This fixes PR21743. llvm-svn: 235647
* [WinEH] Handle stubs for outlined functions that have only unreached ↵Andrew Kaylor2015-04-231-9/+16
| | | | | | terminators. llvm-svn: 235618
* Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare ↵Reid Kleckner2015-04-234-2/+121
| | | | | | | | | | works" We still have some "uses remain after removal" issues in -O0 builds. This reverts commit r235557. llvm-svn: 235617
* Re-commit r235560: Switch lowering: extract jump tables and bit tests before ↵Hans Wennborg2015-04-233-760/+898
| | | | | | | | | | | building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608
* Revert r235560; this commit was causing several failed assertions in Debug ↵Aaron Ballman2015-04-233-897/+760
| | | | | | builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597
* [DAGCombiner] Remove extra bitcasts surrounding vector shuffles Simon Pilgrim2015-04-231-0/+45
| | | | | | | | Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) Differential Revision: http://reviews.llvm.org/D9097 llvm-svn: 235578
* [WinEH] Don't skip landing pads that end with an unreachable instruction.Andrew Kaylor2015-04-231-6/+4
| | | | llvm-svn: 235563
* Switch lowering: extract jump tables and bit tests before building binary ↵Hans Wennborg2015-04-223-760/+897
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560
* [SEH] Remove the old __C_specific_handler code now that WinEHPrepare worksReid Kleckner2015-04-224-121/+2
| | | | | | | | | | This removes the -sehprepare flag and makes __C_specific_handler functions always to use WinEHPrepare. This was tested by building all of chromium_builder_tests and running a few tests that use SEH, but if something breaks, we can revert this. llvm-svn: 235557
* [WinEH] Demote values and phis live across exception handlers up frontReid Kleckner2015-04-221-68/+277
| | | | | | | | | | | | | | | | | | In particular, this handles SSA values that are live *out* of a handler. The existing code only handles values that are live *in* to a handler. It also handles phi nodes in the block where normal control should resume after the end of a catch handler. When EH return points have phi nodes, we need to split the return edge. It is impossible for phi elimination to emit copies in the previous block if that block gets outlined. The indirectbr that we leave in the function is only notional, and is eliminated from the MachineFunction CFG early on. Reviewers: majnemer, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D9158 llvm-svn: 235545
* Test commit: fix typo in comment.Luqman Aden2015-04-221-2/+2
| | | | llvm-svn: 235526
* Fixed logic to enable complex FMA formation.Olivier Sallenave2015-04-221-2/+2
| | | | llvm-svn: 235508
* [DAGCombine] Disable select(c, load,load) for indexed loadsHal Finkel2015-04-221-0/+3
| | | | | | | | | This turned up after r235333, but was a pre-existing bug. The optimization which transforms select(c, load, load) into a load of a select of the addresses does not handle indexed loads (pre/post inc/dec). However, it did not check for them either, leading to a crash if it tried to transform one of them. llvm-svn: 235497
* [patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and theLang Hames2015-04-222-19/+27
| | | | | | | | | | | | X86 backend. The code generated for symbolic targets is identical to the code generated for constant targets, except that a relocation is emitted to fix up the actual target address at link-time. This allows IR and object files containing patchpoints to be cached across JIT-invocations where the target address may change. llvm-svn: 235483
* [WinEH] Correctly handle inlined __finally blocks with capturesReid Kleckner2015-04-221-6/+33
| | | | | | | | We should also teach the inliner to collapse framerecover of frameaddress of the current frame down to an alloca, but that can happen later. llvm-svn: 235459
* DebugInfo: Remove DIArray and DITypeArray typedefsDuncan P. N. Exon Smith2015-04-213-9/+9
| | | | | | | Remove the `DIArray` and `DITypeArray` typedefs, preferring the underlying types (`DebugNodeArray` and `MDTypeRefArray`, respectively). llvm-svn: 235413
* DebugInfo: Drop rest of DIDescriptor subclassesDuncan P. N. Exon Smith2015-04-2115-83/+83
| | | | | | | Delete the remaining subclasses of (the already deleted) `DIDescriptor`. Part of PR23080. llvm-svn: 235404
* DebugInfo: Assert dbg.declare/value insts are validDuncan P. N. Exon Smith2015-04-213-9/+8
| | | | | | | | | | Remove early returns for when `getVariable()` is null, and just assert that it never happens. The Verifier already confirms that there's a valid variable on these intrinsics, so we should assume the debug info isn't broken. I also updated a check for a `!dbg` attachment, which the Verifier similarly guarantees. llvm-svn: 235400
* Re-land r235154-r235156 under the existing -sehprepare flagReid Kleckner2015-04-214-8/+117
| | | | | | | | Keep the old SEH fan-in lowering on by default for now, since projects rely on it. This will make it easy to test this change with a simple flag flip. llvm-svn: 235399
* CONCAT_VECTOR of BUILD_VECTOR - minor fixSimon Pilgrim2015-04-211-0/+12
| | | | | | | | Fixed issue with the combine of CONCAT_VECTOR of 2 BUILD_VECTOR nodes - the optimisation wasn't ensuring that the scalar operands of both nodes were the same type/size for implicit truncation. Test case spotted by Patrik Hagglund llvm-svn: 235371
* Fix generic shift expansion when shift amount is 0Pawel Bylica2015-04-211-7/+9
| | | | | | | | | | | | | | | | | | | | | Summary: This fixes http://llvm.org/bugs/show_bug.cgi?id=16439. This is one possible way to approach this. The other would be to split InL>>(nbits-Amt) into (InL>>(nbits-1-Amt))>>1, which is also valid since since we only need to care about Amt up nbits-1. It's hard to tell which one is better since the shift might be expensive if this stage of expansion is not yet a legal machine integer, whereas comparisons with zero are relatively cheap at all sizes, but more expensive than a shift if the shift is on a legal machine type. Patch by Keno Fischer! Test Plan: regression test from http://reviews.llvm.org/D7752 Reviewers: chfast, resistor Reviewed By: chfast, resistor Subscribers: sanjoy, resistor, chfast, llvm-commits Differential Revision: http://reviews.llvm.org/D4978 llvm-svn: 235370
* [WinEH] Fix problem with landing pad return values used in PHI nodes during ↵Andrew Kaylor2015-04-201-0/+4
| | | | | | outlining. llvm-svn: 235358
* DebugInfo: Delete subclasses of DIScopeDuncan P. N. Exon Smith2015-04-208-43/+45
| | | | | | | Delete subclasses of (the already defunct) `DIScope`, updating users to use the raw pointers from the `Metadata` hierarchy directly. llvm-svn: 235356
* [WinEH] Fix problem with mapping shared empty handler blocks.Andrew Kaylor2015-04-201-2/+38
| | | | | | Differential Revision: http://reviews.llvm.org/D9125 llvm-svn: 235354
* DebugInfo: Delete old subclasses of DITypeDuncan P. N. Exon Smith2015-04-204-32/+29
| | | | | | | | | | | Delete subclasses of (the already deleted) `DIType` in favour of directly using pointers from the `Metadata` hierarchy. While `DICompositeType` wraps `MDCompositeTypeBase` and `DIDerivedType` wraps `MDDerivedTypeBase`, most uses of each really meant the more specific `MDCompositeType` and `MDDerivedType`. llvm-svn: 235351
OpenPOWER on IntegriCloud