summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Switch lowering: fix APInt overflow causing infinite loop / OOMHans Wennborg2015-04-241-1/+2
| | | | llvm-svn: 235729
* [WinEH] Split the landingpad BB instead of cloning itReid Kleckner2015-04-241-21/+9
| | | | | | | This means we don't have to RAUW the landingpad instruction and landingpad BB, which is a nice win. llvm-svn: 235725
* RegisterCoalescer: implicit phsreg uses are fine when rematerializingMatthias Braun2015-04-241-2/+2
| | | | | | | The target hooks should have already checked them. This change is necessary to enable the remateriailzation on R600. llvm-svn: 235673
* RegisterCoalescer: Avoid unnecessary register class widening for some ↵Matthias Braun2015-04-231-3/+25
| | | | | | | | | | | rematerializations I couldn't provide a testcase as none of the public targets has wide register classes with alot of subregisters and at the same time an instruction which "ReMaterializable" and "AsCheapAsAMove" (could probably be added for R600). llvm-svn: 235668
* Re-commit "[SEH] Remove the old __C_specific_handler code now that ↵Reid Kleckner2015-04-234-121/+2
| | | | | | | | | | WinEHPrepare works" This reverts commit r235617. r235649 should have addressed the problems. llvm-svn: 235667
* [WinEH] Ignore filter clauses while mapping landing pad blocks.Andrew Kaylor2015-04-231-0/+6
| | | | llvm-svn: 235656
* Remove trivial assert to fix NDEBUG Werror buildsReid Kleckner2015-04-231-2/+0
| | | | llvm-svn: 235652
* [WinEH] Replace more lpad value uses with undefReid Kleckner2015-04-231-9/+20
| | | | | | | | | | | | | | | | | | | | | We were asserting on code like this: extern "C" unsigned long _exception_code(); void might_crash(unsigned long); void foo() { __try { might_crash(0); } __except(1) { might_crash(_exception_code()); } } Gtest and many other libraries get the exception code from the __except block. What's supposed to happen here is that EAX is live into the __except block, and it contains the exception code. Eventually we'll represent that as a use of the landingpad ehptr value, but for now we can replace it with undef. llvm-svn: 235649
* [MachineCopyPropagation] Handle undef flags conservatively so that we do notQuentin Colombet2015-04-231-1/+5
| | | | | | | | | | | | | | | | | | | | | remove copies that are useful after breaking some hardware dependencies. In other words, handle this kind of situations conservatively by assuming reg2 is redefined by the undef flag. reg1 = copy reg2 = inst reg2<undef> reg2 = copy reg1 Copy propagation used to remove the last copy. This is incorrect because the undef flag on reg2 in inst, allows next passes to put whatever trashed value in reg2 that may help. In practice we end up with this code: reg1 = copy reg2 reg2 = 0 = inst reg2<undef> reg2 = copy reg1 This fixes PR21743. llvm-svn: 235647
* [WinEH] Handle stubs for outlined functions that have only unreached ↵Andrew Kaylor2015-04-231-9/+16
| | | | | | terminators. llvm-svn: 235618
* Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare ↵Reid Kleckner2015-04-234-2/+121
| | | | | | | | | | works" We still have some "uses remain after removal" issues in -O0 builds. This reverts commit r235557. llvm-svn: 235617
* Re-commit r235560: Switch lowering: extract jump tables and bit tests before ↵Hans Wennborg2015-04-233-760/+898
| | | | | | | | | | | building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608
* Revert r235560; this commit was causing several failed assertions in Debug ↵Aaron Ballman2015-04-233-897/+760
| | | | | | builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597
* [DAGCombiner] Remove extra bitcasts surrounding vector shuffles Simon Pilgrim2015-04-231-0/+45
| | | | | | | | Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) Differential Revision: http://reviews.llvm.org/D9097 llvm-svn: 235578
* [WinEH] Don't skip landing pads that end with an unreachable instruction.Andrew Kaylor2015-04-231-6/+4
| | | | llvm-svn: 235563
* Switch lowering: extract jump tables and bit tests before building binary ↵Hans Wennborg2015-04-223-760/+897
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560
* [SEH] Remove the old __C_specific_handler code now that WinEHPrepare worksReid Kleckner2015-04-224-121/+2
| | | | | | | | | | This removes the -sehprepare flag and makes __C_specific_handler functions always to use WinEHPrepare. This was tested by building all of chromium_builder_tests and running a few tests that use SEH, but if something breaks, we can revert this. llvm-svn: 235557
* [WinEH] Demote values and phis live across exception handlers up frontReid Kleckner2015-04-221-68/+277
| | | | | | | | | | | | | | | | | | In particular, this handles SSA values that are live *out* of a handler. The existing code only handles values that are live *in* to a handler. It also handles phi nodes in the block where normal control should resume after the end of a catch handler. When EH return points have phi nodes, we need to split the return edge. It is impossible for phi elimination to emit copies in the previous block if that block gets outlined. The indirectbr that we leave in the function is only notional, and is eliminated from the MachineFunction CFG early on. Reviewers: majnemer, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D9158 llvm-svn: 235545
* Test commit: fix typo in comment.Luqman Aden2015-04-221-2/+2
| | | | llvm-svn: 235526
* Fixed logic to enable complex FMA formation.Olivier Sallenave2015-04-221-2/+2
| | | | llvm-svn: 235508
* [DAGCombine] Disable select(c, load,load) for indexed loadsHal Finkel2015-04-221-0/+3
| | | | | | | | | This turned up after r235333, but was a pre-existing bug. The optimization which transforms select(c, load, load) into a load of a select of the addresses does not handle indexed loads (pre/post inc/dec). However, it did not check for them either, leading to a crash if it tried to transform one of them. llvm-svn: 235497
* [patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and theLang Hames2015-04-222-19/+27
| | | | | | | | | | | | X86 backend. The code generated for symbolic targets is identical to the code generated for constant targets, except that a relocation is emitted to fix up the actual target address at link-time. This allows IR and object files containing patchpoints to be cached across JIT-invocations where the target address may change. llvm-svn: 235483
* [WinEH] Correctly handle inlined __finally blocks with capturesReid Kleckner2015-04-221-6/+33
| | | | | | | | We should also teach the inliner to collapse framerecover of frameaddress of the current frame down to an alloca, but that can happen later. llvm-svn: 235459
* DebugInfo: Remove DIArray and DITypeArray typedefsDuncan P. N. Exon Smith2015-04-213-9/+9
| | | | | | | Remove the `DIArray` and `DITypeArray` typedefs, preferring the underlying types (`DebugNodeArray` and `MDTypeRefArray`, respectively). llvm-svn: 235413
* DebugInfo: Drop rest of DIDescriptor subclassesDuncan P. N. Exon Smith2015-04-2115-83/+83
| | | | | | | Delete the remaining subclasses of (the already deleted) `DIDescriptor`. Part of PR23080. llvm-svn: 235404
* DebugInfo: Assert dbg.declare/value insts are validDuncan P. N. Exon Smith2015-04-213-9/+8
| | | | | | | | | | Remove early returns for when `getVariable()` is null, and just assert that it never happens. The Verifier already confirms that there's a valid variable on these intrinsics, so we should assume the debug info isn't broken. I also updated a check for a `!dbg` attachment, which the Verifier similarly guarantees. llvm-svn: 235400
* Re-land r235154-r235156 under the existing -sehprepare flagReid Kleckner2015-04-214-8/+117
| | | | | | | | Keep the old SEH fan-in lowering on by default for now, since projects rely on it. This will make it easy to test this change with a simple flag flip. llvm-svn: 235399
* CONCAT_VECTOR of BUILD_VECTOR - minor fixSimon Pilgrim2015-04-211-0/+12
| | | | | | | | Fixed issue with the combine of CONCAT_VECTOR of 2 BUILD_VECTOR nodes - the optimisation wasn't ensuring that the scalar operands of both nodes were the same type/size for implicit truncation. Test case spotted by Patrik Hagglund llvm-svn: 235371
* Fix generic shift expansion when shift amount is 0Pawel Bylica2015-04-211-7/+9
| | | | | | | | | | | | | | | | | | | | | Summary: This fixes http://llvm.org/bugs/show_bug.cgi?id=16439. This is one possible way to approach this. The other would be to split InL>>(nbits-Amt) into (InL>>(nbits-1-Amt))>>1, which is also valid since since we only need to care about Amt up nbits-1. It's hard to tell which one is better since the shift might be expensive if this stage of expansion is not yet a legal machine integer, whereas comparisons with zero are relatively cheap at all sizes, but more expensive than a shift if the shift is on a legal machine type. Patch by Keno Fischer! Test Plan: regression test from http://reviews.llvm.org/D7752 Reviewers: chfast, resistor Reviewed By: chfast, resistor Subscribers: sanjoy, resistor, chfast, llvm-commits Differential Revision: http://reviews.llvm.org/D4978 llvm-svn: 235370
* [WinEH] Fix problem with landing pad return values used in PHI nodes during ↵Andrew Kaylor2015-04-201-0/+4
| | | | | | outlining. llvm-svn: 235358
* DebugInfo: Delete subclasses of DIScopeDuncan P. N. Exon Smith2015-04-208-43/+45
| | | | | | | Delete subclasses of (the already defunct) `DIScope`, updating users to use the raw pointers from the `Metadata` hierarchy directly. llvm-svn: 235356
* [WinEH] Fix problem with mapping shared empty handler blocks.Andrew Kaylor2015-04-201-2/+38
| | | | | | Differential Revision: http://reviews.llvm.org/D9125 llvm-svn: 235354
* DebugInfo: Delete old subclasses of DITypeDuncan P. N. Exon Smith2015-04-204-32/+29
| | | | | | | | | | | Delete subclasses of (the already deleted) `DIType` in favour of directly using pointers from the `Metadata` hierarchy. While `DICompositeType` wraps `MDCompositeTypeBase` and `DIDerivedType` wraps `MDDerivedTypeBase`, most uses of each really meant the more specific `MDCompositeType` and `MDDerivedType`. llvm-svn: 235351
* DwarfUnit: Split MDSubroutineType version of constructTypeDIE()Duncan P. N. Exon Smith2015-04-203-32/+36
| | | | | | | | | | | | | The version of `constructTypeDIE()` for `MDSubroutineType` is unrelated to (and has different callers than) the `MDCompositeType`. Split the two in half. This simplifies an upcoming patch to delete `DICompositeType`. There shouldn't be any real functionality change here. `createTypeDIE()` is `cast<>`'ing where it didn't need to before, but that function in turn is only called for true `MDCompositeType`s. llvm-svn: 235349
* DwarfUnit: Cleanup commentsDuncan P. N. Exon Smith2015-04-202-194/+87
| | | | | | | | | | | | | | Update comment style in `DwarfUnit`. - Drop duplicated comments at definition, and update the comments at the declaration where the definition comments looked newer or more complete. - Drop the `functionName -` prefix. - Add `\brief` in a few places. - Remove a few comments entirely that weren't adding value (just turned the function name and arguments into a sentence). llvm-svn: 235345
* Refactoring and enhancement to FMA combine.Olivier Sallenave2015-04-201-171/+369
| | | | llvm-svn: 235344
* DAGCombine: Remove redundant NaN checks around ISD::FSQRTTom Stellard2015-04-201-0/+35
| | | | | | | | This folds: (select (setcc x, -0.0, *lt), NaN, (fsqrt x)) -> ( fsqrt x) llvm-svn: 235333
* DebugInfo: Remove DITypeDuncan P. N. Exon Smith2015-04-204-27/+27
| | | | | | | | This is the last major parent class, so I'll probably start deleting classes in batches now. Looks like many of the references to the DI* hierarchy were updated organically along the way. llvm-svn: 235331
* [WinEH] Fix memory leak with catch-all mapping.Andrew Kaylor2015-04-201-5/+12
| | | | llvm-svn: 235328
* DebugInfo: Remove DIScopeDuncan P. N. Exon Smith2015-04-204-28/+29
| | | | | | | | | Replace uses of `DIScope` with `MDScope*`. There was one spot where I've left an `MDScope*` uninitialized (where `DIScope` would have been default-initialized to `nullptr`) -- this is intentional, since the if/else that follows should unconditional assign it to a value. llvm-svn: 235327
* DebugInfo: Remove typedefs for DITypeRef, etc.Duncan P. N. Exon Smith2015-04-202-4/+4
| | | | | | | | | | Remove typedefs for type refs: - DITypeRef => MDTypeRef - DIScopeRef => MDScopeRef - DIDescriptorRef => DebugNodeRef llvm-svn: 235323
* [InlineAsm] Remove EarlyClobber on registers that are also inputsHal Finkel2015-04-201-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an inline asm call has an output register marked as early-clobber, but that same register is also an input operand, what should we do? GCC accepts this, and is documented to accept this for read/write operands saying, "Furthermore, if the earlyclobber operand is also a read/write operand, then that operand is written only after it's used." For write-only operands, the situation seems less clear, but I have at least one existing codebase that assumes this will work, in part because it has syscall macros like this: ({ \ register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \ register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \ register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \ register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \ __asm__ __volatile__ \ ("sc" \ : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \ : "0"(r0), "1"(r3), "2"(r4), "3"(r5) \ : "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \ r3; \ }) Furthermore, with register aliases and subregister relationships that only the backend knows about, rejecting this in the frontend seems like a difficult proposition (if we wanted to do so). However, keeping the early-clobber flag on the INLINEASM MI does not work for us, because it will cause the register's live interval to end to soon (so it will not appear defined to be used as an input). Fortunately, fixing this does not seem hard: When forming the INLINEASM MI, check to see if any of the early-clobber outputs are also inputs, and if so, remove the early-clobber flag. llvm-svn: 235283
* Remove CFIFuncName from TargetOptions as it is currently unused.Eric Christopher2015-04-191-6/+0
| | | | llvm-svn: 235268
* Remove the CFIEnforcing flag from TargetOptions as it is unused.Eric Christopher2015-04-191-2/+1
| | | | llvm-svn: 235267
* [GlobalMerge] Look at uses to create smaller global sets.Ahmed Bougacha2015-04-181-12/+240
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of merging everything together, look at the users of GlobalVariables, and try to group them by function, to create sets of globals used "together". Using that information, a less-aggressive alternative is to keep merging everything together *except* globals that are only ever used alone, that is, those for which it's clearly non-profitable to merge with others. In my testing, grouping by Function is too aggressive, but grouping by BasicBlock is too conservative. Anything in-between isn't trivially available, so stick with Function grouping for now. cl::opts are added for testing; both enabled by default. A few of the testcases aren't testing the merging proper, but just various edge cases when merging does occur. Update them to use the previous grouping behavior. Also, one of the tests is unrelated to GlobalMerge; change it accordingly. While there, switch to r234666' flags rather than the brutal -O3. Differential Revision: http://reviews.llvm.org/D8070 llvm-svn: 235249
* DebugInfo: Delete DIDescriptor (but not its subclasses)Duncan P. N. Exon Smith2015-04-183-21/+16
| | | | | | | Delete `DIDescriptor` and update the remaining users. I'll follow-up by deleting subclasses in manageable groups (top-down). llvm-svn: 235248
* Fix build wanrings and line endingsAndrew Kaylor2015-04-171-1/+0
| | | | llvm-svn: 235241
* DebugInfo: Remove DIDescriptor from the DebugInfo APIDuncan P. N. Exon Smith2015-04-172-2/+3
| | | | | | | Stop using `DIDescriptor` and its subclasses in the `DebugInfoFinder` API, as well as the rest of the API hanging around in `DebugInfo.h`. llvm-svn: 235240
* [WinEH] Fixes for a few cppeh failures.Andrew Kaylor2015-04-171-13/+82
| | | | | | Differential Review: http://reviews.llvm.org/D9065 llvm-svn: 235239
* AsmPrinter: Create a unified .debug_loc streamDuncan P. N. Exon Smith2015-04-178-86/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit removes `DebugLocList` and replaces it with `DebugLocStream`. - `DebugLocEntry` no longer contains its byte/comment streams. - The `DebugLocEntry` list for a variable/inlined-at pair is allocated on the stack, and released right after `DebugLocEntry::finalize()` (possible because of the refactoring in r231023). Now, only one list is in memory at a time now. - There's a single unified stream for the `.debug_loc` section that persists, stored in the new `DebugLocStream` data structure. The last point is important: this collapses the nested `SmallVector<>`s from `DebugLocList` into unified streams. We previously had something like the following: vec<tuple<Label, CU, vec<tuple<BeginSym, EndSym, vec<Value>, vec<char>, vec<string>>>>> A `SmallVector` can avoid allocations, but is statically fairly large for a vector: three pointers plus the size of the small storage, which is the number of elements in small mode times the element size). Nesting these is expensive, since an inner vector's size contributes to the element size of an outer one. (Nesting any vector is expensive...) In the old data structure, the outer vector's *element* size was 632B, excluding allocation costs for when the middle and inner vectors exceeded their small sizes. 312B of this was for the "three" pointers in the vector-tree beneath it. If you assume 1M functions with an average of 10 variable/inlined-at pairs each (in an LTO scenario), that's almost 6GB (besides inner allocations), with almost 3GB for the "three" pointers. This came up in a heap profile a little while ago of a `clang -flto -g` bootstrap, with `DwarfDebug::collectVariableInfo()` using something like 10-15% of the total memory. With this commit, we have: tuple<vec<tuple<Label, CU, Offset>>, vec<tuple<BeginSym, EndSym, Offset, Offset>>, vec<char>, vec<string>> The offsets are used to create `ArrayRef` slices of adjacent `SmallVector`s. This reduces the number of vectors to four (unrelated to the number of variable/inlined-at pairs), and caps the number of allocations at the same number. Besides saving memory and limiting allocations, this is NFC. I don't know my way around this code very well yet, but I wonder if we could go further: why stream to a side-table, instead of directly to the output stream? llvm-svn: 235229
OpenPOWER on IntegriCloud