summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* MC: Skip names of temporary symbols in object streamerDuncan P. N. Exon Smith2015-05-061-0/+3
| | | | | | | | | | | | | | Don't create names for temporary symbols when using an object streamer. The names never make it to the output anyway. From the starting point of r236629, my heap profile says this drops peak memory usage from 1100 MB to 1058 MB for CodeGen of `verify-uselistorder`, a savings of almost 4% on peak memory, and removes `StringMap<bool, BumpPtrAllocator...>` from the profile entirely. (I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc`; see r236629 for details.) llvm-svn: 236642
* CodeGen: move over-zealous assert into actual if statement.Tim Northover2015-05-061-3/+2
| | | | | | | | | | | | | | | It's quite possible to encounter an insertvalue instruction that's more deeply nested than the value we're looking for, but when that happens we really mustn't compare beyond the end of the index array. Since I couldn't see any guarantees about what comparisons std::equal makes, we probably need to directly check the size beforehand. In practice, I suspect most std::equal implementations would probably bail early, which would be OK. But just in case... rdar://20834485 llvm-svn: 236635
* DwarfDebug: Emit number of bytes in .debug_loc entry directlyDuncan P. N. Exon Smith2015-05-061-6/+3
| | | | | | | | | | | | | | | | | | | | | | Emit the number of bytes in a `.debug_loc` entry directly. The old code created temp labels (expensive), emitted the difference between them, and then emitted one on each side of the relevant bytes. (I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc` (the optimized version of ld64's `-save-temps` when linking the `verify-uselistorder` executable in an LTO bootstrap). I've hacked `MCContext::Allocate()` to just call `malloc()` instead of using the `BumpPtrAllocator` so that the heap profile is easier to read. As far as peak memory is concerned, `MCContext::Allocate()` is equivalent to a leak, since it only gets freed at process teardown. In my heap profile, this patch drops memory usage of `DwarfDebug::emitDebugLoc()` from 132.56 MB (11.4%) down to 29.86 MB (2.7%) at peak memory. Some of that must be noise from `SmallVector` (or other) allocations -- peak memory only dropped from 1160 MB down to 1100 MB -- but this nevertheless shaves 5% off the top.) llvm-svn: 236629
* [WinEH] Improve fatal error message about failed demotionReid Kleckner2015-05-061-1/+6
| | | | llvm-svn: 236626
* [SelectionDAG] Delete SelectionDAGBuilder::removeValue. NFC.Sanjoy Das2015-05-061-6/+0
| | | | | | SelectionDAGBuilder::removeValue is dead now, after rL236563. llvm-svn: 236618
* Allow 0-weight branches in BranchProbabilityInfo.Diego Novillo2015-05-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: When computing branch weights in BPI, we used to disallow branches with weight 0. This is a minor nuisance, because a branch with weight 0 is different to "don't have information". In the context of instrumentation, it may mean "never executed", in the context of sampling, it means "never or seldom executed". In allowing 0 weight branches, I ran into issues with the switch expansion code in selection DAG. It is currently hardwired to not handle branches with weight 0. To maintain the current behaviour, I changed it to use 1 when it finds 0, but perhaps the algorithm needs changes to tolerate branches with weight zero. Reviewers: hansw Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9533 llvm-svn: 236617
* Add ChangeTo* to MachineOperand for symbolsMatt Arsenault2015-05-061-0/+22
| | | | llvm-svn: 236612
* Reformat.NAKAMURA Takumi2015-05-062-6/+5
| | | | llvm-svn: 236601
* Revert r236546, "propagate IR-level fast-math-flags to DAG nodes (NFC)"NAKAMURA Takumi2015-05-064-65/+60
| | | | | | It caused undefined behavior. llvm-svn: 236600
* SelectionDAG: Handle out-of-bounds index in extract vector elementPawel Bylica2015-05-061-0/+4
| | | | | | | | | | | | | | | | | | Summary: This patch correctly handles undef case of EXTRACT_VECTOR_ELT node where the element index is constant and not less than vector size. Test Plan: CodeGen for X86 test included. Also one incorrect regression test fixed. Reviewers: qcolombet, chandlerc, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D9250 llvm-svn: 236584
* [Statepoint] Clean up StatepointLowering: symbolic constants.Sanjoy Das2015-05-061-2/+3
| | | | | | | | For accessors in the `Statepoint` class, use symbolic constants for offsets into the argument vector instead of literals. This makes the code intent clearer and simpler to change. llvm-svn: 236566
* [Statepoint] Clean up Statepoint.h: accessor names.Sanjoy Das2015-05-062-16/+16
| | | | | | Use getFoo() as accessors consistently and some other naming changes. llvm-svn: 236564
* [StatepointLowering] Don't create temporary instructions. NFCI.Sanjoy Das2015-05-061-73/+69
| | | | | | | | | | | | | | Summary: Instead of creating a temporary call instruction and lowering that, use SelectionDAGBuilder::lowerCallOperands. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9480 llvm-svn: 236563
* [WinEH] Reset WinEHPrepare::SEHExceptionCodeSlot when we're done.Ahmed Bougacha2015-05-061-0/+1
| | | | | | | This caused a use-after-free on test/CodeGen/X86/win32-eh.ll No functional change intended. llvm-svn: 236561
* [SelectionDAG] Make an argument optional in RFV::getCopyToRegs. NFC.Sanjoy Das2015-05-051-5/+6
| | | | | | | | | | | | | | | Summary: We default the value argument to nullptr. The only use of the value is in diagnosePossiblyInvalidConstraint and that seems to be resilient to it being nullptr. Reviewers: atrick, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9479 llvm-svn: 236555
* [SelectionDAG] Move RegsForValue into SelectionDAGBuilder.h. NFC.Sanjoy Das2015-05-052-85/+90
| | | | | | | | | | | | | | | Summary: The exported class will be used in later change, in StatepointLowering.cpp. It is still internal to SelectionDAG (not exported via include/). Reviewers: reames, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9478 llvm-svn: 236554
* [SelectionDAG] Pass explicit type to lowerCallOperands. NFC.Sanjoy Das2015-05-052-5/+6
| | | | | | | | | | | | | | Summary: Currently this does not change anything, but change will be used in a later change to StatepointLowering.cpp Reviewers: reames, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9477 llvm-svn: 236553
* [StatepointLowering] Rename variable, NFC.Sanjoy Das2015-05-051-3/+3
| | | | | | Rename LoweredArgs to LoweredMetaArgs to clarify intent. llvm-svn: 236552
* Fix IfConverter to handle regmask machine operands.Pete Cooper2015-05-051-1/+14
| | | | | | | | | | | | | | Note, this is a recommit of r236515 after fixing an error in r236514. The buildbot ran fast enough that it picked up r236514 prior to r236515 and threw an error. r236515 itself ran 'make check' without errors. Original commit message follows: A regmask (typically seen on a call) clobbers the set of registers it lists. The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks. These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier. Otherwise, uses after the if converted call could think they are reading an undefined register. Reviewed by Matthias Braun and Quentin Colombet. llvm-svn: 236550
* propagate IR-level fast-math-flags to DAG nodes (NFC)Sanjay Patel2015-05-054-60/+65
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds the minimum plumbing necessary to use IR-level fast-math-flags (FMF) in the backend without actually using them for anything yet. This is a follow-on to: http://reviews.llvm.org/rL235997 ...which split the existing nsw / nuw / exact flags and FMF into their own struct. There are 2 structural changes here: 1. The main diff is that we're preparing to extend the optimization flags to affect more than just binary SDNodes. Eg, IR intrinsics ( https://llvm.org/bugs/show_bug.cgi?id=21290 ) or non-binop nodes that don't even exist in IR such as FMA, FNEG, etc. 2. The other change is that we're actually copying the FP fast-math-flags from the IR instructions to SDNodes. Differential Revision: http://reviews.llvm.org/D8900 llvm-svn: 236546
* Refactor UpdatePredRedefs and StepForward to avoid duplication. NFCPete Cooper2015-05-052-31/+29
| | | | | | | | | | Note, this is a reapplication of r236515 with a fix to not assert on non-register operands, but instead only handle them until the subsequent commit. Original commit message follows. The code was basically the same here already. Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs. Will be used in the next commit to also handle regmasks. llvm-svn: 236538
* [DAGCombiner] Account for getVectorIdxTy() when narrowing vector loadUlrich Weigand2015-05-051-2/+3
| | | | | | | | | | | This patch makes ReplaceExtractVectorEltOfLoadWithNarrowedLoad convert the element number from getVectorIdxTy() to PtrTy before doing pointer arithmetic on it. This is needed on z, where element numbers are i32 but pointers are i64. Original patch by Richard Sandiford. llvm-svn: 236530
* [DAGCombiner] Fix ReplaceExtractVectorEltOfLoadWithNarrowedLoad for BEUlrich Weigand2015-05-051-7/+0
| | | | | | | | | | | | | For little-endian, the function would convert (extract_vector_elt (load X), Y) to X + Y*sizeof(elt). For big-endian it would instead use X + sizeof(vec) - Y*sizeof(elt). The big-endian case wasn't right since vector index order always follows memory/array order, even for big-endian. (Note that the current handling has to be wrong for Y==0 since it would access beyond the end of the vector.) Original patch by Richard Sandiford. llvm-svn: 236529
* [LegalizeVectorTypes] Allow single loads and stores for more short vectorsUlrich Weigand2015-05-051-1/+4
| | | | | | | | | | | | | | | | | | When lowering a load or store for TypeWidenVector, the type legalizer would use a single load or store if the associated integer type was legal. E.g. it would load a v4i8 as an i32 if i32 was legal. This patch extends that behavior to promoted integers as well as legal ones. If the integer type for the full vector width is TypePromoteInteger, the element type is going to be TypePromoteInteger too, and it's still better to use a single promoting load or truncating store rather than N individual promoting loads or truncating stores. E.g. if you have a v2i8 on a target where i16 is promoted to i32, it's better to load the v2i8 as an i16 rather than load both i8s individually. Original patch by Richard Sandiford. llvm-svn: 236528
* Revert "Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC"Pete Cooper2015-05-052-29/+31
| | | | | | | | This reverts commit 963cdbccf6e5578822836fd9b2ebece0ba9a60b7 (ie r236514) This is to get the bots green while i investigate. llvm-svn: 236518
* Revert "Fix IfConverter to handle regmask machine operands."Pete Cooper2015-05-051-14/+0
| | | | | | | | This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515). This is to get the bots green while i investigate the failures. llvm-svn: 236517
* Fix IfConverter to handle regmask machine operands.Pete Cooper2015-05-051-0/+14
| | | | | | | | | | A regmask (typically seen on a call) clobbers the set of registers it lists. The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks. These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier. Otherwise, uses after the if converted call could think they are reading an undefined register. Reviewed by Matthias Braun and Quentin Colombet. llvm-svn: 236515
* Refactor UpdatePredRedefs and StepForward to avoid duplication. NFCPete Cooper2015-05-052-31/+29
| | | | | | | | The code was basically the same here already. Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs. Will be used in the next commit to also handle regmasks. llvm-svn: 236514
* Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-052-5/+8
| | | | | | | | | | | | This reverts commit r236360. This change exposed a bug in WinEHPrepare by opting win32 code into EH preparation. We already knew that WinEHPrepare has bugs, and is the status quo for x64, so I don't think that's a reason to hold off on this change. I disabled exceptions in the sanitizer tests in r236505 and an earlier revision. llvm-svn: 236508
* [ShrinkWrap] Add (a simplified version) of shrink-wrapping.Quentin Colombet2015-05-056-37/+530
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. ** Context ** Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. ** Motivating example ** Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. ** Proposed Solution ** This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. ** Design Decisions ** 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507
* CodeGen: match up correct insertvalue indices when assessing tail calls.Tim Northover2015-05-041-1/+2
| | | | | | | | | | | | When deciding whether a value comes from the aggregate or inserted value of an insertvalue instruction, we compare the indices against those of the location we're interested in. One of the lists needs reversing because the input data is backwards (so that modifications take place at the end of the SmallVector), but we were reversing both before leading to incorrect results. Should fix PR23408 llvm-svn: 236457
* ScheduleDAGInstrs should toggle kill flags on bundled instrs.Pete Cooper2015-05-041-1/+46
| | | | | | | | | | | | | | | | | | | | | | | | ScheduleDAGInstrs wasn't setting or clearing the kill flags on instructions inside bundles. This led to code such as this %R3<def> = t2ANDrr %R0 BUNDLE %ITSTATE<imp-def,dead>, %R0<imp-use,kill> t2IT 1, 24, %ITSTATE<imp-def> R6<def,tied6> = t2ORRrr %R0<kill>, ... being transformed to BUNDLE %ITSTATE<imp-def,dead>, %R0<imp-use> t2IT 1, 24, %ITSTATE<imp-def> R6<def,tied6> = t2ORRrr %R0<kill>, ... %R3<def> = t2ANDrr %R0<kill> where the kill flag was removed from the BUNDLE instruction, but not the t2ORRrr inside it. The verifier then thought that R0 was undefined when read by the AND. This change make the toggleKillFlags method also check for bundles and toggle flags on bundled instructions. Setting the kill flag is special cased as we only want to set the kill flag on the last instruction in the bundle. llvm-svn: 236428
* Masked gather and scatter intrinsics - enabled codegen for KNL.Elena Demikhovsky2015-05-033-3/+182
| | | | llvm-svn: 236394
* [DAGCombiner] Enabled vector float/double -> int constant foldingSimon Pilgrim2015-05-022-4/+4
| | | | llvm-svn: 236387
* DebugInfo: Use low_pc relative debug_ranges under fission when the CU has a ↵David Blaikie2015-05-021-1/+1
| | | | | | | | | low_pc Seems we were setting the base address on the wrong DwarfCompileUnit object so it wasn't being used when generating the ranges. llvm-svn: 236377
* Fix spelling.Jim Grosbach2015-05-021-1/+1
| | | | llvm-svn: 236367
* Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-012-8/+5
| | | | | | This reverts commit r236359. Things are still broken despite testing. :( llvm-svn: 236360
* Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-012-5/+8
| | | | | | This reverts commit r236340. llvm-svn: 236359
* Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-012-8/+5
| | | | | | This reverts commit r236339, it breaks the win32 clang-cl self-host. llvm-svn: 236340
* [WinEH] Add an EH registration and state insertion pass for 32-bit x86Reid Kleckner2015-05-012-5/+8
| | | | | | | | | | | | | | | | | This pass is responsible for constructing the EH registration object that gets linked into fs:00, which is all it does in this change. In the future, it will also insert stores to update the EH state number. I considered keeping this functionality in WinEHPrepare, but it's pretty separable and X86 specific. It has conceptually very little to do with the task of WinEHPrepare, which is currently outlining. WinEHPrepare is also in theory useful on ARM, but this logic is pretty x86 specific. Reviewers: andrew.w.kaylor, majnemer Differential Revision: http://reviews.llvm.org/D9422 llvm-svn: 236339
* [SelectionDAG] Unary vector constant folding integer legality fixesSimon Pilgrim2015-05-011-5/+25
| | | | | | | | | | | | This patch fixes issues with vector constant folding not correctly handling scalar input operands if they require implicit truncation - this was tested with llvm-stress as recommended by Patrik H Hagglund. The patch ensures that integer input scalars from a build vector are correctly truncated before folding, and that constant integer scalar results are promoted to a legal type before inclusion in the new folded build vector. I have added another crash test case and also a test for UINT_TO_FP / SINT_TO_FP using an non-truncated scalar input, which was failing before this patch. Differential Revision: http://reviews.llvm.org/D9282 llvm-svn: 236308
* Fix typoMatt Arsenault2015-04-301-1/+1
| | | | llvm-svn: 236283
* Commute the internal flag on MachineOperands.Pete Cooper2015-04-301-0/+4
| | | | | | | | | | | | | | When commuting a thumb instruction in the size reduction pass, thumb instructions are represented as a bundle and so some operands may be marked as internal. The internal flag has to move with the operand when commuting. This test is sensitive to register allocation so can't specifically check that this error was happening, but so long as it continues to pass with -verify then hopefully its still ok. rdar://problem/20752113 llvm-svn: 236282
* Fix for PR23103. Correctly propagate the 'IsUndef' flag to the register ↵Andrea Di Biagio2015-04-301-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | operands of a commuted instruction. Revision 220239 exposed a latent bug in method 'TargetInstrInfo::commuteInstruction'. When commuting the operands of a machine instruction, method 'commuteInstruction' didn't correctly propagate the 'IsUndef' flag to the register operands of the new (commuted) instruction. Before this patch, the following instruction: %vreg4<def> = VADDSDrr %vreg14, %vreg5<undef>; FR64:%vreg4,%vreg14,%vreg5 was wrongly converted by method 'commuteInstruction' into: %vreg4<def> = VADDSDrr %vreg5, %vreg14<undef>; FR64:%vreg4,%vreg5,%vreg14 The correct instruction should have been: %vreg4<def> = VADDSDrr %vreg5<undef>, %vreg14; FR64:%vreg4,%vreg5,%vreg14 This patch fixes the problem in method 'TargetInstrInfo::commuteInstruction'. When swapping the operands of a machine instruction, we now make sure that 'IsUndef' flags are correctly set. Added test case 'pr23103.ll'. Differential Revision: http://reviews.llvm.org/D9406 llvm-svn: 236258
* MachineVerifier: Don't crash if MachineOperand has no parentMatt Arsenault2015-04-301-2/+12
| | | | | | | | If you somehow added a MachineOperand to an instruction that did not have the parent set, the verifier would crash since it attempts to use the operand's parent. llvm-svn: 236249
* Don't rewrite jumps to empty BBs to landing pads.Pete Cooper2015-04-301-0/+5
| | | | | | | | | | | | In the test case here, the 'unreachable' BB was removed by BranchFolding because its empty. It then rewrote the jump from 'entry' to jump to its fallthrough, which was a landing pad. This results in 'entry' jumping to 2 different landing pads, which fails the machine verifier. rdar://problem/20750162 llvm-svn: 236248
* Add a note about permitting default member initializersReid Kleckner2015-04-301-4/+4
| | | | | | | Use them in WinEHPrepare so that we can spot any toolchain bugs that come up. llvm-svn: 236244
* Reinstate revisions r234755, r234759, r234760Jan Vesely2015-04-301-0/+32
| | | | | | | | | changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240
* Inline local variable to silence unused warning.Daniel Jasper2015-04-301-2/+1
| | | | llvm-svn: 236212
* Masked gather and scatter - added DAGCombine visitorsElena Demikhovsky2015-04-302-0/+144
| | | | | | | | | and AVX-512 instruction selection patterns. All other patches, including tests will follow. http://reviews.llvm.org/D7665 llvm-svn: 236211
OpenPOWER on IntegriCloud