summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "Debug info: (bugfix) C++ C/Dtors can be compiled to multiple functions,"Adrian Prantl2014-04-121-14/+9
| | | | | | | This reverts commit 206096 while I investigate why this broke the gdb buildbot. llvm-svn: 206103
* Use dwarf::Tag rather than unsigned for DIE::Tag to make debugging easier.David Blaikie2014-04-124-8/+13
| | | | | | | | | | | Nice to be able to just print out the Tag and have the debugger print dwarf::DW_TAG_subprogram or whatever, rather than an int. It's a bit finicky (for example DIDescriptor::getTag still returns unsigned) because some places still handle real dwarf tags + our fake tags (one day we'll remove the fake tags, hopefully). llvm-svn: 206098
* Debug info: (bugfix) C++ C/Dtors can be compiled to multiple functions,Adrian Prantl2014-04-121-9/+14
| | | | | | | | | | | | therefore, their declaration cannot have one DW_AT_linkage_name. The specific instances however can and should have that attribute. This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE() to emit linkage names for C/Dtors. rdar://problem/16362674. llvm-svn: 206096
* Reenable use of TBAA during CodeGenHal Finkel2014-04-122-14/+2
| | | | | | | | | | | | | | | | | | | | We had disabled use of TBAA during CodeGen (even when otherwise using AA) because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to miss basic type punning that it should catch (and, thus, we'd fail to override TBAA when we should). However, when AA is in use during CodeGen, CGP now uses normal GEPs and bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a result, BasicAA should be able to make us do the right thing in the face of type-punning, and it seems safe to enable use of TBAA again. self-hosting seems fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle. Note: We still don't update TBAA when merging stack slots, although because BasicAA should now catch all such cases, this is no longer a blocking issue. Nevertheless, I plan to commit code to deal with this properly in the near future. llvm-svn: 206093
* Add the ability to use GEPs for address sinking in CGPHal Finkel2014-04-121-0/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | The current memory-instruction optimization logic in CGP, which sinks parts of the address computation that can be adsorbed by the addressing mode, does this by explicitly converting the relevant part of the address computation into IR-level integer operations (making use of ptrtoint and inttoptr). For most targets this is currently not a problem, but for targets wishing to make use of IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a problem for two reasons: 1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr 2. In cases where type-punning was used, and BasicAA was used to override TBAA, BasicAA may no longer do so. (this had forced us to disable all use of TBAA in CodeGen; something which we can now enable again) This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by default (except for those targets that use AA during CodeGen), and so aside from some PowerPC subtargets and SystemZ, there should be no change in behavior. We may be able to switch completely away from the ptrtoint/inttoptr sinking on all targets, but further testing is required. I've doubled-up on a number of existing tests that are sensitive to the address sinking behavior (including some store-merging tests that are sensitive to the order of the resulting ADD operations at the SDAG level). llvm-svn: 206092
* blockfreq: Rename BlockFrequencyImpl to BlockFrequencyInfoImplDuncan P. N. Exon Smith2014-04-111-2/+2
| | | | | | | | | | | | This is a shared implementation class for BlockFrequencyInfo and MachineBlockFrequencyInfo, not for BlockFrequency, a related (but distinct) class. No functionality change. <rdar://problem/14292693> llvm-svn: 206083
* [RegAllocGreedy][Last Chance Recoloring] Change the name of the exhaustive ↵Quentin Colombet2014-04-111-1/+1
| | | | | | | | | | | search option. fexhaustive-register-search => exhaustive-register-search 'f' is a Clang thing! This is related to PR18747. llvm-svn: 206075
* [RegAllocGreedy][Last Chance Recoloring] Addition ofQuentin Colombet2014-04-111-6/+14
| | | | | | | | | | | -fexhaustive-register-search option to allow an exhaustive search during last chance recoloring. This is related to PR18747 Patch by MAYUR PANDEY <mayur.p@samsung.com>. llvm-svn: 206072
* [Register Coalescer] Fix wrong live-range information with rematerialization.Quentin Colombet2014-04-111-0/+21
| | | | | | | | | | | | | | | When rematerializing an instruction that defines a super register that would be used by a physical subregisters we use the related physical super register for the definition. To keep the live-range information accurate, all the defined subregisters must be marked as dead def, otherwise the register allocation may miss some interferences. Working on a reduced test-case! <rdar://problem/16582185> llvm-svn: 206060
* Debug info: Store the DIVariable in DebugLocEntry also for constants,Adrian Prantl2014-04-112-9/+11
| | | | | | | | so DwarfDebug::emitDebugLocEntry can emit them with the correct signedness. rdar://problem/15928306 llvm-svn: 206042
* Move ExtractVectorElements to SelectionDAG.Matt Arsenault2014-04-111-0/+16
| | | | | | | This seems generally useful, and makes sense to go along with SplitVector. llvm-svn: 206041
* SelectionDAG: Use helper function to improve legalization of ISD::MULTom Stellard2014-04-111-0/+17
| | | | | | | | The TargetLowering::expandMUL() helper contains lowering code extracted from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more ISD::MUL patterns without having to use a library call. llvm-svn: 206037
* SelectionDAG: Factor ISD::MUL lowering code out of DAGTypeLegalizerTom Stellard2014-04-112-67/+113
| | | | | | | | | | | This code has been moved to a new function in the TargetLowering class called expandMUL(). The purpose of this is to be able to share lowering code between the SelectionDAGLegalize and DAGTypeLegalizer classes. No functionality changed intended. llvm-svn: 206036
* Implement depth_first and inverse_depth_first range factory functions.David Blaikie2014-04-111-11/+9
| | | | | | | | | | | | | | Also updated as many loops as I could find using df_begin/idf_begin - strangely I found no uses of idf_begin. Is that just used out of tree? Also a few places couldn't use df_begin because either they used the member functions of the depth first iterators or had specific ordering constraints (I added a comment in the latter case). Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T> where you needed iterator_range<idf_iterator<T>>) llvm-svn: 206016
* [c++11] Range'ify use list loops in InstrEmitter.Jim Grosbach2014-04-111-9/+3
| | | | llvm-svn: 206015
* [c++11] Range'ify use list loops in DAGCombiner.Jim Grosbach2014-04-111-18/+7
| | | | llvm-svn: 206014
* Move the segmented stack switch to a function attributeReid Kleckner2014-04-102-1/+6
| | | | | | | | | This removes the -segmented-stacks command line flag in favor of a per-function "split-stack" attribute. Patch by Luqman Aden and Alex Crichton! llvm-svn: 205997
* Debug info: Factor the retrieving of the DIVariable from a MachineInstrAdrian Prantl2014-04-101-3/+2
| | | | | | into a function. llvm-svn: 205973
* Fix to support properly cleaning up failed address sinking against constantsJim Grosbach2014-04-101-2/+3
| | | | | | | | | | | As it turns out the source of the sunkaddr can be a constant, in which case there is not an instruction to delete, causing the cleanup code introduced in r204833 to crash. This patch adds a dynamic check to ensure the deleted value is in fact an instruction and not a constant. Patch by Louis Gerbarg <lgg@apple.com> llvm-svn: 205941
* SelectionDAG: Don't constant fold target-specific nodes.Jim Grosbach2014-04-091-0/+6
| | | | | | | | | | | | | | FoldConstantArithmetic() only knows how to deal with a few target independent ISD opcodes. Bail early if it sees a target-specific ISD node. These node do funny things with operand types which may break the assumptions of the code that follows, and there's no actual folding that can be done anyway. For example, non-constant 256 bit vector shifts on X86 have a shift-amount operand that's a 128-bit v4i32 vector regardless of what the first operand type is and that breaks the assumption that the operand types must match. rdar://16530923 llvm-svn: 205937
* [DAGCombiner] DAG combine does not know how to combine indexed loads withQuentin Colombet2014-04-091-2/+5
| | | | | | | | | | | sign/zero/any extensions. However a few places were not checking properly the property of the load and were turning an indexed load into a regular extended load. Therefore the indexed value was lost during the process and this was triggering an assertion. <rdar://problem/16389332> llvm-svn: 205923
* WinCOFF: Emit common symbols as specified in the COFF specDavid Majnemer2014-04-081-3/+6
| | | | | | | | | | | | | | | | | | Summary: Local common symbols were properly inserted into the .bss section. However, putting external common symbols in the .bss section would give them a strong definition. Instead, encode them as undefined, external symbols who's symbol value is equivalent to their size. Reviewers: Bigcheese, rafael, rnk CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3324 llvm-svn: 205811
* Bug 19348: Check for legal ExtLoad operation before foldingMatt Arsenault2014-04-081-9/+12
| | | | | | | | (aext (zextload x)) -> (aext (truncate (*extload x))) Patch by Stanislav Mekhanoshin! llvm-svn: 205805
* RegAlloc: Account for a variable entry block frequencyDuncan P. N. Exon Smith2014-04-082-13/+58
| | | | | | | | | | | | | | | | | | | | Until r197284, the entry frequency was constant -- i.e., set to 2^14. Although current ToT still has a constant entry frequency, since r197284 that has been an implementation detail (which is soon going to change). - r204690 made the wrong assumption for the CSRCost metric. Adjust callee-saved register cost based on entry frequency. - r185393 made the wrong assumption (although it was valid at the time). Update SpillPlacement.cpp::Threshold to be relative to the entry frequency. Since ToT still has 2^14 entry frequency, this should have no observable functionality change. <rdar://problem/14292693> llvm-svn: 205789
* Put a limit on ScheduleDAGSDNodes::ClusterNeighboringLoads to avoid blowing ↵Andrew Trick2014-04-071-1/+6
| | | | | | | | | | | | | | | | | | | | up compile time. Fixes PR16365 - Extremely slow compilation in -O1 and -O2. The SD scheduler has a quadratic implementation of load clustering which absolutely blows up compile time for large blocks with constant pool loads. The MI scheduler has a better implementation of load clustering. However, we have not done the work yet to completely eliminate the SD scheduler. Some benchmarks still seem to benefit from early load clustering, although maybe by chance. As an intermediate term fix, I just put a nice limit on the number of DAG users to search before finding a match. With this limit there are no binary differences in the LLVM test suite, and the PR16365 test case does not suffer any compile time impact from this routine. llvm-svn: 205738
* Minor change to StackMapLiveness DEBUG output.Andrew Trick2014-04-041-1/+1
| | | | llvm-svn: 205656
* Add DAG parameter to ComputeNumSignBitsForTargetNodeMatt Arsenault2014-04-042-1/+2
| | | | | | | | This way, you can check the number of sign bits in the operands. The depth parameter it already has is pretty useless without this. llvm-svn: 205649
* DAGLegalize: add last-ditch type-legalization for VSELECT.Tim Northover2014-04-042-0/+16
| | | | | | | | | | | | | When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can decide that the result is OK (v1i64 is legal on AArch64, for example) but it still need scalarising because of that v1i1. There was no code to do this though. AArch64 and ARM64 have DAG combines to produce efficient code and prevent that occuring in *most* such situations, but there are edge cases that they miss. This adds a legalization to cope with that. llvm-svn: 205626
* ARM64: handle v1i1 types arising from setcc properly.Tim Northover2014-04-041-3/+15
| | | | | | | | | | | | | | | | | | | | There were several overlapping problems here, and this solution is closely inspired by the one adopted in AArch64 in r201381. Firstly, scalarisation of v1i1 setcc operations simply fails if the input types are legal. This is fixed in LegalizeVectorTypes.cpp this time, and allows AArch64 code to be simplified slightly. Second, vselect with such a setcc feeding into it ends up in ScalarizeVectorOperand, where it's not handled. I experimented with an implementation, but found that whatever DAG came out was rather horrific. I think Hao's DAG combine approach is a good one for quality, though there are edge cases it won't catch (to be fixed separately). Should fix PR19335. llvm-svn: 205625
* Make consistent use of MCPhysReg instead of uint16_t throughout the tree.Craig Topper2014-04-048-8/+8
| | | | llvm-svn: 205610
* [RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chanceQuentin Colombet2014-04-041-1/+35
| | | | | | | | | | recoloring cut-offs are encountered and register allocation failed. This is related to PR18747 Patch by MAYUR PANDEY <mayur.p@samsung.com>. llvm-svn: 205601
* Revert r205599, the commit was not intended to have so many changesQuentin Colombet2014-04-041-35/+1
| | | | llvm-svn: 205600
* [RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chanceQuentin Colombet2014-04-041-1/+35
| | | | | | | | | | recoloring cut-offs are hit. This is related to PR18747. Patch by MAYUR PANDEY <mayur.p@samsung.com> llvm-svn: 205599
* Fix for PR 19261:Eric Christopher2014-04-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | llc doesn't generate nodes for unconditional fall-through branches for targets without FastISel implementation (X86 has it, but can be disabled by "-fast-isel=false") in SelectionDAGBuilder::visitBr(). So for line 4 in the following testcase 1: void foo(int i){ 2: switch(i){ 3: default: 4: break; 5: } 6: return; 7: } there is no corresponding line in .debug_line section, and a debugger cannot set a breakpoint at line 4. Fix this by always emitting a branch when we're not optimizing and add a testcase to ensure that there's code on every line we'd want to break. Patch by Daniil Fukalov. llvm-svn: 205529
* DebugInfo: Use a 64 bit type for the subrangeDavid Blaikie2014-04-031-4/+4
| | | | | | | | | | | While we were encoding 64 bit values (data8) in the subrange itself, using a 32 bit type for the subrange was still confusing the gdb. Oh, and make it unsigned too. As the comment points out, this could be pushed into the frontend so that it would be 32 or 64 bit as appropriate, etc. llvm-svn: 205512
* [CodeGen] Fix peephole optimizer bug introduced in r205481. Fixes PR19318.Lang Hames2014-04-031-9/+11
| | | | | | | | I should have read that comment a little more carefully. ;) Regression test in the works, committing in the mean time to un-break people. llvm-svn: 205511
* Account for scalarization costs in BasicTTI::getMemoryOpCost for extending ↵Hal Finkel2014-04-031-2/+24
| | | | | | | | | | | | | | | | | | | | vector loads When a vector type legalizes to a larger vector type, and the target does not support the associated extending load (or truncating store), then legalization will scalarize the load (or store) resulting in an associated scalarization cost. BasicTTI::getMemoryOpCost needs to account for this. Between this, and r205487, PowerPC on the P7 with VSX enabled shows: MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup (some of these are new; some of these, such as PAQ8p, just reverse regressions that VSX support would trigger) llvm-svn: 205495
* Fix multi-register costs in BasicTTI::getCastInstrCostHal Finkel2014-04-021-1/+2
| | | | | | | | | | | | | For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or promoted). This is not true when the number of values required to legalize the type is changing. For example, <8 x i16> being sign extended by <8 x i32> is not generically cheap on PPC with VSX, even though sign extension to v4i32 is legal, because two output v4i32 values are required compared to the single v8i16 input value, and without custom logic in the target, this conversion will scalarize. llvm-svn: 205487
* [CodeGen] Teach the peephole optimizer to remember (and exploit) all foldingLang Hames2014-04-021-35/+44
| | | | | | | | opportunities in the current basic block, rather than just the last one seen. <rdar://problem/16478629> llvm-svn: 205481
* Add comments and test case for [DAG] Keep the opaque constant flag when ↵Juergen Ributzka2014-04-021-1/+5
| | | | | | performing unary constant folding operations (r204737). llvm-svn: 205474
* Simplify resolveFrameIndex() signature.Jim Grosbach2014-04-021-1/+1
| | | | | | | | Just pass a MachineInstr reference rather than an MBB iterator. Creating a MachineInstr& is the first thing every implementation did anyway. llvm-svn: 205453
* ARM: Add support for segmented stacksOliver Stannard2014-04-021-0/+3
| | | | | | Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430
* clarify commentAdrian Prantl2014-04-021-1/+2
| | | | llvm-svn: 205429
* Adjust comments regarding non-relocated abbrev offset in debug_info.dwoDavid Blaikie2014-04-022-2/+4
| | | | | | | | I'm not sure the comment in the implementation really adds a lot of value (it's clear that we emit zero when no symbol is provided, but it doesn't explain why we would do that). Happy to iterate. llvm-svn: 205386
* Split debug_loc and debug_loc.dwo emission into two separate functionsDavid Blaikie2014-04-022-21/+32
| | | | | | Based on code review feedback from Eric Christopher on r204697 llvm-svn: 205385
* DebugInfo: Introduce DebugLocList to encapsulate a list of DebugLocEntries ↵David Blaikie2014-04-025-12/+39
| | | | | | | | | | | | and an MC Label to refer to them This removes the magic-number-esque code creating/retrieving the same label for a debug_loc entry from two places and removes the last small piece of reusable logic from emitDebugLoc so that there will be less duplication when refactoring it into two functions (one for debug_loc, the other for debug_loc.dwo). llvm-svn: 205382
* Add a doxygen comment to DebugLocEntry::Merge.Adrian Prantl2014-04-011-0/+3
| | | | llvm-svn: 205374
* DebugLocEntry: Actually merge the loc entry when returning true.David Blaikie2014-04-011-1/+5
| | | | | | | | | | Seems we didn't have any test coverage for merging... awesome. So I added some - but hit an llvm-objdump bug while I was there. I'm choosing not to shave that yak right now. Code review feedback/bug catch by Adrian Prantl in r205360. llvm-svn: 205373
* Fix accidental fallthrough in DebugLocEntry::hasSameValueOrLocationDavid Blaikie2014-04-011-5/+10
| | | | | | | | | | No test case (this would invoke UB by examining uninitialized members, etc, at best - and this code is apparently untested anyway - I'm about to fix that) Code review feedback from Adrian Prantl on r205360. llvm-svn: 205367
* Remove unused function DebugLocEntry::isEmptyDavid Blaikie2014-04-011-3/+0
| | | | llvm-svn: 205365
OpenPOWER on IntegriCloud