summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Push the CURangeList down into the skeleton CU (where available) rather than ↵David Blaikie2014-11-033-3/+3
| | | | | | | | | | | | | | | | | the full CU So that it may be shared between skeleton/full compile unit, for CU ranges and other ranges to be added for fission+gmlt. (at some point we might want some kind of object shared between the skeleton and full compile units for all those things we only want one of in that scope, rather than having the full unit always look through to the skeleton... - alternatively, we might be able to have the skeleton pointer (or another, separate pointer) point to the skeleton or to the unit itself in non-fission, so we don't have to special case its absence) llvm-svn: 221186
* Add DwarfCompileUnit::BaseAddress to track the base address used by relative ↵David Blaikie2014-11-033-7/+11
| | | | | | | | | | | | | | addressing in debug_ranges and debug_loc This is one of a few steps to generalize range handling to include the CU range (thus the CU's range list will be moved into the range list list, losing track of the base address in the process), which means generalizing ranges from both the skeleton and full unit under fission. And... then I can used that generalized support for ranges in fission+gmlt where there'll be a bunch more ranges in the skeleton. llvm-svn: 221182
* Normally an 'optnone' function goes through fast-isel, which does notPaul Robinson2014-11-031-0/+7
| | | | | | | | | | | | call DAGCombiner. But we ran into a case (on Windows) where the calling convention causes argument lowering to bail out of fast-isel, and we end up in CodeGenAndEmitDAG() which does run DAGCombiner. So, we need to make DAGCombiner check for 'optnone' after all. Commit includes the test that found this, plus another one that got missed in the original optnone work. llvm-svn: 221168
* Cleanup some unused or trivial functions in DwarfCompileUnitDavid Blaikie2014-11-032-7/+1
| | | | llvm-svn: 221164
* Sink DwarfUnit::CURanges into DwarfCompileUnitDavid Blaikie2014-11-033-8/+7
| | | | llvm-svn: 221161
* Revert r221150, as it broke sanitizer testsOliver Stannard2014-11-032-2/+2
| | | | llvm-svn: 221151
* Emit .eh_frame with relocations to functions, rather than sectionsOliver Stannard2014-11-032-2/+2
| | | | | | | | | | | | | | | | | | | | | | | When LLVM emits DWARF call frame information, it currently creates a local, section-relative symbol in the code section, which is pointed to by a relocation on the .eh_frame section. However, for C++ we emit some functions in section groups, and the SysV ABI has some rules to make it easier to remove these sections (http://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_group_rules): A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed. This means that we need to use the function symbol for the relocation, not a temporary symbol. There was a comment in the code claiming that the local symbol was used to avoid creating a relocation, but a relocation must be created anyway as the code and CFI are in different sections. llvm-svn: 221150
* Sink range list handling down from DwarfUnit into its only use, in ↵David Blaikie2014-11-032-15/+15
| | | | | | DwarfCompileUnit. llvm-svn: 221123
* FormattingDavid Blaikie2014-11-021-1/+1
| | | | llvm-svn: 221095
* Add DwarfUnit::isDwoUnit and use it to generalize string creationDavid Blaikie2014-11-025-15/+34
| | | | | | | | | | | | | | | | | | | Currently we only need to emit skeleton strings into the CU header and we do this by explicitly calling "addLocalString". With gmlt-in-fission, we'll be emitting a bunch of other strings from other codepaths where it's not statically known that these strings will be local or not. Introduce a virtual function to indicate whether this unit is a DWO unit or not (I'm not sure if we have a good term for this, the opposite/alternative to 'skeleton' unit) and use that to generalize the string emission logic so that strings can be correctly emitted in both the skeleton and dwo unit when in split dwarf mode. And to demonstrate that this works, switch the existing special callers of addLocalString in the skeleton builder to addString - and they still work. Yay. llvm-svn: 221094
* Remove the last mention of LineTablesOnly from DwarfUnit, sinking it into ↵David Blaikie2014-11-023-4/+7
| | | | | | | | | | | | | | DwarfCompileUnit This is a useful distinction/invariant/delination to make because LineTablesOnly mode is never relevant to type units, so it's clear that we're not doing weird line-tables-only-with-types by making this API choice. It also lays the foundations nicely for adding gmlt-like data to fission skeleton CUs while limiting the effects to CUs and not TUs. llvm-svn: 221093
* Sink DwarfUnit::applySubprogramAttributesToDefinition into DwarfCompileUnitDavid Blaikie2014-11-024-8/+10
| | | | llvm-svn: 221092
* Sink DwarfUnit::addExpr into DwarfCompileUnitDavid Blaikie2014-11-024-10/+10
| | | | llvm-svn: 221090
* Fix the build from the last commitDavid Blaikie2014-11-021-2/+2
| | | | llvm-svn: 221089
* Sink DwarfUnit::applyVariableAttributes into DwarfCompileUnitDavid Blaikie2014-11-024-12/+12
| | | | llvm-svn: 221088
* Sink DwarfUnit::addLocationList down into DwarfCompileUnitDavid Blaikie2014-11-024-13/+12
| | | | llvm-svn: 221087
* Sink DwarfUnit::addComplexAddress down into DwarfCompileUnitDavid Blaikie2014-11-024-67/+65
| | | | llvm-svn: 221086
* Push DwarfUnit::addAddress down into DwarfCompileUnitDavid Blaikie2014-11-024-24/+22
| | | | llvm-svn: 221085
* Sink DwarfUnit::addVariableAddress into DwarfCompileUnit since type units ↵David Blaikie2014-11-024-18/+17
| | | | | | don't have variables llvm-svn: 221084
* DebugInfo: Sink accelerator table lists down (GlobalNames/Types) into ↵David Blaikie2014-11-026-33/+42
| | | | | | DwarfCompileUnit llvm-svn: 221083
* Add DwarfUnit::addGlobalType to match DwarfUnit::addGlobalNameDavid Blaikie2014-11-022-7/+15
| | | | | | | | | (these will shortly become virtual, with a null implementation in DwarfUnit (since type units don't have accelerator tables in the current schema) and the current implementation down in DwarfCompileUnit, moving the actual maps there too) llvm-svn: 221082
* DebugInfo: Refactor index type DIE initialization by rolling it into the ↵David Blaikie2014-11-022-10/+13
| | | | | | accessor llvm-svn: 221080
* Be sure to initialize DwarfCompileUnit::LabelBegin now that it may be ↵David Blaikie2014-11-021-1/+1
| | | | | | skipped in initSection llvm-svn: 221079
* Don't bother creating LabelBegin for .dwo unitsDavid Blaikie2014-11-022-4/+8
| | | | | | | | | | This would help catch cases where we might otherwise try to reference a dwo CU label, which would be weird - because without relocations in the dwo file it's not generally meaningful to talk about the CU offsets there (or, if it is, we can do so in absolute terms without using a relocation to compute it). llvm-svn: 221078
* Drop DwarfCompileUnit::getLocalLabel* in favor of just mapping through the ↵David Blaikie2014-11-022-17/+5
| | | | | | | | | skeleton explicitly. Confusing to do this two different ways - I'm not too wedded to either one, but here goes. llvm-svn: 221076
* Sink DwarfUnit::LabelBegin down into DwarfCompileUnit since that's the only ↵David Blaikie2014-11-023-10/+10
| | | | | | place it's needed. llvm-svn: 221075
* Sink dwarf unit length emission down into DwarfUnit::emitHeaderDavid Blaikie2014-11-014-7/+14
| | | | | | | | This allows the CU label to be emitted only for compile units, as they're the only ones that need it (so they can be referenced from pubnames) llvm-svn: 221072
* Remove DwarfUnit::LabelEnd in favor of computing the length of the section ↵David Blaikie2014-11-015-12/+5
| | | | | | | | | | directly This was a compile-unit specific label (unused in type units) and seems unnecessary anyway when we can more easily directly compute the size of the compile unit. llvm-svn: 221067
* Sink DwarfUnit::SectionSym into DwarfCompileUnit as it's only needed/used there.David Blaikie2014-11-014-40/+28
| | | | llvm-svn: 221062
* Make DwarfCompileUnit::Skeleton more narrowly typed (DwarfCompileUnit* ↵David Blaikie2014-11-011-3/+3
| | | | | | instead of DwarfUnit*) now that it's specific to DwarfCompileUnit anyway. llvm-svn: 221060
* Sink DwarfUnit::Skeleton down into DwarfCompileUnitDavid Blaikie2014-11-014-25/+25
| | | | | | | | | Type units no longer have skeletons and it's misleading to be able to query for a type unit's skeleton (it might incorrectly lead one to conclude that if a unit doesn't have a skeleton it's not in a .dwo file... ). llvm-svn: 221055
* Sink DwarfDebug::AbstractSPDies down into DwarfFileDavid Blaikie2014-11-013-11/+11
| | | | | | | | | | | This is the first big step to allowing gmlt-like inline scope information in the skeleton CU. While this commit doesn't change the functionality, it's only a small step to call "constructAbstractSubprogramDIE" on both the InfoHolder and the SkeletonHolder (when in use) and that will at least create the abstract SP dies in that case, though still not creating the other subprograms. llvm-svn: 221051
* Remove unused functionDavid Blaikie2014-11-011-3/+0
| | | | llvm-svn: 221037
* And... fix the build some more.David Blaikie2014-11-011-1/+1
| | | | llvm-svn: 221036
* Just iterate the DwarfCompileUnits rather than trying to filter them out of ↵David Blaikie2014-11-011-49/+46
| | | | | | the list of all units. llvm-svn: 221034
* Add '*' to auto variable that is a pointer, as per the coding conventions.David Blaikie2014-11-011-1/+1
| | | | llvm-svn: 221033
* Add DwarfCompileUnit::getSkeleton that returns DwarfCompileUnit* to avoid ↵David Blaikie2014-11-012-3/+6
| | | | | | having to cast from DwarfUnit* on every call. llvm-svn: 221031
* IR: MDNode => Value: Instruction::getMetadata()Duncan P. N. Exon Smith2014-11-012-8/+8
| | | | | | | | | | Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. llvm-svn: 221024
* Sink some of DwarfDebug::collectDeadVariables down into DwarfCompileUnit.David Blaikie2014-10-314-20/+25
| | | | llvm-svn: 221010
* Sink most of DwarfDebug::constructAbstractSubprogramScopeDIE into ↵David Blaikie2014-10-313-14/+13
| | | | | | DwarfCompileUnit llvm-svn: 221005
* [CodeGenPrepare] Move extractelement close to store if they can be combined.Quentin Colombet2014-10-311-1/+379
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds an optimization in CodeGenPrepare to move an extractelement right before a store when the target can combine them. The optimization may promote any scalar operations to vector operations in the way to make that possible. ** Context ** Some targets use different register files for both vector and scalar operations. This means that transitioning from one domain to another may incur copy from one register file to another. These copies are not coalescable and may be expensive. For example, according to the scheduling model, on cortex-A8 a vector to GPR move is 20 cycles. ** Motivating Example ** Let us consider an example: define void @foo(<2 x i32>* %addr1, i32* %dest) { %in1 = load <2 x i32>* %addr1, align 8 %extract = extractelement <2 x i32> %in1, i32 1 %out = or i32 %extract, 1 store i32 %out, i32* %dest, align 4 ret void } As it is, this IR generates the following assembly on armv7: vldr d16, [r0] @vector load vmov.32 r0, d16[1] @ cross-register-file copy: 20 cycles orr r0, r0, #1 @ scalar bitwise or str r0, [r1] @ scalar store bx lr Whereas we could generate much faster code: vldr d16, [r0] @ vector load vorr.i32 d16, #0x1 @ vector bitwise or vst1.32 {d16[1]}, [r1:32] @ vector extract + store bx lr Half of the computation made in the vector is useless, but this allows to get rid of the expensive cross-register-file copy. ** Proposed Solution ** To avoid this cross-register-copy penalty, we promote the scalar operations to vector operations. The penalty will be removed if we manage to promote the whole chain of computation in the vector domain. Currently, we do that only when the chain of computation ends by a store and the target is able to combine an extract with a store. Stores are the most likely candidates, because other instructions produce values that would need to be promoted and so, extracted as some point[1]. Moreover, this is customary that targets feature stores that perform a vector extract (see AArch64 and X86 for instance). The proposed implementation relies on the TargetTransformInfo to decide whether or not it is beneficial to promote a chain of computation in the vector domain. Unfortunately, this interface is rather inaccurate for this level of details and although this optimization may be beneficial for X86 and AArch64, the inaccuracy will lead to the optimization being too aggressive. Basically in TargetTransformInfo, everything that is legal has a cost of 1, whereas, even if a vector type is legal, usually a vector operation is slightly more expensive than its scalar counterpart. That will lead to too many promotions that may not be counter balanced by the saving of the cross-register-file copy. For instance, on AArch64 this penalty is just 4 cycles. For now, the optimization is just enabled for ARM prior than v8, since those processors have a larger penalty on cross-register-file copies, and the scope is limited to basic blocks. Because of these two factors, we limit the effects of the inaccuracy. Indeed, I did not want to build up a fancy cost model with block frequency and everything on top of that. [1] We can imagine targets that can combine an extractelement with other instructions than just stores. If we want to go into that direction, the current interfaces must be augmented and, moreover, I think this becomes a global isel problem. Differential Revision: http://reviews.llvm.org/D5921 <rdar://problem/14170854> llvm-svn: 220978
* Correct assert text from r220923David Blaikie2014-10-311-1/+1
| | | | | | Noticed in post-commit review by Adrian Prantl. llvm-svn: 220967
* PR20557: Fix the bug that bogus cpu parameter crashes llc on AArch64 backend.Hao Liu2014-10-311-1/+5
| | | | | | Initial patch by Oleg Ranevskyy. llvm-svn: 220945
* [SelectionDAG] When scalarizing trunc, don't assert for legal operands.Ahmed Bougacha2014-10-301-1/+17
| | | | | | | | | | | | | | | | | | | | | r212242 introduced a legalizer hook, originally to let AArch64 widen v1i{32,16,8} rather than scalarize, because the legalizer expected, when scalarizing the result of a conversion operation, to already have scalarized the operands. On AArch64, v1i64 is legal, so that commit ensured operations such as v1i32 = trunc v1i64 wouldn't assert. It did that by choosing to widen v1 types whenever possible. However, v1i1 types, for which there's no legal widened type, would still trigger the assert. This commit fixes that, by only scalarizing a trunc's result when the operand has already been scalarized, and introducing an extract_elt otherwise. This is similar to r205625. Fixes PR20777. llvm-svn: 220937
* Fix incorrect invariant check in DAG CombineLouis Gerbarg2014-10-301-1/+1
| | | | | | | | | | | | | Earlier this summer I fixed an issue where we were incorrectly combining multiple loads that had different constraints such alignment, invariance, temporality, etc. Apparently in one case I made copt paste error and swapped alignment and invariance. Tests included. rdar://18816719 llvm-svn: 220933
* PR21408: Workaround the appearance of duplicate variables due to problems ↵David Blaikie2014-10-301-1/+6
| | | | | | when inlining two calls to the same function from the same call site. llvm-svn: 220923
* Whitespace.NAKAMURA Takumi2014-10-295-40/+40
| | | | llvm-svn: 220857
* Minimize the scope of some variables, NFC.David Blaikie2014-10-281-2/+2
| | | | llvm-svn: 220759
* [PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of theseLang Hames2014-10-271-29/+50
| | | | | | | | | | | sets as keys into a cache of interference matrice values in the Interference constraint adder. Creating interference matrices was one of the large remaining time-sinks in PBQP. Caching them reduces the total compile time (when using PBQP) on the nightly test suite by ~10%. llvm-svn: 220688
* Remove some unnecessary casts.David Blaikie2014-10-261-2/+2
| | | | llvm-svn: 220658
OpenPOWER on IntegriCloud