summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Lower stackmap intrinsics directly to their target opcode in the DAG builder.Andrew Trick2013-10-313-11/+216
| | | | llvm-svn: 193769
* whitespaceAndrew Trick2013-10-311-7/+7
| | | | llvm-svn: 193765
* Remove the --shrink-wrap option.Rafael Espindola2013-10-314-1394/+69
| | | | | | It had no tests, was unused and was "experimental at best". llvm-svn: 193749
* Legalize: Improve legalization of long vector extends.Jim Grosbach2013-10-312-3/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727
* Fix CodeGen for unaligned loads with address spacesMatt Arsenault2013-10-301-2/+5
| | | | llvm-svn: 193721
* Produce .weak_def_can_be_hidden for some linkonce_odr valuesRafael Espindola2013-10-301-7/+20
| | | | | | | | | | | | | | With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr if they are also unnamed_addr or don't have their address taken. There is not a lot of documentation about .weak_def_can_be_hidden, but from the old discussion about linkonce_odr_auto_hide and the name of the directive this looks correct: these symbols can be hidden. Testing this with the ld64 in Xcode 5 linking clang reduces the number of exported symbols from 21053 to 19049. llvm-svn: 193718
* DebugInfo: Push header handling down into CompileUnitDavid Blaikie2013-10-303-18/+29
| | | | | | | This is a preliminary step to handling type units by abstracting over all (type or compile) units. llvm-svn: 193714
* DwarfDebug: Change Abbreviations member from pointer to referenceDavid Blaikie2013-10-302-10/+10
| | | | llvm-svn: 193699
* Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs ↵Juergen Ributzka2013-10-302-35/+8
| | | | | | | | splitting too." Now Hexagon and SystemZ are not happy with it :-( llvm-svn: 193677
* SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.Juergen Ributzka2013-10-302-8/+35
| | | | | | | | | | | | | | | | | | | | The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 193676
* Reformat code with clang-format.Josh Magee2013-10-301-37/+40
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D2057 llvm-svn: 193672
* Debug Info: code clean up.Manman Ren2013-10-292-3/+5
| | | | | | | | | | Use EmitLabelOffsetDifference for handling on darwin platform when non-darwin platforms use EmitLabelPlusOffset. Also fix a bug in EmitLabelOffsetDifference where the size is hard-coded to 4 even though Size is passed in as an argument. llvm-svn: 193660
* Debug Info: support for DW_FORM_ref_addr.Manman Ren2013-10-296-3/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | To support ref_addr, we calculate the section offset of a DIE (i.e. offset of a DIE from beginning of the debug info section). The Offset field in DIE is currently CU-relative. To calculate the section offset, we add a DebugInfoOffset field in CompileUnit to store the offset of a CU from beginning of the debug info section. We set the value in DwarfUnits::computeSizeAndOffset for each CompileUnit. A helper function DIE::getCompileUnit is added to return the CU DIE that the input DIE belongs to. We also add a map CUDieMap in DwarfDebug to help finding the CU for a given CU DIE. For a cross-referenced DIE, we first find the CU DIE it belongs to with getCompileUnit, then we use CUDieMap to get the corresponding CU for the CU DIE. Adding the section offset of the CU with the CU-relative offset of a DIE gives us the seciton offset of the DIE. We correctly emit ref_addr with relocation using EmitLabelPlusOffset when doesDwarfUseRelocationsAcrossSections is true. This commit handles the emission of DW_FORM_ref_addr when we have an attribute with FORM_ref_addr. A follow-on patch will start using ref_addr when adding a DIEEntry. This commit will be tested and verified in the follow-on patch. Reviewed off-list by Eric, Thanks. llvm-svn: 193658
* Debug Info: instead of calling addToContextOwner which constructs the contextManman Ren2013-10-292-21/+9
| | | | | | | | | | | | | | | after the DIE creation, we construct the context first. Ensure that we create the context before we create a type so that we can add the newly created type to the parent. Remove last use of addToContextOwner now that it's not needed. We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs should be added to their parents right after the creation. Reviewed off-list by Eric, Thanks. llvm-svn: 193657
* [stackprotector] Update the StackProtector pass to perform datalayout analysis.Josh Magee2013-10-291-27/+60
| | | | | | | | | | | | | | | This modifies the pass to classify every SSP-triggering AllocaInst according to an SSPLayoutKind (LargeArray, SmallArray, AddrOf). This analysis is collected by the pass and made available for use, but no other pass uses it yet. The next patch will make use of this analysis in PEI and StackSlot passes. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D1789 llvm-svn: 193653
* Move getSymbol to TargetLoweringObjectFile.Rafael Espindola2013-10-292-9/+9
| | | | | | This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630
* Add a helper getSymbol to AsmPrinter.Rafael Espindola2013-10-294-14/+17
| | | | llvm-svn: 193627
* Debug Info: instead of calling addToContextOwner which constructs the contextManman Ren2013-10-291-7/+17
| | | | | | | | | | | | after the DIE creation, we construct the context first. This touches creation of namespaces and global variables. The purpose is to handle all DIE creations similarly: constructs the context first, then creates the DIE and immediately adds the DIE to its parent. We use createAndAddDIE to wrap around "new DIE(". llvm-svn: 193589
* Fix "existant" typosAlp Toker2013-10-291-1/+1
| | | | llvm-svn: 193579
* Debug Info: use createAndAddDIE to wrap around "new DIE" in DwarfDebug.Manman Ren2013-10-291-6/+5
| | | | | | | | | This commit ensures DIEs are constructed within a compile unit and immediately added to their parents. Reviewed off-list by Eric. llvm-svn: 193568
* Debug Info: use createAndAddDIE for newly-created Subprogram DIEs.Manman Ren2013-10-291-9/+5
| | | | | | | | | | | More patches will be submitted to convert "new DIE(" to use createAddAndDIE in DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where we have to decide between ref4 and ref_addr, because DIEs that can be shared across CU will be added to a CU already. Reviewed off-list by Eric. llvm-svn: 193567
* Debug Info: add a helper function createAndAddDIE.Manman Ren2013-10-292-29/+28
| | | | | | | | | | | | | | It wraps around "new DIE(" and handles the bookkeeping part of the newly-created DIE. It adds the DIE to its parent, and calls insertDIE if necessary. It makes sure that bookkeeping is done at the earliest time and we should not see parentless DIEs if all constructions of DIEs go through this helper function. Later on, we can use an allocator for DIE allocation, and will only need to change createAndAddDIE instead of modifying all the "new DIE(". Reviewed off-list by Eric. llvm-svn: 193566
* [DAGCombiner] Respect volatility when checking for aliasesRichard Sandiford2013-10-281-18/+25
| | | | | | | | Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518
* Keep TBAA info when rewriting SelectionDAG loads and storesRichard Sandiford2013-10-288-191/+181
| | | | | | | | | | | | | | | | | Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517
* DIEHash: Summary hashing of member functionsDavid Blaikie2013-10-251-1/+1
| | | | llvm-svn: 193432
* DIEHash: Summary hashing of nested typesDavid Blaikie2013-10-252-1/+26
| | | | llvm-svn: 193427
* LegalizeDAG: allow libcalls for max/min atomic operationsTim Northover2013-10-252-0/+60
| | | | | | | | | | | ARM processors without ldrex/strex need to be able to make libcalls for all atomic operations, including the newer min/max versions. The alternative would probably be expanding these operations in terms of cmpxchg (as x86 does always), but in the configurations where this matters code-size tends to be paramount so the libcall is more desirable. llvm-svn: 193398
* Optimize concat_vectors(X, undef) -> scalar_to_vector(X).Nadav Rotem2013-10-251-1/+28
| | | | | | | This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393
* MCStreamer: Reimplement the virtual EmitRawText as a protected member, ↵David Blaikie2013-10-241-1/+1
| | | | | | | | | | EmitRawTextImpl, to avoid string literal ambiguities Also improve the implementation of EmitRawText(Twine) so it doesn't bother using the SmallString buffer if the Twine is a simple StringRef anyway. llvm-svn: 193378
* DWARF emission: Remove unnecessary/redundant DIE reference codeDavid Blaikie2013-10-241-7/+0
| | | | | | The default case at the end of the switch handles this just fine. llvm-svn: 193374
* Fix name of variable in comment.Eric Christopher2013-10-241-1/+1
| | | | llvm-svn: 193373
* Grammar.Eric Christopher2013-10-241-1/+1
| | | | llvm-svn: 193372
* Update misleading comment.Eric Christopher2013-10-241-2/+3
| | | | llvm-svn: 193371
* DIEHash: Const correct and use references where non-null/non-rebound.David Blaikie2013-10-243-49/+49
| | | | llvm-svn: 193363
* DIEHash: Do not use shallow type hashing for unnamed typesDavid Blaikie2013-10-241-4/+6
| | | | llvm-svn: 193361
* DIEHash: Refactor ref attribute hashing into smaller functionsDavid Blaikie2013-10-243-68/+98
| | | | llvm-svn: 193360
* Remove unused debug-only member variable.David Blaikie2013-10-241-4/+0
| | | | | | | This may've been used at some point but the 'print' member function grew an Indent parameter that entirely shadows this parameter. llvm-svn: 193358
* Debug Info: code clean up.Manman Ren2013-10-232-31/+23
| | | | | | | | | | | Since we never insert DIE for DITemplateTypeParameter to a map, there is no need to call getDIE in getOrCreateTemplateTypeParameterDIE. It is also renamed to constructTemplateTypeParameterDIE to match with other construct functions in CompileUnit. Same applies to getOrCreateTemplateValueParameterDIE. llvm-svn: 193287
* Debug Info: code clean up.Manman Ren2013-10-232-5/+5
| | | | | | | Rename createMemberDIE to constructMemberDIE to match other construct functions in CompileUnit. llvm-svn: 193286
* Debug Info: code clean up.Manman Ren2013-10-232-18/+12
| | | | | | | Remove the unneeded return values from createMemberDIE, constructEnumTypeDIE, getOrCreateTemplateTypeParameterDIE, and getOrCreateTemplateValueParameterDIE. llvm-svn: 193285
* Debug Info: code clean up.Manman Ren2013-10-232-18/+18
| | | | | | | | | Unifying the argument ordering of private construct functions in CompileUnit to follow constructTypeDIE(DIE &, DIBasicType), constructTypeDIE(DIE &, DIDerivedType), constructTypeDIE(DIE &, DICompositeType), constructSubrangeDIE and constructArrayTypeDIE. llvm-svn: 193284
* Remove {} from one-line block.Manman Ren2013-10-231-2/+1
| | | | llvm-svn: 193276
* Reduce casting and use a fully covered switch.Rafael Espindola2013-10-231-9/+14
| | | | llvm-svn: 193272
* SelectionDAG: Pass along the original argument/element type in ISD::InputArgTom Stellard2013-10-232-6/+9
| | | | | | | | | | | | | | | | For some targets, it is useful to be able to look at the original type of an argument without having to dig through the original IR. This also fixes a bug in SelectionDAGBuilder where InputArg.PartOffset was not taking into account the offset of structure elements. Patch by: Justin Holewinski Tom Stellard: - Changed the type of ArgVT to EVT, so it can store non-simple types like v3i32. llvm-svn: 193214
* Debug Info: code clean up.Manman Ren2013-10-221-7/+1
| | | | | | | | | | | Remove unnecessary creation of LexicalScope in collectDeadVariables. The created LexicialScope was only used to get isAbstractScope, which should be false from the creation: "new LexicalScope(NULL, DIDescriptor(SP), NULL, false);". We can also remove a DenseMap that holds the created LexicalScopes. llvm-svn: 193196
* DIEHashing: Provide an assert for unreachable functionality regarding friends.David Blaikie2013-10-221-0/+3
| | | | | | | | | | | | | Since (as of r190716) Clang no longer emits debug info for C++ friend declarations (and it seems GCC never has/does, which was the motivation for the Clang change), there's no actual reachable case for implementing the part of DWARF 4, Section 7.27 part 5 that pertains to friends. Leave an assert here so that if/when we do have a client producing friends and using type units, we can fill in the gap and add appropriate (unit and feature) tests. llvm-svn: 193193
* DWARF type hashing: pointers to membersDavid Blaikie2013-10-221-11/+14
| | | | | | | | | Includes a test case/FIXME demonstrating a bug/limitation in pointer to member hashing. To be honest I'm not sure why we don't just always use summary hashing for referenced types... but perhaps I'm missing something. llvm-svn: 193175
* Using FoldingSet in SelectionDAG::getVTList.Wan Xiaofei2013-10-221-59/+64
| | | | | | | | | | VTList has a long life cycle through the module and getVTList is frequently called. In current getVTList, sequential search over a std::vector is used, this is inefficient in big module. This patch use FoldingSet to implement hashing mechanism when searching. Reviewer: Nadav Rotem Test : Pass unit tests & LNT test suite llvm-svn: 193150
* Formatting/whitespace.Eric Christopher2013-10-221-4/+4
| | | | llvm-svn: 193135
* DWARF Type Hashing: Include reference and rvalue reference type in the ↵David Blaikie2013-10-211-1/+3
| | | | | | | | declarable summary hashing path More support for 7.25 Part 5. llvm-svn: 193129
OpenPOWER on IntegriCloud