summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [Stackmap] Add AnyReg calling convention support for patchpoint intrinsic.Juergen Ributzka2013-11-084-36/+100
| | | | | | | | | | | | | | The idea of the AnyReg Calling Convention is to provide the call arguments in registers, but not to force them to be placed in a paticular order into a specified set of registers. Instead it is up tp the register allocator to assign any register as it sees fit. The same applies to the return value (if applicable). Differential Revision: http://llvm-reviews.chandlerc.com/D2009 Reviewed by Andy llvm-svn: 194293
* increase the accuracy of register pressure computation in the presence of ↵Pedro Artigas2013-11-082-16/+36
| | | | | | dead definitions by using live intervals, if available, to identify dead definitions and proceed accordingly. llvm-svn: 194286
* Fix some minor issues with r194282 to get the tree healthy again.Lang Hames2013-11-081-1/+2
| | | | llvm-svn: 194284
* Add a method to get the object-file appropriate stack map section.Lang Hames2013-11-081-2/+1
| | | | | | Thanks to Eric Christopher for the tips on the appropriate way to do this. llvm-svn: 194282
* Revert "CalculateSpillWeights does not need to be a pass"Arnaud A. de Grandmaison2013-11-085-9/+29
| | | | | | Temporarily revert my previous commit until I understand why it breaks 3 target tests. llvm-svn: 194272
* [VirtRegMap] Fix for PR17825. Do not ignore noreturn definitions when settingQuentin Colombet2013-11-081-1/+5
| | | | | | | isPhysRegUsed if the unwind information is required. Indeed, the runtime may need a correct stack to be able to unwind the call. llvm-svn: 194271
* CalculateSpillWeights does not need to be a passArnaud A. de Grandmaison2013-11-085-29/+9
| | | | | | | | | | Based on discussions with Lang Hames and Jakob Stoklund Olesen at the hacker's lab, and in the light of upcoming work on the PBQP register allocator, it was though that CalcSpillWeights does not need to be a pass. This change will enable to customize / tune the spill weight computation depending on the allocator. Update the documentation style while there. No functionnal change. llvm-svn: 194269
* CalculateSpillWeights cleanup: remove unneeded includesArnaud A. de Grandmaison2013-11-081-2/+0
| | | | llvm-svn: 194259
* Slightly change the way stackmap and patchpoint intrinsics are lowered.Andrew Trick2013-11-051-9/+27
| | | | | | | | | | | | | | MorphNodeTo is not safe to call during DAG building. It eagerly deletes dependent DAG nodes which invalidates the NodeMap. We could expose a safe interface for morphing nodes, but I don't think it's worth it. Just create a new MachineNode and replaceAllUsesWith. My understaning of the SD design has been that we want to support early target opcode selection. That isn't very well supported, but generally works. It seems reasonable to rely on this feature even if it isn't widely used. llvm-svn: 194102
* Comment some and reformat for clarity beginFunction.Eric Christopher2013-11-011-30/+42
| | | | llvm-svn: 193894
* [Stackmap] Remove erroneous assert.Juergen Ributzka2013-11-011-3/+0
| | | | llvm-svn: 193871
* Remove linkonce_odr_auto_hide.Rafael Espindola2013-11-011-4/+2
| | | | | | | | | | | | | | | linkonce_odr_auto_hide was in incomplete attempt to implement a way for the linker to hide symbols that are known to be available in every TU and whose addresses are not relevant for a particular DSO. It was redundant in that it all its uses are equivalent to linkonce_odr+unnamed_addr. Unlike those, it has never been connected to clang or llvm's optimizers, so it was effectively dead. Given that nothing produces it, this patch just nukes it (other than the llvm-c enum value). llvm-svn: 193865
* Commenting out this assert because it is causing the build bots to fail. ↵Aaron Ballman2013-11-011-2/+2
| | | | | | This effectively reverts r193861, but needs to be fixed as part of r193769. llvm-svn: 193862
* Fixing an order of evaluation error in an assert.Aaron Ballman2013-11-011-1/+1
| | | | llvm-svn: 193861
* DebugInfo: Emit member variable locations as data instead of expressions in ↵David Blaikie2013-11-011-29/+31
| | | | | | | | | blocks Drive by space optimization. Also makes the DIEs more regular which might speed up DWARF parsing. llvm-svn: 193835
* Unused variableAndrew Trick2013-10-311-0/+1
| | | | llvm-svn: 193819
* Add support for stack map generation in the X86 backend.Andrew Trick2013-10-313-0/+217
| | | | | | Originally implemented by Lang Hames. llvm-svn: 193811
* Debug Info: remove duplication of DIEs when a DIE can be shared across CUs.Manman Ren2013-10-316-11/+87
| | | | | | | | | | | | | | | | | | | We add a map in DwarfDebug to map MDNodes that are shareable across CUs to the corresponding DIEs: MDTypeNodeToDieMap. These DIEs can be shared across CUs, that is why we keep the maps in DwarfDebug instead of CompileUnit. We make the assumption that if a DIE is not added to an owner yet, we assume it belongs to the current CU. Since DIEs for the type system are added to their owners immediately after creation, and other DIEs belong to the current CU, the assumption should be true. A testing case is added to show that we only create a single DIE for a type MDNode and we use ref_addr to refer to the type DIE. We also add a testing case to show ref_addr relocations for non-darwin platforms. llvm-svn: 193779
* Lower stackmap intrinsics directly to their target opcode in the DAG builder.Andrew Trick2013-10-313-11/+216
| | | | llvm-svn: 193769
* whitespaceAndrew Trick2013-10-311-7/+7
| | | | llvm-svn: 193765
* Remove the --shrink-wrap option.Rafael Espindola2013-10-314-1394/+69
| | | | | | It had no tests, was unused and was "experimental at best". llvm-svn: 193749
* Legalize: Improve legalization of long vector extends.Jim Grosbach2013-10-312-3/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727
* Fix CodeGen for unaligned loads with address spacesMatt Arsenault2013-10-301-2/+5
| | | | llvm-svn: 193721
* Produce .weak_def_can_be_hidden for some linkonce_odr valuesRafael Espindola2013-10-301-7/+20
| | | | | | | | | | | | | | With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr if they are also unnamed_addr or don't have their address taken. There is not a lot of documentation about .weak_def_can_be_hidden, but from the old discussion about linkonce_odr_auto_hide and the name of the directive this looks correct: these symbols can be hidden. Testing this with the ld64 in Xcode 5 linking clang reduces the number of exported symbols from 21053 to 19049. llvm-svn: 193718
* DebugInfo: Push header handling down into CompileUnitDavid Blaikie2013-10-303-18/+29
| | | | | | | This is a preliminary step to handling type units by abstracting over all (type or compile) units. llvm-svn: 193714
* DwarfDebug: Change Abbreviations member from pointer to referenceDavid Blaikie2013-10-302-10/+10
| | | | llvm-svn: 193699
* Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs ↵Juergen Ributzka2013-10-302-35/+8
| | | | | | | | splitting too." Now Hexagon and SystemZ are not happy with it :-( llvm-svn: 193677
* SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.Juergen Ributzka2013-10-302-8/+35
| | | | | | | | | | | | | | | | | | | | The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 193676
* Reformat code with clang-format.Josh Magee2013-10-301-37/+40
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D2057 llvm-svn: 193672
* Debug Info: code clean up.Manman Ren2013-10-292-3/+5
| | | | | | | | | | Use EmitLabelOffsetDifference for handling on darwin platform when non-darwin platforms use EmitLabelPlusOffset. Also fix a bug in EmitLabelOffsetDifference where the size is hard-coded to 4 even though Size is passed in as an argument. llvm-svn: 193660
* Debug Info: support for DW_FORM_ref_addr.Manman Ren2013-10-296-3/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | To support ref_addr, we calculate the section offset of a DIE (i.e. offset of a DIE from beginning of the debug info section). The Offset field in DIE is currently CU-relative. To calculate the section offset, we add a DebugInfoOffset field in CompileUnit to store the offset of a CU from beginning of the debug info section. We set the value in DwarfUnits::computeSizeAndOffset for each CompileUnit. A helper function DIE::getCompileUnit is added to return the CU DIE that the input DIE belongs to. We also add a map CUDieMap in DwarfDebug to help finding the CU for a given CU DIE. For a cross-referenced DIE, we first find the CU DIE it belongs to with getCompileUnit, then we use CUDieMap to get the corresponding CU for the CU DIE. Adding the section offset of the CU with the CU-relative offset of a DIE gives us the seciton offset of the DIE. We correctly emit ref_addr with relocation using EmitLabelPlusOffset when doesDwarfUseRelocationsAcrossSections is true. This commit handles the emission of DW_FORM_ref_addr when we have an attribute with FORM_ref_addr. A follow-on patch will start using ref_addr when adding a DIEEntry. This commit will be tested and verified in the follow-on patch. Reviewed off-list by Eric, Thanks. llvm-svn: 193658
* Debug Info: instead of calling addToContextOwner which constructs the contextManman Ren2013-10-292-21/+9
| | | | | | | | | | | | | | | after the DIE creation, we construct the context first. Ensure that we create the context before we create a type so that we can add the newly created type to the parent. Remove last use of addToContextOwner now that it's not needed. We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs should be added to their parents right after the creation. Reviewed off-list by Eric, Thanks. llvm-svn: 193657
* [stackprotector] Update the StackProtector pass to perform datalayout analysis.Josh Magee2013-10-291-27/+60
| | | | | | | | | | | | | | | This modifies the pass to classify every SSP-triggering AllocaInst according to an SSPLayoutKind (LargeArray, SmallArray, AddrOf). This analysis is collected by the pass and made available for use, but no other pass uses it yet. The next patch will make use of this analysis in PEI and StackSlot passes. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D1789 llvm-svn: 193653
* Move getSymbol to TargetLoweringObjectFile.Rafael Espindola2013-10-292-9/+9
| | | | | | This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630
* Add a helper getSymbol to AsmPrinter.Rafael Espindola2013-10-294-14/+17
| | | | llvm-svn: 193627
* Debug Info: instead of calling addToContextOwner which constructs the contextManman Ren2013-10-291-7/+17
| | | | | | | | | | | | after the DIE creation, we construct the context first. This touches creation of namespaces and global variables. The purpose is to handle all DIE creations similarly: constructs the context first, then creates the DIE and immediately adds the DIE to its parent. We use createAndAddDIE to wrap around "new DIE(". llvm-svn: 193589
* Fix "existant" typosAlp Toker2013-10-291-1/+1
| | | | llvm-svn: 193579
* Debug Info: use createAndAddDIE to wrap around "new DIE" in DwarfDebug.Manman Ren2013-10-291-6/+5
| | | | | | | | | This commit ensures DIEs are constructed within a compile unit and immediately added to their parents. Reviewed off-list by Eric. llvm-svn: 193568
* Debug Info: use createAndAddDIE for newly-created Subprogram DIEs.Manman Ren2013-10-291-9/+5
| | | | | | | | | | | More patches will be submitted to convert "new DIE(" to use createAddAndDIE in DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where we have to decide between ref4 and ref_addr, because DIEs that can be shared across CU will be added to a CU already. Reviewed off-list by Eric. llvm-svn: 193567
* Debug Info: add a helper function createAndAddDIE.Manman Ren2013-10-292-29/+28
| | | | | | | | | | | | | | It wraps around "new DIE(" and handles the bookkeeping part of the newly-created DIE. It adds the DIE to its parent, and calls insertDIE if necessary. It makes sure that bookkeeping is done at the earliest time and we should not see parentless DIEs if all constructions of DIEs go through this helper function. Later on, we can use an allocator for DIE allocation, and will only need to change createAndAddDIE instead of modifying all the "new DIE(". Reviewed off-list by Eric. llvm-svn: 193566
* [DAGCombiner] Respect volatility when checking for aliasesRichard Sandiford2013-10-281-18/+25
| | | | | | | | Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518
* Keep TBAA info when rewriting SelectionDAG loads and storesRichard Sandiford2013-10-288-191/+181
| | | | | | | | | | | | | | | | | Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517
* DIEHash: Summary hashing of member functionsDavid Blaikie2013-10-251-1/+1
| | | | llvm-svn: 193432
* DIEHash: Summary hashing of nested typesDavid Blaikie2013-10-252-1/+26
| | | | llvm-svn: 193427
* LegalizeDAG: allow libcalls for max/min atomic operationsTim Northover2013-10-252-0/+60
| | | | | | | | | | | ARM processors without ldrex/strex need to be able to make libcalls for all atomic operations, including the newer min/max versions. The alternative would probably be expanding these operations in terms of cmpxchg (as x86 does always), but in the configurations where this matters code-size tends to be paramount so the libcall is more desirable. llvm-svn: 193398
* Optimize concat_vectors(X, undef) -> scalar_to_vector(X).Nadav Rotem2013-10-251-1/+28
| | | | | | | This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393
* MCStreamer: Reimplement the virtual EmitRawText as a protected member, ↵David Blaikie2013-10-241-1/+1
| | | | | | | | | | EmitRawTextImpl, to avoid string literal ambiguities Also improve the implementation of EmitRawText(Twine) so it doesn't bother using the SmallString buffer if the Twine is a simple StringRef anyway. llvm-svn: 193378
* DWARF emission: Remove unnecessary/redundant DIE reference codeDavid Blaikie2013-10-241-7/+0
| | | | | | The default case at the end of the switch handles this just fine. llvm-svn: 193374
* Fix name of variable in comment.Eric Christopher2013-10-241-1/+1
| | | | llvm-svn: 193373
* Grammar.Eric Christopher2013-10-241-1/+1
| | | | llvm-svn: 193372
OpenPOWER on IntegriCloud