summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-113-1/+32
| | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. llvm-svn: 324854
* [CodeView] Allow variable names to be as long as the codeview format supportsBrock Wyma2018-02-111-3/+4
| | | | | | | | | Instead of reserving 0xF00 bytes for the fixed length portion of the CodeView symbol name, calculate the actual length of the fixed length portion. Differential Revision: https://reviews.llvm.org/D42125 llvm-svn: 324850
* [X86][SSE] Use SplitBinaryOpsAndApply to recognise PSUBUS patterns before ↵Simon Pilgrim2018-02-111-8/+16
| | | | | | | | they're split on AVX1 This needs to be generalised further to support AVX512BW cases but I want to add non-uniform constants first. llvm-svn: 324844
* [InstCombine] X / (X * Y) -> 1 / Y if the multiplication does not overflowSanjay Patel2018-02-111-0/+11
| | | | | | | | | | | The related cases for (X * Y) / X were handled in rL124487. https://rise4fun.com/Alive/6k9 The division in these tests is subsequently eliminated by existing instcombines for 1/X. llvm-svn: 324843
* [X86] Use min/max for vector ult/ugt compares if avoids a sign flip.Craig Topper2018-02-111-27/+34
| | | | | | | | | | | | | | | | | | | | | Summary: Currently we only use min/max to help with ule/uge compares because it removes an invert of the result that would otherwise be needed. But we can also use it for ult/ugt compares if it will prevent the need for a sign bit flip needed to use pcmpgt at the cost of requiring an invert after the compare. I also refactored the code so that the max/min code is self contained and does its own return instead of setting up a flag to manipulate the rest of the function's behavior. Most of the test cases look ok with this. I did notice that we added instructions when one of the operands being sign flipped is a constant vector that we were able to constant fold the flip into. I also noticed that sometimes the SSE min/max clobbers a register that is needed after the compare. This resulted in an extra move being inserted before the min/max to preserve the register. We could try to detect this and switch from min to max and change the compare operands to use the operand that gets reused in the compare. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42935 llvm-svn: 324842
* [X86][SSE] Moved SplitBinaryOpsAndApply earlier so more methods can use it. ↵Simon Pilgrim2018-02-111-47/+47
| | | | | | NFCI. llvm-svn: 324841
* [TargetLowering] try to create -1 constant operand for math ops via demanded ↵Sanjay Patel2018-02-111-0/+21
| | | | | | | | | | | | | | | | | | bits This reverses instcombine's demanded bits' transform which always tries to clear bits in constants. As noted in PR35792 and shown in the test diffs: https://bugs.llvm.org/show_bug.cgi?id=35792 ...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity. I did investigate changing instcombine's behavior, but it would be more work to change canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions, so we'd have to figure out how to not regress those cases. Differential Revision: https://reviews.llvm.org/D42986 llvm-svn: 324839
* [X86][SSE] Enable SMIN/SMAX/UMIN/UMAX custom lowering for all legal typesSimon Pilgrim2018-02-112-8/+66
| | | | | | | | This allows us to recognise more saturation patterns and also simplify some MINMAX codegen that was failing to combine CMPGE comparisons to a legal CMPGT. Differential Revision: https://reviews.llvm.org/D43014 llvm-svn: 324837
* [X86] Reduce Store Forward Block issues in HWLama Saba2018-02-114-0/+563
| | | | | | | | | | | | | | | If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory. A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load. The estimated penalty for a store forward block is ~13 cycles. This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence of a load and a store. The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies. breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM. Change-Id: I620b6dc91583ad9a1444591e3ddc00dd25d81748 llvm-svn: 324835
* [X86] Don't make 512-bit vectors legal when preferred vector width is 256 ↵Craig Topper2018-02-114-25/+77
| | | | | | | | | | | | | | | | | | bits and 512 bits aren't required This patch adds a new function attribute "required-vector-width" that can be set by the frontend to indicate the maximum vector width present in the original source code. The idea is that this would be set based on ABI requirements, intrinsics or explicit vector types being used, maybe simd pragmas, etc. The backend will then use this information to determine if its save to make 512-bit vectors illegal when the preference is for 256-bit vectors. For code that has no vectors in it originally and only get vectors through the loop and slp vectorizers this allows us to generate code largely similar to our AVX2 only output while still enabling AVX512 features like mask registers and gather/scatter. The loop vectorizer doesn't always obey TTI and will create oversized vectors with the expectation the backend will legalize it. In order to avoid changing the vectorizer and potentially harm our AVX2 codegen this patch tries to make the legalizer behavior similar. This is restricted to CPUs that support AVX512F and AVX512VL so that we have good fallback options to use 128 and 256-bit vectors and still get masking. I've qualified every place I could find in X86ISelLowering.cpp and added tests cases for many of them with 2 different values for the attribute to see the codegen differences. We still need to do frontend work for the attribute and teach the inliner how to merge it, etc. But this gets the codegen layer ready for it. Differential Revision: https://reviews.llvm.org/D42724 llvm-svn: 324834
* [X86] Remove setOperationAction lines for promoting vXi1 SINT_TO_FP/UINT_TO_FP.Craig Topper2018-02-111-26/+0
| | | | | | | | We promote these via a DAG combine now before lowering gets the chance. Also remove the v2i1 custom handling since it will no longer be triggered. llvm-svn: 324833
* [SelectionDAG] Remove TargetLowering::getConstTrueVal. Use ↵Craig Topper2018-02-112-12/+3
| | | | | | | | SelectionDAG::getBoolConstant in the one place it was used. SelectionDAG::getBoolConstant was recently introduced. At the time I didn't know getConstTrueVal existed, but I think getBoolConstant is better as it will use the source VT to make sure it can properly detect floating point if it is configured differently. llvm-svn: 324832
* [X86] Remove some redundant qualifications from the setOperationAction ↵Craig Topper2018-02-111-4/+2
| | | | | | | | blocks. NFC These were added as part of the refactoring for prefer vector width. At the time I thought the hasAVX512 here would be replaced with "allow 512 bit vectors" so that it would read "allow 512 bit vectors OR VLX". But now the plan is to only give the option of disabling 512 bit vectors when VLX is enabled. So we don't need this qualification at all llvm-svn: 324831
* [X86] Change signatures of avx512 packed fp compare intrinsics to return a ↵Craig Topper2018-02-102-25/+85
| | | | | | | | | | | | | | | | | | | | | | | vXi1 mask type to be closer to an fcmp. Summary: This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR. This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares. Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it. This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once. Reviewers: spatel, delena, RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43137 llvm-svn: 324827
* [InstCombine] Add constant vector support for ~(C >> Y) --> ~C >> YSimon Pilgrim2018-02-101-5/+7
| | | | | | Includes adding m_NonNegative constant pattern matcher llvm-svn: 324825
* [X86][SSE] Increase PMULLD costs to better match hardwareSimon Pilgrim2018-02-101-3/+5
| | | | | | Until Skylake, most hardware could only issue a PMULLD op every other cycle llvm-svn: 324823
* [X86] Custom legalize (v2i32 (setcc (v2f32))) so that we don't end up with a ↵Craig Topper2018-02-101-0/+32
| | | | | | | | | | (v4i1 (setcc (v4f32))) Undef VLX, getSetCCResultType returns v2i1/v4i1 for v2f32/v4f32 so default type legalization will end up changing the setcc result type back to vXi1 if it had been extended. The resulting extend gets messed up further by type legalization and is difficult to recombine back to (v4i32 (setcc (v4f32))) after legalization. I went ahead and enabled this for SSE2 and later since its always the result we want and this helps type legalization get there in less steps. llvm-svn: 324822
* [X86] Extend inputs with elements smaller than i32 to sint_to_fp/uint_to_fp ↵Craig Topper2018-02-101-6/+3
| | | | | | | | | | | | | | before type legalization. This prevents extends of masks being introduced during lowering where it become difficult to combine them out. There are a few oddities in here. We sometimes concatenate two k-registers produced by two compares, sign_extend the combined pair, then extract two halves. This worked better previously because the sign_extend wasn't created until after the fp_to_sint was split which led to a split sign_extend being created. We probably also need to custom type legalize (v2i32 (sext v2i1)) via widening. llvm-svn: 324820
* [X86] Custom legalize (v2i1 (fp_to_uint/fp_to_sint v2f64)) without AVX512VL.Craig Topper2018-02-101-4/+2
| | | | | | Strangely the code was already present, just the setOperationAction wasn't being called without VLX. llvm-svn: 324806
* [X86] Legalize zero extends from vXi1 to vXi16/vXi32/vXi64 using a sign ↵Craig Topper2018-02-101-5/+12
| | | | | | | | | | extend and a shift. This avoids a constant pool load to create 1. The int->float are showing converts to mask and back. We probably need to widen inputs to sint_to_fp/uint_to_fp before type legalization. llvm-svn: 324805
* [X86] Teach combineExtSetcc to handle ZERO_EXTEND by widening the setcc and ↵Craig Topper2018-02-101-5/+6
| | | | | | | | then masking. A later DAG combine will convert to a shift. This helps to avoid a constant pool load needed to zero extend from the mask. llvm-svn: 324804
* [DAG] Make early exit hasPredecessorHelper return true. NFCI.Nirav Dave2018-02-102-10/+4
| | | | | | | All uses conservatively assume in early exit case that it will be a predecessor. Changing default removes checking code in all uses. llvm-svn: 324797
* [X86] Teach combineInsertSubvector how to combine some k-register ↵Craig Topper2018-02-101-6/+25
| | | | | | insert_subvectors and extract_subvector sequences to remove extra zeroing.wq llvm-svn: 324791
* Make LLVM timer reprintable: that is, make more than one print action on the ↵George Karpenkov2018-02-101-2/+10
| | | | | | | | | | | | | | | | same timer feasible Currently, each LLVM timer can be only printed once, as the act of printing clears the timer. Moreover, the current printing mechanism implicitly assumes that the timer is stopped -- and prints zero otherwise. This patch relaxes this assumption and makes printing statistics multiple time a possibility. Differential Revision: https://reviews.llvm.org/D43136 llvm-svn: 324788
* [LV] Fix analyzeInterleaving when -pass-remarks enabledMircea Trofin2018-02-101-1/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: If -pass-remarks=loop-vectorize, atomic ops will be seen by analyzeInterleaving(), even though canVectorizeMemory() == false. This is because we are requesting extra analysis instead of bailing out. In such a case, we end up with a Group in both Load- and StoreGroups, and then we'll try to access freed memory when traversing LoadGroups after having had released the Group when iterating over StoreGroups. The fix is to include mayWriteToMemory() when validating that two instructions are the same kind of memory operation. Reviewers: mssimpso, davidxl Reviewed By: davidxl Subscribers: hsaito, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D43064 llvm-svn: 324786
* [Hexagon] Update uses of deprecated IRBuilder CreateMemCpy/Move callsDaniel Neilson2018-02-091-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the Hexagon LoopIdiom pass to cease using the old IRBuilder createMemCpy/createMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324784
* [X86] Teach lower1BitVectorShuffle to recognize shuffles that are just ↵Craig Topper2018-02-091-1/+30
| | | | | | | | filling upper elements with zero. Replace with insert_subvector. There's still some extra kshifts in one of the modified test cases here, but hopefully that's only a DAG combine away. llvm-svn: 324782
* [ARMFastISel] Replace deprecated calls to MemoryIntrinsic::getAlignment() (NFCI)Daniel Neilson2018-02-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes ARMFastISel to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324781
* [WebAssembly] Add mechanisms for specifying an explicit import module name.Dan Gohman2018-02-093-2/+29
| | | | | | | | | | | | | | | | | | | | This adds a wasm-import-module function attribute and a .import_module assembler directive, for specifying module import names for WebAssembly. Currently these may only be used for function symbols; global variables may be considered in the future. WebAssembly has a two-level namespace scheme for symbols, and it's normally the linker's job to assign the module name, which is the first-level name. The attributes here allow users to specify their own module names explicitly, which is useful for tools generating bindings to modules defined in other languages. This feature is not fully usable yet. It will evolve along with the ongoing symbol table and lld changes. Differential Revision: https://reviews.llvm.org/D42520 llvm-svn: 324778
* [WebAssembly] Add an LLVM_FALLTHROUGH to address a warning. NFC.Dan Gohman2018-02-091-0/+1
| | | | llvm-svn: 324777
* [AMDGPUPromoteAlloca] Replace deprecated memory intrinsic APIs (NFCI)Daniel Neilson2018-02-091-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AMDGPUPromoteAlloca pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder createMemCpy/createMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, r323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324774
* [AArch64FastISel] Replace deprecated calls to ↵Daniel Neilson2018-02-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MemoryIntrinsic::getAlignment() (NFCI) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes AArch64FastISel to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, r323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324773
* [X86][MC] Fix assembling rip-relative addressing + immediate displacementsFrancis Visoiu Mistrih2018-02-091-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In the rare case where the input contains rip-relative addressing with immediate displacements, *and* the instruction ends with an immediate, we encode the instruction in the wrong way: movl $12345678, 0x400(%rdi) // all good, no rip-relative addr movl %eax, 0x400(%rip) // all good, no immediate at the end of the instruction movl $12345678, 0x400(%rip) // fails, encodes address as 0x3fc(%rip) Offset is a label: movl $12345678, foo(%rip) we want to account for the size of the immediate (in this case, $12345678, 4 bytes). Offset is an immediate: movl $12345678, 0x400(%rip) we should not account for the size of the immediate, assuming the immediate offset is what the user wanted. Differential Revision: https://reviews.llvm.org/D43050 llvm-svn: 324772
* [WebAssebmly] Report undefined symbols correctly in objdumpSam Clegg2018-02-091-8/+13
| | | | | | | | | | | | | | Peviously we were reporting undefined symbol as being defined by the IMPORT sections. This change reports undefined symbols in the same that other formats do, and also removes the need to store the section with each symbol (since it can be derived from the symbol type). Differential Revision: https://reviews.llvm.org/D43101 llvm-svn: 324770
* [CodeGen] Print predecessors as MIR comments in -debug outputFrancis Visoiu Mistrih2018-02-091-3/+7
| | | | | | Make -debug MBB headers more copy-pastable into mir files. llvm-svn: 324769
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-02-091-2/+6
| | | | | | | Fix the modeling of transfers between a generic register and a partial ASIMD one. llvm-svn: 324766
* [Utils] Salvage debug info from dead 'or' instructionsVedant Kumar2018-02-093-7/+32
| | | | | | | | | | | Extend salvageDebugInfo to preserve the debug info from a dead 'or' with a constant. Patch by Ismail Badawi! Differential Revision: https://reviews.llvm.org/D43129 llvm-svn: 324764
* [Hexagon] Add code to select QTRUE and QFALSEKrzysztof Parzyszek2018-02-093-0/+29
| | | | | | Fixes http://llvm.org/PR36320. llvm-svn: 324763
* [tablegen] Fixed few !foreach evaluation issues.Artem Belevich2018-02-091-41/+26
| | | | | | | | | | | | | | * !foreach on lists didn't evaluate operands of the RHS operator. This made nested operators silently fail. * A typo in the code could result in a wrong value substituted for an operation which produced a false '!foreach requires an operator' error. * Keep recursion over the DAG within ForeachHelper. This simplifies things a bit as we no longer need to pass the Type around in order to prevent recursion. Differential Revision: https://reviews.llvm.org/D43083 llvm-svn: 324758
* [ThinLTO] Teach ThinLTO about auto hide symbolsSteven Wu2018-02-091-0/+7
| | | | | | | | | | | | | | | | | | Summary: For symbols that has linkonce_odr linkage and unnamed_addr, it can be auto hide by linker to avoid weak external symbols. Teach ThinLTO to perform auto hide so it can safely promote linkonce_odr to weak symbols without breaking this nice property. Reviewers: tejohnson, mehdi_amini Reviewed By: tejohnson Subscribers: inglorion, eraman, rnk, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D43130 llvm-svn: 324757
* AMDGPU: Remove tied operand from si_elseMatt Arsenault2018-02-091-1/+0
| | | | llvm-svn: 324751
* Emit smaller exception tables for non-SJLJ mode.Rafael Espindola2018-02-091-8/+11
| | | | | | | | | | | * Use uleb128 for code offsets in the LSDA call site table. * Omit the TTBase offset if the type table is empty. This change can reduce the size of the DWARF/Itanium LSDA by about half. Patch by Ryan Prichard! llvm-svn: 324750
* Use assembler expressions to lay out the EH LSDA.Rafael Espindola2018-02-098-123/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rely on the assembler to finalize the layout of the DWARF/Itanium exception-handling LSDA. Rather than calculate the exact size of each thing in the LSDA, use assembler directives: To emit the offset to the TTBase label: .uleb128 .Lttbase0-.Lttbaseref0 .Lttbaseref0: To emit the size of the call site table: .uleb128 .Lcst_end0-.Lcst_begin0 .Lcst_begin0: ... call site table entries ... .Lcst_end0: To align the type info table: ... action table ... .balign 4 .long _ZTIi .long _ZTIl .Lttbase0: Using assembler directives simplifies the compiler and allows switching the encoding of offsets in the call site table from udata4 to uleb128 for a large code size savings. (This commit does not change the encoding.) The combination of the uleb128 followed by a balign creates an unfortunate dependency cycle that the assembler must sometimes resolve either by padding an LEB or by inserting zero padding before the type table. See PR35809 or GNU as bug 4029. Patch by Ryan Prichard! llvm-svn: 324749
* Reapply "AMDGPU: Add 32-bit constant address space"Matt Arsenault2018-02-0913-19/+86
| | | | | | This reverts r324494 and reapplies r324487. llvm-svn: 324747
* AMDGPU: Fix layering issueMatt Arsenault2018-02-096-21/+22
| | | | | | | Move utility function that depends on codegen. Fixes build with r324487 reapplied. llvm-svn: 324746
* [AArch64] Refactor stand alone methods (NFC)Evandro Menezes2018-02-092-142/+144
| | | | | | Make stand alone methods in AArch64InstrInfo static. llvm-svn: 324745
* [Hexagon] Express calling conventions via .td file instead of hand-codingKrzysztof Parzyszek2018-02-094-531/+255
| | | | | | Additionally, simplify the rest of the argument/parameter lowering code. llvm-svn: 324737
* [DebugInfo] Don't insert DEBUG_VALUE after terminatorsStefan Maksimovic2018-02-091-1/+1
| | | | | | | | | | | | | | r314974 introduced insertion of DEBUG_VALUEs after each redefinition of debug value register in the slot index range. In case the instruction redefining the debug value register was a terminator, machine verifier would complain since it enforces the rule of no non-terminator instructions following the first terminator. Differential Revision: https://reviews.llvm.org/D42801 llvm-svn: 324734
* [SelectionDAG] Provide adequate register class for RegisterSDNodeStefan Maksimovic2018-02-091-1/+16
| | | | | | | | | | When adding operands to machine instructions in case of RegisterSDNodes, generate a COPY node in case the register class does not match the one in the instruction definition. Differental Revision: https://reviews.llvm.org/D35561 llvm-svn: 324733
* [ELF] Print the .type assembly directive correctly for STT_NOTYPEOliver Stannard2018-02-091-1/+1
| | | | | | | | | | The llvm assembly parser and gas both accept "@notype" in the .type assembly directive, but we were printing it as "@no_type", which isn't accepted by either assembler. Differential revision: https://reviews.llvm.org/D43116 llvm-svn: 324731
OpenPOWER on IntegriCloud