summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* DebugInfo: Add metadata support for disabling DWARF pub sectionsDavid Blaikie2018-08-165-38/+68
| | | | | | | | | | | | | | | | | | | | | | | In cases where the debugger load time is a worthwhile tradeoff (or less costly - such as loading from a DWP instead of a variety of DWOs (possibly over a high-latency/distributed filesystem)) against object file size, it can be reasonable to disable pubnames and corresponding gdb-index creation in the linker. A backend-flag version of this was implemented for NVPTX in D44385/r327994 - which was fine for NVPTX which wouldn't mix-and-match CUs. Now that it's going to be a user-facing option (likely powered by "-gno-pubnames", the same as GCC) it should be encoded in the DICompileUnit so it can vary per-CU. After this, likely the NVPTX support should be migrated to the metadata & the previous flag implementation should be removed. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D50213 llvm-svn: 339939
* [MachineVerifier] Check if predecessor is jointly dominated by undefsKrzysztof Parzyszek2018-08-161-1/+11
| | | | | | | | Each use of a value should be jointly dominated by the union of defs and undefs. It can happen that it will only be jointly dominated by undefs, and that is still legal. Make sure that the verifier is aware of that. llvm-svn: 339924
* [SelectionDAG] Improve the legalisation lowering of UMULO.Eli Friedman2018-08-161-17/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no way in the universe, that doing a full-width division in software will be faster than doing overflowing multiplication in software in the first place, especially given that this same full-width multiplication needs to be done anyway. This patch replaces the previous implementation with a direct lowering into an overflowing multiplication algorithm based on half-width operations. Correctness of the algorithm was verified by exhaustively checking the output of this algorithm for overflowing multiplication of 16 bit integers against an obviously correct widening multiplication. Baring any oversights introduced by porting the algorithm to DAG, confidence in correctness of this algorithm is extremely high. Following table shows the change in both t = runtime and s = space. The change is expressed as a multiplier of original, so anything under 1 is “better” and anything above 1 is worse. +-------+-----------+-----------+-------------+-------------+ | Arch | u64*u64 t | u64*u64 s | u128*u128 t | u128*u128 s | +-------+-----------+-----------+-------------+-------------+ | X64 | - | - | ~0.5 | ~0.64 | | i686 | ~0.5 | ~0.6666 | ~0.05 | ~0.9 | | armv7 | - | ~0.75 | - | ~1.4 | +-------+-----------+-----------+-------------+-------------+ Performance numbers have been collected by running overflowing multiplication in a loop under `perf` on two x86_64 (one Intel Haswell, other AMD Ryzen) based machines. Size numbers have been collected by looking at the size of function containing an overflowing multiply in a loop. All in all, it can be seen that both performance and size has improved except in the case of armv7 where code size has regressed for 128-bit multiply. u128*u128 overflowing multiply on 32-bit platforms seem to benefit from this change a lot, taking only 5% of the time compared to original algorithm to calculate the same thing. The final benefit of this change is that LLVM is now capable of lowering the overflowing unsigned multiply for integers of any bit-width as long as the target is capable of lowering regular multiplication for the same bit-width. Previously, 128-bit overflowing multiply was the widest possible. Patch by Simonas Kazlauskas! Differential Revision: https://reviews.llvm.org/D50310 llvm-svn: 339922
* [RegisterCoalescer] Shrink to uses if needed after removeCopyByCommutingDefKrzysztof Parzyszek2018-08-161-24/+54
| | | | llvm-svn: 339912
* [TargetLowering] Add support for non-uniform vectors to BuildSDIVSimon Pilgrim2018-08-161-10/+24
| | | | | | | | | | This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators. This is the last patch necessary to close PR36545 Differential Revision: https://reviews.llvm.org/D50765 llvm-svn: 339908
* [TargetLowering] Refactor BuildSDIV in preparation for D50765. NFCI.Simon Pilgrim2018-08-161-24/+36
| | | | | | Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator. llvm-svn: 339898
* [CodeGenPrepare] Add BothExtension type to PromotedInstsGuozhi Wei2018-08-151-7/+49
| | | | | | | | | | | | This patch fixes PR38125. Instruction extension types are recorded in PromotedInsts, it can be used later in function canGetThrough. If an instruction has two users with different extension types, it will be inserted into PromotedInsts two times in function promoteOperandForOther. The second one overwrites the first one, and the final extension type is wrong, later causes problem in canGetThrough. This patch changes the simple bool extension type to 2-bit enum type, add a BothExtension type in addition to zero/sign extension. When an user sees BothExtension for an instruction, it actually knows nothing about how that instruction is extended. Differential Revision: https://reviews.llvm.org/D49512 llvm-svn: 339822
* DAG: Use getObjectOffset helperMatt Arsenault2018-08-151-4/+1
| | | | llvm-svn: 339813
* DAG: Try to custom lower when promoting float operandsMatt Arsenault2018-08-151-0/+5
| | | | | | | For some reason this wasn't done for floats like integers. llvm-svn: 339811
* [RegisterCoalescer] Ensure that both registers have subranges if one doesKrzysztof Parzyszek2018-08-151-1/+4
| | | | llvm-svn: 339792
* [RegisterCoalescer] Reset VNInfo def when copying segments overKrzysztof Parzyszek2018-08-151-2/+6
| | | | llvm-svn: 339788
* [RegAlloc] Check that subreg liveness tracking applies to given virtual regKrzysztof Parzyszek2018-08-151-1/+1
| | | | | | | | Subregister liveness applies selectively to register classes with certain properties. Make sure that when it's enabled, it applies to a given virtual register (in virtual register rewriter). llvm-svn: 339784
* [TargetLowering] Minor cleanup of TargetLowering::BuildSDIV. NFCI.Simon Pilgrim2018-08-151-21/+20
| | | | | | Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support. llvm-svn: 339763
* [TargetLowering] Minor refactor to TargetLowering::BuildUDIV to merge ↵Simon Pilgrim2018-08-151-41/+31
| | | | | | | | scalar/vector magic value collection. NFCI. Use the same ISD::matchUnaryPredicate pattern that was used in D50392. llvm-svn: 339758
* [DagCombiner] Don't bother adding to the work list if TLI.BuildSDIVPow2 ↵Simon Pilgrim2018-08-151-4/+6
| | | | | | | | failed. NFCI. Matches the code in BuildSDIV/BuildUDIV llvm-svn: 339757
* [TargetLowering] Add support for non-uniform vectors to BuildExactSDIVSimon Pilgrim2018-08-151-12/+24
| | | | | | | | This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators. Differential Revision: https://reviews.llvm.org/D50392 llvm-svn: 339756
* [SDAG] Remove the reliance on MI's allocation strategy forChandler Carruth2018-08-146-40/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be *surprised* at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740
* [FPEnv] Scalarize StrictFP vector operationsCameron McInally2018-08-142-0/+50
| | | | | | | | Add a helper function to scalarize constrained FP operations as needed. Differential Revision: https://reviews.llvm.org/D50720 llvm-svn: 339735
* [ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly.Eli Friedman2018-08-141-2/+3
| | | | | | | | | | | | | | | | | | | | | Intentionally excluding nodes from the DAGCombine worklist is likely to lead to weird optimizations and infinite loops, so it's generally a bad idea. To avoid the infinite loops, fix DAGCombine to use the isDesirableToCommuteWithShift target hook before performing the transforms in question, and implement the target hook in the ARM backend disable the transforms in question. Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a reduced testcase for that bug. But we should have sufficient test coverage for PerformSHLSimplify given that we're not playing weird tricks with the worklist. I can try to bugpoint it if necessary, though.) Differential Revision: https://reviews.llvm.org/D50667 llvm-svn: 339734
* [DebugInfoMetadata] Added DIFlags interface in DIBasicType.Adrian Prantl2018-08-141-0/+5
| | | | | | | | | | | Flags in DIBasicType will be used to pass attributes used in DW_TAG_base_type, such as DW_AT_endianity. Patch by Chirag Patel! Differential Revision: https://reviews.llvm.org/D49610 llvm-svn: 339714
* Revert "[DebugInfo] Generate DWARF debug information for labels. (Fix leak ↵Bruno Cardoso Lopes2018-08-1416-407/+149
| | | | | | | | | | | | problems)" This reverts commit cb8c5e417d55141f3f079a8a876e786f44308336 / r339676. This causing a test to fail in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/48406/ LLVM :: DebugInfo/Generic/debug-label.ll llvm-svn: 339700
* [DAG] Avoid redundant chain transversal in store merge cycle check. NFCI.Nirav Dave2018-08-141-1/+2
| | | | | | Patch by Henric Karlsson. llvm-svn: 339688
* [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)Hsiangkai Wang2018-08-1416-149/+407
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two forms for label debug information in DWARF format. 1. Labels in a non-inlined function: DW_TAG_label DW_AT_name DW_AT_decl_file DW_AT_decl_line DW_AT_low_pc 2. Labels in an inlined function: DW_TAG_label DW_AT_abstract_origin DW_AT_low_pc We will collect label information from DBG_LABEL. Before every DBG_LABEL, we will generate a temporary symbol to denote the location of the label. The symbol could be used to get DW_AT_low_pc afterwards. So, we create a mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase. The DBG_LABEL in the mapping is used to query the symbol before it. The AbstractLabels in DwarfCompileUnit is used to process labels in inlined functions. We also keep a mapping between scope and labels in DwarfFile to help to generate correct tree structure of DIEs. It also generates label debug information under global isel. Differential Revision: https://reviews.llvm.org/D45556 llvm-svn: 339676
* [GlobalISel][IRTranslator] Fix a bug in handling repeating struct types ↵Amara Emerson2018-08-141-0/+2
| | | | | | | | during argument lowering. Differential Revision: https://reviews.llvm.org/D49442 llvm-svn: 339674
* [CodeGen] Fix assert in SelectionDAG::computeKnownBitsScott Linder2018-08-131-2/+2
| | | | | | | | | | Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR when zero extending the demanded elements mask if it is already as long as the source vector. Differential Revision: https://reviews.llvm.org/D49574 llvm-svn: 339600
* [DAGCombiner] simplifyDivRem - add comment describing divide by undef/zero ↵Simon Pilgrim2018-08-131-0/+5
| | | | | | combine. NFC. llvm-svn: 339561
* [CGP] Fix GEP issue with out of range APInt constant values not fitting in ↵Simon Pilgrim2018-08-131-2/+7
| | | | | | | | int64_t Test case reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7173 llvm-svn: 339556
* [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the ↵Craig Topper2018-08-131-8/+11
| | | | | | | | fp_to_fp16 in case the result type isn't a scalar integer. This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary. llvm-svn: 339536
* [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, ↵Craig Topper2018-08-131-2/+2
| | | | | | | | | | make sure the output type is scalar. For vectors, use a store and load of temporary. Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid. This is basically the opposite case of the root cause of PR38533. llvm-svn: 339535
* Restore correct x86_64 EH encodings in kernel code modelLei Liu2018-08-131-9/+14
| | | | | | | | | | | | | Fixes PR37524. The exception handling encodings for x86_64 in kernel code model has been changed with r309884. Restore it to correct ones. These encodings include PersonalityEncoding, LSDAEncoding and TTypeEncoding. Differential Revision: https://reviews.llvm.org/D50490 llvm-svn: 339534
* [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the ↵Craig Topper2018-08-131-2/+4
| | | | | | | | | | fp16_to_fp in case the input type isn't an i16. The bitcast can be further legalized as needed. Fixes PR38533. llvm-svn: 339533
* DAG: Check no-signed-zeros instead of unsafe-fp-mathMatt Arsenault2018-08-121-3/+3
| | | | | | | Addresses fixme, although this should still be checking individual operand flags. llvm-svn: 339525
* [TargetLowering] Simplify one of the special cases in SimplifyDemandedBits ↵Craig Topper2018-08-121-21/+21
| | | | | | | | for XOR. NFCI We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead. llvm-svn: 339509
* [TargetLowering] Use APInt::isSubsetOf to simplify some code. NFCCraig Topper2018-08-121-1/+1
| | | | llvm-svn: 339508
* Rename the cfguard module flag to cfguardtableHans Wennborg2018-08-101-1/+1
| | | | | | | | | | The previous name sounds like it inserts cfguard implementation, but it really just emits the table of address-taken functions. Change the name to better reflect that. Clang will be updated in the next commit. llvm-svn: 339419
* [MC] Move EH DWARF encodings from MC to CodeGen, NFCReid Kleckner2018-08-091-0/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The TType encoding, LSDA encoding, and personality encoding are all passed explicitly by CodeGen to the assembler through .cfi_* directives, so only the AsmPrinter needs to know about them. The FDE CFI encoding however, controls the encoding of the label implicitly created by the .cfi_startproc directive. That directive seems to be special in that it doesn't take an encoding, so the assembler just has to know how to encode one DSO-local label reference from .eh_frame to .text. As a result, it looks like MC will continue to have to know when the large code model is in use. Perhaps we could invent a '.cfi_startproc [large]' flag so that this knowledge doesn't need to pollute the assembler. Reviewers: davide, lliu0, JDevlieghere Subscribers: hiraditya, fedor.sergeev, llvm-commits Differential Revision: https://reviews.llvm.org/D50533 llvm-svn: 339397
* [SelectionDAG] try harder to convert funnel shift to rotateSanjay Patel2018-08-091-3/+10
| | | | | | | | | | | | Similar to rL337966 - if the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. AArch only goes right, and PPC only goes left. x86 has both, so no diffs there. Differential Revision: https://reviews.llvm.org/D50091 llvm-svn: 339359
* extend folding fsub/fadd to fneg for FMFMichael Berg2018-08-091-8/+10
| | | | | | | | | | | | | | Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation Reviewers: spatel, wristow, arsenm Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D50195 llvm-svn: 339357
* [MC] Remove PhysRegSize from MCRegisterClassBjorn Pettersson2018-08-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The interface to get size and spill size of a register was moved from MCRegisterInfo to TargetRegisterInfo over a year ago. Afaik the old interface has bee around to give out-of-tree targets a chance to adapt to the new interface. One problem with the old MCRegisterClass::PhysRegSize was that it represented the size of a register as "size in bits" / 8. So a register had to be a multiple of eight bits wide for the size to be correct (and the byte size for the target needed to be eight bits). Reviewers: kparzysz, qcolombet Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47199 llvm-svn: 339350
* [TargetLowering] Add BuildSDIVPattern helper to BuildExactSDIV (NFCI).Simon Pilgrim2018-08-091-14/+23
| | | | | | As requested in D50392, pull the magic constant calculations out into a helper function. llvm-svn: 339346
* [DAGCombiner] loosen constraints for fsub+fadd foldSanjay Patel2018-08-081-14/+7
| | | | | | | | | isNegatibleForFree() should not matter here (as the test diffs show) because it's always a win to replace an fsub+fadd with fneg. The problem in D50195 persists because either (1) we are doing these folds in the wrong order or (2) we're missing another fold for fadd. llvm-svn: 339299
* [DAGCombiner] move fadd simplification ahead of other foldsSanjay Patel2018-08-081-9/+6
| | | | | | | | | I don't know if it's possible to expose this diff in a test, but we should always try simplifications (no new nodes created) before more complicated transforms for efficiency (similar to what we do in IR). llvm-svn: 339298
* revert '[CodeGen] emit inline asm clobber list warnings for reserved'Ties Stuij2018-08-081-78/+32
| | | | llvm-svn: 339274
* [DebugInfo] Fine tune emitting flags as part of the producerJonas Devlieghere2018-08-081-1/+1
| | | | | | | | | When using APPLE extensions, don't duplicate the compiler invocation's flags both in AT_producer and AT_APPLE_flags. Differential revision: https://reviews.llvm.org/D50453 llvm-svn: 339268
* [DAG] DAGCombiner::visitSDIVLike - remove unnecessary isConstOrConstSplat ↵Simon Pilgrim2018-08-081-4/+1
| | | | | | | | call. NFCI. The isConstOrConstSplat result is only used in a ISD::matchUnaryPredicate call which can perform the equivalent iteration just as quickly. llvm-svn: 339262
* [CodeGen] emit inline asm clobber list warnings for reservedTies Stuij2018-08-081-32/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: ``` extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, #1" : : : "r7" ); return 1 + bar(a); } ``` Compiled for thumb, this gives: ``` $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad #4 sub sp, #4 movs r1, #7 add.w r0, r1, r0, lsl #2 bic r0, r0, #7 sub.w r0, sp, r0 mov sp, r0 @APP mov.w r7, #1 @NO_APP bl bar adds r0, #1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... ``` r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: ``` repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, #1" ^ ``` Reviewers: eli.friedman, olista01, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49727 llvm-svn: 339257
* [TargetLowering] BuildUDIV - Add support for divide by one (PR38477)Simon Pilgrim2018-08-081-7/+8
| | | | | | | | Provide a pass-through of the numerator for divide by one cases - this is the same approach we take in DAGCombiner::visitSDIVLike. I investigated whether we could achieve this by magic MULHU/SRL values but nothing appeared to work as we don't have a way for MULHU(x,c) -> x llvm-svn: 339254
* [TargetLowering] Remove APInt divisor argument from BuildExactSDIV (NFCI).Simon Pilgrim2018-08-081-14/+22
| | | | | | | | As requested in D50392, this is a minor refactor to BuildExactSDIV to stop taking the uniform constant APInt divisor and instead extract it locally. I also cleanup the operands and valuetypes to better match BuildUDiv (and BuildSDIV in the near future). llvm-svn: 339246
* test commit accessTies Stuij2018-08-081-4/+4
| | | | | | | | | | Summary: changing a few typos Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50445 llvm-svn: 339245
* [TargetLowering] BuildUDIV - Early out for divide by one (PR38477)Simon Pilgrim2018-08-081-0/+4
| | | | | | We're not handling the UDIV by one special case properly - for now just early out. llvm-svn: 339229
OpenPOWER on IntegriCloud