summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the ↵Craig Topper2018-08-131-2/+4
| | | | | | | | | | fp16_to_fp in case the input type isn't an i16. The bitcast can be further legalized as needed. Fixes PR38533. llvm-svn: 339533
* DAG: Check no-signed-zeros instead of unsafe-fp-mathMatt Arsenault2018-08-121-3/+3
| | | | | | | Addresses fixme, although this should still be checking individual operand flags. llvm-svn: 339525
* [TargetLowering] Simplify one of the special cases in SimplifyDemandedBits ↵Craig Topper2018-08-121-21/+21
| | | | | | | | for XOR. NFCI We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead. llvm-svn: 339509
* [TargetLowering] Use APInt::isSubsetOf to simplify some code. NFCCraig Topper2018-08-121-1/+1
| | | | llvm-svn: 339508
* Rename the cfguard module flag to cfguardtableHans Wennborg2018-08-101-1/+1
| | | | | | | | | | The previous name sounds like it inserts cfguard implementation, but it really just emits the table of address-taken functions. Change the name to better reflect that. Clang will be updated in the next commit. llvm-svn: 339419
* [MC] Move EH DWARF encodings from MC to CodeGen, NFCReid Kleckner2018-08-091-0/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The TType encoding, LSDA encoding, and personality encoding are all passed explicitly by CodeGen to the assembler through .cfi_* directives, so only the AsmPrinter needs to know about them. The FDE CFI encoding however, controls the encoding of the label implicitly created by the .cfi_startproc directive. That directive seems to be special in that it doesn't take an encoding, so the assembler just has to know how to encode one DSO-local label reference from .eh_frame to .text. As a result, it looks like MC will continue to have to know when the large code model is in use. Perhaps we could invent a '.cfi_startproc [large]' flag so that this knowledge doesn't need to pollute the assembler. Reviewers: davide, lliu0, JDevlieghere Subscribers: hiraditya, fedor.sergeev, llvm-commits Differential Revision: https://reviews.llvm.org/D50533 llvm-svn: 339397
* [SelectionDAG] try harder to convert funnel shift to rotateSanjay Patel2018-08-091-3/+10
| | | | | | | | | | | | Similar to rL337966 - if the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. AArch only goes right, and PPC only goes left. x86 has both, so no diffs there. Differential Revision: https://reviews.llvm.org/D50091 llvm-svn: 339359
* extend folding fsub/fadd to fneg for FMFMichael Berg2018-08-091-8/+10
| | | | | | | | | | | | | | Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation Reviewers: spatel, wristow, arsenm Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D50195 llvm-svn: 339357
* [MC] Remove PhysRegSize from MCRegisterClassBjorn Pettersson2018-08-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The interface to get size and spill size of a register was moved from MCRegisterInfo to TargetRegisterInfo over a year ago. Afaik the old interface has bee around to give out-of-tree targets a chance to adapt to the new interface. One problem with the old MCRegisterClass::PhysRegSize was that it represented the size of a register as "size in bits" / 8. So a register had to be a multiple of eight bits wide for the size to be correct (and the byte size for the target needed to be eight bits). Reviewers: kparzysz, qcolombet Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47199 llvm-svn: 339350
* [TargetLowering] Add BuildSDIVPattern helper to BuildExactSDIV (NFCI).Simon Pilgrim2018-08-091-14/+23
| | | | | | As requested in D50392, pull the magic constant calculations out into a helper function. llvm-svn: 339346
* [DAGCombiner] loosen constraints for fsub+fadd foldSanjay Patel2018-08-081-14/+7
| | | | | | | | | isNegatibleForFree() should not matter here (as the test diffs show) because it's always a win to replace an fsub+fadd with fneg. The problem in D50195 persists because either (1) we are doing these folds in the wrong order or (2) we're missing another fold for fadd. llvm-svn: 339299
* [DAGCombiner] move fadd simplification ahead of other foldsSanjay Patel2018-08-081-9/+6
| | | | | | | | | I don't know if it's possible to expose this diff in a test, but we should always try simplifications (no new nodes created) before more complicated transforms for efficiency (similar to what we do in IR). llvm-svn: 339298
* revert '[CodeGen] emit inline asm clobber list warnings for reserved'Ties Stuij2018-08-081-78/+32
| | | | llvm-svn: 339274
* [DebugInfo] Fine tune emitting flags as part of the producerJonas Devlieghere2018-08-081-1/+1
| | | | | | | | | When using APPLE extensions, don't duplicate the compiler invocation's flags both in AT_producer and AT_APPLE_flags. Differential revision: https://reviews.llvm.org/D50453 llvm-svn: 339268
* [DAG] DAGCombiner::visitSDIVLike - remove unnecessary isConstOrConstSplat ↵Simon Pilgrim2018-08-081-4/+1
| | | | | | | | call. NFCI. The isConstOrConstSplat result is only used in a ISD::matchUnaryPredicate call which can perform the equivalent iteration just as quickly. llvm-svn: 339262
* [CodeGen] emit inline asm clobber list warnings for reservedTies Stuij2018-08-081-32/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: ``` extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, #1" : : : "r7" ); return 1 + bar(a); } ``` Compiled for thumb, this gives: ``` $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad #4 sub sp, #4 movs r1, #7 add.w r0, r1, r0, lsl #2 bic r0, r0, #7 sub.w r0, sp, r0 mov sp, r0 @APP mov.w r7, #1 @NO_APP bl bar adds r0, #1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... ``` r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: ``` repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, #1" ^ ``` Reviewers: eli.friedman, olista01, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49727 llvm-svn: 339257
* [TargetLowering] BuildUDIV - Add support for divide by one (PR38477)Simon Pilgrim2018-08-081-7/+8
| | | | | | | | Provide a pass-through of the numerator for divide by one cases - this is the same approach we take in DAGCombiner::visitSDIVLike. I investigated whether we could achieve this by magic MULHU/SRL values but nothing appeared to work as we don't have a way for MULHU(x,c) -> x llvm-svn: 339254
* [TargetLowering] Remove APInt divisor argument from BuildExactSDIV (NFCI).Simon Pilgrim2018-08-081-14/+22
| | | | | | | | As requested in D50392, this is a minor refactor to BuildExactSDIV to stop taking the uniform constant APInt divisor and instead extract it locally. I also cleanup the operands and valuetypes to better match BuildUDiv (and BuildSDIV in the near future). llvm-svn: 339246
* test commit accessTies Stuij2018-08-081-4/+4
| | | | | | | | | | Summary: changing a few typos Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50445 llvm-svn: 339245
* [TargetLowering] BuildUDIV - Early out for divide by one (PR38477)Simon Pilgrim2018-08-081-0/+4
| | | | | | We're not handling the UDIV by one special case properly - for now just early out. llvm-svn: 339229
* Support inline asm with multiple 64bit output in 32bit GPRThomas Preud'homme2018-08-081-16/+37
| | | | | | | | | | | | | | Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45437 llvm-svn: 339225
* [SelectionDAG] When splitting scatter nodes during DAGCombine, create a ↵Craig Topper2018-08-071-12/+10
| | | | | | | | | | serial chain dependency. Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code. Differential Revision: https://reviews.llvm.org/D50374 llvm-svn: 339157
* [DAG] Allow non-uniform constant vectors to call BuildSDIVSimon Pilgrim2018-08-071-1/+2
| | | | | | | | This was missed in D50185. NFC until we add actual non-uniform support to BuildSDIV (similar BuildUDIV support in D49248) - for now it just early outs. llvm-svn: 339147
* [TargetLowering] Use pre-computed Shift value type in BuildUDIV (NFCI)Simon Pilgrim2018-08-071-9/+5
| | | | | | This was missed in D49248 llvm-svn: 339146
* Fix inconsistency with/without debug information (-g)Jonas Devlieghere2018-08-071-1/+1
| | | | | | | | | | | | | This fixes an inconsistency in code generation when compiling with or without debug information (-g). When debug information is available in an empty block, the original test would fail, resulting in possibly different code. Patch by: Jeroen Dobbelaere Differential revision: https://reviews.llvm.org/D49467 llvm-svn: 339129
* [DebugInfo] Reduce debug_str_offsets section sizePavel Labath2018-08-074-14/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The accelerator tables use the debug_str section to store their strings. However, they do not support the indirect method of access that is available for the debug_info section (DW_FORM_strx et al.). Currently our code is assuming that all strings can/will be referenced indirectly, and puts all of them into the debug_str_offsets section. This is generally true for regular (unsplit) dwarf, but in the DWO case, most of the strings in the debug_str section will only be used from the accelerator tables. Therefore the contents of the debug_str_offsets section will be largely unused and bloating the main executable. This patch rectifies this by teaching the DwarfStringPool to differentiate between strings accessed directly and indirectly. When a user inserts a string into the pool it has to declare whether that string will be referenced directly or not. If at least one user requsts indirect access, that string will be assigned an index ID and put into debug_str_offsets table. Otherwise, the offset table is skipped. This approach reduces the overall binary size (when compiled with -gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is shrunk by 99%). Reviewers: probinson, dblaikie, JDevlieghere Subscribers: aprantl, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D49493 llvm-svn: 339122
* [TargetLowering] Add support for non-uniform vectors to BuildUDIVSimon Pilgrim2018-08-072-61/+136
| | | | | | | | | | This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators. It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV. Differential Revision: https://reviews.llvm.org/D49248 llvm-svn: 339121
* [SelectionDAG][X86] Rename MaskedLoadSDNode::getSrc0 to getPassThru.Craig Topper2018-08-073-19/+19
| | | | | | Src0 doesn't really convey any meaning to what the operand is. Passthru matches what's used in the documentation for the intrinsic this comes from. llvm-svn: 339101
* [SelectionDAG][X86] Rename getValue to getPassThru for gather SDNodes.Craig Topper2018-08-074-25/+26
| | | | | | getValue is more meaningful name for scatter than it is for gather. Split them and use getPassThru for gather. llvm-svn: 339096
* Fix a -Wsign-compareReid Kleckner2018-08-061-1/+1
| | | | llvm-svn: 339059
* [X86] Fix assertion in subreg extractionReid Kleckner2018-08-061-1/+1
| | | | | | | | | | | This assert fires when attempting to extract a subregister from the global PIC base register. This virtual register SD node is not in the VRBaseMap, so we shouldn't call getVR to look it up there. If this is a RegisterSDNode, we should be able to use the virtual register directly. Fixes PR38385 llvm-svn: 339056
* [RegisterCoalescer] Delay live interval update work until the rematerializationWei Mi2018-08-061-6/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for all the uses from the same def is done. We run into a compile time problem with flex generated code combined with `-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants outside of a big loop, and drastically increases the compile time in global register splitting and copy coalescing. https://reviews.llvm.org/D49353 relieves the problem in global splitting. This patch is to handle the problem in copy coalescing. About the situation where the problem in copy coalescing happens. After machineLICM, we have several defs outside of a big loop with hundreds or thousands of uses inside the loop. Rematerialization in copy coalescing happens for each use and everytime rematerialization is done, shrinkToUses will be called to update the huge live interval. Because we have 'n' uses for a def, and each live interval update will have at least 'n' complexity, the total update work is n^2. To fix the problem, we try to do the live interval update work in a collective way. If a def has many copylike uses larger than a threshold, each time rematerialization is done for one of those uses, we won't do the live interval update in time but delay that work until rematerialization for all those uses are completed, so we only have to do the live interval update work once. Delaying the live interval update could potentially change the copy coalescing result, so we hope to limit that change to those defs with many (like above a hundred) copylike uses, and the cutoff can be adjusted by the option -mllvm -late-remat-update-threshold=xxx. Differential Revision: https://reviews.llvm.org/D49519 llvm-svn: 339035
* [DebugInfo] Refactor DbgInfoIntrinsic class hierarchy.Hsiangkai Wang2018-08-061-1/+1
| | | | | | | | | | | | | | | | In the past, DbgInfoIntrinsic has a strong assumption that these intrinsics all have variables and expressions attached to them. However, it is too strong to derive the class for other debug entities. Now, it has problems for debug labels. In order to make DbgInfoIntrinsic as a base class for 'debug info', I create a class for 'variable debug info', DbgVariableIntrinsic. DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it. Differential Revision: https://reviews.llvm.org/D50220 llvm-svn: 338984
* [GISel]: Add Opcodes for CTLZ/CTTZ/CTPOPAditya Nandakumar2018-08-041-0/+20
| | | | | | | | https://reviews.llvm.org/D48600 Added IRTranslator support to translate these known intrinsics into GISel opcodes. llvm-svn: 338944
* [SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked ↵Craig Topper2018-08-031-11/+28
| | | | | | | | | | store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. llvm-svn: 338915
* DAG: Enhance isKnownNeverNaNMatt Arsenault2018-08-032-8/+102
| | | | | | | | | | | | Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. llvm-svn: 338910
* [TargetLowering] Generalise BuildSDIV functionSimon Pilgrim2018-08-032-15/+14
| | | | | | | | First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used. Differential Revision: https://reviews.llvm.org/D50185 llvm-svn: 338838
* [GlobalMerge] Allow merging globals with explicit section markings.Eli Friedman2018-08-021-10/+22
| | | | | | | | | | | | | | | At least on ELF, it's impossible to tell from the object file whether two globals with the same section marking were merged: the merged global uses "private" linkage to hide its symbol, and the aliases look like regular symbols. I can't think of any other reason to disallow it. (Of course, we can only merge globals in the same section.) The weird alignment handling matches AsmPrinter; our alignment handling for global variables should probably be refactored. Differential Revision: https://reviews.llvm.org/D49822 llvm-svn: 338791
* DAG: Fix vector widening fcanonicalizeMatt Arsenault2018-08-021-0/+1
| | | | llvm-svn: 338715
* [GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per ↵Alexander Ivchenko2018-08-021-4/+6
| | | | | | | | | | Value This is logical continuation of https://reviews.llvm.org/D46018 (r332449) Differential Revision: https://reviews.llvm.org/D49660 llvm-svn: 338685
* Fix FCOPYSIGN expansionLei Liu2018-08-021-16/+12
| | | | | | | | | | | | In expansion of FCOPYSIGN, the shift node is missing when the two operands of FCOPYSIGN are of the same size. We should always generate shift node (if the required shift bit is not zero) to put the sign bit into the right position, regardless of the size of underlying types. Differential Revision: https://reviews.llvm.org/D49973 llvm-svn: 338665
* [AArch64] DWARF: do not generate AT_location for thread localLei Liu2018-08-011-0/+4
| | | | | | | | | | | | | | | AArch64 ELF ABI does not define a static relocation type for TLS offset within a module, which makes it impossible for compiler to generate a valid DW_AT_location content for thread local variables. Currently LLVM generates an invalid R_AARCH64_ABS64 relocation at the DW_AT_location field for a TLS variable. That causes trouble for linker because thread local variable does not have an absolute address at link time. AArch64 GCC solves the problem by not generating DW_AT_location for thread local variables. We should do the same in LLVM. Differential Revision: https://reviews.llvm.org/D43860 llvm-svn: 338655
* [DEBUGINFO] Disable emission of the dwarf sections, but allow directives.Alexey Bataev2018-08-014-1/+36
| | | | | | | | | | | | | | | | Summary: Added an option that allows to emit only '.loc' and '.file' kind debug directives, but disables emission of the DWARF sections. Required for NVPTX target to support profiling. It requires '.loc' and '.file' directives, but does not require any DWARF sections for the profiler. Reviewers: probinson, echristo, dblaikie Subscribers: aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D46021 llvm-svn: 338616
* [NFC] small addendum to r334242, FMF propagationMichael Berg2018-08-011-1/+1
| | | | llvm-svn: 338604
* [SelectionDAG] fix bug in translating funnel shift with non-power-of-2 typeSanjay Patel2018-08-011-31/+39
| | | | | | | | | | | | | | | | | | The bug is visible in the constant-folded x86 tests. We can't use the negated shift amount when the type is not power-of-2: https://rise4fun.com/Alive/US1r ...so in that case, use the regular lowering that includes a select to guard against a shift-by-bitwidth. This path is improved by only calculating the modulo shift amount once now. Also, improve the rotate (with power-of-2 size) lowering to use a negate rather than subtract from bitwidth. This improves the codegen whether we have a rotate instruction or not (although we can still see that we're not matching to a legal rotate in all cases). llvm-svn: 338592
* [SelectionDAG] Make binop reduction matcher available to all targetsSimon Pilgrim2018-08-011-0/+58
| | | | | | | | There is nothing x86-specific about this code, so it'd be nice to make this available for other targets to use in the future (and get it out of X86ISelLowering!). Differential Revision: https://reviews.llvm.org/D50083 llvm-svn: 338586
* [FPEnv] Widen illegal width StrictFP vector operations as neededCameron McInally2018-08-012-58/+205
| | | | | | Differential Revision: https://reviews.llvm.org/D49806 llvm-svn: 338562
* [MC] Report fatal error for DWARF types for non-ELF object filesJonas Devlieghere2018-08-011-1/+3
| | | | | | | | | | | | | | Getting the DWARF types section is only implemented for ELF object files. We already disabled emitting debug types in clang (r337717), but now we also report an fatal error (rather than crashing) when trying to obtain this section in MC. Additionally we ignore the generate debug types flag for unsupported target triples. See PR38190 for more information. Differential revision: https://reviews.llvm.org/D50057 llvm-svn: 338527
* [DWARF] Basic support for producing DWARFv5 .debug_addr sectionVictor Leschuk2018-08-014-2/+31
| | | | | | | | | | | | | | This revision implements support for generating DWARFv5 .debug_addr section. The implementation is pretty straight-forward: we just check the dwarf version and emit section header if needed. Reviewers: aprantl, dblaikie, probinson Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D50005 llvm-svn: 338487
* [GlobalISel][IRTranslator] Use RPO traversal when visiting blocks to translate.Amara Emerson2018-08-011-5/+8
| | | | | | | | | | | | Previously we were just visiting the blocks in the function in IR order, which is rather arbitrary. Therefore we wouldn't always visit defs before uses, but the translation code relies on this assumption in some places. Only codegen change seen in tests is an elision of a redundant copy. Fixes PR38396 llvm-svn: 338476
OpenPOWER on IntegriCloud