summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Reapply: "DebugInfo: Emit only one kind of accelerated access/name table""David Blaikie2019-04-233-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Originally committed in r358931 Reverted in r358997 Seems this change made Apple accelerator tables miss names (because names started respecting the CU NameTableKind GNU & assuming that shouldn't produce accelerated names too), which is never correct (apple accelerator tables don't have separators or CU lists - if present, they must describe all names in all CUs). Original Description: Currently to opt in to debug_names in DWARFv5, the IR must contain 'nameTableKind: Default' which also enables debug_pubnames. Instead, only allow one of {debug_names, apple_names, debug_pubnames, debug_gnu_pubnames}. nameTableKind: Default gives debug_names in DWARFv5 and greater, debug_pubnames in v4 and earlier - and apple_names when tuning for lldb on MachO. nameTableKind: GNU always gives gnu_pubnames llvm-svn: 359026
* [AArch64][GlobalISel] Legalize G_INTRINSIC_TRUNCJessica Paquette2019-04-231-0/+1
| | | | | | | | | Same patch as G_FCEIL etc. Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct rule in AArch64LegalizerInfo.cpp, and add a test. llvm-svn: 359021
* Revert "DebugInfo: Emit only one kind of accelerated access/name table"David Blaikie2019-04-233-8/+3
| | | | | | | | Regresses some apple_names situations - still investigating. This reverts commit r358931. llvm-svn: 358997
* Use llvm::stable_sortFangrui Song2019-04-2312-62/+56
| | | | | | While touching the code, simplify if feasible. llvm-svn: 358996
* [DAGCombiner] generalize binop-of-splats scalarizationSanjay Patel2019-04-231-46/+38
| | | | | | | | | | | | | | If we only match build vectors, we can miss some patterns that use shuffles as seen in the affected tests. Note that the underlying calls within getSplatSourceVector() have the potential for compile-time explosion because of exponential recursion looking through binop opcodes, but currently the list of supported opcodes is very limited. Both of those problems should be addressed in follow-up patches. llvm-svn: 358984
* [DAGCombiner] Combine OR as ADD when no common bits are setBjorn Pettersson2019-04-231-16/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The DAGCombiner is rewriting (canonicalizing) an ISD::ADD with no common bits set in the operands as an ISD::OR node. This could sometimes result in "missing out" on some combines that normally are performed for ADD. To be more specific this could happen if we already have rewritten an ADD into OR, and later (after legalizations or combines) we expose patterns that could have been optimized if we had seen the OR as an ADD (e.g. reassociations based on ADD). To make the DAG combiner less sensitive to if ADD or OR is used for these "no common bits set" ADD/OR operations we now apply most of the ADD combines also to an OR operation, when value tracking indicates that the operands have no common bits set. Reviewers: spatel, RKSimon, craig.topper, kparzysz Reviewed By: spatel Subscribers: arsenm, rampitec, lebedev.ri, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59758 llvm-svn: 358965
* Revert "Use const DebugLoc&"Chandler Carruth2019-04-231-2/+2
| | | | | | | | | | | | | | | | This reverts r358910 (git commit 2b744665308fc8d30a3baecb4947f2bd81aa7d30) While this patch *seems* trivial and safe and correct, it is not. The copies are actually load bearing copies. You can observe this with MSan or other ways of checking for use-after-destroy, but otherwise this may result in ... difficult to debug inexplicable behavior. I suspect the issue is that the debug location is used after the original reference to it is removed. The metadata backing it gets destroyed as its last references goes away, and then we reference it later through these const references. llvm-svn: 358940
* DebugInfo: Emit only one kind of accelerated access/name tableDavid Blaikie2019-04-223-3/+8
| | | | | | | | | | | | | | | Currently to opt in to debug_names in DWARFv5, the IR must contain 'nameTableKind: Default' which also enables debug_pubnames. Instead, only allow one of {debug_names, apple_names, debug_pubnames, debug_gnu_pubnames}. nameTableKind: Default gives debug_names in DWARFv5 and greater, debug_pubnames in v4 and earlier - and apple_names when tuning for lldb on MachO. nameTableKind: GNU always gives gnu_pubnames llvm-svn: 358931
* [SelectionDAG] move splat util functions up from x86 loweringSanjay Patel2019-04-221-0/+52
| | | | | | | | | | This was supposed to be NFC, but the change in SDLoc definitions causes instruction scheduling changes. There's nothing x86-specific in this code, and it can likely be used from DAGCombiner's simplifyVBinOp(). llvm-svn: 358930
* Use const DebugLoc&Matt Arsenault2019-04-221-2/+2
| | | | llvm-svn: 358910
* GlobalISel: Legalize scalar G_EXTRACT sourcesMatt Arsenault2019-04-221-0/+7
| | | | llvm-svn: 358892
* [TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handlingSimon Pilgrim2019-04-221-1/+25
| | | | | | | | | | | | This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGCombine but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. Differential Revision: https://reviews.llvm.org/D60462 llvm-svn: 358887
* [DAGCombiner] make variable name less ambiguous; NFCSanjay Patel2019-04-221-4/+4
| | | | llvm-svn: 358886
* [DAGCombiner] prepare shuffle-of-splat to handle more patterns; NFCSanjay Patel2019-04-221-11/+16
| | | | llvm-svn: 358884
* Revert r358800. Breaks Obsequi from the test suite.Amara Emerson2019-04-201-95/+4
| | | | | | | The last attempt fixed gcc and consumer-typeset, but Obsequi seems to fail with a different issue. llvm-svn: 358829
* [ExecutionDomainFix] Optimize a binary search insertionFangrui Song2019-04-201-3/+3
| | | | llvm-svn: 358815
* Revert "Revert "[GlobalISel] Add legalization support for non-power-2 loads ↵Amara Emerson2019-04-191-4/+95
| | | | | | | | | and stores"" We were shifting the wrong component of a split load when trying to combine them back into a single value. llvm-svn: 358800
* [GlobalISel][AArch64] Legalize + select G_FRINTJessica Paquette2019-04-191-0/+2
| | | | | | | | | | Exactly the same as G_FCEIL, G_FABS, etc. Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc. Differential Revision: https://reviews.llvm.org/D60895 llvm-svn: 358799
* [GlobalISel] Add IRTranslator support for G_FRINTJessica Paquette2019-04-191-0/+2
| | | | | | | | Add it as a simple intrinsic, update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60893 llvm-svn: 358787
* Attempt to fix buildbot failure in commit ↵Amy Huang2019-04-191-1/+1
| | | | | | 1bb57bac959ac163fd7d8a76d734ca3e0ecee6ab. llvm-svn: 358786
* [MS] Emit S_HEAPALLOCSITE debug infoAmy Huang2019-04-194-0/+36
| | | | | | | | | | | | | | | | | Summary: This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug info in codeview. Currently only changes FastISel, so emitting labels still needs to be implemented in SelectionDAG. Reviewers: hans, rnk Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60800 llvm-svn: 358783
* Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"Amara Emerson2019-04-191-95/+4
| | | | | | This introduces some runtime failures which I'll need to investigate further. llvm-svn: 358771
* [GlobalISel][AArch64] Legalize vector G_FPOWJessica Paquette2019-04-191-0/+1
| | | | | | | | | | This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc. Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change. Differential Revision: https://reviews.llvm.org/D60218 llvm-svn: 358764
* [SelectionDAG] soften splat mask assert/unreachable (PR41535)Sanjay Patel2019-04-191-1/+4
| | | | | | | | These are general queries, so they should not die when given a degenerate input like an all undef mask. Callers should be able to deal with an op that will eventually be simplified away. llvm-svn: 358761
* [CodeGen] Add "const" to MachineInstr::mayAliasBjorn Pettersson2019-04-195-15/+18
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The basic idea here is to make it possible to use MachineInstr::mayAlias also when the MachineInstr is const (or the "Other" MachineInstr is const). The addition of const in MachineInstr::mayAlias then rippled down to the need for adding const in several other places, such as TargetTransformInfo::getMemOperandWithOffset. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, MatzeB, arsenm, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60856 llvm-svn: 358744
* [PATCH] [MachineScheduler] Check pending instructions when an instruction is ↵James Molloy2019-04-191-0/+2
| | | | | | | | | | | | | | scheduled Pending instructions that may have been blocked from being available by the HazardRecognizer may no longer may not be blocked any more when an instruction is scheduled; pending instructions should be re-checked in this case. This is primarily aimed at VLIW targets with large parallelism and esoteric constraints. No testcase as no in-tree targets have this behavior. Differential revision: https://reviews.llvm.org/D60861 llvm-svn: 358743
* [llvm] Prevent duplicate files in debug line header in dwarf 5: another attemptAli Tamur2019-04-191-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Another attempt to land the changes in debug line header to prevent duplicate files in Dwarf 5. I rolled back my previous commit because of a mistake in generating the object file in a test. Meanwhile, I addressed some offline comments and changed the implementation; the largest difference is that MCDwarfLineTableHeader does not keep DwarfVersion but gets it as a parameter. I also merged the patch to fix two lld tests that will strt to fail into this patch. Original Commit: https://reviews.llvm.org/D59515 Original Message: Motivation: In previous dwarf versions, file name indexes started from 1, and the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes the primary source file to be explicitly given an entry with an index number 0. The current implementation honors the specification by just duplicating the main source file, once with index number 0, and later maybe with another index number. While this is compliant with the letter of the standard, the duplication causes problems for consumers of this information such as lldb. (Some files are duplicated, where only some of them have a line table although all refer to the same file) With this change, dwarf 5 debug line section files always start from 0, and the zeroth entry is not duplicated whenever possible. This requires different handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5) However, I think the minor complication is worth it, because it enables all consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the file name list homogenously. llvm-svn: 358732
* [NFC] FMF propagation for GlobalIselMichael Berg2019-04-182-8/+8
| | | | llvm-svn: 358702
* Fix a typo in comments. [NFC]Ali Tamur2019-04-181-1/+1
| | | | llvm-svn: 358639
* [GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast ↵Aditya Nandakumar2019-04-181-1/+1
| | | | | | | | | | instructions https://reviews.llvm.org/D60844 Use the style of buildInstr that allows CSEing. llvm-svn: 358637
* [AsmPrinter] hoist %a output template to base class for ARM+Aarch64Nick Desaulniers2019-04-171-4/+11
| | | | | | | | | | | | | | | | | | | | | Summary: X86 is quite complicated; so I intend to leave it as is. ARM+Aarch64 do basically the same thing (Aarch64 did not correctly handle immediates, ARM has a test llvm/test/CodeGen/ARM/2009-04-06-AsmModifier.ll that uses %a with an immediate) for a flag that should be target independent anyways. Reviewers: echristo, peter.smith Reviewed By: echristo Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60841 llvm-svn: 358618
* Add a getSizeInBits() accessor to MachineMemOperand. NFC.Amara Emerson2019-04-172-8/+8
| | | | | | | | Cleans up a bunch of places where we do getSize() * 8. Differential Revision: https://reviews.llvm.org/D60799 llvm-svn: 358617
* [GlobalISel] Add legalization support for non-power-2 loads and storesAmara Emerson2019-04-171-3/+94
| | | | | | | | | | Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613
* [AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFCNick Desaulniers2019-04-171-0/+3
| | | | | | | | | | | | | | | | | | | Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 llvm-svn: 358603
* [DAGCombine] Add SimplifyDemandedBits helper that handles demanded elts mask ↵Simon Pilgrim2019-04-171-4/+13
| | | | | | | | as well The other SimplifyDemandedBits helpers become wrappers to this new demanded elts variant. llvm-svn: 358585
* [ScheduleDAGRRList] Recompute topological ordering on demand.Florian Hahn2019-04-172-24/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently there is a single point in ScheduleDAGRRList, where we actually query the topological order (besides init code). Currently we are recomputing the order after adding a node (which does not have predecessors) and then we add predecessors edge-by-edge. We can avoid adding edges one-by-one after we added a new node. In that case, we can just rebuild the order from scratch after adding the edges to the DAG and avoid all the updates to the ordering. Also, we can delay updating the DAG until we query the DAG, if we keep a list of added edges. Depending on the number of updates, we can either apply them when needed or recompute the order from scratch. This brings down the geomean compile time for of CTMark with -O1 down 0.3% on X86, with no regressions. Reviewers: MatzeB, atrick, efriedma, niravd, paquette Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60125 llvm-svn: 358583
* Change some llvm::{lower,upper}_bound to llvm::bsearch. NFCFangrui Song2019-04-171-4/+2
| | | | llvm-svn: 358564
* [TargetLowering] Rename preferShiftsToClearExtremeBits and ↵Simon Pilgrim2019-04-161-2/+2
| | | | | | | | | | | | shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair llvm-svn: 358526
* [DAGCombiner] Add missing flag to addressing mode checkLuis Marques2019-04-161-0/+2
| | | | | | | | | | | | The checks in `canFoldInAddressingMode` tested for addressing modes that have a base register but didn't set the `HasBaseReg` flag to true (it's false by default). This patch fixes that. Although the omission of the flag was technically incorrect it had no known observable impact, so no tests were changed by this patch. Differential Revision: https://reviews.llvm.org/D60314 llvm-svn: 358502
* [AArch64][GlobalISel] Don't do extending loads combine for non-pow-2 types.Amara Emerson2019-04-151-0/+5
| | | | | | | | Since non-pow-2 types are going to get split up into multiple loads anyway, don't do the [SZ]EXTLOAD combine for those and save us trouble later in legalization. llvm-svn: 358458
* DAG: propagate ConsecutiveRegs flags to returns too.Tim Northover2019-04-151-0/+18
| | | | | | | | | | Arguments already have a flag to inform backends when they have been split up. The AArch64 arm64_32 ABI makes use of these on return types too, so that code emitted for armv7k can be ABI-compliant. There should be no CodeGen changes yet, just making more information available. llvm-svn: 358399
* DAG: propagate whether an arg is a pointer for CallingConv decisions.Tim Northover2019-04-152-5/+30
| | | | | | | | | | | | | | | The arm64_32 ABI specifies that pointers (despite being 32-bits) should be zero-extended to 64-bits when passed in registers for efficiency reasons. This means that the SelectionDAG needs to be able to tell the backend that an argument was originally a pointer, which is implmented here. Additionally, some memory intrinsics need to be declared as taking an i8* instead of an iPTR. There should be no CodeGen change yet, but it will be triggered when AArch64 backend support for ILP32 is added. llvm-svn: 358398
* [SelectionDAG] Use KnownBits::computeForAddSub/computeForAddCarryBjorn Pettersson2019-04-151-58/+21
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Use KnownBits::computeForAddSub/computeForAddCarry in SelectionDAG::computeKnownBits when doing value tracking for addition/subtraction. This should improve the precision of the known bits, as we only used to make a simple estimate of known zeroes. The KnownBits support functions are also able to deduce bits that are known to be one in the result. Reviewers: spatel, RKSimon, nikic, lebedev.ri Reviewed By: nikic Subscribers: nikic, javed.absar, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60460 llvm-svn: 358372
* [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with ↵Amara Emerson2019-04-155-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369
* [GlobalISel] Introduce a CSEConfigBase class to allow targets to define ↵Amara Emerson2019-04-154-8/+23
| | | | | | | | | | | | | | their own CSE configs. Because CodeGen can't depend on GlobalISel, we need a way to encapsulate the CSE configs that can be passed between TargetPassConfig and the targets' custom pass configs. This CSEConfigBase allows targets to create custom CSE configs which is then used by the GISel passes for the CSEMIRBuilder. This support will be used in a follow up commit to allow constant-only CSE for -O0 compiles in D60580. llvm-svn: 358368
* [AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and ↵Amara Emerson2019-04-131-0/+1
| | | | | | | | | | | | | fix a crash. This enables the simple copy combine that already exists in the CombinerHelper. However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a set of MIs to process, and so would end up causing a crash when deleted MIs were being added to the combiner worklist again. Differential Revision: https://reviews.llvm.org/D60579 llvm-svn: 358318
* [GlobalISel] Fix a crash when handling an invalid MVT during call lowering.Amara Emerson2019-04-121-1/+1
| | | | | | | | This crash was introduced in r358032 as we try to construct an EVT from an MVT in order to find the register type for the calling conv. Fall back instead of trying to do this with an invalid MVT coming from i256. llvm-svn: 358314
* [DAGCombiner] narrow shuffle of concatenated vectorsSanjay Patel2019-04-121-0/+50
| | | | | | | | | | | | | | | | | | // shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements. The x86 changes look neutral or better. There's one test with an extra instruction, but that could be reversed for a subtarget with the right attributes. But by default, we want to avoid the 256-bit op when possible (in my motivating benchmark, a handful of ymm ops sprinkled into a sequence of xmm ops are triggering frequency throttling on Haswell resulting in significantly worse perf). Differential Revision: https://reviews.llvm.org/D60545 llvm-svn: 358291
* Add options for MaxLoadsPerMemcmp(OptSize).Hiroshi Yamauchi2019-04-121-2/+15
| | | | | | | | | | | | | | Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60587 llvm-svn: 358287
* Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter."Hans Wennborg2019-04-123-22/+3
| | | | | | | | | | | | | | | | | | | It causes clang to crash while building Chromium. See https://crbug.com/952230 for reproducer. > The PrologEpilogInserter need to insert a DW_OP_deref_size before > prepending a memory location expression to an already implicit > expression to avoid having the existing expression act on the memory > address instead of the value behind it. > > The reason for using DW_OP_deref_size and not plain DW_OP_deref is that > big-endian targets need to read the right size as simply truncating a > larger read would yield the wrong result (LSB bytes are not at the lower > address). > > Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 358281
OpenPOWER on IntegriCloud