summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
...
* [GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra ↵Amara Emerson2019-06-171-61/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | block. Inter-block localization is the same as what currently happens, except now it only runs on the entry block because that's where the problematic constants with long live ranges come from. The second phase is a new intra-block localization phase which attempts to re-sink the already localized instructions further right before one of the multiple uses. One additional change is to also localize G_GLOBAL_VALUE as they're constants too. However, on some targets like arm64 it takes multiple instructions to materialize the value, so some additional heuristics with a TTI hook have been introduced attempt to prevent code size regressions when localizing these. Overall, these changes improve CTMark code size on arm64 by 1.2%. Full code size results: Program baseline new diff ------------------------------------------------------------------------------ test-suite...-typeset/consumer-typeset.test 1249984 1217216 -2.6% test-suite...:: CTMark/ClamAV/clamscan.test 1264928 1232152 -2.6% test-suite :: CTMark/SPASS/SPASS.test 1394092 1361316 -2.4% test-suite...Mark/mafft/pairlocalalign.test 731320 714928 -2.2% test-suite :: CTMark/lencod/lencod.test 1340592 1324200 -1.2% test-suite :: CTMark/kimwitu++/kc.test 3853512 3820420 -0.9% test-suite :: CTMark/Bullet/bullet.test 3406036 3389652 -0.5% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8017000 8016992 -0.0% test-suite...TMark/7zip/7zip-benchmark.test 2856588 2856588 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 765704 765704 0.0% Geomean difference -1.2% Differential Revision: https://reviews.llvm.org/D63303 llvm-svn: 363632
* Propagate fmf in IRTranslate for fnegMichael Berg2019-06-171-8/+17
| | | | | | | | | | | | | | Summary: This case is related to D63405 in that we need to be propagating FMF on negates. Reviewers: volkan, spatel, arsenm Reviewed By: arsenm Subscribers: wdng, javed.absar Differential Revision: https://reviews.llvm.org/D63458 llvm-svn: 363631
* [globalisel] Fix iterator invalidation in the extload combinesDaniel Sanders2019-06-171-54/+38
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Change the way we deal with iterator invalidation in the extload combines as it was still possible to neglect to visit a use. Even worse, it happened in the in-tree test cases and the checks weren't good enough to detect it. We now take a cheap copy of the use list before iterating over it. This prevents iterator invalidation from occurring and has the nice side effect of making the existing schedule-for-erase/schedule-for-insert mechanism moot. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61813 llvm-svn: 363616
* GlobalISel: Ignore callsite attributes when picking intrinsic typeMatt Arsenault2019-06-171-1/+3
| | | | | | | | | | | A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. I fixed the same bug in SelectionDAG in r287593. llvm-svn: 363580
* Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Matt Arsenault2019-06-151-4/+26
| | | | | | | This reapplies r363410, avoiding null dereference if there is no AltRegBank. llvm-svn: 363478
* Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Mitch Phillips2019-06-141-25/+4
| | | | | | | | | | | This patch breaks UBSan build bots. See https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for a guide as to how to reproduce the error. This reverts commit c2864c0de07efb5451d32d27a7d4ff2984830929. This reverts rL363410. llvm-svn: 363476
* [GlobalISel] Add a G_BRJT opcode.Amara Emerson2019-06-141-0/+11
| | | | | | | | | | | | | This is a branch opcode that takes a jump table pointer, jump table index and an index into the table to do an indirect branch. We pass both the table pointer and JTI to allow targets like ARM64 to more easily use the existing jump table compression optimization without having to walk up the block to find a paired G_JUMP_TABLE. Differential Revision: https://reviews.llvm.org/D63159 llvm-svn: 363434
* GlobalISel: Avoid producing Illegal copies in RegBankSelectMatt Arsenault2019-06-141-4/+25
| | | | | | | | | | | | | | | | | | | | | | Avoid producing illegal register bank copies for reg_sequence and phi. The default implementation assumes it is possible to pick any operand's bank and use that for the result, introducing a copy for operands with a different bank. This does not check for illegal copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR operand requires the result to be a VGPR. The changes in getInstrMappingImpl aren't strictly necessary, since AMDGPU now just bypasses this for reg_sequence/phi. This could be replaced with an assert in case other targets run into this. It is currently responsible for producing the error for unsatisfiable copies, but this will be better served with a verifier check. For phis, for now assume any undetermined operands must be VGPRs. Eventually, this needs to be able to defer mapping these operations. This also does not yet have a way to check for whether the block is in a divergent region. llvm-svn: 363410
* RegBankSelect: Remove checks for invalid mappingsMatt Arsenault2019-06-141-5/+2
| | | | | | | | | Avoid a check for valid and a set of redundant asserts. The place InstructionMapping is constructed asserts all of the default fields are passed anyway for an invalid mapping, so don't overcomplicate this. llvm-svn: 363391
* [GlobalISel][IRTranslator] Add debug loc with line 0 to constants emitted ↵Amara Emerson2019-06-131-3/+14
| | | | | | | | | | | | | | | into the entry block. Constants, including G_GLOBAL_VALUE, are all emitted into the entry block which lets us use the vreg def assuming it dominates all other users. However, it can cause jumpy debug behaviour since the DebugLoc attached to these MIs are from a user instruction that could be in a different block. Fixes PR40887. Differential Revision: https://reviews.llvm.org/D63286 llvm-svn: 363331
* [GlobalISel] Add a G_JUMP_TABLE opcode.Amara Emerson2019-06-111-0/+6
| | | | | | | | | | | | This opcode generates a pointer to the address of the jump table specified by the source operand, which is a jump table index. It will be used in conjunction with an upcoming G_BRJT opcode to support jump table codegen with GlobalISel. Differential Revision: https://reviews.llvm.org/D63111 llvm-svn: 363096
* [GlobalISel] Translate memset/memmove/memcpy from undef ptrs into nopsJessica Paquette2019-06-101-0/+13
| | | | | | | | | | | | | If the source is undef, then just don't do anything. This matches SelectionDAG's behaviour in SelectionDAG.cpp. Also add a test showing that we do the right thing here. (irtranslator-memfunc-undef.ll) Differential Revision: https://reviews.llvm.org/D63095 llvm-svn: 362989
* [GlobalISel] IRTranslator: Translate the intrinsics ignored by CodeGenVolkan Keles2019-06-071-0/+5
| | | | | | | | | | | | | | | | | | Summary: Translate `llvm.assume`, `llvm.var.annotation` and `llvm.sideeffect` to nothing as they have no effect on CodeGen. Reviewers: qcolombet, aditya_nandakumar, dsanders, paquette, aemerson, arsenm Reviewed By: arsenm Subscribers: hiraditya, wdng, rovka, kristof.beyls, javed.absar, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63022 llvm-svn: 362834
* [MIPS GlobalISel] Select floor and ceilPetar Avramovic2019-06-061-1/+9
| | | | | | | | Select G_FFLOOR and G_FCEIL for MIPS32. Differential Revision: https://reviews.llvm.org/D62901 llvm-svn: 362688
* Allow target to handle STRICT floating-point nodesUlrich Weigand2019-06-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is *both* marked as mayRaiseFPException *and* FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663
* Reapply: IR: add optional type to 'byval' function parametersTim Northover2019-05-301-1/+4
| | | | | | | | | | | | | | | | | When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. The original commit did not remap byval types when linking modules, which broke LTO. This version fixes that. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362128
* Revert "IR: add optional type to 'byval' function parameters"Tim Northover2019-05-291-4/+1
| | | | | | | The IRLinker doesn't delve into the new byval attribute when mapping types, and this breaks LTO. llvm-svn: 362029
* IR: add optional type to 'byval' function parametersTim Northover2019-05-291-1/+4
| | | | | | | | | | | | | | When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362012
* Revert "[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to"Pengfei Wang2019-05-291-3/+1
| | | | | | This reverts commit c1b3716614bc0a107e6f41a7d3d503baefad8a5b. llvm-svn: 361918
* [X86] Use 'llvm_unreachable' instead of nullptr in unreachable code toPengfei Wang2019-05-291-1/+3
| | | | | | | | | | | | | | | | | avoid static check fail RegClassOrBank is an object of RegClassOrRegBank, which is defined as using llvm::RegClassOrRegBank = typedef PointerUnion<const TargetRegisterClass *, const RegisterBank *> so control flow can not get here. Use ""llvm_unreachable" here to avoid "null pointer" confusion. Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62006 Signed-off-by: pengfei <pengfei.wang@intel.com> llvm-svn: 361912
* GlobalISel: support swifterror attribute on AArch64.Tim Northover2019-05-242-17/+86
| | | | | | | | swifterror marks an argument as a register pretending to be a pointer, so we need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the infrastructure can be reused from the DAG world. llvm-svn: 361608
* AMDGPU/GlobalISel: Legality for integer min/maxMatt Arsenault2019-05-231-1/+13
| | | | llvm-svn: 361519
* GlobalISel: Implement lower for S64->S32 [SU]ITOFPMatt Arsenault2019-05-171-0/+121
| | | | | | | | | | | | | This is ported from the custom AMDGPU DAG implementation. I think this is a better default expansion than what the DAG currently uses, at least if the target has CTLZ. This implements the signed version in terms of the unsigned conversion, which is implemented with bit operations. SelectionDAG has several other implementations that should eventually be ported depending on what instructions are legal. llvm-svn: 361081
* GlobalISel: Define integer min/max instructionsMatt Arsenault2019-05-171-1/+5
| | | | | | | Doesn't attempt to emit them for anything yet, but some legalizations I want to port use them. llvm-svn: 361061
* AMDGPU/GlobalISel: Legalize G_FCOPYSIGNMatt Arsenault2019-05-171-0/+1
| | | | llvm-svn: 361025
* [GlobalISel] Fix -Wsign-compare on 32-bit -DLLVM_ENABLE_ASSERTIONS=on buildsFangrui Song2019-05-171-1/+2
| | | | llvm-svn: 360989
* GlobalISel: Add DstOp version of buildIntrinsicMatt Arsenault2019-05-161-0/+12
| | | | llvm-svn: 360879
* GlobalISel: Add buildFConstant for APFloatMatt Arsenault2019-05-161-0/+7
| | | | llvm-svn: 360853
* GlobalISel: Fix indentationMatt Arsenault2019-05-161-1/+1
| | | | llvm-svn: 360851
* GlobalISel: Add G_FCOPYSIGNMatt Arsenault2019-05-161-0/+2
| | | | llvm-svn: 360850
* [IRTranslator] Don't hardcode GEP index typeDiana Picus2019-05-141-2/+8
| | | | | | | | | | | | | | | | | | When breaking up loads and stores of aggregates, the IRTranslator uses LLT::scalar(64) for the index type of the G_GEP instructions that compute the addresses. This is unnecessarily large for 32-bit targets. Use the int ptr type provided by the DataLayout instead. Note that we're already doing the right thing when translating getelementptr instructions from the IR. This is just an oversight when generating new ones while translating loads/stores. Both x86 and AArch64 already have tests confirming that the old behaviour is preserved for 64-bit targets. Differential Revision: https://reviews.llvm.org/D61852 llvm-svn: 360656
* [IRTranslator] Use the alloc size instead of the store size when translating ↵Quentin Colombet2019-05-031-1/+1
| | | | | | | | | | | | | | allocas We use to incorrectly use the store size instead of the alloc size when creating the stack slot for allocas. On aarch64 this can be demonstrated by allocating weirdly sized types. For instance, in the added test case, we use an alloca for i19. We used to allocate a slot of size 24-bit (19 rounded up to the next byte), whereas we really want to use a full 32-bit slot for this type. llvm-svn: 359856
* [globalisel] Improve Legalizer debug outputDaniel Sanders2019-04-292-6/+62
| | | | | | | | | | * LegalizeAction should be printed by name rather than number * Newly created instructions are incomplete at the point the observer first sees them. They are therefore recorded in a small vector and printed just before the legalizer moves on to another instruction. By this point, the instruction must be complete. llvm-svn: 359481
* [GlobalISel] Fix inserting copies in the right position for reg definitionsMarcello Maggioni2019-04-262-12/+38
| | | | | | | | | | | | | When constrainRegClass is called if the constraining happens on a use the COPY needs to be inserted before the instruction that contains the MachineOperand, but if we are constraining a definition it actually needs to be added after the instruction. In addition, the COPY needs to have its operands flipped (in the use case we are copying from the old unconstrained register to the new constrained register, while in the definition case we are copying from the new constrained register that the instruction defines to the old unconstrained register). llvm-svn: 359282
* [GlobalISel][AArch64] Legalize G_FNEARBYINTJessica Paquette2019-04-251-0/+2
| | | | | | | | | Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc. Since the importer allows us to automatically select this after legalization, also add tests for selection etc. Also update arm64-vfloatintrinsics.ll. llvm-svn: 359204
* [GlobalISel] Add IRTranslator support for G_FNEARBYINTJessica Paquette2019-04-251-0/+2
| | | | | | | | | Translate llvm.nearbyint into G_FNEARBYINT as a simple intrinsic. Update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60922 llvm-svn: 359203
* Add "const" in GetUnderlyingObjects. NFCBjorn Pettersson2019-04-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072
* [AArch64][GlobalISel] Legalize G_INTRINSIC_ROUNDJessica Paquette2019-04-231-0/+1
| | | | | | | Add it to the same rule as G_FCEIL etc. Add a legalizer test, and add a missing switch case to AArch64LegalizerInfo.cpp. llvm-svn: 359033
* [AArch64][GlobalISel] Legalize G_INTRINSIC_TRUNCJessica Paquette2019-04-231-0/+1
| | | | | | | | | Same patch as G_FCEIL etc. Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct rule in AArch64LegalizerInfo.cpp, and add a test. llvm-svn: 359021
* GlobalISel: Legalize scalar G_EXTRACT sourcesMatt Arsenault2019-04-221-0/+7
| | | | llvm-svn: 358892
* Revert r358800. Breaks Obsequi from the test suite.Amara Emerson2019-04-201-95/+4
| | | | | | | The last attempt fixed gcc and consumer-typeset, but Obsequi seems to fail with a different issue. llvm-svn: 358829
* Revert "Revert "[GlobalISel] Add legalization support for non-power-2 loads ↵Amara Emerson2019-04-191-4/+95
| | | | | | | | | and stores"" We were shifting the wrong component of a split load when trying to combine them back into a single value. llvm-svn: 358800
* [GlobalISel][AArch64] Legalize + select G_FRINTJessica Paquette2019-04-191-0/+2
| | | | | | | | | | Exactly the same as G_FCEIL, G_FABS, etc. Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc. Differential Revision: https://reviews.llvm.org/D60895 llvm-svn: 358799
* [GlobalISel] Add IRTranslator support for G_FRINTJessica Paquette2019-04-191-0/+2
| | | | | | | | Add it as a simple intrinsic, update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60893 llvm-svn: 358787
* Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"Amara Emerson2019-04-191-95/+4
| | | | | | This introduces some runtime failures which I'll need to investigate further. llvm-svn: 358771
* [GlobalISel][AArch64] Legalize vector G_FPOWJessica Paquette2019-04-191-0/+1
| | | | | | | | | | This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc. Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change. Differential Revision: https://reviews.llvm.org/D60218 llvm-svn: 358764
* [NFC] FMF propagation for GlobalIselMichael Berg2019-04-182-8/+8
| | | | llvm-svn: 358702
* [GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast ↵Aditya Nandakumar2019-04-181-1/+1
| | | | | | | | | | instructions https://reviews.llvm.org/D60844 Use the style of buildInstr that allows CSEing. llvm-svn: 358637
* Add a getSizeInBits() accessor to MachineMemOperand. NFC.Amara Emerson2019-04-171-5/+5
| | | | | | | | Cleans up a bunch of places where we do getSize() * 8. Differential Revision: https://reviews.llvm.org/D60799 llvm-svn: 358617
* [GlobalISel] Add legalization support for non-power-2 loads and storesAmara Emerson2019-04-171-3/+94
| | | | | | | | | | Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613
OpenPOWER on IntegriCloud