summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
* GlobalISel: Try to widen merges with other mergesMatt Arsenault2019-07-011-2/+28
| | | | | | | | If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841
* [GlobalISel]: Allow backends to custom legalize IntrinsicsAditya Nandakumar2019-07-012-0/+10
| | | | | | | | | https://reviews.llvm.org/D31359 Add a hook "legalizeInstrinsic" to allow backends to override this and custom lower/legalize intrinsics. llvm-svn: 364821
* GlobalISel: Implement lower for min/maxMatt Arsenault2019-07-011-0/+36
| | | | llvm-svn: 364816
* Fixup r364512Diana Picus2019-07-011-10/+12
| | | | | | | | | | Fix stack-use-after-scope errors from r364512. One instance was already fixed in r364611 - this patch simplifies that fix and addresses one more instance of similar code. Discussed in: https://reviews.llvm.org/D63905 llvm-svn: 364778
* Cleanup: llvm::bsearch -> llvm::partition_point after r364719Fangrui Song2019-06-301-5/+4
| | | | llvm-svn: 364720
* GlobalISel: Use RegisterMatt Arsenault2019-06-283-39/+39
| | | | llvm-svn: 364618
* GlobalISel: Convert rest of MachineIRBuilder to using RegisterMatt Arsenault2019-06-281-50/+50
| | | | llvm-svn: 364615
* [GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when ↵Amara Emerson2019-06-271-14/+26
| | | | | | | | | | | | optimizations are used. The new switch lowering code that tries to generate jump tables and range checks were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to town on trying to generate optimized lowerings, e.g. multiple jump tables, range checks etc. This exposed bugs in the way PHI nodes are handled because the CFG looks even stranger after all of this is done. llvm-svn: 364613
* Fix ASAN error caused by commit r364512.Rumeet Dhindsa2019-06-271-4/+6
| | | | | | | | | This patch intends to fix ASAN stack-use-after-scope error. This is at least a short-term fix to unbreak LLVM's mainline. Differential Revision: https://reviews.llvm.org/D63905 llvm-svn: 364611
* [GlobalISel] Remove [un]packRegs from IRTranslatorDiana Picus2019-06-271-29/+4
| | | | | | | | | | | Remove the last use of packRegs from IRTranslator and delete pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics with aggregate arguments, since we don't have a testcase for them so it's hard to tell how we'd want to handle them. Discussed in https://reviews.llvm.org/D63551 llvm-svn: 364514
* [GlobalISel] Accept multiple vregs for lowerCall's argsDiana Picus2019-06-272-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512
* [GlobalISel] Accept multiple vregs for lowerCall's resultDiana Picus2019-06-272-14/+7
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511
* [GlobalISel] Accept multiple vregs in lowerFormalArgsDiana Picus2019-06-272-20/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510
* [GlobalISel] Allow multiple VRegs in ArgInfo. NFCDiana Picus2019-06-272-5/+11
| | | | | | | | | | | Allow CallLowering::ArgInfo to contain more than one virtual register. This is useful when passes split aggregates into several virtual registers, but need to also provide information about the original type to the call lowering. Used in follow-up patches. Differential Revision: https://reviews.llvm.org/D63548 llvm-svn: 364509
* GlobalISel: Remove unsigned variant of SrcOpMatt Arsenault2019-06-244-217/+217
| | | | | | | | | Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
* CodeGen: Introduce a class for registersMatt Arsenault2019-06-244-105/+105
| | | | | | | | | Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
* Make GlobalISel depend on SelectionDAG after D63169Fangrui Song2019-06-221-1/+1
| | | | | | | | | | | GlobalISel/IRTranslator.cpp now references SelectionDAG/FunctionLoweringInfo.cpp. This fixes a link error in -DBUILD_SHARED_LIBS=on builds: ld.lld: error: undefined symbol: llvm::FunctionLoweringInfo::clear() >>> referenced by IRTranslator.cpp:2198 (../lib/CodeGen/GlobalISel/IRTranslator.cpp:2198) >>> lib/CodeGen/GlobalISel/CMakeFiles/LLVMGlobalISel.dir/IRTranslator.cpp.o:(llvm::IRTranslator::finalizeFunction()) llvm-svn: 364124
* [GlobalISel][IRTranslator] Change switch table translation to generate jump ↵Amara Emerson2019-06-211-54/+440
| | | | | | | | | | | | | | | | | | | | | | | | | | tables and range checks. This change makes use of the newly refactored SwitchLoweringUtils code from SelectionDAG to in order to generate jump tables and range checks where appropriate. Much of this code is ported from SDAG with some modifications. We generate G_JUMP_TABLE and G_BRJT instructions when JT opportunities are found. This means that targets which previously relied on the naive one MBB per case stmt translation will now start falling back until they add support for the new opcodes. For range checks, we don't generate any previously unused operations. This just recognizes contiguous ranges of case values and generates a single block per range. Single case value blocks are just a special case of ranges so we get that support almost for free. There are still some optimizations missing that I haven't ported over, and bit-tests are also unimplemented. This patch series is already complex enough. Actual arm64 support for selection of jump tables is coming in a later patch. Differential Revision: https://reviews.llvm.org/D63169 llvm-svn: 364085
* [GlobalISel][Localizer] Allow localization of G_INTTOPTR and chains of ↵Amara Emerson2019-06-211-14/+15
| | | | | | | | | | | | | | | | | | | | instructions. G_INTTOPTR can prevent the localizer from moving G_CONSTANTs, but since it's essentially a side effect free cast instruction we can remat both instructions. This patch changes the localizer to enable localization of the chains by iterating over the entry block instructions in reverse order. That way, uses will localized first, and then the defs are free to be localized as well. This also changes the previous SmallPtrSet of localized instructions to use a SetVector instead. We're dealing with pointers and need deterministic iteration order. Overall, this change improves ARM64 -O0 CTMark code size by around 0.7% geomean. Differential Revision: https://reviews.llvm.org/D63630 llvm-svn: 364001
* [MIPS GlobalISel] Select integer to floating point conversionsPetar Avramovic2019-06-201-2/+2
| | | | | | | | Select G_SITOFP and G_UITOFP for MIPS32. Differential Revision: https://reviews.llvm.org/D63542 llvm-svn: 363912
* [MIPS GlobalISel] Select floating point to integer conversionsPetar Avramovic2019-06-201-2/+3
| | | | | | | | Select G_FPTOSI and G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D63541 llvm-svn: 363911
* [GlobalISel][Localizer] Remove redundant set lookup.Amara Emerson2019-06-181-1/+1
| | | | | | | After changing the algorithm to only process the entry block we never revisit a processed instruction. llvm-svn: 363745
* GlobalISel: Remove redundant pass initializationTom Stellard2019-06-185-13/+4
| | | | | | | | | | | | | | | | | | | Summary: All the GlobalISel passes are initialized when the target calls initializeGlobalISel(), so we don't need to call the initializers from the pass constructors. Reviewers: qcolombet, t.p.northover, paquette, dsanders, aemerson, aditya_nandakumar Reviewed By: aemerson Subscribers: rovka, kristof.beyls, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63235 llvm-svn: 363642
* GlobalISel: Use the original flags when lowering fneg to fsubMatt Arsenault2019-06-171-2/+1
| | | | | | | | | | This was ignoring the flag on fneg, and using the source instruction's flags. Also fixes tests missing from r358702. Note the expansion itself isn't correct without nnan, but that should be fixed separately. llvm-svn: 363637
* [GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra ↵Amara Emerson2019-06-171-61/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | block. Inter-block localization is the same as what currently happens, except now it only runs on the entry block because that's where the problematic constants with long live ranges come from. The second phase is a new intra-block localization phase which attempts to re-sink the already localized instructions further right before one of the multiple uses. One additional change is to also localize G_GLOBAL_VALUE as they're constants too. However, on some targets like arm64 it takes multiple instructions to materialize the value, so some additional heuristics with a TTI hook have been introduced attempt to prevent code size regressions when localizing these. Overall, these changes improve CTMark code size on arm64 by 1.2%. Full code size results: Program baseline new diff ------------------------------------------------------------------------------ test-suite...-typeset/consumer-typeset.test 1249984 1217216 -2.6% test-suite...:: CTMark/ClamAV/clamscan.test 1264928 1232152 -2.6% test-suite :: CTMark/SPASS/SPASS.test 1394092 1361316 -2.4% test-suite...Mark/mafft/pairlocalalign.test 731320 714928 -2.2% test-suite :: CTMark/lencod/lencod.test 1340592 1324200 -1.2% test-suite :: CTMark/kimwitu++/kc.test 3853512 3820420 -0.9% test-suite :: CTMark/Bullet/bullet.test 3406036 3389652 -0.5% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8017000 8016992 -0.0% test-suite...TMark/7zip/7zip-benchmark.test 2856588 2856588 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 765704 765704 0.0% Geomean difference -1.2% Differential Revision: https://reviews.llvm.org/D63303 llvm-svn: 363632
* Propagate fmf in IRTranslate for fnegMichael Berg2019-06-171-8/+17
| | | | | | | | | | | | | | Summary: This case is related to D63405 in that we need to be propagating FMF on negates. Reviewers: volkan, spatel, arsenm Reviewed By: arsenm Subscribers: wdng, javed.absar Differential Revision: https://reviews.llvm.org/D63458 llvm-svn: 363631
* [globalisel] Fix iterator invalidation in the extload combinesDaniel Sanders2019-06-171-54/+38
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Change the way we deal with iterator invalidation in the extload combines as it was still possible to neglect to visit a use. Even worse, it happened in the in-tree test cases and the checks weren't good enough to detect it. We now take a cheap copy of the use list before iterating over it. This prevents iterator invalidation from occurring and has the nice side effect of making the existing schedule-for-erase/schedule-for-insert mechanism moot. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61813 llvm-svn: 363616
* GlobalISel: Ignore callsite attributes when picking intrinsic typeMatt Arsenault2019-06-171-1/+3
| | | | | | | | | | | A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. I fixed the same bug in SelectionDAG in r287593. llvm-svn: 363580
* Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Matt Arsenault2019-06-151-4/+26
| | | | | | | This reapplies r363410, avoiding null dereference if there is no AltRegBank. llvm-svn: 363478
* Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Mitch Phillips2019-06-141-25/+4
| | | | | | | | | | | This patch breaks UBSan build bots. See https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for a guide as to how to reproduce the error. This reverts commit c2864c0de07efb5451d32d27a7d4ff2984830929. This reverts rL363410. llvm-svn: 363476
* [GlobalISel] Add a G_BRJT opcode.Amara Emerson2019-06-141-0/+11
| | | | | | | | | | | | | This is a branch opcode that takes a jump table pointer, jump table index and an index into the table to do an indirect branch. We pass both the table pointer and JTI to allow targets like ARM64 to more easily use the existing jump table compression optimization without having to walk up the block to find a paired G_JUMP_TABLE. Differential Revision: https://reviews.llvm.org/D63159 llvm-svn: 363434
* GlobalISel: Avoid producing Illegal copies in RegBankSelectMatt Arsenault2019-06-141-4/+25
| | | | | | | | | | | | | | | | | | | | | | Avoid producing illegal register bank copies for reg_sequence and phi. The default implementation assumes it is possible to pick any operand's bank and use that for the result, introducing a copy for operands with a different bank. This does not check for illegal copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR operand requires the result to be a VGPR. The changes in getInstrMappingImpl aren't strictly necessary, since AMDGPU now just bypasses this for reg_sequence/phi. This could be replaced with an assert in case other targets run into this. It is currently responsible for producing the error for unsatisfiable copies, but this will be better served with a verifier check. For phis, for now assume any undetermined operands must be VGPRs. Eventually, this needs to be able to defer mapping these operations. This also does not yet have a way to check for whether the block is in a divergent region. llvm-svn: 363410
* RegBankSelect: Remove checks for invalid mappingsMatt Arsenault2019-06-141-5/+2
| | | | | | | | | Avoid a check for valid and a set of redundant asserts. The place InstructionMapping is constructed asserts all of the default fields are passed anyway for an invalid mapping, so don't overcomplicate this. llvm-svn: 363391
* [GlobalISel][IRTranslator] Add debug loc with line 0 to constants emitted ↵Amara Emerson2019-06-131-3/+14
| | | | | | | | | | | | | | | into the entry block. Constants, including G_GLOBAL_VALUE, are all emitted into the entry block which lets us use the vreg def assuming it dominates all other users. However, it can cause jumpy debug behaviour since the DebugLoc attached to these MIs are from a user instruction that could be in a different block. Fixes PR40887. Differential Revision: https://reviews.llvm.org/D63286 llvm-svn: 363331
* [GlobalISel] Add a G_JUMP_TABLE opcode.Amara Emerson2019-06-111-0/+6
| | | | | | | | | | | | This opcode generates a pointer to the address of the jump table specified by the source operand, which is a jump table index. It will be used in conjunction with an upcoming G_BRJT opcode to support jump table codegen with GlobalISel. Differential Revision: https://reviews.llvm.org/D63111 llvm-svn: 363096
* [GlobalISel] Translate memset/memmove/memcpy from undef ptrs into nopsJessica Paquette2019-06-101-0/+13
| | | | | | | | | | | | | If the source is undef, then just don't do anything. This matches SelectionDAG's behaviour in SelectionDAG.cpp. Also add a test showing that we do the right thing here. (irtranslator-memfunc-undef.ll) Differential Revision: https://reviews.llvm.org/D63095 llvm-svn: 362989
* [GlobalISel] IRTranslator: Translate the intrinsics ignored by CodeGenVolkan Keles2019-06-071-0/+5
| | | | | | | | | | | | | | | | | | Summary: Translate `llvm.assume`, `llvm.var.annotation` and `llvm.sideeffect` to nothing as they have no effect on CodeGen. Reviewers: qcolombet, aditya_nandakumar, dsanders, paquette, aemerson, arsenm Reviewed By: arsenm Subscribers: hiraditya, wdng, rovka, kristof.beyls, javed.absar, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63022 llvm-svn: 362834
* [MIPS GlobalISel] Select floor and ceilPetar Avramovic2019-06-061-1/+9
| | | | | | | | Select G_FFLOOR and G_FCEIL for MIPS32. Differential Revision: https://reviews.llvm.org/D62901 llvm-svn: 362688
* Allow target to handle STRICT floating-point nodesUlrich Weigand2019-06-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is *both* marked as mayRaiseFPException *and* FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663
* Reapply: IR: add optional type to 'byval' function parametersTim Northover2019-05-301-1/+4
| | | | | | | | | | | | | | | | | When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. The original commit did not remap byval types when linking modules, which broke LTO. This version fixes that. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362128
* Revert "IR: add optional type to 'byval' function parameters"Tim Northover2019-05-291-4/+1
| | | | | | | The IRLinker doesn't delve into the new byval attribute when mapping types, and this breaks LTO. llvm-svn: 362029
* IR: add optional type to 'byval' function parametersTim Northover2019-05-291-1/+4
| | | | | | | | | | | | | | When we switch to opaque pointer types we will need some way to describe how many bytes a 'byval' parameter should occupy on the stack. This adds a (for now) optional extra type parameter. If present, the type must match the pointee type of the argument. Note to front-end maintainers: if this causes test failures, it's probably because the "byval" attribute is printed after attributes without any parameter after this change. llvm-svn: 362012
* Revert "[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to"Pengfei Wang2019-05-291-3/+1
| | | | | | This reverts commit c1b3716614bc0a107e6f41a7d3d503baefad8a5b. llvm-svn: 361918
* [X86] Use 'llvm_unreachable' instead of nullptr in unreachable code toPengfei Wang2019-05-291-1/+3
| | | | | | | | | | | | | | | | | avoid static check fail RegClassOrBank is an object of RegClassOrRegBank, which is defined as using llvm::RegClassOrRegBank = typedef PointerUnion<const TargetRegisterClass *, const RegisterBank *> so control flow can not get here. Use ""llvm_unreachable" here to avoid "null pointer" confusion. Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62006 Signed-off-by: pengfei <pengfei.wang@intel.com> llvm-svn: 361912
* GlobalISel: support swifterror attribute on AArch64.Tim Northover2019-05-242-17/+86
| | | | | | | | swifterror marks an argument as a register pretending to be a pointer, so we need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the infrastructure can be reused from the DAG world. llvm-svn: 361608
* AMDGPU/GlobalISel: Legality for integer min/maxMatt Arsenault2019-05-231-1/+13
| | | | llvm-svn: 361519
* GlobalISel: Implement lower for S64->S32 [SU]ITOFPMatt Arsenault2019-05-171-0/+121
| | | | | | | | | | | | | This is ported from the custom AMDGPU DAG implementation. I think this is a better default expansion than what the DAG currently uses, at least if the target has CTLZ. This implements the signed version in terms of the unsigned conversion, which is implemented with bit operations. SelectionDAG has several other implementations that should eventually be ported depending on what instructions are legal. llvm-svn: 361081
* GlobalISel: Define integer min/max instructionsMatt Arsenault2019-05-171-1/+5
| | | | | | | Doesn't attempt to emit them for anything yet, but some legalizations I want to port use them. llvm-svn: 361061
* AMDGPU/GlobalISel: Legalize G_FCOPYSIGNMatt Arsenault2019-05-171-0/+1
| | | | llvm-svn: 361025
* [GlobalISel] Fix -Wsign-compare on 32-bit -DLLVM_ENABLE_ASSERTIONS=on buildsFangrui Song2019-05-171-1/+2
| | | | llvm-svn: 360989
OpenPOWER on IntegriCloud