summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [MC] Add parameter `Address` to MCInstrPrinter::printInstructionFangrui Song2020-01-0638-51/+51
| | | | | | | | Follow-up of D72172. Reviewed By: jhenderson, rnk Differential Revision: https://reviews.llvm.org/D72180
* [MC] Add parameter `Address` to MCInstPrinter::printInstFangrui Song2020-01-0643-99/+119
| | | | | | | | | | | | | | | | | | | | | | | | printInst prints a branch/call instruction as `b offset` (there are many variants on various targets) instead of `b address`. It is a convention to use address instead of offset in most external symbolizers/disassemblers. This difference makes `llvm-objdump -d` output unsatisfactory. Add `uint64_t Address` to printInst(), so that it can pass the argument to printInstruction(). `raw_ostream &OS` is moved to the last to be consistent with other print* methods. The next step is to pass `Address` to printInstruction() (generated by tablegen from the instruction set description). We can gradually migrate targets to print addresses instead of offsets. In any case, downstream projects which don't know `Address` can pass 0 as the argument. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72172
* AMDGPU/GlobalISel: Fix unused variable warning in releaseMatt Arsenault2020-01-061-2/+1
|
* AMDGPU: Select llvm.amdgcn.interp.p2.f16 directlyMatt Arsenault2020-01-062-26/+12
| | | | This will enable automatic GlobalISel support in a future commit.
* AMDGPU: Use default operands for clamp/omodMatt Arsenault2020-01-061-10/+28
| | | | | | | We have a lot of complex pattern variants that just set the source modifiers that are really handled, and then set the output modifiers to 0. We're unlikely to ever match output modifiers from the use instruction side, and we already match clamp/omod in a separate pass.
* [WebAssembly] Fix landingpad-only case in Emscripten EHHeejin Ahn2020-01-061-1/+1
| | | | | | | | | | | | | | | | | | | Summary: Previously we didn't set `Changed` to true when there are only landing pads but not invokes. This fixes it and we set `Changed` to true whenever we have landing pads. (There can't be invokes without landing pads, so that case is covered too) The test case for this has to be a separate file because this pass is a `ModulePass` and `Changed` is computed based on the whole module. Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72308
* AMDGPU/GlobalISel: Legalize G_READCYCLECOUNTERMatt Arsenault2020-01-062-1/+5
|
* Add Triple::isX86()Fangrui Song2020-01-062-5/+2
| | | | | | Reviewed By: craig.topper, skan Differential Revision: https://reviews.llvm.org/D72247
* AMDGPU/GlobalISel: Select G_UADDE/G_USUBEMatt Arsenault2020-01-062-11/+31
|
* AMDGPU/GlobalISel: Replace handling of boolean valuesMatt Arsenault2020-01-068-249/+427
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This solves selection failures with generated selection patterns, which would fail due to inferring the SGPR reg bank for virtual registers with a set register class instead of VCC bank. Use instruction selection would constrain the virtual register to a specific class, so when the def was selected later the bank no longer was set to VCC. Remove the SCC reg bank. SCC isn't directly addressable, so it requires copying from SCC to an allocatable 32-bit register during selection, so these might as well be treated as 32-bit SGPR values. Now any scalar boolean value that will produce an outupt in SCC should be widened during RegBankSelect to s32. Any s1 value should be a vector boolean during selection. This makes the vcc register bank unambiguous with a normal SGPR during selection. Summary of how this should now work: - G_TRUNC is always a no-op, and never should use a vcc bank result. - SALU boolean operations should be promoted to s32 in RegBankSelect apply mapping - An s1 value means vcc bank at selection. The exception is for legalization artifacts that use s1, which are never VCC. All other contexts should infer the VCC register classes for s1 typed registers. The LLT for the register is now needed to infer the correct register class. Extensions with vcc sources should be legalized to a select of constants during RegBankSelect. - Copy from non-vcc to vcc ensures high bits of the input value are cleared during selection. - SALU boolean inputs should ensure the inputs are 0/1. This includes select, conditional branches, and carry-ins. There are a few somewhat dirty details. One is that G_TRUNC/G_*EXT selection ignores the usual register-bank from register class functions, and can't handle truncates with VCC result banks. I think this is OK, since the artifacts are specially treated anyway. This does require some care to avoid producing cases with vcc. There will also be no 100% reliable way to verify this rule is followed in selection in case of register classes, and violations manifests themselves as invalid copy instructions much later. Standard phi handling also only considers the bank of the result register, and doesn't insert copies to make the source banks match. This doesn't work for vcc, so we have to manually correct phi inputs in this case. We should add a verifier check to make sure there are no phis with mixed vcc and non-vcc register bank inputs. There's also some duplication with the LegalizerHelper, and some code which should live in the helper. I don't see a good way to share special knowledge about what types to use for intermediate operations depending on the bank for example. Using the helper to replace extensions with selects also seems somewhat awkward to me. Another issue is there are some contexts calling getRegBankFromRegClass that apparently don't have the LLT type for the register, but I haven't yet run into a real issue from this. This also introduces new unnecessary instructions in most cases, since we don't yet try to optimize out the zext when the source is known to come from a compare.
* GlobalISel: Implement lower for G_INTRINSIC_ROUNDMatt Arsenault2020-01-063-3/+31
| | | | | Mostly copied from AMDGPU lowering implementation, except used G_SITOFP instead of directly creating a select on -1.0, 0.0.
* [X86] Move an enum definition into a header to simplify future patches [NFC]Philip Reames2020-01-062-24/+26
|
* Don't rely on 'l'(ell) modifiers to indicate a label referenceBill Wendling2020-01-061-19/+16
| | | | | | | | | | | | | | | | Summary: It's not necessary to use an 'l'(ell) modifier when referencing a label. Treat block addresses and MBB references as if the modifier is used anyway. This prevents us from generating references to ficticious labels. Reviewers: jyknight, nickdesaulniers, hfinkel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71849
* [FileCheck] Remove FileCheck prefix in APIThomas Preud'homme2020-01-062-231/+204
| | | | | | | | | | | | | | | | | | Summary: When FileCheck was made a library, types in the public API were renamed to add a FileCheck prefix, such as Pattern to FileCheckPattern. Many types were moved into a private interface and thus don't need this prefix anymore. This commit removes those unneeded prefixes. Reviewers: jhenderson, jdenny, probinson, grimar, arichardson, rnk Reviewed By: jhenderson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72186
* [PowerPC][NFC] Rename record instructions to use _rec suffix instead of oJinsong Ji2020-01-0613-1007/+1050
| | | | | | | | | | | | | | | | | | | We use o suffix to indicate record form instuctions, (as it is similar to dot '.' in mne?) This was fine before, as we did not support XO-form. However, with https://reviews.llvm.org/D66902, we now have XO-form support. It becomes confusing now to still use 'o' for record form, and it is weird to have something like 'Oo' . This patch rename all 'o' instructions to use '_rec' instead. Also rename `isDot` to `isRecordForm`. Reviewed By: #powerpc, hfinkel, nemanjai, steven.zhang, lkail Differential Revision: https://reviews.llvm.org/D70758
* GlobalISel: Fix unsupported legalize actionMatt Arsenault2020-01-061-0/+5
| | | | | | | | This would complain about invalid legalizer rules otherwise. Mark some operations as unsupported for AMDGPU. This currently seems to produce the same legalize error as when no rules are defined, but eventually this should produce a proper user facing error.
* GlobalISel: Correct result type for G_FCMP in lowerFPTOUIMatt Arsenault2020-01-061-1/+3
| | | | | Using the final result type doesn't make any sense. Use the natural default boolean type for the select condition.
* GlobalISel: Start adding computeNumSignBits to GISelKnownBitsMatt Arsenault2020-01-061-0/+70
|
* AMDGPU: Fix legalizing f16 fpowMatt Arsenault2020-01-062-0/+2
| | | | | | The existing test only covered one case for r600. The use of mul_legacy also looks suspicious to me, but leave it for now. The patterns are also not making use of source modifiers.
* AMDGPU: Use ImmLeafMatt Arsenault2020-01-061-2/+2
| | | | | This solves one GlobalISel importer error, but the pattern still fails for another reason.
* AMDGPU: Use ImmLeaf for inline immediate predicatesMatt Arsenault2020-01-066-9/+63
|
* llc/MIR: Fix setFunctionAttributes for MIR functionsMatt Arsenault2020-01-061-17/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | A random set of attributes are implemented by llc/opt forcing the string attributes on the IR functions before processing anything. This would not happen for MIR functions, which have not yet been created at this point. Use a callback in the MIR parser, purely to avoid dealing with the ugliness that the command line flags are in a .inc file, and would require allowing access to these flags from multiple places (either from the MIR parser directly, or a new utility pass to implement these flags). It would probably be better to cleanup the flag handling into a separate library. This is in preparation for treating more command line flags with a corresponding function attribute in a more uniform way. The fast math flags in particular have a messy system where the command line flag sets the behavior from a function attribute if present, and otherwise the command line flag. This means if any other pass tries to inspect the function attributes directly, it will be inconsistent with the intended behavior. This is also inconsistent with the current behavior of -mcpu and -mattr, which overwrites any pre-existing function attributes. I would like to move this to consistenly have the command line flags not overwrite any pre-existing attributes, and to always ensure the command line flags are consistent with the function attributes.
* [X86] Improve v4i32->v4f64 uint_to_fp for AVX1/AVX2 targets.Craig Topper2020-01-061-0/+15
| | | | | | Use zext+or+fsub to do the conversion. Similar to D71971. Differential Revision: https://reviews.llvm.org/D71971
* [LegalizeTypes] Add widening support for STRICT_FSETCC/FSETCCSCraig Topper2020-01-062-0/+86
| | | | | | This patch adds widening which really just scalarizes because we don't have a strategy for the extra elements we would need to pad with. Differential Revision: https://reviews.llvm.org/D72193
* Lower TAGPstack with negative offset to SUBG.Evgenii Stepanov2020-01-062-3/+12
| | | | | | | | | | | | | | Summary: This never really occurs in the current codegen, so only a MIR test is possible. Reviewers: ostannard, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72123
* [X86] Fix an 8 bit testb being selected when folding a volatile i32 load ↵Amara Emerson2020-01-061-0/+11
| | | | | | pattern. Differential Revision: https://reviews.llvm.org/D71581
* [PowerPC][LoopVectorize] Extend getRegisterClassForType to consider double ↵Jinsong Ji2020-01-061-3/+9
| | | | | | | | | | | | | | and other floating point type In https://reviews.llvm.org/D67148, we use isFloatTy to test floating point type, otherwise we return GPRRC. So 'double' will be classified as GPRRC, which is not accurate. This patch covers other floating point types. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D71946
* [AIX] Use csect reference for function address constantsdiggerlin2020-01-061-1/+1
| | | | | | | | | | | | | | | | | | | | SUMMARY: We currently emit a reference for function address constants as labels; for example: foo_ptr: .long foo however, there may be no such label in the case where the function is undefined. Although the label exists when the function is defined, we will (to be consistent) also use a csect reference in that case. Address one comment https://reviews.llvm.org/D71144#inline-653255 Reviewers: daltenty,hubert.reinterpretcast,jasonliu,Xiangling_L Subscribers: cebowleratibm, wuzish, nemanjai Differential Revision: https://reviews.llvm.org/D71144
* [ARM] Use the correct opcodes for Thumb2 segmented stack frame loweringDavid Green2020-01-061-2/+4
| | | | | | | | | The segmented stack lowering code appears to be using ARM opcodes under Thumb2. The MRC opcode will be the same for Thumb and ARM, but t2LDR seems wrong. Either way, using the correct thumb vs arm opcodes is more correct. Differential Revision: https://reviews.llvm.org/D72074
* [ARM] Use correct TRAP opcode for thumb in FastISelDavid Green2020-01-061-2/+6
| | | | | | | | | We were previously unconditionally using the ARM::TRAP opcode, even under Thumb. My understanding is that these are essentially the same thing (they both result in a trap under Thumb), but the ARM::TRAP opcode is marked as requiring IsARM, so it is more correct to use ARM::tTRAP. Differential Revision: https://reviews.llvm.org/D72075
* [AIX] Use csect reference for function address constantsdiggerlin2020-01-061-0/+22
| | | | | | | | | | | | | | | | | SUMMARY: We currently emit a reference for function address constants as labels; for example: foo_ptr: .long foo however, there may be no such label in the case where the function is undefined. Although the label exists when the function is defined, we will (to be consistent) also use a csect reference in that case. Reviewers: daltenty,hubert.reinterpretcast,jasonliu,Xiangling_L Subscribers: cebowleratibm, wuzish, nemanjai Differential Revision: https://reviews.llvm.org/D71144
* [AMDGPU] Fix "use of uninitialized variable" static analyzer warning. NFCI.Simon Pilgrim2020-01-061-0/+1
| | | | Add "unreachable" default case to AMDGPUTargetStreamer::getArchNameFromElfMach
* Fix "use of uninitialized variable" static analyzer warnings. NFCI.Simon Pilgrim2020-01-061-0/+2
| | | | Add "unreachable" default cases like we do for the other switch()s in X86MCInstLower::Lower
* Fix "use of uninitialized variable" static analyzer warning. NFCI.Simon Pilgrim2020-01-061-1/+1
|
* [ARM,MVE] Fix many signedness errors in MVE intrinsics.Simon Tatham2020-01-061-28/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Running an end-to-end test last week I noticed that a lot of the ACLE intrinsics that operate differently on vectors of signed and unsigned integers were ending up generating the signed version of the instruction unconditionally. This is because the IR intrinsics had no way to distinguish signed from unsigned: the LLVM type system just calls them both `v8i16` (or whatever), so you need either separate intrinsics for signed and unsigned, or a flag parameter that tells ISel which one to choose. This patch fixes all the problems of that kind that I've noticed, by adding an i32 flag parameter to many of the IR intrinsics which is set to 1 for unsigned (matching the existing practice in cases where we got it right), and conditioning all the isel patterns on that flag. So the fundamental change is in `IntrinsicsARM.td`, changing the low-level IR intrinsics API; there are knock-on changes in `arm_mve.td` (adjusting code gen for the ACLE intrinsics to use the modified API) and in `ARMInstrMVE.td` (adjusting isel to expect the new unsigned flags). The rest of this patch is boringly updating tests. Reviewers: dmgreen, miyuki, MarkMurrayARM Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72270
* [ARM,MVE] Generate the right instruction for vmaxnmq_m_f16.Simon Tatham2020-01-061-2/+2
| | | | | | | | | | | | | | | | | Summary: Due to a copy-paste error in the isel patterns, the predicated version of this intrinsic was expanding to the `VMAXNMT.F32` instruction instead of `VMAXNMT.F16`. Similarly for vminnm. Reviewers: dmgreen, miyuki, MarkMurrayARM Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72269
* AMDGPU/GlobalISel: Select scalar v2s16 G_BUILD_VECTORMatt Arsenault2020-01-063-25/+40
|
* AMDGPU/GlobalISel: Select more G_EXTRACTs correctlyMatt Arsenault2020-01-061-5/+19
| | | | | | | | This assumed a 32-bit extract size, which would produce invalid copies with 64-bit extracts. Handle the easy case. Ideally we would have a way to get the proper subreg index for any 32-bit offset, but there should probably be a tablegenerated way of getting the subreg index for any size and offset.
* [CostModel][X86] Add missing scalar i64->f32 uitofp costsSimon Pilgrim2020-01-061-0/+4
|
* [DAG] DAGCombiner::XformToShuffleWithZero - use APInt::extractBits helper. NFCI.Simon Pilgrim2020-01-061-8/+4
|
* [NFC] Fix trivial typos in commentsJames Henderson2020-01-0626-30/+30
| | | | | | | | Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.
* [ARM][MVE] More MVETailPredication debug messages. NFC.Sjoerd Meijer2020-01-062-65/+96
| | | | | | | | | | I've added a few more debug messages to MVETailPredication because I wanted to trace better which instructions are added/removed. And while I was at it, I factored out one function which I thought was clearer, and have added some comments to describe better the flow between MVETailPredication and ARMLowOverheadLoops. Differential Revision: https://reviews.llvm.org/D71549
* Add interface emitPrefix for MCCodeEmitterShengchen Kan2020-01-061-89/+133
| | | | Differential Revision: https://reviews.llvm.org/D72047
* [APFloat] Fix compilation warningsEhud Katz2020-01-062-2/+4
|
* Add ExternalAAWrapperPass to createLegacyPMAAResults.Neil Henning2020-01-061-0/+5
| | | | | | | | | | | | | | | Our out-of-tree custom aliasing solution for the HPC# Burst compiler here at Unity makes use of the `ExternalAAwrapperPass` infrastructure to insert our custom aliasing resolution into the core of LLVM. This is great for all cases except for function inlining, where because `createLegacyPMAAResults` does not make use of `ExternalAAWrapperPass`, when we have a definite no-alias result within a function it won't be propagated to the calling function during inlining. This commit just rectifies this oversight by adding the missing dependency. Differential Revision: https://reviews.llvm.org/D71348
* [APFloat] Add recoverable string parsing errors to APFloatEhud Katz2020-01-065-68/+116
| | | | | | Implementing the APFloat part in PR4745. Differential Revision: https://reviews.llvm.org/D69770
* [Metadata] Add TBAA struct metadata to `AAMDNode`Anton Afanasyev2020-01-062-10/+8
| | | | | | | | | | | | | | | | | | | Summary: Make `AAMDNodes`' `getAAMetadata()` and `setAAMetadata()` to take `!tbaa.struct` into account as well as `!tbaa`. This impacts llvm.org/pr42022. This is a temprorary fix needed to keep `!tbaa.struct` tag by SROA pass. New field `TBAAStruct` should be deleted when `!tbaa` tag replaces `!tbaa.struct`. Merging two `!tbaa.struct`'s to one is conservatively considered to be `nullptr` (giving `MayAlias`) -- this could be enhanced, but relying on the said future replacement. Reviewers: RKSimon, spatel, vporpo Subscribers: hiraditya, kosarev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70924
* [TargetLowering] Use SETCC input type to call getBooleanContents instead of ↵Craig Topper2020-01-051-1/+1
| | | | | | | | | the setcc result type. This isn't a functonal change since we also check the bit width is the same and the input type is integer. This guarantees the input and output type are the same. But passing the input type makes the code more readable.
* [MC] Reorder MCFragment members to decrease paddingFangrui Song2020-01-051-2/+2
| | | | | | | sizeof(MCFragment) does not change, but some if its subclasses do, e.g. on a 64-bit platform, sizeof(MCEncodedFragment) decreases from 64 to 56, sizeof(MCDataFragment) decreases from 224 to 216.
* [DAGCombine] Don't check the legality of type when combine the SIGN_EXTEND_INREGQingShan Zhang2020-01-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | This is the DAG node for SIGN_EXTEND_INREG : t21: v4i32 = sign_extend_inreg t18, ValueType:ch:v4i16 It has two operands. The first one is the value it want to extend, and the second one is the type to specify how to extend the value. For this example, it means that, it is signed extend the t18(v4i32) from v4i16 to v4i32. That is the semantics of c code: vector int foo(vector int m) { return m << 16 >> 16; } And it could be any vector type that hardware support the operation, though the type 'v4i16' is NOT legal for the target. When we are trying to combine the srl + sra, what we did now is calling the TLI.isOperationLegal(), which will also check the legality of the type. That doesn't make sense. Differential Revision: https://reviews.llvm.org/D70230
OpenPOWER on IntegriCloud