summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM][MVE] Masked gathers from base + vector of offsetsAnna Welker2020-01-141-38/+162
| | | | | | | | Enables the masked gather pass to create a masked gather loading from a base and vector of offsets. This also enables v8i16 and v16i8 gather loads. Differential Revision: https://reviews.llvm.org/D72330
* [TableGen] Introduce an if/then/else statement.Simon Tatham2020-01-144-12/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This allows you to make some of the defs in a multiclass or `foreach` conditional on an expression computed from the parameters or iteration variables. It was already possible to simulate an if statement using a `foreach` with a dummy iteration variable and a list constructed using `!if` so that it had length 0 or 1 depending on the condition, e.g. foreach unusedIterationVar = !if(condition, [1], []<int>) in { ... } But this syntax is nicer to read, and also more convenient because it allows an else clause. To avoid upheaval in the implementation, I've implemented `if` as pure syntactic sugar on the `foreach` implementation: internally, `ParseIf` actually does construct exactly the kind of foreach shown above (and another reversed one for the else clause if present). Reviewers: nhaehnle, hfinkel Reviewed By: hfinkel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71474
* [TableGen] Introduce a `defvar` statement.Simon Tatham2020-01-144-5/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This allows you to define a global or local variable to an arbitrary value, and refer to it in subsequent definitions. The main use I anticipate for this is if you have to compute some difficult function of the parameters of a multiclass, and then use it many times. For example: multiclass Foo<int i, string s> { defvar op = !cast<BaseClass>("whatnot_" # s # "_" # i); def myRecord { dag a = (op this, (op that, the other), (op x, y, z)); int b = op.subfield; } def myOtherRecord<"template params including", op>; } There are a couple of ways to do this already, but they're not really satisfactory. You can replace `defvar x = y` with a loop over a singleton list, `foreach x = [y] in { ... }` - but that's unintuitive to someone who hasn't seen that workaround idiom before, and requires an extra pair of braces that you often didn't really want. Or you can define a nested pair of multiclasses, with the inner one taking `x` as a template parameter, and the outer one instantiating it just once with the desired value of `x` computed from its other parameters - but that makes it awkward to sequentially compute each value based on the previous ones. I think `defvar` makes things considerably easier. You can also use `defvar` at the top level, where it inserts globals into the same map used by `defset`. That allows you to define global constants without having to make a dummy record for them to live in: defvar MAX_BUFSIZE = 512; // previously: // def Dummy { int MAX_BUFSIZE = 512; } // and then refer to Dummy.MAX_BUFSIZE everywhere Reviewers: nhaehnle, hfinkel Reviewed By: hfinkel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71407
* [AMDGPU] Model distance to instruction in bundleStanislav Mekhanoshin2020-01-141-5/+17
| | | | | | | This change allows to model the height of the instruction within a bundle for latency adjustment purposes. Differential Revision: https://reviews.llvm.org/D72669
* [AMDGPU] Fix getInstrLatency() always returning 1Stanislav Mekhanoshin2020-01-142-3/+7
| | | | | | | | We do not have InstrItinerary so generic getInstLatency() was always defaulting to return 1 cycle. We need to use TargetSchedModel instead to compute an instruction's latency. Differential Revision: https://reviews.llvm.org/D72655
* [MC] Don't resolve relocations referencing STB_LOCAL STT_GNU_IFUNCFangrui Song2020-01-131-1/+2
|
* [X86] Copy the nofpexcept flag when folding a load into an instruction using ↵Craig Topper2020-01-131-0/+4
| | | | the load folding tables./
* [GlobalISel] Change representation of shuffle masks in MachineOperand.Eli Friedman2020-01-138-53/+27
| | | | | | | | | | | | We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663
* [PGO][CHR] Guard against 0-to-0 branch weight and avoid division by zero crash.Hiroshi Yamauchi2020-01-131-0/+4
| | | | | | | | | | | | Summary: This fixes a crash in internal builds under SamplePGO. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72653
* Revert "[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto ↵Amy Huang2020-01-131-8/+0
| | | | | | | return type for C++ member functions." This reverts commit c958639098a8702b831952b1a1a677ae19190a55, which causes a crash. See https://reviews.llvm.org/D70524 for details.
* [ThinLTO/WPD] Fix index-based WPD for alias vtablesTeresa Johnson2020-01-131-1/+1
| | | | | | | | | | | | | | | | | Summary: A recent fix in D69452 fixed index based WPD in the presence of available_externally vtables. It added a cast of the vtable def summary to a GlobalVarSummary. However, in some cases one def may be an alias, in which case we need to get the base object before casting, otherwise we will crash. Reviewers: evgeny777, steven_wu, aganea Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71040
* [LegalizeIntegerTypes][X86] Add support for expanding input of ↵Craig Topper2020-01-131-6/+30
| | | | | | | | STRICT_SINT_TO_FP/STRICT_UINT_TO_FP into a libcall. Needed to support i128->fp128 on 32-bit X86. Add full set of strict sint_to_fp/uint_to_fp conversion tests for fp128.
* [Dsymutil][Debuginfo][NFC] #3 Refactor dsymutil to separate DWARF optimizing ↵Alexey Lapshin2020-01-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | part. Summary: This is the next portion of patches for dsymutil. Create DwarfEmitter interface to generate all debug info tables. Put DwarfEmitter into DwarfLinker library and make tools/dsymutil/DwarfStreamer to be child of DwarfEmitter. It passes check-all testing. MD5 checksum for clang .dSYM bundle matches for the dsymutil with/without that patch. Reviewers: JDevlieghere, friss, dblaikie, aprantl Reviewed By: JDevlieghere Subscribers: merge_guards_bot, hiraditya, thegameg, probinson, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D72476
* [LTO] Constify lto::Config reference passed to backends (NFC)Teresa Johnson2020-01-132-18/+17
| | | | | | | | The lto::Config object saved on the global LTO object should not be updated by any of the LTO backends. Otherwise we could run into interference between threads utilizing it. Motivated by some proposed changes that would have caused it to get modified in the ThinLTO backends.
* Rework be15dfa88fb1 such that it works with GlobalISel which doesn't use EVTDaniel Sanders2020-01-131-3/+11
| | | | | | | | | | | | | | | | | | | | | Summary: be15dfa88fb1 broke GlobalISel's usage of getSetCCInverse() which currently appears to be limited to our out-of-tree backend. GlobalISel doesn't use EVT's and isn't able to derive them from the information it has as it doesn't distinguish between integer and floating point types (that distinction is made by operations rather than values). Bring back the bool version of getSetCCInverse() in a way that doesn't break the intent of be15dfa88fb1 but also allows GlobalISel to continue using it. Reviewers: spatel, bogner, arichardson Reviewed By: arichardson Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72309
* [X86][Disassembler] Fix a bug when disassembling an empty stringFangrui Song2020-01-131-1/+3
| | | | | | | | | | | | | readPrefixes() assumes insn->bytes is non-empty. The code path is not exercised in llvm-mc because llvm-mc does not feed empty input to MCDisassembler::getInstruction(). This bug is uncovered by a5994c789a2982a770254ae1607b5b4cb641f73c. An empty string did not crash before because the deleted regionReader() allowed UINT64_C(-1) as insn->readerCursor. Bytes.size() <= Address -> R->Base 0 <= UINT64_C(-1) - UINT32_C(-1)
* [llvm][MIRVRegNamerUtils] Adding hashing on FrameIndex MachineOperands.Puyan Lotfi2020-01-131-1/+2
| | | | | | | | | | | | This patch makes it so that cases where multiple instructions that differ only in their FrameIndex MachineOperand values no longer collide. For instance: %1:_(p0) = G_FRAME_INDEX %stack.0 %2:_(p0) = G_FRAME_INDEX %stack.1 Prior to this patch these instructions would collide together. Differential Revision: https://reviews.llvm.org/D71583
* AMDGPU/GlobalISel: Select llvm.amdgcn.ds.ordered.{add|swap}Matt Arsenault2020-01-132-0/+88
|
* [SelectionDAG] ComputeNumSignBits add getValidMaximumShiftAmountConstant() ↵Simon Pilgrim2020-01-131-0/+31
| | | | | | for ISD::SHL support Allows us to handle non-uniform SHL shifts to determine the minimum number of sign bits remaining (based off the maximum shift amount value)
* AMDGPU/GlobalISel: Set insert point after waterfall loopMatt Arsenault2020-01-131-2/+3
| | | | | | | | | The current users of the waterfall loop utility functions do not make use of the restored original insert point. The insertion is either done, or they set the insert point somewhere else. A future change will want to insert instructions after the waterfall loop, but figuring out the point after the loop is more difficult than ensuring the insert point is there after the loop.
* AMDGPU/GlobalISel: Fix branch targets when emitting SI_IFMatt Arsenault2020-01-131-7/+30
| | | | | | | | The branch target needs to be changed depending on whether there is an unconditional branch or not. Loops also need to be similarly fixed, but compiling a simple testcase end to end requires another set of patches that aren't upstream yet.
* AMDGPU/GlobalISel: Simplify assertMatt Arsenault2020-01-131-11/+3
|
* [LegalizeTypes] Add SoftenFloatResult support for ↵Andrew Wei2020-01-141-8/+16
| | | | | | | | | STRICT_SINT_TO_FP/STRICT_UINT_TO_FP Some target like arm/riscv with soft-float will have compiling crash when using -fno-unsafe-math-optimization option. This patch will add the missing strict FP support to SoftenFloatRes_XINT_TO_FP. Differential Revision: https://reviews.llvm.org/D72277
* [SelectionDAG] ComputeNumSignBits add getValidMinimumShiftAmountConstant() ↵Simon Pilgrim2020-01-131-1/+4
| | | | | | ISD::SRA support Allows us to handle more non-uniform SRA sign bits cases
* [Scheduler] Remove superfluous casts. NFCDavid Green2020-01-132-5/+3
|
* [AArch64][SVE] Add patterns for some arith SVE instructions.Danilo Carvalho Grael2020-01-135-10/+67
| | | | | | | | | | | Summary: Add patterns for the following instructions: - smax, smin, umax, umin Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: amehsan Differential Revision: https://reviews.llvm.org/D71779
* [DebugInfo] Make debug line address size mismatch non-fatal to parsingJames Henderson2020-01-131-11/+20
| | | | | | | | | Reasonable assumptions can be made when a parsed address length does not match the expected length, so there's no need for this to be fatal. Reviewed by: ikudrin Differential Revision: https://reviews.llvm.org/D72154
* [Inlining] Add PreInlineThreshold for the new pass managerKazu Hirata2020-01-131-2/+6
| | | | | | | | | | | | | | | Summary: This patch makes it easy to try out different preinlining thresholds with a command-line switch just like -preinline-threshold for the legacy pass manager. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72618
* [RISCV] Handle globals and block addresses in asm operandsLuís Marques2020-01-131-0/+8
| | | | | | | | | | Summary: These seem to be the machine operand types currently needed by the RISC-V target. Reviewers: asb, lenary Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D72275
* [AArch64] Emit HINT instead of PAC insns in Armv8.2-A or belowPablo Barrio2020-01-131-15/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The Pointer Authentication Extension (PAC) was added in Armv8.3-A. Some instructions are implemented in the HINT space to allow compiling code common to CPUs regardless of whether they feature PAC or not, and still benefit from PAC protection in the PAC-enabled CPUs. The 8.3-specific mnemonics were currently enabled in any architecture, and LLVM was emitting them in assembly files when PAC code generation was enabled. This was ok for compilations where both LLVM codegen and the integrated assembler were used. However, the LLVM codegen was not compatible with other assemblers (e.g. GAS). Given the fact that the approach from these assemblers (i.e. to disallow Armv8.3-A mnemonics if compiling for Armv8.2-A or lower) is entirely reasonable, this patch makes LLVM to emit HINT when building for Armv8.2-A and below, instead of PACIASP, AUTIASP and friends. Then, LLVM assembly should be compatible with other assemblers. Reviewers: samparker, chill, LukeCheeseman Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71658
* [MIPS] Don't emit R_(MICRO)MIPS_JALR relocations against data symbolsAlex Richardson2020-01-131-0/+9
| | | | | | | | | | | | | | The R_(MICRO)MIPS_JALR optimization only works when used against functions. Using the relocation against a data symbol (e.g. function pointer) will cause some linkers that don't ignore the hint in this case (e.g. LLD prior to commit 5bab291b7b) to generate a relative branch to the data symbol which crashes at run time. Before this patch, LLVM was erroneously emitting these relocations against local-dynamic TLS function pointers and global function pointers with internal visibility. Reviewers: atanasyan, jrtc27, vstefanovic Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D72571
* [MIPS][ELF] Use PC-relative relocations in .eh_frame when possibleAlex Richardson2020-01-132-3/+11
| | | | | | | | | | | | | | | | | | | | When compiling position-independent executables, we now use DW_EH_PE_pcrel | DW_EH_PE_sdata4. However, the MIPS ABI does not define a 64-bit PC-relative ELF relocation so we cannot use sdata8 for the large code model case. When using the large code model, we fall back to the previous behaviour of generating absolute relocations. With this change clang-generated .o files can be linked by LLD without having to pass -Wl,-z,notext (which creates text relocations). This is simpler than the approach used by ld.bfd, which rewrites the .eh_frame section to convert absolute relocations into relative references. I saw in D13104 that apparently ld.bfd did not accept pc-relative relocations for MIPS ouput at some point. However, I also checked that recent ld.bfd can process the clang-generated .o files so this no longer seems true. Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D72228
* [SelectionDAG] ComputeNumSignBits - Use getValidShiftAmountConstant for ↵Simon Pilgrim2020-01-131-15/+8
| | | | | | shift opcodes getValidShiftAmountConstant handles out of bounds shift amounts for us, allowing us to remove the local handling.
* [SelectionDAG] ComputeKnownBits - Add DemandedElts support to ↵Simon Pilgrim2020-01-131-8/+14
| | | | getValidShiftAmountConstant/getValidMinimumShiftAmountConstant()
* [FPEnv] Fix chain handling for fpexcept.strict nodesUlrich Weigand2020-01-132-14/+81
| | | | | | | | | | | | | | | | | We need to ensure that fpexcept.strict nodes are not optimized away even if the result is unused. To do that, we need to chain them into the block's terminator nodes, like already done for PendingExcepts. This patch adds two new lists of pending chains, PendingConstrainedFP and PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without and with fpexcept.strict markers. This allows not only to solve the above problem, but also to relax chains a bit further by no longer flushing all FP nodes before a store or other memory access. (They are still flushed before nodes with other side effects.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72341
* [SelectionDAG] ComputeKnownBits add getValidMinimumShiftAmountConstant() ↵Simon Pilgrim2020-01-131-0/+3
| | | | | | ISD::SHL support As mentioned on D72573
* [SelectionDAG] ComputeKnownBits - minimum leading/trailing zero bits in ↵Simon Pilgrim2020-01-131-0/+11
| | | | | | | | | | LSHR/SHL (PR44526) As detailed in https://blog.regehr.org/archives/1709 we don't make use of the known leading/trailing zeros for shifted values in cases where we don't know the shift amount value. This patch adds support to SelectionDAG::ComputeKnownBits to use KnownBits::countMinTrailingZeros and countMinLeadingZeros to set the minimum guaranteed leading/trailing known zero bits. Differential Revision: https://reviews.llvm.org/D72573
* [X86] Fix MSVC "truncation from 'int' to 'bool'" warning. NFCI.Simon Pilgrim2020-01-131-2/+2
|
* ARMLowOverheadLoops: return earlier to avoid printing irrelevant dbg msg. NFCSjoerd Meijer2020-01-131-0/+1
|
* This option allows selecting the TLS size in the local exec TLS model,KAWASHIMA Takahiro2020-01-133-26/+117
| | | | | | | | | | | | | | | | | | which is the default TLS model for non-PIC objects. This allows large/ many thread local variables or a compact/fast code in an executable. Specification is same as that of GCC. For example, the code model option precedes the TLS size option. TLS access models other than local-exec are not changed. It means supoort of the large code model is only in the local exec TLS model. Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>) Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard Reviewd By: peter.smith Committed by: peter.smith Differential Revision: https://reviews.llvm.org/D71688
* [RISCV] Collect Statistics on Compressed InstructionsSam Elliott2020-01-132-0/+14
| | | | | | | | | | | | | | | | | Summary: It is useful to keep statistics on how many instructions we have compressed, so we can see if future changes are increasing or decreasing this number. Reviewers: asb, luismarques Reviewed By: asb, luismarques Subscribers: xbolva00, sameer.abuasal, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67495
* [SCEV] Follow up of D71563: addressing post commit comment. NFC.Sjoerd Meijer2020-01-131-11/+6
|
* [DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return ↵Awanish Pandey2020-01-131-0/+8
| | | | | | | | | | | | | | | | | | type for C++ member functions. Summary: This patch will provide support for auto return type for the C++ member functions. Before this return type of the member function is deduced and stored in the DIE. This patch includes llvm side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524
* [X86] Use SDNPOptInGlue instead of SDNPInGlue on a couple SDNodes.Craig Topper2020-01-121-2/+2
| | | | | At least one of these is used without a Glue. This doesn't seem to change the X86GenDAGISel.inc output so maybe it doesn't matter?
* AMDGPU/GlobalISel: Don't use XEXEC class for SGPRsMatt Arsenault2020-01-121-1/+1
| | | | | We don't use the xexec register classes for arbitrary values anymore. Avoids a test variance beween GlobalISel and SelectionDAG>
* AMDGPU/GlobalISel: Copy type when inserting readfirstlaneMatt Arsenault2020-01-121-0/+2
| | | | | getDefIgnoringCopies will fail to find any def if no type is set if we try to use it on the use's operand, so propagate the type.
* [SCEV] accurate range for addrecexpr with nuw flagZheng Chen2020-01-121-5/+6
| | | | | | | | | If addrecexpr has nuw flag, the value should never be less than its start value and start value does not required to be SCEVConstant. Reviewed By: nikic, sanjoy Differential Revision: https://reviews.llvm.org/D71690
* [RISCV] Check register class for AMO memory operandsJames Clarke2020-01-132-1/+6
| | | | | | | | | | | | | | | | | | | Summary: AMO memory operands use a custom parser in order to accept both (reg) and 0(reg). However, the validation predicate used for these operands was only checking that they were registers, and not the register class, so non-GPRs (such as FPRs) were also accepted. Thus, fix this by making the predicate check that they are GPRs. Reviewers: asb, lenary Reviewed By: asb, lenary Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72471
* [MC][ELF] Emit a relocation if target is defined in the same section and is ↵Fangrui Song2020-01-121-21/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | non-local For a target symbol defined in the same section, currently we don't emit a relocation if VariantKind is VK_None (with few exceptions like RISC-V relaxation), while GNU as emits one. This causes program behavior differences with and without -ffunction-sections, and can break intended symbol interposition in a -shared link. ``` .globl foo foo: call foo # no relocation. On other targets, may be written as b foo, etc call bar # a relocation if bar is in another section (e.g. -ffunction-sections) call foo@plt # a relocation ``` Unify these cases by always emitting a relocation. If we ever want to optimize `call foo` in -shared links, we should emit a STB_LOCAL alias and call via the alias. ARM/thumb2-beq-fixup.s: we now emit a relocation to global_thumb_fn as GNU as does. X86/Inputs/align-branch-64-2.s: we now emit R_X86_64_PLT32 to foo as GNU does. ELF/relax.s: rewrite the test as target-in-same-section.s . We omitted relocations to `global` and now emit R_X86_64_PLT32. Note, GNU as does not emit a relocation for `jmp global` (maybe its own bug). Our new behavior is compatible except `jmp global`. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D72197
* __patchable_function_entries: don't use linkage field 'unique' with ↵Fangrui Song2020-01-121-18/+21
| | | | | | | | | | | | -no-integrated-as .section name, "flags"G, @type, GroupName[, linkage] As of binutils 2.33, linkage cannot be 'unique'. For integrated assembler, we use both 'o' flag and 'unique' linkage to support --gc-sections and COMDAT with lld. https://sourceware.org/ml/binutils/2019-11/msg00266.html
OpenPOWER on IntegriCloud