summaryrefslogtreecommitdiffstats
path: root/llvm/include
Commit message (Collapse)AuthorAgeFilesLines
...
* [TableGen] Add bang-operators !getop and !setop.Simon Tatham2019-12-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: These allow you to get and set the operator of a dag node, without affecting its list of arguments. `!getop` is slightly fiddly because in many contexts you need its return value to have a static type more specific than 'any record'. It works to say `!cast<BaseClass>(!getop(...))`, but it's cumbersome, so I made `!getop` take an optional type suffix itself, so that can be written as the shorter `!getop<BaseClass>(...)`. Reviewers: hfinkel, nhaehnle Reviewed By: nhaehnle Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71191
* [AArch64][SVE] Implement intrinsics for non-temporal loads & storesKerry McLaughlin2019-12-111-0/+26
| | | | | | | | | | | | | | | | | | | | Summary: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above. Reviewers: sdesmalen, paulwalker-arm, dancgr, mgudim, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71000
* [ARM][LowOverheadLoops] Remove dead loop update instructions.Sjoerd Meijer2019-12-112-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007
* [ARM][MVE] Add intrinsics for immediate shifts. (reland)Simon Tatham2019-12-111-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen, MarkMurrayARM Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065
* [NFC] Correct the example in the comments of JSON.h to avoid misleadQingShan Zhang2019-12-111-1/+1
| | | | user
* [MCRegInfo] Add sub_and_superregs_inclusive iterator range.Florian Hahn2019-12-111-0/+8
| | | | | | | | Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70566
* [AArch64][SVE] Move TableGen class definitions for gather loads (NFC)Andrzej Warzynski2019-12-111-17/+17
| | | | | | | Move 2 intrinsic class definitions so that they're all clustered in one place. Patch submitted to test commit access.
* [LiveRegUnits] Add phys_regs_and_masks iterator range (NFC).Florian Hahn2019-12-111-0/+13
| | | | | | | | | | | This iterator range just includes physical registers and register masks, which are interesting when dealing with register liveness. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70562
* [FPEnv][X86] Constrained FCmp intrinsics enabling on X86Wang, Pengfei2019-12-111-1/+5
| | | | | | | | | | | | Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582
* Revert "Reapply: [DebugInfo] Recover debug intrinsics when killing ↵Vlad Tsyrklevich2019-12-101-2/+0
| | | | | | | | duplicated/empty..." This reverts commit f2ba93971ccc236c0eef5323704d31f48107e04f, it was causing build timeouts on sanitizer-x86_64-linux-autoconf such as http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/44917
* [IR] allow undefined elements when checking for splat constantsSanjay Patel2019-12-102-6/+8
| | | | | This mimics the related call in SDAG. The caller is responsible for ensuring that undef values are propagated safely.
* Fix -Wincomplete-umbrella warning in the modules buildVedant Kumar2019-12-101-0/+1
| | | | | [281/3666] Building CXX object lib/IR/CMakeFiles/LLVMCore.dir/IntrinsicInst.cpp.o /Users/vsk/src/llvm-project-master/llvm/lib/IR/IntrinsicInst.cpp:155:2: warning: missing submodule 'LLVM_IR.ConstrainedOps' [-Wincomplete-umbrella]
* [FPEnv] clang support for constrained FP builtinsKevin P. Neal2019-12-101-0/+33
| | | | | | | | Change the IRBuilder and clang so that constrained FP intrinsics will be emitted for builtins when appropriate. Only non-target-specific builtins are affected in this patch. Differential Revision: https://reviews.llvm.org/D70256
* [VectorUtils] Fix -Wunused-private-field after D67572Fangrui Song2019-12-101-4/+2
|
* [VectorUtils] Introduce the Vector Function Database (VFDatabase).Francesco Petrogalli2019-12-102-5/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduced the VFDatabase, the framework proposed in http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [*] In this patch the VFDatabase is used to bridge the TargetLibraryInfo (TLI) calls that were previously used to query for the availability of vector counterparts of scalar functions. The VFISAKind field `ISA` of VFShape have been moved into into VFInfo, under the assumption that different vector ISAs may provide the same vector signature. At the moment, the vectorizer accepts any of the available ISAs as long as the signature provided by the VFDatabase matches the one expected in the vectorization process. For example, when targeting AVX or AVX2, which both have 256-bit registers, the IR signature of the two vector functions associated to the two ISAs is the same. The `getVectorizedFunction` method at the moment returns the first available match. We will need to add more heuristics to the search system to decide which of the available version (TLI, AVX, AVX2, ...) the system should prefer, when multiple versions with the same VFShape are present. Some of the code in this patch is based on the work done by Sumedh Arani in https://reviews.llvm.org/D66025. [*] Notice that in the proposal the VFDatabase was called SVFS. The name VFDatabase is more in line with LLVM recommendations for naming classes and variables. Differential Revision: https://reviews.llvm.org/D67572
* [ARM][MVE] Refactor complex vector intrinsics [NFCI]Mikhail Maltsev2019-12-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch refactors instruction selection of the complex vector addition, multiplication and multiply-add intrinsics, so that it is now based on TableGen patterns rather than C++ code. It also changes the first parameter (halving vs non-halving) of the arm_mve_vcaddq IR intrinsic to match the corresponding instruction encoding, hence it requires some changes in the tests. The patch addresses David's comment in https://reviews.llvm.org/D71190 Reviewers: dmgreen, ostannard, simon_tatham, MarkMurrayARM Reviewed By: dmgreen Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71245
* [DebugInfo] Support to emit debugInfo for extern variablesYonghong Song2019-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extern variable usage in BPF is different from traditional pure user space application. Recent discussion in linux bpf mailing list has two use cases where debug info types are required to use extern variables: - extern types are required to have a suitable interface in libbpf (bpf loader) to provide kernel config parameters to bpf programs. https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23PtwaZs-yAyafmXC9g@mail.gmail.com/T/#t - extern types are required so kernel bpf verifier can verify program which uses external functions more precisely. This will make later link with actual external function no need to reverify. https://lore.kernel.org/bpf/87eez4odqp.fsf@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed This patch added clang support to emit debuginfo for extern variables with a TargetInfo hook to enable it. The debuginfo for the extern variable is emitted only if that extern variable is referenced in the current compilation unit. Currently, only BPF target enables to generate debug info for extern variables. The emission of such debuginfo is disabled for C++ at this moment since BPF only supports a subset of C language. Emission with C++ can be enabled later if an appropriate use case is identified. -fstandalone-debug permits us to see more debuginfo with the cost of bloated binary size. This patch did not add emission of extern variable debug info with -fstandalone-debug. This can be re-evaluated if there is a real need. Differential Revision: https://reviews.llvm.org/D70696
* [Alignment][NFC] CreateMemSet use MaybeAlignGuillaume Chatelet2019-12-101-4/+4
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71213
* Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty...stozer2019-12-101-0/+2
| | | | | | | | | | | basic blocks Originally applied in 72ce759928e6dfee6a9efa310b966c19722352ba. Fixed a build failure caused by incorrect use of cast instead of dyn_cast. This reverts commit 8b0780f795eb58fca0a2456e308adaaa1a0b5013.
* [AArch64] Fix issues with large arrays on stackKiran Chandramohan2019-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496
* [OpenMP][NFCI] Introduce llvm/IR/OpenMPConstants.hJohannes Doerfert2019-12-102-0/+148
| | | | | | | | | | | | | | | | | | | Summary: The new OpenMPConstants.h is a location for all OpenMP related constants (and helpers) to live. This patch moves the directives there (the enum OpenMPDirectiveKind) and rewires Clang to use the new location. Initially part of D69785. Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim Subscribers: jholewinski, ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69853
* [MC] Delete MCCodePadderFangrui Song2019-12-096-384/+0
| | | | | | | | | | | | | | | | | | | | | D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106
* Revert "[ARM][MVE] Add intrinsics for immediate shifts."Eric Christopher2019-12-091-8/+0
| | | | | | | | | | | | | | and two follow-on commits: one warning fix and one functionality. As it's breaking at least the lto bot: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15132/steps/test-stage1-compiler/logs/stdio This reverts commits: 8d70f3c933a5b81a87a5ab1af0e3e98ee2cd7c67 ff4dceef9201c5ae3924e92f6955977f243ac71d d97b3e3e65cd77a81b39732af84a1a4229e95091
* Implement LWG#1203 for raw_ostream.Christian Sigg2019-12-091-0/+12
| | | | | | | | | | | | | | | | | | | Implement LWG#1203 (https://cplusplus.github.io/LWG/issue1203) for raw_ostream like libc++ does for std::basic_ostream<...>. Add a operator<< overload that takes an rvalue reference of a typed derived from raw_ostream, streams the value to it and returns the stream of the same type as the argument. This allows free operator<< to work with rvalue reference raw_ostreams: raw_ostream& operator<<(raw_ostream&, const SomeType& Value); raw_os_ostream(std::cout) << SomeType(); It also allows using the derived type like: auto Foo = (raw_string_ostream(buffer) << "foo").str(); Author: Christian Sigg <csigg@google.com> Differential Revision: https://reviews.llvm.org/D70686
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-093-0/+16
| | | | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149
* [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, ↵Mark Murray2019-12-091-0/+36
| | | | | | | | | | | | | | VQDMULHQ, VQRDMULHQ intrinsics. Summary: Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen, miyuki Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71198
* [ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT|POLY) intrinsics.Mark Murray2019-12-091-0/+14
| | | | | | | | | | | | Summary: Add VMULL[BT]Q_(INT|POLY) intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71066
* [ARM][MVE] Add intrinsics for immediate shifts.Simon Tatham2019-12-091-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065
* [DebugInfo] Nerf placeDbgValues, with prejudiceJeremy Morse2019-12-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453
* [ARM][MVE] Add complex vector intrinsicsMikhail Maltsev2019-12-091-0/+36
| | | | | | | | | | | | | | | | | | | | Summary: This patch adds intrinsics for the following MVE instructions: * VCADD, VHCADD * VCMUL * VCMLA Each of the above 3 groups has a corresponding new LLVM IR intrinsic. Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen Reviewed By: MarkMurrayARM Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71190
* [CommandLine] Add missing CallbacksDavid Green2019-12-091-0/+8
| | | | | | | | | | It appears that the cl::bits options are not used anywhere in-tree. In the recent addition to add Callback's to the options, the Callback was missing from this one. This fixes it by adding the same code from the other classes. It also adds a simple test, of sorts, just to make sure these continue compiling.
* [ARM] Teach the Arm cost model that a Shift can be folded into other ↵David Green2019-12-094-17/+22
| | | | | | | | | | | | | | | | | | | | | | | | | instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
* [DebugInfo] Make describeLoadedValue() reg awareDavid Stenberg2019-12-091-8/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431
* Revert "[DebugInfo] Make describeLoadedValue() reg aware"David Stenberg2019-12-091-17/+8
| | | | | This reverts commit 3cd93a4efcdeabeb20cb7bec9fbddcb540d337a1. I'll recommit with a well-formatted arcanist commit message.
* [DebugInfo] Make describeLoadedValue() reg awareDavid Stenberg2019-12-091-8/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.
* [FPEnv] Constrained FCmp intrinsicsUlrich Weigand2019-12-075-1/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for constrained floating-point comparison intrinsics. Specifically, we add: declare <ty2> @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) declare <ty2> @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) The first variant implements an IEEE "quiet" comparison (i.e. we only get an invalid FP exception if either argument is a SNaN), while the second variant implements an IEEE "signaling" comparison (i.e. we get an invalid FP exception if either argument is any NaN). The condition code is implemented as a metadata string. The same set of predicates as for the fcmp instruction is supported (except for the "true" and "false" predicates). These new intrinsics are mapped by SelectionDAG codegen onto two new ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again representing quiet vs. signaling comparison operations. Otherwise those nodes look like SETCC nodes, with an additional chain argument and result as usual for strict FP nodes. The patch includes support for the common legalization operations for those nodes. The patch also includes full SystemZ back-end support for the new ISD nodes, mapping them to all available SystemZ instruction to fully implement strict semantics (scalar and vector). Differential Revision: https://reviews.llvm.org/D69281
* [CommandLine] Add callbacks to OptionsDon Hinton2019-12-061-0/+57
| | | | | | | | | | | | | | | | | | | | | | Summary: Add a new cl::callback attribute to Option. This attribute specifies a callback function that is called when an option is seen, and can be used to set other options, as in option A implies option B. If the option is a `cl::list`, and `cl::CommaSeparated` is also specified, the callback will fire once for each value. This could be used to validate combinations or selectively set other options. Reviewers: beanz, thomasfinch, MaskRay, thopre, serge-sans-paille Reviewed By: beanz Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70620
* [WebAssebmly][MC] Support .import_name/.import_field asm directivesSam Clegg2019-12-061-0/+1
| | | | | | | | Convert the MC test to use asm rather than bitcode. This is a precursor to https://reviews.llvm.org/D70520. Differential Revision: https://reviews.llvm.org/D70877
* [MC] Rewrite tablegen for printInstrAlias to comiple faster, NFCReid Kleckner2019-12-061-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change, the *InstPrinter.cpp files of each target where some of the slowest objects to compile in all of LLVM. See this snippet produced by ClangBuildAnalyzer: https://reviews.llvm.org/P8171$96 Search for "InstPrinter", and see that it shows up in a few places. Tablegen was emitting a large switch containing a sequence of operand checks, each of which created many conditions and many BBs. Register allocation and jump threading both did not scale well with such a large repetitive sequence of basic blocks. So, this change essentially turns those control flow structures into data. The previous structure looked like: switch (Opc) { case TGT::ADD: // check alias 1 if (MI->getOperandCount() == N && // check num opnds MI->getOperand(0).isReg() && // check opnd 0 ... MI->getOperand(1).isImm() && // check opnd 1 AsmString = "foo"; break; } // check alias 2 if (...) ... return false; The new structure looks like: OpToPatterns: Sorted table of opcodes mapping to pattern indices. \-> Patterns: List of patterns. Previous table points to subrange of patterns to match. \-> Conds: The if conditions above encoded as a kind and 32-bit value. See MCInstPrinter.cpp for the details of how the new data structures are interpreted. Here are some before and after metrics. Time to compile AArch64InstPrinter.cpp: 0m29.062s vs. 0m2.203s size of the obj: 3.9M vs. 676K size of clang.exe: 97M vs. 96M I have not benchmarked disassembly performance, but typically disassemblers are bottlenecked on IO and string processing, not alias matching, so I'm not sure it's interesting enough to be worth doing. Reviewers: RKSimon, andreadb, xbolva00, craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D70650
* Revert "[PGO][PGSO] Instrument the code gen / target passes."Hiroshi Yamauchi2019-12-062-12/+0
| | | | | | This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac. This seems to break buildbots.
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-062-0/+12
| | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072
* [AArch64][SVE] Implement integer compare intrinsicsCullen Rhodes2019-12-062-1/+30
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Adds intrinsics for the following: * cmphs, cmphi * cmpge, cmpgt * cmpeq, cmpne * cmplt, cmple * cmplo, cmpls Includes a minor change to `TLI.getMemValueType` that fixes a crash due to the scalable flag being dropped. Reviewers: sdesmalen, efriedma, rengolin, rovka, dancgr, huntergr Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70889
* [Dsymutil][NFC] Move NonRelocatableStringpool into common CodeGen folder.Alexey Lapshin2019-12-061-0/+83
| | | | | | | That refactoring moves NonRelocatableStringpool into common CodeGen folder. So that NonRelocatableStringpool could be used not only inside dsymutil. Differential Revision: https://reviews.llvm.org/D71068
* [JITLink] Use Blocks rather than Symbols for SectionRange.Lang Hames2019-12-051-14/+13
| | | | This ensures that anonymous blocks are included in the section range.
* [JITLink] Remove the Section::symbols_empty() method.Lang Hames2019-12-051-4/+1
| | | | llvm::empty(Sec.symbols()) can be used instead.
* Add lookup functions for efficient lookups of addresses when using ↵Greg Clayton2019-12-056-17/+217
| | | | | | | | | | | | | | | | | GsymReader classes. Summary: Lookup functions are designed to not fully decode a FunctionInfo, LineTable or InlineInfo, they decode only what is needed into a LookupResult object. This allows lookups to avoid costly memory allocations and avoid parsing large amounts of information one a suitable match is found. LookupResult objects contain the address that was looked up, the concrete function address range, the name of the concrete function, and a list of source locations. One for each inline function, and one for the concrete function. This allows one address to turn into multiple frames and improves the signal you get when symbolicating addresses in GSYM files. Reviewers: labath, aprantl Subscribers: mgorny, hiraditya, llvm-commits, lldb-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70993
* [AutoFDO] Properly merge context-sensitive profile of inlinee back to ↵Wenlei He2019-12-052-1/+13
| | | | | | | | | | | | | | | | | | | | outlined function Summary: When sample profile loader decides not to inline a previously inlined call-site, we adjust the profile of outlined function simply by scaling up its profile counts by call-site count. This means the context-sensitive profile of that inlined instance will be thrown away. This commit try to keep context-sensitive profile for such cases: - Instead of scaling outlined function's profile, we now properly merge the FunctionSamples of inlined instance into outlined function, including all recursively inlined profile. - Instead of adjusting the profile for negative inline decision at the end of the sample profile loader pass, we do the profile merge right after processing each function. This change paired with top-down ordering of annotation/inline-replay (a separate diff) will make sure we recursively merge profile back before the profile is used for annotation and inline replay. A new switch -sample-profile-merge-inlinee is added to enable the new profile merge for tuning. It should be the default behavior eventually. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70653
* [IR] Move ctor in the NDEBUG branchFangrui Song2019-12-051-2/+1
|
* [IR] Add a default copy constructor for -Wdeprecated-copyFangrui Song2019-12-051-0/+2
|
* [GlobalISel] Localizer: Allow targets not to run the pass conditionallyVolkan Keles2019-12-051-0/+5
| | | | | | | | | | | | | | | | | | | Summary: Previously, it was not possible to skip running the localizer pass conditionally. This patch adds an input function to the pass which decides if the pass should run on the given MachineFunction or not. No test case as there is no upstream target needs this functionality. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71038
OpenPOWER on IntegriCloud