summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Adding inline comments to code view type record directives for better ↵Nilanjana Basu2019-07-171-2/+15
| | | | | | readability llvm-svn: 366372
* [PEI] Don't re-allocate a pre-allocated stack protector slotFrancis Visoiu Mistrih2019-07-172-2/+27
| | | | | | | | | | | | | | | | | | | | | | The LocalStackSlotPass pre-allocates a stack protector and makes sure that it comes before the local variables on the stack. We need to make sure that later during PEI we don't re-allocate a new stack protector slot. If that happens, the new stack protector slot will end up being **after** the local variables that it should be protecting. Therefore, we would have two slots assigned for two different stack protectors, one at the top of the stack, and one at the bottom. Since PEI will overwrite the assigned slot for the stack protector, the load that is used to compare the value of the stack protector will use the slot assigned by PEI, which is wrong. For this, we need to check if the object is pre-allocated, and re-use that pre-allocated slot. Differential Revision: https://reviews.llvm.org/D64757 llvm-svn: 366371
* [CodeGen][NFC] Simplify checks for stack protector index checkingFrancis Visoiu Mistrih2019-07-172-13/+11
| | | | | | | Use `hasStackProtectorIndex()` instead of `getStackProtectorIndex() >= 0`. llvm-svn: 366369
* GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sourcesMatt Arsenault2019-07-172-48/+87
| | | | | | | | | | | Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367
* GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUESMatt Arsenault2019-07-171-4/+23
| | | | | | | | | | | | Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366
* Basic codegen for MTE stack tagging.Evgeniy Stepanov2019-07-171-0/+13
| | | | | | | | | | | | Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360
* [AsmPrinter] Make the encoding of call sites in .gcc_except_table ↵Alex Bradbury2019-07-173-6/+28
| | | | | | | | | | | | | | | | | | | configurable and use for RISC-V The original behavior was to always emit the offsets to each call site in the call site table as uleb128 values, however on some architectures (eg RISCV) these uleb128 offsets into the code cannot always be resolved until link time (because relaxation will invalidate any calculated offsets), and there are no appropriate relocations for uleb128 values. As a consequence it needs to be possible to specify an alternative. This also switches RISCV to use DW_EH_PE_udata4 for call side encodings in .gcc_except_table Differential Revision: https://reviews.llvm.org/D63415 Patch by Edward Jones. llvm-svn: 366329
* [RISCV] Set correct encodings for DWARF exception handlingAlex Bradbury2019-07-171-0/+8
| | | | | | | | | | | | This patch sets correct encodings for DWARF exception handling for RISC-V (other than call site encoding, which must be udata4 rather than uleb128 and is handled by D63415). This has the same intend as D63409, except this version matches GCC/binutils behaviour which uses the same encodings regardless of PIC/non-PIC and medlow/medany code model. llvm-svn: 366327
* [MIPS GlobalISel] ClampScalar and select pointer G_ICMPPetar Avramovic2019-07-171-0/+36
| | | | | | | | | | | Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317
* GlobalISel: Add overload of handleAssignments with CCStateMatt Arsenault2019-07-161-2/+11
| | | | | | | | | | | AMDGPU needs to allocate special argument registers separately from the user function argument list, so needs direct control over the CCState. The ArgLocs argument is only really necessary because CCState doesn't allow access to it. llvm-svn: 366279
* DWARF: Skip zero column for inline call sitesDavid Blaikie2019-07-161-1/+2
| | | | | | | | | | | | | | D64033 <https://reviews.llvm.org/D64033> added DW_AT_call_column for inline sites. However, that change wasn't aware of "-gno-column-info". To avoid adding column info when "-gno-column-info" is used, now DW_AT_call_column is only added when we have non-zero column (when "-gno-column-info" is used, column will be zero). Patch by Wenlei He! Differential Revision: https://reviews.llvm.org/D64784 llvm-svn: 366264
* [Strict FP] Allow more relaxed schedulingUlrich Weigand2019-07-161-10/+21
| | | | | | | | | | | | | | Reimplement scheduling constraints for strict FP instructions in ScheduleDAGInstrs::buildSchedGraph to allow for more relaxed scheduling. Specifially, allow one strict FP instruction to be scheduled across another, as long as it is not moved across any global barrier. Differential Revision: https://reviews.llvm.org/D64412 Reviewed By: cameron.mcinally llvm-svn: 366222
* [Remarks][NFC] Combine ParserFormat and SerializerFormatFrancis Visoiu Mistrih2019-07-161-0/+1
| | | | | | It's useless to have both. llvm-svn: 366216
* [DAGCombiner] fold (addcarry (xor a, -1), b, c) -> (subcarry b, a, !c) and ↵Amaury Sechet2019-07-161-16/+28
| | | | | | | | | | | | | | | | | | | flip carry. Summary: As per title. DAGCombiner only mathes the special case where b = 0, this patches extends the pattern to match any value of b. Depends on D57302 Reviewers: hfinkel, RKSimon, craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59208 llvm-svn: 366214
* Fix parameter name comments using clang-tidy. NFC.Rui Ueyama2019-07-169-24/+24
| | | | | | | | | | | | | | | | | | | | | This patch applies clang-tidy's bugprone-argument-comment tool to LLVM, clang and lld source trees. Here is how I created this patch: $ git clone https://github.com/llvm/llvm-project.git $ cd llvm-project $ mkdir build $ cd build $ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \ -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \ -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm $ ninja $ parallel clang-tidy -checks='-*,bugprone-argument-comment' \ -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \ ::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h} llvm-svn: 366177
* [WebAssembly] Rename except_ref type to exnrefHeejin Ahn2019-07-151-1/+1
| | | | | | | | | | | | | | | | | | | Summary: We agreed to rename `except_ref` to `exnref` for consistency with other reference types in https://github.com/WebAssembly/exception-handling/issues/79. This also renames WebAssemblyInstrExceptRef.td to WebAssemblyInstrRef.td in order to use the file for other reference types in future. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64703 llvm-svn: 366145
* GlobalISel: Implement narrowScalar for vector extract/insert indexesMatt Arsenault2019-07-151-0/+11
| | | | llvm-svn: 366113
* [PowerPC] Support fp128 libcallsFangrui Song2019-07-151-0/+28
| | | | | | | | | | | | | On PowerPC, IEEE 754 quadruple-precision libcall names use "kf" instead of "tf". In libgcc, libgcc/config/rs6000/float128-sed converts TF names to KF names. This patch implements its 24 substitution rules. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D64282 llvm-svn: 366039
* [DebugInfo] Add column info for inline sitesJonas Devlieghere2019-07-121-0/+1
| | | | | | | | | | | | The column field is missing for all inline sites, currently it's always zero. This changes populates DW_AT_call_column field for inline sites. Test case modified to cover this change. Patch by: Wenlei He Differential revision: https://reviews.llvm.org/D64033 llvm-svn: 365945
* Delete dead storesFangrui Song2019-07-124-15/+3
| | | | llvm-svn: 365903
* [DAGCombine] narrowExtractedVectorBinOp - wrap subvector extraction in ↵Simon Pilgrim2019-07-121-9/+11
| | | | | | | | helper. NFCI. First step towards supporting 'free' subvector extractions other than concat_vectors. llvm-svn: 365896
* Revert "[DwarfDebug] Dump call site debug info"Djordje Todorovic2019-07-1210-393/+43
| | | | | | | | A build failure was found on the SystemZ platform. This reverts commit 9e7e73578e54cd22b3c7af4b54274d743b6607cc. llvm-svn: 365886
* [MachinePipeliner] Fix order for nodes with Anti dependence in same cycleJinsong Ji2019-07-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Problem exposed in PowerPC functional testing. We did not consider Anti dependence for nodes in same cycle, so we may end up generating bad machine code. eg: the reduced test won't verify. *** Bad machine code: Using an undefined physical register *** - function: lame_encode_buffer_interleaved - basic block: %bb.4 (0x4bde4e12928) - instruction: %29:gprc = ADDZE %27:gprc, implicit-def dead $carry, implicit $carry - operand 3: implicit $carry Reviewers: bcahoon, kparzysz, hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64192 llvm-svn: 365859
* [DAGCombine] narrowInsertExtractVectorBinOp - add CONCAT_VECTORS supportSimon Pilgrim2019-07-111-4/+14
| | | | | | | | | | | | | | We already split extract_subvector(binop(insert_subvector(v,x),insert_subvector(w,y))) -> binop(x,y). This patch adds support for extract_subvector(binop(concat_vectors(),concat_vectors())) cases as well. In particular this means we don't have to wait for X86 lowering to convert concat_vectors to insert_subvector chains, which helps avoid some cases where demandedelts/combine calls occur too late to split large vector ops. The fast-isel-store.ll load folding regression is annoying but I don't think is that critical. Differential Revision: https://reviews.llvm.org/D63653 llvm-svn: 365785
* RegUsageInfoCollector: Skip calling conventions I missed beforeMatt Arsenault2019-07-111-0/+3
| | | | llvm-svn: 365784
* GlobalISel: Use RegisterMatt Arsenault2019-07-111-5/+5
| | | | llvm-svn: 365780
* OpaquePtr: switch to GlobalValue::getValueType in a few places. NFC.Tim Northover2019-07-111-2/+2
| | | | llvm-svn: 365770
* OpaquePtr: use byval accessor instead of inspecting pointer type. NFC.Tim Northover2019-07-111-3/+2
| | | | | | | The accessor can deal with both "byval(ty)" and "ty* byval" forms seamlessly. llvm-svn: 365769
* [SDAG] commute setcc operands to match a subtractSanjay Patel2019-07-101-0/+11
| | | | | | | | | | | | | | | | | | | If we have: R = sub X, Y P = cmp Y, X ...then flipping the operands in the compare instruction can allow using a subtract that sets compare flags. Motivated by diffs in D58875 - not sure if this changes anything there, but this seems like a good thing independent of that. There's a more involved version of this transform already in IR (in instcombine although that seems misplaced to me) - see "swapMayExposeCSEOpportunities()". Differential Revision: https://reviews.llvm.org/D63958 llvm-svn: 365711
* [AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and ↵Amara Emerson2019-07-101-0/+3
| | | | | | | | | | | | | | | | | | | | unknown values. Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes obstruct attempts to find constant source values. These usually come about when try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough about this operation allows the CBZ/CBNZ optimization to catch more cases. This change also improves the case where we can't find a constant source at all. Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit a cmp and conditional branch, saving an instruction. The cumulative code size improvement of this change plus D64354 is 5.5% geomean on arm64 CTMark -O0. Differential Revision: https://reviews.llvm.org/D64377 llvm-svn: 365690
* Move three folds for FADD, FSUB and FMUL in the DAG combiner away from ↵Michael Berg2019-07-101-4/+4
| | | | | | | | | | | | | | | | Unsafe to more aligned checks that reflect context Summary: Unsafe does not map well alone for each of these three cases as it is missing NoNan context when accessed directly with clang. I have migrated the fold guards to reflect the expectations of handing nan and zero contexts directly (NoNan, NSZ) and some tests with it. Unsafe does include NSZ, however there is already precedent for using the target option directly to reflect that context. Reviewers: spatel, wristow, hfinkel, craig.topper, arsenm Reviewed By: arsenm Subscribers: michele.scandale, wdng, javed.absar Differential Revision: https://reviews.llvm.org/D64450 llvm-svn: 365679
* [TargetLowering] support BlockAddress as "i" inline asm constraintNick Desaulniers2019-07-101-0/+7
| | | | | | | | | | | | | | | | | | | | Summary: This allows passing address of labels to inline assembly "i" input constraints. Fixes pr/42502. Reviewers: ostannard Reviewed By: ostannard Subscribers: void, echristo, nathanchance, ostannard, javed.absar, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D64167 llvm-svn: 365664
* GlobalISel: Legalization for G_FMINNUM/G_FMAXNUMMatt Arsenault2019-07-102-0/+71
| | | | llvm-svn: 365658
* GlobalISel: Define the full family of FP min/max instructionsMatt Arsenault2019-07-101-0/+8
| | | | llvm-svn: 365657
* [DAGCombine] visitINSERT_SUBVECTOR - use uint64_t subvector index. NFCI.Simon Pilgrim2019-07-101-1/+1
| | | | | | Keep the uint64_t type from getZExtValue() to stop truncation/extension overflow warnings in MSVC in subvector index math. llvm-svn: 365621
* Fix const/non-const lambda return type warning. NFCI.Simon Pilgrim2019-07-101-1/+1
| | | | llvm-svn: 365613
* GlobalISel: Implement lower for G_FCOPYSIGNMatt Arsenault2019-07-091-0/+50
| | | | | | | | | In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583
* GlobalISel: Combine unmerge of merge with intermediate castMatt Arsenault2019-07-091-3/+9
| | | | | | | This eliminates some illegal intermediate vectors when operations are scalarized. llvm-svn: 365566
* [X86][AMDGPU][DAGCombiner] Move call to allowsMemoryAccess into ↵Craig Topper2019-07-091-15/+8
| | | | | | | | | | | | | | | | isLoadBitCastBeneficial/isStoreBitCastBeneficial to allow X86 to bypass it Basically the problem is that X86 doesn't set the Fast flag from allowsMemoryAccess on certain CPUs due to slow unaligned memory subtarget features. This prevents bitcasts from being folded into loads and stores. But all vector loads and stores of the same width are the same cost on X86. This patch merges the allowsMemoryAccess call into isLoadBitCastBeneficial to allow X86 to skip it. Differential Revision: https://reviews.llvm.org/D64295 llvm-svn: 365549
* Revert "[HardwareLoops] NFC - move hardware loop checking code to ↵Jinsong Ji2019-07-091-12/+33
| | | | | | | | isHardwareLoopProfitable()" This reverts commit d95557306585404893d610784edb3e32f1bfce18. llvm-svn: 365520
* [AArch64][GlobalISel] Optimize conditional branches followed by ↵Amara Emerson2019-07-091-0/+62
| | | | | | | | | | | | | | unconditional branches If we have an icmp->brcond->br sequence where the brcond just branches to the next block jumping over the br, while the br takes the false edge, then we can modify the conditional branch to jump to the br's target while inverting the condition of the incoming icmp. This means we can eliminate the br as an unconditional branch to the fallthrough block. Differential Revision: https://reviews.llvm.org/D64354 llvm-svn: 365510
* [DAGCombine] LoadedSlice - keep getOffsetFromBase() uint64_t offset. NFCI.Simon Pilgrim2019-07-091-1/+1
| | | | | | Keep the uint64_t type from getOffsetFromBase() to stop truncation/extension overflow warnings in MSVC in alignment math. llvm-svn: 365504
* [HardwareLoops] NFC - move hardware loop checking code to ↵Chen Zheng2019-07-091-33/+12
| | | | | | | | isHardwareLoopProfitable() Differential Revision: https://reviews.llvm.org/D64197 llvm-svn: 365497
* [MIPS GlobalISel] Register bank select for G_PHI. Select i64 phiPetar Avramovic2019-07-091-0/+28
| | | | | | | | | | | | | | | Select gprb or fprb when def/use register operand of G_PHI is used/defined by either: copy to/from physical register or instruction with only one mapping available for that use/def operand. Integer s64 phi is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64351 llvm-svn: 365494
* [CodeGen] AccelTable - remove non-constexpr (MSVC) Atom defsSimon Pilgrim2019-07-091-20/+0
| | | | | | Now that we've dropped VS2015 support (D64326) we can enable the constexpr variables on MSVC builds as VS2017+ correctly handles them llvm-svn: 365477
* [NFC][AsmPrinter] Fix the formatting for the rL365467Djordje Todorovic2019-07-092-23/+21
| | | | | | | In addition, fix the build failure for the 'unused' variable. The variable was used inside the 'LLVM_DEBUG()'. llvm-svn: 365469
* OpaquePtr: add Type parameter to Loads analysis API.Tim Northover2019-07-091-2/+4
| | | | | | | | | | | | | This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job. Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering. llvm-svn: 365468
* [DwarfDebug] Dump call site debug infoDjordje Todorovic2019-07-0910-38/+390
| | | | | | | | | | | | | | | | | | | Dump the DWARF information about call sites and call site parameters into debug info sections. The patch also provides an interface for the interpretation of instructions that could load values of a call site parameters in order to generate DWARF about the call site parameters. ([13/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 365467
* [SelectionDAG] Simplify some calls to getSetCCResultType. NFCBjorn Pettersson2019-07-093-8/+4
| | | | | | | | DAGTypeLegalizer and SelectionDAGLegalize has helper functions wrapping the call to TLI.getSetCCResultType(...). Use those helpers in more places. llvm-svn: 365456
* [LegalizeTypes] Fix saturation bug for smul.fix.satBjorn Pettersson2019-07-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Make sure we use SETGE instead of SETGT when checking if the sign bit is zero at SMULFIXSAT expansion. The faulty expansion occured when doing "expand" of SMULFIXSAT and the scale was exactly matching the size of the smaller type. For example doing i64 Z = SMULFIXSAT X, Y, 32 and expanding X/Y/Z into using two i32 values. The problem was that we sometimes did not saturate to min when overflowing. Here is an example using Q3.4 numbers: Consider that we are multiplying X and Y. X = 0x80 (-8.0 as Q3.4) Y = 0x20 (2.0 as Q3.4) To avoid loss of precision we do a widening multiplication, getting a 16 bit result Z = 0xF000 (-16.0 as Q7.8) To detect negative overflow we should check if the five most significant bits in Z are less than -1. Assume that we name the 4 most significant bits as HH and the next 4 bits as HL. Then we can do the check by examining if (HH < -1) or (HH == -1 && "sign bit in HL is zero"). The fault was that we have been doing the check as (HH < -1) or (HH == -1 && HL > 0) instead of (HH < -1) or (HH == -1 && HL >= 0). In our example HH is -1 and HL is 0, so the old code did not trigger saturation and simply truncated the result to 0x00 (0.0). With the bugfix we instead detect that we should saturate to min, and the result will be set to 0x80 (-8.0). Reviewers: leonardchan, bevinh Reviewed By: leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64331 llvm-svn: 365455
OpenPOWER on IntegriCloud