summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [NFC] Fix compilation of CrashRecoveryContext.cpp on mingwMarkus Böck2020-01-121-1/+2
| | | | | | Patch by Markus Böck. Differential Revision: https://reviews.llvm.org/D72564
* [PowerPC] Delete PPCDarwinAsmPrinter and PPCMCAsmInfoDarwinFangrui Song2020-01-124-201/+1
| | | | | | | | Darwin support has been removed. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D72063
* [X86][AVX] Use lowerShuffleAsLanePermuteAndSHUFP to lower binary v4f64 shuffles.Simon Pilgrim2020-01-121-0/+12
| | | | | | Only perform this if we are shuffling lower and upper lane elements across the lanes (otherwise splitting to lower xmm shuffles would be better). This is a regression if we shuffle build_vectors due to getVectorShuffle canonicalizing 'blend of splat' build vectors, for now I've set this not to shuffle build_vector nodes at all to avoid this.
* [X86][AVX] lowerShuffleAsLanePermuteAndSHUFP - only set the demanded ↵Simon Pilgrim2020-01-121-2/+1
| | | | | | elements of the lane mask. Fixes an cyclic dependency issue with an upcoming patch where getVectorShuffle canonicalizes masks with splat build vector sources.
* [X86][Disassembler] Merge X86DisassemblerDecoder.cpp into ↵Fangrui Song2020-01-124-1868/+1569
| | | | X86Disassembler.cpp and refactor
* [X86][Disassembler] SimplifyFangrui Song2020-01-123-45/+7
|
* [NFC] Refactor memory ops cluster methodQiu Chaofan2020-01-121-14/+7
| | | | | | | | | | | Current implementation of BaseMemOpsClusterMutation is a little bit obscure. This patch directly uses a map from store chain ID to set of memory instrs to make it simpler, so that future improvements are easier to read, update and review. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D72070
* [X86] Don't call LowerSETCC from LowerSELECT for ↵Craig Topper2020-01-111-3/+1
| | | | | | | | | | | STRICT_FSETCC/STRICT_FSETCCS nodes. This causes the STRICT_FSETCC/STRICT_FSETCCS nodes to lowered early while lowering SELECT, but the output chain doesn't get connected. Then we visit the node again when it is its turn because we haven't replaced the use of the chain result. In the case of the fp128 libcall lowering, after D72341 this will cause the libcall to be emitted twice.
* [SCEV] more accurate range for addrecexpr with nsw flag.Zheng Chen2020-01-111-9/+17
| | | | | | Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D72436
* [LegalizeVectorOps] Parallelize the lo/hi part of STRICT_UINT_TO_FLOAT ↵Craig Topper2020-01-111-3/+6
| | | | | | | legalization. The lo and hi computation are independent. Give them the same input chain and TokenFactor the results together.
* [TargetLowering][X86] Connect the chain from STRICT_FSETCC in ↵Craig Topper2020-01-112-5/+9
| | | | TargetLowering::expandFP_TO_UINT and X86TargetLowering::FP_TO_INTHelper.
* [LegalizeVectorOps] Expand vector MERGE_VALUES immediately.Craig Topper2020-01-111-0/+11
| | | | | | Custom legalization can produce MERGE_VALUES to return multiple results. We can expand them immediately instead of leaving them around for DAG combine to clean up.
* [X86][Disassembler] Optimize argument passing and immediate readingFangrui Song2020-01-113-74/+41
|
* [Disassembler] Delete the VStream parameter of MCDisassembler::getInstruction()Fangrui Song2020-01-1123-75/+38
| | | | | | | | | | The argument is llvm::null() everywhere except llvm::errs() in llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds. If we ever have the needs to add verbose log to disassemblers, we can record log with a member function, instead of passing it around as an argument.
* [ORC] Fix argv handling in runAsMain / lli.Lang Hames2020-01-111-1/+1
| | | | | | | | This fixes an off-by-one error in the argc value computed by runAsMain, and switches lli back to using the input bitcode (rather than the string "lli") as the effective program name. Thanks to Stefan Graenitz for spotting the bug.
* [Support] Optionally call signal handlers when a function wrapped by the the ↵Alexandre Ganea2020-01-113-53/+85
| | | | | | | | | CrashRecoveryContext fails This patch allows for handling a failure inside a CrashRecoveryContext in the same way as the global exception/signal handler. A failure will have the same side-effect, such as cleanup of temporarty file, printing callstack, calling relevant signal handlers, and finally returning an exception code. This is an optional feature, disabled by default. This is a support patch for D69825. Differential Revision: https://reviews.llvm.org/D70568
* [X86][Disassembler] Replace custom logger with LLVM_DEBUGFangrui Song2020-01-113-56/+14
| | | | | | | llvm-objdump -d on clang is decreased from 7.8s to 7.4s. The improvement is likely due to the elimination of logger setup and dbgprintf(), which has a large overhead.
* [LegalizeVectorOps] Remove some of the simpler Expand methods. Pass Results ↵Craig Topper2020-01-111-125/+77
| | | | | | | | | | | vector to a couple. NFCI Some of the simplest handlers just call TLI and if that fails, they fall back to unrolling. For those just inline the TLI call and share the unrolling call with the default case of Expand. For ExpandFSUB and ExpandBITREVERSE so that its obvious they don't return results sometimes and want to defer to LegalizeDAG.
* [LegalizeVectorOps] Only pass SDNode* instead SDValue to all of the Expand* ↵Craig Topper2020-01-111-251/+251
| | | | | | | and Promote* methods. All the Expand* and Promote* function assume they are being called with result 0 anyway. Just hardcode result 0 into them.
* [X86][Disassembler] Simplify and optimize reader functionsFangrui Song2020-01-113-180/+101
| | | | llvm-objdump -d on clang is decreased from 8.2s to 7.8s.
* [X86] Turn FP_ROUND/STRICT_FP_ROUND into X86ISD::VFPROUND/STRICT_VFPROUND ↵Craig Topper2020-01-113-67/+4
| | | | during PreprocessISelDAG to remove some duplicate isel patterns.
* [ExecutionEngine] Re-enable FastISel for non-iOS arm targets.Lang Hames2020-01-111-7/+0
| | | | | | Patch by Nicolas Capens. Thanks Nicolas! https://reviews.llvm.org/D65015
* [X86] Adjust nop emission by compiler to consider target decode limitationsPhilip Reames2020-01-111-0/+17
| | | | | | The primary motivation of this change is to bring the code more closely in sync behavior wise with the assembler's version of nop emission. I'd like to eventually factor them into one, but that's hard to do when one has features the other doesn't. The longest encodeable nop on x86 is 15 bytes, but many processors - for instance all intel chips - can't decode the 15 byte form efficiently. On those processors, it's better to use either a 10 byte or 11 byte sequence depending.
* [X86AsmBackend] Move static function before sole use [NFC]Philip Reames2020-01-111-34/+34
|
* [X86AsmBackend] Be consistent about placing definitions out of line [NFC]Philip Reames2020-01-111-49/+57
|
* Fix uninitialized value clang static analyzer warning. NFC.Simon Pilgrim2020-01-111-1/+1
|
* moveOperands - assert Src/Dst MachineOperands are non-null.Simon Pilgrim2020-01-111-1/+1
| | | | Fixes static-analyzer warnings.
* [X86] Fix outdated commentSimon Pilgrim2020-01-111-2/+1
| | | | The generic saturated math opcodes are no longer widened inside X86TargetLowering
* [X86][AVX] Add lowerShuffleAsLanePermuteAndSHUFP loweringSimon Pilgrim2020-01-111-0/+40
| | | | | | | | Add initial support for lowering v4f64 shuffles to SHUFPD(VPERM2F128(V1, V2), VPERM2F128(V1, V2)), eventually this could be used for v8f32 (and maybe v8f64/v16f32) but I'm being conservative for the initial implementation as only v4f64 can always succeed. This currently is only called from lowerShuffleAsLanePermuteAndShuffle so only gets used for unary shuffles, and we limit this to cases where we use upper elements as otherwise concating 2 xmm shuffles is probably the better case. Helps with poor shuffles mentioned in D66004.
* DSE: fix bug where we would only check libcalls for name rather than whole declNuno Lopes2020-01-111-9/+12
|
* [InstCombine] Preserve nuw on sub of geps (PR44419)Nikita Popov2020-01-112-4/+16
| | | | | | | | | Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the nuw on sub of geps. We only do this if the offset has a multiplication as the final operation, as we can't be sure the operations is nuw in the other cases without more thorough analysis. Differential Revision: https://reviews.llvm.org/D72048
* [X86] Remove dead code from X86DAGToDAGISel::Select that is no longer needed ↵Craig Topper2020-01-111-28/+0
| | | | now that we don't mutate strict fp nodes. NFC
* [X86] Simplify code by removing an unreachable condition. NFCICraig Topper2020-01-101-12/+2
| | | | | | For X87<->SSE conversions, the SSE type is always smaller than the X87 type. So we can always use the smallest type for the memory type.
* [X86] Preserve fpexcept property when turning strict_fp_extend and ↵Craig Topper2020-01-102-4/+37
| | | | | | | | | | | strict_fp_round into stack operations. We use the stack for X87 fp_round and for moving from SSE f32/f64 to X87 f64/f80. Or from X87 f64/f80 to SSE f32/f64. Note for the SSE<->X87 conversions the conversion always happens in the X87 domain. The load/store ops in the X87 instructions are able to signal exceptions.
* [X86][Disassembler] Simplify readPrefixesFangrui Song2020-01-101-43/+25
|
* [X86] Use ReplaceAllUsesWith instead of ReplaceAllUsesOfValueWith to ↵Craig Topper2020-01-101-12/+2
| | | | simplify some code. NFCI
* [AMDGPU] Remove unnecessary v_mov from a register to itself in WQM lowering.Michael Bedy2020-01-101-5/+22
| | | | | | | | | | | | | | | | | | | Summary: - SI Whole Quad Mode phase is replacing WQM pseudo instructions with v_mov instructions. While this is necessary for the special handling of moving results out of WWM live ranges, it is not necessary for WQM live ranges. The result is a v_mov from a register to itself after every WQM operation. This change uses a COPY psuedo in these cases, which allows the register allocator to coalesce the moves away. Reviewers: tpr, dstuttard, foad, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71386
* [TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from ↵Craig Topper2020-01-106-27/+7
| | | | | | | | | | | | | | | | | | | RunttimeLibcalls.def and all associated usages Summary: This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare. This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code. Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn Reviewed By: efriedma Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72536
* [AArch64] Don't generate libcalls for wide shifts on DarwinJessica Paquette2020-01-101-1/+1
| | | | | | | Similar to cff90f07cb5cc3. Darwin doesn't always use compiler-rt, and so we can't assume that these functions are available (at least on arm64).
* [NFC][InlineCost] Factor cost modeling out of CallAnalyzer traversal.Mircea Trofin2020-01-101-328/+431
| | | | | | | | | | | | | | | | | | Summary: The goal is to simplify experimentation on the cost model. Today, CallAnalyzer decides 2 things: legality, and benefit. The refactoring keeps legality assessment in CallAnalyzer, and factors benefit evaluation out, as an extension. Reviewers: davidxl, eraman Reviewed By: davidxl Subscribers: kamleshbhalui, fedor.sergeev, hiraditya, baloghadamsoftware, haicheng, a.sidorin, Szelethus, donat.nagy, dkrupp, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71733
* [LockFileManager] Make default waitForUnlock timeout a parameter, NFCVedant Kumar2020-01-101-4/+2
| | | | Patch by Xi Ge!
* Let targets adjust operand latency of bundlesStanislav Mekhanoshin2020-01-104-42/+34
| | | | | | | | | | | This reverts the AMDGPU DAG mutation implemented in D72487 and gives a more general way of adjusting BUNDLE operand latency. It also replaces FixBundleLatencyMutation with adjustSchedDependency callback in the AMDGPU, fixing not only successor latencies but predecessors' as well. Differential Revision: https://reviews.llvm.org/D72535
* [AArch64] Add isAuthenticated predicate to MCInstDescVedant Kumar2020-01-102-6/+14
| | | | | | | | | | Add a predicate to MCInstDesc that allows tools to determine whether an instruction authenticates a pointer. This can be used by diagnostic tools to hint at pointer authentication failures. Differential Revision: https://reviews.llvm.org/D70329 rdar://55089604
* [TargetLowering] Use SelectionDAG::getSetCC and remove a repeated call to ↵Craig Topper2020-01-101-8/+4
| | | | getSetCCResultType in softenSetCCOperands. NFCI
* [CMake] Fix modules build after DWARFLinker reorganizationJonas Devlieghere2020-01-101-0/+2
| | | | | Create a dedicate module for the DWARFLinker and make it depend on intrinsics gen.
* [TargetLowering][ARM][X86] Change softenSetCCOperands handling of ONE to ↵Craig Topper2020-01-101-10/+9
| | | | | | | | | | | | avoid spurious exceptions for QNANs with strict FP quiet compares ONE is currently softened to OGT | OLT. But the libcalls for OGT and OLT libcalls will trigger an exception for QNAN. At least for X86 with libgcc. UEQ on the other hand uses UO | OEQ. The UO and OEQ libcalls will not trigger an exception for QNAN. This patch changes ONE to use the inverse of the UEQ lowering. So we now produce O & UNE. Technically the existing behavior was correct for a signalling ONE, but since I don't know how to generate one of those from clang that seemed like something we can deal with later as we would need to fix other predicates as well. Also removing spurious exceptions seemed better than missing an exception. There are also problems with quiet OGT/OLT/OLE/OGE, but those are harder to fix. Differential Revision: https://reviews.llvm.org/D72477
* [LegalizeVectorOps] Improve handling of multi-result operations.Craig Topper2020-01-101-173/+271
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This system wasn't very well designed for multi-result nodes. As a consequence they weren't consistently registered in the LegalizedNodes map leading to nodes being revisited for different results. I've removed the "Result" variable from the main LegalizeOp method and used a SDNode* instead. The result number from the incoming Op SDValue is only used for deciding which result to return to the caller. When LegalizeOp is called it should always register a legalized result for all of its results. Future calls for any other result should be pulled for the LegalizedNodes map. Legal nodes will now register all of their results in the map instead of just the one we were called for. The Expand and Promote handling to use a vector of results similar to LegalizeDAG. Each of the new results is then re-legalized and logged in the LegalizedNodes map for all of the Results for the node being legalized. None of the handles register their own results now. And none call ReplaceAllUsesOfValueWith now. Custom handling now always passes result number 0 to LowerOperation. This matches what LegalizeDAG does. Since the introduction of STRICT nodes, I've encountered several issues with X86's custom handling being called with an SDValue pointing at the chain and our custom handlers using that to get a VT instead of result 0. This should prevent us from having any more of those issues. On return we will update the LegalizedNodes map for all results so we shouldn't call the custom handler again for each result number. I want to push SDNode* further into the Expand and Promote handlers, but I've left that for a follow to keep this patch size down. I've created a dummy SDValue(Node, 0) to keep the handlers working. Differential Revision: https://reviews.llvm.org/D72224
* [X86] Support function attribute "patchable-function-entry"Fangrui Song2020-01-101-3/+15
| | | | | | | For x86-64, we diverge from GCC -fpatchable-function-entry in that we emit multi-byte NOPs. Differential Revision: https://reviews.llvm.org/D72220
* [AArch64] Add function attribute "patchable-function-entry" to add NOPs at ↵Fangrui Song2020-01-104-2/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | function entry The Linux kernel uses -fpatchable-function-entry to implement DYNAMIC_FTRACE_WITH_REGS for arm64 and parisc. GCC 8 implemented -fpatchable-function-entry, which can be seen as a generalized form of -mnop-mcount. The N,M form (function entry points before the Mth NOP) is currently only used by parisc. This patch adds N,0 support to AArch64 codegen. N is represented as the function attribute "patchable-function-entry". We will use a different function attribute for M, if we decide to implement it. The patch reuses the existing patchable-function pass, and TargetOpcode::PATCHABLE_FUNCTION_ENTER which is currently used by XRay. When the integrated assembler is used, __patchable_function_entries will be created for each text section with the SHF_LINK_ORDER flag to prevent --gc-sections (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93197) and COMDAT (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93195) issues. Retrospectively, __patchable_function_entries should use a PC-relative relocation type to avoid the SHF_WRITE flag and dynamic relocations. "patchable-function-entry"'s interaction with Branch Target Identification is still unclear (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 for GCC discussions). Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D72215
* [AIX] Allow vararg calls when all arguments reside in registersjasonliu2020-01-101-22/+85
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch pushes the AIX vararg unimplemented error diagnostic later and allows vararg calls so long as all the arguments can be passed in register. This patch extends the AIX calling convention implementation to initialize GPR(s) for vararg float arguments. On AIX, both GPR(s) and FPR are allocated for floating point arguments. The GPR(s) are only initialized for vararg calls, otherwise the callee is expected to retrieve the float argument in the FPR. f64 in AIX PPC32 requires special handling in order to allocated and initialize 2 GPRs. This is performed with bitcast, SRL, truncation to initialize one GPR for the MSW and bitcast, truncations to initialize the other GPR for the LSW. A future patch will follow to add support for arguments passed on the stack. Patch provided by: cebowleratibm Reviewers: sfertile, ZarkoCA, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D71013
OpenPOWER on IntegriCloud