summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [Timers] Use the pass argument name for JSON keys in time-passesFrancis Visoiu Mistrih2018-06-131-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using clang --save-stats -mllvm -time-passes, both timers and stats end up in the same json file. We could end up with things like: { "asm-printer.EmittedInsts": 1, "time.pass.Virtual Register Map.wall": 2.9015541076660156e-04, "time.pass.Virtual Register Map.user": 2.0500000000000379e-04, "time.pass.Virtual Register Map.sys": 8.5000000000001741e-05, } This patch makes use of the pass argument name (if available) in the JSON key to end up with things like: { "asm-printer.EmittedInsts": 1, "time.pass.virtregmap.wall": 2.9015541076660156e-04, "time.pass.virtregmap.user": 2.0500000000000379e-04, "time.pass.virtregmap.sys": 8.5000000000001741e-05, } This also helps avoiding to write another JSON printer to handle all the cases that we could have in our pass names. Differential Revision: https://reviews.llvm.org/D48109 llvm-svn: 334649
* [X86] Move RCPSSr_Int, RSQRTSSr_Int, SQRTSDr_Int, SQRTSSr_Int to the correct ↵Craig Topper2018-06-131-4/+4
| | | | | | | | load folding table. They were in the operand 1 folding table, but their foldable operand is operand 2. llvm-svn: 334648
* Enable ThreadPool to support tasks that return values.Zachary Turner2018-06-131-19/+2
| | | | | | | | | | | | | | | | | | | | | Previously ThreadPool could only queue async "jobs", i.e. work that was done for its side effects and not for its result. It's useful occasionally to queue async work that returns a value. From an API perspective, this is very intuitive. The previous API just returned a shared_future<void>, so all we need to do is make it return a shared_future<T>, where T is the type of value that the operation returns. Making this work required a little magic, but ultimately it's not too bad. Instead of keeping a shared queue<packaged_task<void()>> we just keep a shared queue<unique_ptr<TaskBase>>, where TaskBase is a class with a pure virtual execute() method, then have a templated derived class that stores a packaged_task<T()>. Everything else works out pretty cleanly. Differential Revision: https://reviews.llvm.org/D48115 llvm-svn: 334643
* [AMDGPU] Corrected computeKnownBits for V_PERM_B32Stanislav Mekhanoshin2018-06-131-7/+8
| | | | | | Differential Revision: https://reviews.llvm.org/D48133 llvm-svn: 334640
* LTO: Keep file handles open for memory mapped files.Peter Collingbourne2018-06-135-102/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Windows we've observed that if you open a file, write to it, map it into memory and close the file handle, the contents of the memory mapping can sometimes be incorrect. That was what we did when adding an entry to the ThinLTO cache using the TempFile and MemoryBuffer classes, and it was causing intermittent build failures on Chromium's ThinLTO bots on Windows. More details are in the associated Chromium bug (crbug.com/786127). We can prevent this from happening by keeping a handle to the file open while the mapping is active. So this patch changes the mapped_file_region class to duplicate the file handle when mapping the file and close it upon unmapping it. One gotcha is that the file handle that we keep open must not have been created with FILE_FLAG_DELETE_ON_CLOSE, as otherwise the operating system will prevent other processes from opening the file. We can achieve this by avoiding the use of FILE_FLAG_DELETE_ON_CLOSE altogether. Instead, we use SetFileInformationByHandle with FileDispositionInfo to manage the delete-on-close bit. This lets us remove the hack that we used to use to clear the delete-on-close bit on a file opened with FILE_FLAG_DELETE_ON_CLOSE. A downside of using SetFileInformationByHandle/FileDispositionInfo as opposed to FILE_FLAG_DELETE_ON_CLOSE is that it prevents us from using CreateFile to open the file while the flag is set, even within the same process. This doesn't seem to matter for almost every client of TempFile, except for LockFileManager, which calls sys::fs::create_link to create a hard link from the lock file, and in the process of doing so tries to open the file. To prevent this change from breaking LockFileManager I changed it to stop using TempFile by effectively reverting r318550. Differential Revision: https://reviews.llvm.org/D48051 llvm-svn: 334630
* [AMDGPU] Change enqueue kernel handle typeYaxun Liu2018-06-131-1/+2
| | | | | | | | | | Currently the handle type is a global pointer which holds 8 bytes. We need a larger type which hold 16 bytes, therefore change it to [i64 x 2]. Differential Revision: https://reviews.llvm.org/D48094 llvm-svn: 334625
* [AMDGPU][MC] Enabled parsing of relocations on VALU instructionsDmitry Preobrazhensky2018-06-131-2/+2
| | | | | | | | | | See bug 37566: https://bugs.llvm.org/show_bug.cgi?id=37566 Reviewers: artem.tamazov, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D47884 llvm-svn: 334622
* [CostModel] Recognise BROADCAST shuffle mask if the elements come from the ↵Simon Pilgrim2018-06-131-4/+11
| | | | | | second src llvm-svn: 334620
* [AMDGPU][MC][GFX8][GFX9] Allow LDS direct reads for BUFFER_LOAD_DWORDX2/X3/X4Dmitry Preobrazhensky2018-06-131-3/+19
| | | | | | | | | | See bug 37653: https://bugs.llvm.org/show_bug.cgi?id=37653 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D47885 llvm-svn: 334609
* [DAGCombiner] remove hasOneUse() check from fadd constants transformSanjay Patel2018-06-131-7/+6
| | | | | | | | | | | We're constant folding here, so we shouldn't check uses. This matches the IR optimizer behavior. The x86 test shows the expected win. The AArch64 test shows something else. This only seems to happen if the "generic" AArch64 CPU model is used by MachineCombiner, so I'll file a bug report to follow-up. llvm-svn: 334608
* AMDGPU: Move isSDNodeSourceOfDivergence() implementation to SITargetLoweringTom Stellard2018-06-134-71/+69
| | | | | | | | | | | | | | | | | | Summary: The code that handles ISD:Register and ISD::CopyFromReg assumes the target is amdgcn, so this is broken on r600. We don't need this analysis on r600 anyway so we can safely move it to SITargetLowering. Reviewers: alex-t, arsenm, nhaehnle Reviewed By: arsenm Subscribers: msearles, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46298 llvm-svn: 334607
* [FPEnv] Expand constrained FP operationsCameron McInally2018-06-131-8/+88
| | | | | | | | | | Add a helper function to expand constrained FP operations as needed. Note that the Strict POWI operation is not handled in this patch since the format is slightly different from the others. Differential Revision: https://reviews.llvm.org/D47491 llvm-svn: 334603
* Do not enforce absolute path argv0 in windowsHans Wennborg2018-06-131-29/+39
| | | | | | | | | | | | | | | | Even if we support no-canonical-prefix on clang-cl(https://reviews.llvm.org/D47480), argv0 becomes absolute path in clang-cl and that embeds absolute path in /showIncludes. This patch removes such full path normalization from InitLLVM on windows, and that removes absolute path from clang-cl output (obj/stdout/stderr) when debug flag is disabled. Patch by Takuto Ikuta! Differential Revision https://reviews.llvm.org/D47578 llvm-svn: 334602
* Revert "Improve handling of COPY instructions with identical value numbers"Krzysztof Parzyszek2018-06-131-53/+19
| | | | | | This reverts r334594, it breaks buildbots and fails with expensive checks. llvm-svn: 334598
* [mips][microMIPS] Extending size reduction pass with LWP and SWPZoran Jovanovic2018-06-134-55/+245
| | | | | | | | | | | | Author: milena.vujosevic.janicic Reviewers: sdardis The patch extends size reduction pass for MicroMIPS. It introduces reduction of two instructions into one instruction: Two SW instructions are transformed into one SWP instrucition. Two LW instructions are transformed into one LWP instrucition. Differential Revision: https://reviews.llvm.org/D39115 llvm-svn: 334595
* Improve handling of COPY instructions with identical value numbersKrzysztof Parzyszek2018-06-131-19/+53
| | | | | | Differential Revision: https://reviews.llvm.org/D48102 llvm-svn: 334594
* [x86] eliminate even more sign-bit tests with vector selectSanjay Patel2018-06-131-5/+4
| | | | | | | | | | | | | | | | | | | This shortcoming was noted in D47330, and the test diffs show we already had other examples where we failed to fold to a SHRUNKBLEND: /// Dynamic (non-constant condition) vector blend where only the sign bits /// of the condition elements are used. This is used to enforce that the /// condition mask is not valid for generic VSELECT optimizations. This patch implements an idea from D48043 and would obsolete that patch because it catches more cases (notable the AVX1 case that was missed there). All we're doing is allowing the existing transform to fire more often by removing the post-legalize constraint. All of the relevant feature checks and other predicates are left as-is. Differential Revision: https://reviews.llvm.org/D48078 llvm-svn: 334592
* [RISCV] Add codegen support for atomic load/stores with RV32AAlex Bradbury2018-06-134-2/+56
| | | | | | | | | | | | | | | Fences are inserted according to table A.6 in the current draft of version 2.3 of the RISC-V Instruction Set Manual, which incorporates the memory model changes and definitions contributed by the RISC-V Memory Consistency Model task group. Instruction selection failures will now occur for 8/16/32-bit atomicrmw and cmpxchg operations when targeting RV32IA until lowering for these operations is added in a follow-on patch. Differential Revision: https://reviews.llvm.org/D47589 llvm-svn: 334591
* [RISCV] Codegen support for atomic operations on RV32IAlex Bradbury2018-06-133-0/+23
| | | | | | | | | | | | | | | | | This patch adds lowering for atomic fences and relies on AtomicExpandPass to lower atomic loads/stores, atomic rmw, and cmpxchg to __atomic_* libcalls. test/CodeGen/RISCV/atomic-* are modelled on the exhaustive test/CodeGen/PPC/atomics-regression.ll, and will prove more useful once RV32A codegen support is introduced. Fence mappings are taken from table A.6 in the current draft of version 2.3 of the RISC-V Instruction Set Manual, which incorporates the memory model changes and definitions contributed by the RISC-V Memory Consistency Model task group. Differential Revision: https://reviews.llvm.org/D47587 llvm-svn: 334590
* [SLPVectorizer] getSameOpcode - remove useless cast [NFC]Simon Pilgrim2018-06-131-3/+2
| | | | | | There's no need to cast the base Value to an Instruction llvm-svn: 334588
* [SLPVectorizer] getSameOpcode - remove unusued alternate code [NFC]Simon Pilgrim2018-06-131-4/+1
| | | | | | We early-out for the case where we don't use alternate opcodes, so no need to check for it later. llvm-svn: 334587
* [TableGen] Emit a fatal error on inconsistencies in resource units vs cycles.Clement Courbet2018-06-131-6/+6
| | | | | | | | | | | | | | | | | | | | | Summary: For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files. For more obvious cases, I've ventured a fix. Some notes: - Exynos is especially fishy. - AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice. - The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit. Also see PR37310. Reviewers: RKSimon, craig.topper, javed.absar Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D46356 llvm-svn: 334586
* [PowerPC] fix trivial typos in comment, NFCHiroshi Inoue2018-06-1310-26/+26
| | | | llvm-svn: 334583
* Fix -DLLVM_ENABLE_THREADS=OFF build after r334537Hans Wennborg2018-06-131-1/+1
| | | | llvm-svn: 334582
* [PowerPC] avoid verification failure due to PowerPC VSX Swap Removal passHiroshi Inoue2018-06-131-0/+6
| | | | | | | This patch fixes a failure in lnt tests with -verify-machineinstrs option. When VSX Swap Removal pass swaps two register operands, it did not maintain kill flags associated with operands. This patch swaps flags as well as register number to avoid inconsistent kill flags information. llvm-svn: 334579
* [DWARF/AccelTable] Remove getDIESectionOffset for DWARF v5 entriesPavel Labath2018-06-132-11/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This method was not correct for entries in DWO files as it assumed it could just add up the CU and DIE offsets to get the absolute DIE offset. This is not correct for the DWO files, as here the CU offset will reference the skeleton unit, whereas the DIE offset will be the offset in the full unit in the DWO file. Unfortunately, this means that we are not able to determine the absolute DIE offset using the information in the .debug_names section alone, which means we have to offload some of this work to the users of this class. To demonstrate how this can be done, I've added/fixed the ability to lookup entries using accelerator tables in DWO files in llvm-dwarfdump. To make this happen, I've needed to make two extra changes in other classes: - made the DWARFContext method to lookup a CU based on the section offset public. I've needed this functionality to lookup a CU, and this seems like a useful thing in general. - made DWARFUnit::getDWOId call extractDIEsIfNeeded. Before this, the DWOId was filled in only if the root DIE happened to be parsed before we called the accessor. Since the lazy parsing is supposed to happen under the hood, calling extractDIEsIfNeeded seems appropriate. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D48009 llvm-svn: 334578
* [X86] Remove masking from avx512vbmi2 concat and shift by immediate ↵Craig Topper2018-06-133-23/+63
| | | | | | intrinsics. Use select in IR instead. llvm-svn: 334576
* [SimplifyIndVars] Ignore dead usersMax Kazantsev2018-06-131-0/+10
| | | | | | | | | | | | | IndVarSimplify sometimes makes transforms basing on users that are trivially dead. In particular, if DCE wasn't run before it, there may be a dead `sext/zext` in loop that will trigger widening transforms, however it makes no sense to do it. This patch teaches IndVarsSimplify ignore the mist trivial cases of that. Differential Revision: https://reviews.llvm.org/D47974 Reviewed By: sanjoy llvm-svn: 334567
* [X86] Mark all instructions that have masked store semantics with ↵Craig Topper2018-06-131-6/+8
| | | | | | | | NotMemoryFoldable. Remove dependency on SchedRW from memory table autogenerator. Previously we were whitelisting in instructions based on their SchedRW value. With the masked store instructions explicitly removed via NotMemoryFoldable, we don't seem to need this check anymore. llvm-svn: 334563
* [X86] Remove VPCOMPRESSB/W from the autogenerated load folding table.Craig Topper2018-06-131-2/+4
| | | | llvm-svn: 334562
* [AMDGPU] DAG combine to produce V_PERM_B32Stanislav Mekhanoshin2018-06-125-1/+214
| | | | | | Differential Revision: https://reviews.llvm.org/D48099 llvm-svn: 334559
* [DAGCombiner] Recognize more patterns for ABSKrzysztof Parzyszek2018-06-124-35/+38
| | | | | | Differential Revision: https://reviews.llvm.org/D47831 llvm-svn: 334553
* Remove malloc.h include from Intel JIT events codeReid Kleckner2018-06-121-1/+0
| | | | llvm-svn: 334547
* Add null check to Intel JIT event listenerReid Kleckner2018-06-121-4/+6
| | | | llvm-svn: 334544
* [ORC] Add a fallback definition generator for VSOs.Lang Hames2018-06-121-66/+100
| | | | | | | | | | | | If a VSO has a fallback definition generator attached it will be called during lookup (and lookupFlags) for any unresolved symbols. The definition generator can add new definitions to the VSO for any unresolved symbol. This allows VSOs to generate new definitions on demand. The immediate use case for this code is supporting VSOs that can import definitions found via dlsym on demand. llvm-svn: 334538
* [ORC] Refactor blocking lookup logic into the blockingLookup function, andLang Hames2018-06-124-60/+61
| | | | | | | implement existing blocking lookups (the lookup function) and JITSymbolResolverAdapter on top of that. llvm-svn: 334537
* [RuntimeDyld] Add an assert to catch misbehaving symbol resolvers.Lang Hames2018-06-121-0/+3
| | | | | | | | | Resolvers are required to find results for all requested symbols or return an error, but if a resolver fails to adhere to this contract (by returning results for only a subset of the requested symbols) then this code will infinite loop. This assertion catches resolvers that fail to adhere to the contract. llvm-svn: 334536
* [MCJIT] Call materializeAll on modules before compiling them in MCJIT.Lang Hames2018-06-121-0/+6
| | | | | | | | | This only affects modules with lazy GVMaterializers attached (usually modules read off disk using the lazy bitcode reader). For such modules, materializing before compiling prevents crashes due to missing function bodies / initializers. llvm-svn: 334535
* [AArch64] Support reserving x20 registerPetr Hosek2018-06-124-5/+25
| | | | | | | | | | | | Register x20 is a callee-saved register which may be used for other purposes in certain contexts, for example to hold special variables within the kernel. This change adds support for reserving this register both to frontend and backend to make this register usable for these purposes. Differential Revision: https://reviews.llvm.org/D46552 llvm-svn: 334531
* [X86] Remove mayLoad flag from AVX512 truncating store instructions.Craig Topper2018-06-121-2/+1
| | | | llvm-svn: 334529
* [MS][ARM64] Hoist __ImageBase handling into TargetLoweringObjectFileCOFFReid Kleckner2018-06-124-112/+105
| | | | | | | | | | | | All COFF targets should use @IMGREL32 relocations for symbol differences against __ImageBase. Do the same for getSectionForConstant, so that immediates lowered to globals get merged across TUs. Patch by Chris January Differential Revision: https://reviews.llvm.org/D47783 llvm-svn: 334523
* AMDHSA/NFC: Code object v3 updates (additional):Konstantin Zhuravlyov2018-06-122-13/+16
| | | | | | - Move section selection and alignment to AMDGPUAsmPrinter llvm-svn: 334521
* [MIR][MachineCSE] Implementing proper MachineInstr::getNumExplicitDefs()Roman Tereshin2018-06-122-8/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apparently, MachineInstr class definition as well as pretty much all of the machine passes assume that the only kind of MachineInstr's operands that is variadic for variadic opcodes is explicit non-definitions. In particular, this assumption is made by MachineInstr::defs(), uses(), and explicit_uses() methods, as well as by MachineCSE pass. The assumption is incorrect judging from at least TableGen backend implementation, that recognizes variable_ops in OutOperandList, and the very existence of G_UNMERGE_VALUES generic opcode, or ARM load multiple instructions, all of which have variadic defs. In particular, MachineCSE pass breaks MIR with CSE'able G_UNMERGE_VALUES instructions in it. This commit implements MachineInstr::getNumExplicitDefs() similar to pre-existing MachineInstr::getNumExplicitOperands(), fixes MachineInstr::defs(), uses(), and explicit_uses(), and fixes MachineCSE pass. As the issue addressed seems to affect only machine passes that could be ran mid-GlobalISel pipeline at the moment, the other passes aren't fixed by this commit, like MachineLICM: that could be done on per-pass basis when (if ever) they get adopted for GlobalISel. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D45640 llvm-svn: 334520
* AMDHSA: Code object v3 updatesKonstantin Zhuravlyov2018-06-126-10/+184
| | | | | | | | | | | | | | | - Do not emit following assembler directives: - .hsa_code_object_version - .hsa_code_object_isa - .amd_amdgpu_isa - .amd_amdgpu_hsa_metadata - .amd_amdgpu_pal_metadata - Do not emit .note entries - Cleanup and bring in sync kernel descriptor header file - Emit kernel descriptor into .rodata with appropriate relocations and alignments llvm-svn: 334519
* Refactor ExecuteAndWait to take StringRefs.Zachary Turner2018-06-125-73/+80
| | | | | | | | | | | | | | | | | | | This simplifies some code which had StringRefs to begin with, and makes other code more complicated which had const char* to begin with. In the end, I think this makes for a more idiomatic and platform agnostic API. Not all platforms launch process with null terminated c-string arrays for the environment pointer and argv, but the api was designed that way because it allowed easy pass-through for posix-based platforms. There's a little additional overhead now since on posix based platforms we'll be takign StringRefs which were constructed from null terminated strings and then copying them to null terminate them again, but from a readability and usability standpoint of the API user, I think this API signature is strictly better. llvm-svn: 334518
* [MC] [X86] Teach leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 to use ↵Fangrui Song2018-06-121-1/+7
| | | | | | | | | | | | | | | | | | | R_X86_64_GOTPC32 instead of R_X86_64_PC32 Summary: This is similar to D46319 (ARM). x86-64 psABI p40 gives an example: leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 # GOTPC32 reloc GNU as creates R_X86_64_GOTPC32. However, MC currently emits R_X86_64_PC32. Reviewers: javed.absar, echristo Subscribers: kristof.beyls, llvm-commits, peter.smith, grimar Differential Revision: https://reviews.llvm.org/D47507 llvm-svn: 334515
* Utilize new SDNode flag functionality to expand current support for fmulMichael Berg2018-06-121-2/+5
| | | | | | | | | | | | | | Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fmul. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: nhaehnle, wdng Differential Revision: https://reviews.llvm.org/D47911 llvm-svn: 334514
* [CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select ↵Simon Pilgrim2018-06-124-52/+47
| | | | | | | | | | | | | | | | | | (PR33744) As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources: e.g. v4f32: <0,5,2,7> or <4,1,6,3> This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline: e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc. This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns. Differential Revision: https://reviews.llvm.org/D47985 llvm-svn: 334513
* [DWARFv5] llvm-mc -dwarf-version does not imply -g.Paul Robinson2018-06-121-7/+14
| | | | | | | | | | | | | Don't provide the assembler source as the "root file" unless the user asked to have debug info for the assembler source (with -g). If the source doesn't provide an explicit ".file 0" then (a) use the compilation directory as directory #0, and (b) use the file #1 info for file #0 also. Differential Revision: https://reviews.llvm.org/D48055 llvm-svn: 334512
* [X86] Remove TB_ALIGN_16 from VEXTRACTF128/VEXTRACTI128 in the memory ↵Craig Topper2018-06-121-2/+2
| | | | | | folding table. llvm-svn: 334511
OpenPOWER on IntegriCloud