summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [Support] Update comments about stdout, raw_fd_ostream, and outs()Reid Kleckner2017-08-041-9/+8
| | | | | | | | | | | | | | | The full story is in the comments: // Do not attempt to close stdout or stderr. We used to try to maintain the // property that tools that support writing file to stdout should not also // write informational output to stdout, but in practice we were never able to // maintain this invariant. Many features have been added to LLVM and clang // (-fdump-record-layouts, optimization remarks, etc) that print to stdout, so // users must simply be aware that mixed output and remarks is a possibility. NFC, I am just updating comments to reflect reality. llvm-svn: 310016
* Teach GlobalSRA to update the debug info for split-up globals.Adrian Prantl2017-08-042-7/+57
| | | | | | | | | This is similar to what we are doing in "regular" SROA and creates DW_OP_LLVM_fragment operations to describe the resulting variables. rdar://problem/33654891 llvm-svn: 310014
* [AMDGPU] Add missing hazard for DPP-after-EXEC-writeConnor Abbott2017-08-041-1/+8
| | | | | | | | | | | | | | Summary: Following the docs, we need at least 5 wait states between an EXEC write and an instruction that uses DPP. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34849 llvm-svn: 310013
* Disable libFuzzer tests on WindowsGeorge Karpenkov2017-08-041-2/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D36297 llvm-svn: 310009
* AMDGPU: Remove pointless assertsMatt Arsenault2017-08-041-3/+0
| | | | llvm-svn: 310007
* Use profile summary to disable peeling for huge working setsTeresa Johnson2017-08-032-14/+42
| | | | | | | | | | | | | | | | | | | | | Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot percentile in the profile summary to a large threshold*. When the working set size is determined to be huge, disable peeling to avoid bloating the working set further. *Note that the selected threshold (15K) is significantly larger than the largest working set value in SPEC cpu2006 (which is gcc at around 11K). Reviewers: davidxl Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36288 llvm-svn: 310005
* AMDGPU: Don't use report_fatal_error for unsupported call typesMatt Arsenault2017-08-033-9/+29
| | | | llvm-svn: 310004
* AMDGPU: Remove error on calls for amdgcnMatt Arsenault2017-08-033-24/+16
| | | | | | | | Repurpose the -amdgpu-function-calls flag. Rather than require it to emit a call, only use it to run the always inline path or not. llvm-svn: 310003
* AMDGPU: Fix implicitarg.ptr handling special inputsMatt Arsenault2017-08-034-8/+33
| | | | llvm-svn: 310002
* Support: WOA64 and WOA SignalsMartell Malone2017-08-031-6/+23
| | | | | | | | Reviewers: rnk Differential Revision: https://reviews.llvm.org/D21813 llvm-svn: 310001
* AMDGPU: Pass special input registers to functionsMatt Arsenault2017-08-0311-229/+423
| | | | llvm-svn: 309998
* Fix typo.Eric Christopher2017-08-031-1/+1
| | | | llvm-svn: 309997
* AMDGPU: Add analysis pass for function argument infoMatt Arsenault2017-08-036-7/+326
| | | | | | | This will allow only adding necessary inputs to callee functions that need special inputs forwarded from the kernel. llvm-svn: 309996
* [Inliner] Increase threshold for hot callsites without PGO.Easwaran Raman2017-08-031-3/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This increases the inlining threshold for hot callsites. Hotness is defined in terms of block frequency of the callsite relative to the caller's entry block's frequency. Since this requires BFI in the inliner, this only affects the new PM pipeline. This is enabled by default at -O3. This improves the performance of some internal benchmarks. Notably, an internal benchmark for Gipfeli compression (https://github.com/google/gipfeli) improves by ~7%. Povray in SPEC2006 improves by ~2.5%. I am running more experiments and will update the thread if other benchmarks show improvement/regression. In terms of text size, LLVM test-suite shows an 1.22% text size increase. Diving into the results, 13 of the benchmarks in the test-suite increases by > 10%. Most of these are small, but Adobe-C++/loop_unroll (17.6% increases) and tramp3d(20.7% size increase) have >250K text size. On a large application, the text size increases by 2% Reviewers: chandlerc, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36199 llvm-svn: 309994
* [Mips] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-0326-290/+436
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 309993
* DAG: Provide access to Pass instance from SelectionDAGMatt Arsenault2017-08-032-2/+4
| | | | | | This allows accessing an analysis pass during lowering. llvm-svn: 309991
* [GlobalISel] Make GlobalISel a non-optional library.Quentin Colombet2017-08-0332-243/+36
| | | | | | | | With this change, the GlobalISel library gets always built. In particular, this is not possible to opt GlobalISel out of the build using the LLVM_BUILD_GLOBAL_ISEL variable any more. llvm-svn: 309990
* [NewGVN] Fix the case where we have a phi-of-ops which goes away.Davide Italiano2017-08-031-6/+27
| | | | | | Patch by Daniel Berlin, fixes PR33196 (and probably something else). llvm-svn: 309988
* [PDB] Fix section contributionsReid Kleckner2017-08-031-13/+0
| | | | | | | | | | | | | | | | | | | | | Summary: PDB section contributions are supposed to use output section indices and offsets, not input section indices and offsets. This allows the debugger to look up the index of the module that it should look up in the modules stream for symbol information. With this change, windbg can now find line tables, but it still cannot print local variables. Fixes PR34048 Reviewers: zturner Subscribers: hiraditya, ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D36285 llvm-svn: 309987
* [LVI] Constant-propagate a zero extension of the switch condition value ↵Hiroshi Yamauchi2017-08-031-6/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | through case edges Summary: (This is a second attempt as https://reviews.llvm.org/D34822 was reverted.) LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges. But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur. This patch adds a small logic to handle such a case in getEdgeValueLocal(). This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary. With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%. Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36247 llvm-svn: 309986
* [libFuzzer] Un-reverting change in tests after fixing the failure on Linux.George Karpenkov2017-08-031-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D36242 llvm-svn: 309982
* Fix llvm-for-windows-on-linux build after LLVM r272701.Nico Weber2017-08-031-2/+2
| | | | | | | | | | | | The file is called "intrin.h". When building targeting Windows on a Linux system, with the SDK mounted in a case-insensitive file system, "Intrin.h" will miss clang's intrin.h header (because that's not in a case-insensitive file system) but then find intrin.h in the Microsoft SDK. clang can't handle the SDK's intrin.h. https://reviews.llvm.org/D36281 llvm-svn: 309980
* Disable loop peeling during full unrolling pass.Teresa Johnson2017-08-031-20/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrolling. The later loop unrolling invocation will also eventually utilize profile summary and branch frequency information, which we would like to use to control peeling. And for ThinLTO we want to delay peeling until the backend (post thin link) phase, just as we do for most types of unrolling. Ensure peeling doesn't occur during the full unrolling invocation by adding a parameter to the shared implementation function, similar to the way partial and runtime loop unrolling are disabled. Performance results for ThinLTO suggest this has a neutral to positive effect on some internal benchmarks. Reviewers: chandlerc, davidxl Subscribers: mzolotukhin, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36258 llvm-svn: 309966
* Do not want to use BFI to get profile count for sample pgoDehao Chen2017-08-031-2/+18
| | | | | | | | | | | | | | Summary: For SamplePGO, we already record the callsite count in the call instruction itself. So we do not want to use BFI to get profile count as it is less accurate. Reviewers: tejohnson, davidxl, eraman Reviewed By: eraman Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36025 llvm-svn: 309964
* Revert "[AArch64] Simplify AES*Tied pseudo expansion (NFC)."Tim Northover2017-08-031-3/+10
| | | | | | | | | | This reverts commit r309821. My suggestion was wrong because it left the MachineOperands tied which confused the verifier. Since there's no easy way to untie operands, the original BuildMI solution is probably best. llvm-svn: 309962
* AMDGPU/SI: Don't fix a PHI under uniform branch in SIFixSGPRCopies only when ↵Changpeng Fang2017-08-031-3/+3
| | | | | | | | | | | | | | | | sources and destination are all sgprs Summary: If a PHI has at lease one VGPR operand, we have to fix the PHI in SIFixSGPRCopies. Reviewer: Matt Differential Revision: http://reviews.llvm.org/D34727 llvm-svn: 309959
* [DAG] Allow merging of stores of vector loadsNirav Dave2017-08-031-6/+0
| | | | | | | | | | | | | Remove restriction disallowing merging of stores vector loads into larger store of larger vector load. Reviewers: RKSimon, efriedma, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36158 llvm-svn: 309951
* Revert r309923, it caused PR34045.Nico Weber2017-08-032-156/+13
| | | | llvm-svn: 309950
* [NewGVN] fix typos; NFCSanjay Patel2017-08-031-8/+8
| | | | llvm-svn: 309946
* [LiveDebugVariables] Use lexical scope to trim debug value live intervalsRobert Lougher2017-08-031-7/+90
| | | | | | | | | | | | | | | | The debug value live intervals computed by Live Debug Variables may extend beyond the range of the debug location's lexical scope. In this case, splitting of an interval can result in an interval outside of the scope being created, causing extra unnecessary DBG_VALUEs to be emitted. To prevent this, trim the intervals to the lexical scope. This resolves PR33730. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D35953 llvm-svn: 309933
* [SelectionDAG] Resolve PR33978.Simon Dardis2017-08-031-4/+2
| | | | | | | | | | | | | | | | | | | | | rL306209 taught SelectionDAG how to add the dereferenceable flag when expanding memcpy and memmove. The fix however contained a nit where the offset + size was constructed as an APInt of PointerSize rather than PointerSizeInBits. This lead to isDereferenceableAndAlignedPointer() get truncated values or values which would be sign extended within that function leading to incorrect results. Thanks to Alex Crichton for reporting the issue! This resolves PR33978. Reviewers: inouehrs Differential Revision: https://reviews.llvm.org/D36236 llvm-svn: 309930
* [Cloning] Move distinct GlobalVariable debug info metadata in CloneModuleEwan Crawford2017-08-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Duplicating the distinct Subprogram and CU metadata nodes seems like the incorrect thing to do in CloneModule for GlobalVariable debug info. As it results in the scope of the GlobalVariable DI no longer being consistent with the rest of the module, and the new CU is absent from llvm.dbg.cu. Fixed by adding RF_MoveDistinctMDs to MapMetadata flags for GlobalVariables. Current unit test IR after clone: ``` @gv = global i32 1, comdat($comdat), !dbg !0, !type !5 define private void @f() comdat($comdat) personality void ()* @persfn !dbg !14 { !llvm.dbg.cu = !{!10} !0 = !DIGlobalVariableExpression(var: !1) !1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5) !3 = !DIFile(filename: "filename.c", directory: "/file/dir/") !4 = !DISubroutineType(types: !5) !5 = !{} !6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8) !7 = !DIFile(filename: "filename.c", directory: "/file/dir") !8 = !{!0} !9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") !10 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !11) !11 = !{!12} !12 = !DIGlobalVariableExpression(var: !13) !13 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !14, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !14 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !10, variables: !5) ``` Patched IR after clone: ``` @gv = global i32 1, comdat($comdat), !dbg !0, !type !5 define private void @f() comdat($comdat) personality void ()* @persfn !dbg !2 { !llvm.dbg.cu = !{!6} !0 = !DIGlobalVariableExpression(var: !1) !1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5) !3 = !DIFile(filename: "filename.c", directory: "/file/dir/") !4 = !DISubroutineType(types: !5) !5 = !{} !6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8) !7 = !DIFile(filename: "filename.c", directory: "/file/dir") !8 = !{!0} !9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") ``` Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36082 llvm-svn: 309928
* [ARM] GlobalISel: Select simple G_GLOBAL_VALUE instructionsDiana Picus2017-08-031-0/+57
| | | | | | | | | | | | | Add support in the instruction selector for G_GLOBAL_VALUE for ELF and MachO for the static relocation model. We don't handle Windows yet because that's Thumb-only, and we don't handle Thumb in general at the moment. Support for PIC, ROPI, RWPI and TLS will be added in subsequent commits. Differential Revision: https://reviews.llvm.org/D35883 llvm-svn: 309927
* [X86] SET0 to use XMM registers where possible PR26018 PR32862Dinar Temirbulatov2017-08-031-8/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D35965 llvm-svn: 309926
* [SCEV] Re-enable "Cache results of computeExitLimit"Max Kazantsev2017-08-031-2/+37
| | | | | | | | | | The patch rL309080 was reverted because it did not clean up the cache on "forgetValue" method call. This patch re-enables this change, adds the missing check and introduces two new unit tests that make sure that the cache is cleaned properly. Differential Revision: https://reviews.llvm.org/D36087 llvm-svn: 309925
* [ARM] Use ADDCARRY / SUBCARRYRoger Ferrer Ibanez2017-08-032-13/+156
| | | | | | | | | | | | | | | | | | | | | | | | This patch: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) <- (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) <- (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRY into ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) -> C Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 309923
* Fix WebAssembly target after r309911.Daniel Jasper2017-08-033-16/+5
| | | | llvm-svn: 309922
* Fix the ppc jit tests.Rafael Espindola2017-08-031-3/+4
| | | | llvm-svn: 309921
* [RegisterCoalescer] Add wrapper for Erasing InstructionsSameer AbuAsal2017-08-031-14/+16
| | | | | | | | | | | | | | | | | | | Summary: To delete an instruction the coalescer needs to call eraseFromParent() on the MachineInstr, insert it in the ErasedInstrs list and update the Live Ranges structure. This patch re-factors the code to do all that in one function. This will also fix cases where previous code wasn't inserting deleted instructions in the ErasedList. Reviewers: qcolombet, kparzysz Reviewed By: qcolombet Subscribers: MatzeB, llvm-commits, qcolombet Differential Revision: https://reviews.llvm.org/D36204 llvm-svn: 309915
* Delete Default and JITDefault code modelsRafael Espindola2017-08-0348-327/+361
| | | | | | | | | | | | | | | IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
* [ARM] Tidy up banked registers encodingJaved Absar2017-08-035-77/+74
| | | | | | | | | | Moves encoding (SYSm) information of banked registers to ARMSystemRegister.td, where it rightly belongs and forms a single point of reference in the code. Reviewed by: @fhahn, @rovka, @olista01 Differential Revision: https://reviews.llvm.org/D36219 llvm-svn: 309910
* Fix the bug when SampleProfileWriter writes out number of callsites.Dehao Chen2017-08-031-1/+4
| | | | | | | | | | | | | | Summary: As we support multiple callsites for the same location, we need to traverse all locations to get the number of callsites. Reviewers: davidxl Reviewed By: davidxl Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D36246 llvm-svn: 309907
* [Coverage] Add an API to retrive all instantiations of a function (NFC)Vedant Kumar2017-08-021-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | The CoverageMapping::getInstantiations() API retrieved all function records corresponding to functions with more than one instantiation (e.g template functions with multiple specializations). However, there was no simple way to determine *which* function a given record was an instantiation of. This was an oversight, since it's useful to aggregate coverage information over all instantiations of a function. llvm-cov works around this by building a mapping of source locations to instantiation sets, but this duplicates logic that libCoverage already has (see FunctionInstantiationSetCollector). This change adds a new API, CoverageMapping::getInstantiationGroups(), which returns a list of InstantiationGroups. A group contains records for each instantiation of some particular function, and also provides utilities to get the total execution count within the group, the source location of the common definition, etc. This lets removes some hacky logic in llvm-cov by reusing FunctionInstantiationSetCollector and makes the CoverageMapping API friendlier for other clients. llvm-svn: 309904
* Revert "[libFuzzer tests] Use substring comparison in libFuzzer tests"George Karpenkov2017-08-021-1/+1
| | | | | | | | This reverts commit 3592d8049660dcdd07f7c2e797f2de9790f93111. Breaks the bots, reverting for now. llvm-svn: 309899
* AMDGPU/GlobalISel: Mark 32-bit G_FMUL as legalTom Stellard2017-08-021-0/+2
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D36218 llvm-svn: 309898
* [pdb/lld] Write a valid FPM.Zachary Turner2017-08-025-11/+110
| | | | | | | | | | | The PDB reserves certain blocks for the FPM that describe which blocks in the file are allocated and which are free. We weren't filling that out at all, and in some cases we were even stomping it with incorrect data. This patch writes a correct FPM. Differential Revision: https://reviews.llvm.org/D36235 llvm-svn: 309896
* [pdbutil] Add a command to dump the FPM.Zachary Turner2017-08-023-18/+21
| | | | | | | | | | | | | | Recently problems have been discovered in the way we write the FPM (free page map). In order to fix this, we first need to establish a baseline about what a correct FPM looks like using an MSVC generated PDB, so that we can then make our own generated PDBs match. And in order to do this, the dumper needs a mode where it can dump an FPM so that we can write tests for it. This patch adds a command to dump the FPM, as well as a test against a known-good PDB. llvm-svn: 309894
* AMDGPU/R600: Initialize more passesTom Stellard2017-08-027-8/+68
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D36128 llvm-svn: 309893
* Xray docs with description of Flight Data Recorder binary format.Keith Wyss2017-08-021-5/+6
| | | | | | | | | | | | | | | | | | | Summary: Adding a new restructuredText file to document the trace format produced with an FDR mode handler and read by llvm-xray toolset. Fixed two problems in the documentation from differential review. One bad table and a missing link in the toc. Original commit was e97c5836a77db803fe53319c53f3bf8e8b26d2b7. Reviewers: dberris, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36041 llvm-svn: 309891
* LV: Don't insert runtime ptr checks on divergent targetsMatt Arsenault2017-08-021-0/+12
| | | | llvm-svn: 309890
OpenPOWER on IntegriCloud