summaryrefslogtreecommitdiffstats
path: root/clang/lib/Frontend/CompilerInvocation.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [Driver][CodeGen] Add -fpatchable-function-entry=N[,0]Fangrui Song2020-01-101-0/+2
| | | | | | | | | | | In the backend, this feature is implemented with the function attribute "patchable-function-entry". Both the attribute and XRay use TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are incompatible. Reviewed By: ostannard, MaskRay Differential Revision: https://reviews.llvm.org/D72222
* [HIP] Add option --gpu-max-threads-per-block=nYaxun (Sam) Liu2020-01-071-0/+6
| | | | | | Add this option to change the default launch bounds. Differential Revision: https://reviews.llvm.org/D71221
* Add Triple::isX86()Fangrui Song2020-01-061-1/+1
| | | | | | Reviewed By: craig.topper, skan Differential Revision: https://reviews.llvm.org/D72247
* [NFC] Separate getLastArgIntValue to BasicYaxun (Sam) Liu2019-12-211-27/+0
| | | | | | | | getLastArgIntValue is a useful utility function to get command line argument as an integer. Currently it is in Frontend so that it can only be used by clang -cc1. Move it to basic so that it can also be used by clang driver. Differential Revision: https://reviews.llvm.org/D71080
* [Clang FE, SystemZ] Recognize -mrecord-mcount CL option.Jonas Paulsson2019-12-191-0/+1
| | | | | | | | | | Recognize -mrecord-mcount from the command line and add a function attribute "mrecord-mcount" when passed. Only valid on SystemZ (when used with -mfentry). Review: Ulrich Weigand https://reviews.llvm.org/D71627
* Re-land "Add an -fno-temp-file flag for compilation"Hans Wennborg2019-12-191-0/+1
| | | | | | | | This time making sure to initialize FrontendOptions::UseTemporary. Patch by Zachary Henkel! Differential revision: https://reviews.llvm.org/D70615
* Revert "Add an -fno-temp-file flag for compilation"Mitch Phillips2019-12-181-1/+0
| | | | | | | This reverts commit d129aa1d5369781deff6c6b854cb612e160d3fb2. This broke the MSan buildbots. More information available in the original PR: https://reviews.llvm.org/D70615
* Add an -fno-temp-file flag for compilationHans Wennborg2019-12-181-0/+1
| | | | | | | | | | | | Our build system does not handle randomly named files created during the build well. We'd prefer to write compilation output directly without creating a temporary file. Function parameters already existed to control this behavior but were not exposed all the way out to the command line. Patch by Zachary Henkel! Differential revision: https://reviews.llvm.org/D70615
* [Frontend] Fixes -Wrange-loop-analysis warningsMark de Wever2019-12-171-3/+3
| | | | | | This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71530
* [Clang FE, SystemZ] Recognize -mpacked-stack CL optionJonas Paulsson2019-12-171-0/+1
| | | | | | | | | | | Recognize -mpacked-stack from the command line and add a function attribute "mpacked-stack" when passed. This is needed for building the Linux kernel. If this option is passed for any other target than SystemZ, an error is generated. Review: Ulrich Weigand https://reviews.llvm.org/D71441
* Default to -fuse-init-arrayFangrui Song2019-12-121-1/+1
| | | | | | | | | | | | | | | | | | | Very few ELF platforms still use .ctors/.dtors now. Linux (glibc: 1999-07), DragonFlyBSD, FreeBSD (2012-03) and Solaris have supported .init_array for many years. Some architectures like AArch64/RISC-V default to .init_array . GNU ld and gold can even convert .ctors to .init_array . It makes more sense to flip the CC1 default, and only uses -fno-use-init-array on platforms that don't support .init_array . For example, OpenBSD did not support DT_INIT_ARRAY before Aug 2016 (https://github.com/openbsd/src/commit/86fa57a2792c6374b0849dd7b818a11e676e60ba) I may miss some ELF platforms that still use .ctors, but their maintainers can easily diagnose such problems. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D71393
* [OpenMP] Use the OpenMP-IR-BuilderJohannes Doerfert2019-12-111-0/+2
| | | | | | | | | | | | | | | This is a follow up patch to use the OpenMP-IR-Builder, as discussed on the mailing list ([1] and later) and at the US Dev Meeting'19. [1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim Subscribers: ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny Tags: #clang Differential Revision: https://reviews.llvm.org/D69922
* [Frontend] Allow OpenMP offloading to aarch64Bryan Chan2019-12-081-1/+2
| | | | | | | | | | | | | | | | | | | | | Summary: D30644 added OpenMP offloading to AArch64 targets, then D32035 changed the frontend to throw an error when offloading is requested for an unsupported target architecture. However the latter did not include AArch64 in the list of supported architectures, causing the following unit tests to fail: libomptarget :: api/omp_get_num_devices.c libomptarget :: mapping/pr38704.c libomptarget :: offloading/offloading_success.c libomptarget :: offloading/offloading_success.cpp Reviewers: pawosm01, gtbercea, jdoerfert, ABataev Subscribers: kristof.beyls, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70804
* Reapply af57dbf12e54 "Add support for options -frounding-math, ↵Melanie Blower2019-12-051-0/+28
| | | | | | | | | | | | ftrapping-math, -ffp-model=, and -ffp-exception-behavior=" Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048 The original patch is modified to set the strictfp IR attribute explicitly in CodeGen instead of as a side effect of IRBuilder. In the 2nd attempt to reapply there was a windows lit test fail, the tests were fixed to use wildcard matching. Differential Revision: https://reviews.llvm.org/D62731
* Revert " Reapply af57dbf12e54 "Add support for options ↵Melanie Blower2019-12-041-28/+0
| | | | | | | -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior="" This reverts commit cdbed2dd856c14687efd741c2d8321686102acb8. Build break on Windows (lit fail)
* Reapply af57dbf12e54 "Add support for options -frounding-math, ↵Melanie Blower2019-12-041-0/+28
| | | | | | | | | | ftrapping-math, -ffp-model=, and -ffp-exception-behavior=" Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048 The original patch is modified to set the strictfp IR attribute explicitly in CodeGen instead of as a side effect of IRBuilder Differential Revision: https://reviews.llvm.org/D62731
* [ConstExprPreter] Removed the flag forcing the use of the interpreterNandor Licker2019-11-271-2/+0
| | | | | | | | | | | | | | | | | | | Summary: Removed the ```-fforce-experimental-new-constant-interpreter flag```, leaving only the ```-fexperimental-new-constant-interpreter``` one. The interpreter now always emits an error on an unsupported feature. Allowing the interpreter to bail out would require a mapping from APValue to interpreter memory, which will not be necessary in the final version. It is more sensible to always emit an error if the interpreter fails. Reviewers: jfb, Bigcheese, rsmith, dexonsmith Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70071
* Initial implementation of -fmacro-prefix-map and -ffile-prefix-mapDan McGregor2019-11-261-0/+3
| | | | | | | | | GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro. -ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map Reviewed By: rnk, Lekensteyn, maskray Differential Revision: https://reviews.llvm.org/D49466
* Move vtordisp mode from Attr class to LangOptions.h, NFCReid Kleckner2019-11-221-1/+2
| | | | | This removes one of the two uses of Attr.h in DeclCXX.h, reducing the need to include Attr.h as widely. LangOptions is already very popular.
* clang: Add -fconvergent-functions flagMatt Arsenault2019-11-191-0/+3
| | | | | | | | The CUDA builtin library is apparently compiled in C++ mode, so the assumption of convergent needs to be made in a typically non-SPMD language. The functions in the library should still be assumed convergent. Currently they are not, which is potentially incorrect and this happens to work after the library is linked.
* Work on cleaning up denormal mode handlingMatt Arsenault2019-11-191-7/+2
| | | | | | | | | | | | | | | | | | | | | | Cleanup handling of the denormal-fp-math attribute. Consolidate places checking the allowed names in one place. This is in preparation for introducing FP type specific variants of the denormal-fp-mode attribute. AMDGPU will switch to using this in place of the current hacky use of subtarget features for the denormal mode. Introduce a new header for dealing with FP modes. The constrained intrinsic classes define related enums that should also be moved into this header for uses in other contexts. The verifier could use a check to make sure the denorm-fp-mode attribute is sane, but there currently isn't one. Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by default in the one current user. Clang must be taught to start emitting this attribute by default to avoid regressions when this is switched to assume ieee behavior if the attribute isn't present.
* Temporarily Revert "Add support for options -frounding-math, ftrapping-math, ↵Eric Christopher2019-11-181-30/+0
| | | | | | | | -ffp-model=, and -ffp-exception-behavior=" and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread. This reverts commits af57dbf12e54f3a8ff48534bf1078f4de104c1cd and e6584b2b7b2de06f1e59aac41971760cac1e1b79
* Revert "Reland "[clang] Report sanitizer blacklist as a dependency in cc1""Jan Korous2019-11-081-1/+20
| | | | This reverts commit cae4a28864f4e8a55920e2b94e2cd43617902dec.
* [clang] Fix -fsanitize-system-blacklist processing in cc1Jan Korous2019-11-081-0/+5
|
* Reland "[clang] Report sanitizer blacklist as a dependency in cc1"Jan Korous2019-11-081-20/+1
| | | | This reverts commit 3182027282c59c51d5080d83365917fccd695854.
* Reland "[clang] Report sanitizer blacklist as a dependency in cc1"Jan Korous2019-11-081-1/+20
| | | | This reverts commit 9b8413ac6e56e7a6e0ba884773d13bcf9414bd43.
* Revert "Revert "Revert "[clang] Report sanitizer blacklist as a dependency ↵Abel Kocsis2019-11-081-20/+1
| | | | | | in cc1""" This reverts commit 3182027282c59c51d5080d83365917fccd695854.
* Revert "Revert "[clang] Report sanitizer blacklist as a dependency in cc1""Abel Kocsis2019-11-081-1/+20
| | | | This reverts commit 6b45e1bc11e91ea7b57a6ab1c19461a86dba33f8.
* Revert "[clang] Report sanitizer blacklist as a dependency in cc1"Jeremy Morse2019-11-081-20/+1
| | | | | | | | | | This reverts commit 03b84e4f6d0e1c04f22d69cc445f36e1f713beb4. This breaks dfsan tests with a linking failure, in for example this build: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24312 Reverting this patch locally makes those tests succeed.
* [clang] Report sanitizer blacklist as a dependency in cc1Jan Korous2019-11-071-1/+20
| | | | | | | | Previously these were reported from the driver which blocked clang-scan-deps from getting the full set of dependencies from cc1 commands. Also the default sanitizer blacklist that is added in driver was never reported as a dependency. I introduced -fsanitize-system-blacklist cc1 option to keep track of which blacklists were user-specified and which were added by driver and clang -MD now also reports system blacklists as dependencies. Differential Revision: https://reviews.llvm.org/D69290
* Add support for options -frounding-math, ftrapping-math, -ffp-model=, and ↵Melanie Blower2019-11-071-0/+30
| | | | | | | | | | | | -ffp-exception-behavior= Add options to control floating point behavior: trapping and exception behavior, rounding, and control of optimizations that affect floating point calculations. More details in UsersManual.rst. Reviewers: rjmccall Differential Revision: https://reviews.llvm.org/D62731
* [Clang FE] Recognize -mnop-mcount CL option (SystemZ only).Jonas Paulsson2019-11-051-0/+1
| | | | | | | | | | | | | | Recognize -mnop-mcount from the command line and add a function attribute "mnop-mcount"="true" when passed. When this option is used, a nop is added instead of a call to fentry. This is used when building the Linux Kernel. If this option is passed for any other target than SystemZ, an error is generated. Review: Ulrich Weigand https://reviews.llvm.org/D67763
* Recommit "[CodeView] Add option to disable inline line tables."Amy Huang2019-11-041-0/+1
| | | | | | | | | | | | This reverts commit 004ed2b0d1b86d424643ffc88fce20ad8bab6804. Original commit hash 6d03890384517919a3ba7fe4c35535425f278f89 Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. https://reviews.llvm.org/D67723
* [cfi] Add flag to always generate .debug_frameDavid Candler2019-10-311-0/+3
| | | | | | | | | This adds a flag to LLVM and clang to always generate a .debug_frame section, even if other debug information is not being generated. In situations where .eh_frame would normally be emitted, both .debug_frame and .eh_frame will be used. Differential Revision: https://reviews.llvm.org/D67216
* Revert "[CodeView] Add option to disable inline line tables."Amy Huang2019-10-301-1/+0
| | | | | | because it breaks compiler-rt tests. This reverts commit 6d03890384517919a3ba7fe4c35535425f278f89.
* [CodeView] Add option to disable inline line tables.Amy Huang2019-10-301-0/+1
| | | | | | | | | | | | | | | | | Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. See https://bugs.llvm.org/show_bug.cgi?id=42344 Reviewers: rnk Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67723
* Add Windows Control Flow Guard checks (/guard:cf).Andrew Paverd2019-10-281-0/+1
| | | | | | | | | | | | | | | | | | | Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761
* Fix name of warn_ignored_hip_only_optionYaxun (Sam) Liu2019-10-221-1/+1
| | | | Differential Revision: https://reviews.llvm.org/D69268
* [HIP] Add option -fgpu-allow-device-initYaxun (Sam) Liu2019-10-221-0/+7
| | | | | | | | | | | | | | Add this option to allow device side class type global variables with non-trivial ctor/dtor. device side init/fini functions will be emitted, which will be executed by HIP runtime when the fat binary is loaded/unloaded. This feature is to facilitate implementation of device side sanitizer which requires global vars with non-trival ctors. By default this option is disabled. Differential Revision: https://reviews.llvm.org/D69268
* [Implicit Modules] Add -cc1 option -fmodules-strict-context-hash which ↵Michael J. Spencer2019-10-211-0/+20
| | | | | | | | includes search paths and diagnostics. This is a recommit of r375322 and r375327 with a fix for the Windows test breakage. llvm-svn: 375466
* Revert "[Implicit Modules] Add -cc1 option -fmodules-strict-context-hash ↵Michael J. Spencer2019-10-191-20/+0
| | | | | | | | which includes search paths and diagnostics." and "[Docs] Fix header level." The test doesn't work on Windows. I'll fix it and recommit later. llvm-svn: 375338
* [Implicit Modules] Add -cc1 option -fmodules-strict-context-hash which ↵Michael J. Spencer2019-10-191-0/+20
| | | | | | | | includes search paths and diagnostics. Differential Revision: https://reviews.llvm.org/D68528 llvm-svn: 375322
* Reland: Dead Virtual Function EliminationOliver Stannard2019-10-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove dead virtual functions from vtables with replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up correctly. Original commit message: Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 375094
* Reapply: [Modules][PCH] Hash input files contentBruno Cardoso Lopes2019-10-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When files often get touched during builds, the mtime based validation leads to different problems in implicit modules builds, even when the content doesn't actually change: - Modules only: module invalidation due to out of date files. Usually causing rebuild traffic. - Modules + PCH: build failures because clang cannot rebuild a module if it comes from building a PCH. - PCH: build failures because clang cannot rebuild a PCH in case one of the input headers has different mtime. This patch proposes hashing the content of input files (headers and module maps), which is performed during serialization time. When looking at input files for validation, clang only computes the hash in case there's a mtime mismatch. I've tested a couple of different hash algorithms availble in LLVM in face of building modules+pch for `#import <Cocoa/Cocoa.h>`: - `hash_code`: performace diff within the noise, total module cache increased by 0.07%. - `SHA1`: 5% slowdown. Haven't done real size measurements, but it'd be BLOCK_ID+20 bytes per input file, instead of BLOCK_ID+8 bytes from `hash_code`. - `MD5`: 3% slowdown. Like above, but BLOCK_ID+16 bytes per input file. Given the numbers above, the patch uses `hash_code`. The patch also improves invalidation error msgs to point out which type of problem the user is facing: "mtime", "size" or "content". rdar://problem/29320105 Reviewers: dexonsmith, arphaman, rsmith, aprantl Subscribers: jkorous, cfe-commits, ributzka Tags: #clang Differential Revision: https://reviews.llvm.org/D67249 > llvm-svn: 374841 llvm-svn: 374895
* Revert "Dead Virtual Function Elimination"Jorge Gorbe Moya2019-10-141-2/+0
| | | | | | This reverts commit 9f6a873268e1ad9855873d9d8007086c0d01cf4f. llvm-svn: 374844
* Temporarily Revert [Modules][PCH] Hash input files contentEric Christopher2019-10-141-2/+0
| | | | | | | | as it's breaking a few bots. This reverts r374841 (git commit 2a1386c81de504b5bda44fbecf3f7b4cdfd748fc) llvm-svn: 374842
* [Modules][PCH] Hash input files contentBruno Cardoso Lopes2019-10-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When files often get touched during builds, the mtime based validation leads to different problems in implicit modules builds, even when the content doesn't actually change: - Modules only: module invalidation due to out of date files. Usually causing rebuild traffic. - Modules + PCH: build failures because clang cannot rebuild a module if it comes from building a PCH. - PCH: build failures because clang cannot rebuild a PCH in case one of the input headers has different mtime. This patch proposes hashing the content of input files (headers and module maps), which is performed during serialization time. When looking at input files for validation, clang only computes the hash in case there's a mtime mismatch. I've tested a couple of different hash algorithms availble in LLVM in face of building modules+pch for `#import <Cocoa/Cocoa.h>`: - `hash_code`: performace diff within the noise, total module cache increased by 0.07%. - `SHA1`: 5% slowdown. Haven't done real size measurements, but it'd be BLOCK_ID+20 bytes per input file, instead of BLOCK_ID+8 bytes from `hash_code`. - `MD5`: 3% slowdown. Like above, but BLOCK_ID+16 bytes per input file. Given the numbers above, the patch uses `hash_code`. The patch also improves invalidation error msgs to point out which type of problem the user is facing: "mtime", "size" or "content". rdar://problem/29320105 Reviewers: dexonsmith, arphaman, rsmith, aprantl Subscribers: jkorous, cfe-commits, ributzka Tags: #clang Differential Revision: https://reviews.llvm.org/D67249 llvm-svn: 374841
* [clang-scan-deps] Support for clang --analyze in clang-scan-depsJan Korous2019-10-141-0/+2
| | | | | | | | | | | | | | | The goal is to have 100% fidelity in clang-scan-deps behavior when --analyze is present in compilation command. At the same time I don't want to break clang-tidy which expects __static_analyzer__ macro defined as built-in. I introduce new cc1 options (-setup-static-analyzer) that controls the macro definition and is conditionally set in driver. Differential Revision: https://reviews.llvm.org/D68093 llvm-svn: 374815
* [clang][IFS] Fixing spelling errors in interface-stubs OPT flag (NFC).Puyan Lotfi2019-10-121-3/+3
| | | | | | This is just a long standing spelling error that was found recently. llvm-svn: 374638
* Dead Virtual Function EliminationOliver Stannard2019-10-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 374539
OpenPOWER on IntegriCloud