bcm5719-llvm/llvm/test/Transforms/GlobalDCE, branch meklort-10.0.1

bcm5719-llvm/llvm/test/Transforms/GlobalDCE, branch meklort-10.0.1 Project Ortega BCM5719 LLVM https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1 2019-10-17T09:58:57+00:00 Reland: Dead Virtual Function Elimination 2019-10-17T09:58:57+00:00 Oliver Stannard oliver.stannard@linaro.org 2019-10-17T09:58:57+00:00 urn:sha1:3b598b9c867a39065e6cb804423c28a6b020e6ee Remove dead virtual functions from vtables with replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up correctly. Original commit message: Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 375094 Revert "Dead Virtual Function Elimination" 2019-10-14T23:25:25+00:00 Jorge Gorbe Moya jgorbe@google.com 2019-10-14T23:25:25+00:00 urn:sha1:b052331bd614ff2d06bbb3e5af15e899e3f7e52f This reverts commit 9f6a873268e1ad9855873d9d8007086c0d01cf4f. llvm-svn: 374844 Dead Virtual Function Elimination 2019-10-11T11:59:55+00:00 Oliver Stannard oliver.stannard@linaro.org 2019-10-11T11:59:55+00:00 urn:sha1:9f6a873268e1ad9855873d9d8007086c0d01cf4f Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 374539 [IR] Disallow llvm.global_ctors and llvm.global_dtors of the 2-field form in textual format 2019-05-15T02:35:32+00:00 Fangrui Song maskray@google.com 2019-05-15T02:35:32+00:00 urn:sha1:f4dfd63c74899e2953b176de2174ae7a8924a72c The 3-field form was introduced by D3499 in 2014 and the legacy 2-field form was planned to be removed in LLVM 4.0 For the textual format, this patch migrates the existing 2-field form to use the 3-field form and deletes the compatibility code. test/Verifier/global-ctors-2.ll checks we have a friendly error message. For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the 2-field form (add i8* null as the third field). Reviewed By: rnk, dexonsmith Differential Revision: https://reviews.llvm.org/D61547 llvm-svn: 360742 Revert "Temporarily Revert "Add basic loop fusion pass."" 2019-04-17T04:52:47+00:00 Eric Christopher echristo@gmail.com 2019-04-17T04:52:47+00:00 urn:sha1:cee313d288a4faf0355d76fb6e0e927e211d08a5 The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552 Temporarily Revert "Add basic loop fusion pass." 2019-04-17T02:12:23+00:00 Eric Christopher echristo@gmail.com 2019-04-17T02:12:23+00:00 urn:sha1:a86343512845c9c1fdbac865fea88aa5fce7142a As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546 GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics. 2018-11-16T17:47:21+00:00 Adrian Prantl aprantl@apple.com 2018-11-16T17:47:21+00:00 urn:sha1:83d87520ed30e8bac4a7303c8ee4e3f580e3cf22 This fixes PR39669. https://bugs.llvm.org/show_bug.cgi?id=39669 llvm-svn: 347065 [PM] Teach the PGO instrumentation pasess to run GlobalDCE before 2017-05-25T07:15:09+00:00 Chandler Carruth chandlerc@gmail.com 2017-05-25T07:15:09+00:00 urn:sha1:f4d62c480c314f6514c3b54d69bb7eb0c0b69d5a instrumenting code. This is important in the new pass manager. The old pass manager's inliner has a small DCE routine embedded within it. The new pass manager relies on the actual GlobalDCE pass for this. Without this patch, instrumentation profiling with the new PM results in massive code bloat in the object files because the instrumentation itself ends up preventing DCE from working to remove the code. We should probably change the instrumentation (and/or DCE) so that we can eliminate dead code even if instrumented, but we shouldn't even spend the time generating instrumentation for that code so this still seems like a good patch. Differential Revision: https://reviews.llvm.org/D33535 llvm-svn: 303845 [PH] Replace uses of AssertingVH from members of analysis results with 2017-01-24T12:55:57+00:00 Chandler Carruth chandlerc@gmail.com 2017-01-24T12:55:57+00:00 urn:sha1:6acdca78a08cc8b699e3618c7e1c5d2d8e797c20 a lazy-asserting PoisoningVH. AssertVH is fundamentally incompatible with cache-invalidation of analysis results. The invaliadtion happens after the AssertingVH has already fired. Instead, use a PoisoningVH that will assert if the dangling handle is ever used rather than merely be assigned or destroyed. This patch also removes all of the (numerous) doomed attempts to work around this fundamental incompatibility. It is a pretty significant simplification IMO. The most interesting change is in the Inliner where we still do some clearing because we don't want to rely on the coarse grained invalidation strategy of the containing pass manager. However, I prefer the approach that contains this logic to the cleanup phase of the Inliner, and I think we could enhance the CGSCC analysis management layer to make this even better in the future if desired. The rest is straight cleanup. I've also added a test for one of the harder cases to work around: when a *module analysis* contains many AssertingVHes pointing at functions. Differential Revision: https://reviews.llvm.org/D29006 llvm-svn: 292928 [PM] Replace the hard invalidate in JumpThreading for LVI with correct 2017-01-23T08:33:24+00:00 Chandler Carruth chandlerc@gmail.com 2017-01-23T08:33:24+00:00 urn:sha1:e8c66b27662bf71e3ca6fb787380957b428f1ff8 invalidation of deleted functions in GlobalDCE. This was always testing a bug really triggered in GlobalDCE. Right now we have analyses with asserting value handles into IR. As long as those remain, when *deleting* an IR unit, we cannot wait for the normal invalidation scheme to kick in even though it was designed to work correctly in the face of these kinds of deletions. Instead, the pass needs to directly handle invalidating the analysis results pointing at that IR unit. I've tought the Inliner about this and this patch teaches GlobalDCE. This will handle the asserting VH case in the existing test as well as other issues of the same fundamental variety. I've moved the test into the GlobalDCE directory and added a comment explaining what is going on. Note that we cannot simply require LVI here because LVI is too lazy. llvm-svn: 292773