summaryrefslogtreecommitdiffstats
path: root/llvm/include
Commit message (Collapse)AuthorAgeFilesLines
...
* [Attributor][NFC] Expose call site traversal without QueryingAAJohannes Doerfert2019-10-131-0/+9
| | | | llvm-svn: 374700
* [Attributor][FIX] Avoid modifying naked/optnone functionsJohannes Doerfert2019-10-131-4/+12
| | | | | | | | The check for naked/optnone was insufficient for different reasons. We now check before we initialize an abstract attribute and we do it for all abstract attributes. llvm-svn: 374694
* SymbolRecord - consistently use explicit for single operand constructorsSimon Pilgrim2019-10-121-15/+15
| | | | llvm-svn: 374673
* SymbolRecord - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-10-121-111/+111
| | | | llvm-svn: 374672
* recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure ↵Zi Xuan Wu2019-10-123-10/+42
| | | | | | | | | | | | | | | | | | | | | | | separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634
* llvm-dwarfdump: Add verbose printing for debug_loclistsDavid Blaikie2019-10-111-3/+9
| | | | llvm-svn: 374582
* [AArch64][SVE] Implement sdot and udot (lane) intrinsicsKerry McLaughlin2019-10-111-0/+21
| | | | | | | | | | | | | | | | | | | | | Summary: Implements the following arithmetic intrinsics: - int_aarch64_sve_sdot - int_aarch64_sve_sdot_lane - int_aarch64_sve_udot - int_aarch64_sve_udot_lane This patch includes tests for the Subdivide4Argument type added by D67549 Reviewers: sdesmalen, SjoerdMeijer, greened, rengolin, rovka Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D67551 llvm-svn: 374566
* Dead Virtual Function EliminationOliver Stannard2019-10-114-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 374539
* [FileCheck] Implement --ignore-case option.Kai Nacke2019-10-111-0/+1
| | | | | | | | | | | | The FileCheck utility is enhanced to support a `--ignore-case` option. This is useful in cases where the output of Unix tools differs in case (e.g. case not specified by Posix). Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D68146 llvm-svn: 374538
* [Windows] Use information from the PE32 exceptions directory to construct ↵Aleksandr Urakov2019-10-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unwind plans This patch adds an implementation of unwinding using PE EH info. It allows to get almost ideal call stacks on 64-bit Windows systems (except some epilogue cases, but I believe that they can be fixed with unwind plan disassembly augmentation in the future). To achieve the goal the CallFrameInfo abstraction was made. It is based on the DWARFCallFrameInfo class interface with a few changes to make it less DWARF-specific. To implement the new interface for PECOFF object files the class PECallFrameInfo was written. It uses the next helper classes: - UnwindCodesIterator helps to iterate through UnwindCode structures (and processes chained infos transparently); - EHProgramBuilder with the use of UnwindCodesIterator constructs EHProgram; - EHProgram is, by fact, a vector of EHInstructions. It creates an abstraction over the low-level unwind codes and simplifies work with them. It contains only the information that is relevant to unwinding in the unified form. Also the required unwind codes are read from the object file only once with it; - EHProgramRange allows to take a range of EHProgram and to build an unwind row for it. So, PECallFrameInfo builds the EHProgram with EHProgramBuilder, takes the ranges corresponding to every offset in prologue and builds the rows of the resulted unwind plan. The resulted plan covers the whole range of the function except the epilogue. Reviewers: jasonmolenda, asmith, amccarth, clayborg, JDevlieghere, stella.stamenova, labath, espindola Reviewed By: jasonmolenda Subscribers: leonid.mashinskiy, emaste, mgorny, aprantl, arichardson, MaskRay, lldb-commits, llvm-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D67347 llvm-svn: 374528
* Insert module constructors in a module passVitaly Buka2019-10-112-0/+3
| | | | | | | | | | | | | | | | | | | | | Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 > llvm-svn: 374481 Signed-off-by: Vitaly Buka <vitalybuka@google.com> llvm-svn: 374527
* Fix modules build for r374337Pavel Labath2019-10-111-0/+2
| | | | | | | | | | A modules build failed with the following error: call to function 'operator&' that is neither visible in the template definition nor found by argument-dependent lookup Fix that by declaring the appropriate operators in the llvm::minidump namespace. llvm-svn: 374517
* [InstCombine] recognize popcount.Chen Zheng2019-10-111-5/+11
| | | | | | | | | This patch recognizes popcount intrinsic according to algorithm from website http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel Differential Revision: https://reviews.llvm.org/D68189 llvm-svn: 374512
* Revert 374481 "[tsan,msan] Insert module constructors in a module pass"Nico Weber2019-10-112-3/+0
| | | | | | | CodeGen/sanitizer-module-constructor.c fails on mac and windows, see e.g. http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11424 llvm-svn: 374503
* [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS.Volodymyr Sapsai2019-10-111-49/+53
| | | | | | | | | | | | | | | | | | | | | The intended usage is to measure relatively expensive operations. So the cost of the statistic is negligible compared to the cost of a measured operation and can be enabled all the time without impairing the compilation time. rdar://problem/55715134 Reviewers: dsanders, bogner, rtereshin Reviewed By: dsanders Subscribers: hiraditya, jkorous, dexonsmith, ributzka, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68252 llvm-svn: 374490
* [tsan,msan] Insert module constructors in a module passVitaly Buka2019-10-102-0/+3
| | | | | | | | | | | | | | | | | | Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 llvm-svn: 374481
* [msan, NFC] Move option parsing into constructorVitaly Buka2019-10-101-6/+5
| | | | llvm-svn: 374480
* [JITLink] Add an initial implementation of JITLink for MachO/AArch64.Lang Hames2019-10-101-0/+60
| | | | | | | | | This implementation has support for all relocation types except TLV. Compact unwind sections are not yet supported, so exceptions/unwinding will not work. llvm-svn: 374476
* [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values.Marcello Maggioni2019-10-101-5/+7
| | | | | | | | | | | | | | | | | | | | | In GISel we have both G_CONSTANT and G_FCONSTANT, but because in GISel we don't really have a concept of Float vs Int value the only difference between the two is where the data originates from. What both G_CONSTANT and G_FCONSTANT return is just a bag of bits with the constant representation in it. By making getConstantVRegVal() return G_FCONSTANTs bit representation as well we allow ConstantFold and other things to operate with G_FCONSTANT. Adding tests that show ConstantFolding to work on mixed G_CONSTANT and G_FCONSTANT sources. Differential Revision: https://reviews.llvm.org/D68739 llvm-svn: 374458
* [System Model] [TTI] Move default cache/prefetch implementationsDavid Greene2019-10-102-2/+35
| | | | | | | | | | Move the default implementations of cache and prefetch queries to TargetTransformInfoImplBase and delete them from NoTIIImpl. This brings these interfaces in line with how other TTI interfaces work. Differential Revision: https://reviews.llvm.org/D68804 llvm-svn: 374446
* Fix a documentation warning from GSYM commit.Greg Clayton2019-10-101-1/+1
| | | | llvm-svn: 374445
* [PDB] Fix bug when using multiple PCH header objects with the same name.Zachary Turner2019-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | A common pattern in Windows is to have all your precompiled headers use an object named stdafx.obj. If you've got a project with many different static libs, you might use a separate PCH for each one of these. During the final link step, a file from A might reference the PCH object from A, but it will have the same name (stdafx.obj) as any other PCH from another project. The only difference will be the path. For example, A might be A/stdafx.obj while B is B/stdafx.obj. The existing algorithm checks only the filename that was passed on the command line (or stored in archive), but this is insufficient in the case where relative paths are used, because depending on the command line object file / library order, it might find the wrong PCH object first resulting in a signature mismatch. The fix here is to simply check whether the absolute path of the PCH object (which is stored in the input obj file for the file that references the PCH) *ends with* the full relative path of whatever is specified on the command line (or is in the archive). Differential Revision: https://reviews.llvm.org/D66431 llvm-svn: 374442
* ADT: Save a word in every StringSet entryJordan Rose2019-10-104-19/+42
| | | | | | | | | | | | | | | Add a specialization to StringMap (actually StringMapEntry) for a value type of NoneType (the type of llvm::None), and use it for StringSet. This'll save us a word from every entry in a StringSet, used for alignment with the size_t that stores the string length. I could have gone all the way to some kind of empty base class optimization, but that seemed like overkill. Someone can consider adding that in the future, though. https://reviews.llvm.org/D68586 llvm-svn: 374440
* win: Move Parallel.h off concrt to cross-platform codeNico Weber2019-10-101-27/+0
| | | | | | | | | | | | | | | | | | | | | r179397 added Parallel.h and implemented it terms of concrt in 2013. In 2015, a cross-platform implementation of the functions has appeared and is in use everywhere but on Windows (r232419). r246219 hints that <thread> had issues in MSVC2013, but r296906 suggests they've been fixed now that we require 2015+. So remove the concrt code. It's less code, and it sounds like concrt has conceptual and performance issues, see PR41198. I built blink_core.dll in a debug component build with full symbols and in a release component build without any symbols. I couldn't measure a performance difference for linking blink_core.dll before and after this patch. Differential Revision: https://reviews.llvm.org/D68820 llvm-svn: 374421
* Add GsymCreator and GsymReader.Greg Clayton2019-10-104-9/+475
| | | | | | | | | | This patch adds the ability to create GSYM files with GsymCreator, and read them with GsymReader. Full testing has been added for both new classes. This patch differs from the original patch https://reviews.llvm.org/D53379 in that is uses a StringTableBuilder class from llvm instead of a custom version. Support for big and little endian files has been added. If the endianness matches the current host, we use efficient extraction for the header, address table and address info offset tables. Differential Revision: https://reviews.llvm.org/D68744 llvm-svn: 374381
* [Alignment][NFC] Use llv::Align in GISelKnownBitsGuillaume Chatelet2019-10-101-4/+5
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68786 llvm-svn: 374369
* Revert "[FileCheck] Implement --ignore-case option."Dmitri Gribenko2019-10-101-181/+180
| | | | | | | This reverts commit r374339. It broke tests: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19066 llvm-svn: 374359
* Revert "[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator"Dmitri Gribenko2019-10-101-2/+4
| | | | | | | This reverts commit r374240. It broke OCaml tests: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19014 llvm-svn: 374354
* [FileCheck] Implement --ignore-case option.Kai Nacke2019-10-101-180/+181
| | | | | | | | | | | | The FileCheck utility is enhanced to support a `--ignore-case` option. This is useful in cases where the output of Unix tools differs in case (e.g. case not specified by Posix). Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D68146 llvm-svn: 374339
* MinidumpYAML: Add support for the memory info list streamPavel Labath2019-10-102-5/+40
| | | | | | | | | | | | | | | | | | | | | Summary: The implementation is fairly straight-forward and uses the same patterns as the existing streams. The yaml form does not attempt to preserve the data in the "gaps" that can be created by setting a larger-than-required header or entry size in the stream header, because the existing consumer (lldb) does not make use of the information in the gap in any way, and attempting to preserve that would make the implementation more complicated. Reviewers: amccarth, jhenderson, clayborg Subscribers: llvm-commits, lldb-commits, markmentovai, zturner, JosephTremoulet Tags: #llvm Differential Revision: https://reviews.llvm.org/D68645 llvm-svn: 374337
* [Alignment][NFC] Make VectorUtils uas llvm::AlignGuillaume Chatelet2019-10-101-17/+14
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, rogfer01, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68784 llvm-svn: 374330
* [IfCvt][ARM] Optimise diamond if-conversion for code sizeOliver Stannard2019-10-101-0/+13
| | | | | | | | | | | | | | | | | Currently, the heuristics the if-conversion pass uses for diamond if-conversion are based on execution time, with no consideration for code size. This adds a new set of heuristics to be used when optimising for code size. This is mostly target-independent, because the if-conversion pass can see the code size of the instructions which it is removing. For thumb, there are a few passes (insertion of IT instructions, selection of narrow branches, and selection of CBZ instructions) which are run after if conversion and affect these heuristics, so I've added target hooks to better predict the code-size effect of a proposed if-conversion. Differential revision: https://reviews.llvm.org/D67350 llvm-svn: 374301
* [Attributor][NFC] clang formatJohannes Doerfert2019-10-101-4/+4
| | | | llvm-svn: 374281
* Reland "[TextAPI] Introduce TBDv4"Cyndy Ishida2019-10-103-2/+14
| | | | | | | | | Original Patch broke for compilations w/ gcc and exposed asan fail. This reland repairs those bugs. Differential Revision: https://reviews.llvm.org/D67529 llvm-svn: 374277
* GlobalISel: Implement fewerElementsVector for G_BUILD_VECTORMatt Arsenault2019-10-091-0/+4
| | | | | | Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. llvm-svn: 374252
* [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorCameron McInally2019-10-091-4/+2
| | | | | | | | Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus. Differential Revision: https://reviews.llvm.org/D61675 llvm-svn: 374240
* [SampleFDO] Add indexing for function profiles so they can be loaded on demandWei Mi2019-10-093-15/+53
| | | | | | | | | | | | | | | | | | | | | | | | in ExtBinary format Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo, even if there is no function showing up in the profile, we have to load all the function profiles from the profile input. That is a waste of compile time. CompactBinary format profile has already had the support of loading function profiles on demand. In this patch, we add the support to load profile on demand for ExtBinary format. It will work no matter the sections in ExtBinary format profile are compressed or not. Experiment shows it reduces the time to compile a server benchmark by 30%. When profile remapping and loading function profiles on demand are both used, extra work needs to be done so that the loading on demand process will take the name remapping into consideration. It will be addressed in a follow-up patch. Differential Revision: https://reviews.llvm.org/D68601 llvm-svn: 374233
* llvm-dwarfdump: Support multiple debug_loclists contributionsDavid Blaikie2019-10-091-1/+1
| | | | | | | Also fixing the incorrect "offset" field being computed/printed for each location list. llvm-svn: 374232
* [System Model] [TTI] Fix virtual destructor warningVitaly Buka2019-10-091-0/+1
| | | | llvm-svn: 374221
* [Support] Add mathematical constantsEvandro Menezes2019-10-091-0/+37
| | | | | | | | Add own version of the mathematical constants from the upcoming C++20 `std::numbers`. Differential revision: https://reviews.llvm.org/D68257 llvm-svn: 374207
* [System Model] [TTI] Update cache and prefetch TTI interfacesDavid Greene2019-10-094-51/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo. Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Moving the default TargetTransformInfoImplBase implementation to a default MCSubtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default MCSubtargetInfo implementation essentially returns the defaults TargetTransformInfoImplBase used to return. Existing users of TTI defaults will hit the defaults now in MCSubtargetInfo. Targets that define their own custom TTI implementations won't use the BasicTTIImpl implementations that route to the subtarget. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 374205
* [WebAssembly] Add builtin and intrinsic for v8x16.swizzleThomas Lively2019-10-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: This clang builtin and corresponding LLVM intrinsic are necessary to expose the exact semantics of the underlying WebAssembly instruction to users. LLVM produces a poison value if the dynamic swizzle indices are greater than the vector size, but the WebAssembly instruction sets the corresponding output lane to zero. Users who depend on this behavior can safely use this builtin. Depends on D68527. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68531 llvm-svn: 374189
* [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry ↵Jason Liu2019-10-091-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | to SectionOrLength. Summary: According the the XCOFF document, If Then XTY_SD x_scnlen contains the csect length. XTY_LD x_scnlen contains the symbol table index of the containing csect. XTY_CM x_scnlen contains the csect length. XTY_ER x_scnlen contains 0. Change the SectionLen member name to SectionOrLength is more reasonable. Authored By: DiggerLin Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D68650 llvm-svn: 374179
* Fix Wdocumentation unknown parameter warning. NFCI.Simon Pilgrim2019-10-091-1/+1
| | | | llvm-svn: 374171
* Unify the two CRC implementationsHans Wennborg2019-10-092-53/+40
| | | | | | | | | | | | | | | | | | | | | David added the JamCRC implementation in r246590. More recently, Eugene added a CRC-32 implementation in r357901, which falls back to zlib's crc32 function if present. These checksums are essentially the same, so having multiple implementations seems unnecessary. This replaces the CRC-32 implementation with the simpler one from JamCRC, and implements the JamCRC interface in terms of CRC-32 since this means it can use zlib's implementation when available, saving a few bytes and potentially making it faster. JamCRC took an ArrayRef<char> argument, and CRC-32 took a StringRef. This patch changes it to ArrayRef<uint8_t> which I think is the best choice, and simplifies a few of the callers nicely. Differential revision: https://reviews.llvm.org/D68570 llvm-svn: 374148
* [TypeSize] Fix module builds (cassert)Kristina Brooks2019-10-091-0/+1
| | | | | | | | TypeSize.h uses `assert` statements without including the <cassert> header first which leads to failures in modular builds. llvm-svn: 374138
* DebugInfo: Move LLE enum handling to .def to match RLE handlingDavid Blaikie2019-10-082-15/+25
| | | | llvm-svn: 374122
* Mark several PointerIntPair methods as lvalue-onlyJordan Rose2019-10-081-5/+6
| | | | | | | | No point in mutating 'this' if it's just going to be thrown away. https://reviews.llvm.org/D63945 llvm-svn: 374102
* [tblgen] Add getOperatorAsDef() to RecordDaniel Sanders2019-10-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: While working with DagInit's, it's often the case that you expect the operator to be a reference to a def. This patch adds a wrapper for this common case to reduce the amount of boilerplate callers need to duplicate repeatedly. getOperatorAsDef() returns the record if the DagInit has an operator that is a DefInit. Otherwise, it prints a fatal error. There's only a few pre-existing examples in LLVM at the moment and I've left a few instances of the code this simplifies as they had more specific error messages than the generic one this produces. I'm going to be using this a fair bit in my subsequent patches. Reviewers: bogner, volkan, nhaehnle Reviewed By: nhaehnle Subscribers: nhaehnle, hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68424 llvm-svn: 374101
* [BPF] do compile-once run-everywhere relocation for bitfieldsYonghong Song2019-10-081-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void *, unsigned, const void *); int field_read(struct s *arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void *)arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = *(unsigned char *)((void *)arg + offset); break; case 2: ull = *(unsigned short *)((void *)arg + offset); break; case 4: ull = *(unsigned int *)((void *)arg + offset); break; case 8: ull = *(unsigned long long *)((void *)arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = *(u64 *)(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = *(u64 *)(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = *(u16 *)(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = *(u64 *)(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = *(u8 *)(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = *(u32 *)(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = *(u32 *)(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 llvm-svn: 374099
OpenPOWER on IntegriCloud