summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/CGCUDANV.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Compute and preserve alignment more faithfully in IR-generation.John McCall2015-09-081-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce an Address type to bundle a pointer value with an alignment. Introduce APIs on CGBuilderTy to work with Address values. Change core APIs on CGF/CGM to traffic in Address where appropriate. Require alignments to be non-zero. Update a ton of code to compute and propagate alignment information. As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment helper function to CGF and made use of it in a number of places in the expression emitter. The end result is that we should now be significantly more correct when performing operations on objects that are locally known to be under-aligned. Since alignment is not reliably tracked in the type system, there are inherent limits to this, but at least we are no longer confused by standard operations like derived-to-base conversions and array-to-pointer decay. I've also fixed a large number of bugs where we were applying the complete-object alignment to a pointer instead of the non-virtual alignment, although most of these were hidden by the very conservative approach we took with member alignment. Also, because IRGen now reliably asserts on zero alignments, we should no longer be subject to an absurd but frustrating recurring bug where an incomplete type would report a zero alignment and then we'd naively do a alignmentAtOffset on it and emit code using an alignment equal to the largest power-of-two factor of the offset. We should also now be emitting much more aggressive alignment attributes in the presence of over-alignment. In particular, field access now uses alignmentAtOffset instead of min. Several times in this patch, I had to change the existing code-generation pattern in order to more effectively use the Address APIs. For the most part, this seems to be a strict improvement, like doing pointer arithmetic with GEPs instead of ptrtoint. That said, I've tried very hard to not change semantics, but it is likely that I've failed in a few places, for which I apologize. ABIArgInfo now always carries the assumed alignment of indirect and indirect byval arguments. In order to cut down on what was already a dauntingly large patch, I changed the code to never set align attributes in the IR on non-byval indirect arguments. That is, we still generate code which assumes that indirect arguments have the given alignment, but we don't express this information to the backend except where it's semantically required (i.e. on byvals). This is likely a minor regression for those targets that did provide this information, but it'll be trivial to add it back in a later patch. I partially punted on applying this work to CGBuiltin. Please do not add more uses of the CreateDefaultAligned{Load,Store} APIs; they will be going away eventually. llvm-svn: 246985
* Revert r240270 ("Fixed/added namespace ending comments using clang-tidy").Alexander Kornienko2015-06-221-1/+1
| | | | llvm-svn: 240353
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-221-1/+1
| | | | | | | | | | | | The patch is generated using this command: $ tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ work/llvm/tools/clang To reduce churn, not touching namespaces spanning less than 10 lines. llvm-svn: 240270
* [cuda] Include GPU binary into host object file and generate init/deinit code.Artem Belevich2015-05-071-13/+205
| | | | | | | | | | | | - added -fcuda-include-gpubinary option to incorporate results of device-side compilation into host-side one. - generate code to register GPU binaries and associated kernels with CUDA runtime and clean-up on exit. - added test case for init/deinit code generation. Differential Revision: http://reviews.llvm.org/D9507 llvm-svn: 236765
* [C++11] Add 'override' keyword to virtual methods that override their base ↵Craig Topper2014-03-121-1/+1
| | | | | | class. llvm-svn: 203643
* [Modules] Update to reflect the move of CallSite into the IR library inChandler Carruth2014-03-041-1/+1
| | | | | | LLVM r202816. llvm-svn: 202817
* Use the actual ABI-determined C calling convention for runtimeJohn McCall2013-02-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | calls and declarations. LLVM has a default CC determined by the target triple. This is not always the actual default CC for the ABI we've been asked to target, and so we sometimes find ourselves annotating all user functions with an explicit calling convention. Since these calling conventions usually agree for the simple set of argument types passed to most runtime functions, using the LLVM-default CC in principle has no effect. However, the LLVM optimizer goes into histrionics if it sees this kind of formal CC mismatch, since it has no concept of CC compatibility. Therefore, if this module happens to define the "runtime" function, or got LTO'ed with such a definition, we can miscompile; so it's quite important to get this right. Defining runtime functions locally is quite common in embedded applications. llvm-svn: 176286
* Remove useless 'llvm::' qualifier from names like StringRef and others that areDmitri Gribenko2013-01-121-1/+1
| | | | | | brought into 'clang' namespace by clang/Basic/LLVM.h llvm-svn: 172323
* Rewrite #includes for llvm/Foo.h to llvm/IR/Foo.h as appropriate toChandler Carruth2013-01-021-3/+3
| | | | | | | | reflect the migration in r171366. Re-sort the #include lines to reflect the new paths. llvm-svn: 171369
* Sort all of Clang's files under 'lib', and fix up the broken headersChandler Carruth2012-12-041-1/+0
| | | | | | | | | | | | | uncovered. This required manually correcting all of the incorrect main-module headers I could find, and running the new llvm/utils/sort_includes.py script over the files. I also manually added quite a few missing headers that were uncovered by shuffling the order or moving headers up to be main-module-headers. llvm-svn: 169237
* CUDA: IR generation support for device stubsPeter Collingbourne2011-10-061-0/+92
| | | | llvm-svn: 141304
* CUDA: IR generation support for kernel call expressionsPeter Collingbourne2011-10-061-0/+34
llvm-svn: 141300
OpenPOWER on IntegriCloud