summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [complex] Teach Clang to preserve different-type operands to arithmeticChandler Carruth2014-10-111-35/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | operators where one type is a C complex type, and to emit both the efficient and correct implementation for complex arithmetic according to C11 Annex G using this extra information. For both multiply and divide the old code was writing a long-hand reduced version of the math without any of the special handling of inf and NaN recommended by the standard here. Instead of putting more complexity here, this change does what GCC does which is to emit a libcall for the fully general case. However, the old code also failed to do the proper minimization of the set of operations when there was a mixed complex and real operation. In those cases, C provides a spec for much more minimal operations that are valid. Clang now emits the exact suggested operations. This change isn't *just* about performance though, without minimizing these operations, we again lose the correct handling of infinities and NaNs. It is critical that this happen in the frontend based on assymetric type operands to complex math operations. The performance implications of this change aren't trivial either. I've run a set of benchmarks in Eigen, an open source mathematics library that makes heavy use of complex. While a few have slowed down due to the libcall being introduce, most sped up and some by a huge amount: up to 100% and 140%. In order to make all of this work, also match the algorithm in the constant evaluator to the one in the runtime library. Currently it is a broken port of the simplifications from C's Annex G to the long-hand formulation of the algorithm. Splitting this patch up is very hard because none of this works without the AST change to preserve non-complex operands. Sorry for the enormous change. Follow-up changes will include support for sinking the libcalls onto cold paths in common cases and fastmath improvements to allow more aggressive backend folding. Differential Revision: http://reviews.llvm.org/D5698 llvm-svn: 219557
* CodeGen: FieldMemcpyizer didn't handle copies starting inside bitfieldsDavid Majnemer2014-10-101-4/+16
| | | | | | | | | | | | | It's possible to construct cases where the first field we are trying to copy is in the middle of an IR field. In some complicated cases, we would fail to use an appropriate offset inside the object. Earlier builds of clang seemed to miscompile the code by copying an insufficient number of bytes. Up until now, we would assert: the copying offset was insufficiently aligned. This fixes PR21232. llvm-svn: 219524
* Reduce double set lookups. NFC.Benjamin Kramer2014-10-101-2/+1
| | | | llvm-svn: 219504
* Unfriend CGOpenMPRegionInfo so it can go into an anonymous namespace.Benjamin Kramer2014-10-102-8/+4
| | | | | | Also remove some unnecessary virtual keywords. NFC. llvm-svn: 219497
* Fixing the MSVC 2013 build, NFC.Aaron Ballman2014-10-101-0/+3
| | | | llvm-svn: 219491
* Code reformatting and improvement for OpenMP.Alexey Bataev2014-10-104-106/+142
| | | | | | Moved CGOpenMPRegionInfo from CGOpenMPRuntime.h to CGOpenMPRuntime.cpp file and reworked the code for this change. Also added processing of ThreadID variable passed as an argument in outlined functions in parallel and task directives. llvm-svn: 219490
* Code improvements in OpenMP CodeGen.Alexey Bataev2014-10-103-71/+108
| | | | | | This patch makes class OMPPrivateScope a common class for all private variables. Reworked processing of firstprivate variables (now it is based on OMPPrivateScope too). llvm-svn: 219486
* Revert r218865 because it introduced PR21236, a crash in codegen emitting ↵Nick Lewycky2014-10-105-124/+32
| | | | | | the try block. llvm-svn: 219470
* Promote null pointer constants used as arguments to variadic functionsReid Kleckner2014-10-102-1/+26
| | | | | | | | | | | | | | | Make it possible to pass NULL through variadic functions on 64-bit Windows targets. The Visual C++ headers define NULL to 0, when they should define it to 0LL on Win64 so that NULL is a pointer-sized integer. Fixes PR20949. Reviewers: thakis, rsmith Differential Revision: http://reviews.llvm.org/D5480 llvm-svn: 219456
* Fix for bug http://llvm.org/PR17427.Alexey Bataev2014-10-093-82/+17
| | | | | | | | Assertion failed: "Computed __func__ length differs from type!" Reworked PredefinedExpr representation with internal StringLiteral field for function declaration. Differential Revision: http://reviews.llvm.org/D5365 llvm-svn: 219393
* [OPENMP] 'omp teams' directive basic support.Alexey Bataev2014-10-093-0/+8
| | | | | | Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive. llvm-svn: 219385
* Replace a destructor of EHCleanupScope with a Destroy() method to reflect ↵Kostya Serebryany2014-10-082-2/+4
| | | | | | | | | | | | | | | | | | | | | the current usage. Summary: The current code uses memset to re-initialize EHCleanupScope objects with breaks the assumptions of the upcoming asan's intra-object-overflow checker. If there is no DTOR, the new checker will refuse to work. Test Plan: bootstrap with asan Reviewers: rnk Reviewed By: rnk Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D5656 llvm-svn: 219331
* Revert "Remove threshold on object size for inserting lifetime begin / end"Arnaud A. de Grandmaison2014-10-081-9/+18
| | | | | | Revert this patch while I investigate some sanitizer failures off-line. llvm-svn: 219307
* [OPENMP] Codegen for 'firstprivate' clause.Alexey Bataev2014-10-085-68/+270
| | | | | | | | This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy. In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables. Differential Revision: http://reviews.llvm.org/D5140 llvm-svn: 219306
* Remove threshold on object size for inserting lifetime begin / endArnaud A. de Grandmaison2014-10-081-18/+9
| | | | | | | | | Boostrapping LLVM+Clang+LLDB without threshold on object size for lifetime markers insertion has shown there was no significant change in compile time, so let the stack slot colorizer do its optimization for all slots. llvm-svn: 219303
* Revert commit r219297.Alexey Bataev2014-10-085-270/+68
| | | | | | Still troubles with OpenMP/parallel_firstprivate_codegen.cpp (now in ARM buildbots). llvm-svn: 219298
* [OPENMP] Codegen for 'firstprivate' clause.Alexey Bataev2014-10-085-68/+270
| | | | | | | | This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy. In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables. Differential Revision: http://reviews.llvm.org/D5140 llvm-svn: 219297
* Revert back r219295.Alexey Bataev2014-10-085-270/+68
| | | | | | To fix issues with test OpenMP/parallel_firstprivate_codegen.cpp llvm-svn: 219296
* [OPENMP] Codegen for 'firstprivate' clause.Alexey Bataev2014-10-085-68/+270
| | | | | | | | This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy. In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables. Differential Revision: http://reviews.llvm.org/D5140 llvm-svn: 219295
* Revert "[OPENMP] 'omp teams' directive basic support. Includes parsing and ↵Renato Golin2014-10-083-8/+0
| | | | | | | | | semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive." This reverts commit r219197 because it broke ARM self-hosting buildbots with segmentation fault errors in many tests. llvm-svn: 219289
* Fix IRGen for referencing a static local before emitting its declReid Kleckner2014-10-085-45/+74
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously CodeGen assumed that static locals were emitted before they could be accessed, which is true for automatic storage duration locals. However, it is possible to have CodeGen emit a nested function that uses a static local before emitting the function that defines the static local, breaking that assumption. Fix it by creating the static local upon access and ensuring that the deferred function body gets emitted. We may not be able to emit the initializer properly from outside the function body, so don't try. Fixes PR18020. See also previous attempts to fix static locals in PR6769 and PR7101. Reviewers: rsmith Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D4787 llvm-svn: 219265
* Avoid code duplication by calling setAliasAttributes in EmitAliasDefinition.Rafael Espindola2014-10-081-12/+3
| | | | llvm-svn: 219258
* Allow dllexport alias to base destructors.Rafael Espindola2014-10-071-5/+1
| | | | | | | | | | | We used to avoid these, but it looks like we did so just because we were not handling dllexport alias correctly. Dario Domizioli fixed that, so allow these aliases. Based on a patch by Dario Domizioli! llvm-svn: 219206
* [OPENMP] 'omp teams' directive basic support.Alexey Bataev2014-10-073-0/+8
| | | | | | Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive. llvm-svn: 219197
* [OPENMP] Small refactoring of EmitOMPSimdLoop helper routine.Alexander Musman2014-10-072-17/+18
| | | | | | | | | No functional changes intended. Renamed EmitOMPSimdLoop to EmitOMPInnerLoop, I plan to re-use it to emit inner loop in the future patches for CodeGen of the worksharing loop directives (omp for, omp for simd). llvm-svn: 219195
* Using an explicit cast to work around MSVC 2013 not picking the conversion ↵Aaron Ballman2014-10-061-1/+1
| | | | | | operator as expected. NFC, should fix the MSVC build bots. llvm-svn: 219116
* Add FIXME/notes to the future.David Blaikie2014-10-061-0/+5
| | | | llvm-svn: 219104
* DebugInfo: Don't include implicit special members in the list of class membersDavid Blaikie2014-10-061-18/+17
| | | | | | | | By leaving these members out of the member list, we avoid them being emitted into type unit definitions - while still allowing the definition/declaration to be injected into the compile unit as expected. llvm-svn: 219101
* DebugInfo: Don't include member function template specializations in the ↵David Blaikie2014-10-061-10/+0
| | | | | | | | | | list of class members By leaving these members out of the member list, we avoid them being emitted into type unit definitions - while still allowing the definition/declaration to be injected into the compile unit as expected. llvm-svn: 219100
* MS ABI: Implement thread_local for global variablesDavid Majnemer2014-10-058-98/+194
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This add support for the C++11 feature, thread_local global variables. The ABI Clang implements is an improvement of the MSVC ABI. Sadly, further improvements could be made but not without sacrificing ABI compatibility. The feature is implemented as follows: - All thread_local initialization routines are pointed to from the .CRT$XDU section. - All non-weak thread_local variables have their initialization routines call from a single function instead of getting their own .CRT$XDU section entry. This is done to open up optimization opportunities to the compiler. - All weak thread_local variables have their own .CRT$XDU section entry. This entry is in a COMDAT with the global variable it is initializing; this ensures that we will initialize the global exactly once. - Destructors are registered in the initialization function using __tlregdtor. Differential Revision: http://reviews.llvm.org/D5597 llvm-svn: 219074
* Emit @llvm.assume for non-parameter lvalue align_value-attribute loadsHal Finkel2014-10-041-3/+41
| | | | | | | | | | | | | | | | | | | | | We already add the align parameter attribute for function parameters that have the align_value attribute (or those with a typedef type having that attribute), which is an important special case, but does not handle pointers with value alignment assumptions that come into scope in any other way. To handle the general case, emit an @llvm.assume-based alignment assumption whenever we load the pointer-typed lvalue of an align_value-attributed variable (except for function parameters, which we already deal with at entry). I'll also note that this is more general than Intel's described support in: https://software.intel.com/en-us/articles/data-alignment-to-assist-vectorization which states that the compiler inserts __assume_aligned directives in response to align_value-attributed variables only for function parameters and for the initializers of local variables. I think that we can make the optimizer deal with this more-general scheme (which could lead to a lot of calls to @llvm.assume inside of loop bodies, for example), but if not, I'll rework this to be less aggressive. llvm-svn: 219052
* CFE Knob for: Add a thread-model knob for lowering atomics on baremetal & ↵Jonathan Roelofs2014-10-031-0/+5
| | | | | | | | single threaded systems http://reviews.llvm.org/D4985 llvm-svn: 219027
* Add getOpenMPSimdDefaultAlignment for PowerPCHal Finkel2014-10-031-0/+12
| | | | | | | | | | | | When the aligned clause of an OpenMP simd pragma is not provided with an explicit alignment, a target-dependent default must be used. This adds such a default of PPC targets. This will become slightly more complicated when BG/Q support is added (because then it will depend on the type). For now, 16 is a correct value for all systems, and covers Altivec and VSX vectors. llvm-svn: 218994
* Initial support for the align_value attributeHal Finkel2014-10-021-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the align_value attribute. This attribute is supported by Intel's compiler (versions 14.0+), and several of my HPC users have requested support in Clang. It specifies an alignment assumption on the values to which a pointer points, and is used by numerical libraries to encourage efficient generation of vector code. Of course, we already have an aligned attribute that can specify enhanced alignment for a type, so why is this additional attribute important? The problem is that if you want to specify that an input array of T is, say, 64-byte aligned, you could try this: typedef double aligned_double attribute((aligned(64))); void foo(aligned_double *P) { double x = P[0]; // This is fine. double y = P[1]; // What alignment did those doubles have again? } the access here to P[1] causes problems. P was specified as a pointer to type aligned_double, and any object of type aligned_double must be 64-byte aligned. But if P[0] is 64-byte aligned, then P[1] cannot be, and this access causes undefined behavior. Getting round this problem requires a lot of awkward casting and hand-unrolling of loops, all of which is bad. With the align_value attribute, we can accomplish what we'd like in a well defined way: typedef double *aligned_double_ptr attribute((align_value(64))); void foo(aligned_double_ptr P) { double x = P[0]; // This is fine. double y = P[1]; // This is fine too. } This attribute does not create a new type (and so it not part of the type system), and so will only "propagate" through templates, auto, etc. by optimizer deduction after inlining. This seems consistent with Intel's implementation (thanks to Alexey for confirming the various Intel-compiler behaviors). As a final note, I would have chosen to call this aligned_value, not align_value, for better naming consistency with the aligned attribute, but I think it would be more useful to users to adopt Intel's name. llvm-svn: 218910
* Add __sync_fetch_and_nand (again)Hal Finkel2014-10-021-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to GCC 4.4, __sync_fetch_and_nand was implemented as: { tmp = *ptr; *ptr = ~tmp & value; return tmp; } but this was changed in GCC 4.4 to be: { tmp = *ptr; *ptr = ~(tmp & value); return tmp; } in response to this change, support for sync_fetch_and_nand (and sync_nand_and_fetch) was removed in r99522 in order to avoid miscompiling code depending on the old semantics. However, at this point: 1. Many years have passed, and the amount of code relying on the old semantics is likely smaller. 2. Through the work of many contributors, all LLVM backends have been updated such that "atomicrmw nand" provides the newer GCC 4.4+ semantics (this process was complete July of 2014 (added to the release notes in r212635). 3. The lack of this intrinsic is now a needless impediment to porting codes from GCC to Clang (I've now seen several examples of this). It is true, however, that we still set GNUC_MINOR to 2 (corresponding to GCC 4.2). To compensate for this, and to address the original concern regarding code relying on the old semantics, I've added a warning that specifically details the fact that the semantics have changed and that we provide the newer semantics. Fixes PR8842. llvm-svn: 218905
* [x32/NaCl] Check if method pointers straddle an eightbyte to classify HiJan Wen Voung2014-10-021-3/+18
| | | | | | | | | | | | | | | | | | | | Summary: Currently, with struct my_struct { int x; method_ptr y; }; a call to foo(my_struct s) may end up dropping the last 4 bytes of the method pointer for x86_64 NaCl and x32. When checking Has64BitPointers, also check if the method pointer straddles an eightbyte boundary and classify Hi as well as Lo if needed. Test Plan: test/CodeGenCXX/x86_64-arguments-nacl-x32.cpp Reviewers: dschuff, pavel.v.chupin Subscribers: jfb Differential Revision: http://reviews.llvm.org/D5555 llvm-svn: 218889
* Reapply "InstrProf: Update for the LLVM API change in r218879"Justin Bogner2014-10-023-15/+7
| | | | | | | | Reapplying now that r218887 is in. This reverts commit r218882, reapplying r218880. llvm-svn: 218888
* Revert "InstrProf: Update for the LLVM API change in r218879"Justin Bogner2014-10-023-7/+15
| | | | | | | | r218879 has been reverted for now, this needs to go to match. This reverts commit r218880. llvm-svn: 218882
* InstrProf: Update for the LLVM API change in r218879Justin Bogner2014-10-023-15/+7
| | | | llvm-svn: 218880
* Emit lifetime.start / lifetime.end markers for unnamed temporary objects.Arnaud A. de Grandmaison2014-10-025-32/+124
| | | | | | | This will give more information to the optimizers so that they can reuse stack slots and reduce stack usage. llvm-svn: 218865
* DIBuilder: Encapsulate DIExpression's element typeDuncan P. N. Exon Smith2014-10-011-18/+16
| | | | | | | Update for corresponding LLVM API change for `DIBuilder::createExpression()`. llvm-svn: 218798
* Update CGDebugInfo to the updated API in LLVM.Adrian Prantl2014-10-011-21/+25
| | | | | | | | | | Complex address expressions are no longer part of DIVariable, but rather an extra argument to the debug intrinsics. http://reviews.llvm.org/D4919 rdar://problem/17994491 llvm-svn: 218788
* Reverting r218777 while investigating buildbot breakage.Adrian Prantl2014-10-011-25/+21
| | | | | | "Update CGDebugInfo to the updated API in LLVM." llvm-svn: 218781
* Update CGDebugInfo to the updated API in LLVM.Adrian Prantl2014-10-011-21/+25
| | | | | | | | | | Complex address expressions are no longer part of DIVariable, but rather an extra argument to the debug intrinsics. http://reviews.llvm.org/D4919 rdar://problem/17994491 llvm-svn: 218777
* Adds 'override' to overriding methods. NFC.Fariborz Jahanian2014-10-011-1/+1
| | | | | | These were uncoveredby my yet undelivered patch. llvm-svn: 218774
* [OPENMP] Loop collapsing and codegen for 'omp simd' directive.Alexander Musman2014-10-012-3/+203
| | | | | | | | | | | | | This patch implements collapsing of the loops (in particular, in presense of clause 'collapse'). It calculates number of iterations N and expressions nesessary to calculate the nested loops counters values based on new iteration variable (that goes from 0 to N-1) in Sema. It also adds Codegen for 'omp simd', which uses (and tests) this feature. Differential Revision: http://reviews.llvm.org/D5184 llvm-svn: 218743
* InstrProf: Avoid repeated linear searches in a hot pathJustin Bogner2014-10-011-51/+33
| | | | | | | | | | | | | | | | | | | | | | When generating coverage regions, we were doing a linear search through the existing regions in order to try to merge related ones. Most of the time this would find what it was looking for in a small number of steps and it wasn't a big deal, but in cases with many regions and few mergeable ones this leads to an absurd compile time regression. This changes the coverage mapping logic to do a single sort and then merge as we go, which is a bit simpler and about 100 times faster. I've also added FIXMEs on a couple of behaviours that seem a little suspect, while keeping them behaving as they were - I'll look into these soon. The test changes here are mostly tedious reorganization, because the ordering of regions we output has become slightly (but not completely) more consistent from the almost completely arbitrary ordering we got before. llvm-svn: 218738
* InstrProf: Hide SourceMappingRegion's internals (NFC)Justin Bogner2014-10-011-12/+31
| | | | | | | | This struct has some members that are accessed directly and others that need accessors, but it's all just public. This is confusing, so I've changed it to a class and made more members private. llvm-svn: 218737
* InstrProf: Remove an unused member (NFC)Justin Bogner2014-09-301-6/+3
| | | | llvm-svn: 218697
* [OPENMP] Codegen of the ‘aligned’ clause for the ‘omp simd’ directive.Alexander Musman2014-09-303-10/+62
| | | | | | Differential Revision: http://reviews.llvm.org/D5499 llvm-svn: 218660
OpenPOWER on IntegriCloud