| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
This reverts commit r337082, restoring r337051, since the LLVM side
patch has been restored.
llvm-svn: 337185
|
|
|
|
|
|
| |
This reverts commit r337051.
llvm-svn: 337082
|
|
|
|
|
|
|
| |
Clang change to reflect the FunctionsToImportTy type change
in the llvm changes for D48670.
llvm-svn: 337051
|
|
|
|
|
|
|
|
|
|
| |
Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds.
Subscribers: dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D49209
llvm-svn: 337041
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit.
Reviewers: ABataev, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D49188
llvm-svn: 337015
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code in `CodeGenModule::SetFunctionAttributes()` could set an empty
attribute `implicit-section-name` on a function that is affected by
`#pragma clang text="section"`. This is incorrect because the attribute
should contain a valid section name. If the function additionally also
used `__attribute__((section("section")))` then this could result in
emitting the function in a section with an empty name.
The patch fixes the issue by removing the problematic code that sets
empty `implicit-section-name` from
`CodeGenModule::SetFunctionAttributes()` because it is sufficient to set
this attribute only from a similar code in `setNonAliasAttributes()`
when the function is emitted.
Differential Revision: https://reviews.llvm.org/D48916
llvm-svn: 336842
|
|
|
|
| |
llvm-svn: 336840
|
|
|
|
|
|
|
|
|
| |
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.
Note: This is what was intended to be committed in r336726
llvm-svn: 336729
|
|
|
|
| |
llvm-svn: 336727
|
|
|
|
|
|
|
|
| |
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.
llvm-svn: 336726
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Make sure that loop metadata only is put on the backedge
when expanding a do-while loop.
Previously we added the loop metadata also on the branch
in the pre-header. That could confuse optimization passes
and result in the loop metadata being associated with the
wrong loop.
Fixes https://bugs.llvm.org/show_bug.cgi?id=38011
Committing on behalf of deepak2427 (Deepak Panickal)
Reviewers: #clang, ABataev, hfinkel, aaron.ballman, bjope
Reviewed By: bjope
Subscribers: bjope, rsmith, shenhan, zzheng, xbolva00, lebedev.ri, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D48721
llvm-svn: 336717
|
|
|
|
|
|
| |
__builtin_ia32_divsd_round_mask.
llvm-svn: 336628
|
|
|
|
|
|
|
|
|
|
|
|
| |
is suitable for use in scalar mask intrinsics.
This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics.
This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling.
I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle.
llvm-svn: 336622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
width. Add a min_vector_width function attribute and tag all x86 instrinsics with it
This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.
Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.
To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.
To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.
To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.
To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.
There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.
Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.
Differential Revision: https://reviews.llvm.org/D48617
llvm-svn: 336583
|
|
|
|
|
|
|
|
|
|
| |
In generic data-sharing mode we are allowed to not globalize local
variables that escape their declaration context iff they are declared
inside of the parallel region. We can do this because L2 parallel
regions are executed sequentially and, thus, we do not need to put
shared local variables in the global memory.
llvm-svn: 336567
|
|
|
|
|
|
| |
This allows us to handle masking in a very similar way to the default rounding version that uses llvm.fma
llvm-svn: 336507
|
|
|
|
|
|
|
|
|
|
| |
sure we optimize the all ones mask case.
This case occurs in the intrinsic headers so we should avoid emitting the mask in those cases.
Factor the code into a helper function to make this easy.
llvm-svn: 336472
|
|
|
|
|
|
|
|
| |
native IR using llvm.fma intrinsic.
This generates some extra zeroing currently, but we should be able to quickly address that with some isel patterns.
llvm-svn: 336417
|
|
|
|
|
|
|
|
|
|
| |
fmaddsub/fmsubadd IR emission.
Shufflevector is easier to generate and matches what the backend pattern matches without relying on constant selects being turned into shuffles.
While I was there I also made the IR regular expressions a little stricter to ensure operand order on the shuffle.
llvm-svn: 336388
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes on optimization used with the TRUE/FALSE
predicates, as was suggested in https://reviews.llvm.org/D45616
for r335339.
The optimization was buggy, since r335339 used it also
for *_mask builtins, without actually applying the mask -- the
mask argument was just ignored.
Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D48715
llvm-svn: 336355
|
|
|
|
|
|
|
|
|
| |
Update clang to treat fp128 as a valid base type for homogeneous aggregate
passing and returning.
Differential Revision: https://reviews.llvm.org/D48044
llvm-svn: 336308
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Emmiting new intrinsic that strips invariant.groups to make
devirtulization sound, as described in RFC: Devirtualization v2.
Reviewers: rjmccall, rsmith, amharc, kuhar
Subscribers: llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D47299
Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 336137
|
|
|
|
|
|
| |
builtins instead.
llvm-svn: 335945
|
|
|
|
|
|
|
|
| |
That's where CUDA binaries appear to put them.
Differential Revision: https://reviews.llvm.org/D48615
llvm-svn: 335880
|
|
|
|
|
|
| |
Follow-up commit for r335757 to address some inconsistencies.
llvm-svn: 335834
|
|
|
|
|
|
|
|
|
|
| |
This matches the way NVCC does it. Doing module cleanup at global
destructor phase used to work, but is, apparently, too late for
the CUDA runtime in CUDA-9.2, which ends up crashing with double-free.
Differential Revision: https://reviews.llvm.org/D48613
llvm-svn: 335763
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As brought up during the discussion of the DWARF5 accelerator tables,
there is currently no way to associate Objective-C methods with the
interface they belong to, other than the .apple_objc accelerator table.
After due consideration we came to the conclusion that it makes more
sense to follow Pavel's suggestion of just emitting this information in
the .debug_info section. One concern was that categories were
emitted in the .apple_names as well, but it turns out that LLDB doesn't
rely on the accelerator tables for this information.
This patch changes the codegen behavior to emit subprograms for
structure types, like we do for C++. This will result in the
DW_TAG_subprogram being nested as a child under its
DW_TAG_structure_type. This behavior is only enabled for DWARF5 and
later, so we can have a unique code path in LLDB with regards to
obtaining the class methods.
This was tested on the LLDB side and doesn't lead to a regression.
There's already code in place to deal with member functions in C++,
which deals with this transparently.
For more background please refer to the discussion on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-June/123986.html
Differential revision: https://reviews.llvm.org/D48241
llvm-svn: 335757
|
|
|
|
|
|
| |
name to match llvm.
llvm-svn: 335745
|
|
|
|
|
|
|
|
|
| |
This patch reworks the support for dup NEON intrinsics as
described in D48439.
Differential Revision: https://reviews.llvm.org/D48440
llvm-svn: 335734
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We track when we see a name-shaped expression followed by a '<' token
and parse the '<' as a comparison. Then:
* if we see a token sequence that cannot possibly be an expression but
can be a template argument (in particular, a type-id) that follows
either a ',' or the '<', diagnose that the '<' was supposed to start
a template argument list, and
* if we see '>()', diagnose that the '<' was supposed to start a
template argument list.
This only changes the diagnostic for error cases, and in practice
appears to catch the most common cases where a missing 'template'
keyword leads to parse errors within a template.
Differential Revision: https://reviews.llvm.org/D48571
llvm-svn: 335687
|
|
|
|
|
|
| |
Depends on r334313, which has been reverted in r335681.
llvm-svn: 335684
|
|
|
|
|
|
|
| |
Apparently we're now hitting an object file section limit on this
file with expensive checks enabled.
llvm-svn: 335636
|
|
|
|
|
|
|
|
|
| |
Patch tries to make better analysis of the variables that should be
globalized. From now, instead of all parallel directives it will check
only distribute parallel .. directives and check only for
firstprivte/lastprivate variables if they must be globalized.
llvm-svn: 335632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similarly to CFI on virtual and indirect calls, this implementation
tries to use program type information to make the checks as precise
as possible. The basic way that it works is as follows, where `C`
is the name of the class being defined or the target of a call and
the function type is assumed to be `void()`.
For virtual calls:
- Attach type metadata to the addresses of function pointers in vtables
(not the functions themselves) of type `void (B::*)()` for each `B`
that is a recursive dynamic base class of `C`, including `C` itself.
This type metadata has an annotation that the type is for virtual
calls (to distinguish it from the non-virtual case).
- At the call site, check that the computed address of the function
pointer in the vtable has type `void (C::*)()`.
For non-virtual calls:
- Attach type metadata to each non-virtual member function whose address
can be taken with a member function pointer. The type of a function
in class `C` of type `void()` is each of the types `void (B::*)()`
where `B` is a most-base class of `C`. A most-base class of `C`
is defined as a recursive base class of `C`, including `C` itself,
that does not have any bases.
- At the call site, check that the function pointer has one of the types
`void (B::*)()` where `B` is a most-base class of `C`.
Differential Revision: https://reviews.llvm.org/D47567
llvm-svn: 335569
|
|
|
|
|
|
|
|
| |
implement the mask input argument using an 'and' IR instruction.
Additional IR is emitted to convert between scalar and vXi1 type to match the expected software inferface for the builtin that clang exposes.
llvm-svn: 335564
|
|
|
|
|
|
|
|
|
|
| |
The WebAssembly backend in particular benefits from being
able to distinguish between varargs functions (...) and prototype-less
C functions.
Differential Revision: https://reviews.llvm.org/D48443
llvm-svn: 335510
|
|
|
|
|
|
|
|
|
| |
threadprivate.
Do not delay emission of the address constant variables in OpenMP mode
as they cannot be defined as threadprivate.
llvm-svn: 335483
|
|
|
|
|
|
|
|
| |
constructor calls.
Differential Revision: https://reviews.llvm.org/D48531
llvm-svn: 335445
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In his review of https://reviews.llvm.org/D45860, @GorNishanov suggested
avoiding generating additional exception-handling IR in the case that
the resume function was marked as 'noexcept', and exceptions could not
occur. This implements that suggestion.
Test Plan: `check-clang`
Reviewers: GorNishanov, EricWF
Reviewed By: GorNishanov
Subscribers: cfe-commits, GorNishanov
Differential Revision: https://reviews.llvm.org/D47673
llvm-svn: 335422
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we are now producing a summary also for regular LTO builds, we
need to run the NameAnonGlobals pass in those cases as well (the
summary cannot handle anonymous globals).
See https://reviews.llvm.org/D34156 for details on the original change.
This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3.
llvm-svn: 335385
|
|
|
|
|
|
|
|
| |
If the shuffle is required for the reduced structures/big data type,
current code may cause compiler crash because of the loading of the
aggregate values. Patch fixes this problem.
llvm-svn: 335377
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Lowering some vector comparision builtins to fcmp IR instructions.
This ignores the signaling behaviour specified in the predicate
argument of said builtins.
Affected AVX512 builtins:
__builtin_ia32_cmpps128_mask
__builtin_ia32_cmpps256_mask
__builtin_ia32_cmpps512_mask
__builtin_ia32_cmppd128_mask
__builtin_ia32_cmppd256_mask
__builtin_ia32_cmppd512_mask
Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma
Reviewed By: craig.topper, spatel, efriedma
Differential Revision: https://reviews.llvm.org/D45616
llvm-svn: 335339
|
|
|
|
|
|
|
|
| |
D48464 contains changes that will loosen some of the range checks in SemaChecking to a DefaultError warning that can be disabled.
This patch adds explicit masking to avoid using the upper bits of immediates to gracefully handle the warning being disabled.
llvm-svn: 335308
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fixes PR37898.
Reviewers: pcc, vlad.tsyrklevich
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D48454
llvm-svn: 335305
|
|
|
|
|
|
|
|
|
|
|
| |
This is breaking a couple of buildbots. We need to run the
NameAnonGlobal pass for regular LTO now as well (since we're producing a
summary). I'll post a separate patch for review to make this happen and
then re-commit.
This reverts commit c0759b7b1f4a81ff9021b952aa38a222d5fa4dfd.
llvm-svn: 335291
|
|
|
|
|
|
|
|
|
|
| |
parallel region.
If the current construct requires sharing of the local variable in the
inner parallel region, this variable must be globalized to avoid
runtime crash.
llvm-svn: 335285
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With D33921, we gained the ability to have module summaries in regular
LTO modules without triggering ThinLTO compilation. Module summaries in
regular LTO allow garbage collection (dead stripping) before LTO
compilation and thus open up additional optimization opportunities.
This patch enables summary emission in regular LTO for all targets
except ld64-based ones (which use the legacy LTO API).
Reviewers: pcc, tejohnson, mehdi_amini
Subscribers: inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D34156
llvm-svn: 335284
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This test is a strip down version of a function inside the
amalgamated sqlite source. When converted to IR clang produces
a phi instruction without debug location.
This patch fixes the above issue.
Differential Revision: https://reviews.llvm.org/D47720
llvm-svn: 335255
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This diff includes the logic for setting the precision bits for each primary fixed point type in the target info and logic for initializing a fixed point literal.
Fixed point literals are declared using the suffixes
```
hr: short _Fract
uhr: unsigned short _Fract
r: _Fract
ur: unsigned _Fract
lr: long _Fract
ulr: unsigned long _Fract
hk: short _Accum
uhk: unsigned short _Accum
k: _Accum
uk: unsigned _Accum
```
Errors are also thrown for illegal literal values
```
unsigned short _Accum u_short_accum = 256.0uhk; // expected-error{{the integral part of this literal is too large for this unsigned _Accum type}}
```
Differential Revision: https://reviews.llvm.org/D46915
llvm-svn: 335148
|
|
|
|
|
|
|
|
|
| |
This is not only semantically correct but ensures that they will not
be marked as address-significant once D48155 lands.
Differential Revision: https://reviews.llvm.org/D48206
llvm-svn: 334982
|