| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
When using the option, draw the histogram representing the debug
location buckets. The resulting histogram will be saved in a png
file.
Differential Revision: https://reviews.llvm.org/D71869
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.
More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272
This patch adds a support for these sections.
Differential revision: https://reviews.llvm.org/D71872
|
|
|
|
| |
This reverts commit 57cf6ee9c84434161088c39a6f8dd2aae14eb12d.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Making these changes, the code becomes more robust and easier for
adding the new features.
-Introduce the LocationStats class representing the statistics
-Add the pretty_print() method in the LocationStats class
-Add additional '-' for the program options
-Add the verify_program_inputs() function
-Add the parse_locstats() function
-Rename 'results' => 'opts'
-Add more comments
Differential Revision: https://reviews.llvm.org/D71868
|
|
|
|
|
| |
if users don't specific -mattr, the default target-feature come
from IR attribute.
|
|
|
|
|
|
| |
DWARFContext, the only user of this class, can already handle such offsets.
Differential Revision: https://reviews.llvm.org/D71834
|
| |
|
|
|
|
|
|
| |
This is a follow-up for D71546 to add a corresponding unit test.
Differential Revision: https://reviews.llvm.org/D72695
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current implementation of skip insertion (SIInsertSkip) makes it a
mandatory pass required for correctness. Initially, the idea was to
have an optional pass. This patch inserts the s_cbranch_execz upfront
during SILowerControlFlow to skip over the sections of code when no
lanes are active. Later, SIRemoveShortExecBranches removes the skips
for short branches, unless there is a sideeffect and the skip branch is
really necessary.
This new pass will replace the handling of skip insertion in the
existing SIInsertSkip Pass.
Differential revision: https://reviews.llvm.org/D68092
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch implements minimal VE code generation for empty function bodies (no args, no value return).
Contents
* empty function code generation test.
* Minimal function prologue & epilogue emission
* Instruction formats and instruction definitions as far as required for the empty function prologue & epilogue.
* I64 register class definitions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D72598
|
|
|
|
|
|
|
|
|
|
| |
We were performing an emulated i32->f64 in the SSE registers, then
storing that value to memory and doing a extload into the X87
domain.
After this patch we'll now just store the i32 to memory along
with an i32 0. Then do a 64-bit FILD to f80 completely in the X87
unit. This matches what we do without SSE.
|
|
|
|
|
|
| |
The mve-phireg.ll test no longer really tests what it was added for,
but the original case was fairly complex. I've left the test in as a
general codegen test.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point.
One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument).
The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty).
Currently, AAValueConstantRange is created in `getAssumedConstant` method when `AAValueSimplify` returns `nullptr`(worst state).
Supported
- BinaryOperator(add, sub, ...)
- CmpInst(icmp eq, ...)
- !range metadata
`AAValueConstantRange` is not intended to extend to polyhedral range value analysis.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: phosek, davezarzycki, baziotis, hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71620
|
|
|
|
|
|
|
|
| |
ScheduleDAGMI. NFC
All the callers of this function will be ScheduleDAGMI from the
MachineScheduler. This allows us to use the extra info available in
ScheduleDAGMI without resorting to awkward casts.
|
|
|
|
|
| |
Add a test for the CMTime data formatter. The coverage report showed
that this code path was untested.
|
|
|
|
|
|
|
| |
The 'asynchronously' argument to both GetLLDBCommandsFromIOHandler and
GetPythonCommandsFromIOHandler is true for all call sites. This commit
simplifies the API by dropping it and giving the baton a default
argument.
|
| |
|
|
|
|
|
|
|
|
| |
These driver options perform some checking and delegate to MC options -x86-align-branch* and -x86-branches-within-32B-boundaries.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D72463
|
|
|
|
|
| |
Add a check to bitfield mismatches that may have caused Clang to
give an error about the bitfield instead of being mutable.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As currently written, -target powerpcspe will enable SPE regardless of
disabling the feature later on in the command line. Instead, change
this to just set a default CPU to 'e500' instead of a generic CPU.
As part of this, add FeatureSPE to the e500 definition.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D72673
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Today the optimization is limited to:
- `[ClassName alloc]`
- `[self alloc]` when within a class method
However it means that when code is written this way:
```
@interface MyObject
- (id)copyWithZone:(NSZone *)zone
{
return [[self.class alloc] _initWith...];
}
@end
```
... then the optimization doesn't kick in and `+[NSObject alloc]` ends
up in IMP caches where it could have been avoided. It turns out that
`+alloc` -> `+[NSObject alloc]` is the most cached SEL/IMP pair in the
entire platform which is rather silly).
There's two theoretical risks allowing this optimization:
1. if the receiver is nil (which it can't be today), but it turns out
that `objc_alloc()`/`objc_alloc_init()` cope with a nil receiver,
2. if the `Clas` type for the receiver is a lie. However, for such a
code to work today (and not fail witn an unrecognized selector
anyway) you'd have to have implemented the `-alloc` **instance
method**.
Fortunately, `objc_alloc()` doesn't assume that the receiver is a
Class, it basically starts with a test that is similar to
`if (receiver->isa->bits & hasDefaultAWZ) { /* fastpath */ }`.
This bit is only set on metaclasses by the runtime, so if an instance
is passed to this function by accident, its isa will fail this test,
and `objc_alloc()` will gracefully fallback to `objc_msgSend()`.
The one thing `objc_alloc()` doesn't support is tagged pointer
instances. None of the tagged pointer classes implement an instance
method called `'alloc'` (actually there's a single class in the
entire Apple codebase that has such a method).
Differential Revision: https://reviews.llvm.org/D71682
Radar-Id: rdar://problem/58058316
Reviewed-By: Akira Hatanaka
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.
A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.
This patch reduces the number of public symbols in libLLVM.so by about
25%. This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so
One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.
Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278
Reviewers: chandlerc, beanz, mgorny, rnk, hans
Reviewed By: rnk, hans
Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D54439
|
|
|
|
|
|
|
|
|
| |
list constructor when initializing from {}.
We would previously pick between calling an initializer list constructor
and calling a default constructor unstably in this situation, depending
on whether the inherited default constructor had already been used
elsewhere in the program.
|
|
|
|
| |
assembler is used that is not present.
|
|
|
|
|
| |
rG7e02406f6cf180a8c89ce64665660e7cc9dbc23e switched the file to CRLF
line endings.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This flag was originally part of D70157, but was removed as we carved away pieces of the review. Since we have the nop support checked in, and it appears mature(*), I think it's time to add the master flag. For now, it will default to nop padding, but once the prefix padding support lands, we'll update the defaults.
(*) I can now confirm that downstream testing of the changes which have landed to date - nop padding and compiler support for suppressions - is passing all of the functional testing we've thrown at it. There might still be something lurking, but we've gotten enough coverage to be confident of the basic approach.
Note that the new flag can be used either when assembling an .s file, or when using the integrated assembler directly from the compiler. The later will use all of the suppression mechanism and should always generate correct code. We don't yet have assembly syntax for the suppressions, so passing this directly to the assembler w/a raw .s file may result in broken code. Use at your own risk.
Also note that this isn't the wiring for the clang option. I think the most recent review for that is D72227, but I've lost track, so that might be off.
Differential Revision: https://reviews.llvm.org/D72738
|
|
|
|
|
|
|
| |
Add support for type-constraints in template type parameters.
Also add support for template type parameters as pack expansions (where the type constraint can now contain an unexpanded parameter pack).
Differential Revision: https://reviews.llvm.org/D44352
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before this change, X86_32ABIInfo::classifyArgument would be called
twice on vector arguments to vectorcall functions. This function has
side effects to track GPR register usage, and this would lead to
incorrect GPR usage in some cases. The specific case I noticed is from
running out of XMM registers with mixed FP and vector arguments and no
aggregates of any kind. Consider this prototype:
void __vectorcall vectorcall_indirect_vec(
double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
__m128 xmm5,
__m128 ecx,
int edx,
__m128 mem);
classifyArgument has no effects when called on a plain FP type, but when
called on a vector type, it modifies FreeRegs to model GPR consumption.
However, this should not happen during the vector call first pass.
I refactored the code to unify vectorcall HVA logic with regcall HVA
logic. The conventions pass HVAs in registers differently (expanded vs.
not expanded), but if they do not fit in registers, they both pass them
indirectly by address.
Reviewers: erichkeane, craig.topper
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72110
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before this patch adding a new /D flag when compiling a source file that consumed a PCH with clang-cl would issue a diagnostic and then fail. With the patch, the diagnostic is still issued but the definition is accepted. This matches the msvc behavior. The fuzzy-pch-msvc.c is a clone of the existing fuzzy-pch.c tests with some msvc specific rework.
msvc diagnostic:
warning C4605: '/DBAR=int' specified on current command line, but was not specified when precompiled header was built
Output of the CHECK-BAR test prior to the code change:
<built-in>(1,9): warning: definition of macro 'BAR' does not match definition in precompiled header [-Wclang-cl-pch]
#define BAR int
^
D:\repos\llvm\llvm-project\clang\test\PCH\fuzzy-pch-msvc.c(12,1): error: unknown type name 'BAR'
BAR bar = 17;
^
D:\repos\llvm\llvm-project\clang\test\PCH\fuzzy-pch-msvc.c(23,4): error: BAR was not defined
# error BAR was not defined
^
1 warning and 2 errors generated.
Reviewers: rnk, thakis, hans, zturner
Subscribers: mikerice, aganea, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72405
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pass small FP values in GPRs or stack memory according the the normal
convention. This is what gcc -mno-sse does on Win64.
I adjusted the conditions under which we emit an error to check if the
argument or return value would be passed in an XMM register when SSE is
disabled. This has a side effect of no longer emitting an error for FP
arguments marked 'inreg' when targetting x86 with SSE disabled. Our
calling convention logic was already assigning it to FP0/FP1, and then
we emitted this error. That seems unnecessary, we can ignore 'inreg' and
compile it without SSE.
Reviewers: jyknight, aemerson
Differential Revision: https://reviews.llvm.org/D70465
|
|
|
|
| |
- There are typos introduced due to merge.
|
|
|
|
| |
The extload on X87 is free.
|
|
|
|
|
|
|
|
|
|
|
|
| |
mode i64->f32/f64/f80 uint_to_fp algorithm.
This allows us to generate better code for selecting the fixup
to load.
Previously when the sign was set we had to load offset 0. And
when it was clear we had to load offset 4. This required a testl,
setns, zero extend, and finally a mul by 4. By switching the offsets
we can just shift the sign bit into the lsb and multiply it by 4.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fix the ViewOpShapeFolder in case of no affine mapping associated with a Memref construct identity mapping.
Reviewers: nicolasvasilache
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72735
|
|
|
|
|
|
|
|
|
|
|
| |
On Fuchsia, pthread API is emulated on top of C11 thread API. Using C11
thread API directly is more efficient.
While this implementation is only used by Fuchsia at the moment, it's
not Fuchsia specific, and could be used by other platforms that use C11
threads rather than pthreads in the future.
Differential Revision: https://reviews.llvm.org/D64378
|
|
|
|
| |
(clang diagnostic handler for IR input files)
|
|
|
|
|
|
|
|
| |
objects
The reference counter for global objects marked with declare target is INF. This patch prevents the runtime from incrementing /decrementing INF refcounts. Without it, the map(delete: global_object) directive actually deallocates the global on the device. With this patch, such a directive becomes a no-op.
Differential Revision: https://reviews.llvm.org/D72525
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- `dead-mi-elimination` assumes MIR in the SSA form and cannot be
arranged after phi elimination or DeSSA. It's enhanced to handle the
dead register definition by skipping use check on it. Once a register
def is `dead`, all its uses, if any, should be `undef`.
- Re-arrange the DIE in RA phase for AMDGPU by placing it directly after
`detect-dead-lanes`.
- Many relevant tests are refined due to different register assignment.
Reviewers: rampitec, qcolombet, sunfish
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72709
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit defines a new SPIR-V dialect attribute for specifying
a SPIR-V target environment. It is a dictionary attribute containing
the SPIR-V version, supported extension list, and allowed capability
list. A SPIRVConversionTarget subclass is created to take in the
target environment and sets proper dynmaically legal ops by querying
the op availability interface of SPIR-V ops to make sure they are
available in the specified target environment. All existing conversions
targeting SPIR-V is changed to use this SPIRVConversionTarget. It
probes whether the input IR has a `spv.target_env` attribute,
otherwise, it uses the default target environment: SPIR-V 1.0 with
Shader capability and no extra extensions.
Differential Revision: https://reviews.llvm.org/D72256
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For IR input files, we currently use LLVM diagnostic handler even the
compilation is from clang. As a result, we are not able to use -Rpass
to get the transformation reports. Some warnings are not handled
properly either: We found many mysterious warnings in our ThinLTO backend
compilations in SamplePGO and CSPGO. An example of the warning:
"warning: net/proto2/public/metadata_lite.h:51:21: 0.02% (1 / 4999)"
This turns out to be a warning by Wmisexpect, which is supposed to be
filtered out by default. But since the filter is in clang's
diagnostic hander, we emit these incomplete warnings from LLVM's
diagnostic handler.
This patch uses clang diagnostic handler for IR input files. We create
a fake backendconsumer just to install the diagnostic handler.
With this change, we will have proper handling of all the warnings and we can
use -Rpass* options in IR input files compilation.
Also note that with is patch, LLVM's diagnostic options, like
"-mllvm -pass-remarks=*", are no longer be able to get optimization remarks.
Differential Revision: https://reviews.llvm.org/D72523
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows for users to cache printer state, which can be costly to recompute. Each of the IR print methods gain a new overload taking this new state class.
Depends On D72293
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D72294
|
|
|
|
| |
NVPTX does not support RTTI, so disable it by default.
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This was previously disabled as FunctionType TypeAttrs could not be roundtripped in the IR. This has been fixed, so we can now generically print FuncOp.
Depends On D72429
Reviewed By: jpienaar, mehdi_amini
Differential Revision: https://reviews.llvm.org/D72642
|
|
|
|
|
|
|
|
|
|
| |
Allow to build PCH's (with -building-pch-with-obj and the extra .o file)
with -fmodules-codegen -fmodules-debuginfo to allow emitting shared code
into the extra .o file, similarly to how it works with modules. A bit of
a misnomer, but the underlying functionality is the same. This saves up
to 20% of build time here.
Differential Revision: https://reviews.llvm.org/D69778
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a header contains 'extern template', then the template should be provided
somewhere by an explicit instantiation, so it is not necessary to generate
a copy. Worse, this can lead to an unresolved symbol, because the codegen's
object file will not actually contain functions from such a template
because of the GVA_AvailableExternally, but the object file for the explicit
instantiation will not contain them either because it will be blocked
by the information provided by the module.
Differential Revision: https://reviews.llvm.org/D69779
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This diff fixes issues with the semantics of linalg.generic on tensors that appeared when converting directly from HLO to linalg.generic.
The changes are self-contained within MLIR and can be captured and tested independently of XLA.
The linalg.generic and indexed_generic are updated to:
To allow progressive lowering from the value world (a.k.a tensor values) to
the buffer world (a.k.a memref values), a linalg.generic op accepts
mixing input and output ranked tensor values with input and output memrefs.
```
%1 = linalg.generic #trait_attribute %A, %B {other-attributes} :
tensor<?x?xf32>,
memref<?x?xf32, stride_specification>
-> (tensor<?x?xf32>)
```
In this case, the number of outputs (args_out) must match the sum of (1) the
number of output buffer operands and (2) the number of tensor return values.
The semantics is that the linalg.indexed_generic op produces (i.e.
allocates and fills) its return values.
Tensor values must be legalized by a buffer allocation pass before most
transformations can be applied. Such legalization moves tensor return values
into output buffer operands and updates the region argument accordingly.
Transformations that create control-flow around linalg.indexed_generic
operations are not expected to mix with tensors because SSA values do not
escape naturally. Still, transformations and rewrites that take advantage of
tensor SSA values are expected to be useful and will be added in the near
future.
Subscribers: bmahjour, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72555
|
|
|
|
|
|
| |
- Prefer `getVectorIdxTy()` as the index operand type for
`EXTRACT_SUBVECTOR` as targets expect different types by overloading
`getVectorIdxTy()`.
|
|
|
|
|
|
| |
There are no special virtual function handlers (like __cxa_pure_virtual)
defined for NVPTX target, so just emit such functions as null pointers
to prevent issues with linking and unresolved references.
|
|
|
|
|
|
|
|
| |
Summary: bfloat16 doesn't have a valid APFloat format, so we have to use double semantics when storing it. This change makes sure that hexadecimal values can be round-tripped properly given this fact.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D72667
|