| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
This is to help recommit a fixed version of r310208. As shown in PR34097,
we could miscompile if subtraction of the constants overflowed.
llvm-svn: 310490
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This flag "--fopenmp-ptx=" enables the overwriting of the default PTX version used for GPU offloaded OpenMP target regions: "+ptx42".
Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar
Reviewed By: ABataev
Subscribers: rengolin, cfe-commits
Differential Revision: https://reviews.llvm.org/D29660
llvm-svn: 310489
|
| |
|
|
| |
llvm-svn: 310488
|
| |
|
|
|
|
| |
space. This should be a NFC, but it will change how the compiler parses it.
llvm-svn: 310487
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A similar error message has been removed from the ARMTargetMachineBase
constructor in r306939. With this patch, we generate an error message
for the example below, compiled with -mcpu=cortex-m0, which does not
have ARM execution mode.
__attribute__((target("arm"))) int foo(int a, int b)
{
return a + b % a;
}
__attribute__((target("thumb"))) int bar(int a, int b)
{
return a + b % a;
}
By adding this error message to ARMBaseTargetMachine::getSubtargetImpl,
we can deal with functions that set -thumb-mode in target-features.
At the moment it seems like Clang does not have access to target-feature
specific information, so adding the error message to the frontend will
be harder.
Reviewers: echristo, richard.barton.arm, t.p.northover, rengolin, efriedma
Reviewed By: echristo, efriedma
Subscribers: efriedma, aemerson, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D35627
llvm-svn: 310486
|
| |
|
|
|
|
|
|
|
|
|
| |
specifier
Currently, far jmp/call which utilizes a 48bit memory operand would have been invoked via the 'lcall/ljmp' mnemonic (intel style).
This patch align those variants to formal intel spec
Differential Revision: https://reviews.llvm.org/D35846
llvm-svn: 310485
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
target code for NVIDIA GPUs.
Summary: Previously we have added the "-c" flag which gets passed to PTXAS by default to generate relocatable OpenMP target code by default. This set of flags exposes control over this behaviour.
Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar
Reviewed By: ABataev
Subscribers: Hahnfeld, rengolin, cfe-commits
Differential Revision: https://reviews.llvm.org/D29659
llvm-svn: 310484
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We noticed when implementing a new pragma that the TableGen-generated function
getAttributeSpellingListIndex() did not work for pragma attributes. It relies
on the values in the enum AttributeList::Syntax and a new value
AS_ContextSensitiveKeyword was added changing the value for AS_Pragma.
Apparently no tests failed since no pragmas currently make use of the
generated function.
To fix this we can move AS_Pragma back to the value that TableGen code expects.
Also to prevent changes in the enum from breaking that routine again I added
calls to getAttributeSpellingListIndex() in the unroll pragma code. That will
cause some lit test failures if the order is changed. I added a comment to
remind of this issue in the future.
This assumes we don’t need/want full TableGen support for
AS_ContextSensitiveKeyword. It currently only appears in getAttrKind and no
other TableGen-generated routines.
Patch by: mikerice
Differential Revision: https://reviews.llvm.org/D36473
llvm-svn: 310483
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: A closing parenthesis followed by a declaration or statement should always terminate the current statement.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D36491
llvm-svn: 310482
|
| |
|
|
|
|
|
|
|
|
|
| |
The recently improved support for `icmp` in ValueTracking
(r307304) exposes the fact that `isImplied` condition doesn't
really bail out if we hit the recursion limit (and calls
`computeKnownBits` which increases the depth and asserts).
Differential Revision: https://reviews.llvm.org/D36512
llvm-svn: 310481
|
| |
|
|
|
|
|
|
|
|
|
| |
Dot product is an optional ARMv8.2a extension, see also the public architecture
specification here:
https://developer.arm.com/products/architecture/a-profile/exploration-tools.
This patch adds AArch64 assembler support for these dot product instructions.
Differential Revision: https://reviews.llvm.org/D36515
llvm-svn: 310480
|
| |
|
|
|
|
|
|
|
| |
Original Diff: D29642
This patch was previously reverted due to an error with patch D29654
that this depends on.
llvm-svn: 310479
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
not present.
C++14 added user-defined literal support for complex numbers so that you
can write something like "complex<double> val = 2i". However, there is
an existing GNU extension supporting this syntax and interpreting the
result as a _Complex type.
This changes parsing so that such literals are interpreted in terms of
C++14's operators if an overload is present but otherwise falls back to
the original GNU extension.
(We now have more robust diagnostics for implicit conversions so the
libc++ test that caused the original revert still passes).
llvm-svn: 310478
|
| |
|
|
|
|
|
| |
Set the type of TheCall inside SemaBuiltinReserveRWPipe to reduce
duplicated code.
llvm-svn: 310477
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
By removing FeatureNoARM implies ModeThumb, we can detect cases where a
function's target-features contain -thumb-mode (enables ARM codegen for the
function), but the architecture does not support ARM mode. Previously, the
implication caused the FeatureNoARM bit to be cleared for functions with
-thumb-mode, making the assertion in ARMSubtarget::ARMSubtarget [1]
pointless for such functions.
This assertion is the only guard against generating ARM code for
architectures without ARM codegen support. Is there a place where we
could easily generate error messages for the user? At the moment, we
would generate ARM code for Thumb-only architectures. X86 has the same
behavior as ARM, as in it only has an assertion and no error message,
but I think for ARM an error message would be helpful. What do you
think?
For the example below, `llc -mtriple=armv7m-eabi test.ll -o -` will
generate ARM assembler (or fail with an assertion error with this patch).
Note that if we run the resulting assembler through llvm-mc, we get
an appropriate error message, but not when codegen is handled
through clang.
```
define void @bar() #0 {
entry:
ret void
}
attributes #0 = { "target-features"="-thumb-mode" }
```
[1] https://github.com/llvm-mirror/llvm/blob/c1f7b54cef62e9c8aa745d40bea146a167bf844e/lib/Target/ARM/ARMSubtarget.cpp#L147
Reviewers: t.p.northover, rengolin, peter.smith, aadg, silviu.baranga, richard.barton.arm, echristo
Reviewed By: rengolin, echristo
Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D35569
llvm-svn: 310476
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
formatv_object currently uses the implicitly defined move constructor,
but it is buggy. In typical use-cases, the problem doesn't show-up
because all calls to the move constructor are elided. Thus, the buggy
constructors are never invoked.
The issue especially shows-up when code is compiled using the
-fno-elide-constructors compiler flag. For instance, this is useful when
attempting to collect accurate code coverage statistics.
The exact issue is the following:
The Parameters data member is correctly moved, thus making the
parameters occupy a new memory location in the target
object. Unfortunately, the default copying of the Adapters blindly
copies the vector of pointers, leaving each of these pointers
referencing the parameters in the original object instead of the copied
one. These pointers quickly become dangling when the original object is
deleted. This quickly leads to crashes.
The solution is to update the Adapters pointers when performing a move.
The copy constructor isn't useful for format objects and can thus be
deleted.
This resolves PR33388.
Differential Revision: https://reviews.llvm.org/D34463
llvm-svn: 310475
|
| |
|
|
| |
llvm-svn: 310474
|
| |
|
|
| |
llvm-svn: 310473
|
| |
|
|
|
|
|
|
|
|
|
| |
pointers aliases
MS InlineAsm Dot operator accepts "Bases" such as "this" (cpp) and class/struct pointer typedef.
This patch enhance its implementation with this behavior.
Differential Revision: https://reviews.llvm.org/D36450
llvm-svn: 310472
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
managed memory.
This pass is useful to automatically convert a codebase that uses malloc/free
to use their managed memory counterparts.
Currently, rewrite malloc and free to the `polly_{malloc,free}Managed` variants.
A future patch will teach ManagedMemoryRewrite to rewrite global arrays
as pointers to globally allocated managed memory.
Differential Revision: https://reviews.llvm.org/D36513
llvm-svn: 310471
|
| |
|
|
|
|
|
| |
It accidentally std::move'd from Value parameter if it deduced to an
l-value ref.
llvm-svn: 310470
|
| |
|
|
| |
llvm-svn: 310469
|
| |
|
|
|
|
| |
Patch by: Reka Nikolett Kovacs
llvm-svn: 310468
|
| |
|
|
|
|
|
|
| |
Adopt a more strict approach regarding what marks should/can appear after a destination register, when operating upon an AVX512 platform.
Differential Revision: https://reviews.llvm.org/D35785
llvm-svn: 310467
|
| |
|
|
|
|
|
|
|
|
| |
Codegen with -polly-parallel queried the unmapped MemoryAccess, but only
the MemoryKind after mapping is relevant for codegen.
This should fix various fails of the
perf-x86_64-penryn-O3-polly-parallel-fast buildbot.
llvm-svn: 310466
|
| |
|
|
|
|
|
|
|
| |
The previous value of "polly-delicm" was forgotten to to be changed when
ForwardOpTree was split from DeLICM.
Thanks to Tobias for noticing!
llvm-svn: 310465
|
| |
|
|
|
|
|
|
|
| |
constructors when deciding whether classes should be passed indirectly."
This reverts commit r310401 because it seems to have broken some ARM
bot(s).
llvm-svn: 310464
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isLegalAddressingMode() has recently gained the extra optional Instruction*
parameter, and therefore it can now do the job that previously only
isFoldableMemAccess() could do.
The SystemZ implementation of isLegalAddressingMode() has gained the
functionality of checking for offsets, which used to be done with
isFoldableMemAccess().
The isFoldableMemAccess() hook has been removed everywhere.
Review: Quentin Colombet, Ulrich Weigand
https://reviews.llvm.org/D35933
llvm-svn: 310463
|
| |
|
|
|
|
|
|
| |
In the recursive call to isAMCompletelyFolded(), the passed offset should be
the sum of F.BaseOffset and Fixup.Offset.
Review: Quentin Colombet.
llvm-svn: 310462
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isl_error_quota proof.
distributeDomain() and filterKnownValInst() are used in a scop
of ForwardOpTree that limits the number of isl operations.
Therefore some isl functions may return null after any operation.
Remove assertion that assume non-null results and handle
isl_*_foreach returning isl::stat::error.
I hope this fixes the crash of the asop buildbot at ihevc_recon.c.
llvm-svn: 310461
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Assert that a binary expression is actually a binary expression,
rather than potientially incorrectly attempting to handle it as a
unary expression.
This resolves PR34083.
Thanks to Simonn Pilgrim for reporting the issue!
llvm-svn: 310460
|
| |
|
|
| |
llvm-svn: 310459
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pragmas
Summary:
This handles a case where the trailing '*/' of a multiline jsdoc eding in a
comment pragma wouldn't be put on a new line.
Reviewers: mprobst
Reviewed By: mprobst
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D36359
llvm-svn: 310458
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.
Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.
Differential Revision: https://reviews.llvm.org/D36405
llvm-svn: 310457
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to Nodes when removing ref edges from a RefSCC.
This map based association turns out to be pretty expensive for large
RefSCCs and pointless as we already have embedded data members inside
nodes that we use to track the DFS state. We can reuse one of those and
the map becomes unnecessary.
This also fuses the update of those numbers into the scan across the
pending stack of nodes so that we don't walk the nodes twice during the
DFS.
With this I expect the new PM to be faster than the old PM for the test
case I have been optimizing. That said, it also seems simpler and more
direct in many ways. The side storage was always pretty awkward.
The last remaining hot-spot in the profile of the LCG once this is done
will be the edge iterator walk in the DFS. I'll take a look at improving
that next.
llvm-svn: 310456
|
| |
|
|
|
| |
Suggested-by: Hongbin Zheng <etherzhhb@gmail.com>
llvm-svn: 310455
|
| |
|
|
|
|
|
| |
No need to create an OptimizationRemarkMissed object if we are not going
to use it anyway.
llvm-svn: 310454
|
| |
|
|
| |
llvm-svn: 310453
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The available platform list was previously only accessible via the
`platform list` command, this patch makes it possible to access that
list via the SBDebugger API. The active platform list has likewise
been exposed via the SBDebugger API.
Differential Revision: https://reviews.llvm.org/D35760
llvm-svn: 310452
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that RefSCC still connected.
This is common and can be handled much more efficiently. As soon as we
know we've covered every node in the RefSCC with the DFS, we can simply
reset our state and return. This avoids numerous data structure updates
and other complexity.
On top of other changes, this appears to get new PM back to parity with
the old PM for a large protocol buffer message source code. The dense
map updates are very hot in this function.
llvm-svn: 310451
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
limited batch updates.
Specifically, allow removing multiple reference edges starting from
a common source node. There are a few constraints that play into
supporting this form of batching:
1) The way updates occur during the CGSCC walk, about the most we can
functionally batch together are those with a common source node. This
also makes the batching simpler to implement, so it seems
a worthwhile restriction.
2) The far and away hottest function for large C++ files I measured
(generated code for protocol buffers) showed a huge amount of time
was spent removing ref edges specifically, so it seems worth focusing
there.
3) The algorithm for removing ref edges is very amenable to this
restricted batching. There are just both API and implementation
special casing for the non-batch case that gets in the way. Once
removed, supporting batches is nearly trivial.
This does modify the API in an interesting way -- now, we only preserve
the target RefSCC when the RefSCC structure is unchanged. In the face of
any splits, we create brand new RefSCC objects. However, all of the
users were OK with it that I could find. Only the unittest needed
interesting updates here.
How much does batching these updates help? I instrumented the compiler
when run over a very large generated source file for a protocol buffer
and found that the majority of updates are intrinsically updating one
function at a time. However, nearly 40% of the total ref edges removed
are removed as part of a batch of removals greater than one, so these
are the cases batching can help with.
When compiling the IR for this file with 'opt' and 'O3', this patch
reduces the total time by 8-9%.
Differential Revision: https://reviews.llvm.org/D36352
llvm-svn: 310450
|
| |
|
|
|
|
|
|
|
|
| |
statements
Patch by: Reka Nikolett Kovacs
Differential Revision: https://reviews.llvm.org/D36407
llvm-svn: 310449
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Previously, we used to compute this with `elementSizeInBits / 8`. This
would yield an element size of 0 when the array had element size < 8 in
bits.
To fix this, ask data layout what the size in bytes should be.
Differential Revision: https://reviews.llvm.org/D36459
llvm-svn: 310448
|
| |
|
|
|
|
| |
I don't know if this really affects anything. Just thought it was weird that we had all of the ADD/SUB/AND/OR/XOR instructions.
llvm-svn: 310447
|
| |
|
|
| |
llvm-svn: 310446
|
| |
|
|
|
|
|
| |
"error: unable to create target: 'No available targets are compatible
with this triple.'"
llvm-svn: 310445
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
It was timing out on this test, but for reasons unrelated to the
specific bug it was testing for. Randomly breaking in gdb with `clang
-target i686-windows -fmsc-version=1700` reveals *many* frames from
MicrosoftCXXNameMangler. So, it would seem that some caching is needed
there, as well...
Fingers crossed that specifying a triple is sufficient to work around
this.
llvm-svn: 310444
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that dependent instruction may access memory.
In this case we must reject optimization because the memory change will
be visible in null handler basic block. So we will execute an instruction which
we must not execute if check fails.
Reviewers: sanjoy, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36392
llvm-svn: 310443
|
| |
|
|
|
|
|
|
| |
For some reason I didn't see this failure the first time. The
output format changed slightly, so we just have to update the
test for the new format.
llvm-svn: 310442
|
| |
|
|
| |
llvm-svn: 310441
|