| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Requested by: Jim Grosbach.
llvm-svn: 245963
|
|
|
|
| |
llvm-svn: 245957
|
|
|
|
| |
llvm-svn: 245955
|
|
|
|
|
|
| |
Code review feedback by Charlie Turner.
llvm-svn: 245954
|
|
|
|
|
|
|
|
|
|
|
| |
loop iterations check.
The loop minimum iterations check below ensures the loop has enough trip count so the generated
vector loop will likely be executed, and it covers the overflow check.
Differential Revision: http://reviews.llvm.org/D12107.
llvm-svn: 245952
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-on from the discussion in http://reviews.llvm.org/D12154.
This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has
generally fast unaligned memory ops.
A motivating use case for this change is a clang invocation that doesn't explicitly set
the CPU, but does target a feature that we know only exists on a CPU that supports fast
unaligned memops. For example:
$ clang -O1 foo.c -mavx
This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449
Before this patch, we used different store types depending on whether the example can be
lowered as a memset or not.
Differential Revision: http://reviews.llvm.org/D12288
llvm-svn: 245950
|
|
|
|
|
|
|
|
|
|
|
| |
Eventually, we will need sample profiles to be incorporated into the
inliner's cost models. To do this, we need the sample profile pass to
be a module pass.
This patch makes no functional changes beyond the mechanical adjustments
needed to run SampleProfile as a module pass.
llvm-svn: 245940
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While introducing support for MinVersionLoadCommand in llvm-readobj I noticed there's
no API to extract Major/Minor/Update components conveniently. Currently consumers
do the bit twiddling on their own, but this will change from now on.
I'll convert llvm-objdump (and llvm-readobj) in a later commit.
Differential Revision: http://reviews.llvm.org/D12282
Reviewed by: rafael
llvm-svn: 245938
|
|
|
|
|
|
|
| |
As of r245924, _ftol2 is no longer used for fptoui on MS platforms.
Remove the dead code associated with it.
llvm-svn: 245925
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes two issues in x86 fptoui lowering.
1) Makes conversions from f80 go through the right path on AVX-512.
2) Implements an inline sequence for fptoui i64 instead of a library
call. This improves performance by 6X on SSE3+ and 3X otherwise.
Incidentally, it also removes the use of ftol2 for fptoui, which was
wrong to begin with, as ftol2 converts to a signed i64, producing
wrong results for values >= 2^63.
Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D11316
llvm-svn: 245924
|
|
|
|
| |
llvm-svn: 245921
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It doesn't solve the problem, when for example we load something, and
then assume that it is the same as some constant value, because
globalopt will fail on unknown load instruction. The proposed solution
would be to skip some instructions that we can't evaluate and they are
safe to skip (f.e. load, assume and many others) and see if they are
required to perform optimization (f.e. we don't care about ephemeral
instructions that may appear using @llvm.assume())
http://reviews.llvm.org/D12266
llvm-svn: 245919
|
|
|
|
|
|
|
|
| |
This reverts commit 433bfd94e4b7e3cc3f8b08f8513ce47817941b0c.
Broke some bot, have to see why it passed locally.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245917
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We removed access to the DataLayout on the TargetMachine and
deprecated the C API function LLVMGetTargetMachineData() in r243114.
However the way I tried to be backward compatible was broken: I
changed the wrapper of the TargetMachine to be a structure that
includes the DataLayout as well. However the TargetMachine is also
wrapped by the ExecutionEngine, in the more classic way. A client
using the TargetMachine wrapped by the ExecutionEngine and trying
to get the DataLayout would break.
It seems tricky to solve the problem completely in the C API
implementation. This patch tries to address this backward
compatibility in a more lighter way in the C++ API. The C API is
restored in its original state and the removed C++ API is
reintroduced, but privately. The C API is friended to the
TargetMachine and should be the only consumer for this API.
Reviewers: ributzka
Differential Revision: http://reviews.llvm.org/D12263
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245916
|
|
|
|
|
|
|
|
| |
We might end up with a trivial copy as the addend, and if so, we should ignore
the corresponding FMA instruction. The trivial copy can be coalesced away later,
so there's nothing to do here. We should not, however, assert. Fixes PR24544.
llvm-svn: 245907
|
|
|
|
|
|
|
|
| |
Apparently std::vector::erase(const_iterator) (as opposed to the
non-const iterator) is a part of C++11 but it seems this is not available
on all the buildbots.
llvm-svn: 245900
|
|
|
|
| |
llvm-svn: 245899
|
|
|
|
| |
llvm-svn: 245898
|
|
|
|
| |
llvm-svn: 245896
|
|
|
|
| |
llvm-svn: 245895
|
|
|
|
| |
llvm-svn: 245893
|
|
|
|
|
|
|
|
|
|
|
| |
This change moves LTOCodeGenerator's ownership of the merged module to a
field of type std::unique_ptr<Module>. This helps simplify parts of the code
and clears the way for the module to be consumed by LLVM CodeGen (see D12132
review comments).
Differential Revision: http://reviews.llvm.org/D12205
llvm-svn: 245891
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Support function calls.
Reviewers: sunfish, sunfishcode
Subscribers: sunfishcode, jfb, llvm-commits
Differential revision: http://reviews.llvm.org/D12219
llvm-svn: 245887
|
|
|
|
|
|
|
|
|
|
| |
Summary: I forgot to squash git commits before doing an svn dcommit of D12219. Reverting, and re-submitting.
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D12298
llvm-svn: 245886
|
|
|
|
| |
llvm-svn: 245883
|
|
|
|
| |
llvm-svn: 245882
|
|
|
|
| |
llvm-svn: 245875
|
|
|
|
|
|
|
|
| |
change.
Also convert a few loops to range-for loops and correct a comment.
llvm-svn: 245874
|
|
|
|
| |
llvm-svn: 245872
|
|
|
|
| |
llvm-svn: 245869
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes PR24546, which demonstrates a segfault during the VSX
swap removal pass. The problem is that debug value instructions were
not excluded from the list of instructions to be analyzed for webs of
related computation. I've added the test case from the PR as a crash
test in test/CodeGen/PowerPC.
llvm-svn: 245862
|
|
|
|
| |
llvm-svn: 245860
|
|
|
|
| |
llvm-svn: 245859
|
|
|
|
| |
llvm-svn: 245853
|
|
|
|
| |
llvm-svn: 245852
|
|
|
|
| |
llvm-svn: 245851
|
|
|
|
|
|
|
|
|
| |
This patch adds support for dfsan on aarch64-linux with 42-bit VMA
(current default config for 64K pagesize kernels). The support is
enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time
for both clang/llvm and compiler-rt. The default VMA is 39 bits.
llvm-svn: 245840
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D12151
llvm-svn: 245835
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The FP16_TO_FP node only uses the bottom 16 bits of its input, so the
following pattern can be optimised by removing the AND:
(FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op)
This is a common pattern for ARM targets when functions have __fp16
arguments, as they are passed as floats (so that they get passed in the
correct registers), but then bitcast and truncated to ignore the top 16
bits.
llvm-svn: 245832
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D12232
llvm-svn: 245830
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D12230
llvm-svn: 245829
|
|
|
|
| |
llvm-svn: 245825
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree.
There is no reason not to require it up-front for SROA.
Some history is necessary to understand why we ended-up here.
r123437 switched the second (Legacy)SROA in the optimizer pipeline to
use SSAUpdater in order to avoid recomputing the costly
DominanceFrontier. The purpose was to speed-up the compile-time.
Later r123609 removed the need for the DominanceFrontier in
(Legacy)SROA.
Right after, some cleanup was made in r123724 to remove any reference
to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and
SROA_DT (the latter replacing SROA_DF).
The second argument of `createScalarReplAggregatesPass` was renamed
from `UseDomFrontier` to `UseDomTree`.
I believe this is were a mistake was made. The pipeline was not
updated and the call site was still:
PM->add(createScalarReplAggregatesPass(-1, false));
At that time, SROA was immediately followed in the pipeline by
EarlyCSE which required alread the DominatorTree. Not requiring
the DominatorTree in SROA didn't save anything, but unfortunately
it was lost at this point.
When the new SROA Pass was introduced in r163965, I believe the goal
was to have an exact replacement of the existing SROA, this bug
slipped through.
You can see currently:
$ echo "" | clang -x c++ -O3 -c - -mllvm -debug-pass=Structure
...
...
FunctionPass Manager
SROA
Dominator Tree Construction
Early CSE
After this patch:
$ echo "" | clang -x c++ -O3 -c - -mllvm -debug-pass=Structure
...
...
FunctionPass Manager
Dominator Tree Construction
SROA
Early CSE
This improves the compile time from 88s to 23s for PR17855.
https://llvm.org/bugs/show_bug.cgi?id=17855
And from 113s to 12s for PR16756
https://llvm.org/bugs/show_bug.cgi?id=16756
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D12267
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245820
|
|
|
|
|
|
| |
Just a cosmetic change, no functionality change is intended.
llvm-svn: 245818
|
|
|
|
|
|
| |
Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from.
llvm-svn: 245814
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LoadedObjectInfo.
Not only do we not need to do anything to read correct values from the
object files, but the current logic actually wrongly applies twice the
section base address when there is no LoadedObjectInfo passed to the
DWARFContext creation (as the added test shows).
Simply do not apply any relocations on the mach-o debug info if there is
no load offset to apply.
llvm-svn: 245807
|
|
|
|
|
|
|
| |
Reported by coverity.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245800
|
|
|
|
|
|
|
| |
Reported by coverity.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245799
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
WinEHPrepare is going to require that cleanuppad and catchpad produce values
of token type which are consumed by any cleanupret or catchret exiting the
pad. This change updates the signatures of those operators to require/enforce
that the type produced by the pads is token type and that the rets have an
appropriate argument.
The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and
similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that
restriction, this change adds a notion of an operator constraint to both
LLParser and BitcodeReader, allowing appropriate sentinels to be constructed
for forward references and appropriate error messages to be emitted for
illegal inputs.
Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad
predecessor must have no other predecessors; this ensures that WinEHPrepare
will see the expected linear relationship between sibling catches on the
same try.
Lastly, remove some superfluous/vestigial casts from instruction operand
setters operating on BasicBlocks.
Reviewers: rnk, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12108
llvm-svn: 245797
|
|
|
|
|
|
|
|
| |
There was already a good error path for this. Added a test for it & made
a minor code change to ensure the error path was actually reached,
rather than crashing before we got that far.
llvm-svn: 245795
|