| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 255704
|
|
|
|
|
|
| |
It wants to assert that the subtarget is 64-bit, not the register.
llvm-svn: 255703
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15493
llvm-svn: 255702
|
|
|
|
| |
llvm-svn: 255701
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch improves on the suggested codegen from PR24475:
https://llvm.org/bugs/show_bug.cgi?id=24475
but only for the fmaxf() case to start, so we can sort out any bugs before
extending to fmin, f64, and vectors.
The fmax / maxnum definitions provide us flexibility for signed zeros, so the
only thing we have to worry about in this replacement sequence is NaN handling.
Note 1: It may be better to implement this as lowerFMAXNUM(), but that exposes
a problem: SelectionDAGBuilder::visitSelect() transforms compare/select
instructions into FMAXNUM nodes if we declare FMAXNUM legal or custom. Perhaps
that should be checking for NaN inputs or global unsafe-math before transforming?
As it stands, that bypasses a big set of optimizations that the x86 backend
already has in PerformSELECTCombine().
Note 2: The v2f32 test reveals another bug; the vector is extended to v4f32, so
we have completely unnecessary operations happening on undef elements of the
vector.
Differential Revision: http://reviews.llvm.org/D15294
llvm-svn: 255700
|
|
|
|
| |
llvm-svn: 255698
|
|
|
|
| |
llvm-svn: 255697
|
|
|
|
| |
llvm-svn: 255696
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an initial version of the runtime cross-DSO CFI support
library.
It contains a number of FIXMEs, ex. it does not support the
diagnostic mode nor dlopen/dlclose, but it works and can be tested.
Diagnostic mode, in particular, would require some refactoring (we'd
like to gather all CFI hooks in the UBSan library into one function
so that we could easier pass the diagnostic information down to
__cfi_check). It will be implemented later.
Once the diagnostic mode is in, I plan to create a second test
configuration to run all existing tests in both modes. For now, this
patch includes only a few new cross-DSO tests.
llvm-svn: 255695
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso.
* Links a runtime library when enabled.
* Emits __cfi_slowpath calls is bitset test fails.
* Emits extra hash-based bitsets for external CFI checks.
* Sets a module flag to enable __cfi_check generation during LTO.
This mode does not yet support diagnostics.
llvm-svn: 255694
|
|
|
|
|
|
|
|
| |
An LTO pass that generates a __cfi_check() function that validates a
call based on a hash of the call-site-known type and the target
pointer.
llvm-svn: 255693
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: I'm not sure how things worked before without this.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15492
llvm-svn: 255692
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ignoring specific instructions.
(This is the third attempt to check in this patch, and the first two are r255454
and r255460. The once failed test file reg-usage.ll is now moved to
test/Transform/LoopVectorize/X86 directory with target datalayout and target
triple indicated.)
LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the
register usage for specific VFs. However, it takes into account many
instructions that won't be vectorized, such as induction variables,
GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative
when choosing VF. In this patch, the induction variables that won't be
vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set
so that their register usage won't be considered any more.
Differential revision: http://reviews.llvm.org/D15177
llvm-svn: 255691
|
|
|
|
| |
llvm-svn: 255690
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15426
llvm-svn: 255689
|
|
|
|
|
|
|
|
|
| |
Patch by: Johan Engelen
Introduce LLVM_LIBRARY_WEAK macro. Define LLVM_LIBRARY_WEAK
and LLVM_LIBRARY_VISIBIITY for MSVC
llvm-svn: 255688
|
|
|
|
|
|
| |
categories
llvm-svn: 255687
|
|
|
|
| |
llvm-svn: 255686
|
|
|
|
| |
llvm-svn: 255685
|
|
|
|
|
|
|
|
|
|
| |
Patch by: Johan Engelen
On windows, opening in text mode will result in
line ending chars to be appended leading to
profile corruption.
llvm-svn: 255684
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch allows GCC 4.6 and above to use `noexcept` as opposed to `throw()`.
Is it an ABI safe change to suddenly switch on `noexcept`? I imagine it must be because it's disabled in w/ clang in C++03 but not C++11.
Reviewers: danalbert, jroelofs, mclow.lists
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D15516
llvm-svn: 255683
|
|
|
|
|
|
|
|
| |
Eventually we may need to sink this include to the .cpp file or
something to suport LLVM_ENABLE_THREADS=OFF, but this solves my
immediate problem of fixing the build.
llvm-svn: 255682
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add instruction patterns for matching load and store instructions with constant
offsets in addresses. The code is fairly redundant due to the need to replicate
everything between imm, tglobaldadr, and texternalsym, but this appears to be
common tablegen practice. The main alternative appears to be to introduce
matching functions with C++ code, but sticking with purely generated matchers
seems better for now.
Also note that this doesn't yet support offsets from getelementptr, which will
be the most common case; that will depend on a change in target-independent code
in order to set the NoUnsignedWrap flag, which I'll submit separately. Until
then, the testcase uses ptrtoint+add+inttoptr with a nuw on the add.
Also implement isLegalAddressingMode with an approximation of this.
Differential Revision: http://reviews.llvm.org/D15538
llvm-svn: 255681
|
|
|
|
| |
llvm-svn: 255680
|
|
|
|
|
|
| |
Patch by B. Sivachandra Reddy!
llvm-svn: 255679
|
|
|
|
| |
llvm-svn: 255678
|
|
|
|
|
|
|
| |
We can clean this up now that we have the X86 CATCHRET instruction to
restore the FP, SP, and BP.
llvm-svn: 255677
|
|
|
|
|
|
|
| |
This allows more specialized formatters to still reuse the results
summarization display from the base class.
llvm-svn: 255676
|
|
|
|
|
|
|
|
|
|
|
| |
This updates clang to use bundle operands to associate an invoke with
the funclet which it is contained within.
Depends on D15517.
Differential Revision: http://reviews.llvm.org/D15518
llvm-svn: 255675
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifyCFG allows tail merging with code which terminates in
unreachable which, in turn, makes it possible for an invoke to end up in
a funclet which it was not originally part of.
Using operand bundles on invokes allows us to determine whether or not
an invoke was part of a funclet in the source program.
Furthermore, it allows us to unambiguously answer questions about the
legality of inlining into call sites which the personality may have
trouble with.
Differential Revision: http://reviews.llvm.org/D15517
llvm-svn: 255674
|
|
|
|
| |
llvm-svn: 255673
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions
Summary:
We were previously selecting all constant loads to SMRD instructions and legalizing
the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass.
This new solution is more simple and also generates much better code, because
the instruction selector is able to take advantage of all the MUBUF addressing
modes that are legalization pass wasn't able to.
We also no longer need to generate v_add_* instructions when we
have a uniform pointer and a non-uniform offset, as this is now folded into the
MUBUF instruction during instruction selection.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15425
llvm-svn: 255672
|
|
|
|
| |
llvm-svn: 255671
|
|
|
|
| |
llvm-svn: 255670
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A large number of loop utility functions take a `Pass *` and reach
into it to find out which analyses to preserve. There are a number of
problems with this:
- The APIs have access to pretty well any Pass state they want, so
it's hard to tell what they may or may not do.
- Other APIs have copied these and pass around a `Pass *` even though
they don't even use it. Some of these just hand a nullptr to the API
since the callers don't even have a pass available.
- Passes in the new pass manager don't work like the current ones, so
the APIs can't be used as is there.
Instead, we should explicitly thread the analysis results that we
actually care about through these APIs. This is both simpler and more
reusable.
llvm-svn: 255669
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On SparcV8, doubles get passed in two 32-bit integer registers. The call
code was already handling endianness correctly, but the incoming
argument code was not -- it got the two halves in opposite order.
Also remove some dead code in LowerFormalArguments_32 to handle
less-than-32bit values, which can't actually happen.
Finally, add some test cases for the 32-bit calling convention, cribbed
from the 64abi.ll test, and run for both big and little-endian.
llvm-svn: 255668
|
|
|
|
| |
llvm-svn: 255667
|
|
|
|
| |
llvm-svn: 255665
|
|
|
|
|
|
|
|
|
| |
We only want to emit CFI adjustments when actually using DWARF.
This fixes PR25828.
Differential Revision: http://reviews.llvm.org/D15522
llvm-svn: 255664
|
|
|
|
| |
llvm-svn: 255663
|
|
|
|
|
|
| |
Patch by: Vedran Mileti
llvm-svn: 255662
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15476
llvm-svn: 255661
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the last general step to allow more IR-level speculation with a safety harness in place in CodeGenPrepare.
The intent is to restore the behavior enabled by:
http://reviews.llvm.org/rL228826
but prevent bad performance such as:
https://llvm.org/bugs/show_bug.cgi?id=24818
Earlier patches in this sequence:
D12882 (disable SimplifyCFG speculation for expensive instructions)
D13297 (have CGP despeculate expensive ops)
D14630 (have CGP despeculate special versions of cttz/ctlz)
As shown in the test cases, we only have two instructions currently affected: ctz for some x86 and fdiv generally.
Allowing exactly one expensive instruction is a bit of a hack, but it lines up with what is currently implemented
in CGP. If we make the despeculation more general in CGP, we can make the speculation here more liberal.
A follow-up patch will adjust the cost for sqrt and possibly other typically expensive math intrinsics (currently
everything is cheap by default). GPU targets would likely want to override those expensive default costs (just as
they probably should already override the cost of div/rem) because just about any math is cheaper than control-flow
on those targets.
Differential Revision: http://reviews.llvm.org/D15213
llvm-svn: 255660
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change adds support for specifying a weight when merging profile data with the llvm-profdata tool.
Weights are specified by using the --weighted-input=<weight>,<filename> option. Input files not specified
with this option (normal positional list after options) are given a default weight of 1.
Adding support for arbitrary weighting of input profile data allows for relative importance to be placed on the
input data from multiple training runs.
Both sampled and instrumented profiles are supported.
Reviewers: davidxl, dnovillo, bogner, silvas
Subscribers: silvas, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D15306
llvm-svn: 255659
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The LibCallSimplifier will turn llvm.exp2.* intrinsics into ldexp* libcalls
which do not make sense with the AMDGPU backend.
In the long run, we'll want an llvm.ldexp.* intrinsic to properly make use of
this optimization, but this works around the problem for now.
See also: http://reviews.llvm.org/D14327 (suggested llvm.ldexp.* implementation)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709
Reviewers: arsenm, tstellarAMD
Differential Revision: http://reviews.llvm.org/D14990
llvm-svn: 255658
|
|
|
|
|
|
|
|
|
|
| |
The radeonsi fp64 support can hit these now that some redundant bitcasts
are folded.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 255657
|
|
|
|
|
|
|
|
|
| |
"movl $-1, %eax" is 5 bytes, "xorl %eax, %eax; decl %eax" is 3 bytes.
This commit makes LLVM use the latter when optimizing for size.
Differential Revision: http://reviews.llvm.org/D14971
llvm-svn: 255656
|
|
|
|
| |
llvm-svn: 255655
|
|
|
|
|
|
| |
We now have 252 expected failures.
llvm-svn: 255654
|
|
|
|
| |
llvm-svn: 255653
|