| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 220973
|
|
|
|
|
|
|
|
|
|
| |
Tested this by #if 0'ing out the pthreads implementation, which
indicated that this fallback was not currently compiling successfully
and applying this patch resolves that.
Patch by Andy Chien.
llvm-svn: 220969
|
|
|
|
|
|
| |
Noticed in post-commit review by Adrian Prantl.
llvm-svn: 220967
|
|
|
|
| |
llvm-svn: 220964
|
|
|
|
|
|
|
|
|
| |
a CMP which defines the flags used by B.CC.
http://reviews.llvm.org/D6047
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!
llvm-svn: 220961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a case where we have a no {un,}signed wrap flag on the increment, if
RHS - Start is constant then we can avoid inserting a max operation bewteen
the two, since we can statically determine which is greater.
This allows us to unroll loops such as:
void testcase3(int v) {
for (int i=v; i<=v+1; ++i)
f(i);
}
llvm-svn: 220960
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since block address values can be larger than 2GB in 64-bit code, they
cannot be loaded simply using an @l / @ha pair, but instead must be
loaded from the TOC, just like GlobalAddress, ConstantPool, and
JumpTable values are.
The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where
temporary labels could not be used as TOC values, since code would
attempt (and fail) to use GetOrCreateSymbol to create a symbol of the
same name as the temporary label.
llvm-svn: 220959
|
|
|
|
|
|
|
| |
Do a better job classifying symbols. This increases the consistency
between the COFF handling code and the ELF side of things.
llvm-svn: 220952
|
|
|
|
| |
llvm-svn: 220949
|
|
|
|
|
|
| |
Initial patch by Oleg Ranevskyy.
llvm-svn: 220945
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r212242 introduced a legalizer hook, originally to let AArch64 widen
v1i{32,16,8} rather than scalarize, because the legalizer expected, when
scalarizing the result of a conversion operation, to already have
scalarized the operands. On AArch64, v1i64 is legal, so that commit
ensured operations such as v1i32 = trunc v1i64 wouldn't assert.
It did that by choosing to widen v1 types whenever possible. However,
v1i1 types, for which there's no legal widened type, would still trigger
the assert.
This commit fixes that, by only scalarizing a trunc's result when the
operand has already been scalarized, and introducing an extract_elt
otherwise.
This is similar to r205625.
Fixes PR20777.
llvm-svn: 220937
|
|
|
|
| |
llvm-svn: 220936
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier this summer I fixed an issue where we were incorrectly combining
multiple loads that had different constraints such alignment, invariance,
temporality, etc. Apparently in one case I made copt paste error and swapped
alignment and invariance.
Tests included.
rdar://18816719
llvm-svn: 220933
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to initialize the ManagedStatic mutex.
Summary:
This patch adds an llvm_call_once which is a wrapper around std::call_once on platforms where it is available and devoid of bugs. The patch also migrates the ManagedStatic mutex to be allocated using llvm_call_once.
These changes are philosophically equivalent to the changes added in r219638, which were reverted due to a hang on Win32 which was the result of a bug in the Windows implementation of std::call_once.
Reviewers: aaron.ballman, chapuni, chandlerc, rnk
Reviewed By: rnk
Subscribers: majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D5922
llvm-svn: 220932
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The langref says:
LLVM explicitly allows declarations of global variables to be marked
constant, even if the final definition of the global is not. This
capability can be used to enable slightly better optimization of the
program, but requires the language definition to guarantee that
optimizations based on the ‘constantness’ are valid for the
translation units that do not include the definition.
Given that definition, when merging two declarations, we have to drop
constantness if of of them is not marked contant, since the Module
without the constant marker might not have the necessary guarantees.
llvm-svn: 220927
|
|
|
|
|
|
|
|
|
|
|
| |
If we load from a location with range metadata, we can use information about the ranges of the loaded value for optimization purposes. This helps to remove redundant checks and canonicalize checks for other optimization passes. This particular patch checks whether a value is known to be non-zero from the range metadata.
Currently, these tests are against InstCombine. In theory, all of these should be InstSimplify since we're not inserting any new instructions. Moving the code may follow in a separate change.
Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D5947
llvm-svn: 220925
|
|
|
|
|
|
| |
when inlining two calls to the same function from the same call site.
llvm-svn: 220923
|
|
|
|
|
|
|
| |
This fixes the autobuilders I broke with a recent patch. Thanks echristo
and dblaikie for beating me with a clue stick.
llvm-svn: 220918
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch finishes up support for handling sampling profiles in both
text and binary formats. The new binary format uses uleb128 encoding to
represent numeric values. This makes profiles files about 25% smaller.
The profile writer class can write profiles in the existing text and the
new binary format. In subsequent patches, I will add the capability to
read (and perhaps write) profiles in the gcov format used by GCC.
Additionally, I will be adding support in llvm-profdata to manipulate
sampling profiles.
There was a bit of refactoring needed to separate some code that was in
the reader files, but is actually common to both the reader and writer.
The new test checks that reading the same profile encoded as text or
raw, produces the same results.
Reviewers: bogner, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6000
llvm-svn: 220915
|
|
|
|
|
|
|
| |
Refactored through AVX512_maskable
llvm-svn: 220908
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The previous calling convention prevented custom functions from being able
to access argument labels unless it knew how many variadic arguments there
were, and of which type. This restriction made it impossible to correctly
model functions in the printf family, as it is legal to pass more arguments
than required to those functions. We now pass arguments in the following order:
non-vararg arguments
labels for non-vararg arguments
[if vararg function, pointer to array of labels for vararg arguments]
[if non-void function, pointer to label for return value]
vararg arguments
Differential Revision: http://reviews.llvm.org/D6028
llvm-svn: 220906
|
|
|
|
| |
llvm-svn: 220884
|
|
|
|
|
|
| |
extra live range interferance
llvm-svn: 220872
|
|
|
|
|
|
|
|
|
| |
VMULP*, VDIVP*, VMAXP*, VMINP*)
Refactored through AVX512_maskable
Added encoding tests for them.
llvm-svn: 220858
|
|
|
|
| |
llvm-svn: 220857
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added LLVM libraries required for IntelJITEvents to LLVMBuild.txt.
* Removed 'jit' library from llvm-jitlistener.
* Added support for OptionalLibraries to llvm-build cmake files generator.
Patch by aleksey.a.bader@intel.com
Differential Revision: http://reviews.llvm.org/D5646
llvm-svn: 220848
|
|
|
|
|
|
| |
Patch by Gabriel Radanne <drupyog@zoho.com>.
llvm-svn: 220817
|
|
|
|
|
|
| |
Patch by Gabriel Radanne <drupyog@zoho.com>.
llvm-svn: 220814
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This restores the commit from SVN r219899 with an additional change to ensure
that the CodeGen is correct for the case that was identified as being incorrect
(originally PR7272).
In the case that during inlining we need to synthesize a value on the stack
(i.e. for passing a value byval), then any function involving that alloca must
be stripped of its tailness as the restriction that it does not access the
parent's stack no longer holds. Unfortunately, a single alloca can cause a
rippling effect through out the inlining as the value may be aliased or may be
mutated through an escaped external call. As such, we simply track if an alloca
has been introduced in the frame during inlining, and strip any tail calls.
llvm-svn: 220811
|
|
|
|
|
|
| |
No functional change
llvm-svn: 220808
|
|
|
|
|
|
| |
Refactored through AVX512_maskable
llvm-svn: 220806
|
|
|
|
|
|
| |
Refactored multiclass through AVX512_maskable
llvm-svn: 220783
|
|
|
|
|
|
|
|
| |
Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine.
No functional change.
llvm-svn: 220779
|
|
|
|
|
|
|
|
|
| |
value type and true value is all ones or false value is all zeros.
This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case.
Added tests for vselect of <X x i1> values.
llvm-svn: 220777
|
|
|
|
|
|
| |
warning; NFC.
llvm-svn: 220775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code.
Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll):
define <16 x i32> @_inreg16xi32(i32 %a) {
; CHECK-LABEL: _inreg16xi32:
; CHECK: ## BB#0:
-; CHECK-NEXT: vpbroadcastd %edi, %zmm0
+; CHECK-NEXT: vmovd %edi, %xmm0
+; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0
+; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0
; CHECK-NEXT: retq
%b = insertelement <16 x i32> undef, i32 %a, i32 0
%c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer
ret <16 x i32> %c
}
Here, 256-bit broadcast was generated instead of 512-bit one.
In this patch
1) I added vector-shuffle lowering through broadcasts
2) Removed asserts and branches likes because this is incorrect
- assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI");
3) Fixed lowering tests
llvm-svn: 220774
|
|
|
|
| |
llvm-svn: 220773
|
|
|
|
| |
llvm-svn: 220772
|
|
|
|
| |
llvm-svn: 220771
|
|
|
|
| |
llvm-svn: 220759
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a Microsoft calling convention that supports both x86 and x86_64
subtargets. It passes vector and floating point arguments in XMM0-XMM5,
and passes them indirectly once they are consumed.
Homogenous vector aggregates of up to four elements can be passed in
sequential vector registers, but this part is not implemented in LLVM
and will be handled in Clang.
On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as
integer register parameters and is callee cleanup. On x86_64, it
delegates to the normal win64 calling convention.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D5943
llvm-svn: 220745
|
|
|
|
|
|
|
|
|
|
| |
Benchmarks have shown that it's harmless to the performance there, and having a
unified set of passes between the two cores where possible helps big.LITTLE
deployment.
Patch by Z. Zheng.
llvm-svn: 220744
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed that it was untested, and forcing it on caused some tests to fail:
LLVM :: Linker/metadata-a.ll
LLVM :: Linker/prefixdata.ll
LLVM :: Linker/type-unique-odr-a.ll
LLVM :: Linker/type-unique-simple-a.ll
LLVM :: Linker/type-unique-simple2-a.ll
LLVM :: Linker/type-unique-simple2.ll
LLVM :: Linker/type-unique-type-array-a.ll
LLVM :: Linker/unnamed-addr1-a.ll
LLVM :: Linker/visibility1.ll
If it is to be resurrected, it has to be fixed and we should probably have a
-preserve-source command line option in llvm-mc and run tests with and without
it.
llvm-svn: 220741
|
|
|
|
|
|
| |
[-Winconsistent-missing-override]
llvm-svn: 220739
|
|
|
|
|
|
|
|
|
| |
This is implemented via a multiclass that derives from the vperm imm
multiclass.
Fixes <rdar://problem/18426089>
llvm-svn: 220737
|
|
|
|
|
|
|
|
| |
No functionality change. No change in X86.td.expanded except that we only set
the CD8 attributes for the memory variants. (This shouldn't be used unless we
have a memory operand.)
llvm-svn: 220736
|
|
|
|
|
|
|
|
| |
This used to derive from avx512_pshuf_imm which is confusing.
NFC. Compared X86.td.expanded.
llvm-svn: 220735
|
|
|
|
|
|
|
|
|
| |
1) i512mem -> f512mem (this is the packed FP input being permuted)
2) element size is 64 bits in EVEX_CD8 for PD.
(A good illustration why X86VectorVTInfo is useful)
llvm-svn: 220734
|
|
|
|
| |
llvm-svn: 220732
|
|
|
|
|
|
|
|
| |
For a call to not return in to the stackmap shadow, the shadow must end with the call.
To do this, we must insert any required nops *before* the call, and not after it.
llvm-svn: 220728
|