| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Support vector type G_INSERT legalization/selection.
Split from https://reviews.llvm.org/D33665
Reviewers: qcolombet, t.p.northover, zvi, guyblank
Reviewed By: guyblank
Subscribers: guyblank, rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33956
llvm-svn: 305989
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds a macro fusion using CodeGen/MacroFusion.cpp to pair AES
instructions back to back and adds FeatureFuseAES to enable the feature.
Reviewers: evandro, javed.absar, rengolin, t.p.northover
Reviewed By: javed.absar
Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34142
llvm-svn: 305988
|
|
|
|
|
|
|
|
|
|
|
|
| |
Masked gather for vector length 2 is lowered incorrectly for element type i32.
The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD.
The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal.
In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well.
Differential revision: https://reviews.llvm.org/D34343
llvm-svn: 305987
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
Added several subtarget features for GFX9 SDWA.
This diff also contains changes from D34026.
Depends D34026
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34241
llvm-svn: 305986
|
|
|
|
|
|
| |
folding can create more instructions than it removed when there are multiple uses. NFC
llvm-svn: 305985
|
|
|
|
|
|
| |
This patch fixes trivial mishandling of 32-bit/64-bit instructions that may cause verification errors with -verify-machineinstrs.
llvm-svn: 305984
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D34442
llvm-svn: 305983
|
|
|
|
|
|
|
|
| |
This includes the safe SEH tables and the control flow guard function
table. LLD will emit the guard table soon, and I need a tool that dumps
them for testing.
llvm-svn: 305979
|
|
|
|
| |
llvm-svn: 305978
|
|
|
|
|
|
| |
error message. No functional change.
llvm-svn: 305977
|
|
|
|
| |
llvm-svn: 305976
|
|
|
|
|
|
|
| |
This reverts commit r305949 and r305950 as they didn't have the
correct commit message.
llvm-svn: 305973
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Make SizeClassAllocator32 return nullptr when it encounters OOM, which
allows the entire sanitizer's allocator to follow allocator_may_return_null=1
policy, even for small allocations (LargeMmapAllocator is already fixed
by D34243).
Will add a test for OOM in primary allocator later, when
SizeClassAllocator64 can gracefully handle OOM too.
Reviewers: eugenis
Subscribers: kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D34433
llvm-svn: 305972
|
|
|
|
|
|
|
|
|
|
|
| |
- Use auto where appropriate
- Use early return to reduce nesting
- Remove stray comment line
- Use C++ foreach over explicit iterator
Differential Revision: https://reviews.llvm.org/D34477
llvm-svn: 305971
|
|
|
|
|
|
|
|
| |
This is one of the nodes which also compile as v_cmp_*.
Differential Revision: https://reviews.llvm.org/D34485
llvm-svn: 305970
|
|
|
|
|
|
| |
What You Use warnings; other minor fixes (NFC).
llvm-svn: 305969
|
|
|
|
| |
llvm-svn: 305968
|
|
|
|
| |
llvm-svn: 305967
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a bug where we always treat APSInts in Codeview as
signed when writing them to YAML. One symptom of this problem is that
llvm-pdbdump raw would show Enumerator Values that differ between the
original PDB and a PDB that has been round-tripped through YAML.
Reviewers: zturner
Reviewed By: zturner
Subscribers: llvm-commits, fhahn
Differential Revision: https://reviews.llvm.org/D34013
llvm-svn: 305965
|
|
|
|
|
|
|
|
|
| |
If one of the arguments of adde/sube is zero we can fold another
add/sub into it.
Differential Revision: https://reviews.llvm.org/D34374
llvm-svn: 305964
|
|
|
|
|
|
|
|
|
| |
Add const qualifier to any dump() method where adding one
was trivial.
Differential Revision: https://reviews.llvm.org/D34481
llvm-svn: 305963
|
|
|
|
|
|
|
|
|
| |
This simplification allows to avoid generating v_cndmask_b32
to serialize condition code between compare and use.
Differential Revision: https://reviews.llvm.org/D34300
llvm-svn: 305962
|
|
|
|
|
|
|
|
|
|
|
| |
CMake emits build targets as relative paths (from build.ninja) but Ninja doesn't identify absolute path (in *.d) as relative path (in build.ninja).
So, let file names, in the command line, relative from ${CMAKE_BINARY_DIR}, where build.ninja is.
Note that tblgen is executed on ${CMAKE_BINARY_DIR} as working directory.
Differential Revision: https://reviews.llvm.org/D33707
llvm-svn: 305961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact:
spec/2006/fp/C++/444.namd 26.84 -0.31%
spec/2006/fp/C++/447.dealII 46.19 +0.89%
spec/2006/fp/C++/450.soplex 42.92 -0.44%
spec/2006/fp/C++/453.povray 38.57 -2.25%
spec/2006/fp/C/433.milc 24.54 -0.76%
spec/2006/fp/C/470.lbm 41.08 +0.26%
spec/2006/fp/C/482.sphinx3 47.58 -0.99%
spec/2006/int/C++/471.omnetpp 22.06 +1.87%
spec/2006/int/C++/473.astar 22.65 -0.12%
spec/2006/int/C++/483.xalancbmk 33.69 +4.97%
spec/2006/int/C/400.perlbench 33.43 +1.70%
spec/2006/int/C/401.bzip2 23.02 -0.19%
spec/2006/int/C/403.gcc 32.57 -0.43%
spec/2006/int/C/429.mcf 40.35 +0.27%
spec/2006/int/C/445.gobmk 26.96 +0.06%
spec/2006/int/C/456.hmmer 24.4 +0.19%
spec/2006/int/C/458.sjeng 27.91 -0.08%
spec/2006/int/C/462.libquantum 57.47 -0.20%
spec/2006/int/C/464.h264ref 46.52 +1.35%
geometric mean +0.29%
The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag.
I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent.
Reviewers: hfinkel, mkuper, davidxl, chandlerc
Reviewed By: chandlerc
Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33341
llvm-svn: 305960
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to take type alignment padding into account whe computing physical
layouts.
The layout must be compatible with the input layout, offsets are defined in
terms of offsets within a packed struct which are computed in terms of the alloc
size of a type.
Usingthe store size we would insert padding for the following type for example:
struct {
int3 v;
long long l;
} __attribute((packed))
On x86-64 int3 is padded to int4 alignment. The swiftcc type would be
<{ <3 x float>, [4 x i8], i64 }> which is not compatible with <{ <3 x float>,
i64 }>.
The latter has i64 at offset 16 and the former at offset 20.
rdar://32618125
llvm-svn: 305956
|
|
|
|
| |
llvm-svn: 305955
|
|
|
|
|
|
|
|
| |
This is unnecessary because --gc-sections runs before ICF.
Differential Revision: https://reviews.llvm.org/D34465
llvm-svn: 305954
|
|
|
|
| |
llvm-svn: 305953
|
|
|
|
|
|
|
|
|
| |
For consistency with other MC*Streamer.cpp files and
the header file.
Differential Revision: https://reviews.llvm.org/D34466
llvm-svn: 305952
|
|
|
|
| |
llvm-svn: 305951
|
|
|
|
| |
llvm-svn: 305950
|
|
|
|
|
|
|
|
| |
Patch by John Baldwin < jhb at freebsd dot org >!
Differential Revision: https://reviews.llvm.org/D34452
llvm-svn: 305949
|
|
|
|
|
|
|
|
| |
Patch by Fedor Sergeev.
Differential Revision: https://reviews.llvm.org/D33868
llvm-svn: 305948
|
|
|
|
|
|
|
| |
Done to remove noise from https://reviews.llvm.org/D32332 (and to make
this test more resilient to changes in general).
llvm-svn: 305947
|
|
|
|
|
|
| |
improve readability. NFC
llvm-svn: 305946
|
|
|
|
| |
llvm-svn: 305945
|
|
|
|
|
|
|
|
|
|
| |
(consumer).
Reviewer: aprantl
Differential Revision: https://reviews.llvm.org/D34418
llvm-svn: 305944
|
|
|
|
| |
llvm-svn: 305943
|
|
|
|
|
|
| |
The bug that was causing this to fail was fixed in r305429.
llvm-svn: 305942
|
|
|
|
|
|
| |
methods standalone static functions. Put 'if' around variable declaration instead of after. NFC
llvm-svn: 305941
|
|
|
|
|
|
|
|
| |
enabled and #if with an undefined identifier and without #else
'HandleEndifDirective' asserts that 'WasSkipping' is false, so switch to using 'FoundNonSkip' as the hint for 'SingleFileParseMode' to keep going with parsing.
llvm-svn: 305940
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This attribute is used to ensure the guard page is triggered on stack
overflow. Stack frames larger than the guard page size will generate
a call to __probestack to touch each page so the guard page won't
be skipped.
Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D34386
llvm-svn: 305939
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using various methods, BasicAA tries to determine whether two
GetElementPtr memory locations alias when its base pointers are known
to be equal. When none of its heuristics are applicable, it falls back
to PartialAlias to, according to a comment, protect TBAA making a wrong
decision in case of unions and malloc. PartialAlias is not correct,
because a PartialAlias result implies that some, but not all, bytes
overlap which is not necessarily the case here.
AAResults returns the first analysis result that is not MayAlias.
BasicAA is always the first alias analysis. When it returns
PartialAlias, no other analysis is queried to give a more exact result
(which was the intention of returning PartialAlias instead of MayAlias).
For instance, ScopedAA could return a more accurate result.
The PartialAlias hack was introduced in r131781 (and re-applied in
r132632 after some reverts) to fix llvm.org/PR9971 where TBAA returns a
wrong NoAlias result due to a union. A test case for the malloc case
mentioned in the comment was not provided and I don't think it is
affected since it returns an omnipotent char anyway.
Since r303851 (https://reviews.llvm.org/D33328) clang does emit specific
TBAA for unions anymore (but "omnipotent char" instead). Hence, the
PartialAlias workaround is not required anymore.
This patch passes the test-suite and check-llvm/check-clang of a
self-hoisted build on x64.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D34318
llvm-svn: 305938
|
|
|
|
|
|
|
|
|
| |
This will be needed in order to share the irsymtab string table with
the bitcode string table.
Differential Revision: https://reviews.llvm.org/D33971
llvm-svn: 305937
|
|
|
|
|
|
| |
trunc
llvm-svn: 305936
|
|
|
|
| |
llvm-svn: 305935
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls.
Reviewers: iteratee, davidxl
Reviewed By: iteratee
Subscribers: sanjoy, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D34456
llvm-svn: 305934
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The main complexity in adding symbol records is that we need to
"relocate" all the type indices. Type indices do not have anything like
relocations, an opaque data structure describing where to find existing
type indices for fixups. The linker just has to "know" where the type
references are in the symbol records. I added an overload of
`discoverTypeIndices` that works on symbol records, and it seems to be
able to link the standard library.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34432
llvm-svn: 305933
|
|
|
|
|
|
|
|
|
|
|
| |
Define target hook isReallyTriviallyReMaterializable() to explicitly specify
PowerPC instructions that are trivially rematerializable. This will allow
the MachineLICM pass to accurately identify PPC instructions that should always
be hoisted.
Differential Revision: https://reviews.llvm.org/D34255
llvm-svn: 305932
|
|
|
|
|
|
|
| |
I don't think there's any visible difference from having the wrong layout
for the 32-bit case at this point, but that could change in the future.
llvm-svn: 305931
|