| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initial patch adding assembly support for Armv8.4-A.
Besides adding v8.4 as a supported architecture to the usual places, this also
adds target features for the different crypto algorithms. Armv8.4-A introduced
new crypto algorithms, made them optional, and allows different combinations:
- none of the v8.4 crypto functions are supported, which is independent of the
implementation of the Armv8.0 SHA1 and SHA2 instructions.
- the v8.4 SHA512 and SHA3 support is implemented, in this case the Armv8.0
SHA1 and SHA2 instructions must also be implemented.
- the v8.4 SM3 and SM4 support is implemented, which is independent of the
implementation of the Armv8.0 SHA1 and SHA2 instructions.
- all of the v8.4 crypto functions are supported, in this case the Armv8.0 SHA1
and SHA2 instructions must also be implemented.
The v8.4 crypto instructions are added to AArch64 only, and not AArch32,
and are made optional extensions to Armv8.2-A.
The user-facing Clang options will map on these new target features, their
naming will be compatible with GCC and added in follow-up patches.
The Armv8.4-A instruction sets can be downloaded here:
https://developer.arm.com/products/architecture/a-profile/exploration-tools
Differential Revision: https://reviews.llvm.org/D48625
llvm-svn: 335953
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
An alternative to D48597.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=37936 | PR37936 ]].
The problem is as follows:
1. `indvars` marks `%dec` as `NUW`.
2. `loop-instsimplify` runs `instsimplify`, which constant-folds `%dec` to -1 (D47908)
3. `loop-reduce` tries to do some further modification, but crashes
with an type assertion in cast, because `%dec` is no longer an `Instruction`,
If the runline is split into two, i.e. you first run `-indvars -loop-instsimplify`,
store that into a file, and then run `-loop-reduce`, there is no crash.
So it looks like the problem is due to `-loop-instsimplify` not discarding SCEV.
But in this case we can just not crash if it's not an `Instruction`.
This is just a local fix, unlike D48597, so there may very well be other problems.
Reviewers: mkazantsev, uabelho, sanjoy, silviu.baranga, wmi
Reviewed By: mkazantsev
Subscribers: evstupac, javed.absar, spatel, llvm-commits
Differential Revision: https://reviews.llvm.org/D48599
llvm-svn: 335950
|
|
|
|
|
|
|
|
| |
IR instead.
While there improve the coverage of the intrinsic testing and add fast-isel tests.
llvm-svn: 335944
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc. This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself. This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.
Reviewers: arsenm, nhaehnle, jvesely
Reviewed By: arsenm
Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D46365
llvm-svn: 335942
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't ever check these again (unless you're using
-fno-integrated-as), so make sure the extracted bits are well-defined.
I don't think it's possible to trigger any of the assertions on trunk,
but it's difficult to prove. (The first one depends on DAGCombine to
minimize the number of set bits in AND masks; I think the others are
mathematically impossible to hit.)
llvm-svn: 335931
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a recommit of r335879.
We shouldn't add the outliner when compiling at -O0 even if
-enable-machine-outliner is passed in. This makes sure that we
don't add it in this case.
This also removes -O0 from the outliner DWARF test.
llvm-svn: 335930
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds experimental support for SHT_RELR sections, proposed
here: https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
Definitions for the new ELF section type and dynamic array tags, as well
as the encoding used in the new section are all under discussion and are
subject to change. Use with caution!
Author: rahulchaudhry
Differential Revision: https://reviews.llvm.org/D47919
llvm-svn: 335922
|
|
|
|
|
|
|
|
|
| |
There's no way to expose this difference currently,
but we should use the updated variable because the
original opcodes can go stale if we transform into
something new.
llvm-svn: 335920
|
|
|
|
|
|
|
|
|
|
| |
This fixes a regression since SVN r334523, where the object files
built targeting MinGW were rejected by GNU binutils tools. Prior to
that commit, we only put constants in comdat for MSVC configurations.
Differential Revision: https://reviews.llvm.org/D48567
llvm-svn: 335918
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The InlinerFunctionImportStats will collect and dump stats regarding how
many function inlined into the module were imported by ThinLTO.
Reviewers: wmi, dexonsmith
Subscribers: mehdi_amini, inglorion, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D48729
llvm-svn: 335914
|
|
|
|
|
|
| |
No functionality change.
llvm-svn: 335913
|
|
|
|
|
|
|
|
|
|
| |
Mostly just adding checks for Thumb2 instructions which correspond to
ARM instructions which already had diagnostics. While I'm here, also fix
ARM-mode strd to check the input registers correctly.
Differential Revision: https://reviews.llvm.org/D48610
llvm-svn: 335909
|
|
|
|
|
|
|
|
|
| |
When rewriting an alloca partition copy the DL from the
old alloca over the the new one.
Differential Revision: https://reviews.llvm.org/D48640
llvm-svn: 335904
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FileOutputBuffer creates a temp file and on commit atomically
renames the temp file to the destination file. Sometimes we
want to modify an existing file in place, but still have the
atomicity guarantee. To do this we can initialize the contents
of the temp file from the destination file (if it exists), that
way the resulting FileOutputBuffer can have only selective
bytes modified. Committing will then atomically replace the
destination file as desired.
llvm-svn: 335902
|
|
|
|
|
|
| |
Fixes -Wpedantic warning.
llvm-svn: 335901
|
|
|
|
|
|
|
|
|
|
| |
btr/bts/btc.
This is a follow up to r335753. At the time I forgot about isProfitableToFold which makes this pretty easy.
Differential Revision: https://reviews.llvm.org/D48706
llvm-svn: 335895
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
code models""
Reverting because this is causing failures in the LLDB test suite on
GreenDragon.
LLVM ERROR: unsupported relocation with subtraction expression, symbol
'__GLOBAL_OFFSET_TABLE_' can not be undefined in a subtraction
expression
llvm-svn: 335894
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806
We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other
direction because that's generally better of course). This allows us to remove the shuffle
as we do in the regular opcodes-are-the-same cases.
This requires a small hack to make sure we don't introduce any extra poison:
https://rise4fun.com/Alive/ZGv
Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already
canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There
are planned enhancements for opcode transforms such or -> add.
Note that there's a different fold needed if we've already managed to simplify away a binop
as seen in the test based on PR37806, but we manage to get that one case here because this
fold is positioned above the demanded elements fold currently.
Differential Revision: https://reviews.llvm.org/D48485
llvm-svn: 335888
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Targets should be able to define whether or not they support the outliner
without the outliner being added to the pass pipeline. Before this, the
outliner pass would be added, and ask the target whether or not it supports the
outliner.
After this, it's possible to query the target in TargetPassConfig, before the
outliner pass is created. This ensures that passing -enable-machine-outliner
will not modify the pass pipeline of any target that does not support it.
https://reviews.llvm.org/D48683
llvm-svn: 335887
|
|
|
|
|
|
|
|
|
|
| |
We could get away with it for constant folded cases, but not for rL335719.
Thanks to Krzysztof Parzyszek for noticing.
Reapply original commit rL335821 which was reverted at rL335871 due to a WebAssembly bug that was fixed at rL335884.
llvm-svn: 335886
|
|
|
|
|
|
|
|
| |
compare results.
Necessary to get the rL335821 bugfix (which was reverted at rL335871) un-reverted.
llvm-svn: 335884
|
|
|
|
|
|
|
|
|
| |
-enable-machine-outliner"
I accidentally committed this instead of D48683 because I haven't had coffee
yet.
llvm-svn: 335883
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 9c7c10e4073a0bc6a759ce5cd33afbac74930091.
It relies on r335872 since that introduces the machine outliner
flags test. I meant to commit D48683 in that commit, but got mixed
up and committed D48682 instead. So, I'm reverting this and
r335872, since D48682 hasn't made it through review yet.
llvm-svn: 335882
|
|
|
|
|
|
|
|
|
|
|
| |
We shouldn't add the outliner when compiling at -O0 even if
-enable-machine-outliner is passed in. This makes sure that we
don't add it in this case.
This also updates machine-outliner-flags to reflect the change
and improves the comment describing what that test does.
llvm-svn: 335879
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add NoTrapAfterNoreturn target option which skips emission of traps
behind noreturn calls even if TrapUnreachable is enabled.
Enable the feature on Mach-O to save code size; Comments suggest it is
not possible to enable it for the other users of TrapUnreachable.
rdar://41530228
DifferentialRevision: https://reviews.llvm.org/D48674
llvm-svn: 335877
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To enable the MachineOutliner by default on AArch64, we need to be able to
disable the MachineOutliner and also provide an option to "always" enable the
outliner.
This adds that capability. It allows the user to still use the old
-enable-machine-outliner option, which defaults to "always". This is building
up to allowing the user to specify "always" versus the target-default
outlining behaviour.
llvm-svn: 335872
|
|
|
|
|
|
|
|
| |
This reverts commit r335821.
This crashes the webassembly test, run "ninja check-llvm-codegen-webassembly" to reproduce.
llvm-svn: 335871
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows hoisting of a common code, for instance if denominator
is loop invariant. Current change is expansion only, adding licm to
the target pass list going to be a separate patch. Given this patch
changes to codegen are minor as the expansion is similar to that on
DAG. DAG expansion still must remain for R600.
Differential Revision: https://reviews.llvm.org/D48586
llvm-svn: 335868
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D48677
llvm-svn: 335866
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass is being added in order to make the information available to BasicAA,
which can't do caching of this information itself, but possibly this information
may be useful for other passes.
Incorporates code based on Daniel Berlin's implementation of Tarjan's algorithm.
Differential Revision: https://reviews.llvm.org/D47893
llvm-svn: 335857
|
|
|
|
|
|
|
|
| |
Frequency Info."
This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost.
llvm-svn: 335851
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Armv6 introduced instructions to perform 32-bit SIMD operations. The purpose of
this pass is to do some straightforward IR pattern matching to create ACLE DSP
intrinsics, which map on these 32-bit SIMD operations.
Currently, only the SMLAD instruction gets recognised. This instruction
performs two multiplications with 16-bit operands, and stores the result in an
accumulator. We will follow this up with patches to recognise SMLAD in more
cases, and also to generate other DSP instructions (like e.g. SADD16).
Patch by: Sam Parker and Sjoerd Meijer
Differential Revision: https://reviews.llvm.org/D48128
llvm-svn: 335850
|
|
|
|
|
|
|
|
|
|
| |
Summary: Just a silly one-character correction.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48709
llvm-svn: 335832
|
|
|
|
| |
llvm-svn: 335831
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have too many mechanisms for tracking the various offsets
used for kernel arguments, so remove one. There's still a lot of
confusion with these because there are two different "implicit"
argument areas located at the beginning and end of the kernarg
segment.
Additionally, the offset was determined based on the memory
size of the split element types. This would break in a future
commit where v3i32 is decomposed into separate i32 pieces.
llvm-svn: 335830
|
|
|
|
|
|
|
|
| |
In principle nothing should stop these from working, but
work is necessary to create an ABI for dealing with the stack
related registers.
llvm-svn: 335829
|
|
|
|
|
|
| |
Not sure how this wasn't noticed before.
llvm-svn: 335828
|
|
|
|
|
|
|
|
|
|
| |
Just fix the crash for now by not doing the optimization since
figuring out how to properly convert the bits for an arbitrary
struct is a pain.
Also fix a crash when there is only an empty struct argument.
llvm-svn: 335827
|
|
|
|
|
|
|
| |
These are all benign races and only visible in !NDEBUG. tsan complains
about it, but a simple atomic bool is sufficient to make it happy.
llvm-svn: 335823
|
|
|
|
|
|
|
|
| |
We could get away with it for constant folded cases, but not for rL335719.
Thanks to Krzysztof Parzyszek for noticing.
llvm-svn: 335821
|
|
|
|
|
|
|
|
|
|
|
|
| |
SCCP does not change the CFG, so we can mark it as preserved.
Reviewers: dberlin, efriedma, davide
Reviewed By: davide
Differential Revision: https://reviews.llvm.org/D47149
llvm-svn: 335820
|
|
|
|
|
|
| |
Noticed in D45806 review.
llvm-svn: 335817
|
|
|
|
|
|
|
| |
If a trunc has a user in a block which is not reachable from entry,
we can safely perform trunc elimination as if this user didn't exist.
llvm-svn: 335816
|
|
|
|
|
|
|
|
|
|
| |
Remove unused ByteStreamer argument from function emitDebugLocValue.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D48590
llvm-svn: 335811
|
|
|
|
|
|
|
|
|
|
| |
instead of ImmLeafs with predicates where one of the two numbers was hardcoded.
This more efficient for the isel table generator since we can use CheckChildInteger instead of MoveChild, CheckPredicate, MoveParent. This reduced the table size by 1-2K.
I wish there was a way to share the values with X86BaseInfo.h and still use a PatFrag like this. These numbers are fixed by the X86 intrinsic spec going back many years and we should never need to change them. So we shouldn't waste table bytes to support sharing.
llvm-svn: 335806
|
|
|
|
|
|
|
|
|
|
|
|
| |
BMI2 added new shift by register instructions that have the ability to fold a load.
Normally without doing anything special isel would prefer folding a load over folding an immediate because the load folding pattern has higher "complexity". This would require an instruction to move the immediate into a register. We would rather fold the immediate instead and have a separate instruction for the load.
We used to enforce this priority by artificially lowering the complexity of the load pattern.
This patch changes this to instead reject the load fold in isProfitableToFoldLoad if there is an immediate. This is more consistent with other binops and feels less hacky.
llvm-svn: 335804
|
|
|
|
| |
llvm-svn: 335797
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
=== Generating the CG Profile ===
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```
Differential Revision: https://reviews.llvm.org/D48105
llvm-svn: 335794
|
|
|
|
|
|
|
|
| |
The code to emit the pieces of the MSF file were actually in
PDBFileBuilder. Move this to MSFBuilder so that we can
theoretically emit an MSF without having a PDB file.
llvm-svn: 335789
|
|
|
|
|
|
|
| |
This is a benign race, but tsan likes to complain about it. Just make it
happy.
llvm-svn: 335788
|