| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21067
llvm-svn: 272006
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using an LLVM IR aggregate return value type containing three
or more integer values causes an abort in the fast isel pass.
This patch adds two more registers to RetCC_PPC64_ELF_FIS to
allow returning up to four integers with fast isel, just the
same as is currently supported with regular isel (RetCC_PPC).
This is needed for Swift and (possibly) other non-clang frontends.
Fixes PR26190.
llvm-svn: 272005
|
| |
|
|
|
|
|
|
|
|
| |
shuffle mask directly
We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened.
This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases.
llvm-svn: 272003
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
A Thumb-2 post-indexed LDR instruction such as:
ldr.w r0, [r1], #4
Can be rewritten as:
ldm.n r1!, {r0}
LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode.
llvm-svn: 272002
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have an LDM that uses only low registers and doesn't write to its base register:
ldm.w r0, {r1, r2, r3}
And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding:
ldm.n r0!, {r1, r2, r3}
Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't.
llvm-svn: 272000
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Thumb2 conditional branch B<cond>.W has a different encoding (T3)
to the unconditional branch B.W (T4) as it needs to record <cond>.
As the encoding is different the B<cond>.W is given a different
relocation type.
ELF for the ARM Architecture 4.6.1.6 (Table-13) states that
R_ARM_THM_JUMP19 should be used for B<cond>.W. At present the
MC layer is using the R_ARM_THM_JUMP24 from B.W.
This change makes B<cond>.W use R_ARM_THM_JUMP19 and alters the
existing test that checks for R_ARM_THM_JUMP24 to expect
R_ARM_THM_JUMP19.
llvm-svn: 271997
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).
If the shift amount is constant we can sometimes convert these instructions to native shifts:
1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.
In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.
Differential Revision: http://reviews.llvm.org/D19675
llvm-svn: 271996
|
| |
|
|
|
|
|
|
|
|
| |
This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions.
The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs.
Differential Revision: http://reviews.llvm.org/D20998
llvm-svn: 271990
|
| |
|
|
|
|
| |
encoded instructions when VLX is enabled.
llvm-svn: 271988
|
| |
|
|
|
|
| |
instructions that have patterns that imply them. Add the same set of flags to instructions that don't have patterns to imply them.
llvm-svn: 271987
|
| |
|
|
|
|
| |
of the patterns in the .td file protects this, but its better to be explicit.
llvm-svn: 271986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to efficiently write PDBs, we need to be able to make a
StreamWriter class similar to a StreamReader, which can transparently deal
with writing to discontiguous streams, and we need to use this for all
writing, similar to how we use StreamReader for all reading.
Most discontiguous streams are the typical numbered streams that appear in
a PDB file and are described by the directory, but the exception to this,
that until now has been parsed by hand, is the directory itself.
MappedBlockStream works by querying the directory to find out which blocks
a stream occupies and various other things, so naturally the same logic
could not possibly work to describe the blocks that the directory itself
resided on.
To solve this, I've introduced an abstraction IPDBStreamData, which allows
the client to query for the list of blocks occupied by the stream, as well
as the stream length. I provide two implementations of this: one which
queries the directory (for indexed streams), and one which queries the
super block (for the directory stream).
This has the side benefit of vastly simplifying the code to parse the
directory. Whereas before a mini state machine was rolled by hand, now we
simply use FixedStreamArray to read out the stream sizes, then build a
vector of FixedStreamArrays for the stream map, all in just a few lines of
code.
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21046
llvm-svn: 271982
|
| |
|
|
| |
llvm-svn: 271980
|
| |
|
|
|
|
|
|
| |
because LSan is not currently supported.
Differential Revision: http://reviews.llvm.org/D20947
llvm-svn: 271979
|
| |
|
|
|
|
|
|
|
|
|
|
| |
TLS access requires an offset from the TLS index. The index itself is the
section-relative distance of the symbol. For ARM, the relevant relocation
(IMAGE_REL_ARM_SECREL) is applied as a constant. This means that the value may
not be an immediate and must be lowered into a constant pool. This offset will
not be base relocated. We were previously emitting the actual address of the
symbol which would be base relocated and would therefore be the vaue offset by
the ImageBase + TLS Offset.
llvm-svn: 271974
|
| |
|
|
|
|
|
| |
clang-format a couple of switches in preparation for a future change. Add some
enumeration comments
llvm-svn: 271973
|
| |
|
|
|
|
| |
Just adjust the whitespace for the selection patterns. NFC.
llvm-svn: 271972
|
| |
|
|
| |
llvm-svn: 271967
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r271962 and reinstantes r271957.
MSVC's linker doesn't appear to like it if you have an empty symbol
substream, so only open a symbol substream if we're going to emit
something about globals into it.
Makes check-asan pass.
llvm-svn: 271965
|
| |
|
|
|
|
| |
world without a named variable
llvm-svn: 271964
|
| |
|
|
|
|
| |
This reverts commit r271957, it broke check-asan on Windows.
llvm-svn: 271962
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
scalarizePHI only looked for phis that have exactly two uses - the "latch"
use, and an extract. Unfortunately, we can not assume all equivalent extracts
are CSE'd, since InstCombine itself may create an extract which is a duplicate
of an existing one. This extends it to handle several distinct extracts from
the same index.
This should fix at least some of the performance regressions from PR27988.
Differential Revision: http://reviews.llvm.org/D20983
llvm-svn: 271961
|
| |
|
|
| |
llvm-svn: 271958
|
| |
|
|
|
|
|
| |
This currently emits everything as S_GDATA32, which isn't right for
things like thread locals, but it's a start.
llvm-svn: 271957
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
functions.
Arrange to call verify(Function &) on each function, followed by
verify(Module &), whether the verifier is being used from the pass or
from verifyModule(). As a side effect, this fixes an issue that caused
us not to call verify(Function &) on unmaterialized functions from
verifyModule().
Differential Revision: http://reviews.llvm.org/D21042
llvm-svn: 271956
|
| |
|
|
| |
llvm-svn: 271954
|
| |
|
|
|
|
|
| |
Remove previously unreachable code that verifies that a function definition has
an entry block. By definition, a function definition has at least one block.
llvm-svn: 271948
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Calls to this function are currently injected by the
``SanitizerCoverageModule`` pass when the both the ``indirect-calls``
and ``trace-pc`` sanitizer coverage options are enabled and the code
being instrumented has indirect calls. Previously because LibFuzzer did
not define this function this would lead to link errors when building
some of the tests on OSX.
Differential Revision: http://reviews.llvm.org/D20946
llvm-svn: 271938
|
| |
|
|
| |
llvm-svn: 271936
|
| |
|
|
|
|
|
|
| |
If we had a constant group address space cast the queue pointer
wasn't enabled for the function, resulting in a crash on noreg
later.
llvm-svn: 271935
|
| |
|
|
| |
llvm-svn: 271934
|
| |
|
|
| |
llvm-svn: 271932
|
| |
|
|
|
|
|
|
|
|
|
| |
In some cases, when simplifying with SCEV, we might consider pointer values as
just usual integer values. Thus, we might get a different type from what we
had originally in the map of simplified values, and hence we need to check
types before operating on the values.
This fixes PR28015.
llvm-svn: 271931
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs. This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.
Originally reviewed in http://reviews.llvm.org/D18001
Reviewers: atrick
Subscribers: llvm-commits, mzolotukhin, mcrosier
Differential Revision: http://reviews.llvm.org/D18480
llvm-svn: 271929
|
| |
|
|
|
|
|
|
|
|
| |
The data strucutre in the new FPO stream is described in the
PE/COFF spec. There is one record per function if frame pointer
is omitted.
Differential Revision: http://reviews.llvm.org/D20999
llvm-svn: 271926
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The code layout that TailMerging (inside BranchFolding) works on is not the
final layout optimized based on the branch probability. Generally, after
BlockPlacement, many new merging opportunities emerge.
This patch calls Tail Merging after MBP and calls MBP again if Tail Merging
merges anything.
Differential Revision: http://reviews.llvm.org/D20276
llvm-svn: 271925
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20184
llvm-svn: 271923
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Following D20970 (committed as r271726).
This is a substantial refactoring of the host CPU detection code.
There is no functionality change intended, but the changes are extensive.
Definitions of architecture types and subtypes are by no means exhaustive or
perfectly defined, but a fair starting point.
Suggestions for futher improvements are welcome.
Reviewers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20988
llvm-svn: 271921
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check
above this to work for any Constant rather than ConstantInt. AFAICT,
that part makes sense if we can determine that the shrunken/extended
constant remained equal. But it doesn't make sense for this later
transform where we assume that the constant DID change.
This could assert for a ConstantExpr:
https://llvm.org/bugs/show_bug.cgi?id=28011
And it could be wrong for a vector as shown in the added regression test.
llvm-svn: 271908
|
| |
|
|
|
|
|
|
|
|
|
|
| |
src2 == VCC.
Another step for unification llvm assembler/disassembler with sp3.
Besides, CodeGen output is a bit improved, thus changes in CodeGen tests.
Assembler/Disassembler tests updated/added.
Differential Revision: http://reviews.llvm.org/D20796
llvm-svn: 271900
|
| |
|
|
|
|
|
| |
Contributed-by: Aditya Kumar <hiraditya@msn.com>
Differential Revision: http://reviews.llvm.org/D20953
llvm-svn: 271895
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21013
llvm-svn: 271891
|
| |
|
|
|
|
| |
Found by clang-tidy's misc-assert-side-effect.
llvm-svn: 271887
|
| |
|
|
| |
llvm-svn: 271882
|
| |
|
|
|
|
| |
of vector shuffle and select.
llvm-svn: 271872
|
| |
|
|
| |
llvm-svn: 271870
|
| |
|
|
| |
llvm-svn: 271862
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This hasn't been caught before because it requires noalias or similarly
strong alias analysis to actually reproduce.
Fixes http://llvm.org/PR27952 .
Reviewers: hfinkel, sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20944
llvm-svn: 271858
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since FoldOpIntoPhi speculates the binary operation to potentially each
of the predecessors of the PHI node (pulling it out of arbitrary control
dependence in the process), we can FoldOpIntoPhi only if we know the
operation doesn't have UB.
This also brings up an interesting profitability question -- the way it
is written today, commonIRemTransforms will hoist out work from
dynamically dead code into code that will execute at runtime. Perhaps
that isn't the best canonicalization?
Fixes PR27968.
llvm-svn: 271857
|
| |
|
|
| |
llvm-svn: 271851
|