| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was intended to be no-functional-change, but it's not - there's a test diff.
So I thought I should stop here and post it as-is to see if this looks like what was expected
based on the discussion in PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603
Notes:
1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried
through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'.
The parameter isn't passed down, so we pick up the default value from the function signature
after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the
'SimplifyCFG' calls.
2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops.
This would theoretically allow us to differentiate the transforms controlled by those params
independently.
3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too.
I just stopped here to minimize the diffs.
4. Similarly, I stopped short of messing with the pass manager layer. I have another question that
could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG
set to true no matter where in the pipeline it's creating SimplifyCFG passes?
// Create an early function pass manager to cleanup the output of the
// frontend.
EarlyFPM.addPass(SimplifyCFGPass());
-->
/// \brief Construct a pass with the default thresholds
/// and switch optimizations.
SimplifyCFGPass::SimplifyCFGPass()
: BonusInstThreshold(UserBonusInstThreshold),
LateSimplifyCFG(true) {} <-- switches get converted to lookup tables and loops may not be in canonical form
If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG'
setting via recursion was masking this bug.
Differential Revision: https://reviews.llvm.org/D38138
llvm-svn: 314308
|
| |
|
|
|
|
|
|
|
|
| |
InlineCost can understand Select IR now. This patch finds free Select IRs and
continue the propagation of SimplifiedValues, ConstantOffsetPtrs, and
SROAArgValues.
Differential Revision: https://reviews.llvm.org/D37198
llvm-svn: 314307
|
| |
|
|
| |
llvm-svn: 314301
|
| |
|
|
| |
llvm-svn: 314299
|
| |
|
|
|
|
|
|
|
| |
This patch makes analyzeBranch eliminate unconditional branch to the next instruction.
After basic blocks are re-organized by optimizers, such as machine block placement, a BB may end with an unconditional branch to the next (fallthrough) BB. This patch removes such redundant branch instruction.
Differential Revision: https://reviews.llvm.org/D37730
llvm-svn: 314297
|
| |
|
|
|
|
|
| |
Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38176
llvm-svn: 314296
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37473
llvm-svn: 314295
|
| |
|
|
|
|
|
|
| |
results.
As commented on D37849 and rL313547, AVX1 targets were missing a chance to use vmovmskpd for v4f64/v4i64 results for bool vector bitcasts
llvm-svn: 314293
|
| |
|
|
|
|
|
|
|
| |
Fixes 32-bit buildbots:
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/542
http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15/builds/11533
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/11494
llvm-svn: 314291
|
| |
|
|
|
|
|
|
| |
This patch adds support for passing an offset to -debug-line.
Differential revision: https://reviews.llvm.org/D38240
llvm-svn: 314288
|
| |
|
|
|
|
|
|
| |
This patch adds support for passing an offset to -debug-loc.
Differential revision: https://reviews.llvm.org/D38237
llvm-svn: 314286
|
| |
|
|
|
|
|
|
|
|
| |
I implemented isTruncateFree in rL313533, this patch fixes the logic
to match my comment, as the previous logic was too general. Now the
only truncates that are free are i64 -> i32.
Differential Revision: https://reviews.llvm.org/D38234
llvm-svn: 314280
|
| |
|
|
| |
llvm-svn: 314278
|
| |
|
|
|
|
|
|
|
| |
This is necessary, but not sufficient, for having working SJLJ exception
handling on x86_64.
Differential Revision: https://reviews.llvm.org/D38254
llvm-svn: 314277
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The callsite value is already stored indexed from 0 in
the _Unwind_Context struct. When accessed via the functions
_Unwind_GetIP and _Unwind_SetIP, the value is indexed from 1,
but those functions handle the offseting. When reading directly
from the struct here, we shouldn't subtract 1.
This matches the code generated by the ARM target, where SJLJ
exception handling is used by default on iOS.
This makes clang-built object files for 32 bit x86 mingw work when
linked with libgcc/libstdc++.
Differential Revision: https://reviews.llvm.org/D38251
llvm-svn: 314276
|
| |
|
|
|
|
| |
build vectors.
llvm-svn: 314274
|
| |
|
|
|
|
| |
them, not if we just found existing ones
llvm-svn: 314273
|
| |
|
|
|
|
| |
by passing in the SelectionDAG.
llvm-svn: 314271
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A new FDR metadata record will support logging a function call argument;
appending multiple metadata records will represent a sequence of arguments
meaning that "holes" are not representable by the buffer format. Each
call argument is currently a 64-bit value (useful for "this" pointers and
synchronization objects).
If present, we put this argument to the function call "entry" record it
belongs to, and alter its type to notify the user of its presence.
Reviewers: dberris
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32840
llvm-svn: 314269
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch tries to transform cases like:
for (unsigned i = 0; i < N; i += 2) {
bool c0 = (i & 0x1) == 0;
bool c1 = ((i + 1) & 0x1) == 1;
}
To
for (unsigned i = 0; i < N; i += 2) {
bool c0 = true;
bool c1 = true;
}
This commit also update test/Transforms/IndVarSimplify/replace-srem-by-urem.ll to prevent constant folding.
Differential Revision: https://reviews.llvm.org/D38272
llvm-svn: 314266
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Don't bail out on constant divisors for divisions that can be narrowed without
introducing control flow . This gives us a 32 bit multiply instead of an
emulated 64 bit multiply in the generated PTX assembly.
Reviewers: jlebar
Subscribers: jholewinski, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D38265
llvm-svn: 314253
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In rare cases, loads that don't get prefetched that were marked as
strided loads could cause a crash if they occurred in a loop with other
colliding loads.
Reviewers: mcrosier
Subscribers: aemerson, rengolin, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D38261
llvm-svn: 314252
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correct some opcode tag computations.
Summary:
This addresses a correctness bug for LD[1234]*_POST opcodes that have
the prefetcher fix applied to them: the base register was not being
written back from the temp after being incremented, so it would appear
to never be incremented.
Also, fix some opcode tag computations based on some updated HW details
to get better tag avoidance and thus better prefetcher performance.
Reviewers: mcrosier
Subscribers: aemerson, rengolin, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D38256
llvm-svn: 314251
|
| |
|
|
| |
llvm-svn: 314250
|
| |
|
|
|
|
|
|
| |
postRA pseudos like the NOREX version of TEST.""
The late MOV8rr_NOREX that caused the crash has been removed.
llvm-svn: 314249
|
| |
|
|
|
|
| |
This hook is called after register allocation with two physical registers. We don't need a separate instruction at that time to force register class constraints. I left in the assert though. We also have a fatal error in X86MCCodeEmitter if we ever encode an H-reg and a REX prefix.
llvm-svn: 314248
|
| |
|
|
| |
llvm-svn: 314247
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Previously these were being included as both imports and
exports, with the import being satisfied by the export
(or some strong symbol) at runtime. However proved
unnecessary and actually complicated linking as it meant
there was not a 1-to-1 mapping between a wasm function
/global index and a linker symbol.
Differential Revision: https://reviews.llvm.org/D38246
llvm-svn: 314245
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions
In the past while, I've committed a number of patches in the PowerPC back end
aimed at eliminating comparison instructions. However, this causes some failures
in proprietary source and these issues are not observed in SPEC or any open
source packages I've been able to run.
As a result, I'm pulling the entire series and will refactor it to:
- Have a single entry point for easy control
- Have fine-grained control over which patterns we transform
A side-effect of this is that test cases for these patches (and modified by
them) are XFAIL-ed. This is a temporary measure as it is counter-productive
to remove/modify these test cases and then have to modify them again when
the refactored patch is recommitted.
The failure will be investigated in parallel to the refactoring effort and
the recommit will either have a fix for it or will leave this transformation
off by default until the problem is resolved.
llvm-svn: 314244
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
X86InterleavedAccess (VF{8|16|32} stride 3)
This patch expands the support of lowerInterleavedStore to {8|16|32}x8i stride 3.
LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) .
This patch is part two of two patches and it covers the store (interlevaed) side.
The patch goal is to optimize the following sequence:
a0 a1 a2 a3 a4 a5 a6 a7
b0 b1 b2 b3 b4 b5 b6 b7
c0 c1 c2 c3 c4 c5 c6 c7
into
a0 b0 c0 a1 b1 c1 a2 b2
c2 a3 b3 c3 a4 b4 c4 a5
b5 c5 a6 b6 c6 a7 b7 c7
Reviewers:
zvi
guyblank
dorit
Ayal
Differential Revision: https://reviews.llvm.org/D37117
Change-Id: I56ced8bcbea809a37654060771911ade20246ccc
llvm-svn: 314234
|
| |
|
|
|
|
|
|
|
|
|
|
| |
foldICmpAndShift.
If this transformation succeeds, we're going to remove our dependency on the shift by rewriting the and. So it doesn't matter how many uses the shift has.
This distributes the one use check to other transforms in foldICmpAndConstConst that do need it.
Differential Revision: https://reviews.llvm.org/D38206
llvm-svn: 314233
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is useful for the symbol to contain the index of the
function of global it represents in the function/global
index space.
For imports we also store the import index so that the
linker can find, for example, the signature of the
corresponding function, which is defined by the import
In the long run we need to decide whether this API
surface should be closer to binary (where imported
functions are seperate) or the wasm spec (where the
function index space is unified).
Differential Revision: https://reviews.llvm.org/D38189
llvm-svn: 314230
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D38191
llvm-svn: 314223
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch extends the v8i32/v4i32 custom lowering to support v16i32
Reviewers: zvi, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38274
llvm-svn: 314221
|
| |
|
|
| |
llvm-svn: 314216
|
| |
|
|
|
|
|
| |
Make sure that "initializeSubtargetDependencies" sets all members that
InstrInfo and the like may depend on.
llvm-svn: 314214
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When dsymutil generates the companion file, its strips all unnecessary
sections by omitting their body and setting the offset in their
corresponding load command to zero.
One such section is the .eh_frame section, as it contains runtime
information rather than debug information and is part of the __TEXT
segment. When reading this section, we would just read the number of
bytes specified in the load command, starting from offset 0 (i.e. the
beginning of the file).
Rather than trying to parse this obviously invalid section, dwarfdump
now skips this.
Differential revision: https://reviews.llvm.org/D38135
llvm-svn: 314208
|
| |
|
|
|
|
|
|
| |
The XOP rotations act as ROTL with +ve values and ROTR with -ve values, which means that we can treat them all as ROTL with unsigned modulo. We already check that we're only trying to lower as ROTL for XOP rotations.
Differential Revision: https://reviews.llvm.org/D37949
llvm-svn: 314207
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
early store also wrote to (2nd try)
This is a 2nd attempt at:
https://reviews.llvm.org/rL310055
...which was reverted at rL310123 because of PR34074:
https://bugs.llvm.org/show_bug.cgi?id=34074
In this version, we break out of the inner loop after we successfully merge and kill a pair of stores. In the
earlier rev, we were continuing instead, which meant we could process the invalid info from a now dead store.
Original commit message (authored by Filipe Cabecinhas):
This fixes PR31777.
If both stores' values are ConstantInt, we merge the two stores
(shifting the smaller store appropriately) and replace the earlier (and
larger) store with an updated constant.
In the future we should also support vectors of integers. And maybe
float/double if we can.
Differential Revision: https://reviews.llvm.org/D30703
llvm-svn: 314206
|
| |
|
|
|
|
|
|
|
|
| |
https://bugs.llvm.org//show_bug.cgi?id=29061
Don't try referencing REX-needed regs when not on 64bit mode
Aligns to GCC
Differetial Revision: https://reviews.llvm.org/D37801
llvm-svn: 314203
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes https://bugs.llvm.org/show_bug.cgi?id=34714.
Patch by Marco Castelluccio
Reviewers: rnk
Reviewed By: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38224
llvm-svn: 314201
|
| |
|
|
|
|
|
|
| |
pseudos like the NOREX version of TEST."
Makes llc crash. This reverts commit r314151.
llvm-svn: 314199
|
| |
|
|
|
|
|
|
|
|
|
|
| |
llvm side.
Removing X86 broadcast(f/i)32x2 intrinsics from llvm.
Adding autoUpgrade support.
Moving matching tests from avx512dq-intrinsics.ll to avx512dq-intrinsics-upgrade.ll and from avx512dqvl-intrinsics.ll to avx512dqvl-intrinsics-upgrade.ll.
Differential Revision: https://reviews.llvm.org/D38220
llvm-svn: 314195
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usually the frontend communicates the size of wchar_t via metadata and
we can optimize wcslen (and possibly other calls in the future). In
cases without the wchar_size metadata we would previously try to guess
the correct size based on the target triple; however this is fragile to
keep up to date and may miss users manually changing the size via flags.
Better be safe and stop guessing and optimizing if the frontend didn't
communicate the size.
Differential Revision: https://reviews.llvm.org/D38106
llvm-svn: 314185
|
| |
|
|
|
|
| |
Thanks to Eli Friedman for the suggestion.
llvm-svn: 314182
|
| |
|
|
|
|
|
|
|
|
|
| |
spot as the original MBB
Discovered in avr-rust/rust#62
https://github.com/avr-rust/rust/issues/62
Patch by Gergo Erdi.
llvm-svn: 314180
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was an oversight in the original backend data layout.
The AVR architecture does not have the concept of unaligned loads - all
loads/stores from all addresses are aligned to one byte.
Discovered in avr-rust issue #64
https://github.com/avr-rust/rust/issues/64
Patch By Gergo Erdi.
llvm-svn: 314179
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Sanitizer blacklist entries currently apply to all sanitizers--there
is no way to specify that an entry should only apply to a specific
sanitizer. This is important for Control Flow Integrity since there are
several different CFI modes that can be enabled at once. For maximum
security, CFI blacklist entries should be scoped to only the specific
CFI mode(s) that entry applies to.
Adding section headers to SpecialCaseLists allows users to specify more
information about list entries, like sanitizer names or other metadata,
like so:
[section1]
fun:*fun1*
[section2|section3]
fun:*fun23*
The section headers are regular expressions. For backwards compatbility,
blacklist entries entered before a section header are put into the '[*]'
section so that blacklists without sections retain the same behavior.
SpecialCaseList has been modified to also accept a section name when
matching against the blacklist. It has also been modified so the
follow-up change to clang can define a derived class that allows
matching sections by SectionMask instead of by string.
Reviewers: pcc, kcc, eugenis, vsk
Reviewed By: eugenis, vsk
Subscribers: vitalybuka, llvm-commits
Differential Revision: https://reviews.llvm.org/D37924
llvm-svn: 314170
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
It leads to some improvements, but also a regression for the simple
case, so it's not clearly a good idea.
test/CodeGen/ARM/vcvt.ll now has test coverage to show the difference.
Ultimately, the right solution is probably to custom-lower fp-to-int
conversions, to something like ARMISD::VCVT_F32_S32 plus a bitcast.
It's hard to do the right thing when the implicit bitcast isn't visible
to DAG transforms.
llvm-svn: 314169
|
| |
|
|
|
|
|
|
| |
R12 is used for the SwiftError parameter. It is no longer a CSR as it
is used for transfer the SwiftError, and the caller must preserve it if
they need to.
llvm-svn: 314165
|