| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 252643
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.
The net result of allowing this speculation for the regression tests in this patch is
that we get this code:
ctlz:
clz r0, r0
bx lr
cttz:
rbit r0, r0
clz r0, r0
bx lr
Instead of:
ctlz:
cmp r0, #0
moveq r0, #32
clzne r0, r0
bx lr
cttz:
cmp r0, #0
moveq r0, #32
rbitne r0, r0
clzne r0, r0
bx lr
This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818
Differential Revision: http://reviews.llvm.org/D14469
llvm-svn: 252639
|
| |
|
|
|
|
|
|
| |
default diagnostic handler.
Differential Revision: http://reviews.llvm.org/D14520
llvm-svn: 252633
|
| |
|
|
|
|
|
|
|
|
|
| |
This allows avoiding the default Expand behavior which
introduces stack usage. Bitcast the scalar and replace
the missing elements with undef.
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.
llvm-svn: 252632
|
| |
|
|
|
|
|
| |
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.
llvm-svn: 252631
|
| |
|
|
|
|
|
|
|
|
| |
This is for AMDGPU to implement v2i64 extract as extract of
half of a v4i32.
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.
llvm-svn: 252630
|
| |
|
|
|
|
|
|
|
|
| |
This is a cleaned up version of a patch by John Regehr with permission. Originally found via the souper tool.
If we add an odd number to x, then bitwise-and the result with x, we know that the low bit of the result must be zero. Either it was zero in x originally, or the add cleared it in the temporary value. As a result, one of the two values anded together must have the bit cleared.
Differential Revision: http://reviews.llvm.org/D14315
llvm-svn: 252629
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When configuring various llvm projects that use AddLLVM.cmake, this warning is
emitted many times, flooding the screen:
Policy CMP0007 is not set: list command no longer ignores empty elements.
The fix is removing an extra semicolon.
Differential Revision: http://reviews.llvm.org/D14339
llvm-svn: 252628
|
| |
|
|
| |
llvm-svn: 252627
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure WeakAny variables are imported as ExternalWeak declarations. To
handle WeakAny more consistently and fix this issue:
1) Update helper doImportAsDefinition to properly flag WeakAny variables
and aliases as not importing defintions.
Update callers of doImportAsDefinition to remove now redundant checks for
WeakAny aliases, or ignore aliases, as appropriate.
2) Add any !doImportAsDefinition GVs to DoNotLinkFromSource set during
linking of the GV prototype, where we usually add GVs to the
DoNotLinkFromSource set for other reasons.
Remove now unnecessary adding of WeakAny aliases to
DoNotLinkFromSource set from copyGlobalAliasProto.
Remove now unnecessary guard against linking non-imported function
bodies from ModuleLinker::run.
llvm-svn: 252626
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isCheapToSpeculateCtlz()
AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any AArch64
implementation.
The net result of allowing this speculation for the regression tests in this
patch is that we get this code:
ctlz:
clz w0, w0
ret
cttz:
rbit w8, w0
clz w0, w8
ret
Instead of:
ctlz:
cbz w0, .LBB0_2
clz w0, w0
ret
.LBB0_2:
orr w0, wzr, #0x20
ret
cttz:
cbz w0, .LBB1_2
rbit w8, w0
clz w0, w8
ret
.LBB1_2:
orr w0, wzr, #0x20
ret
See D14469 for the larger motivation.
Differential Revision: http://reviews.llvm.org/D14505
llvm-svn: 252625
|
| |
|
|
|
|
|
| |
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.
llvm-svn: 252623
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14495
llvm-svn: 252621
|
| |
|
|
| |
llvm-svn: 252617
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is one of the problems noted in PR25016:
https://llvm.org/bugs/show_bug.cgi?id=25016
and:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html
The spilling problem is independent and not addressed by this patch.
The MachineCombiner was doing reassociations that don't improve or even worsen the critical path.
This is caused by inclusion of the "slack" factor when calculating the critical path of the original
code sequence. If we don't add that, then we have a more conservative cost comparison of the old code
sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64
MULADD patterns because benchmark regressions were observed without that.
The two failing test cases now have identical asm that does what we want:
a + b + c + d ---> (a + b) + (c + d)
Differential Revision: http://reviews.llvm.org/D13417
llvm-svn: 252616
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh!
Original commit message:
If we have a CMOV, OR and AND combination such as:
if (x & CN)
y |= CM;
And:
* CN is a single bit;
* All bits covered by CM are known zero in y;
Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).
llvm-svn: 252606
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is fix for PR24059.
When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.
This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.
Differential Revision: http://reviews.llvm.org/D14398
llvm-svn: 252604
|
| |
|
|
|
|
|
| |
This is needed for targets which do not support big-endian with the default
triple.
llvm-svn: 252603
|
| |
|
|
|
|
|
|
| |
The local variable Hi is never being read.
Issue identified by the Clang static analyzer.
llvm-svn: 252600
|
| |
|
|
|
|
|
|
|
|
|
|
| |
For big-endian targets, when we merge two halfword loads into a word load, the
order of the halfwords in the loaded value is reversed compared to
little-endian, so the load-store optimiser needs to swap the destination
registers.
This does not affect merging of two word loads, as we use ldp, which treats the
memory as two separate 32-bit words.
llvm-svn: 252597
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14499
llvm-svn: 252595
|
| |
|
|
|
|
|
|
| |
instructions.
Differential Revision: http://reviews.llvm.org/D14492
llvm-svn: 252592
|
| |
|
|
| |
llvm-svn: 252582
|
| |
|
|
| |
llvm-svn: 252580
|
| |
|
|
| |
llvm-svn: 252579
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages
between the current stack limit and the desired new stack pointer location. This implements support for
the inline expansion on x64.
For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call
is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications
that arise when introducing new machine basic blocks during prolog and epilog creation.
Added a new test case, modified an existing one to exclude non-x64 coreclr (for now).
Add test case
Fix tests
llvm-svn: 252578
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: brad.king, beanz
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14523
llvm-svn: 252576
|
| |
|
|
| |
llvm-svn: 252574
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
AArch64 has the ability to use the top 8-bits of an "address" for extra
information, with the memory subsystem automatically masking them off for loads
and stores. When that's happening, we can sometimes skip masks on memory
operations in the compiler.
However, this requires the host OS and support stack to preserve those bits so
it can't be enabled everywhere. In principle iOS 8.0 and above do take the
required precautions and but we'll put it under a flag for now.
llvm-svn: 252573
|
| |
|
|
|
|
|
|
| |
darwin’s nm(1).
Also a small fix to match printing of Mach-O objects with -format posix.
llvm-svn: 252567
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lower LLVM's 'unreachable' terminator to ISD::TRAP, and lower ISD::TRAP to
wasm's 'unreachable' expression.
WebAssembly type-checks expressions, but a noreturn function with a
return type that doesn't match the context will cause a check
failure. So we lower LLVM 'unreachable' to ISD::TRAP and then lower that
to WebAssembly's 'unreachable' expression, which typechecks in any
context and causes a trap if executed.
Differential Revision: http://reviews.llvm.org/D14515
llvm-svn: 252566
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not sure what the point of this was. I'm not sure why
you would ever define an instruction that produces an unallocatable
register class. No tests fail with this removed, and it seems like
it should be a verifier error to define such an instruction.
This was problematic for AMDGPU because it would make bad decisions
by arbitrarily changing the register class when unsetting isAllocatable
for VS_32/VS_64, which is currently set as a workaround to this problem.
AMDGPU uses the VS_32/VS_64 register classes to represent operands which
can use either VGPRs or SGPRs. When isAllocatable is unset for these,
this would need to pick either the SGPR or VGPR class and insert either
a copy we don't want, or an illegal copy we would need to deal with
later. A semi-arbitrary register class ordering decision is made in tablegen,
which resulted in always picking a VGPR class because it happens to have
more registers than the SGPR register class. We really just want to
use whatever register class the original register had.
llvm-svn: 252565
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Make indexed value profile data more compact by peeling out
the per-site value count field into its own smaller sized array.
- Introduced formal data structure definitions to specify value
profile data layout in indexed format. Previously the layout
of the data is only assumed in the client code (scattered in
three different places : size computation, EmitData, and ReadData
- The new data structure serves as a central place for layout documentation.
- Add interfaces to force BE output for value profile data (testing purpose)
- Add byte swap unit tests
Differential Revision: http://reviews.llvm.org/D14401
llvm-svn: 252563
|
| |
|
|
| |
llvm-svn: 252561
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug in ARMAsmPrinter::EmitUnwindingInstruction where
llvm_unreachable was reached because t2ADDri wasn't handled.
Test case provided by Tim Northover.
rdar://problem/23270609
http://reviews.llvm.org/D14518
llvm-svn: 252557
|
| |
|
|
| |
llvm-svn: 252555
|
| |
|
|
|
|
|
|
|
| |
Simply perform additional report_context() calls after a report()
instead of adding more and more overloaded variations of report(). Also
improve several instances where information was output in an ad-hoc way
probably because no matching report() overload was available.
llvm-svn: 252552
|
| |
|
|
|
|
|
| |
MachineInstr::print() with SkipOppers==true does not produce a
linebreak, so we have to do that in MachineVerifier::report().
llvm-svn: 252551
|
| |
|
|
|
|
|
| |
The code was passing a target machine pointer which degraded to a true
operand to SkipOppers.
llvm-svn: 252550
|
| |
|
|
| |
llvm-svn: 252549
|
| |
|
|
|
|
| |
(Reid fixed the original error, but this seems nice to do in any case)
llvm-svn: 252548
|
| |
|
|
| |
llvm-svn: 252542
|
| |
|
|
|
|
| |
Fixes machine verification failures with David's latest EH change.
llvm-svn: 252541
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was suggested in:
http://reviews.llvm.org/D13956
and is a follow-on to:
http://reviews.llvm.org/rL252515
http://reviews.llvm.org/rL252519
This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG.
A corresponding function for IR instructions already exists in ValueTracking.
llvm-svn: 252539
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Be honest about using iterator semantics in `SlotIndex::getNextSlot()`
and `SlotIndex::getPrevSlot()`. Instead of calling `getNextNode()` --
which is documented (but fails) to check for the sentinel -- call
`&*++getIterator()`.
This is (surprisingly!) a NFC commit. `ilist_traits<IndexListEntry>`
has an `ilist_half_node<IndexListEntry>` as a sentinel (and no other
fields), and so the layout of `ilist<IndexListEntry>` is:
--
struct ilist<IndexListEntry> {
ilist_half_node<IndexListEntry> Sentinel;
IndexListEntry *Head;
IndexListEntry *getHead() { return Head; }
IndexListEntry *getSentinel() { return cast<...>(&Sentinel); }
};
--
In memory, this happens to look just like:
--
struct ilist<IndexListEntry> {
ilist_node<IndexListEntry> Sentinel;
IndexListEntry *getHead() { return Sentinel.getNext(); }
IndexListEntry *getSentinel() { return cast<...>(&Sentinel); }
};
--
As a result, `ilist_node<IndexListEntry>::getNextNode()` that checks
`getNext()` of the possible sentinel will get a pointer to the head of
the list; it will never detect the sentinel, and will return the
sentinel itself instead of `nullptr` in the special cases.
Since `getNextNode()` and `getPrevNode()` don't work, just be honest
that we're not checking for the end/beginning of the list here. Since
this code works, I guess we must never go past the sentinel.
(It's possible we're just getting lucky, and the new code will get
"lucky" in the same situations. To properly fix that hypothetical bug,
we would need to check the iterator against `end()`/`begin()`.)
llvm-svn: 252538
|
| |
|
|
|
|
|
|
|
|
|
| |
derived class objects
SCEVUnionPredicate is copied constructed here: lib/Transforms/Scalar/LoopDistribute.cpp:793
and move assigned (which can use the base class's copy ctor just
fine/without extra cost (I'd add it if it weren't for MSVC's issues
meaning = default is insufficient)) here: lib/Transforms/Utils/LoopVersioning.cpp:46
llvm-svn: 252537
|
| |
|
|
|
|
|
|
|
| |
This is a prerequisite for further optimisations of these functions,
which will be commited as a separate patch.
Differential Revision: http://reviews.llvm.org/D14219
llvm-svn: 252535
|
| |
|
|
|
|
| |
instrumentation (it will fail at start-up time)
llvm-svn: 252533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch does a couple of things:
- Adds a new argument `--shared-mode` which accepts a list of components and prints whether or not the provided components need to be linked statically or shared.
- Fixes `--libnames` when CMake BUILD_SHARED_LIBS is used.
- Fixes `--libnames`, `--libs`, and `--libfiles` for dylib when static components aren't installed.
- Fixes `--libnames`, `--libs`, `--libfiles`, and `--components` to use LLVM_DYLIB_COMPONENTS as the component manifest for dylib linking.
- Uses the host platform's usual convention for filename extensions and such, instead of always defaulting to Unix-izms.
Because I don't own a Mac, I am not able to test the Mac platform dependent stuff locally. If someone would be willing to run a build for me on their machine (unless there's a better option), I'd appreciate it.
Reviewers: jfb, brad.king, whitequark, beanz
Subscribers: beanz, jauhien, llvm-commits
Differential Revision: http://reviews.llvm.org/D13198
llvm-svn: 252532
|
| |
|
|
|
|
|
| |
This avoids the need to have two dummy implementations of
findModulesAndOffsets.
llvm-svn: 252531
|